Zalingo Synthetic Finance — Credit Applications — 100k
Synthetic Tabular Data
Tags and Keywords
Trusted By




"No reviews yet"
£249
About
Zalingo Synthetic Finance — Credit Applications — 100k Core Sample (Parquet)
A 100,000-row synthetic credit-applications sample designed for scorecard demos, feature exploration, and pipeline QA—no PII. Records include product/channel, financials, bureau-like attributes, decision outcomes, and a basic 12-month performance label to support PD-style baselines.
Need richer risk features or larger volumes? See our Focused (adds binned/bucketed risk features and performance windows) and Premium Evaluation Kit (~1M rows) listings.
Dataset Features (representative)
- Application:
application_id
,ts_utc
,channel
(online | branch | partner),product
(credit_card | personal_loan | auto | mortgage) - Requested Terms:
amount_requested
,term_months
,purpose
- Affordability:
income_monthly
,employment_status
,dti
(debt-to-income),housing_status
- Credit History (synthetic):
credit_history_length_m
,prior_defaults_ct
,enquiries_90d
,utilisation_ratio
,bureau_score
- Decision & Offer:
approval_flag (0/1)
,decision_code
,offer_apr
,offer_limit
- Performance Label (basic):
outcome_default_12m (0/1)
(Columns may vary slightly; see the preview file for the exact schema.)
Distribution
- Format: ZIP containing Parquet data (
100k_sample.parquet
), sample_100.csv (preview), and schema.json - Volume: 100,000 rows, ~20–30 columns
- Approx Size: 3–6 MB zipped (category-dependent)
- Structure: single Parquet (or few shards); schema stable across core credit samples
Usage
- Scorecard baselines & PD demos — feature trials, threshold tuning
- Underwriting dashboards & sandboxes — approval/offer diagnostics
- Pipeline QA / MLOps — schema contracts, drift checks, visualisations
- Education & enablement — safe examples without consumer PII
Coverage
- Geographic: Multi-country synthetic coverage (ISO codes)
- Time Range: Recent multi-year synthetic window with weekly/seasonal patterns
- PII: None — fully synthetic; not re-identifiable
Who Can Use It
- Risk/Data Science — feature engineering, baseline scorecards
- Underwriting/FinOps — policy experiments, KPI diagnostics
- Product/Analytics — cohort & approval-rate analysis
- Vendors/SIs — demo environments and pipeline validation
Notes / Disclaimers
- Synthetic data; not real consumer applications.
- Not for production credit decisions. Rates and labels follow calibrated synthetic distributions and do not represent any specific lender/bureau.