Zalingo Synthetic Finance — Credit Risk Signals — 100k
Synthetic Tabular Data
Tags and Keywords
Trusted By




"No reviews yet"
£749
About
Zalingo Synthetic Finance — Credit Risk Signals — 100k Focused (Parquet)
A 100,000-row privacy-safe focused sample for credit risk modelling and scorecard prototyping. This pack enriches core application data with bureau-like aggregates, affordability & stability flags, utilisation and enquiry features, plus 6/12-month performance labels—ideal for feature engineering, baseline PD models, and policy experiments, without handling real consumer data (no PII).
Need larger volumes or notebooks? Upgrade to the Premium Evaluation Kit (~1M rows) or ask for a weekly/daily enterprise feed.
Dataset Features (representative)
- Application:
application_id
,ts_utc
,channel
(online | branch | partner),product
,amount_requested
,term_months
,purpose
- Demographic/Economic (synthetic):
employment_status
,employment_tenure_m
,income_monthly
,housing_status
,address_age_m
- Affordability & Stability:
dti
,debt_service_ratio
,affordability_flag
,stability_score
- Bureau-like Signals (synthetic):
credit_history_length_m
,prior_defaults_ct
,late_payments_12m
,enquiries_90d
,utilisation_ratio
,limit_total
,balance_total
- Segmentation & Scores:
segment_bucket
(prime | near-prime | subprime),pd_score_0_1
- Decision & Offer:
approval_flag (0/1)
,decision_code
,offer_apr
,offer_limit
- Performance Labels:
outcome_default_6m (0/1)
,outcome_default_12m (0/1)
(Columns may vary slightly; see the included preview for exact schema.)
Distribution
- Format: ZIP containing Parquet data (
100k_sample.parquet
), sample_100.csv (preview), and schema.json - Volume: 100,000 rows, ~24–36 columns
- Approx Size: 3–6 MB zipped (mix-dependent)
- Structure: Single Parquet (or a few shards); schema stable across focused credit packs
Usage
- PD / scorecard baselines — feature engineering, binning, threshold tuning
- Underwriting policy experiments — approval strategy & offer simulations
- Portfolio analytics — cohorting, delinquency/roll-rate diagnostics
- MLOps QA — schema contracts, drift monitors, dashboard demos
Coverage
- Geographic: Multi-country synthetic coverage (ISO codes)
- Time Range: Recent multi-year synthetic window with weekly/seasonal patterns
- PII: None — fully synthetic; not re-identifiable
Who Can Use It
- Risk/Data Science — PD modelling & feature stores
- Underwriting/FinOps — policy design & KPI diagnostics
- Product/Analytics — approval, offer, and performance sandboxes
- Vendors/SIs — demo environments & connector validation
Notes / Disclaimers
- Synthetic data; not real consumer applications or bureau data.
- Not for production credit decisions. Distributions and labels are synthetic and calibrated; they do not represent any specific lender/bureau.