Opendatabay APP

Zalingo Synthetic Finance — Credit Risk Signals — 100k

Synthetic Tabular Data

Tags and Keywords

Synthetic

Data

Finance

Credit

Applications

Underwriting

Scorecards

Risk

Signals

Pd

Default

Bureau

Affordability

Parquet

Csv

Pii-safe

Anonymised

Trusted By
Trusted by company1Trusted by company2Trusted by company3
Zalingo Synthetic Finance — Credit Risk Signals — 100k Dataset on Opendatabay data marketplace

"No reviews yet"

£749

About

Zalingo Synthetic Finance — Credit Risk Signals — 100k Focused (Parquet)
A 100,000-row privacy-safe focused sample for credit risk modelling and scorecard prototyping. This pack enriches core application data with bureau-like aggregates, affordability & stability flags, utilisation and enquiry features, plus 6/12-month performance labels—ideal for feature engineering, baseline PD models, and policy experiments, without handling real consumer data (no PII).
Need larger volumes or notebooks? Upgrade to the Premium Evaluation Kit (~1M rows) or ask for a weekly/daily enterprise feed.

Dataset Features (representative)

  • Application: application_id, ts_utc, channel (online | branch | partner), product, amount_requested, term_months, purpose
  • Demographic/Economic (synthetic): employment_status, employment_tenure_m, income_monthly, housing_status, address_age_m
  • Affordability & Stability: dti, debt_service_ratio, affordability_flag, stability_score
  • Bureau-like Signals (synthetic): credit_history_length_m, prior_defaults_ct, late_payments_12m, enquiries_90d, utilisation_ratio, limit_total, balance_total
  • Segmentation & Scores: segment_bucket (prime | near-prime | subprime), pd_score_0_1
  • Decision & Offer: approval_flag (0/1), decision_code, offer_apr, offer_limit
  • Performance Labels: outcome_default_6m (0/1), outcome_default_12m (0/1) (Columns may vary slightly; see the included preview for exact schema.)

Distribution

  • Format: ZIP containing Parquet data (100k_sample.parquet), sample_100.csv (preview), and schema.json
  • Volume: 100,000 rows, ~24–36 columns
  • Approx Size: 3–6 MB zipped (mix-dependent)
  • Structure: Single Parquet (or a few shards); schema stable across focused credit packs

Usage

  • PD / scorecard baselines — feature engineering, binning, threshold tuning
  • Underwriting policy experiments — approval strategy & offer simulations
  • Portfolio analytics — cohorting, delinquency/roll-rate diagnostics
  • MLOps QA — schema contracts, drift monitors, dashboard demos

Coverage

  • Geographic: Multi-country synthetic coverage (ISO codes)
  • Time Range: Recent multi-year synthetic window with weekly/seasonal patterns
  • PII: None — fully synthetic; not re-identifiable

Who Can Use It

  • Risk/Data Science — PD modelling & feature stores
  • Underwriting/FinOps — policy design & KPI diagnostics
  • Product/Analytics — approval, offer, and performance sandboxes
  • Vendors/SIs — demo environments & connector validation

Notes / Disclaimers

  • Synthetic data; not real consumer applications or bureau data.
  • Not for production credit decisions. Distributions and labels are synthetic and calibrated; they do not represent any specific lender/bureau.

Listing Stats

VIEWS

4

DOWNLOADS

0

LISTED

12/09/2025

REGION

GLOBAL

Universal Data Quality Score Logo UDQSQUALITY

5 / 5

VERSION

1.0

£749

Download Dataset in ZIP Format