Zalingo Synthetic Finance, Card-Not-Present Fraud & Chargebacks
Synthetic Tabular Data
Tags and Keywords
Trusted By




"No reviews yet"
£749
About
Zalingo Synthetic Finance — Card-Not-Present Fraud & Chargebacks — 100k Focused Sample (Parquet)
A 100,000-row privacy-safe focused sample of e-commerce / CNP transactions with fraud & chargeback labels. Built for rapid feature engineering, rule testing, and model baselines without handling real cardholder data (no PII).
Need larger volumes or scheduled refresh? After purchasing this focused sample, message us about enterprise bundles and monthly/weekly/daily subscriptions (CNP only or mixed).
Dataset Features (representative)
- transaction_id, account_id — Synthetic identifiers (non-linkable).
- ts_utc, amount, currency — Event time, value, ISO-4217 code.
- cnp_flag, channel — CNP=1; ecommerce | wallet | mail/phone.
- merchant_id, merchant_country, mcc, category — Merchant context.
- device_fingerprint, ip_country, user_agent — Client hints (synthetic).
- three_ds_result, avs_result, cvv_result — Auth signals (categorical).
- auth_result, decline_reason_code — Approved/Declined + reason.
- velocity_txn_1h / 24h / 7d — Synthetic velocity aggregates.
- distance_km_billing_shipping — Geo-consistency proxy.
- first_time_merchant_flag, recurring_flag, coupon_used — Behavioural context.
- fraud_label (0/1), risk_score_0_1 — Labels + calibrated score.
- chargeback_flag (0/1), chargeback_reason_code — Post-event outcome. (Exact columns may vary slightly; see included preview for this sample’s schema.)
Distribution
- Format: ZIP with Parquet shards (Snappy) + README.
- Volume: 100,000 rows, ~20–30 columns, 1–5 parts.
- Size: ~3–6 MB zipped.
- Schema stability: Consistent across focused CNP bundles; full datasets partitioned by date/merchant/MCC.
Usage
- Fraud/Risk modelling: baselines, uplift vs rule-sets, threshold tuning.
- Authorization optimisation: AVS/3DS policy experiments.
- Scenario testing: velocity, geo-mismatch, first-use patterns.
- MLOps QA: schema/drift checks, dashboard demos.
Coverage
- Geographic: Multi-country synthetic coverage (ISO codes).
- Time Range: Recent multi-year synthetic window with weekly/seasonal patterns.
- PII: None — fully synthetic; not re-identifiable.
License
Proprietary — internal use; no redistribution/resale.
Who Can Use It
- Risk/Data Science — feature engineering & model iteration.
- Payments/FinOps — auth strategy & loss-rate simulations.
- Product/Analytics — KPI sandboxes w/out sensitive data.
Notes / Disclaimers
- Not real cardholder data. Not for production credit decisions.
- Labels and rates follow synthetic, calibrated distributions and do not represent any specific issuer/acquirer.