Zalingo Synthetic Finance — Fraud & Chargebacks (CNP) — 100k
Synthetic Tabular Data
Tags and Keywords
Trusted By




"No reviews yet"
£749
About
Zalingo Synthetic Finance — Fraud & Chargebacks (CNP) — 100k Focused Signals (Parquet)
A 100,000-row privacy-safe focused sample for card-not-present (CNP) fraud & chargebacks. This pack enriches core transactions with authorization signals (3DS/AVS/CVV), device/IP/geo consistency, and multi-window velocity features—ideal for feature engineering, rule testing, cost-curve tuning, and baseline model development, without handling any real cardholder data (no PII).
Need more volume or notebooks? Upgrade to the Premium Evaluation Kit (~1M rows) or ask for a weekly/daily enterprise feed.
Dataset Features (representative)
- Core:
transaction_id
,account_id
,ts_utc
,amount
,currency
,channel
(ecommerce | wallet | mail/phone),merchant_id
,mcc
,merchant_country
,user_agent
- Auth Signals:
three_ds_result
,avs_result
,cvv_result
,auth_result
,decline_reason_code
- Device/IP/Geo:
device_fingerprint
,ip_country
,distance_km_billing_shipping
,first_time_merchant_flag
,recurring_flag
,coupon_used
- Velocity (enriched):
txn_ct_15m/1h/24h/7d
,amount_sum_1h/24h
,unique_merchant_ct_7d
- Labels & Scores:
fraud_label (0/1)
,chargeback_flag (0/1)
,risk_score_0_1
(Columns may vary slightly; see the included preview for exact schema.)
Distribution
- Format: ZIP containing Parquet data (
100k_sample.parquet
), sample_100.csv (preview), and schema.json - Volume: 100,000 rows, ~22–35 columns
- Approx Size: 3–6 MB zipped (mix-dependent)
- Structure: Single Parquet (or a few shards); schema stable across focused fraud packs
Usage
- Fraud/Risk modelling: baseline models, feature ablations, threshold & policy tuning
- Authorization optimisation: AVS/3DS strategy experiments and trade-off analysis
- Scenario testing: velocity, geo-mismatch, first-use & recurring patterns
- MLOps QA: schema contracts, drift monitors, dashboard demos
Coverage
- Geographic: Multi-country synthetic coverage (ISO codes)
- Time Range: Recent multi-year synthetic window with weekly/seasonal patterns
- PII: None — fully synthetic; not re-identifiable
Who Can Use It
- Risk/Data Science — feature engineering & model iteration
- Payments/FinOps — authorization strategy & loss-rate diagnostics
- Product/Analytics — KPI sandboxes & experiment design
- Vendors/SIs — demo environments & connector validation
Notes / Disclaimers
- Not real cardholder data. Not for production credit decisions.
- Rates, labels, and distributions are synthetic and calibrated; they do not represent any specific issuer/acquirer/PSP.