Opendatabay APP

Zalingo Synthetic Finance Transactions — 100k Sample (Parquet)

Synthetic Tabular Data

Tags and Keywords

Synthetic

Data

Tabular

Financial

Transactions

Payments

Merchant

Category

Card

Time

Series

Parquet

Csv

Pii-safe

Anonymised

Machine

Learning

Feature

Engineering

Fraud

Detection

Forecasting

Sample

100k

Rows

Aws

S3

Behavioural

Trusted By
Trusted by company1Trusted by company2Trusted by company3
Zalingo Synthetic Finance Transactions — 100k Sample (Parquet) Dataset on Opendatabay data marketplace

"No reviews yet"

£249

About

Zalingo Synthetic Finance Transactions — 100k Sample (Parquet)
This listing provides a 100,000-row, privacy-safe synthetic transactions sample generated by Zalingo’s data refinery. It mimics realistic card and wallet spend patterns (amounts, currencies, merchant types, channels, geographies) without using any real person’s data. Ideal for prototyping ML features, pipeline testing, benchmarking, training demos, and analytics evaluations.
Looking for the full dataset or a monthly refresh? Message us after purchase of this sample—enterprise-scale bundles and subscriptions are available.

Dataset Features (representative)

  • transaction_id — Synthetic unique ID per event.
  • account_id — Synthetic payer/account identifier.
  • ts_utc — Event timestamp (ISO 8601, UTC).
  • amount — Numeric amount (float).
  • currency — ISO-4217 code (e.g., USD, ZAR, GBP).
  • merchant_category — High-level merchant type (e.g., groceries, fuel).
  • mcc — Merchant Category Code (synthetic 4-digit).
  • merchant_country — ISO-3166-1 alpha-2.
  • city — City name (synthetic).
  • channel — pos | ecommerce | atm | transfer.
  • payment_method — debit | credit | wallet | bank.
  • auth_result — approved | declined (synthetic logic). (Fields can vary slightly between bundles; see the included preview CSV for the exact columns in this sample.)

Distribution

  • Format: ZIP containing Parquet files (Snappy) + README.
  • Data Volume: 100,000 rows; ~10–18 columns; 1–5 parquet shards.
  • File Size: ~3–5 MB zipped (category-dependent).
  • Schema Stability: Column names/types are stable across finance samples; full datasets are partitioned by date and merchant attributes.

Usage

  • Feature engineering & model prototyping (fraud/risk, segmentation, CLV).
  • Anomaly/quality tests for pipelines and dashboards.
  • Time-series forecasting (daily/weekly spend patterns).
  • Education & enablement (reproducible ML exercises without PII).

Coverage

  • Geographic: Multi-country synthetic coverage with ISO country codes.
  • Time Range: Synthetic timestamps spanning a recent multi-year window (not tied to real-world events).
  • PII: No PII. 100% synthetic; not re-identifiable.

License

Proprietary — purchase grants internal use rights; redistribution/resale not permitted.

Who Can Use It

  • Data Scientists/ML Engineers — rapid model iteration.
  • Researchers/Analysts — benchmarking and hypothesis testing.
  • Product & Risk Teams — experimentation without live data exposure.

Listing Stats

VIEWS

1

DOWNLOADS

0

LISTED

08/09/2025

REGION

GLOBAL

Universal Data Quality Score Logo UDQSQUALITY

5 / 5

VERSION

1.0

£249

Download Dataset in Parquet Format