Zalingo Synthetic Finance — Market Ticks Premium Evaluation 1M Rows
Synthetic Tabular Data
Tags and Keywords
Trusted By




"No reviews yet"
£2,449
About
Zalingo Synthetic Finance — Market Ticks Premium Evaluation Kit — ~1M Rows + Notebooks
A premium, end-to-end kit for tick-level analytics, factor prototyping, and microstructure research. You get ~1,000,000 privacy-safe synthetic Level-1 ticks across multiple symbols/venues with derived microstructure signals (spreads, imbalance, short-horizon returns, realised volatility), plus Jupyter notebooks and a data dictionary—so teams can benchmark pipelines and models quickly without licensed exchange data (no PII).
Need production-scale feeds? After purchase, ask about enterprise bundles (tens of millions of rows) and weekly/daily refresh subscriptions delivered via S3/API.
What’s Inside (kit contents)
-
Data (Parquet, Snappy): ~1,000,000 rows, multi-symbol/venue; includes core L1 and derived signals.
-
Notebooks (.ipynb):
- EDA & Data Quality — schema checks, seasonality & drift probes
- Microstructure & Signals — mid-price, spreads, imbalance, RV, short-horizon returns
- Baseline Factors/Backtests — simple factors, slippage assumptions, evaluation metrics
-
Docs: Data dictionary, column glossary, sampling notes, quick-start.
-
Schema: JSON schema + example queries for Parquet readers (Spark/Pandas/Polars).
Dataset Features (representative)
- Core L1:
symbol
,ts_utc
,last_price
,bid
,ask
,bid_size
,ask_size
,trade_size
,trade_cond
,venue
,currency
- Derived/Signals:
mid_price
,spread_bp
,quote_imbalance
((bid_size−ask_size)/(bid_size+ask_size)),ret_1s
,ret_5s
,ret_60s
,rv_1m
(realised volatility),micro_price
,roll_mean_1m
,roll_std_1m
,vol_ewm_1m
,trade_imbalance_1m
(Columns may vary slightly; see the included dictionary + preview for exact schema.)
Distribution
- Format: ZIP containing Parquet data, /notebooks, /docs, /schema
- Volume: ~1,000,000 rows, ~15–30 columns, multi-part Parquet
- Approx Size: 50–120 MB zipped (symbol/venue mix dependent)
- Partitioning: by
trade_date
/symbol
/venue
for efficient reads
Usage
- Factor research & prototyping — spreads, imbalance, short-horizon returns, RV
- Backtest scaffolding — pipelines, sanity checks, evaluation metrics
- Pipelines & MLOps — readers, schema contracts, drift monitors
- Education & demos — charts, dashboards, tutorial notebooks
Coverage
- Symbols/Venues: Multi-symbol synthetic coverage with venue tags
- Time Range: Recent synthetic window with intraday seasonality
- PII: None — fully synthetic; not re-identifiable
Who Can Use It
- Quants/Data Scientists — microstructure signals & factor baselines
- Data/Platform Engineers — ingestion, validation, contract testing
- Product/Analytics — dashboards and KPI sandboxes
- Vendors/SIs — demo environments & connector validation
Notes / Disclaimers
- Synthetic market data; not sourced from any exchange/venue.
- Not for live trading or production risk decisions. Distributions are synthetic and do not represent any specific market.
Evaluation License (Non-Production, Internal Use Only)
Buyer is granted a non-exclusive, non-transferable license to use the data and included assets solely for internal evaluation, prototyping, and testing for 90 days from purchase. No production use, external distribution, resale, sublicensing, or sharing beyond Buyer’s employees and on-site contractors under NDA. Derived models/features may be retained for internal research; production deployment requires a separate enterprise license. All materials are provided “as is” without warranties; liability limited to the amount paid.
- Price Justification / Value: Premium kit bundles data + notebooks + docs for faster time-to-value; avoids licensing constraints; calibrated synthetic signals for realistic benchmarks.
- Support & SLA: Email support with 1 business-day response; fixes for material schema/data issues within 5 business days; upgrade credits available if you move to an enterprise plan within 60 days.