Zalingo Synthetic User Behaviour (Web/Mobile/Omni-Channel Events)
Synthetic Tabular Data
Tags and Keywords
Trusted By




"No reviews yet"
£249
About
Zalingo Synthetic User Behaviour (Web/Mobile/Omni-Channel Events) — 100k Sample (Parquet)
A 100,000-row privacy-safe synthetic clickstream & engagement dataset spanning web, mobile, and omni-channel interactions. It mimics realistic journeys (sessions, events, funnels, attribution, conversions) with a stable schema—no PII and not derived from real individuals. Use it to prototype features, validate pipelines, benchmark models, and demo analytics without access hurdles.
Need larger volumes or scheduled refresh? After purchasing this sample, message us about enterprise-scale bundles and monthly/weekly/daily subscriptions.
Dataset Features (representative)
- event_id — Synthetic unique event ID.
- ts_utc — Event timestamp (ISO-8601, UTC).
- user_id / session_id — Synthetic identifiers (non-linkable).
- channel — web | mobile | email | ads | in-store-kiosk | support.
- event_name — page_view | screen_view | click | search | add_to_cart | purchase | signup | unsubscribe | support_chat, etc.
- page_url / screen_name — Normalised location for web/mobile views.
- referrer / utm_source / utm_medium / utm_campaign — Synthetic attribution fields.
- device_os / app_version / browser / screen_resolution — Client context (when applicable).
- geo_country / geo_city — ISO-2 + synthetic city.
- dwell_time_ms / scroll_depth_pct — Engagement signals for views.
- click_x / click_y — Normalised coordinates for click events (optional).
- search_query / item_sku / item_category — When search or commerce events occur.
- cart_value / purchase_value / currency — Commerce signals for conversion events.
- conversion_flag / conversion_type — 0/1 and label (purchase, signup, etc.).
- session_duration_s / n_events_session / bounce_flag — Derived session metrics.
- cohort_week / cohort_source — Synthetic cohort tags for growth analysis.
- consent_flag — Synthetic consent snapshot (yes/no) for demo compliance logic.
(Exact columns may vary slightly by bundle; see the included preview CSV for this sample’s schema.)
Distribution
- Format: ZIP containing Parquet shards (Snappy) + README.
- Volume: 100,000 rows, ~18–30 columns, 1–5 Parquet parts.
- Approx Size: ~2–4 MB zipped (category-dependent).
- Schema Stability: Names/types consistent across behaviour samples; full datasets partition by event_date / channel / session_id.
Usage
- Funnels & cohorts — activation, retention, resurrection analysis.
- Attribution & marketing mix — UTM-based experiments without PII.
- Recommenders & ranking — click/view→add-to-cart→purchase paths.
- Churn/propensity models — engagement, bounce, session signals.
- A/B testing sandboxes — feature flag impact with synthetic journeys.
- Pipeline QA / MLOps — schema checks, drift tests, dashboard demos.
Coverage
- Geographic: Multi-country synthetic coverage with ISO country codes.
- Time Range: Recent multi-year synthetic window reflecting weekly/seasonal patterns.
- Demographics: Optional synthetic aggregates only (no PII).
- PII: None. Fully synthetic; not re-identifiable.
License
Proprietary — internal use rights; redistribution/resale not permitted.
Who Can Use It
- Data Scientists/ML Engineers — feature engineering, baselines, monitoring.
- Growth/Marketing/CRM — funnels, cohorts, attribution and journey analytics.
- Product & Analytics — activation/retention, experimentation, KPI sandboxes.
- BI/RevOps — conversion & LTV modelling with synthetic inputs.
Important Notes / Disclaimers
- Not real user data. Not for direct targeting of individuals.
- Event rates and conversion distributions are synthetic and calibrated to public benchmarks; they do not represent any specific business.