Dark Mode

Home

Data Categories

Synthetic Data for AI & Machine Learning

Zalingo Synthetic User Behaviour — Premium Evaluation Kit

ALITA Therapeutics Ltd

Licensed LLM Data Provider

£1999

Zalingo Synthetic User Behaviour — Premium Evaluation Kit

Name: Zalingo Synthetic User Behaviour — Premium Evaluation Kit
Creator: ALITA Therapeutics Ltd
Published: 2025-09-08T19:21:58.685Z
License: https://docs.opendatabay.com/ai-training-and-model-development-licenses/general-ai-training-and-fine-tuning-data-license

Synthetic Tabular Data

Tags and Keywords

Synthetic

Data

User

Behavior

Clickstream

Events

Sessions

Funnels

Cohorts

Attribution

Conversion

Propensity

Marketing

Mix

Recommendation

Clv

Parquet

Notebooks

1m

Rows

Pii-safe

Anonymised

1million

Zalingo Synthetic User Behaviour — Premium Evaluation Kit Dataset on Opendatabay data marketplace

"No reviews yet"

£1,999

About

Zalingo Synthetic User Behaviour — Premium Evaluation Kit (Engagement • Conversion • Attribution • CLV) — 1M Rows + Notebooks

A premium, end-to-end evaluation kit for growth & behavioural analytics. You get ~1,000,000 privacy-safe synthetic events across web, mobile, email, ads, and in-app channels, plus Jupyter notebooks and data dictionaries—so teams can benchmark funnels, attribution, conversion propensity, and CLV workflows quickly without handling real user data (no PII).

Scaling up? After purchase, message us about enterprise bundles (tens of millions of rows) and weekly/daily refresh subscriptions via S3/API.

What’s Inside

Data (Parquet, Snappy): ~1,000,000 rows, partitioned by event_date / channel / session_id; includes precomputed signals and labels.
Notebooks: EDA & quality, signals & features (RFM, rolling windows, dwell/scroll), propensity & attribution, funnels & cohorts, and an optional A/B & MMM demo.
Docs & Schema: Data dictionary, column glossary, sampling notes, JSON schema examples.

Key Fields (representative)

Events & Timing: event_id, ts_utc, event_name (page_view|screen_view|click|search|add_to_cart|purchase|signup|unsubscribe|support_chat).
Identity (synthetic): user_id, session_id (non-linkable).
Channels & Context: channel (web|mobile|email|ads|in-app|support), page_url/screen_name, referrer.
Attribution: utm_source, utm_medium, utm_campaign, campaign_id, first_touch_channel, last_touch_channel.
Geo & Device: geo_country, geo_city, device_os, app_version, browser.
Commerce Fields (when applicable): item_sku, item_category, cart_value, purchase_value, currency.
Session/Engagement: session_duration_s, n_events_session, bounce_flag, dwell_time_ms, scroll_depth_pct.
Signals & Labels: rfm_recency_days, rfm_frequency_28d, rfm_monetary_90d, roll_events_1d/7d/28d, time_to_convert_s, conversion_flag (0/1), conversion_type, propensity_score_0_1, next_best_action, cohort_week, cohort_source, days_since_last_active, repeat_visit_flag, clv_proxy, ltv_bucket, consent_flag. (Columns may vary slightly; see the included dictionary + preview for exact schema.)

Distribution

Format: ZIP with /data (Parquet), /notebooks, /docs, /schema.
Volume: ~1,000,000 rows, 25–45 columns, multi-part Parquet.
Approx Size: 60–150 MB zipped (category-dependent).
Partitioning: by event_date / channel / session_id for efficient reads.

Usage

Conversion propensity & uplift — treatment targeting, cost/benefit curves.
Funnels & cohorts — activation, retention, resurrection diagnostics.
Attribution & MMM inputs — last/first-touch snapshots, campaign features.
Personalisation & ranking — engagement-aware recommendations.
CLV & lifecycle — synthetic LTV proxies and segmenting.
MLOps QA — schema contracts, drift monitors, dashboards.

Coverage

Geographic: Multi-country synthetic coverage (ISO codes).
Time Range: Recent multi-year synthetic window with weekly/seasonal patterns.
PII: None — fully synthetic; not re-identifiable.

Who Can Use It

Growth/Marketing/CRM, Product & Analytics, Data Science/ML, BI/RevOps, Vendors/SIs for demos and validation.

Notes / Disclaimers

Not real user data. Not for direct targeting of individuals.
Signals, scores, and rates are synthetic calibrated distributions and do not represent any specific business.

Evaluation License (Non-Production, Internal Use Only — 90 Days) Buyer is granted a non-exclusive, non-transferable license to use the data and included assets solely for internal evaluation, prototyping, and testing for 90 days from purchase. No production use, external distribution, resale, sublicensing, or sharing beyond Buyer’s employees and on-site contractors under NDA. Derived models/features may be retained for internal research; production deployment requires a separate enterprise license. All materials are provided “as is” without warranties; liability limited to the amount paid.

Listing Stats

VIEWS

DELIVERY

INSTANT DOWNLOAD

LISTED

08/09/2025

UPDATED

12/09/2025

REGION

GLOBAL

QUALITY

5 / 5

£1,999

Download Dataset in Parquet Format

Recommended Datasets

Loading recommendations...