Opendatabay APP

Zalingo Synthetic Healthcare — 30-Day Readmission & LOS Risk — 100k Fo

Synthetic Tabular Data

Tags and Keywords

Synthetic

Data

Healthcare

Emr

Encounters

Readmission

Length

Of

Stay

Risk

Scoring

Triage

Icd-10

Cpt

Labs

Vitals

Parquet

Csv

Pii-safe

Anonymised

Trusted By
Trusted by company1Trusted by company2Trusted by company3
Zalingo Synthetic Healthcare — 30-Day Readmission & LOS Risk — 100k Fo Dataset on Opendatabay data marketplace

"No reviews yet"

£749

About

Zalingo Synthetic Healthcare — 30-Day Readmission & Length-of-Stay (LOS) Risk — 100k Focused Sample (Parquet)
A 100,000-row privacy-safe focused sample built for readmission (30-day) and LOS modelling. It emulates encounter-level EMR data with diagnoses, procedures, meds, labs, vitals, utilization history and ground-truth labels for readmission_30d and los_days/los_bucket. Ideal for feature engineering, benchmarking models, triage policy what-ifs, and pipeline QA—without handling PHI/PII.
Need larger volumes or scheduled refresh? After purchasing this focused sample, message us about enterprise bundles and monthly/weekly/daily subscriptions (readmission/LOS only or mixed clinical domains).

Dataset Features (representative)

  • patient_id — Synthetic, non-linkable identifier.
  • encounter_id / encounter_type — Inpatient | outpatient | ED | telemed.
  • ts_admit_utc / ts_discharge_utc — ISO-8601 timestamps (UTC).
  • los_days / los_bucket — Numeric LOS and bucket (e.g., 0–1, 2–3, 4–7, 8+).
  • readmission_30d — 0/1 label; readmission_ts_utc when 1.
  • age / sex — Demographics at encounter.
  • diagnosis_primary (ICD-10) — Code; diagnosis_primary_desc label.
  • diagnosis_ccsr_group — High-level diagnostic group (synthetic).
  • procedure_code (CPT-like) — Primary procedure code.
  • cci_score — Charlson-style comorbidity index (synthetic 0+).
  • prior_12m_visits / prior_12m_admits — Utilization history counts.
  • payer_type — public | private | self.
  • triage_acuity — ED triage 1–5 (if ED encounter).
  • vitals: vital_hr, vital_bp_sys, vital_bp_dia, vital_temp_c, vital_spo2.
  • labs: lab_name, lab_value, lab_units, lab_flag (normal/abnormal).
  • med_atc — Active medication ATC code(s) during encounter.
  • procedure_anesthesia_flag — 0/1.
  • icu_admit_flag / icu_hours — ICU utilization markers.
  • discharge_disposition — home | rehab | transfer | deceased (synthetic).
  • sdoH_index — Synthetic social determinants index (0–1).
  • country / city — ISO-2 country + synthetic city.
  • risk_score_0_1 — Calibrated continuous score for readmission/LOS demo. (Columns can vary slightly by bundle; see the preview CSV for the exact schema.)

Distribution

  • Format: ZIP with Parquet shards (Snappy) + README.
  • Volume: 100,000 rows, ~22–32 columns, 1–5 parts.
  • Size: ~3–6 MB zipped.
  • Schema stability: Consistent across readmission/LOS focused bundles; full datasets partitioned by admit_date / facility / acuity.

Usage

  • Readmission modelling: baselines, threshold tuning, feature ablations.
  • LOS prediction: case-mix adjustment, bed-day planning, discharge planning what-ifs.
  • Triage & capacity policy: acuity-based routing experiments.
  • Quality & monitoring: drift tests, KPI dashboards, alert simulations.
  • Education & enablement: reproducible exercises without PHI.

Coverage

  • Geographic: Multi-country synthetic coverage (ISO codes).
  • Time Range: Recent multi-year synthetic window with weekly/seasonal patterns.
  • PII/PHI: None — fully synthetic; not re-identifiable.

License

Proprietary — internal use rights; redistribution/resale not permitted.

Who Can Use It

  • Clinical Analytics & Data Science — feature engineering, baseline models.
  • Operations/Bed Management — LOS planning and scenario testing.
  • Payers/Providers — readmission risk stratification experiments.
  • Vendors/SIs — pipeline QA and demo environments.

Notes / Disclaimers

  • Not real patient data. Not for clinical decision-making.
  • Codes, rates, and scores follow synthetic calibrated distributions and do not represent any specific provider or population.

Listing Stats

VIEWS

1

DOWNLOADS

0

LISTED

08/09/2025

REGION

GLOBAL

Universal Data Quality Score Logo UDQSQUALITY

5 / 5

VERSION

1.0

£749

Download Dataset in Parquet Format