Zalingo Synthetic Manufacturing Sensor & Production Events — 100k
Synthetic Tabular Data
Tags and Keywords
Trusted By




"No reviews yet"
£249
About
Zalingo Synthetic Manufacturing Sensor & Production Events — 100k Sample (Parquet)
This listing provides a 100,000-row synthetic factory dataset combining machine sensor telemetry and production / quality events. It emulates realistic signals and shop-floor states (cycles, downtime, scrap, OEE components, energy) with stable schemas—no proprietary plant data, no PII/PHI. Use it to prototype analytics, validate pipelines, benchmark models, or demo digital-twin scenarios without access hurdles.
Need larger volumes or scheduled refresh? After purchasing this sample, message us about enterprise-scale bundles and monthly/weekly/daily subscriptions.
Dataset Features (representative)
- site — Synthetic site name / code.
- line_id — Production line identifier.
- machine_id — Asset identifier at line level.
- asset_class — robot | press | cnc | conveyor | oven | compressor | sensor_gateway.
- ts_utc — Event/tick timestamp (ISO-8601, UTC).
- event_type — sensor_reading | cycle_complete | changeover | downtime | quality_event | maintenance.
- sensor_type — temperature | vibration | pressure | rpm | current | power | flow (if event_type = sensor_reading).
- reading_value — Numeric sensor value (float); units below.
- reading_unit — °C | mm/s | bar | rpm | A | kW | L/min, etc.
- cycle_time_ms — Observed cycle time for completed cycle events.
- throughput_units — Units produced in the interval (int).
- scrap_count — Defective units in the interval (int).
- scrap_reason — Synthetic defect code/label (optional).
- downtime_start_utc / downtime_end_utc / downtime_minutes — For downtime events.
- downtime_cause — mechanical | electrical | material | planned | changeover | other.
- maintenance_type — planned | unplanned; with alert_code when applicable.
- oee_availability / oee_performance / oee_quality / oee_overall — Component & composite OEE (0–1).
- energy_kwh — Interval energy consumption estimate.
- product_code / lot_id / work_order_id — Synthetic production identifiers.
- country / city — ISO-2 country code + synthetic city.
(Columns can vary slightly by bundle; see the included preview CSV for the exact schema in this sample.)
Distribution
- Format: ZIP containing Parquet shards (Snappy) + README.
- Data Volume: 100,000 rows, ~18–30 columns, 1–5 Parquet parts.
- Approx Size: ~3–5 MB zipped (category-dependent).
- Schema Stability: Names/types remain consistent across manufacturing samples; full datasets are partitioned by date, site, line, and machine.
Usage
- Predictive maintenance & anomaly detection (vibration/temperature drift, early warnings).
- OEE analysis & bottleneck detection (availability, performance, quality).
- Quality analytics (scrap drivers, SPC-style monitoring).
- Throughput & takt-time optimisation (line balancing, changeover impact).
- Energy optimisation (kWh per unit, idle losses).
- Digital-twin demos & education without exposing real plant data.
Coverage
- Geographic: Multi-country synthetic coverage with ISO country codes.
- Time Range: Recent multi-year synthetic window with realistic seasonality and shifts.
- PII/Proprietary: None — fully synthetic; not derived from any single facility.
License
Proprietary — internal use rights; redistribution/resale not permitted.
Who Can Use It
- Data Scientists/ML Engineers — feature engineering, baselines, MLOps tests.
- Process/Quality/Reliability Engineers — root-cause, SPC, PdM prototypes.
- Ops Excellence & OT Teams — dashboards, alerting logic, KPI sandboxes.
- MES/SCADA Integrators — pipeline validation and demo environments.
Important Notes / Disclaimers
- Not real plant data. Not for safety-critical decision-making.
- Codes (defect/downtime/maintenance) follow synthetic distributions; they do not represent any specific manufacturer.