Opendatabay APP

Synthetic Industrial Failure Dataset

LLM Fine-Tuning Data

Tags and Keywords

Maintenance

Prediction

Failure

Machine

Industrial

Trusted By
Trusted by company1Trusted by company2Trusted by company3
Synthetic Industrial Failure Dataset Dataset on Opendatabay data marketplace

"No reviews yet"

Free

About

This dataset provides synthetic data for machine predictive maintenance, designed to help develop models that predict machine failure and classify the type of failure. Real predictive maintenance datasets are often difficult to acquire and publish, so this synthetic dataset offers a realistic alternative for research and development purposes, reflecting common industrial scenarios. It includes detailed operational parameters and two distinct target variables for classification tasks.

Columns

  • UID: A unique identifier for each data point, ranging from 1 to 10,000.
  • productID: Identifies the product quality variant (Low, Medium, or High) and includes a variant-specific serial number. Low quality products constitute 50% of the dataset, Medium 30%, and High 20%.
  • air temperature [K]: The air temperature in Kelvin, generated using a random walk process and normalised to a standard deviation of 2 K around 300 K.
  • process temperature [K]: The process temperature in Kelvin, generated via a random walk process, normalised to a standard deviation of 1 K, and calculated as air temperature plus 10 K.
  • rotational speed [rpm]: The rotational speed in revolutions per minute, derived from a power of 2860 W and overlaid with normally distributed noise.
  • torque [Nm]: Torque values in Newton-meters, normally distributed around 40 Nm with a standard deviation of 10 Nm, ensuring no negative values.
  • tool wear [min]: The amount of tool wear in minutes. Different product quality variants (H/M/L) add specific amounts of wear: 5 minutes for High, 3 minutes for Medium, and 2 minutes for Low.
  • machine failure: A binary target label indicating whether the machine experienced a failure at that specific data point (0 for no failure, 1 for failure). It is crucial not to use this target as a feature to prevent data leakage.
  • Failure Type: A multiclass target label specifying the type of failure if one occurred (e.g., No Failure, Heat Dissipation Failure). This is also a target variable and should not be used as a feature.

Distribution

The dataset is typically provided as a data file, often in CSV format, and comprises 10,000 data points, with each row representing a unique data point. It consists of 14 features organised into columns. The sample file is updated separately to the platform.

Usage

This dataset is ideal for:
  • Developing and testing machine learning models for predictive maintenance.
  • Binary classification tasks to predict if a machine will fail.
  • Multiclass classification tasks to predict the specific type of machine failure.
  • Research into industrial anomaly detection and fault prediction.
  • Educational purposes for demonstrating predictive maintenance concepts.

Coverage

This is a synthetic dataset designed to reflect real predictive maintenance scenarios encountered in industrial settings. As such, it does not have specific geographic or time range coverage in the traditional sense. The data points simulate realistic operational conditions and failure modes found in industry.

License

CC0: Public Domain

Who Can Use It

  • Data Scientists and Machine Learning Engineers building predictive models.
  • Researchers and Academics studying industrial IoT, anomaly detection, and machine reliability.
  • Students learning about classification algorithms and predictive analytics.
  • Anyone interested in applying data science to manufacturing and maintenance optimisation.

Dataset Name Suggestions

  • Machine Predictive Maintenance Classification
  • Synthetic Industrial Failure Dataset
  • AI4I Predictive Maintenance Dataset
  • Machine Failure Prediction Data

Attributes

Listing Stats

VIEWS

1

DOWNLOADS

0

LISTED

08/07/2025

REGION

GLOBAL

Universal Data Quality Score Logo UDQSQUALITY

5 / 5

VERSION

1.0

Free

Download Dataset in CSV Format