Opendatabay APP

Plant Disease Environmental Predictor

Synthetic Data Generation

Tags and Keywords

Plant_disease

Temperature

Humidity

Rainfall

Soil_ph

Trusted By
Trusted by company1Trusted by company2Trusted by company3
Plant Disease Environmental Predictor Dataset on Opendatabay data marketplace

"No reviews yet"

Free

About

Simulates environmental factors critical for predicting fungal infections in plants. This resource provides 10,000 synthetic data samples detailing environmental measurements from different farm locations to aid in the early identification of potential disease outbreaks. By predicting disease presence, users can support efforts to mitigate significant agricultural losses worldwide. The relationships between the environmental inputs and disease presence are complex and non-linear, mimicking real biological systems.

Columns

  • temperature: Environmental temperature measured in degrees Celsius.
  • humidity: Moisture content measured as a percentage.
  • rainfall: Precipitation amount measured in millimeters.
  • soil_pH: Measurement of the soil's acidity or alkalinity.
  • disease_present: A binary output label indicating the presence of disease (0 = healthy, 1 = diseased).

Distribution

The dataset contains 10,000 individual records and includes 5 features. The typical data file format is CSV, with a size of approximately 755.02 kB. The usability score is high, as the data is clean, with zero mismatched or missing values reported across all records.

Usage

The data is ideal for several applications within machine learning and analysis:
  • Practicing binary classification techniques.
  • Performing feature importance analysis to identify critical environmental drivers.
  • Understanding complex feature interactions.
  • Testing the robustness of predictive models.
  • Applying specialised imbalanced classification techniques, given the split between healthy (7,590) and diseased (2,410) labels.

Coverage

The data encompasses environmental measurements such as temperature, humidity, rainfall, and soil pH. Since the data is synthetic, generated for educational purposes, it does not possess a specific geographic location or time range. It represents simulated environmental conditions that could lead to plant fungal infections in farm settings.

License

CC0: Public Domain

Who Can Use It

  • Data Scientists: For developing and evaluating machine learning models focused on early warning systems in agriculture.
  • Researchers: To study the impact of environmental variables on plant health and disease proliferation.
  • Students: For educational practice, particularly in binary classification and handling imbalanced datasets.
  • ML Engineers: To test model stability and robustness under varied environmental input conditions.

Dataset Name Suggestions

  • Plant Disease Environmental Predictor
  • Synthetic Crop Health Factors
  • Fungal Risk Classification Dataset
  • Agricultural Condition Simulation

Attributes

Listing Stats

VIEWS

3

DOWNLOADS

1

LISTED

17/11/2025

REGION

GLOBAL

Universal Data Quality Score Logo UDQSQUALITY

5 / 5

VERSION

1.0

Loading...

Free

Download Dataset in CSV Format