Plant Disease Environmental Predictor
Synthetic Data Generation
Tags and Keywords
Trusted By




"No reviews yet"
Free
About
Simulates environmental factors critical for predicting fungal infections in plants. This resource provides 10,000 synthetic data samples detailing environmental measurements from different farm locations to aid in the early identification of potential disease outbreaks. By predicting disease presence, users can support efforts to mitigate significant agricultural losses worldwide. The relationships between the environmental inputs and disease presence are complex and non-linear, mimicking real biological systems.
Columns
- temperature: Environmental temperature measured in degrees Celsius.
- humidity: Moisture content measured as a percentage.
- rainfall: Precipitation amount measured in millimeters.
- soil_pH: Measurement of the soil's acidity or alkalinity.
- disease_present: A binary output label indicating the presence of disease (0 = healthy, 1 = diseased).
Distribution
The dataset contains 10,000 individual records and includes 5 features. The typical data file format is CSV, with a size of approximately 755.02 kB. The usability score is high, as the data is clean, with zero mismatched or missing values reported across all records.
Usage
The data is ideal for several applications within machine learning and analysis:
- Practicing binary classification techniques.
- Performing feature importance analysis to identify critical environmental drivers.
- Understanding complex feature interactions.
- Testing the robustness of predictive models.
- Applying specialised imbalanced classification techniques, given the split between healthy (7,590) and diseased (2,410) labels.
Coverage
The data encompasses environmental measurements such as temperature, humidity, rainfall, and soil pH. Since the data is synthetic, generated for educational purposes, it does not possess a specific geographic location or time range. It represents simulated environmental conditions that could lead to plant fungal infections in farm settings.
License
CC0: Public Domain
Who Can Use It
- Data Scientists: For developing and evaluating machine learning models focused on early warning systems in agriculture.
- Researchers: To study the impact of environmental variables on plant health and disease proliferation.
- Students: For educational practice, particularly in binary classification and handling imbalanced datasets.
- ML Engineers: To test model stability and robustness under varied environmental input conditions.
Dataset Name Suggestions
- Plant Disease Environmental Predictor
- Synthetic Crop Health Factors
- Fungal Risk Classification Dataset
- Agricultural Condition Simulation
Attributes
Original Data Source: Plant Disease Environmental Predictor
Loading...
