Blueberry Pollination Impact Dataset
Data Science and Analytics
Tags and Keywords
Trusted By




"No reviews yet"
Free
About
This dataset provides simulated data for wild blueberry yield prediction using machine learning. It addresses the ongoing research in the agricultural sector to enhance crop yield prediction models, particularly where the availability of high-quality training data can be a limiting factor. The data was generated by the Wild Blueberry Pollination Model, a spatially explicit simulation model validated by field observations and experimental data collected in Maine, USA, over the last 30 years. It offers valuable insights into the influence of plant spatial traits, bee species composition, and weather conditions on blueberry production, serving as a robust resource for researchers.
Columns
- clonesize: The size of the blueberry plant clone.
- honeybee: A measurement representing the presence or activity of honeybees.
- bumbles: A measurement representing the presence or activity of bumblebees.
- andrena: A measurement representing the presence or activity of Andrena bees.
- osmia: A measurement representing the presence or activity of Osmia bees.
- MaxOfUpperTRange: The maximum temperature recorded in the upper temperature range.
- MinOfUpperTRange: The minimum temperature recorded in the upper temperature range.
- AverageOfUpperTRange: The average temperature recorded in the upper temperature range.
- MaxOfLowerTRange: The maximum temperature recorded in the lower temperature range.
- MinOfLowerTRange: The minimum temperature recorded in the lower temperature range.
- AverageOfLowerTRange: The average temperature recorded in the lower temperature range.
- RainingDays: The number of days rainfall occurred.
- AverageRainingDays: The average number of raining days.
- fruitset: The percentage of successful fruit setting.
- fruitmass: The mass of individual blueberry fruits.
- seeds: The number of seeds found per fruit.
- yield: The predicted or observed wild blueberry crop yield.
Distribution
The dataset is typically provided in a CSV format and is structured as tabular data. The current version available is approximately 427.73 kB in size. Specific numbers for rows or records are not provided, but it contains numerous entries characterising various environmental and biological factors.
Usage
This dataset is ideally suited for developing and testing machine learning models aimed at predicting wild blueberry yields. It can be used for:
- Training and experimenting with diverse machine learning algorithms for crop yield forecasting.
- Analysing the impact of different features, such as plant traits, bee species, and weather conditions, on agricultural output.
- Feature selection and engineering for predictive modelling in agricultural science.
- Providing input for crop yield prediction models, particularly for those interested in the potential of combining real data responses with computer simulation modelling.
Coverage
The data is geographically relevant to Maine, USA. It has been generated from a simulation model that was validated using field observations and experimental data collected over the last 30 years. There is no specific demographic scope as it pertains to agricultural yield.
License
Attribution 4.0 International (CC BY 4.0)
Who Can Use It
This dataset is intended for researchers who possess actual field observation data, as well as those eager to experiment with machine learning algorithms for crop yield prediction. It serves as a valuable resource for academics and practitioners working on agricultural forecasting, ecological modelling, and data science applications in environmental contexts.
Dataset Name Suggestions
- Wild Blueberry Yield Prediction Data
- Blueberry Pollination Impact Dataset
- Agricultural Yield Simulation Data
- Machine Learning Blueberry Dataset
- Pollinator and Weather Impact on Blueberry Yield
Attributes
Original Data Source: Blueberry Pollination Impact Dataset