Cereal Morphology Dataset
Data Science and Analytics
Tags and Keywords
Trusted By




"No reviews yet"
Free
About
The data consists of measurements for wheat kernels belonging to the Kama, Rosa, and Canadian varieties. It is designed for tasks such as classification and cluster analysis, relying on seven distinct geometric parameters derived from the kernels. All the attributes measured are real-valued continuous parameters.
Columns
The dataset includes 8 columns detailing the physical structure of the kernels:
- area (A): The measurable area of the kernel.
- perimeter (P): The measurement around the boundary of the kernel.
- compactness (C): A calculated geometric ratio defined as $4\pi A/P^2$.
- length: The length of the kernel.
- width: The width of the kernel.
- asymmetry coefficient: A parameter describing the geometric asymmetry of the kernel shape.
- groove length: The measured length of the kernel groove.
- category: The labelled variety of the wheat (Kama, Rosa, or Canadian), used for classification tasks.
Distribution
The data is supplied as a CSV file, named
wheat.csv, approximately 10.02 kB in size. It contains 210 total records, derived from 70 elements provided for each of the three wheat varieties. The integrity of the data is high, with 100% valid records and zero missing or mismatched values across all 8 parameters. The expected update frequency for this data is annually.Usage
This dataset is suitable for projects involving machine learning and statistical analysis. Ideal applications include:
- Training classification algorithms to automatically distinguish between the three defined wheat varieties.
- Applying cluster analysis to understand natural groupings based on the physical features.
- Serving as a starting point for Beginner-level data science practice and Exploratory Data Analysis.
Coverage
The data scope is strictly limited to the physical, geometric characteristics of the Kama, Rosa, and Canadian wheat kernel varieties. There is uniform representation, with 70 elements for each variety. The specific geographic origin or time frame of the measurements is not specified in the data notes.
License
CC0: Public Domain
Who Can Use It
- Data Scientists: To benchmark classification models on a well-structured, clean dataset.
- Agricultural Scientists: To quantitatively study physical differences between common wheat varieties.
- Students: For learning foundational techniques in data cleaning, classification, and statistical modelling.
Dataset Name Suggestions
- Wheat Kernel Geometric Parameters
- Kama, Rosa, Canadian Wheat Classification Data
- Cereal Morphology Dataset
- Wheat Variety Attribute Data
Attributes
Original Data Source: Cereal Morphology Dataset
Loading...
