Opendatabay APP

Almond Image Feature Classification Data

Data Science and Analytics

Tags and Keywords

Almond

Classification

Food

Image

Features

Trusted By
Trusted by company1Trusted by company2Trusted by company3
Almond Image Feature Classification Data Dataset on Opendatabay data marketplace

"No reviews yet"

Free

About

This data product focuses on the classification of diverse almond varieties. Each almond type possesses distinct characteristics, making their identification vital for practical applications. The features included were extracted directly from almond images using specific image processing techniques. These techniques involved converting the image colour space to HSV from RGB, creating a mask tailored for the brown colour of the almonds, identifying edges using the Canny method, and subsequently finding and drawing contours. This dataset is foundational for building models capable of classifying various almond types based purely on morphological and derived geometric properties.

Columns

The dataset contains 14 columns detailing physical and calculated attributes:
  • Length (major axis): The maximum length of the almond, measured in pixels within the image.
  • Width (minor axis): The maximum width of the almond, measured in pixels within the image.
  • Thickness (depth): The depth or thickness measurement of the almond, based on pixel count. Note that missing values in this column, along with Width and Roundness, may indicate the almond was not laid flat (it was upright, on its side, or on its back).
  • Area: The size of the almond region detected in the image.
  • Perimeter: The total calculated length of the boundary surrounding the almond.
  • Roundness: A measurement of how round the almond is, calculated using the formula: 4 * Area / ($\pi$ * Length ** 2).
  • Solidity: A measure of convexity, calculated as Area / area_hull.
  • Compactness: A shape descriptor, calculated as perimeter**2 / (4 * $\pi$ * Area).
  • Aspect Ratio: The ratio of the Length to the Width.
  • Eccentricity: A measure of how much the almond deviates from a circle, calculated as sqrt(1 - ( Width / Length ) **2 ).
  • Extent: Calculated as Area / area_bbox (the area of the bounding box).
  • Convex hull (convex area): Represents the smallest convex set that encapsulates all the bounding points of the almond.
  • Type: The classified variety of the almond, such as SANORA, MAMRA, or other types.

Distribution

The data is provided in a CSV file format named Almond.csv, approximately 472.26 kB in size. It contains 14 distinct columns and 2803 records in total. The almond types are relatively evenly distributed, with SANORA making up 34%, MAMRA 33%, and other varieties accounting for the remaining 33%. It is important to note that certain columns, specifically Width (31% missing), Thickness (34% missing), Roundness (31% missing), Aspect Ratio (64% missing), and Eccentricity (64% missing), contain a significant number of null values.

Usage

Ideal applications for this dataset include machine learning classification problems aimed at identifying food varieties. It is suitable for projects focusing on feature engineering using geometric morphology, and for those learning fundamental data cleaning techniques due to the presence of missing values. It can also be applied in nutrition science contexts or agricultural technology development for automated quality control and sorting systems.

Coverage

The sources do not specify the geographic origin, time range, or demographic scope of the almonds photographed.

License

CC0: Public Domain

Who Can Use It

This data is excellent for beginners entering the data science field, providing a real-world classification challenge. It is also valuable for experienced data practitioners who are developing image-based classification algorithms for quality inspection of agricultural products, and researchers interested in the quantitative assessment of nut shape characteristics.

Dataset Name Suggestions

Almond Image Feature Classification Data Nut Variety Morphological Metrics Agricultural Product Quality Control Features Image-Derived Almond Dimensions

Attributes

Listing Stats

VIEWS

0

DOWNLOADS

0

LISTED

07/10/2025

REGION

GLOBAL

Universal Data Quality Score Logo UDQSQUALITY

5 / 5

VERSION

1.0

Free

Download Dataset in CSV Format