Opendatabay APP

Orange vs. Grapefruit Classification Data

Food & Beverage Consumption

Tags and Keywords

Classification

Fruit

Orange

Grapefruit

Dataset

Trusted By
Trusted by company1Trusted by company2Trusted by company3
Orange vs. Grapefruit Classification Data Dataset on Opendatabay data marketplace

"No reviews yet"

Free

About

This dataset is specifically designed for binary classification tasks, aiming to differentiate between oranges and grapefruit [1, 2]. While humans can easily tell the difference, this dataset provides a structured approach for computational analysis [1]. It includes a wide variety of generated values for diameter, weight, and colour, derived from the average characteristics of oranges and grapefruit [1]. This makes it an ideal and engaging resource for teaching situations involving binary classification [2]. The dataset is mostly fictional, created by generating artificial samples from starting fruit measurements [1].

Columns

  • name: This column serves as the label, indicating whether the fruit is an 'orange' or a 'grapefruit' [2]. It has two unique values and consists of 10,000 valid entries [2].
  • diameter: Represents the diameter of the citrus fruit, measured in centimetres [2]. The values range from approximately 2.96 to 16.4 centimetres, with an average of 9.98 cm [3].
  • weight: Indicates the weight of the citrus fruit in grams [3]. These values span from roughly 86.76 to 262 grams, with a mean of 175 grams [3, 4].
  • red: Displays the average red reading from an RGB scan, with values ranging from 0 to 255 [4]. The mean red value is 154 [5].
  • green: Shows the average green reading from an RGB scan, with values from 0 to 255 [5]. The mean green value is 76 [5].
  • blue: Presents the average blue reading from an RGB scan, with values between 0 and 255 [6]. The mean blue value is 11.4 [6].

Distribution

The dataset is supplied in a CSV (Comma Separated Values) format [2, 7]. It contains 10,000 records (rows) [2-6]. All six columns are fully populated, with 100% valid data and no mismatched or missing entries [2-6].

Usage

This dataset is well-suited for developing and testing binary classification algorithms [2]. Its primary use is in educational settings to teach machine learning principles, particularly supervised learning for classification problems [2]. Specific applications include:
  • Building predictive models to distinguish between oranges and grapefruits based on their physical attributes and colour [1].
  • Providing a clean dataset for practising data preprocessing, feature engineering, and model training.
  • Exploring various data visualisation techniques for multi-dimensional data.

Coverage

The dataset is predominantly fictional and artificially generated, based on the typical characteristics of oranges and grapefruits [1]. Consequently, it does not possess real-world geographic, time-based, or demographic coverage [1]. It is intended as a synthetic dataset for educational and model development purposes [1, 2].

License

CC0: Public Domain

Who Can Use It

  • Students and educators: For learning and teaching machine learning concepts, especially binary classification [2].
  • Data scientists and analysts: For prototyping and experimenting with classification algorithms without the need for real-world data collection.
  • Researchers: As a straightforward, clean dataset for demonstrating new classification methodologies.

Dataset Name Suggestions

  • Citrus Fruit Classifier Dataset
  • Orange vs. Grapefruit Classification Data
  • Fruit Attributes for ML
  • Binary Citrus Identification
  • Simulated Fruit Data

Attributes

Listing Stats

VIEWS

0

DOWNLOADS

0

LISTED

25/07/2025

REGION

GLOBAL

Universal Data Quality Score Logo UDQSQUALITY

5 / 5

VERSION

1.0

Free

Download Dataset in CSV Format