Opendatabay APP

Wine Quality Machine Learning Model Input

Data Science and Analytics

Tags and Keywords

Wine

Quality

Chemistry

Alcohol

Regression

Trusted By
Trusted by company1Trusted by company2Trusted by company3
Wine Quality Machine Learning Model Input Dataset on Opendatabay data marketplace

"No reviews yet"

Free

About

This collection captures physicochemical and sensory attributes pertaining to red wine samples from the Portuguese "Vinho Verde" region. The data's primary purpose is to enable predictive modelling, allowing users to forecast the quality score of the wine based on its chemical makeup. It is a highly robust resource for classification, regression, feature selection, and the identification of outliers in analytical tasks.

Columns (12 Features)

  • fixed acidity: Measures the amount of non-volatile acids present in the wine.
  • volatile acidity: Measures the amount of volatile acids in the sample.
  • citric acid: Details the concentration of citric acid in the wine.
  • residual sugar: Represents the level of sugar remaining after the fermentation process.
  • chlorides: Indicates the amount of salt contained within the wine.
  • free sulfur dioxide: The quantity of free sulfur dioxide available.
  • total sulfur dioxide: The overall measure of sulfur dioxide in the sample.
  • density: The specific density of the wine.
  • pH: The acidity level, or pH value, of the wine.
  • sulphates: The concentration of sulphates in the wine.
  • alcohol: The measured alcohol content.
  • quality: The target variable, representing a sensory quality score typically ranging from 0 to 10.

Distribution

The data is provided in a tabular format, typically stored as a CSV file (red_wine_quality.csv). The red wine subset of the data contains 12 columns/features and approximately 1,599 individual records. The dataset is static, meaning no further updates are expected.

Usage

This data product is ideally suited for:
  • Training machine learning models to predict wine quality (regression or classification tasks).
  • Conducting feature selection analysis to determine the most influential chemical properties impacting quality.
  • Performing exploratory data analysis to understand the relationships between various physicochemical characteristics.
  • Detecting anomalies or outliers in wine chemistry.

Coverage

The dataset focuses exclusively on red wine originating from the Portuguese "Vinho Verde" region. The scope is limited to the chemical and sensory parameters collected at the time of sampling. No specific time range is detailed for the collection period.

License

Attribution 4.0 International (CC BY 4.0)

Who Can Use It

  • Data Scientists: For building and evaluating predictive models related to food and beverage quality control.
  • Academic Researchers: To study the impact of various chemical compounds (e.g., acidity, sulfur) on perceived sensory attributes.
  • Students: For educational purposes, learning standard practices in regression, classification, and statistical analysis using real-world data.

Dataset Name Suggestions

  • Red Wine Quality Predictor Data
  • Vinho Verde Physicochemical Properties
  • Wine Quality Machine Learning Model Input

Attributes

Listing Stats

VIEWS

1

DOWNLOADS

0

LISTED

23/11/2025

REGION

GLOBAL

Universal Data Quality Score Logo UDQSQUALITY

5 / 5

VERSION

1.0

Loading...

Free

Download Dataset in CSV Format