Portuguese Wine Quality Dataset
Data Science and Analytics
Tags and Keywords
Trusted By




"No reviews yet"
Free
About
This dataset provides physico-chemical properties of red and white variants of Portuguese "Vinho Verde" wine, alongside their quality scores. Its primary purpose is to facilitate machine learning tasks such as classification or regression, allowing users to build models that predict wine quality based on various chemical attributes. It offers a valuable resource for understanding the factors influencing wine quality and developing predictive analytical tools.
Columns
- fixed acidity: The amount of non-volatile acids present.
- volatile acidity: The amount of acetic acid, which at high levels can lead to an unpleasant vinegar taste.
- citric acid: A weak organic acid that can add "freshness" and flavour to wines.
- residual sugar: The amount of sugar remaining after fermentation has stopped.
- chlorides: The amount of salt in the wine.
- free sulfur dioxide: The free form of SO2, which exists in equilibrium between molecular SO2 (as an antimicrobial and antioxidant) and bisulphite form.
- total sulfur dioxide: The total amount of SO2, both free and bound forms.
- density: The density of the wine, close to that of water depending on the alcohol and sugar content.
- pH: The level of acidity or alkalinity, a scale from 0 (very acidic) to 14 (very basic); most wines are between 3-4 on the pH scale.
- sulphates: A wine additive that can contribute to SO2 levels.
- alcohol: The alcohol content of the wine.
- quality: A sensory evaluation score for the wine, typically ranging from 0 to 10.
Distribution
The dataset is typically provided in CSV format. It comprises two distinct datasets, one for red "Vinho Verde" wine and another for white "Vinho Verde" wine. The red wine dataset,
winequality-red.csv
, is approximately 84.2 kB in size and contains 1599 records, each with 12 attributes. Specific details regarding the number of rows for the white wine dataset are not included in the provided information.Usage
This dataset is ideal for:
- Developing and testing machine learning models for wine quality prediction.
- Conducting classification tasks to categorise wine into quality tiers.
- Performing regression analysis to predict a continuous wine quality score.
- Educational purposes, particularly for beginners learning about data analysis and machine learning.
- Research into the chemical properties influencing wine characteristics.
Coverage
The dataset focuses on red and white variants of Portuguese "Vinho Verde" wine. It captures chemical attributes and quality scores specific to this region and wine type. The dataset is static, with no expected updates, suggesting it represents a fixed snapshot of data. No specific time range or demographic scope is detailed beyond the wine's origin.
License
CC0: Public Domain
Who Can Use It
- Data Scientists and Machine Learning Engineers: For building and refining predictive models.
- Students and Educators: As a practical example for learning data science, classification, and regression techniques.
- Wine Industry Researchers: To gain insights into the physico-chemical drivers of wine quality.
- Academics: For studies in areas like chemistry, food science, and statistics.
Dataset Name Suggestions
- Vinho Verde Wine Quality Attributes
- Portuguese Wine Quality Dataset
- Red and White Vinho Verde Wine Properties
- Wine Physico-Chemical Quality Predictor
Attributes
Original Data Source: Portuguese Wine Quality Dataset