Height and Weight Classification Data
Data Science and Analytics
Tags and Keywords
Trusted By




"No reviews yet"
Free
About
This collection of records is intended for those beginning their journey into machine learning, offering a straightforward structure suitable for applying foundational algorithms. The data focuses on key features—height and weight—which are commonly used to predict the sex of an individual. It provides a clean, manageable set of figures perfect for exercises in predictive modelling and outlier detection.
Columns
There are three principal data fields available:
- Gender: The categorical label for prediction, containing only two unique values: male or female.
- Height: Measured in inches. The values range from a minimum of 54.3 to a maximum of 79 inches, with a mean of 66.4.
- Weight: Measured in an unspecified unit (likely pounds), with values spanning from 64.7 to 270. The average weight recorded is 161.
Distribution
The collection consists of 10,000 valid records, typically delivered in a CSV file format, designated as weight-height.csv. Crucially, there are no missing or mismatched values across any of the three columns, making it extremely clean for immediate use. The overall file size is approximately 428.12 kB. Updates to this dataset are expected on an annual basis.
Usage
The data is ideally suited for basic classification problems, allowing users to train a model to predict sex based on the provided physical attributes. It is also highly effective for conducting statistical analyses, descriptive statistics, and visually identifying outliers in height and weight measurements.
Coverage
The scope covers measured height and weight figures paired with the corresponding sex label (male or female). The 'Male' label constitutes 50% of the entries, representing the most common value. While specific geographic or demographic data is absent, the recorded measurements provide a range of human physical statistics for analysis.
License
CC0: Public Domain
Who Can Use It
- Students and Educators: For demonstrating basic supervised machine learning concepts.
- Beginner Data Scientists: Seeking a simple, non-messy dataset for initial algorithm implementation and testing.
- Data Analysts: For performing quick statistical evaluations and visualisations of correlation and distribution.
Dataset Name Suggestions
- Height and Weight Classification Data
- Gender Prediction Simple Dataset
- Human Measurement Data (10,000 Records)
- Weight and Height Outlier Detection Sample
Attributes
Original Data Source: Height and Weight Classification Data
Loading...
