Human Feature Gender Classifier
Education & Learning Analytics
Tags and Keywords
Trusted By




"No reviews yet"
Free
About
This dataset is designed for individuals, particularly beginners, looking to practice machine learning classification tasks. It offers a straightforward yet realistic scenario, aiming to provide better results for those new to the field. The dataset was created to help users gain confidence and proceed further in machine learning, offering a great starting point for classification projects.
Columns
The dataset comprises eight columns: seven features and one label column.
- long_hair: A binary indicator (0 or 1) representing whether an individual has long hair (1) or not (0). This column has 5001 valid entries, with a mean of 0.87.
- forehead_width_cm: The width of the forehead, measured in centimetres from right to left. This column has 5001 valid entries, with a mean of 13.2 cm.
- forehead_height_cm: The height of the forehead, measured in centimetres from where the hair grows to the eyebrows. This column has 5001 valid entries, with a mean of 5.95 cm.
- nose_wide: A binary indicator (0 or 1) denoting whether the nose is wide (1) or not (0). This column has 5001 valid entries, with a mean of 0.49.
- nose_long: A binary indicator (0 or 1) indicating whether the nose is long (1) or not (0). This column has 5001 valid entries, with a mean of 0.51.
- lips_thin: A binary indicator (0 or 1) signifying whether the lips are thin (1) or not (0). This column has 5001 valid entries, with a mean of 0.49.
- distance_nose_to_lip_long: A binary indicator (0 or 1) representing whether the distance between the nose and lips is long (1) or short (0). This column has 5001 valid entries, with a mean of 0.50.
- gender: The target label column, classifying individuals as either "Male" or "Female". The dataset contains an equal distribution of Male and Female entries, with 5001 valid records.
Distribution
The dataset is provided in a CSV (Comma Separated Values) file format, specifically
gender_classification_v7.csv
. It has a file size of 128.32 kB and contains 5001 records across 8 columns.Usage
This dataset is ideal for machine learning classification tasks, particularly for beginners in the field. It is suitable for projects that aim to achieve reliable classification results using a simple yet effective dataset.
Coverage
This dataset is composed of made-up data, meaning it does not have a specific geographic, time range, or demographic scope. It is not based on real-world observations for particular groups or years.
License
CC0: Public Domain
Who Can Use It
This dataset is primarily intended for beginners in machine learning who are looking to:
- Try out and solve classification problems.
- Gain experience with a dataset that yields good results.
- Build confidence in their machine learning journey.
Dataset Name Suggestions
- Gender Classification Dataset
- Human Feature Gender Classifier
- Biometric Gender Prediction Data
- Machine Learning Gender Attributes
Attributes
Original Data Source:Human Feature Gender Classifier