Adult Income Classification Data
Government & Civic Records
Tags and Keywords
Trusted By



"No reviews yet"
Free
About
This data resource facilitates the prediction of whether an individual's annual income surpasses $50,000, utilising demographic and employment data derived from the 1994 Census database. It is widely referenced as the Adult dataset. The original records were subjected to specific filtering conditions to ensure a high quality set, including requirements that individuals must be over 16 years of age and recorded positive working hours. The primary objective is a binary prediction task concerning high or low income brackets.
Columns
The dataset comprises 15 columns detailing various personal and employment characteristics:
- AGE: The individual's age, spanning from 17 to 90 years old.
- WORKCLASS: The type of employer or employment sector (e.g., Private, Self-emp-not-inc).
- FNLWGT: The final weight, representing the number of people the census believes the sample observation represents.
- EDUCATION: The highest level of education achieved (e.g., HS-grad, Some-college).
- EDUCATION-NUM: A numerical representation of the education level (ranging from 1 to 16).
- MARITAL-STATUS: The individual's marital status (e.g., Married-civ-spouse, Never-married).
- OCCUPATION: The specific field of employment (e.g., Prof-specialty, Craft-repair).
- RELATIONSHIP: The individual's relational status within the household (e.g., Husband, Not-in-family).
- RACE: The racial group of the individual (e.g., White, Black).
- SEX: The recorded gender (Male or Female).
- CAPITAL-GAIN: Income derived from capital gains, with a maximum value of 100,000.
- CAPITAL-LOSS: Losses incurred from capital transactions, reaching up to 4,356.
- HOURS-PER-WEEK: The number of hours worked per week, with a mean of 40.4 hours.
- NATIVE COUNTRY: The country of birth or origin.
- INCOME: The target variable, indicating whether the annual income is less than or equal to $50K or greater than $50K.
Distribution
The data is provided in a CSV file format named CENSUS_INCOME.csv, with a total file size of 3.84 MB. It includes 15 attributes across approximately 32,600 valid records. The data structure is clean, showing zero mismatched or missing values for all featured columns.
Usage
This data is perfectly suited for classification challenges in machine learning, particularly for training models to predict income levels. It is highly useful for researchers investigating socio-economic trends, inequality, and the key determinants of wage outcomes. Furthermore, it serves as a foundational benchmark dataset for comparing and validating new classification algorithms.
Coverage
The scope of this resource is limited to demographic and financial records extracted from the US Census database specific to 1994. Geographically, it is heavily skewed toward individuals identified as being from the United States, accounting for 90% of the observations. Demographically, 85% of individuals are identified as White, and males constitute 67% of the total records.
License
CC0: Public Domain
Who Can Use It
- Data Scientists: For developing and evaluating binary classification models.
- Academic Researchers: For studying the relationships between education, occupation, and financial outcomes in the US.
- Policy Analysts: To model factors contributing to wealth disparity and income brackets.
Dataset Name Suggestions
- Census Income Predictor 1994
- Adult Income Classification Data
- US Demographic Wage Analysis
- 1994 Socio-Economic Records
Attributes
Original Data Source: Adult Income Classification Data
Loading...
