Opendatabay APP

HR Employee Retention Study

Education & Learning Analytics

Tags and Keywords

Attrition

Employee

Hr

Predict

Employment

Trusted By
Trusted by company1Trusted by company2Trusted by company3
HR Employee Retention Study Dataset on Opendatabay data marketplace

"No reviews yet"

Free

About

This dataset is designed to help organisations understand and predict employee attrition, enabling them to identify factors contributing to staff turnover. It allows for detailed analysis of employee demographics, job satisfaction, performance, and work-life balance to uncover patterns and relationships that influence attrition. The dataset is a fictional collection created by IBM data scientists, suitable for exploring complex questions about employee retention, such as the impact of distance from home on job role and attrition, or the correlation between monthly income, education, and attrition.

Columns

The dataset contains 35 columns, providing a rich set of attributes for each employee record:
  • Age: Numerical, representing the employee's age.
  • Attrition: Categorical (Boolean). Indicates if an employee has left the company, with values 'true' (237 instances) or 'false' (1,233 instances).
  • BusinessTravel: Categorical. Describes the frequency of business travel, including 'Travel_Rarely' (71%), 'Travel_Frequently' (19%), and other categories.
  • DailyRate: Numerical, representing the daily rate of pay.
  • Department: Categorical. Specifies the employee's department, such as 'Research & Development' (65%) and 'Sales' (30%).
  • DistanceFromHome: Numerical, indicating the distance of the employee's home from work.
  • Education: Categorical. Represents the level of education, with values: 1 'Below College', 2 'College', 3 'Bachelor', 4 'Master', 5 'Doctor'.
  • EducationField: Categorical. The field of study, including 'Life Sciences' (41%) and 'Medical' (32%).
  • EmployeeCount: Numerical (constant value of 1 for all records).
  • EmployeeNumber: Numerical, a unique identifier for each employee.
  • EnvironmentSatisfaction: Categorical. Measures satisfaction with the work environment, with values: 1 'Low', 2 'Medium', 3 'High', 4 'Very High'.
  • Gender: Categorical. 'Male' (60%) or 'Female' (40%).
  • HourlyRate: Numerical, the employee's hourly pay rate.
  • JobInvolvement: Categorical. Describes job involvement, with values: 1 'Low', 2 'Medium', 3 'High', 4 'Very High'.
  • JobLevel: Categorical. The job level within the organisation.
  • JobRole: Categorical. The employee's specific job role, e.g., 'Sales Executive' (22%) or 'Research Scientist' (20%).
  • JobSatisfaction: Categorical. Measures job satisfaction, with values: 1 'Low', 2 'Medium', 3 'High', 4 'Very High'.
  • MaritalStatus: Categorical. Marital status, including 'Married' (46%) and 'Single' (32%).
  • MonthlyIncome: Numerical, the employee's monthly income.
  • MonthlyRate: Numerical, the monthly rate of pay.
  • NumCompaniesWorked: Numerical, the number of companies the employee has worked for previously.
  • Over18: Categorical (constant 'true' for all records).
  • OverTime: Categorical (Boolean). Indicates if the employee works overtime, with 'true' (28%) or 'false' (72%).
  • PercentSalaryHike: Numerical, the percentage increase in salary.
  • PerformanceRating: Categorical. Employee performance rating, with values: 1 'Low', 2 'Good', 3 'Excellent', 4 'Outstanding'.
  • RelationshipSatisfaction: Categorical. Measures relationship satisfaction at work, with values: 1 'Low', 2 'Medium', 3 'High', 4 'Very High'.
  • StandardHours: Numerical (constant value of 80 for all records).
  • StockOptionLevel: Categorical, the stock option level granted to the employee.
  • TotalWorkingYears: Numerical, the total number of years the employee has worked.
  • TrainingTimesLastYear: Numerical, the number of training sessions attended in the last year.
  • WorkLifeBalance: Categorical. Measures work-life balance, with values: 1 'Bad', 2 'Good', 3 'Better', 4 'Best'.
  • YearsAtCompany: Numerical, the number of years the employee has been with the current company.
  • YearsInCurrentRole: Numerical, the number of years in the current job role.
  • YearsSinceLastPromotion: Numerical, the number of years since the last promotion.
  • YearsWithCurrManager: Numerical, the number of years with the current manager.

Distribution

The dataset is provided as a CSV file, named "HR Employee Attrition.csv". It has a file size of 227.97 kB and contains 1,470 records across all 35 columns, with no missing values.

Usage

This dataset is ideal for:
  • Predictive modelling: Building models to forecast employee attrition.
  • Data analysis and visualisation: Uncovering underlying factors and trends related to employee turnover.
  • Human Resources analytics: Gaining insights into employee behaviour, satisfaction, and retention strategies.
  • Hypothesis testing: Investigating specific relationships, such as the effect of distance from home on attrition, or income and education on attrition rates.
  • Machine learning applications: Training classification models to identify employees at risk of leaving.

Coverage

This is a fictional dataset created by IBM data scientists, therefore it does not represent specific real-world geographic locations, time ranges, or demographic groups. Its purpose is to simulate real-world employee data for analytical and predictive exercises.

License

CC0: Public Domain

Who Can Use It

This dataset is suitable for:
  • Data scientists and machine learning engineers for building predictive models.
  • HR analysts and business intelligence professionals seeking to understand and improve employee retention.
  • Researchers and students in fields such as human resources, organisational psychology, and data science for academic studies and projects.
  • Management consultants interested in workforce analytics and talent management strategies.

Dataset Name Suggestions

  • Employee Attrition Prediction Data
  • IBM HR Analytics Attrition Dataset
  • Workforce Turnover Factors Data
  • HR Employee Retention Study
  • Organisational Attrition Data

Attributes

Original Data Source: HR Employee Retention Study

Listing Stats

VIEWS

0

DOWNLOADS

0

LISTED

14/07/2025

REGION

GLOBAL

Universal Data Quality Score Logo UDQSQUALITY

5 / 5

VERSION

1.0

Free