Healthcare Employee Attrition Dataset
Patient Health Records & Digital Health
Tags and Keywords
Trusted By




"No reviews yet"
Free
About
This dataset focuses on employee attrition within the US healthcare system, an area of significant concern, particularly for hospitals facing high turnover rates amongst nurses. It provides a valuable resource for building machine learning models with strong performance, utilising intuitive features to predict and analyse employee departures. The dataset includes both employee-specific and company-related data, making it suitable for supervised machine learning (where attrition can serve as a target variable), unsupervised machine learning, and general analytics. While synthetic, the data is derived from the IBM Watson attrition dataset, with adjustments made to roles and departments to accurately reflect the healthcare domain. Certain known outcomes have also been modified to enhance the performance of machine learning models.
Columns
- EmployeeID: A unique identifier assigned to each employee.
- Age: The age of the employee.
- Attrition: Indicates whether the employee left the company (Yes/No).
- BusinessTravel: Frequency of business travel undertaken by the employee.
- DailyRate: The daily rate of pay for the employee.
- Department: The department in which the employee works.
- DistanceFromHome: The distance in miles from the employee's home to the workplace.
- Education: The level of education attained by the employee.
- EducationField: The field of study in which the employee's education was concentrated.
- EmployeeCount: A constant value (likely 1) for each employee, possibly for aggregation purposes.
- EnvironmentSatisfaction: Employee satisfaction with their work environment (rated).
- Gender: The gender of the employee.
- HourlyRate: The hourly rate of pay for the employee.
- JobInvolvement: Employee's level of job involvement (rated).
- JobLevel: The job level of the employee within the organisation.
- JobRole: The specific job role or title of the employee.
- JobSatisfaction: Employee satisfaction with their job (rated).
- MaritalStatus: The marital status of the employee.
- MonthlyIncome: The monthly income of the employee.
- MonthlyRate: The monthly rate of pay for the employee.
- NumCompaniesWorked: The number of companies the employee has worked for prior to this one.
- Over18: Indicates if the employee is over 18 years old (likely 'Y').
- OverTime: Indicates if the employee works overtime (Yes/No).
- PercentSalaryHike: The percentage increase in the employee's salary.
- PerformanceRating: The performance rating of the employee (rated).
- RelationshipSatisfaction: Employee satisfaction with their relationships at work (rated).
- StandardHours: The standard number of working hours (likely 80 for bi-weekly).
- Shift: Not explicitly described but likely indicates work shift.
- TotalWorkingYears: The total number of years the employee has worked.
- TrainingTimesLastYear: The number of times the employee underwent training last year.
- WorkLifeBalance: Employee's work-life balance satisfaction (rated).
- YearsAtCompany: The total number of years the employee has been with the current company.
- YearsInCurrentRole: The number of years the employee has been in their current role.
- YearsSinceLastPromotion: The number of years since the employee's last promotion.
- YearsWithCurrManager: The number of years the employee has been with their current manager.
Distribution
The data is typically provided in a CSV file format. The exact number of rows or records for the full dataset is not specified in the available information.
Usage
This dataset is ideal for developing machine learning classification models to predict employee attrition. It can be used by data scientists and HR professionals to identify factors contributing to employee turnover, allowing organisations to implement targeted retention strategies. Specific applications include:
- Building predictive models for employee attrition.
- Performing detailed HR analytics to understand workforce dynamics.
- Identifying key indicators of job satisfaction and dissatisfaction.
- Supporting strategic workforce planning in healthcare institutions.
Coverage
The dataset's scope is primarily focused on employee attrition within the US healthcare system, with a specific emphasis on nurse roles. Demographic information, including age, gender, and marital status, is covered for individual employees. A specific time range for the data collection is not provided.
License
CC0 Public Domain
Who Can Use It
This dataset is particularly useful for:
- Hospitals and healthcare organisations: To analyse and reduce employee attrition, especially among nurses.
- HR departments: For strategic planning, retention program development, and predictive analytics on workforce stability.
- Data scientists and machine learning engineers: To develop and evaluate attrition prediction models, as well as for general supervised and unsupervised learning tasks.
- Researchers: Studying human resources, organisational behaviour, and healthcare management.
Dataset Name Suggestions
- Healthcare Employee Attrition Dataset
- Healthcare Workforce Turnover Analytics
- Nurse Attrition Prediction Data
- US Healthcare Staff Retention Dataset
Attributes
Original Data Source: Healthcare Employee Attrition Dataset