Data Scientist Career Churn Prediction
Data Science and Analytics
Tags and Keywords
Trusted By




"No reviews yet"
Free
About
This dataset is designed to predict whether a candidate will look for a new job or remain with a company after completing specific training programmes. It provides a valuable resource for HR analytics, helping companies to identify candidates genuinely interested in working for them and reducing costs and time associated with training and recruitment. The dataset also aids in interpreting the various factors that influence an employee's decision to leave their current role, which is crucial for workforce planning and candidate categorisation.
Columns
- enrollee_id: A unique identifier for each candidate.
- city: The code identifying the city of the candidate.
- city_development_index: A scaled index representing the development level of the city.
- gender: The gender of the candidate, e.g., Male, Female, or Other.
- relevent_experience: Indicates whether the candidate has relevant work experience.
- enrolled_university: Details the type of university course, if any, the candidate is enrolled in.
- education_level: The highest education level achieved by the candidate, e.g., Graduate, Masters.
- major_discipline: The main academic discipline of the candidate's education, such as STEM.
- experience: The candidate's total work experience, measured in years.
- company_size: The number of employees in the candidate's current employer's company.
- company_type: The classification of the current employer's company, e.g., Pvt Ltd.
- last_new_job: The time difference in years between the candidate's previous job and their current one.
- training_hours: The total hours of training completed by the candidate.
- target: The dependent variable indicating whether the candidate is looking for a job change (1) or not (0).
Distribution
The dataset is typically available in CSV format. A sample test file (
aug_test.csv
) is provided, with a size of approximately 210.5 kB. The data is structured into training and test sets. The target variable is excluded from the test set, but a separate file containing the test target values is available for related tasks. It is important to note that the dataset is imbalanced, and most features are categorical, some exhibiting high cardinality. Missing value imputation may be a necessary step in the data pipeline. The sample test file includes approximately 2,129 records.Usage
This dataset is ideal for:
- Predicting job change probability: Forecasting which data science candidates are likely to seek new employment after training.
- HR analytics and research: Understanding the underlying factors that compel individuals to leave their jobs.
- Optimising recruitment and training strategies: Helping companies reduce costs and time associated with hiring and improving the quality of their training programmes.
- Candidate categorisation: Streamlining the process of classifying candidates based on their likelihood of staying with the company.
- Model interpretation: Gaining insights into which features most significantly affect a candidate's decision to look for a new job.
Coverage
The dataset includes demographic information such as gender, education level, and relevant experience. City codes are provided, indicating a geographical scope pertaining to various cities. There is no specific time range mentioned for the data collection. Details on data availability for specific demographic groups or years are not explicitly detailed beyond the general feature descriptions.
License
CC0: Public Domain
Who Can Use It
This dataset is particularly useful for:
- HR professionals and recruiters in Big Data and Data Science companies.
- Data scientists and machine learning engineers building predictive models for employee churn.
- Academics and researchers focusing on workforce analytics, talent management, and career transitions.
- Organisations looking to improve their training efficacy and reduce recruitment overheads.
Dataset Name Suggestions
- HR Analytics: Data Scientist Job Change
- Data Scientist Career Churn Prediction
- Employee Attrition for Data Professionals
- Job Change Prediction for Data Scientists
Attributes
Original Data Source: Data Scientist Career Churn Prediction