Opendatabay APP

Data Scientist Career Churn Prediction

Data Science and Analytics

Tags and Keywords

Analytics

Job

Churn

Data

Scientists

Trusted By
Trusted by company1Trusted by company2Trusted by company3
Data Scientist Career Churn Prediction Dataset on Opendatabay data marketplace

"No reviews yet"

Free

About

This dataset is designed to predict whether a candidate will look for a new job or remain with a company after completing specific training programmes. It provides a valuable resource for HR analytics, helping companies to identify candidates genuinely interested in working for them and reducing costs and time associated with training and recruitment. The dataset also aids in interpreting the various factors that influence an employee's decision to leave their current role, which is crucial for workforce planning and candidate categorisation.

Columns

  • enrollee_id: A unique identifier for each candidate.
  • city: The code identifying the city of the candidate.
  • city_development_index: A scaled index representing the development level of the city.
  • gender: The gender of the candidate, e.g., Male, Female, or Other.
  • relevent_experience: Indicates whether the candidate has relevant work experience.
  • enrolled_university: Details the type of university course, if any, the candidate is enrolled in.
  • education_level: The highest education level achieved by the candidate, e.g., Graduate, Masters.
  • major_discipline: The main academic discipline of the candidate's education, such as STEM.
  • experience: The candidate's total work experience, measured in years.
  • company_size: The number of employees in the candidate's current employer's company.
  • company_type: The classification of the current employer's company, e.g., Pvt Ltd.
  • last_new_job: The time difference in years between the candidate's previous job and their current one.
  • training_hours: The total hours of training completed by the candidate.
  • target: The dependent variable indicating whether the candidate is looking for a job change (1) or not (0).

Distribution

The dataset is typically available in CSV format. A sample test file (aug_test.csv) is provided, with a size of approximately 210.5 kB. The data is structured into training and test sets. The target variable is excluded from the test set, but a separate file containing the test target values is available for related tasks. It is important to note that the dataset is imbalanced, and most features are categorical, some exhibiting high cardinality. Missing value imputation may be a necessary step in the data pipeline. The sample test file includes approximately 2,129 records.

Usage

This dataset is ideal for:
  • Predicting job change probability: Forecasting which data science candidates are likely to seek new employment after training.
  • HR analytics and research: Understanding the underlying factors that compel individuals to leave their jobs.
  • Optimising recruitment and training strategies: Helping companies reduce costs and time associated with hiring and improving the quality of their training programmes.
  • Candidate categorisation: Streamlining the process of classifying candidates based on their likelihood of staying with the company.
  • Model interpretation: Gaining insights into which features most significantly affect a candidate's decision to look for a new job.

Coverage

The dataset includes demographic information such as gender, education level, and relevant experience. City codes are provided, indicating a geographical scope pertaining to various cities. There is no specific time range mentioned for the data collection. Details on data availability for specific demographic groups or years are not explicitly detailed beyond the general feature descriptions.

License

CC0: Public Domain

Who Can Use It

This dataset is particularly useful for:
  • HR professionals and recruiters in Big Data and Data Science companies.
  • Data scientists and machine learning engineers building predictive models for employee churn.
  • Academics and researchers focusing on workforce analytics, talent management, and career transitions.
  • Organisations looking to improve their training efficacy and reduce recruitment overheads.

Dataset Name Suggestions

  • HR Analytics: Data Scientist Job Change
  • Data Scientist Career Churn Prediction
  • Employee Attrition for Data Professionals
  • Job Change Prediction for Data Scientists

Attributes

Listing Stats

VIEWS

1

DOWNLOADS

0

LISTED

08/07/2025

REGION

GLOBAL

Universal Data Quality Score Logo UDQSQUALITY

5 / 5

VERSION

1.0

Free

Download Dataset in ZIP Format