HR Analytics Simulation Data
Agent Simulation Data
Tags and Keywords
Trusted By




"No reviews yet"
Free
About
This dataset is a simulated collection of employee records, designed to facilitate data analysis and machine learning techniques within human resources and employee management contexts. It mirrors the structure and characteristics of real-world employee data, with all information being fictional and for illustrative purposes. The dataset also includes sections detailing training activities and developmental programmes, offering insights into professional development impact. Furthermore, it provides a view of the recruitment process from job posting to candidate selection, allowing for analysis of recruitment efficiency and candidate profiles. Additionally, the dataset incorporates employee engagement survey responses, enabling the analysis of employee satisfaction and sentiment across various workplace aspects to inform improvement strategies.
Columns
- Employee ID (EmpID): A unique identifier for each employee within the organisation.
- First Name: The employee's first name.
- Last Name: The employee's last name.
- Start Date: The date the employee began working for the organisation. Data ranges from 7 August 2018 to 6 August 2023.
- Exit Date: The date the employee left or exited the organisation, if applicable. Approximately 49% of records have missing exit dates. Data ranges from 19 November 2018 to 6 August 2023.
- Title: The employee's job title or position (e.g., Production Technician I, Production Technician II).
- Supervisor: The name of the employee's immediate supervisor or manager.
- Email (ADEmail): The employee's organisational email address.
- Business Unit: The specific business unit or department the employee belongs to (e.g., NEL, SVG).
- Employee Status: The employee's current employment status (e.g., Active, On Leave, Terminated). Approximately 82% of employees are Active.
- Employee Type: The type of employment the employee holds (e.g., Full-Time, Part-Time, Contract).
- Pay Zone: The pay zone or salary band for the employee's compensation (e.g., Zone A, Zone B).
- Employee Classification Type: The employee's classification type (e.g., Exempt, Non-exempt, Temporary).
- Termination Type: The type of termination if an employee has left the organisation (e.g., Resignation, Layoff, Retirement). Approximately 49% are 'Unk' (unknown).
- Termination Description: Additional details or reasons for the employee's termination, if applicable. Approximately 49% of records are null.
- Department Type: The broader category or type of department associated with the employee's work (e.g., Production, IT/IS). Production accounts for 67% of records.
- Division (Division Description): The division or branch of the organisation where the employee works (e.g., Field Operations, General - Con).
- DOB (Date of Birth): The employee's date of birth.
- State: The state or region where the employee is located. Massachusetts (MA) accounts for 88% of records.
- Job Function (JobFunctionDescription): A brief description of the employee's primary job function or role (e.g., Laborer, Technician).
- Gender (GenderCode): A code representing the employee's gender (e.g., M for Male, F for Female, N for Non-binary). Female accounts for 56% and Male for 44%.
- Location (LocationCode): A code representing the physical location or office where the employee is based.
- Race (or) Ethnicity (RaceDesc): A description of the employee's racial or ethnic background (e.g., Asian, Black).
- Marital Status (MaritalDesc): The employee's marital status (e.g., Single, Married, Divorced).
- Performance Score: A score indicating the employee's performance level (e.g., Excellent, Satisfactory, Needs Improvement). 79% of employees Fully Meet expectations.
- Current Employee Rating: The current rating or evaluation of the employee's overall performance, typically on a scale of 1 to 5. The mean rating is 2.97.
- Training Date: The date on which a training session took place.
- Training Program Name: The name or title of the training programme attended by the employee.
- Training Type: The categorisation of the training, indicating its purpose or focus (e.g., Technical, Soft Skills, Safety).
- Training Outcome: The observed outcome or result of the training for the employee (e.g., Completed, Partial Completion, Not Completed).
- Trainer: The name of the trainer or instructor who facilitated the training.
- Training Duration (Days): The duration of the training programme in days.
- Training Cost: The cost associated with organising and conducting the training programme.
- Applicant ID: A unique identifier for each job applicant.
- Application Date: The date the applicant submitted their job application.
- Phone Number: The applicant's contact phone number.
- Address: The applicant's street address.
- City: The city where the applicant resides.
- Zip Code: The postal or ZIP code associated with the applicant's address.
- Country: The country where the applicant resides.
- Education Level: The highest level of education attained by the applicant.
- Years of Experience: The number of years of professional experience the applicant has.
- Desired Salary: The salary the applicant wishes to receive for the job.
- Status: The status of the applicant's application (e.g., Submitted, Under Review, Rejected, Selected).
- Survey Date: The date on which the engagement survey was administered to employees.
- Engagement Score: A calculated numerical score representing the level of employee engagement based on survey responses.
- Satisfaction Score: A numerical score indicating employee satisfaction with various aspects of their job and workplace.
- Work-Life Balance Score: A numerical score reflecting employee perceptions of the balance between work and personal life.
Distribution
The dataset is typically provided in a CSV (Comma Separated Values) format. The main employee data file,
employee_data.csv
, is approximately 780 kB in size and contains 26 columns. There are 3000 unique records for employee identifiers, ensuring a substantial base for analysis.Usage
This dataset is ideal for:
- Human Resources Analytics: Analysing employee demographics, performance, and turnover trends.
- Machine Learning Applications: Developing models for predicting employee attrition, optimising recruitment processes, or forecasting training needs.
- Training Effectiveness Evaluation: Assessing the impact of professional development programmes and guiding future training investments.
- Recruitment Efficiency Studies: Gaining insights into the effectiveness of various sourcing channels and candidate profiles.
- Employee Engagement and Satisfaction Analysis: Identifying key drivers of engagement and areas for workplace improvement based on survey data.
Coverage
The dataset primarily covers employee activities from August 2018 to August 2023. Geographic coverage predominantly includes the United States, with a significant concentration of employees in Massachusetts (MA), though 28 unique states are represented. Demographic data includes Gender (Female 56%, Male 44%), Race/Ethnicity (including Asian and Black categories, with 5 unique types), and Marital Status (Single, Married, Divorced, and other categories).
License
CC0: Public Domain
Who Can Use It
- HR Analysts: To track key HR metrics, understand workforce dynamics, and inform strategic HR decisions.
- Data Scientists/Machine Learning Engineers: For building predictive models related to employee retention, performance, and recruitment.
- Organisational Development Specialists: To assess training programme outcomes and gauge employee engagement levels.
- Academics and Researchers: For studies on human capital, workplace behaviour, and talent management.
- Recruitment Managers: To analyse recruitment pipeline efficiency and candidate quality.
Dataset Name Suggestions
- Workforce Lifecycle Dataset
- HR Analytics Simulation Data
- Employee Management Data Hub
- Organisational Talent Insights
- Synthetic Employee Data Collection
Attributes
Original Data Source: HR Analytics Simulation Data