Opendatabay APP

Data Science Salary Prediction in India

Data Science and Analytics

Tags and Keywords

Salary

India

Data

Scientist

Jobs

Trusted By
Trusted by company1Trusted by company2Trusted by company3
Data Science Salary Prediction in India Dataset on Opendatabay data marketplace

"No reviews yet"

Free

About

This dataset provides a detailed collection of data scientist job postings and associated salary information for the Indian market, aiming to facilitate salary prediction. It offers insights into various factors that influence salaries in the analytics industry, such as location and specific industry sectors. The data was collected over time, reflecting trends in analytics job postings and enabling an exploration of the median analytics salary across different experience levels and skill sets in India. It serves as a valuable resource for understanding the dynamics of the data science job market.

Columns

  • Name of the company (Encoded): An encoded representation of the company where the job is posted.
  • Years of experience: The required or desired years of experience for the job role.
  • Job description: A detailed textual description of the job responsibilities and requirements.
  • Job designation: The official title or designation of the job role.
  • Job Type: The nature of the employment (e.g., full-time, part-time).
  • Key skills: A list of essential skills required for the position.
  • Location: The geographic location of the job posting in India.
  • Salary in Rupees Lakhs (To be predicted): The salary range, expressed in Lakhs of Indian Rupees, which is the target variable for prediction.

Distribution

The dataset is primarily available in CSV format, with an additional sample submission file in XLSX format. It is structured into training and testing subsets. The training data comprises 19,802 samples, while the test data consists of 6,601 samples. The Final_Train_Dataset.csv file is approximately 4.59 MB, Final_Test_Dataset.csv is around 1.45 MB, and sample_submission.xlsx is about 55.52 kB. Specific total row counts for the overall dataset are not provided, but the sample sizes for train and test are clearly defined.

Usage

This dataset is ideal for building robust machine learning models to predict the salary range of data scientist and analytics professional job postings in India. It can be used for exploratory data analysis to uncover key insights into career progression, salary trends, and influential factors within the Indian data science landscape. Researchers can leverage it to study regional salary variations, industry-specific pay scales (such as in the Telecom industry), and the impact of experience and skills on remuneration.

Coverage

The dataset's geographic scope is India, with specific mentions of high-paying cities like Mumbai and Bengaluru. The data is based on salary and job postings found across the internet in India, collected over several years to reflect evolving trends. It pertains to professionals in data science and analytics. While the median analytics salary for 2017 is mentioned, the data itself has been gathered over an extended period.

License

CC0: Public Domain

Who Can Use It

  • Aspiring Data Scientists: To gain insights into salary expectations and essential skills for career planning.
  • Machine Learning Practitioners: To develop and refine predictive models for salary estimation.
  • Recruiters and HR Professionals: To benchmark salaries, understand market rates, and make informed hiring decisions.
  • Researchers and Academics: To analyse trends in the Indian data science job market, study socio-economic factors influencing salaries, and contribute to salary prediction methodologies.

Dataset Name Suggestions

  • India Data Scientist Salary Predictor
  • Indian Analytics Job Salaries
  • Data Science Salary Prediction in India
  • India ML Salary Forecasting Dataset
  • Data Scientist Compensation India

Attributes

Listing Stats

VIEWS

2

DOWNLOADS

0

LISTED

30/08/2025

REGION

GLOBAL

Universal Data Quality Score Logo UDQSQUALITY

5 / 5

VERSION

1.0

Free

Download Dataset in ZIP Format