Dark Mode

Home

Data Categories

AI & ML Data

Machine Learning Community Profile Data

FREE DATASET LIBRARY

Verified Data Provider

£0

Machine Learning Community Profile Data

Data Science and Analytics

Tags and Keywords

Survey

Kaggle

Machine

Science

Trends

Trusted By

"No reviews yet"

Free

About

A detailed view of the state of data science and machine learning across the industry. It combines data gathered from annual industry-wide surveys conducted over four consecutive years, allowing users to track the evolution and changing profile of Kagglers from 2017 through to 2020. The collection was acquired and cleaned by the Kaggle Team, and subsequently merged into a single file for longitudinal analysis.

Columns

The dataset contains 12 columns detailing respondent characteristics and professional context:

index: A unique identifier key for each record.
Age: The age of the respondent (with the 25–29 bracket being the most frequently reported).
Gender: The self-identified gender of the respondent (81% of respondents are Male).
Country: The respondent’s country of residence (India is the most represented country, followed by the United States of America).
Degree: The highest degree attained by the respondent (Master’s degree is the most common, held by 42% of respondents).
Job Title: The professional occupation of the respondent (Student and Data Scientist are the most frequently listed titles).
Company Size: The number of employees in the respondent’s organisation.
Team Size: The size of the respondent’s immediate team.
ML Status in Company: A description of the status of Machine Learning adoption within the respondent’s company.
Compensation Status: Details regarding the salary or compensation of the respondent.
Money Spent: The amount of money spent on ML products by the respondent’s company.
Year: The year in which the specific survey record was conducted (ranging from 2017 to 2020).

Distribution

The data is delivered in a tabular structure, typically in CSV format. The primary file, named kaggle_survey_17_20_v2.csv, is approximately 10.43 MB in size. It contains over 80,000 valid records representing the survey responses across the merged years.

Usage

This resource is ideally suited for Data Visualization and Exploratory Data Analysis. It can be used to study macro-level shifts within the data science job market, compare professional demographics over time, and analyse global participation in the Machine Learning community.

Coverage

The data spans a time range from 2017 to 2020. Geographically, responses cover a wide global distribution, with 72 unique countries represented. Demographically, the scope includes detailed information on the age, gender, educational background, and professional environment (including team size and ML adoption status) of thousands of survey participants.

License

CC0: Public Domain

Who Can Use It

Intended users include data scientists, academic researchers, and students interested in quantitative socio-economic trends. It is particularly useful for those needing to benchmark skills, track compensation trends, or gain insight into the adoption rate of machine learning practices globally.

Dataset Name Suggestions

Kaggle Survey 2017-2020 Merged Data
Global Data Science Industry Trends (2017-2020)
Machine Learning Community Profile Data
The Evolution of Kagglers

Attributes

Original Data Source: Machine Learning Community Profile Data

Listing Stats

VIEWS

DOWNLOADS

LISTED

22/11/2025

REGION

GLOBAL

QUALITY

5 / 5

VERSION

1.0

FREE DATASET LIBRARY

£0

Machine Learning Community Profile Data

Data Science and Analytics

Tags and Keywords

Survey

Kaggle

Machine

Science

Trends

Trusted By

Free

About

Columns

Distribution

Usage

Coverage

License

Who Can Use It

Dataset Name Suggestions

Attributes

Listing Stats

Free

Download Dataset in CSV Format

RECOMMENDED DATASETS