Opendatabay APP

Historical Global Data Science Rankings 2018–2021

Education & Learning Analytics

Tags and Keywords

Kaggle

Rankings

Achievement

Competition

Meta

Trusted By
Trusted by company1Trusted by company2Trusted by company3
Historical Global Data Science Rankings 2018–2021 Dataset on Opendatabay data marketplace

"No reviews yet"

Free

About

Reconstructing the competitive evolution of data science practitioners is made possible through these historical snapshots of user milestones. By capturing monthly progress from mid-2018 to late 2021, the data preserves a record of global rankings across competitions, scripts, and community discussions. It serves as a vital resource for understanding the trajectory of high-achieving individuals within the world's largest data science community.

Columns

  • Id: A unique identifier for the specific achievement record.
  • UserId: The reference number for the individual user, which can be linked to other metadata to obtain display names.
  • AchievementType: The category of activity, predominantly covering Competitions, but also including Discussions and other community interactions.
  • Tier: The proficiency level attained by the user, representing their professional status in a specific category.
  • TierAchievementDate: The specific timestamp indicating when a user reached their current performance tier.
  • Points: The total numerical score accumulated by the user for their contributions at the time of the snapshot.
  • CurrentRanking: The user's standing relative to others at the time the record was captured.
  • HighestRanking: A historical record of the best position the user has ever achieved on the global leaderboard.
  • TotalGold: The total count of top-tier medals earned for exceptional performance.
  • TotalSilver: The total count of second-tier medals earned for high-quality contributions.

Distribution

The data is provided in a series of CSV files, such as UserAchievements_180701.csv, with file sizes around 4.89 MB. It contains approximately 96,000 valid records per monthly snapshot, featuring a 100% validity rate for core identifiers and point values. Only rows with positive point values are included to ensure the data remains focused on active contributors.

Usage

This resource is ideal for conducting longitudinal studies on how competitive rankings shift over time within a professional community. It is well-suited for training predictive models to identify future top-tier performers based on their early achievement patterns and point accumulation. Additionally, researchers can use the medal counts to benchmark community engagement and the difficulty of advancing through different proficiency tiers.

Coverage

The temporal scope ranges from June 2018 to September 2021, with records captured at approximately monthly intervals. Geographically and demographically, the data represents a global user base of data science professionals and enthusiasts who have earned positive points in various platform activities.

License

CC0: Public Domain

Who Can Use It

Academic researchers can leverage these records to study the dynamics of online competitive communities and merit-based ranking systems. Data scientists might utilise the ranking history to build talent identification algorithms. Furthermore, community analysts can use the snapshots to track the growth of specific sectors, such as discussion or script contributions, over several years.

Dataset Name Suggestions

  • Meta Kaggle User Achievement Monthly Snapshots
  • Historical Global Data Science Rankings 2018–2021
  • Kaggle Achievement and Tier Evolution Archive
  • Longitudinal User Performance Metrics for Meta Kaggle
  • Competitive Data Science Achievement History

Attributes

Listing Stats

VIEWS

3

DOWNLOADS

0

LISTED

29/12/2025

REGION

GLOBAL

Universal Data Quality Score Logo UDQSQUALITY

5 / 5

VERSION

1.0

Loading...

Free

Download Dataset in ZIP Format