Opendatabay APP

Gaming Recommender System Data

Data Science and Analytics

Tags and Keywords

Games

Steam

Recommender

Tags

Playtime

Trusted By
Trusted by company1Trusted by company2Trusted by company3
Gaming Recommender System Data Dataset on Opendatabay data marketplace

"No reviews yet"

Free

About

Features information on approximately 60,000 video games available on the Steam platform. It primarily includes user-generated tags, which are descriptive of each game's nature and core features. Additionally, it incorporates data from howlongtobeat.com regarding the average duration required to complete various aspects of a game. This resource is highly valuable for developing recommendation systems that are not solely reliant on peer preferences.

Columns

  • id: The unique identifier for the game on the Steam platform.
  • name: The title of the game as it appears on Steam.
  • year: The release year of the game.
  • metacritic_rating: The Metacritic aggregate score for the game, where a higher value indicates better reception.
  • reviewer_rating: A rating given by users on a 0-10 scale, with higher scores signifying better user satisfaction.
  • positivity_ratio: Calculated as the number of positive reviews divided by the number of negative reviews.
  • to_beat_main: The estimated time needed to complete the primary storyline or objectives of the game.
  • to_beat_extra: The estimated time required to finish the main story along with optional objectives.
  • to_beat_completionist: The estimated time to fully complete all aspects of the game, including all objectives and collecting every item.
  • extra_content_length: Represents the difference in time between a completionist playthrough and a playthrough focused on main and optional objectives.
  • tags: User-defined descriptive tags or features of the game, separated by a vertical line.

Distribution

The dataset is typically provided in a CSV format and is approximately 11.87 MB in size. It encompasses information for about 60,000 Steam games, structured across 11 columns. Key columns such as 'id', 'name', and 'tags' have nearly 100% data availability, representing roughly 63,500 valid records. The 'year' column also has high validity with 63,400 records. However, some columns related to ratings and playtimes have more limited data availability; for instance, 'metacritic_rating' is available for 6% of records (3,916 valid), 'reviewer_rating' for 70% (44,600 valid), and the 'to_beat' columns (main, extra, completionist) and 'extra_content_length' have 21-33% data availability, indicating a notable proportion of missing values for these specific attributes. The 'positivity_ratio' is available for 97% of records (61,500 valid).

Usage

This dataset is ideally suited for developing and training recommendation systems, particularly those that aim to suggest games based on intrinsic game features and characteristics rather than solely on the preferences of similar users. It can aid in identifying game attributes that influence player engagement and game completion times.

Coverage

The dataset covers video games available on the Steam platform, reflecting a global scope due to Steam's international user base. The temporal coverage for game releases spans from 1997 to 2023. While general user tags and ratings are included, specific demographic breakdowns are not explicitly detailed. It should be noted that data availability varies significantly across different columns, particularly for Metacritic ratings and play duration metrics, which have substantial missing values for many games.

License

CC0: Public Domain

Who Can Use It

This dataset is particularly useful for data scientists and machine learning engineers focused on building recommendation engines. Game developers and analysts could also use it to understand game features, user reception, and expected playtime. Researchers in fields like human-computer interaction or digital entertainment can utilise it for studying player behaviour and game attributes.

Dataset Name Suggestions

  • Steam Video Game Features and Playtime
  • Gaming Recommender System Data
  • HowLongToBeat Steam Games
  • Game Tag Analysis Dataset
  • Steam Game Metrics and User Tags

Attributes

Original Data Source: Gaming Recommender System Data

Listing Stats

VIEWS

1

DOWNLOADS

0

LISTED

08/09/2025

REGION

GLOBAL

Universal Data Quality Score Logo UDQSQUALITY

5 / 5

VERSION

1.0

Free

Download Dataset in CSV Format