Dota 2 Team Performance Data
Data Science and Analytics
Tags and Keywords
Trusted By




"No reviews yet"
Free
About
This dataset provides an analytical base table of professional Dota 2 match results from 2019 to 2021, specifically designed for data science studies. It contains pre-processed, aggregated team statistics calculated one day before each match, based on individual player performance over the preceding six months. The data originated from the Open Dota API and underwent extensive transformations using a Data Lake concept with Apache Spark processing, having been initially stored in MongoDB [1-4]. It is distinct from raw game data, offering summarised, non-normalised metrics for each team [2, 4].
Columns
The dataset includes statistics for each team, with each row indicating the match winner and summarised metrics. Key columns are:
- match_id: A unique identifier for each professional match [5].
- dt_match: The start date and time of the match [6].
- radiant_win: A Boolean value indicating whether the Radiant team won the match (True for Radiant win, False otherwise) [7, 8].
- recencia_r: The average number of days between the last and current match for all players on the Radiant team [8].
- freq_r: The average number of matches played by players on the Radiant team [9].
- win_pct_r: The average win rate of players on the Radiant team [10].
- duration_avg_win_r: The average duration of winning matches for players on the Radiant team [11, 12].
- duration_avg_lose_r: The average duration of losing matches for players on the Radiant team [12, 13].
- actions_per_min_avg_r: The average actions per minute performed by players on the Radiant team [14].
- ancient_kills_avg_r: The average number of Ancients (in-game objectives) killed by players on the Radiant team [15].
Distribution
The dataset is provided as a CSV file named
tb_pro_players_matches.csv
, with a size of 189.62 MB [5, 16]. It contains approximately 47.1 thousand records (rows) [6-8]. While the dataset originally contains 317 columns, only a subset of 10 columns are detailed in the provided sample [5].Usage
This dataset is an excellent resource for data scientists looking to apply their skills to real-world problems and analytical studies [1, 3]. Ideal applications include:
- Developing predictive models for Dota 2 match outcomes [2, 4].
- Analysing team performance trends and player statistics [2, 4].
- Exploring relationships between pre-match statistics and victory rates.
- Studying player behaviour and its impact on game results.
Coverage
The dataset covers professional Dota 2 matches played between 1st January 2019 and 19th June 2021 [1, 6, 7]. The statistics are derived from the performance of professional players [1, 3]. There are no specific geographic or demographic notes beyond its focus on professional matches.
License
Attribution 4.0 International (CC BY 4.0) License
Who Can Use It
This dataset is primarily intended for data scientists and statistics students [1, 3]. It is also suitable for:
- Researchers interested in competitive gaming analytics.
- Gaming enthusiasts and analysts keen on understanding professional Dota 2 matches.
- Individuals seeking practical projects to apply data analysis and machine learning skills [1, 3].
Dataset Name Suggestions
- Dota 2 Pro Match Stats 2019-2021
- Professional Dota 2 Game Analytics
- Dota 2 Team Performance Data
- Esports Match Outcome Predictor Data
- Dota 2 Player Statistics Base
Attributes
Original Data Source: Dota 2 Team Performance Data