Lichess Player Performance Dataset
Product Reviews & Feedback
Tags and Keywords
Trusted By




"No reviews yet"
Free
About
This dataset provides detailed information for over 20,000 online chess games played on Lichess. It includes crucial game elements such as all moves played, the victor of the match, player ratings, and detailed information about the opening sequences. This collection is ideal for analysing game outcomes, strategic patterns, and the impact of player skill on results.
Columns
- game_id: A unique identifier assigned to each individual chess game. There are 20,100 valid and unique game IDs.
- rated: A boolean indicator specifying whether the game was rated (81% true) or unrated (19% false).
- turns: The total number of moves made in a game. The average game length is 60.5 turns, with games ranging from 1 to 349 turns.
- victory_status: Describes how the game concluded, including 'Resign' (56%), 'Mate' (32%), and 'Other' (13%). There are 4 unique statuses.
- winner: Identifies the winner of the game. White won 50% of games, Black won 45%, and 5% had another outcome (e.g., draw). There are 3 unique winner types.
- time_increment: Details the time control setting for the game, with '10+0' being the most frequent (38%). There are 400 unique time increments recorded.
- white_id: The unique identifier for the player controlling the white pieces. There are 9,438 unique white player IDs.
- white_rating: The Elo rating of the white player at the start of the game. Ratings range from 784 to 2700, with an average of 1,600.
- black_id: The unique identifier for the player controlling the black pieces. There are 9,331 unique black player IDs.
- black_rating: The Elo rating of the black player at the start of the game. Ratings range from 789 to 2723, with an average of 1,590.
- moves: The full sequence of chess moves played within the game. There are 18,920 unique move sequences recorded, with 'e4 e5' being the most common start.
- opening_code: The standard chess opening code (e.g., A00, C00). 'A00' is the most common code, representing 5% of games. There are 365 unique opening codes.
- opening_moves: The number of moves defining the opening sequence. The average is 4.82 moves, ranging from 1 to 28 moves.
- opening_fullname: The full name of the chess opening, such as 'Van't Kruijs Opening' or 'Sicilian Defense'. There are 1,477 unique full names, with 'Van't Kruijs Opening' being the most common (2%).
- opening_shortname: A common short name for the chess opening. 'Sicilian Defense' is the most frequent (13%). There are 128 unique short names.
- opening_response: Details any specific response to the opening. This column has a significant number of missing values (94%), with 'Declined' being a recorded response.
- opening_variation: Specifies the variation within an opening. This column is missing for 28% of games, with '#2' being the most common variation (4%) among recorded entries.
Distribution
The dataset is provided as a CSV file,
chess_games.csv
, and is approximately 7.69 MB in size. It comprises 17 columns and contains data for 20,100 unique chess games.Usage
This dataset is highly valuable for various analytical tasks, including:
- Determining the percentage of games won by white, black, or ending in a draw.
- Identifying the most frequently used opening moves based on who won the game.
- Investigating the correlation between player ratings and game outcomes, and whether this varies by piece colour.
- Pinpointing the user who won the most games and understanding their performance when rated higher than their opponent.
- Analysing popular opening strategies and their success rates.
Coverage
The dataset covers online chess games played on Lichess, an international platform, implying a global scope of players. The time range for the games is not explicitly stated. Player demographics are represented by the Elo ratings, ranging from approximately 784 to 2723. Note that specific details for
opening_response
and opening_variation
columns have a notable amount of missing data.License
CC0: Public Domain
Who Can Use It
- Chess Enthusiasts and Analysts: To delve into game statistics, study opening theory, and understand player performance trends.
- Data Scientists and Machine Learning Engineers: For building predictive models for game outcomes, rating changes, or identifying optimal strategies.
- Researchers: For academic studies on game theory, player behaviour, and the impact of various factors in chess.
- Game Developers: To inform AI development for chess games or to design features based on real-world player data.
Dataset Name Suggestions
- Lichess Chess Game Analytics
- Online Chess Game Records
- Lichess Player Performance Dataset
- Chess Opening Statistics from Lichess
- Lichess Game Data Collection
Attributes
Original Data Source: Lichess Player Performance Dataset