Minecraft User Behaviour Dataset
Data Science and Analytics
Tags and Keywords
Trusted By




"No reviews yet"
Free
About
This dataset provides a classification of Minecraft players, categorising them as either pirated or legitimate. It has been compiled from the registration forms of a significant Minecraft event hosted on Discord. To ensure privacy, personal data columns within the dataset have been converted into dummy variables, meaning they do not reveal actual personal details of individuals. This dataset offers valuable insights for understanding user behaviour and piracy trends within the gaming community.
Columns
- Index column: A unique identifier for each player record, ranging from 0 to 1422.
- age: The age of the players, with values spanning from 1 to 43 years. The average age in the dataset is approximately 16.7 years.
- discordid: The Discord username associated with each player. This column contains 1336 unique entries.
- minecraftid: The Minecraft username for each player, featuring 1291 unique entries.
- client: Specifies the platform used by the player. The two categories are Computer (Java Edition), which accounts for 81% of players, and Mobile (Bedrock version), making up the remaining 19%.
- version: Indicates whether the player uses a hacked or official game version. Cracked versions are used by 81% of players, while Paid Version users constitute 19%.
Distribution
The dataset is provided in a CSV file format, named
piracydataset.csv
, with a file size of 105.09 kB. It contains 6 distinct columns and comprises approximately 1423 records or entries.Usage
This dataset is ideally suited for machine learning tasks focused on classification, specifically for distinguishing between pirated and legitimate game users. Potential applications include:
- Developing and testing piracy detection models for online gaming platforms.
- Analysing patterns in player behaviour related to game client and version usage.
- Researching the demographic characteristics of players who opt for pirated versus paid game versions.
- Informing strategies for digital rights management in the gaming industry.
Coverage
The dataset primarily covers demographic information such as player age, ranging from 1 to 43 years, with a notable concentration around 13-17 years. It also details the type of gaming client (computer or mobile) and the game version (cracked or paid) used by players. Geographic or specific time range scope is not explicitly defined within the source material, but the data originates from a Discord event, suggesting a potentially broad reach depending on the event's participants. Personal identifying information has been anonymised.
License
CC BY-SA 4.0
Who Can Use It
This dataset is highly valuable for:
- Data scientists and machine learning engineers working on classification problems and predictive analytics.
- Game developers and publishers seeking to understand player demographics and combat piracy.
- Researchers in fields such as cybersecurity, digital forensics, and user behaviour analysis.
- Organisers of online gaming events interested in participant demographics and platform usage.
Dataset Name Suggestions
- Minecraft Piracy Player Classification
- Gaming Client Version Analysis
- Minecraft User Behaviour Dataset
- Discord Event Player Data
- Legit vs. Cracked Minecraft Players
Attributes
Original Data Source: Minecraft User Behaviour Dataset