Anime Ratings, Status, and Recommendation Data
Data Science and Analytics
Tags and Keywords
Trusted By




"No reviews yet"
Free
About
Recommendation data detailing the viewing habits and preferences of over 74,000 users across more than 16,000 anime titles, scraped from Anime-Planet. This collection was gathered between June 4th and June 25th. The dataset includes rich metadata on each anime, such as tags, synopsis, studio details, and average scores, combined with specific user lists indicating watching status—including 'dropped,' 'watched,' 'want to watch,' 'currently watching,' 'stalled,' and 'won't watch.' It also contains information about anime recommendations and agreement votes for those recommendations. User IDs are randomly generated and non-identifiable. ⚠️ A critical warning: this dataset contains information regarding anime intended for adults (hentai).
Columns
The collection is structured into several core files:
-
animelist.csv (20 Million rows, 74,129 users):
user_id: A randomly generated ID that is non-identifiable.anime_id: The Anime-Planet ID for the title.score: The rating assigned by the user, ranging from 1 to 5 in increments of 0.5, or 0 if no score was assigned.watching_status: A state ID corresponding to the user's progress status for the anime.watched_episodes: The total number of episodes viewed by the user.
-
watching_status.csv: Provides descriptions for the state IDs found in the
watching_statuscolumn ofanimelist.csv. -
rating_complete.csv (8 Million ratings): A filtered subset of
animelist.csvthat only includes ratings where the user watched the anime entirely and assigned a score (not 0). Columns includeuser_id,anime_id, andrating. -
anime.csv (16,621 anime): General metadata for each title.
Anime-PlanetIDName: The full name of the anime (e.g., FLCL).Alternative Name: Another way to refer to the anime (e.g., Furi Kuri).Rating Score: The average score given by all users in the Anime-Planet database.Number Votes: The count of users who provided a score.Tags: A comma-separated list of categories (e.g., Comedy, Sci Fi).Content Warning: A comma-separated list of warning tags (e.g., Explicit Violence, Nudity).Type: The format (e.g., TV, movie, OVA).Episodes: The number of chapters.Finished: Boolean indicating if the anime was completed at the time of scraping.Duration: Duration in minutes.StartYear,EndYear: The years the transmission began and finished.Season: Season and year of release (e.g., Fall 2000).Studios: A comma-separated list of production studios.Synopsis: The summary of the anime.Url: The URL to the main page on Anime-Planet.
-
anime_recommendations.csv: Lists anime recommendations. Columns are
Anime(ID),Recommendation(ID), andAgree Votes(the count of users who agreed with the suggestion).
Distribution
The core data files integrate feedback from 74,129 distinct users pertaining to 16,621 anime titles. The largest transaction file,
animelist.csv, contains 20 Million rows. A focused subset, rating_complete.csv, provides 8 Million specific ratings applied to 15,681 anime by 68,199 users.In addition to the CSV tables, the data includes an "html" folder. This folder contains a zip file for every single anime (16,621 zips), holding various HTML pages scraped from Anime-Planet, such as the main page, reviews, recommendations, characters, and staff information.
Certain metadata fields in
anime.csv show gaps in collection; for instance, Content Warning is unknown in 90% of records, and Season is unknown in 75% of records.Usage
This dataset is highly suitable for experiments aimed at understanding user preference and media consumption. Ideal applications include:
- Building Advanced Recommender Systems: Training and testing new recommendation methodologies, such as collaborative filtering or systems based on context features like tags and synopses.
- Feature Importance Identification: Determining which characteristics of an anime (like tags, content warnings, or studio) are most effective for improving recommendation accuracy.
- Behavioural Analysis: Studying how users interact with their lists, analysing the correlation between assigned scores and watching statuses (e.g., comparing scores for 'dropped' versus 'watched completely').
Coverage
The data was collected over a narrow timeframe, specifically between June 4th and June 25th. It covers the viewing habits of 74,129 users across 16,621 unique anime titles sourced from Anime-Planet. All users are anonymised using randomly generated IDs. The metadata allows for analysis spanning the entire recorded history of anime up to the scraping date, using the
StartYear and EndYear fields.License
CC0: Public Domain
Who Can Use It
- Machine Learning Developers: Seeking large-scale, real-world data to build and benchmark cutting-edge recommendation algorithms.
- Data Scientists and Analysts: Interested in user behaviour modelling related to media consumption and rating systems.
- Academic Researchers: Focused on longitudinal studies of media trends, feature analysis, and the mechanics of online fan communities.
Dataset Name Suggestions
- Anime-Planet User Preference Data 2020
- Anime Ratings, Status, and Recommendation Data
- Large-Scale Anime Watch List and Metadata Collection
Attributes
Original Data Source: Anime Ratings, Status, and Recommendation Data
Loading...
