Film Popularity and Rating Analysis Data
Product Reviews & Feedback
Tags and Keywords
Trusted By




"No reviews yet"
Free
About
A rich and detailed collection of metadata for 10,002 movies, sourced from a popular movie rating and review platform inspired by Letterboxd-like data. This dataset offers a full snapshot of movie attributes, including titles, directors, genres, ratings, runtime, language, and studio information, alongside essential user engagement metrics such as watches, likes, and list appearances. It is a valuable resource for data scientists, machine learning enthusiasts, and film analysts aiming to explore movie popularity dynamics, genre trends, user behaviour, and predictive modelling for film ratings or engagement levels. The data is structured and clean, ready to support a wide array of machine learning and statistical analyses.
Columns
The dataset includes 16 key attributes detailing movie characteristics and user interactions:
- Film_title: The title of the movie (9665 unique values).
- Director: The primary director(s) of the movie.
- Average_rating: The mean user rating for the movie, scaled from 1 to 5.
- Genres: A list detailing the genres associated with the movie (e.g., ['Horror', 'Drama']).
- Runtime: The length of the movie in minutes (average runtime is 103 minutes).
- Original_language: The official language of the movie (English is the most common at 81%).
- Description: A brief synopsis of the movie’s plot or theme.
- Studios: A list of production studios associated with the film.
- Watches: The total count of times the movie has been watched by users (average is 170k).
- List_appearances: The total number of times the movie features in user-curated lists.
- Likes: The overall number of likes the movie has received from users.
- Fans: The count of users who have marked themselves as fans of the movie.
- Lowest★: The volume of 1-star ratings received.
- Medium★★★: The volume of 3-star ratings received.
- Highest★★★★★: The volume of 5-star ratings received.
- Total_ratings: The aggregate number of ratings across all star levels (average is 106k).
Distribution
This dataset contains 10,002 rows and 15 columns of rich metadata and engagement metrics. The file format is CSV, providing immediate readiness for analytical work. Features include numerical metrics, categorical labels, and text-based data, suitable for varied statistical approaches.
Usage
The dataset is ideal for several advanced analytical and predictive tasks:
- Classification: Building models to predict movie ratings or user engagement (such as likes or watches) based on features like genres, runtime, or language.
- Recommendation Systems: Developing systems using collaborative filtering or content-based approaches to suggest movies based on user preferences or film attributes.
- Exploratory Data Analysis (EDA): Analysing shifts and trends in movie production, genre popularity over time, or the influence of specific studios.
- Sentiment Analysis and NLP: Utilising movie descriptions for natural language processing tasks, including topic modelling or sentiment extraction.
- Clustering: Grouping movies based on similarities in genres, average ratings, or overall user engagement patterns.
Coverage
The data covers metadata for exactly 10,002 films. It encompasses a wide selection of movies across diverse genres, languages, and production eras.
License
CC0 (Public Domain)
Who Can Use It
Data Scientists: For developing prediction models for movie ratings and engagement.
Machine Learning Enthusiasts: For practicing classification, recommendation system building, and clustering algorithms.
Film Analysts: For studying the relationship between movie attributes (e.g., genre, director) and critical reception or audience popularity.
Researchers: For performing advanced statistical analysis on genre trends across time periods or languages.
Dataset Name Suggestions
- Letterbox Movie Ratings & Classification Dataset
- Movie Metadata and Engagement Metrics
- Film Popularity and Rating Analysis Data
- Global Film Rating System Dataset
Attributes
Original Data Source: Film Popularity and Rating Analysis Data
Loading...
