Film Recommendation Systems Data
News & Media Articles
Tags and Keywords
Trusted By




"No reviews yet"
Free
About
This dataset contains details of the top 10,000 popular movies based on TMDB ratings. It is an ideal resource for developing and testing recommendation systems, a technology widely used across platforms like Netflix, Amazon Prime, and YouTube. The data was meticulously compiled from the official TMDB API. It offers a clear, structured insight into popular films, making it a valuable foundation for data analytics and machine learning projects.
Columns
- id: A unique identifier for each movie.
- original_language: The original language of the movie, represented by ISO 639-1 codes (e.g., 'en' for English, 'hi' for Hindi). This column contains 44 different languages, with 7,771 movies listed as 'English'.
- original_title: The title of the movie.
- popularity: An indicator of the movie's popularity, where a larger number signifies higher popularity.
- release_date: The release date of the movie. If a release date is absent, it indicates the movie has not yet been released.
- vote_average: The average rating or vote received for the movie.
- vote_count: The total number of ratings or votes recorded for the movie.
- genre: The genre of the movie.
- overview: A brief description of the movie in string format.
- revenue: The revenue generated by the movie.
- runtime: The running time of the movie in minutes.
- tagline: The tagline associated with the movie.
Distribution
The dataset is provided in CSV format and is approximately 4.11 MB in size. It comprises 13 columns and contains 10,000 records (rows).
Usage
This dataset is perfectly suited for anyone looking to build and train movie recommendation algorithms. It provides a robust starting point for exploring various machine learning techniques in the domain of recommender systems.
Coverage
The dataset's coverage is global, stemming from the TMDB API, which encompasses films from a wide array of original languages. The movie release dates in the dataset span from 17th April 1902 to 20th December 2028. There is no specific demographic scope noted.
License
CC0: Public Domain
Who Can Use It
This dataset is particularly beneficial for beginners and intermediate-level data scientists, machine learning engineers, and students interested in developing recommendation systems. It serves as an excellent foundational dataset for educational purposes and practical application in data analytics.
Dataset Name Suggestions
- TMDB Top 10,000 Movies
- Popular Movie Ratings Dataset
- Film Recommendation Systems Data
- Global Movie Popularity Dataset
Attributes
Original Data Source: Film Recommendation Systems Data