TMDB Top Films Metadata
Social Media and Posts
Tags and Keywords
Trusted By




"No reviews yet"
Free
About
This dataset offers a detailed collection of metadata for the top 10,000 most popular films on The Movie Database (TMDB). It includes key information such as film titles, release dates, runtime details, genres, production companies, budget figures, and revenue generated. The data has been gathered from TMDB's public API, then cleaned and preprocessed to ensure quality and ease of use. It is ideal for data analysts, researchers, and developers keen to examine the characteristics and popularity of films. Potential uses include exploring trends in film genres over time, identifying patterns in production budgets and revenues, and assessing how various attributes influence a film's appeal.
Columns
- id: A unique identifier assigned to each film within the TMDB database.
- title: The name of the film.
- release_date: The date when the film was released.
- genres: A list of categories or genres associated with the film.
- original_language: The language in which the film was originally produced.
- vote_average: The average rating given to the film by TMDB users.
- vote_count: The total number of votes cast for the film on TMDB.
- popularity: A score indicating the film's popularity, based on user engagement on TMDB.
- overview: A concise description or synopsis of the film.
- budget: The estimated financial budget for producing the film, in USD.
- production_companies: A list of companies involved in the film's production.
- revenue: The total income generated by the film, in USD.
- runtime: The total duration of the film, measured in minutes.
- tagline: A short, memorable phrase linked with the film, often used for promotional purposes.
Distribution
The dataset is primarily available as a CSV data file, named
top_1000_popular_movies_tmdb.csv
. It has a file size of 4.9 MB and consists of 15 columns and 10,001 individual records or rows. While most columns are well-populated, some entries, such as 'tagline' and 'overview', have a small number of missing values.Usage
This dataset can be effectively used for:
- Exploring trends in film genres across different time periods.
- Identifying patterns related to film budgets and the revenues they generate.
- Analysing the influence of various film attributes on their overall popularity.
- Developing predictive models for film success.
Coverage
This dataset focuses on the top 10,000 popular films listed on TMDB, providing a global scope due to the nature of TMDB itself. While specific time ranges for the dataset's collection are not fixed, film release dates vary, with a significant number around 2023-06-11. The data reflects user engagement metrics from TMDB's audience. Some data points have minor gaps; for instance, 'tagline' has approximately 26% missing values, and 'overview' about 1% missing, with other columns having very few to no missing entries.
License
CC0: Public Domain
Who Can Use It
- Data analysts: To study film popularity, genre trends, and financial patterns.
- Researchers: For academic studies on film characteristics and factors affecting their success.
- Developers: To integrate film metadata into applications or build tools for film analysis.
Dataset Name Suggestions
- TMDB Top Films Metadata
- Popular Movies Dataset by TMDB
- Global Film Popularity Index
- Film Data for Analysis
- Top 10,000 TMDB Films
Attributes
Original Data Source: TMDB Top Films Metadata