Opendatabay APP

Global Cinematic Database

Entertainment & Media Consumption

Tags and Keywords

Arts

Entertainment

Movies

Tv

Shows

Exploratory

Data

Analysis

Nlp

Cleaning

Trusted By
Trusted by company1Trusted by company2Trusted by company3
Global Cinematic Database Dataset on Opendatabay data marketplace

"No reviews yet"

Free

About

This dataset features data on over 10,000 films from TMDB, gathered using the TMDB API. It encompasses details such as film identifiers, titles, release dates, average votes, vote counts, overviews, and popularity metrics. The dataset may contain null values where information was not available from the TMDB database. It is particularly useful for new analysts looking to practise handling missing data and for developing film recommendation systems.

Columns

  • id: Unique identifier for the film.
  • title: The name of the film.
  • overview: A brief summary or synopsis of the film.
  • release_date: The original release date of the film.
  • popularity: A numerical score indicating the film's popularity.
  • vote_average: The average vote score received by the film.
  • vote_count: The total number of votes cast for the film.

Distribution

The dataset contains information on over 10,000 films. The data is typically available in CSV format, structured as a pandas DataFrame. It includes unique identifiers for nearly 10,000 films. Release dates span from 17th April 1902 to 7th September 2022. Popularity scores vary widely, with the majority falling into the lower ranges but some reaching high values. Vote counts also show a broad distribution, and average vote scores range from approximately 5.00 to 8.70. Some fields within the dataset may contain null values.

Usage

This dataset is ideal for:
  • Developing and testing film recommendation systems.
  • Practising data cleaning and handling of missing values, particularly beneficial for new data analysts.
  • Exploratory data analysis of film trends and audience reception.

Coverage

The dataset's coverage is global. It includes films released between 17th April 1902 and 7th September 2022. No specific demographic scope is noted; coverage is based on films available through the TMDB API.

License

CC0

Who Can Use It

  • Data Analysts: Especially those new to data analysis, to gain experience with data manipulation and missing value imputation.
  • Machine Learning Engineers: For building and evaluating film recommendation algorithms.
  • Researchers: Studying film industry trends, audience preferences, and cinematic history.
  • Developers: Creating applications that require film metadata.

Dataset Name Suggestions

  • TMDB Movies Data
  • Film Insights Collection
  • Global Cinematic Database
  • Movie Popularity and Ratings
  • Open Film Dataset

Attributes

Original Data Source: TMDB MOVIES DATASET

Listing Stats

VIEWS

0

DOWNLOADS

0

LISTED

21/06/2025

REGION

GLOBAL

Universal Data Quality Score Logo UDQSQUALITY

5 / 5

VERSION

1.0

Free