Opendatabay APP

IMDb Lightweight Movie & Crew Archive

Product Reviews & Feedback

Tags and Keywords

Movies

Imdb

Cinema

Actors

Directors

Trusted By
Trusted by company1Trusted by company2Trusted by company3
IMDb Lightweight Movie & Crew Archive Dataset on Opendatabay data marketplace

"No reviews yet"

Free

About

Tracking the intersection of cinematic history and the professionals who shape it is simplified through this streamlined version of the Internet Movie Database (IMDb). By merging and refining original tables, including basics and principals, it provides a high-performance resource for exploring film titles and their associated crew members. This collection is specifically curated to include only movies and TV movies, ensuring a focused archive for academic research and media application development.

Columns

  • ID_title: The unique identifier for each film title, also known as tconst.
  • titleType: The classification of the entry, primarily categorised as movie or tvMovie.
  • primaryTitle: The most commonly known title of the film.
  • originalTitle: The title in its original language.
  • startYear: The year the film was released, ranging from the late 19th century to 2020.
  • runtimeMinutes: The total duration of the film in minutes.
  • genres: The professional categorisations associated with the film, such as Drama.
  • averageRating: The weighted average score provided by viewers.
  • numVotes: The total number of user votes that contribute to the rating.
  • ID_crew: Unique identifiers for the principal crew members, also known as nconst.
  • category: The general professional role of the crew member, such as actor or director.
  • job: Specific professional titles for certain crew roles where applicable.
  • characters: The specific names of characters portrayed by actors in the film.
  • director: The identifier for the individual who directed the film.
  • writer: The identifier for the individual who wrote the film.

Distribution

The data is delivered across two primary CSV files. The movie-centric table, df_movies.csv, has a file size of 788.81 MB and contains approximately 5.68 million valid records. The accompanying df_names.csv file provides a trimmed registry of professionals involved in these specific productions. The dataset maintains a usability score of 10.00 and is provided as a static historical record with no future updates expected.

Usage

This resource is ideal for building film recommendation engines and performing network analysis on the collaborations between directors, writers, and actors. It is well-suited for longitudinal studies in the cinema industry, such as tracking the evolution of film runtimes or genre popularity over the last century. Developers can also use the data to create movie trivia applications or searchable registries for web-based media platforms.

Coverage

The temporal scope is extensive, starting in 1894 and extending through to 2020. Geographically, it reflects global film production as captured by the IMDb registry. While the dataset is broad, it is specifically restricted to entries classified as movies and TV movies, excluding other media formats. It captures approximately 5.68 million entries, ensuring an exhaustive representation of titles that have received user ratings and votes.

License

CC0: Public Domain

Who Can Use It

Data scientists can leverage these records to train machine learning models for popularity prediction and genre classification. Film historians may utilise the release dates and crew metadata to map the careers of specific industry professionals. Additionally, software developers can integrate the linked tables to populate databases for media-centric applications and websites.

Dataset Name Suggestions

  • IMDb Lightweight Movie & Crew Archive
  • Streamlined Cinema History: Titles and Talent
  • Global Film and TV Movie Registry (1894–2020)
  • IMDb Merged Titles and Crew Metadata
  • Lighter IMDb Movie Database for Researchers

Attributes

Listing Stats

VIEWS

3

DOWNLOADS

0

LISTED

28/12/2025

REGION

GLOBAL

Universal Data Quality Score Logo UDQSQUALITY

5 / 5

VERSION

1.0

Loading...

Free

Download Dataset in ZIP Format