Opendatabay APP

Netflix Movie Catalogue for ML

Product Reviews & Feedback

Tags and Keywords

Movie

Netflix

Recommendation

System

Film

Trusted By
Trusted by company1Trusted by company2Trusted by company3
Netflix Movie Catalogue for ML Dataset on Opendatabay data marketplace

"No reviews yet"

Free

About

This collection features detailed information concerning Netflix movies, specifically curated for developing and testing advanced recommendation systems. The data allows users to understand the characteristics, such as genre, duration, and production details, that influence content preferences. It serves as foundational material for conducting exploratory data analysis (EDA), cleaning, visualisation, and preparing machine learning models aimed at predicting movie preferences and suggesting relevant titles to users.

Columns

The dataset contains ten key fields describing the movie content and metadata:
  • Index: A numerical identifier used for internal organisation, ranging from 1 to 3896.
  • movie_name: The title of the film, with 3323 unique values recorded.
  • Duration: The running time of the film, measured in minutes, with a mean duration of 111 minutes.
  • year: The production year of the film. Years range from 1942 to 2020.
  • genre: Categorisation of the film; there are 219 unique genre combinations, with Stand-Up Comedy being the most frequently occurring at 8%.
  • director: The credited director(s) of the movie, featuring 2605 unique director entries.
  • actors: A list of primary cast members; 3237 unique actor combinations are present.
  • country: The nation(s) responsible for the film’s production. The United States accounts for 33% of the entries, and India accounts for 20%.
  • rating: The user or critical rating applied to the movie, spanning a minimum of 0 up to 9.1, with an average rating of 6.2.
  • enter_in_netflix: The date the film was made available on the Netflix platform.

Distribution

The data is provided in a CSV file format named Netflix_movies.csv, totalling 785.5 kB in size. It contains 3323 validated records, all columns are fully populated with no missing values, and the structure is optimised for immediate use in analytical environments. This is a static collection of data points; information on expected updates suggests a quarterly release schedule.

Usage

This data is ideally suited for applications in predictive modelling and analytical research, including:
  • Training and evaluating machine learning models for collaborative filtering and content-based recommendation systems.
  • Conducting deep exploratory analysis on film trends, production longevity, and genre popularity over time.
  • Developing visualisations to illustrate shifts in movie production country dominance or platform entry rates.
  • Studying the correlation between movie characteristics (e.g., duration, genre, actors) and subsequent audience ratings.

Coverage

The dataset covers movie production years spanning from 1942 up to 2020. The records track content entry onto the Netflix platform from 1st January 2008 through to 10th July 2020. Geographically, the data reflects production across numerous countries, though there is a strong emphasis on films originating from the United States and India.

License

CC0: Public Domain

Who Can Use It

  • Data Scientists: For feature engineering and algorithm development, particularly focusing on large-scale matrix factorisation or deep learning applied to media recommendations.
  • Machine Learning Engineers: To train, test, and benchmark various recommender system architectures.
  • Media and Film Analysts: For historical research into movie production trends, directorial output, and regional content distribution.

Dataset Name Suggestions

  • Netflix Movie Catalogue for ML
  • Global Streaming Film Metadata
  • Movie Recommendation System Input
  • Netflix Content Attributes

Attributes

Original Data Source: Netflix Movie Catalogue for ML

Listing Stats

VIEWS

5

DOWNLOADS

1

LISTED

25/11/2025

REGION

GLOBAL

Universal Data Quality Score Logo UDQSQUALITY

5 / 5

VERSION

1.0

Loading...

Free

Download Dataset in CSV Format