Opendatabay APP

IMDb Netflix Movie & TV Show Data

Data Science and Analytics

Tags and Keywords

Movies

Netflix

Imdb

Genre

Recommendation

Trusted By
Trusted by company1Trusted by company2Trusted by company3
IMDb Netflix Movie & TV Show Data Dataset on Opendatabay data marketplace

"No reviews yet"

Free

About

This dataset provides a detailed collection of IMDb top Netflix movies and TV shows, enabling advanced data analysis and predictive modelling. It is particularly valuable for feature extraction, exploratory data analysis (EDA), and building recommendation systems. The dataset's purpose is to facilitate the creation of models that can predict movie or TV show genres based on various attributes. The data was successfully scraped from the IMDb website, reflecting a beginner's journey into web scraping with advanced Python.

Columns

  • MOVIES: Contains the names of the movies or TV shows, with 6817 unique values and 9999 valid entries.
  • YEAR: Indicates the year of telecast for the movie or TV show. It has 438 unique values, with common entries like (2020– ) and (2021– ). 9355 entries are valid.
  • GENRE: Lists various genres, highly valuable for recommendation systems. It has 510 unique genres, with Comedy being the most common. 9919 entries are valid.
  • RATING: Represents the audience's rating for the movie or TV show, ranging from 1.1 to 9.9. The mean rating is 6.92, with 8179 valid entries.
  • ONE-LINE: Provides a short description or first impression summary of the movie or TV show. All 9999 entries are valid, with 8688 unique descriptions.
  • STARS: Lists the casting artists, indicating the main actors. All 9999 entries are valid, with 7877 unique star listings.
  • VOTES: Shows the audience's expressed views, useful for identifying the impact of the content. Valid for 8179 entries, with a mean of 15.1k votes.
  • RunTime: Specifies the duration or running time of the content. Valid for 7041 entries, with a mean runtime of 68.7 units.
  • Gross: Represents the total amount earned worldwide. Only 460 entries are valid, with a significant majority being missing.

Distribution

The dataset is provided as a CSV file, specifically movies.csv, with a size of 3.11 MB. It contains more than 9 columns, with a total of 9999 rows/records based on the most populated columns.

Usage

This dataset is ideal for several applications, including:
  • Feature Extraction: Deriving meaningful features from raw data for machine learning models.
  • Exploratory Data Analysis (EDA): Gaining insights into data patterns and distributions.
  • Recommendation Systems: Building models to suggest movies or TV shows to users.
  • Genre Prediction Models: Developing predictive models to determine the genre of unclassified content.

Coverage

The dataset covers top movies and TV shows available on Netflix, with data originally scraped from IMDb. The time range for telecast years includes recent years such as (2020– ) and (2021– ), alongside a wide variety of other years, encompassing 438 unique year entries. The scope is global, reflecting Netflix's and IMDb's international reach.

License

CC0 Public Domain

Who Can Use It

This dataset is suitable for:
  • Data Scientists and Analysts: For conducting deep dives into movie and TV show data.
  • Machine Learning Engineers: For developing and training recommendation and genre prediction models.
  • Beginners in Data Science: To practice web scraping, feature engineering, and basic predictive modelling.
  • Researchers: Studying trends in entertainment content and audience reception.

Dataset Name Suggestions

  • IMDb Netflix Movie & TV Show Data
  • Netflix IMDb Entertainment Dataset
  • Movie & TV Show Analysis Data
  • IMDb Netflix Content Insights
  • Global Movie TV Dataset

Attributes

Original Data Source: IMDb Netflix Movie & TV Show Data

Listing Stats

VIEWS

1

DOWNLOADS

0

LISTED

20/07/2025

REGION

GLOBAL

Universal Data Quality Score Logo UDQSQUALITY

5 / 5

VERSION

1.0

Free

Download Dataset in CSV Format