Global Netflix Catalogue Dataset
Natural Language Processing
Tags and Keywords
Trusted By




"No reviews yet"
Free
About
This dataset provides a detailed listing of movies and TV shows available on Netflix. It offers valuable insights into one of the most popular media and video streaming platforms globally, which reported over 282 million subscribers as of mid-2024. The tabular data includes key details for over 8,000 titles, such as cast, directors, content ratings, release year, and duration. It serves as an excellent resource for analysing Netflix's content catalogue.
Columns
The dataset contains 12 columns, each providing specific information about the Netflix titles:
- show_id: A unique identifier for each movie or TV show.
- type: Specifies whether the title is a "Movie" or "TV Show".
- title: The name of the Netflix title.
- director: The director(s) of the title.
- cast: The main actors involved in the title.
- country: The country where the title was produced.
- date_added: The date when the title was added to the Netflix platform.
- release_year: The year the title was originally released.
- rating: The content rating (e.g., "PG-13", "TV-MA").
- duration: The duration of a movie (in minutes) or the number of seasons for a TV show.
- listed_in: Categories or genres under which the title is listed (e.g., "Documentaries", "TV Dramas").
- description: A brief summary of the title.
Distribution
The dataset is provided in a tabular format, typically a CSV file, named
netflix_titles.csv
. It has a file size of 3.4 MB and consists of 8,807 unique records (rows). All 12 columns contain valid data, though some columns like director
, cast
, and country
have a small percentage of missing values. The type
column indicates that approximately 70% of the titles are Movies and 30% are TV Shows.Usage
This dataset is ideal for various analytical applications, including:
- Data Visualisation: Creating visual representations of content trends, distribution over time, and genre popularity.
- Classification: Developing models to classify titles based on their attributes.
- Exploratory Data Analysis: Gaining insights into the characteristics of Netflix's content library, director and cast prominence, or country of origin trends.
- Content Strategy: Analysing content types, ratings, and release patterns to inform content acquisition or production strategies.
Coverage
The dataset spans a significant time range, with titles originally released from 1925 up to 2021. Content was added to Netflix between 1 January 2008 and 25 September 2021. Geographically, titles are produced in various countries, with the United States being the most frequent country (32%), followed by India (11%). The dataset includes content with diverse content ratings such as "TV-MA" (36%) and "TV-14" (25%), offering insights into the target audience and content suitability.
License
CC0: Public Domain
Who Can Use It
This dataset is suitable for:
- Data Analysts and Scientists: For performing statistical analysis, building predictive models, and extracting meaningful patterns from streaming data.
- Researchers: Studying media trends, global content production, and consumption habits on streaming platforms.
- Content Creators and Marketers: To understand popular genres, successful directors, and the demographics associated with different content ratings.
- Students and Educators: As a practical resource for learning data cleaning, analysis, and visualisation techniques.
Dataset Name Suggestions
- Netflix Content Listings
- Global Netflix Catalogue
- Netflix Movies & TV Series Data
- Netflix Originals & Licensed Titles
Attributes
Original Data Source: Global Netflix Catalogue Dataset