Netflix Shows, Movies, and Cast Data
Natural Language Processing
Tags and Keywords
Trusted By




"No reviews yet"
Free
About
This dataset provides a detailed listing of Netflix TV shows and movies available in the United States as of July 2022. It was created with the aim of cataloguing all available content on the streaming platform and enabling various data analyses. The dataset is particularly valuable for understanding the Netflix content library, facilitating the development of recommender systems, and conducting exploratory data analysis on streaming trends and cast networks. It offers insights into content types, release years, genres, and cast details, making it a robust resource for researchers, data scientists, and developers interested in the entertainment industry.
Columns
The dataset is split into two files:
titles.csv
and credits.csv
.titles.csv
(over 5,000 unique titles, 15 columns):id
: The unique identifier for the title on JustWatch.title
: The name of the movie or TV show.show_type
: Indicates whether the entry is a 'TV show' or a 'movie'.description
: A brief summary or synopsis of the title.release_year
: The year the title was originally released.age_certification
: The age rating or certification for the content.runtime
: The length of the movie in minutes or the average episode length for TV shows.genres
: A list of genres associated with the title.production_countries
: A list of countries where the title was produced.seasons
: The number of seasons available if the title is a TV show.imdb_id
: The unique identifier for the title on IMDb.imdb_score
: The score given to the title on IMDb.imdb_votes
: The number of votes received on IMDb.tmdb_popularity
: The popularity metric from TMDB.tmdb_score
: The score given to the title on TMDB.
credits.csv
(over 77,000 credits, 5 columns):person_id
: The unique identifier for the person on JustWatch.id
: The title ID, linking back to thetitles.csv
file.name
: The real name of the actor or director.character_name
: The name of the character played by the actor.role
: Indicates whether the person's role is 'ACTOR' or 'DIRECTOR'.
Distribution
The dataset consists of two CSV files:
titles.csv
and credits.csv
. The titles.csv
file contains over 5,000 unique titles, while the credits.csv
file includes over 77,000 individual credits for actors and directors. The credits.csv
file has a size of approximately 3.82 MB. The data can be easily imported into various analytical tools due to its standard CSV format.Usage
This dataset is ideally suited for a variety of applications, including:
- Developing content-based recommender systems using genres and descriptions.
- Identifying and analysing the main content types and trends available on Netflix.
- Performing network analysis to understand connections within the cast and crew of titles.
- Conducting exploratory data analysis to uncover interesting insights into the Netflix catalogue.
- Supporting academic research on streaming service content strategies.
Coverage
The dataset's geographic scope is limited to content available in the United States. The data was acquired in July 2022, making it a snapshot of the Netflix catalogue at that specific time. No specific demographic coverage notes are provided.
License
CC0: Public Domain
Who Can Use It
This dataset is particularly useful for:
- Data scientists and analysts performing exploratory data analysis or building machine learning models.
- Developers creating new recommender systems or content discovery applications.
- Researchers studying media consumption, entertainment trends, and content production.
- Students learning about data analysis, database management, and data visualisation.
Dataset Name Suggestions
- Netflix Content Catalogue (July 2022)
- Netflix Shows, Movies, and Cast Data
- US Netflix Streaming Data
- Netflix TV & Film Analytics Dataset
Attributes
Original Data Source: Netflix Shows, Movies, and Cast Data