Amazon Prime Titles and Cast Data
News & Media Articles
Tags and Keywords
Trusted By




"No reviews yet"
Free
About
This dataset provides a detailed listing of Amazon Prime Video content, including both films and TV programmes, available for streaming in the United States as of May 2022. It offers an invaluable resource for analysing the streaming platform's offerings, developing content recommendation systems, and conducting network analysis on the cast and crew involved in various titles.
Columns
The dataset is split into two files:
titles.csv
and credits.csv
.Titles File Columns:
id
: The unique title identifier on JustWatch.title
: The name of the film or TV programme.show_type
: Indicates whether the content is a 'TV show' or a 'movie'.description
: A short synopsis of the title.release_year
: The year the title was originally released.age_certification
: The age rating for the content.runtime
: The duration of a film or an episode of a TV programme.genres
: A list of categories the title belongs to.production_countries
: A list of countries where the title was produced.seasons
: The number of seasons available if the title is a TV show.imdb_id
: The title identifier on IMDb.imdb_score
: The user score for the title on IMDb.imdb_votes
: The number of votes received on IMDb.tmdb_popularity
: The popularity score on TMDB.tmdb_score
: The user score for the title on TMDB.
Credits File Columns:
person_ID
: The unique identifier for the individual (actor or director) on JustWatch.id
: The title identifier on JustWatch, linking to thetitles.csv
file.name
: The name of the actor or director.character_name
: The name of the character played by the actor (if applicable).role
: Specifies the role of the individual, either 'ACTOR' or 'DIRECTOR'.
Distribution
The dataset comprises two CSV files. The
titles.csv
file contains over 9,000 unique titles. The credits.csv
file features over 124,000 credit entries for actors and directors associated with these titles. The person_ID
field in the credits file has over 79,758 unique names and approximately 124,000 valid entries. The role
field shows that actors account for 93% of the entries, while directors make up 7%. Approximately 13% of the character
names are null.Usage
This dataset is ideal for:
- Developing content-based recommender systems utilising genres and descriptions.
- Identifying the primary types of content available on the streaming platform.
- Conducting network analysis on the cast and director relationships within the titles.
- Performing exploratory data analysis to uncover interesting insights into Amazon Prime Video's catalogue.
Coverage
The data was acquired in May 2022 and lists content available specifically within the United States region.
License
CC0: Public Domain
Who Can Use It
This dataset is suitable for:
- Data Scientists and Machine Learning Engineers for building recommendation engines and analysing content patterns.
- Data Analysts keen on exploring trends, popular titles, and content demographics on streaming platforms.
- Researchers studying media consumption, entertainment industry trends, and network dynamics in film and television production.
- Developers looking for real-world data to test and refine data processing and visualisation tools.
Dataset Name Suggestions
- Amazon Prime Video Catalogue (May 2022)
- US Prime Video Films & TV Data
- Streaming Content Analysis Dataset - Amazon Prime
- Amazon Prime Titles and Cast Data
Attributes
Original Data Source: Amazon Prime Titles and Cast Data