Television Episode NLP Dataset
Entertainment & Media Consumption
Tags and Keywords
Trusted By




"No reviews yet"
Free
About
This dataset provides detailed metadata and plot summaries for all episodes across the twelve seasons of The Big Bang Theory. It offers insights into the show's structure, content, and reception, making it a valuable resource for analysing television series data. The dataset's purpose is to facilitate various analytical tasks related to episode content and viewership trends.
Columns
- Season: The season number of the episode.
- No.overall: The overall episode number in the series.
- No. inseason: The episode number within its specific season.
- Title: The title of the episode.
- Directed by: The director or directors of the episode. For example, Mark Cendrowski directed 87% of episodes, while Anthony Rich directed 4%.
- Written by: The writer or writers of the episode. A small percentage of episodes were written by combinations such as Chuck Lorre, Eric Kaplan, Jim Reynolds for story and Steven Molaro, Steve Holland, Maria Ferrari for teleplay.
- Original air date: The date the episode was first aired.
- Prod.code: A unique product code for the episode.
- U.S. viewers(millions): The viewership numbers in millions, specific to the United States. Viewership ranges from approximately 7.34 million to 20.4 million.
- plot: A textual summary of the episode's plot.
Distribution
The dataset is typically provided in a CSV data file format. It contains metadata and plot summaries for a total of 279 unique episodes. The distribution of episodes across seasons varies, with multiple seasons having 24 to 28 episodes. The original air dates span from 24th September 2007 to 16th May 2019, covering all 12 seasons of the show.
Usage
This dataset is ideal for various applications and use cases, including:
- Analysing similarity in plot styles of different episodes.
- Generating plots or narrative structures using the plots of previous seasons.
- Conducting analysis of viewership figures against episode content.
- Investigating plot style similarities under the direction of different directors.
- Natural Language Processing (NLP) tasks on episode summaries.
Coverage
The dataset covers all episodes across 12 seasons of The Big Bang Theory.
- Geographic Scope: While the dataset is globally available, specific viewership figures are provided for the United States.
- Time Range: The original air dates range from 24th September 2007 to 16th May 2019.
- Demographic Scope: No specific demographic details are provided beyond the US viewership.
License
CC0
Who Can Use It
This dataset is suitable for data scientists, researchers, developers, and enthusiasts interested in television series analysis, media consumption trends, and natural language processing. Specific users who might find this valuable include:
- Data analysts looking to understand viewership patterns.
- Researchers studying narrative structures in television.
- Machine learning engineers developing models for text generation or similarity.
- Academics exploring the impact of directorial or writing contributions on content.
Dataset Name Suggestions
- The Big Bang Theory Episode Metadata
- TV Series Plot Summaries and Viewership
- Big Bang Theory Season Data
- Television Episode NLP Dataset
- BBNG Episode Analytics
Attributes
Original Data Source: The Big Bang Theory - Plots (All Seasons)