Six Seasons and a Dataset
Data Science and Analytics
Tags and Keywords
Trusted By




"No reviews yet"
Free
About
This collection features detailed episode metadata and viewership metrics for the acclaimed television series Community. The data covers all 110 episodes across six seasons, spanning its run from September 17, 2009, through to June 2, 2015. The series was created by Dan Harmon. The motivation behind curating this data was to share information about the popular show after binge-watching it. The original source material was gathered from IMDb TV episode datasets and information scraped from Wikipedia. The series opening theme song is "At Least It Was Here" by The 88.
Columns
- season: The season number for the corresponding episode, ranging from 1 to 6.
- episode_num_in_season: The sequential number of the episode within its specific season (maximum value is 25).
- episode_num_overall: The overall episode number counting across the entire series (maximum value is 110).
- title: The unique name of the episode.
- directed_by: The name of the individual who directed the episode.
- written_by: The name(s) of the writer(s) credited for the episode.
- original_air_date: The date the episode first aired, listed as a DateTime format.
- prod_code: The unique production code assigned to the episode.
- us_viewers: The estimated number of US viewers (in millions) recorded on the original air date.
Distribution
The data is provided in a tabular CSV file format, specifically named
community_episodes.csv. It consists of 9 columns and 110 valid records overall. While metrics like season and title are fully populated, the estimated US viewers column contains 13 missing values, meaning 88% of the records have valid viewership data. The overall mean number of US viewers recorded is approximately 3.93 million. This collection is structured as a Beginner-level dataset.Usage
The dataset is highly usable (10.00) for analytical tasks related to television metrics. It is easy to join these data files using fields like 'Title' and 'Air Date' to facilitate comparisons with external metrics, such as IMDb ratings. Potential applications include analysing how US viewer numbers fluctuate across seasons and quantifying the contributions of key crew members; for example, the most frequent director accounts for 22% of all episodes.
Coverage
The data covers every episode across the six seasons of the series, spanning the time range from September 17, 2009, to June 2, 2015. The geographic scope focuses on metrics relevant to the United States, specifically the number of US viewers who tuned in on the original air date.
License
CC0: Public Domain
Who Can Use It
This dataset is suitable for academic researchers studying entertainment consumption and television production models. It is ideal for aspiring data scientists seeking a robust, beginner-friendly tabular dataset. It is also valuable for television aficionados who wish to perform quantitative analysis on the series created by Dan Harmon.
Dataset Name Suggestions
- Community Episodes Data
- Six Seasons and a Dataset
- TV Series Episode Metadata: Community
Attributes
Original Data Source: Six Seasons and a Dataset
Loading...
