Podcast News Conversation Data
Entertainment & Media Consumption
Tags and Keywords
Trusted By




"No reviews yet"
Free
About
This dataset contains all publicly available transcripts from the Today Explained podcast by VOX. The original transcripts were converted into a CSV file, with full credit attributed to VOX. It is well-suited for analysing conversations and news sentiment.
Columns
- date: The date of the episode.
- episodeName: The name of the specific podcast episode.
- speakerName: The name of the individual speaker, as recorded in the transcript.
- text: The spoken content from the speaker.
- speech_nr: The unique number assigned to each speech segment within an episode.
- episode_nr: The unique number assigned to each podcast episode.
Distribution
The dataset is provided in a CSV file format. It covers a period from 29 April 2020 to 24 June 2022. There are 494 unique episode numbers represented. The distribution of speakers includes 'SEAN' at 22%, entries recorded as '[null]' at 18%, and 'Other' speakers making up 60% across 26,499 unique instances. The total number of unique speech entries is 39,804. The number of speeches per episode varies, with the majority falling between 1.00 and 44.50 speech numbers.
Usage
This dataset is ideal for studying patterns in conversations and performing analysis of news sentiment. Potential applications include:
- Text analysis of spoken content.
- Trend analysis in news topics over time.
- Understanding speaker contributions in discussions.
Coverage
The dataset's time range spans from 29 April 2020 to 24 June 2022. It has a global regional coverage. Specific demographic scope is not detailed, but the podcast content focuses on explanations of current news.
License
CC0
Who Can Use It
- Researchers: For academic studies on discourse analysis or media content.
- Data Scientists: For natural language processing (NLP) tasks and sentiment modelling.
- Journalists/Media Analysts: To examine news narratives and public sentiment trends.
- Developers: To train language models or create applications that summarise podcast content.
Dataset Name Suggestions
- VOX Today Explained Podcast Transcripts
- Podcast News Conversation Data
- Today Explained Audio Transcripts
- VOX Podcast Text Data
Attributespodcast
Original Data Source: VOX - Today Explained (podcast) transcripts