US Democratic Debates Speech Data
Government & Civic Records
Tags and Keywords
Trusted By




"No reviews yet"
Free
About
This dataset captures the essence of the United States Democratic Primary Season, offering a collection of transcripts from all Democratic Primary Debates held between 2019 and February 2020, leading up to the New Hampshire primary. The data was meticulously scraped from Rev.com and is presented "as is". Some transformations have been applied to enhance usability, specifically by adding columns for the speaker and the duration of their speech in seconds. This resource is invaluable for analysts keen to explore the evolution of speech patterns amongst candidates over time, to identify consistent versus "people-pleasing" rhetorical styles, or to assess the direct relevance of answers to questions posed during the debates. Acknowledgement is extended to Rev.com for their diligent transcription efforts and for making this public information available.
Columns
- date: The specific date on which the debate took place.
- debate_name: The official name or title of the debate, such as "October Democratic Debate Transcript: 4th Debate in Ohio".
- debate_section: Identifies particular sections within a debate, for instance, "Part 1", "Gun Control", or "Healthcare". If no explicit section is noted, the default value is "Entire Debate".
- speaker: The name of the candidate or individual speaking.
- speech: The exact text spoken by the speaker.
- speaking_time_seconds: The calculated duration, in seconds, for which the speaker delivered their speech, determined by the difference between speech starting times.
Distribution
The dataset is typically available as a CSV file, with one example being
debate_transcripts_v3_2020-02-26.csv
. It has a file size of 2.26 MB and comprises 6 columns. The dataset includes approximately 5,911 valid records across most columns, with speaking_time_seconds
having 5,395 valid entries.Usage
This dataset is ideally suited for a variety of analytical applications, including:
- Analysing how speech patterns of candidates evolve over the course of the primary season.
- Identifying which candidates maintain consistent viewpoints versus those whose rhetoric might adapt to audiences.
- Investigating the correlation between questions asked and the directness or relevance of candidate responses.
- Linguistic analysis of political discourse.
- Research into public speaking dynamics in high-stakes environments.
Coverage
The data spans the period of Democratic debates from 26th June 2019 to 25th February 2020, covering all primary debates up until the New Hampshire primary. The scope is focused on US Democratic Primary Debates, featuring various participating candidates.
License
CC0: Public Domain
Who Can Use It
This dataset is highly beneficial for:
- Political Scientists and Researchers: For detailed analysis of political discourse, candidate strategies, and public speaking effectiveness during elections.
- Journalists: To support investigative reporting on candidate positions and debate performance.
- Data Analysts and Scientists: For applying natural language processing (NLP) techniques to understand political communication.
- Academics: As a valuable resource for studies in linguistics, communication studies, and electoral politics.
Dataset Name Suggestions
- Democratic Primary Debate Transcripts 2019-2020
- US Democratic Debates Speech Data
- New Hampshire Primary Democratic Transcripts
- Political Discourse Analytics: Democratic Debates
Attributes
Original Data Source:US Democratic Debates Speech Data