Opendatabay APP

US Democratic Debates Speech Data

Government & Civic Records

Tags and Keywords

Politics

Debates

Transcripts

Candidates

Speeches

Trusted By
Trusted by company1Trusted by company2Trusted by company3
US Democratic Debates Speech Data Dataset on Opendatabay data marketplace

"No reviews yet"

Free

About

This dataset captures the essence of the United States Democratic Primary Season, offering a collection of transcripts from all Democratic Primary Debates held between 2019 and February 2020, leading up to the New Hampshire primary. The data was meticulously scraped from Rev.com and is presented "as is". Some transformations have been applied to enhance usability, specifically by adding columns for the speaker and the duration of their speech in seconds. This resource is invaluable for analysts keen to explore the evolution of speech patterns amongst candidates over time, to identify consistent versus "people-pleasing" rhetorical styles, or to assess the direct relevance of answers to questions posed during the debates. Acknowledgement is extended to Rev.com for their diligent transcription efforts and for making this public information available.

Columns

  • date: The specific date on which the debate took place.
  • debate_name: The official name or title of the debate, such as "October Democratic Debate Transcript: 4th Debate in Ohio".
  • debate_section: Identifies particular sections within a debate, for instance, "Part 1", "Gun Control", or "Healthcare". If no explicit section is noted, the default value is "Entire Debate".
  • speaker: The name of the candidate or individual speaking.
  • speech: The exact text spoken by the speaker.
  • speaking_time_seconds: The calculated duration, in seconds, for which the speaker delivered their speech, determined by the difference between speech starting times.

Distribution

The dataset is typically available as a CSV file, with one example being debate_transcripts_v3_2020-02-26.csv. It has a file size of 2.26 MB and comprises 6 columns. The dataset includes approximately 5,911 valid records across most columns, with speaking_time_seconds having 5,395 valid entries.

Usage

This dataset is ideally suited for a variety of analytical applications, including:
  • Analysing how speech patterns of candidates evolve over the course of the primary season.
  • Identifying which candidates maintain consistent viewpoints versus those whose rhetoric might adapt to audiences.
  • Investigating the correlation between questions asked and the directness or relevance of candidate responses.
  • Linguistic analysis of political discourse.
  • Research into public speaking dynamics in high-stakes environments.

Coverage

The data spans the period of Democratic debates from 26th June 2019 to 25th February 2020, covering all primary debates up until the New Hampshire primary. The scope is focused on US Democratic Primary Debates, featuring various participating candidates.

License

CC0: Public Domain

Who Can Use It

This dataset is highly beneficial for:
  • Political Scientists and Researchers: For detailed analysis of political discourse, candidate strategies, and public speaking effectiveness during elections.
  • Journalists: To support investigative reporting on candidate positions and debate performance.
  • Data Analysts and Scientists: For applying natural language processing (NLP) techniques to understand political communication.
  • Academics: As a valuable resource for studies in linguistics, communication studies, and electoral politics.

Dataset Name Suggestions

  • Democratic Primary Debate Transcripts 2019-2020
  • US Democratic Debates Speech Data
  • New Hampshire Primary Democratic Transcripts
  • Political Discourse Analytics: Democratic Debates

Attributes

Listing Stats

VIEWS

0

DOWNLOADS

0

LISTED

03/08/2025

REGION

GLOBAL

Universal Data Quality Score Logo UDQSQUALITY

5 / 5

VERSION

1.0

Free

Download Dataset in CSV Format