Political Statement Truthfulness Ratings
Government & Civic Records
Tags and Keywords
Trusted By




"No reviews yet"
Free
About
This collection of data consists of political statements coupled with a verified verdict of their truthfulness, parsed from Politifact.com. It offers over 14,000 affirmations collected up until late 2020. The primary purpose is to facilitate research in truth detection, deception detection, and fact-checking integration. The statements are categorized into six distinct degrees of veracity: True, Mostly True, Half-True, Mostly False, False, and Pants on Fire!
Columns
The data file contains four essential columns:
- statement: The specific assertion made by a celebrity or politician. There are over 14,150 unique statement values recorded.
- source: Identifies the originator of the statement, which may be a person or another entity.
- link: Provides the specific URL affirmation associated with the statement on Politifact.com.
- veracity: The degree of truthfulness assigned to the statement by the Politifact.com team. Half-True statements currently represent approximately 20% of the observations.
Distribution
The data is contained within a single file named
politifact.csv. This file is typically distributed in CSV format and has a size of approximately 4.03 MB. The structure consists of 4 columns and records detailing over 14,000 political statements. Other file variants, often used for machine learning purposes, may have certain classes removed or binarized (simplified into truths and lies).Usage
This dataset is ideally suited for various machine learning and analytical applications, including:
- Building models to detect the truthfulness of text based on statement language.
- Developing and testing algorithms for automated lie detection.
- Integrating fact-checking capabilities into digital platforms.
- Performing Exploratory Data Analysis (EDA) focused on political trends and rhetoric.
Coverage
The data focuses on US political statements and public affirmations from figures such as celebrities and politicians, spanning a period up to late 2020. The scope is defined by the statements and veracity verdicts issued by the Politifact.com platform.
License
CC0: Public Domain
Who Can Use It
- Data Scientists: For training Natural Language Processing (NLP) models focused on text classification and veracity scoring.
- Academics and Researchers: For studying political discourse, misinformation, and deception patterns.
- Journalists and Fact-Checkers: For building tools to verify public claims and integrate verification systems.
Dataset Name Suggestions
- Politifact Veracity Dataset
- Political Statement Truthfulness Ratings
- Deception Detection Text Data
- Fact-Checked Political Affirmations
Attributes
Original Data Source: Political Statement Truthfulness Ratings
Loading...
