Dark Mode

Home

Data Categories

Government & Public Data

Political Claim Verification Dataset

FREE DATASET LIBRARY

Verified Data Provider

£0

Political Claim Verification Dataset

Government & Civic Records

Tags and Keywords

Text

Politics

Nlp

Government

Languages

Trusted By

Political Claim Verification Dataset Dataset on Opendatabay data marketplace

"No reviews yet"

Free

About

This dataset provides a detailed collection of fact-checked claims scraped from Politifact.com. It includes claims made by various individuals and the corresponding assessments by Politifact curators. The primary purpose of this dataset is to facilitate the application of various Natural Language Processing (NLP) algorithms to analyse the integrity of information and determine the validity of claims. It serves as a valuable resource for research into misinformation, public discourse analysis, and the development of automated fact-checking systems.

Columns

sources: A string representing the individual associated with the quote or claim.
sources_dates: The date on which the information or quote was originally furnished by the source.
sources_post_location: The specific location or medium through which the source provided the information, such as a Facebook post.
sources_quote: The exact quote or statement made by the source under scrutiny.
curator_name: The name of the person from Politifact who curated, analysed, and assessed the source's quote.
curated_date: The date when the Politifact curator analysed and assessed the source's claim.
fact: The fact score or rating assigned to the source's quote by Politifact.
sources_url: The URL linking to the Politifact curator's article that discusses the source's quote.
curators_article_title: The title of the article written by the curator, which either supports or rejects the source's claim.
curator_complete_article: The full blog post or article written by the curator providing detailed reasoning for supporting or rejecting the source's claim.
curator_tags: Keywords or tags assigned by the curator to their blog post.
index: An identifier for the entry.

Distribution

The dataset is typically provided in a CSV file format. Specific row counts for individual files are updated separately, but the dataset contains approximately 19.4 thousand unique records. The data is structured with distinct columns detailing source information, claim content, and curatorial analysis, making it ready for various data processing tasks.

Usage

This dataset is ideally suited for researchers and developers working on:

Developing and testing NLP algorithms for fact-checking and truth detection.
Analysing patterns in misinformation and disinformation.
Studying the discourse around political claims and public statements.
Building models to predict the veracity of claims.
Training machine learning models for natural language understanding and text classification in the context of media integrity.

Coverage

The dataset covers claims and fact-checks globally. The time range for the collected information spans from 2nd May 2007 to 20th April 2021, reflecting a significant period of public discourse. While the demographic scope varies, examples include a notable percentage of claims from Donald Trump and those originating from Facebook posts.

License

CC0

Who Can Use It

This dataset is particularly beneficial for:

Data Scientists and NLP Engineers: For training and evaluating models related to text classification, sentiment analysis, and claim verification.
Academics and Researchers: Studying political science, media studies, communication, and computational social science.
Journalists and Fact-Checkers: As a reference or for building tools to assist in verifying information.
Public Policy Analysts: To understand the spread of information and its impact.

Dataset Name Suggestions

Politifact Fact-Check Data
Political Claim Verification Dataset
Public Fact-Checking Corpus
Media Truthfulness Data
NLP Fact-Checking Dataset

Attributes

Original Data Source: Politifact Factcheck Data

Listing Stats

VIEWS

DOWNLOADS

LISTED

26/06/2025

REGION

GLOBAL

QUALITY

5 / 5

VERSION

1.0

FREE DATASET LIBRARY

£0

Political Claim Verification Dataset

Government & Civic Records

Tags and Keywords

Text

Politics

Nlp

Government

Languages

Trusted By

Free

About

Columns

Distribution

Usage

Coverage

License

Who Can Use It

Dataset Name Suggestions

Attributes

Listing Stats

Free

Download Dataset in CSV Format

RECOMMENDED DATASETS