Wikipedia-Based Climate Change Fact-Checking Dataset
News & Media Articles
Tags and Keywords
Trusted By




"No reviews yet"
Free
About
Verifying the accuracy of environmental narratives has become increasingly critical as climate change discourse grows in global importance. This resource adopts the FEVER methodology to provide a structured framework for the verification of real-world claims collected from the internet. It consists of over 1,500 claims, each paired with five manually annotated evidence sentences retrieved from English Wikipedia. By categorising these pairs based on whether they support or refute a statement, the collection addresses the nuances of complex and disputed cases where conflicting evidence may coexist. This serves as a vital tool for solving Natural Language Processing (NLP) problems related to fact-checking and information retrieval in the context of atmospheric science.
Columns
- claim_id: A unique numeric identifier assigned to each specific climate-related claim.
- claim: The actual text of the real-world claim being investigated.
- claim_label: The final verdict assigned to the claim (e.g., SUPPORTS, REFUTES, or NOT_ENOUGH_INFO) based on a majority vote of the evidence.
- evidences: A collection containing the top five evidence sentences associated with the claim.
- evidence_id: A unique identifier for each individual piece of evidence.
- evidence_label: The micro-verdict assigned to a specific sentence regarding its relationship to the claim.
- article: The title of the specific Wikipedia page from which the evidence was extracted.
- evidence: The specific sentence used to validate or invalidate the claim.
- entropy: A metric reflecting the level of uncertainty or disagreement among the votes for a label.
- votes: An array documenting the individual votes cast during the manual annotation process.
Distribution
The data is delivered in a CSV file titled
climate-fever.csv, with a file size of approximately 2.31 MB. It contains 1,535 unique claims, which expand into 7,675 individual claim-evidence pairs. The records demonstrate high integrity, with 100% validity across core fields such as the claim text and labels. This is a static archive with a usability score of 10.00, and no future updates are expected.Usage
This collection is ideally suited for training and evaluating automated fact-checking systems and natural language inference models. Researchers can use it to develop algorithms capable of navigating "challenging" claims that involve multiple facets of climate science. It also provides a robust foundation for sentiment analysis and studies on the spread of information regarding global warming. By examining the entropy and voting patterns, data scientists can further investigate human uncertainty in the labelling of controversial scientific topics.
Coverage
The scope encompasses real-world claims gathered from across the internet, providing a broad view of contemporary climate discourse. The evidence is exclusively sourced from the English Wikipedia, ensuring a standardised baseline for verification. While the claims are global in nature, the demographic focus is on publicly available digital content. The records represent a fixed point in time, specifically around the dataset's publication in 2020, capturing the state of climate knowledge and internet claims up to that period.
License
CC0: Public Domain
Who Can Use It
NLP researchers can leverage these records to refine models for claim verification and evidence retrieval. Fact-checkers and journalists can utilise the annotated pairs to understand common climate myths and the evidence used to debunk them. Additionally, data science students can use the high-validity labels and voting data to practice classification and uncertainty modelling within a socially relevant context.
Dataset Name Suggestions
- CLIMATE-FEVER: Real-World Claim Verification Archive
- Wikipedia-Based Climate Change Fact-Checking Dataset
- Annotated Climate Claims and Evidence Pairs for NLP
- Global Warming Narrative Verification Registry
- FEVER Methodology Climate Claim and Evidence Collection
Attributes
Original Data Source: Wikipedia-Based Climate Change Fact-Checking Dataset
Loading...
Free
Download Dataset in ZIP Format
Recommended Datasets
Loading recommendations...
