News Authenticity Dataset
Entertainment & Media Consumption
Tags and Keywords
Trusted By




"No reviews yet"
Free
About
This dataset is designed to help detect and analyse misinformation patterns. It provides a robust collection crafted for the development and evaluation of machine learning models that can distinguish between authentic and fabricated news. The dataset is divided into two main parts: one containing 21,417 verified news articles and another comprising 23,481 fabricated or misleading news articles. It addresses the challenge of misinformation spreading rapidly in the digital age.
Columns
- title: The headline or title of the news article.
- text: The full content or body of the news article.
- subject: The category or topic of the article, such as politics, world news, tech, conspiracy, or political bias.
- date: The publication or assigned date of the article.
Distribution
The data files are typically in CSV format. The dataset includes 23,481 fabricated news articles and 21,417 verified news articles. Each article entry comprises the title, text, subject, and date. Specific numbers for rows or records within the overall dataset are provided for each of the two parts.
Usage
This dataset is ideal for training Natural Language Processing (NLP) models for binary classification (fake versus real news). It can also be used for sentiment and subject analysis of misinformation, as well as for exploring linguistic patterns that differentiate authentic news from deceptive content.
Coverage
The dataset has a global regional scope. The articles cover a time range from 31st March 2015 to 19th February 2018. The data includes various subjects such as politics, world news, and technology. Specific notes on data availability indicate varying article counts across different date ranges within this period.
License
CC-BY-NC
Who Can Use It
This dataset is suitable for a variety of users including data science and machine learning learners, researchers focused on information integrity, and developers building news verification tools.
Dataset Name Suggestions
- Fake News Articles Dataset
- Real and Fake News Classifier
- Misinformation Detection Corpus
- News Authenticity Dataset
- News Verification Data
Attributes
Original Data Source: Real & Fake News