Opendatabay APP

News Authenticity Dataset

Entertainment & Media Consumption

Tags and Keywords

News

Text

Nlp

English

Binary

Trusted By
Trusted by company1Trusted by company2Trusted by company3
News Authenticity Dataset Dataset on Opendatabay data marketplace

"No reviews yet"

Free

About

This dataset is designed to help detect and analyse misinformation patterns. It provides a robust collection crafted for the development and evaluation of machine learning models that can distinguish between authentic and fabricated news. The dataset is divided into two main parts: one containing 21,417 verified news articles and another comprising 23,481 fabricated or misleading news articles. It addresses the challenge of misinformation spreading rapidly in the digital age.

Columns

  • title: The headline or title of the news article.
  • text: The full content or body of the news article.
  • subject: The category or topic of the article, such as politics, world news, tech, conspiracy, or political bias.
  • date: The publication or assigned date of the article.

Distribution

The data files are typically in CSV format. The dataset includes 23,481 fabricated news articles and 21,417 verified news articles. Each article entry comprises the title, text, subject, and date. Specific numbers for rows or records within the overall dataset are provided for each of the two parts.

Usage

This dataset is ideal for training Natural Language Processing (NLP) models for binary classification (fake versus real news). It can also be used for sentiment and subject analysis of misinformation, as well as for exploring linguistic patterns that differentiate authentic news from deceptive content.

Coverage

The dataset has a global regional scope. The articles cover a time range from 31st March 2015 to 19th February 2018. The data includes various subjects such as politics, world news, and technology. Specific notes on data availability indicate varying article counts across different date ranges within this period.

License

CC-BY-NC

Who Can Use It

This dataset is suitable for a variety of users including data science and machine learning learners, researchers focused on information integrity, and developers building news verification tools.

Dataset Name Suggestions

  • Fake News Articles Dataset
  • Real and Fake News Classifier
  • Misinformation Detection Corpus
  • News Authenticity Dataset
  • News Verification Data

Attributes

Original Data Source: Real & Fake News

Listing Stats

VIEWS

3

DOWNLOADS

1

LISTED

14/06/2025

REGION

GLOBAL

Universal Data Quality Score Logo UDQSQUALITY

5 / 5

VERSION

1.0

Free