News Authenticity Dataset
Knowledge Bundles
Tags and Keywords
Trusted By




"No reviews yet"
Free
About
This dataset is designed for fake news detection, offering a structured collection of news articles categorised as either fake or genuine. It provides the essential information needed to train and evaluate machine learning models aimed at identifying misinformation. The dataset includes a substantial number of articles, making it suitable for a variety of natural language processing tasks.
Columns
- Title: The headline or title of the news article.
- Text: The main body content of the news article.
- Subject: The topic or category of the news article.
- Date: The publication date of the news article.
Distribution
The dataset is provided in CSV format, split into two distinct files. One file contains 23,502 fake news articles, whilst the other contains 21,417 genuine news articles. Each record within these files follows the structure outlined in the 'Columns' section. The specific row count for each file is explicitly available.
Usage
This dataset is ideal for training and validating machine learning models for fake news detection, text classification, and natural language processing research. It can be utilised by researchers and data scientists to develop algorithms that identify and flag misinformation, contributing to efforts against the spread of inaccurate content.
Coverage
The dataset's coverage is global, encompassing news articles with a primary focus on events and topics from late 2017, particularly December. It offers a snapshot of news content, including political and social subjects, during this period. There are no specific notes on data availability for certain groups or years beyond the general timeframe.
License
CC-BY-SA
Who Can Use It
This dataset is intended for:
- Data Scientists and Machine Learning Engineers: For building and refining models to detect fake news.
- Researchers: Studying misinformation, propaganda, and text classification techniques.
- Students: Learning about natural language processing and data analysis in the context of real-world issues.
- Media Analysts: Investigating patterns in news reporting and content authenticity.
Dataset Name Suggestions
- Fake News Detection Dataset
- Genuine and Misinformation News Articles
- News Authenticity Dataset
- Textual Fake News Repository
Attributes
Original Data Source: fake-and-real-news-dataset