Opendatabay APP

AG News Articles Classification

Entertainment & Media Consumption

Tags and Keywords

Business

News

Nlp

Classification

Sentiment

Articles

Trusted By
Trusted by company1Trusted by company2Trusted by company3
AG News Articles Classification Dataset on Opendatabay data marketplace

"No reviews yet"

Free

About

This dataset provides a new opportunity for text classification research, well-suited for various methods in this field. It is a large, well-balanced collection of news articles, designed to facilitate studies in categorising articles, identifying sentiment, and analysing how different media outlets report news.

Columns

  • text: The actual content of the news article, provided as a string.
  • label: An integer representing the category or classification of the news article.

Distribution

The dataset comprises a training set of 10,000 examples and a test set of 5,000 examples. The data is balanced, with approximately 1,900 unique values for each of the following label ranges: 0.00-0.30, 0.90-1.20, 1.80-2.10, and 2.70-3.00. The data files are typically in CSV format, specifically train.csv and test.csv.

Usage

This dataset can be used to:
  • Train a text classifier to automatically categorise news articles.
  • Develop systems capable of identifying positive and negative sentiment within news articles.
  • Conduct research into differences in how positive and negative news is reported by various media outlets.

Coverage

The AG News dataset is a collection of over 1 million news articles, sourced from more than 2,000 news outlets by ComeToMyHead. This academic news search engine has been active since July 2004, indicating a data collection period spanning from that time. The coverage is global, making it a comprehensive resource for news analysis.

License

CC0

Who Can Use It

This dataset is ideal for academic and research purposes. Intended users include researchers in:
  • Data mining (e.g., clustering, classification).
  • Information retrieval (e.g., ranking, search).
  • Applications involving XML, data compression, and data streaming.
  • Any other non-commercial activity related to text data analysis. It is particularly suitable for those engaged in text classification research.

Dataset Name Suggestions

  • AG News Articles Classification
  • News Article Sentiment Dataset
  • Global News Text Corpus
  • Academic News Article Data

Attributes

Original Data Source: AG News (News articles)

Listing Stats

VIEWS

1

DOWNLOADS

0

LISTED

16/06/2025

REGION

GLOBAL

Universal Data Quality Score Logo UDQSQUALITY

5 / 5

VERSION

1.0

Free