Opendatabay APP

Global Events Headline Dataset

Entertainment & Media Consumption

Tags and Keywords

News

Australia

Nlp

Linguistics

Headlines

Abc

Trusted By
Trusted by company1Trusted by company2Trusted by company3
Global Events Headline Dataset Dataset on Opendatabay data marketplace

"No reviews yet"

Free

About

This dataset provides a vast collection of news headlines published over a period of nineteen years, sourced from ABC (Australian Broadcasting Corporation), a reputable Australian news organisation. It serves as a historical record of significant events globally, with a particular emphasis on Australia, spanning from early 2003 to the end of 2021. The dataset captures the entire body of articles published on the abcnews website during this timeframe, offering insight into major episodes such as the Afghanistan war, financial crises, multiple elections, ecological disasters, terrorism, activities of famous individuals, and criminal activity.

Columns

  • publish_date: The date when the article was published, presented in yyyyMMdd format.
  • headline_text: The textual content of the headline, provided in ASCII, English, and lowercase.

Distribution

The dataset is provided in a single CSV file. It contains approximately 1.2 million unique headline entries. On average, around two hundred articles were published per day within the covered period.

Usage

This dataset is ideal for various applications, including:
  • Analysing historical news trends and event evolution.
  • Natural Language Processing (NLP) tasks such as text classification, sentiment analysis, and topic modelling.
  • Tracking the discourse around specific keywords and major global or Australian events.
  • Academic research into media representation and historical timelines.
  • Developing and testing news aggregation or summarisation algorithms.

Coverage

The dataset covers a time range from 19th February 2003 to 31st December 2021, accounting for nineteen years of headlines. Geographically, it focuses on Australia while also including a substantial amount of international news content.

License

CCO

Who Can Use It

  • Data Scientists and Machine Learning Engineers: For NLP model training and development.
  • Researchers and Academics: To study historical events, media coverage, and linguistic patterns over time.
  • Journalists and Media Analysts: For trend spotting and understanding past news narratives.
  • Students: For educational projects related to data analysis and current affairs.

Dataset Name Suggestions

  • Australian News Headline Archive
  • ABC News Headlines 2003-2021
  • Global Events Headline Dataset
  • Nineteen Years of News Headlines

Attributes

Original Data Source: A Million News Headlines

Listing Stats

VIEWS

2

DOWNLOADS

0

LISTED

05/06/2025

REGION

GLOBAL

Universal Data Quality Score Logo UDQSQUALITY

5 / 5

VERSION

1.0

Free

Download Dataset in CSV Format