Global Events Headline Dataset
Entertainment & Media Consumption
Tags and Keywords
Trusted By




"No reviews yet"
Free
About
This dataset provides a vast collection of news headlines published over a period of nineteen years, sourced from ABC (Australian Broadcasting Corporation), a reputable Australian news organisation. It serves as a historical record of significant events globally, with a particular emphasis on Australia, spanning from early 2003 to the end of 2021. The dataset captures the entire body of articles published on the abcnews website during this timeframe, offering insight into major episodes such as the Afghanistan war, financial crises, multiple elections, ecological disasters, terrorism, activities of famous individuals, and criminal activity.
Columns
- publish_date: The date when the article was published, presented in
yyyyMMdd
format. - headline_text: The textual content of the headline, provided in ASCII, English, and lowercase.
Distribution
The dataset is provided in a single CSV file. It contains approximately 1.2 million unique headline entries. On average, around two hundred articles were published per day within the covered period.
Usage
This dataset is ideal for various applications, including:
- Analysing historical news trends and event evolution.
- Natural Language Processing (NLP) tasks such as text classification, sentiment analysis, and topic modelling.
- Tracking the discourse around specific keywords and major global or Australian events.
- Academic research into media representation and historical timelines.
- Developing and testing news aggregation or summarisation algorithms.
Coverage
The dataset covers a time range from 19th February 2003 to 31st December 2021, accounting for nineteen years of headlines. Geographically, it focuses on Australia while also including a substantial amount of international news content.
License
CCO
Who Can Use It
- Data Scientists and Machine Learning Engineers: For NLP model training and development.
- Researchers and Academics: To study historical events, media coverage, and linguistic patterns over time.
- Journalists and Media Analysts: For trend spotting and understanding past news narratives.
- Students: For educational projects related to data analysis and current affairs.
Dataset Name Suggestions
- Australian News Headline Archive
- ABC News Headlines 2003-2021
- Global Events Headline Dataset
- Nineteen Years of News Headlines
Attributes
Original Data Source: A Million News Headlines