Opendatabay APP

British Headlines Historical Dataset

News & Media Articles

Tags and Keywords

News

Uk

Politics

Headlines

Archive

Trusted By
Trusted by company1Trusted by company2Trusted by company3
British Headlines Historical Dataset Dataset on Opendatabay data marketplace

"No reviews yet"

Free

About

Capturing the pulse of the United Kingdom through over a decade of headlines, this dataset aggregates news articles retrieved from Google News UK between 2010 and 2024. With approximately 69,600 entries, the collection spans a diverse range of fourteen topics, offering a broad snapshot of the media landscape. The data captures the evolution of public interest and media coverage across critical areas such as Politics, Economy, Health, and Technology. This archive serves as a valuable resource for historical analysis, sentiment monitoring, and understanding the shifting focus of British journalism over fourteen years.

Columns

  • title: The specific headline of the news article (String). There are roughly 65,400 unique titles, indicating some repetition or coverage of identical stories.
  • published: The publication date and time of the article (String). The data covers the period from 1 January 2010 to 5 May 2024.
  • source: The publisher or origin of the news item (String). Includes major outlets like the BBC (18% of entries) and The Independent (8%), alongside a vast tail of other sources.
  • category: The topical classification of the article (String). Contains 14 unique categories including Police (13%), Crime (10%), Politics, Travel, Sports, Education, Economy, Entertainment, Technology, Culture, International, Science, Health, and Environment.

Distribution

The dataset is provided in a CSV format (news_uk_dataset.csv) with a file size of approximately 9.83 MB. It contains a total of 69,600 valid records with zero missing or mismatched values across all columns. The update frequency is expected to be annual.

Usage

  • Sentiment Analysis: Evaluate the emotional tone of headlines regarding specific topics like the Economy or Health over time.
  • Trend Tracking: Analyse the frequency of keywords to determine when specific subjects (e.g., Politics, Crime) peaked in the news cycle.
  • Source Bias Research: Compare how different publishers (e.g., BBC vs. The Independent) frame similar events based on headline phrasing.
  • Media Landscape Mapping: Visualise the distribution of news categories to understand the thematic priorities of UK news outlets.

Coverage

  • Geographic Scope: United Kingdom (News sources and content focus).
  • Time Range: 1 January 2010 to 5 May 2024.
  • Demographic/Topical Scope: General public interest covering 14 distinct categories ranging from hard news (Politics, Crime, Economy) to lifestyle and specialised topics (Travel, Culture, Technology, Science).
  • Data Consistency: The volume of data generally increases over time, rising from roughly 2,600 records in 2010 to over 5,400 in the 2023-2024 period.

License

CC0: Public Domain

Who Can Use It

  • Data Scientists: For training Natural Language Processing (NLP) models and text classification algorithms.
  • Journalists and Researchers: For historical retrospective studies and media auditing.
  • Sociologists: To study the prevalence of crime or political discourse in public media.
  • Students: As a clean, high-usability dataset (rated 10.00) for learning data visualisation and text mining techniques.

Dataset Name Suggestions

  • UK Google News Archive (2010-2024)
  • British Headlines Historical Dataset
  • UK Media Topic & Source Tracker
  • Google News UK: 14-Year Chronicle
  • Comprehensive UK News Headlines Database

Attributes

Listing Stats

VIEWS

1

DOWNLOADS

0

LISTED

05/12/2025

REGION

GLOBAL

Universal Data Quality Score Logo UDQSQUALITY

5 / 5

VERSION

1.0

Loading...

Free

Download Dataset in CSV Format