Opendatabay APP

Daily BBC News Text Dataset

Entertainment & Media Consumption

Tags and Keywords

News

Nlp

Multiclass

Text

Trusted By
Trusted by company1Trusted by company2Trusted by company3
Daily BBC News Text Dataset Dataset on Opendatabay data marketplace

"No reviews yet"

Free

About

This dataset is designed for analysing news trends, performing sentiment analysis, and studying the impact of specific events over time. It offers valuable insights for those interested in media coverage, news propagation, and shifts in public interest across various topics. The dataset is particularly useful for tasks involving natural language processing (NLP), multiclass classification, and text pre-processing.

Columns

  • title: The headline or title of the news article.
  • pubDate: The date and time when the news article was published.
  • guid: A globally unique identifier for the news article, typically presented as a URL.
  • link: The direct URL link to access the full news article online.
  • description: A concise summary or brief overview of the news article content.

Distribution

The dataset, named bbc_news.csv, contains 35,860 rows and 5 columns. It is typically provided in a CSV file format. The dataset includes 33,889 unique descriptions, 32,335 unique links, 33,124 unique titles, and 33,081 unique GUIDs.

Usage

This dataset is ideally suited for:
  • Analysing patterns and shifts in news reporting.
  • Conducting sentiment analysis on news article content.
  • Investigating the influence of particular events over time.
  • Developing and testing models for multiclass classification.
  • Tasks requiring text pre-processing for machine learning applications.
  • Research into media coverage and public engagement with news.

Coverage

The data primarily spans from 07 March 2022 to 03 July 2024. However, the full collection includes a wider range of publication dates, with some articles dating back to 2013. The distribution of articles by date range is as follows:
  • 08/30/2013 - 03/16/2014: 1 article
  • 06/16/2017 - 12/31/2017: 1 article
  • 12/31/2017 - 07/17/2018: 1 article
  • 08/17/2019 - 03/02/2020: 1 article
  • 09/16/2020 - 04/02/2021: 2 articles
  • 10/17/2021 - 05/03/2022: 2,477 articles
  • 05/03/2022 - 11/17/2022: 8,049 articles
  • 11/17/2022 - 06/03/2023: 7,334 articles
  • 06/03/2023 - 12/18/2023: 8,933 articles
  • 12/18/2023 - 07/04/2024: 9,061 articles The dataset covers news articles on a global scale.

License

CC-BY

Who Can Use It

This dataset is particularly beneficial for:
  • Researchers: For academic studies on media, public opinion, and linguistic analysis.
  • Data Scientists: For developing predictive models, text analytics, and machine learning applications.
  • Journalists: For investigative reporting, trend analysis, and understanding news propagation.
  • Individuals interested in natural language processing (NLP) and text-based data projects.

Dataset Name Suggestions

  • BBC News Articles Collection
  • BBC News Headlines & Summaries Dataset
  • Daily BBC News Text Data
  • BBC News Article Archive

Attributes

Original Data Source:BBC News Articles

Listing Stats

VIEWS

3

DOWNLOADS

0

LISTED

11/06/2025

REGION

GLOBAL

Universal Data Quality Score Logo UDQSQUALITY

5 / 5

VERSION

1.0

Free

Download Dataset in CSV Format