Opendatabay APP

Arabic Soccer News Corpus

Entertainment & Media Consumption

Tags and Keywords

News

Sports

Text

Football

Nlp

Trusted By
Trusted by company1Trusted by company2Trusted by company3
Arabic Soccer News Corpus Dataset on Opendatabay data marketplace

"No reviews yet"

Free

About

This dataset provides Arabic news articles focused on the Saudi MBS football league, capturing the heightened competition and significant investments in recent years. Its primary purpose is to enable advanced analytical tasks such as sentiment analysis, prediction modelling, and clustering of news content. This allows users to gain insights into how the intense competition within the league is described and perceived through media. The news articles cover a specific period, detailing events, team activities, and match-related discussions.

Columns

  • writer: The individual or organisation responsible for writing the news article.
  • location: The geographical location associated with the news piece, such as Riyadh or Jeddah.
  • date: The date the news was published, formatted as yyyy-mm-dd.
  • time: The time the news was published, formatted as hh:mm.
  • news: The full textual content of the news article.
  • title: The headline or title of the news article.
  • class: A categorical indicator for the type of news:
    • 0: Informative news specifically about teams.
    • 1: News directly related to a football match.
    • 2: General news not specific to particular teams.

Distribution

This dataset is provided as a CSV (Comma Separated Values) file. While the precise total number of rows or records is not stated, it contains news articles predominantly from late 2020 to early 2021, with various daily and multi-day counts reported, for instance, ranging from 26 to 752 records on specific days or periods. A total of 1996 values are referenced, suggesting the overall scale of the data.

Usage

This dataset is ideal for:
  • Sentiment analysis to gauge public and media sentiment towards MBS league teams and events.
  • Predictive modelling to forecast the class or type of news.
  • Text clustering to identify common themes or narratives within Saudi football news.
  • Analysing news trends, such as identifying the most frequent words per month or determining locations with the highest news coverage.
  • Developing Natural Language Processing (NLP) applications for Arabic sports content.

Coverage

The dataset primarily covers Saudi Arabian football news related to the MBS league. The news articles were published between 12 December 2020 and 25 January 2021. Geographical coverage includes locations such as Riyadh and Jeddah, with writers from various sources. There are no specific notes on data availability for particular demographic groups, as the focus is on news content.

License

CC BY-NC-SA

Who Can Use It

This dataset is suitable for:
  • Data scientists and machine learning engineers working on NLP, sentiment analysis, or classification tasks in Arabic.
  • Sports analysts and researchers interested in media coverage and trends within Saudi football.
  • Media companies looking to understand content performance and audience engagement with sports news.
  • Academics studying Arabic text, media, or sports sociology.

Dataset Name Suggestions

  • Saudi Football News
  • MBS League News Articles
  • Arabic Soccer News Corpus
  • Saudi Pro League Media Data
  • Middle East Football News

Attributes

Original Data Source: Saudi Soccer News - Arabic

Listing Stats

VIEWS

1

DOWNLOADS

0

LISTED

17/06/2025

REGION

GLOBAL

Universal Data Quality Score Logo UDQSQUALITY

5 / 5

VERSION

1.0

Free

Download Dataset in CSV Format