Opendatabay APP

Daryo Uz News Classification Data

News & Media Articles

Tags and Keywords

News

Text

Classification

Daryo

Asia

Trusted By
Trusted by company1Trusted by company2Trusted by company3
Daryo Uz News Classification Data Dataset on Opendatabay data marketplace

"No reviews yet"

Free

About

This tabular data product compiles 175,217 news stories scraped from the Daryo.uz news portal. It captures the headline, the full article text, and its corresponding category type. It is designed to facilitate robust training datasets for Natural Language Processing (NLP) models, offering insights into regional news coverage and language patterns. The dataset is expected to receive regular updates.

Columns

The structure consists of three primary fields:
  • title: Contains the headline or title of the news story (Sarlavhasi).
  • content: Provides the full body text of the news article (Yangilik matni). Note that 596 records currently contain missing values in this field.
  • target: Specifies the classification category of the news item (Yangilik toifasi).

Distribution

The dataset is provided in a tabular format, specifically as a CSV file named daryo_data.csv, with a size of approximately 271.94 MB. It includes 175,217 total records across 3 columns. There are seven unique target categories available for classification. The distribution of categories is uneven, with 'mahalliy' being the most common type at 42% and 'dunyo' following at 27%.

Usage

This data is ideally suited for:
  • Developing and evaluating text classification algorithms.
  • Training NLP models for sentiment analysis or topic modeling in regional news.
  • Studying linguistic features and vocabulary common in contemporary news media.

Coverage

The scope is based entirely on articles published on the Daryo.uz website. Given the nature of the categories ('Mahalliy', 'Dunyo'), the data pertains to both local events and global affairs, primarily originating from a source in Asia. The data is intended to be updated on a weekly basis, maintaining high topical relevance.

License

CC0: Public Domain

Who Can Use It

Intended users include data scientists needing large text corpora for deep learning applications, academic researchers studying media trends or language analysis, and intermediate to advanced data professionals seeking classification tasks.

Dataset Name Suggestions

  • Daryo Uz News Classification Data
  • Asian News Article Corpus
  • Tabular News Text for NLP
  • Weekly Daryo Uz News Archive

Attributes

Original Data Source: Daryo Uz News Classification Data

Listing Stats

VIEWS

0

DOWNLOADS

0

LISTED

23/11/2025

REGION

GLOBAL

Universal Data Quality Score Logo UDQSQUALITY

5 / 5

VERSION

1.0

Loading...

Free

Download Dataset in CSV Format