Opendatabay APP

Russian News Clickbait Classification Data

News & Media Articles

Tags and Keywords

Russian

Clickbait

Nlp

Headlines

Classification

Trusted By
Trusted by company1Trusted by company2Trusted by company3
Russian News Clickbait Classification Data Dataset on Opendatabay data marketplace

"No reviews yet"

Free

About

The data contains a collection of Russian news headlines classified as either clickbait or non-clickbait. This resource was compiled specifically to facilitate research and development in Natural Language Processing (NLP), enabling users to build and test models focused on text classification. The news articles were gathered over several recent months.

Columns

  • titles: The raw text of the news headlines, primarily in Russian. This column contains 3198 entries.
  • target: The binary classification label indicating whether the headline is clickbait (1) or non-clickbait (0).

Distribution

The data is provided in a standard CSV file format, named titles_data.csv. The file size is approximately 397.96 kB and contains two primary columns. The expected update frequency for this dataset is annually.

Usage

This data is highly suitable for training machine learning algorithms focused on textual analysis. Ideal applications include creating models for text classification, experimenting with various NLP techniques, and studying linguistic patterns associated with sensationalism in news media. It is useful for projects related to Hugging Face models and general data analytics.

Coverage

The data covers news articles sourced from various Russian news websites. The collection period spans the last few months prior to the dataset's creation. The focus is on Russian-language headlines.

License

Attribution-NoDerivatives 4.0 International (CC BY-ND 4.0)

Who Can Use It

  • Data Scientists: Building robust binary text classification models.
  • Academics/Researchers: Studying media bias and the characteristics of clickbait within non-English language contexts.
  • NLP Developers: Validating and refining parsing scripts (such as parser_titles.py) and text processing pipelines.

Dataset Name Suggestions

  1. Russian News Clickbait Classification Data
  2. RU Headline NLP Corpus
  3. Clickbait Titles Text Collection
  4. News Title Classification Resource

Attributes

Listing Stats

VIEWS

1

DOWNLOADS

0

LISTED

02/11/2025

REGION

GLOBAL

Universal Data Quality Score Logo UDQSQUALITY

5 / 5

VERSION

1.0

Loading...

Free

Download Dataset in CSV Format