Opendatabay APP

Online Hate Speech Analyser Dataset

Telecommunications & Network Data

Tags and Keywords

Text

Intermediate

Nlp

Bert

Trusted By
Trusted by company1Trusted by company2Trusted by company3
Online Hate Speech Analyser Dataset Dataset on Opendatabay data marketplace

"No reviews yet"

Free

About

This dataset is designed for classifying text data to determine whether it constitutes hate speech or not. It contains texts and corresponding responses indicating their classification. The primary purpose is to enable the development and training of models capable of identifying hate speech within social discourse, providing a valuable resource for natural language processing and content moderation efforts.

Columns

  • Text_ID: A unique identifier for each individual text entry in the dataset.
  • Data Response: This column contains the actual text content from various contemporary social discourse sources. The content spans a range of topics and opinions, some of which may include discussions of social groups or express strong views, including potentially racist or discriminatory remarks.
  • Label: A numerical indicator classifying the text. A value of '1' denotes hate speech, while '0' indicates that the text is not hate speech.

Distribution

The dataset is typically provided in a CSV format. The exact number of rows or records within the dataset is not specified in the available information. It is structured in a tabular format, making it suitable for direct input into various data analysis and machine learning tools.

Usage

This dataset is ideally suited for a variety of applications, including:
  • Building and evaluating machine learning models for hate speech detection and classification.
  • Conducting natural language processing (NLP) research focused on offensive language identification.
  • Developing and improving content moderation systems for online platforms and social media.
  • Academic studies on the prevalence and characteristics of hate speech in digital communications.

Coverage

The dataset has a global regional coverage. While the listing date is noted as 17/06/2025, specific historical time ranges for the original content within the dataset are not detailed. The demographic scope reflects various aspects of contemporary social discourse, encompassing a range of views and expressions related to different groups.

License

CCO

Who Can Use It

This dataset is valuable for:
  • Data scientists and machine learning engineers looking to train and test algorithms for hate speech detection.
  • AI/ML researchers interested in the nuances of offensive language and text classification.
  • Content moderators and platform administrators aiming to enhance automated content filtering.
  • Social scientists and academics studying online communication patterns and societal discourse.

Dataset Name Suggestions

  • Hate Speech Classification Dataset
  • Social Discourse Classification Texts
  • Online Hate Speech Analyser Data
  • Textual Offence Detector Dataset

Attributes

Listing Stats

VIEWS

2

DOWNLOADS

0

LISTED

17/06/2025

REGION

GLOBAL

Universal Data Quality Score Logo UDQSQUALITY

5 / 5

VERSION

1.0

Free