Online Hate Speech Analyser Dataset
Telecommunications & Network Data
Tags and Keywords
Trusted By




"No reviews yet"
Free
About
This dataset is designed for classifying text data to determine whether it constitutes hate speech or not. It contains texts and corresponding responses indicating their classification. The primary purpose is to enable the development and training of models capable of identifying hate speech within social discourse, providing a valuable resource for natural language processing and content moderation efforts.
Columns
- Text_ID: A unique identifier for each individual text entry in the dataset.
- Data Response: This column contains the actual text content from various contemporary social discourse sources. The content spans a range of topics and opinions, some of which may include discussions of social groups or express strong views, including potentially racist or discriminatory remarks.
- Label: A numerical indicator classifying the text. A value of '1' denotes hate speech, while '0' indicates that the text is not hate speech.
Distribution
The dataset is typically provided in a CSV format. The exact number of rows or records within the dataset is not specified in the available information. It is structured in a tabular format, making it suitable for direct input into various data analysis and machine learning tools.
Usage
This dataset is ideally suited for a variety of applications, including:
- Building and evaluating machine learning models for hate speech detection and classification.
- Conducting natural language processing (NLP) research focused on offensive language identification.
- Developing and improving content moderation systems for online platforms and social media.
- Academic studies on the prevalence and characteristics of hate speech in digital communications.
Coverage
The dataset has a global regional coverage. While the listing date is noted as 17/06/2025, specific historical time ranges for the original content within the dataset are not detailed. The demographic scope reflects various aspects of contemporary social discourse, encompassing a range of views and expressions related to different groups.
License
CCO
Who Can Use It
This dataset is valuable for:
- Data scientists and machine learning engineers looking to train and test algorithms for hate speech detection.
- AI/ML researchers interested in the nuances of offensive language and text classification.
- Content moderators and platform administrators aiming to enhance automated content filtering.
- Social scientists and academics studying online communication patterns and societal discourse.
Dataset Name Suggestions
- Hate Speech Classification Dataset
- Social Discourse Classification Texts
- Online Hate Speech Analyser Data
- Textual Offence Detector Dataset
Attributes
Original Data Source: Hate Speech Classification Dataset