Dark Mode

Home

Data Categories

AI & ML Data

Ethical Dialogue Dataset

FREE DATASET LIBRARY

Verified Data Provider

£0

Ethical Dialogue Dataset

Education & Learning Analytics

Tags and Keywords

Education

Social

Nlp

Psychology

Mental

People

Trusted By

Ethical Dialogue Dataset Dataset on Opendatabay data marketplace

"No reviews yet"

Free

About

ProsocialDialog is a large-scale, multi-turn English dialogue dataset designed to teach conversational agents how to respond to problematic content in line with social norms. It addresses a variety of unethical, biased, toxic, and generally problematic situations. The dataset is notable for its focus on encouraging prosocial behaviour, which is guided by commonsense social rules, referred to as Rules-of-Thumb (RoTs). Developed through a human-AI collaborative framework, the dataset consists of 58,000 dialogues, comprising 331,000 utterances, 160,000 unique RoTs, and 497,000 dialogue safety labels, each accompanied by free-form rationales. The test.csv file within the ProsocialDialog dataset contains data specifically for evaluating the accuracy of a model in predicting conversation safety.

Columns

The dataset includes the following columns:

context: The context of the conversation. (String)
response: The response to the conversation. (String)
rots: Rules of thumb associated with the conversation. (String)
safety_label: The safety label associated with the conversation. (String)
safety_annotations: Annotations associated with the conversation. (String)
safety_annotation_reasons: Reasons for the safety annotations. (String)
source: The source of the conversation. (String)
etc: Any additional information associated with the conversation. (String)
dialogue_id: Unique identifier for each dialogue.
response_id: Unique identifier for each response.

Distribution

The dataset is typically provided in a CSV file format, such as test.csv. It contains 58,000 dialogues, encompassing 331,000 utterances. There are 24,972 unique dialogue IDs and 24,903 unique response IDs. The dataset includes 160,000 unique Rules-of-Thumb (RoTs) and 497,000 dialogue safety labels. Specific numbers for rows or records beyond these counts are not provided in the sources.

Usage

This dataset is ideally suited for several applications:

Designing Conversational Agents: It can be used to build Natural Language Processing (NLP) models capable of recognising and classifying problematic content. The safety labels, rationales, and RoTs can train conversational agents to respond in socially acceptable ways.
Benchmark Systems: ProsocialDialog serves as an effective benchmark for evaluating the performance of existing conversation datasets in identifying, responding to, and preventing problematic content interactions.
Automated Moderation: The dialogue safety labels and their associated free-form rationales are valuable for technology platforms implementing automated moderation tasks, such as flagging or banning offensive messages or users.

Coverage

The ProsocialDialog dataset is in English and has a global regional coverage. It addresses general conversational scenarios involving social norms and problematic content, but specific demographic scope details or the precise time range of data collection are not explicitly outlined in the sources. The dataset was listed on 11/06/2025.

License

CCO

Who Can Use It

This dataset is beneficial for a range of users, including:

Researchers and Developers in AI and Machine Learning: Particularly those focused on Natural Language Processing (NLP) and building sophisticated conversational AI systems.
Organisations and Platforms: Especially those in need of automated moderation tools or aiming to ensure their conversational agents adhere to social norms and promote prosocial behaviour.
Academics and Students: Engaged in studying dialogue safety, social psychology, or ethical AI, who can explore the safety labels, annotations, RoTs, and data sources to gain deeper insights into human conversation dynamics.

Dataset Name Suggestions

ProsocialDialog - Problematic Content Dialogue
Conversational Safety Norms
Ethical Dialogue Dataset
Social Norms AI Conversations
Harmful Content Dialogue Dataset

Attributes

Original Data Source: ProsocialDialog - Problematic Content Dialogue

Listing Stats

VIEWS

DOWNLOADS

LISTED

11/06/2025

REGION

GLOBAL

QUALITY

5 / 5

VERSION

1.0

FREE DATASET LIBRARY

£0

Ethical Dialogue Dataset

Education & Learning Analytics

Tags and Keywords

Education

Social

Nlp

Psychology

Mental

People

Trusted By

Free

About

Columns

Distribution

Usage

Coverage

License

Who Can Use It

Dataset Name Suggestions

Attributes

Listing Stats

Free

Download Dataset in CSV Format

RECOMMENDED DATASETS