Multi-Turn Dialogues with Emotion & Intent Labels
Data Science and Analytics
Tags and Keywords
Trusted By




"No reviews yet"
Free
About
The DailyDialog dataset is a curated collection of multi-turn dialogues that reflects everyday communication. It covers a variety of topics relevant to daily experiences. This dataset features human-written conversations, ensuring natural and realistic language, which contributes to higher quality data with less noise. Each dialogue involves two or more participants and is provided in a textual format. A key feature of this dataset is the inclusion of corresponding labels for communication intention and emotion attached to each utterance. These labels offer valuable insights into how participants express their intentions and emotional states through speech. The dataset is an invaluable resource for developing robust dialogue systems capable of understanding human interactions on a deeper level, identifying diverse intentions, and recognising various emotional states encountered in daily exchanges.
Columns
- dialog: This column contains the actual conversation between two or more participants. It is presented in text format.
- act: The act column provides the communication intention labels for each utterance within the dialogue. These labels categorise the purpose behind a participant's speech, such as asking a question, making a statement, or making a request.
- emotion: This column holds categorical labels that represent the emotions expressed by each participant during their utterances, including examples like anger, happiness, or sadness.
Distribution
The dataset is organised into three separate CSV files: validation.csv, train.csv, and test.csv. These files facilitate different stages of model development, including validation, training, and testing. The dataset focuses on multi-turn dialogues. Specific numbers for rows or records are not provided within the available information.
Usage
This dataset offers excellent opportunities for various applications:
- Natural Language Processing (NLP): Ideal for training NLP models to understand and generate more realistic and human-like dialogues. Communication intention labels help identify the purpose of utterances, while emotion labels add emotional context.
- Sentiment Analysis: With the emotion labels, the dataset can be used for sentiment analysis tasks, allowing classification of the overall sentiment of a conversation or individual utterances. This is useful for understanding customer feedback or social media discussions.
- Dialogue Generation: One can train dialogue generation models capable of creating engaging conversations on various daily life topics. Communication intention labels can guide the model in generating appropriate responses based on different expressed intents.
Coverage
The dataset is designed to accurately represent daily life conversations, covering a wide range of everyday topics. It consists of human-written conversations, ensuring natural language use. No specific geographic, time range, or demographic scope beyond "daily life" is detailed.
License
CCO
Who Can Use It
- AI/ML Developers: Especially those working on dialogue systems, conversational AI, and natural language understanding.
- NLP Researchers: Individuals focused on advancing NLP models for dialogue, intention recognition, and emotion detection.
- Data Scientists: Those interested in sentiment analysis, language modelling, and human communication patterns.
- Academics: Researchers and students studying human interaction, linguistics, and machine learning applications in text analysis.
Dataset Name Suggestions
- DailyDialog: Intent & Emotion Conversations
- Multi-Turn Dialogues with Emotion & Intent Labels
- Everyday Conversation Dataset
- Human-Written Dialogues: Intent & Sentiment
- DailyTalk: Annotated Conversations
Attributes
Original Data Source: DailyDialog: Multi-Turn Dialog+Intention+Emotion