Human-Annotated Emotion Regression Data
Data Science and Analytics
Tags and Keywords
Trusted By




"No reviews yet"
Free
About
EmoBank provides a large-scale collection of text manually annotated with emotional dimensions. By using a multi-dimensional approach, it captures not just basic labels like joy or sadness, but also the nuanced intensity and power of emotions. The inclusion of valence, arousal, and dominance scales allows for a thorough quantitative analysis of emotional content in natural language. This dual-annotation system makes it possible to map text onto a multi-dimensional space, providing a detailed representation of human sentiment.
Columns
- id: A unique identifier for each specific record in the collection.
- split: Indicates the dataset partition, categorised as "train", "dev", or "test" for model development and evaluation.
- V (Valence): Measures the positivity or negativity of the emotion on a continuous scale, where higher values indicate positive emotions.
- A (Arousal): Represents the energy or intensity level of the emotional state, distinguishing between calm and intense emotions.
- D (Dominance): Quantifies the level of control or influence the expressed emotion exerts, distinguishing between powerful and submissive feelings.
- text: The raw textual samples from which the emotional metrics are derived.
Distribution
The data is supplied as a CSV file titled
emobank.csv, with a file size of 1.34 MB. It contains 10,100 records and features a usability rating of 10.00. The structure is 100% valid with no missing or mismatched records across the six columns. Updates are not expected for this static resource.Usage
This resource is ideal for sentiment analysis research and building emotional intelligence into AI systems. It supports the development and evaluation of NLP models that require a sophisticated understanding of human feelings within text. Applications include fine-tuning language models for emotion recognition and performing multi-dimensional regression analysis on linguistic data.
Coverage
The scope includes diverse text types such as online articles, product reviews, and various forms of user-generated content. It provides 10.1k unique entries split into training (80%), development (10%), and testing (10%) subsets to ensure robust analysis. The data reflects a wide range of emotional intensities and polarities found in digital communication.
License
CC0: Public Domain
Who Can Use It
NLP researchers can use these records to benchmark emotion recognition algorithms. AI developers might integrate the multi-dimensional scales to create more empathetic chatbots or virtual assistants. Additionally, psychologists and social scientists can leverage the text to study digital emotional expression and behavioural patterns.
Dataset Name Suggestions
- EmoBank: Multi-Dimensional Textual Emotion Corpus
- VAD Emotion Recognition Dataset
- Text-Based Sentiment and Arousal Scales
- Human-Annotated Emotion Regression Data
Attributes
Original Data Source: Human-Annotated Emotion Regression Data
Loading...
