Dark Mode

Home

Data Categories

Web & Social Media Data

Twitter Emotion Classification Dataset

FREE DATASET LIBRARY

Verified Data Provider

£0

Twitter Emotion Classification Dataset

Social Media and Posts

Tags and Keywords

Emotion

Tweets

Nlp

Sentiment

Recognition

Trusted By

Twitter Emotion Classification Dataset Dataset on Opendatabay data marketplace

"No reviews yet"

Free

About

This dataset is designed for emotion recognition tasks, particularly focusing on English Twitter messages [3]. It provides a collection of tweets labelled with six basic human emotions: anger, fear, joy, love, sadness, and surprise [3]. An extended set of eight emotions (including anticipation, disgust, and trust) was originally collected, and the data has been preprocessed based on the methodology described in the accompanying research paper [3]. The dataset aims to provide robust linguistic building blocks for understanding and modelling how emotions are conveyed through text, which is crucial for contextualised affect representations [4].

Columns

text: A string feature representing the original tweet content [5]. This column contains 2000 unique values, all of which are valid [5]. An example entry is: "im feeling quite sad and sorry for myself but ill snap out of it soon" [6].
label: A classification label, representing one of the six basic emotions [7]. The possible integer values correspond to: sadness (0), joy (1), love (2), anger (3), fear (4), and surprise (implied as 5, though not explicitly listed with a number in the provided breakdown) [7]. This column also contains 2000 valid entries with a mean label value of 1.53 and a standard deviation of 1.47 [7].

Distribution

The dataset is primarily structured around preprocessed English Twitter messages [3]. While the exact file format for distribution is not explicitly stated in the provided text, a test.csv file is referenced, suggesting a CSV format [5]. The size of the downloaded dataset files is 3.95 MB, and the size of the generated dataset is 4.16 MB, leading to a total disk usage of 8.11 MB [6]. Both the text and label columns contain 2000 records [5, 7].

Usage

This dataset is ideal for a variety of applications in Natural Language Processing (NLP) and machine learning, particularly:

Emotion recognition and detection in textual data [3, 4].
Developing and evaluating sentiment analysis models [4].
Text classification tasks related to emotional states [4].
Contextualised affect representation research [4].
Building AI systems capable of understanding nuanced human emotions from text [4].

Coverage

The dataset consists exclusively of English Twitter messages [3]. There is no specific geographic or detailed time range coverage mentioned beyond the source being the Twitter API [3]. The data focuses on general emotional expressions within tweets and does not specify demographic group coverage [3, 4].

License

CC0: Public Domain

Who Can Use It

This dataset is suitable for:

AI and Machine Learning Researchers: For developing and testing new algorithms for emotion recognition and sentiment analysis [4].
Data Scientists: To build predictive models for understanding emotional content in social media data [4].
NLP Practitioners: For training and fine-tuning language models on emotional expression [4].
Students and Academics: As a valuable resource for projects and studies in computational linguistics and artificial intelligence [3].

Dataset Name Suggestions

Emotion Tweets for NLP
Twitter Emotion Classification Dataset
CARER Emotion Dataset
English Tweet Emotion Data
Social Media Emotion Recognition Corpus

Attributes

Original Data Source: Twitter Emotion Classification Dataset

Listing Stats

VIEWS

DOWNLOADS

LISTED

20/07/2025

REGION

GLOBAL

QUALITY

5 / 5

VERSION

1.0

Free

Download Dataset in ZIP Format

Recommended Datasets

Loading recommendations...