Dark Mode

Home

Data Categories

General & Miscellaneous Data

Twitter Mental Health Classification Data

FREE DATASET LIBRARY

Verified Data Provider

£0

Twitter Mental Health Classification Data

Mental Health & Wellness

Tags and Keywords

Text

Nlp

Healthcare

Binary

Depression

Twitter

Mental

Trusted By

Twitter Mental Health Classification Data Dataset on Opendatabay data marketplace

"No reviews yet"

Free

About

This dataset provides uncleaned Twitter data, specifically filtered for English content, designed for mental health classification at the Tweet-level. It serves as a valuable resource for developing and evaluating models that identify mental health indicators from social media text. The dataset includes raw tweet text and associated user metrics. Additionally, it can be used to explore and apply data cleaning and feature extraction techniques, such as Topic Modelling Features using Latent Dirichlet Allocation (LDA) to summarise tweets into top k topics, and Emoji Sentiment Features to count positive, negative, and neutral expression emojis present in tweets.

Columns

post_id: The unique identification number for each Twitter post.
post_created: The timestamp indicating when the post was created.
post_text: The raw, uncleaned text content of the tweet.
user_id: The unique identification number for the user who posted the tweet.
followers: The number of followers the user had at the time of the post.
friends: The number of friends (accounts the user is following) the user had at the time of the post.
favourites: The total number of likes (favourites) the user's account has received across all their tweets.
statuses: The total count of statuses (tweets) posted by the user.
retweets: The total number of retweets received by the current tweet.
Label: The classification label for mental health, intended for binary classification tasks.

Distribution

The data files are typically provided in CSV format and are in an uncleaned state. While a specific total number of rows or records is not explicitly stated, the dataset contains approximately 19,102 unique post IDs and 19,488 unique user IDs. Further details on the distribution of specific metrics like followers, friends, favourites, statuses, and retweets are available within the dataset's meta-information, showing various ranges and their corresponding counts.

Usage

This dataset is ideal for:

Developing and testing mental health classification models using social media data.
Practising and demonstrating Natural Language Processing (NLP) techniques, including text analysis and feature engineering.
Exploring and applying data cleaning methodologies on raw social media text.
Implementing and evaluating Topic Modelling using algorithms like LDA.
Conducting sentiment analysis based on emoji usage in tweets.
Research in social media analytics, public health, and digital epidemiology.

Coverage

The dataset's coverage is global, with tweets specifically filtered to contain English context only. There is no specific time range for the collection period of the tweets provided, but the dataset was listed on 05/06/2025.

License

CCO

Who Can Use It

This dataset is suitable for:

Data scientists and machine learning engineers working on text classification and NLP projects.
Researchers in mental health, social sciences, and computational linguistics.
Students and academics learning about social media data analysis, feature engineering, and model development for health applications.
Healthcare professionals interested in leveraging social media for insights into mental wellness trends.

Dataset Name Suggestions

Twitter Mental Health Classification Data
English Tweets Depression Classifier
Social Media Mental Health Indicators
Tweet-Level Mental Well-being Dataset
Depression Prediction from Twitter

Attributes

Original Data Source: Depression: Twitter Dataset + Feature Extraction

Listing Stats

VIEWS

DOWNLOADS

LISTED

05/06/2025

REGION

GLOBAL

QUALITY

5 / 5

VERSION

1.0

FREE DATASET LIBRARY

£0

Twitter Mental Health Classification Data

Mental Health & Wellness

Tags and Keywords

Text

Nlp

Healthcare

Binary

Depression

Twitter

Mental

Trusted By

Free

About

Columns

Distribution

Usage

Coverage

License

Who Can Use It

Dataset Name Suggestions

Attributes

Listing Stats

Free

Download Dataset in CSV Format

RECOMMENDED DATASETS