Mental Health Discourse Toxicity Dataset
Mental Health & Wellness
Tags and Keywords
Trusted By




"No reviews yet"
Free
About
This dataset is a collection of texts primarily focused on individuals experiencing anxiety, depression, and other mental health challenges. Its purpose is to facilitate understanding of language and sentiment related to mental health issues. The corpus can be applied to diverse tasks such as sentiment analysis, toxic language detection, and general mental health language analysis. The dataset is notably balanced, meaning it contains an equitable distribution of comments considered "poisonous" and those not.
Columns
- text: This column contains the raw text of the comments.
- label: This column provides a numerical classification for each comment. A value of '1' indicates the comment is considered poisonous with mental health issues, while '0' indicates it is not considered poisonous.
Distribution
The dataset is typically structured for distribution in a CSV file format. It contains a total of 27,972 unique records. The distribution of labels shows 14,139 records are classified with a label of '0' (not poisonous), and 13,838 records are classified with a label of '1' (poisonous), indicating its balanced nature.
Usage
This dataset is an ideal resource for developing and refining machine learning models for sentiment analysis, particularly within mental health contexts. It is also highly suitable for creating toxic language detection systems and for conducting linguistic research aimed at understanding patterns in mental health discourse.
Coverage
The geographic scope of this dataset is global. It encompasses a wide range of text comments associated with mental health conditions such as anxiety and depression. The provided sources do not specify a particular time range for the data or specific demographic availability beyond the nature of the comments themselves.
License
CC-BY
Who Can Use It
The dataset is especially beneficial for researchers studying mental health language, mental health professionals seeking insights into online discourse, and developers creating AI models for content moderation, sentiment analysis tools, or support applications related to mental well-being.
Dataset Name Suggestions
- Mental Health Discourse Toxicity Dataset
- Mental Health Comments Corpus
- Toxic Mental Health Language Data
- Anxiety Depression Text Corpus
- Mental Wellness Language Dataset
Attributes
Original Data Source: Mental Health Corpus