Social Media Suicide Sentiment Dataset
Social Media and Posts
Tags and Keywords
Trusted By




"No reviews yet"
Free
About
This dataset, titled Suicidal Tweet Detection, offers a collection of tweets meticulously annotated to indicate the presence or absence of suicidal sentiments. Its primary objective is to facilitate the development and evaluation of machine learning models designed for classifying tweets as either expressing suicidal thoughts or not. Internally generated for a Natural Language Processing (NLP) project, this dataset is crucial for understanding and addressing mental health concerns expressed on social media. Each tweet is categorised as either a "Not Suicide post" or a "Potential Suicide post," allowing for focused analysis and model training.
Columns
- Tweet: This column contains the full text content of tweets gathered from various sources. These tweets encompass a wide array of topics, emotions, and expressions, providing a rich linguistic context for analysis.
- Suicide: This column provides the classification label for each tweet.
- Not Suicide post: Assigned to tweets that do not express any suicidal sentiments or intentions.
- Potential Suicide post: Assigned to tweets that exhibit indications of suicidal thoughts, feelings, or intentions, signalling distress or self-harm concerns.
Distribution
The dataset is provided as a CSV file named "Suicide_Ideation_Dataset(Twitter-based).csv," with a file size of 227.92 kB. It comprises 2 columns and approximately 1,785 to 1,787 records. The 'Tweet' column has 1778 unique values out of 1785 valid entries, while the 'Suicide' column has 2 unique values out of 1787 valid entries, with 'Not Suicide post' being the most common (63%) and 'Potential Suicide post' making up 37%.
Usage
This dataset is particularly well-suited for a variety of Natural Language Processing (NLP) and sentiment analysis tasks. It is ideal for training and evaluating machine learning models capable of automatically classifying tweets as non-suicidal or potentially suicidal. Researchers, data scientists, and developers can leverage this dataset to build systems that identify and flag concerning content on social media platforms, contributing significantly to early intervention and support for individuals in distress.
Coverage
The tweets included in this dataset cover a wide range of topics and emotions. The dataset does not provide any personal or identifying information about the users who posted the tweets. Specific geographic, time range, or demographic scopes are not detailed in the available sources, focusing instead on the textual content and sentiment classification.
License
CC BY-NC-SA 4.0
Who Can Use It
- Researchers: To analyse linguistic patterns and sentiment in different emotional states, particularly concerning mental health.
- Data Scientists and Developers: To create suicidal ideation detection models for automatically identifying and flagging potential suicidal content on social media.
- Social Media Platforms: To implement tools that enable appropriate actions and interventions for users in distress.
- Mental Health Organisations: To develop tools that offer mental health resources or interventions to users showing signs of distress.
- Public Health Initiatives: To raise awareness about mental health issues and promote responsible social media usage.
Dataset Name Suggestions
- Suicidal Tweet Detection Dataset
- NLP Suicidal Ideation on Twitter
- Mental Health Tweet Classifier
- Social Media Suicide Sentiment Dataset
- Annotated Tweets for Suicide Detection
Attributes
Original Data Source: Social Media Suicide Sentiment Dataset