Social Media Sentiment Analysis Data
Social Media and Posts
Tags and Keywords
Trusted By




"No reviews yet"
Free
About
This dataset is designed for Twitter Sentiment Analysis, providing insights into the sentiment expressed in tweets. It categorises sentiments into three distinct labels: negative (-1), neutral (0), and positive (+1). The dataset includes two key fields: the tweet content and its corresponding sentiment label.
Columns
- clean_text: This column contains the processed text of the tweets. It features 162,980 unique text entries, all of which are valid with no mismatched or missing values.
- category: This column represents the sentiment label associated with each tweet. The labels are categorised as -1 (negative), 0 (neutral), and 1 (positive). There are 35,509 negative labels, 55,212 neutral labels, and 72,249 positive labels. This column is 100% valid, with only 10 missing entries out of 163,000 records. The mean sentiment score is 0.23, with a standard deviation of 0.78, indicating a distribution from -1 (minimum) to 1 (maximum) sentiment.
Distribution
The dataset is provided as a CSV file, specifically named
Twitter_Data.csv
, and has a file size of 20.9 MB. It contains approximately 163,000 records, each with values for both tweet text and sentiment label.Usage
This dataset is ideal for developing and evaluating machine learning models focused on sentiment analysis, particularly within the context of social media data. It can be used for tasks such as text classification, natural language processing research, and building predictive models for social media sentiment.
Coverage
The dataset's focus is on general Twitter sentiment and does not specify particular geographic, time range, or demographic scopes. It is a static dataset, with an expected update frequency of "Never", meaning its content will not change or be augmented over time.
License
Attribution 4.0 International (CC BY 4.0)
Who Can Use It
This dataset is suitable for data scientists, machine learning engineers, researchers in natural language processing (NLP), and academic institutions. It provides a valuable resource for those looking to build sentiment analysis models, conduct textual data analysis, or explore social media trends.
Dataset Name Suggestions
- Twitter Sentiment Labels Dataset
- Social Media Sentiment Analysis Data
- Tweet Sentiment Classification Dataset
- Public Tweet Sentiment Archive
Attributes
Original Data Source: Social Media Sentiment Analysis Data