Twitter Public Sentiment Dataset
Telecommunications & Network Data
Tags and Keywords
Trusted By




"No reviews yet"
Free
About
This dataset provides a collection of 1000 tweets designed for sentiment analysis. The tweets were sourced from Twitter using Python and systematically generated using various modules to ensure a balanced representation of different tweet types, user behaviours, and sentiments. This includes the use of a random module for IDs and text, a faker module for usernames and dates, and a textblob module for assigning sentiment. The dataset's purpose is to offer a robust foundation for analysing and visualising sentiment trends and patterns, aiding in the initial exploration of data and the identification of significant patterns or trends.
Columns
- Tweet ID: A unique identifier assigned to each individual tweet.
- Text: The actual textual content of the tweet.
- User: The username of the individual who posted the tweet.
- Created At: The date and time when the tweet was originally published.
- Likes: The total number of likes or approvals the tweet received.
- Retweets: The total count of times the tweet was shared by other users.
- Sentiment: The categorised emotional tone of the tweet, typically labelled as positive, neutral, or negative.
Distribution
The dataset is provided in a CSV file format. It consists of 1000 individual tweet records, structured in a tabular layout with the columns detailed above. A sample file will be made available separately on the platform.
Usage
This dataset is ideal for:
- Analysing and visualising sentiment trends and patterns in social media.
- Initial data exploration to uncover insights into tweet characteristics and user emotions.
- Identifying underlying patterns or trends within social media conversations.
- Developing and training machine learning models for sentiment classification.
- Academic research into Natural Language Processing (NLP) and social media dynamics.
- Educational purposes, allowing students to practise data analysis and visualisation techniques.
Coverage
The dataset spans tweets created between January and April 2023, as observed from the included data samples. While specific geographic or demographic information for users is not available within the dataset, the nature of Twitter implies a general global scope, reflecting a variety of user behaviours and sentiments without specific regional or population group focus.
License
CC0
Who Can Use It
This dataset is valuable for:
- Data Scientists and Machine Learning Engineers working on NLP tasks and model development.
- Researchers in fields such as Natural Language Processing, Machine Learning Algorithms, Deep Learning, and Computer Science.
- Data Analysts looking to extract insights from social media content.
- Academics and Students undertaking projects related to sentiment analysis or social media studies.
- Anyone interested in understanding online sentiment and user behaviour on social media platforms.
Dataset Name Suggestions
- Twitter Public Sentiment Dataset
- Social Media Text Sentiment Analysis
- General Tweet Mood Data
- Twitter Sentiment Collection 2023
- Microblog Sentiment Dataset
Attributes
Original Data Source: Twitter Sentiment Analysis using Roberta and VaderTwitter Sentiment Analysis using Roberta and Vader