Stock Market Tweet Sentiment Data
Stock & Market Data
Tags and Keywords
Trusted By




"No reviews yet"
Free
About
This dataset provides Twitter lexicon data focused on stock market sentiment, specifically tweets collected between April and July 2020. It includes content related to the SPX500 index and its top 25 constituent companies, along with the general hashtag "#stocks". The dataset aims to support sentiment analysis studies, particularly within the financial domain and natural language processing (NLP) research, offering a collection of manually classified tweets. It is a specialised dictionary designed to aid in understanding public sentiment regarding market movements.
Columns
- ID: The identifier for each tweet.
- Date and time: The timestamp indicating when the tweet was posted.
- Tweet: The actual text content written by the user in the tweet.
- Sentiment: The classification of the tweet as either positive or negative.
Distribution
The dataset is supplied as a CSV file named
tweets_labelled_09042020_16072020.csv
, which is approximately 959.89 kB in size. It consists of four distinct columns. While 1300 tweets were manually classified and reviewed to establish sentiment, the detailed column statistics indicate that the 'text' column contains 527 valid entries and the 'sentiment' column holds 195 valid classifications within the provided file, with a notable number of missing values across certain columns.Usage
This dataset is ideally suited for:
- Natural Language Processing (NLP) research, particularly in sentiment analysis.
- Financial market analysis to gauge public opinion on stocks and indices.
- Developing and testing machine learning models for sentiment prediction.
- Social media analysis to track discourse around financial topics.
- Educational purposes for those learning about NLP and financial data.
Coverage
The data was collected between April 9 and July 16, 2020. The content focuses on tweets using the SPX500 tag, mentions of the top 25 companies within that index, and the "#stocks" hashtag. While no explicit geographic or demographic scope is detailed, the focus on stock market discourse implies a relevance to users interested in global financial markets, with tweets likely originating from a broad, English-speaking user base.
License
Attribution 4.0 International (CC BY 4.0)
Who Can Use It
- Data scientists and machine learning engineers working on NLP tasks or financial predictive models.
- Academics and researchers in fields such as computational finance, linguistics, and social media studies.
- Financial analysts seeking to integrate social media sentiment into their market insights.
- Students exploring topics related to sentiment analysis, NLP, or stock market data.
Dataset Name Suggestions
- Stock Market Tweet Sentiment Data
- Twitter Stock Sentiment Lexicon
- Financial Tweet Sentiment Analysis Dataset
- SPX500 & Stock Tweets Sentiment
- Social Media Financial Sentiment
Attributes
Original Data Source: Stock Market Tweet Sentiment Data