Opendatabay APP

Stock Market Tweet Sentiment Data

Stock & Market Data

Tags and Keywords

Stocks

Tweets

Sentiment

Nlp

Finance

Trusted By
Trusted by company1Trusted by company2Trusted by company3
Stock Market Tweet Sentiment Data Dataset on Opendatabay data marketplace

"No reviews yet"

Free

About

This dataset provides Twitter lexicon data focused on stock market sentiment, specifically tweets collected between April and July 2020. It includes content related to the SPX500 index and its top 25 constituent companies, along with the general hashtag "#stocks". The dataset aims to support sentiment analysis studies, particularly within the financial domain and natural language processing (NLP) research, offering a collection of manually classified tweets. It is a specialised dictionary designed to aid in understanding public sentiment regarding market movements.

Columns

  • ID: The identifier for each tweet.
  • Date and time: The timestamp indicating when the tweet was posted.
  • Tweet: The actual text content written by the user in the tweet.
  • Sentiment: The classification of the tweet as either positive or negative.

Distribution

The dataset is supplied as a CSV file named tweets_labelled_09042020_16072020.csv, which is approximately 959.89 kB in size. It consists of four distinct columns. While 1300 tweets were manually classified and reviewed to establish sentiment, the detailed column statistics indicate that the 'text' column contains 527 valid entries and the 'sentiment' column holds 195 valid classifications within the provided file, with a notable number of missing values across certain columns.

Usage

This dataset is ideally suited for:
  • Natural Language Processing (NLP) research, particularly in sentiment analysis.
  • Financial market analysis to gauge public opinion on stocks and indices.
  • Developing and testing machine learning models for sentiment prediction.
  • Social media analysis to track discourse around financial topics.
  • Educational purposes for those learning about NLP and financial data.

Coverage

The data was collected between April 9 and July 16, 2020. The content focuses on tweets using the SPX500 tag, mentions of the top 25 companies within that index, and the "#stocks" hashtag. While no explicit geographic or demographic scope is detailed, the focus on stock market discourse implies a relevance to users interested in global financial markets, with tweets likely originating from a broad, English-speaking user base.

License

Attribution 4.0 International (CC BY 4.0)

Who Can Use It

  • Data scientists and machine learning engineers working on NLP tasks or financial predictive models.
  • Academics and researchers in fields such as computational finance, linguistics, and social media studies.
  • Financial analysts seeking to integrate social media sentiment into their market insights.
  • Students exploring topics related to sentiment analysis, NLP, or stock market data.

Dataset Name Suggestions

  • Stock Market Tweet Sentiment Data
  • Twitter Stock Sentiment Lexicon
  • Financial Tweet Sentiment Analysis Dataset
  • SPX500 & Stock Tweets Sentiment
  • Social Media Financial Sentiment

Attributes

Original Data Source: Stock Market Tweet Sentiment Data

Listing Stats

VIEWS

0

DOWNLOADS

0

LISTED

22/08/2025

REGION

GLOBAL

Universal Data Quality Score Logo UDQSQUALITY

5 / 5

VERSION

1.0

Free

Download Dataset in ZIP Format