Opendatabay APP

Reddit WallStreetBets & SPY Market Sentiment

Finance & Banking Analytics

Tags and Keywords

Finance

Investing

Intermediate

Nlp

Trusted By
Trusted by company1Trusted by company2Trusted by company3
Reddit WallStreetBets & SPY Market Sentiment Dataset on Opendatabay data marketplace

"No reviews yet"

Free

About

This dataset provides a unique insight into the influence of the Reddit community, particularly WallStreetBets, on the stock market. It integrates daily discussion threads scraped from Reddit, processed using Natural Language Processing (NLP), with relevant financial data from the Yahoo Finance API. The primary aim is to enable users to gauge market sentiment, predict the movement of the SPY exchange-traded fund, and develop sophisticated trading systems by leveraging social media insights. This data is invaluable for understanding how collective online sentiment can precede and potentially affect market trends.

Columns

  • index: A unique identifier for each record.
  • Date: The specific trading day for which the data is recorded.
  • id: The unique Reddit ID for each discussion thread.
  • title: The title of each daily discussion thread on Reddit.
  • url: The URL link where the original discussion thread can be found.
  • comments: A list containing all comments scraped from the discussion thread for that day.
  • most_hapenning_tickers_of_the_day: Identifies the most frequently discussed stock tickers on a given day.
  • SPY_close: The adjusted closing price of the 'SPY' ticker for the respective trading day.

Distribution

The dataset is typically provided in a CSV file format. It contains daily discussion threads and associated financial data covering a period from 2018-01-02 to 2022-03-11. While a precise total row count is not specified, the dataset includes multiple entries across this timeframe, with various unique values for different columns, indicating a substantial collection of daily market-related social media activity.

Usage

This dataset is ideally suited for:
  • Performing sentiment analysis on social media data to understand public mood towards the stock market.
  • Developing and back-testing trading systems that incorporate social media sentiment as a predictive factor.
  • Predicting the price movements of the SPY ETF based on Reddit discussions.
  • Gauging the sentiment of the markets before making trading decisions.
  • Researching the correlation between social media activity and stock market performance.

Coverage

The data encompasses discussions from the WallStreetBets community on Reddit, providing insights into sentiment primarily related to the US stock market, particularly the SPY ETF. The time range covered is from 2018-01-02 to 2022-03-11, offering several years of daily data. The geographic scope is global, reflecting the international reach of Reddit users and the global relevance of US market sentiment.

License

CC0

Who Can Use It

  • Financial Analysts and Researchers: To study market sentiment and its impact on stock performance.
  • Quantitative Traders: To develop algorithmic trading strategies leveraging social media insights.
  • Data Scientists: For projects involving NLP, time-series analysis, and predictive modelling in finance.
  • Academics: For research into behavioural economics, financial sociology, and the democratisation of market information.
  • Individual Investors: To gain additional insights and gauge market sentiment before making investment decisions.

Dataset Name Suggestions

  • Reddit WallStreetBets & SPY Market Sentiment
  • Daily Reddit Financial Discussions (2018-2022)
  • Social Media Impact on SPY Data
  • WallStreetBets Sentiment for Stock Prediction
  • Reddit Stock Market Pulse (SPY Focus)

Attributes

Listing Stats

VIEWS

0

DOWNLOADS

0

LISTED

27/06/2025

REGION

GLOBAL

Universal Data Quality Score Logo UDQSQUALITY

5 / 5

VERSION

1.0

Free

Download Dataset in CSV Format