Opendatabay APP

WallStreetBets Market Sentiment Data

Stock & Market Data

Tags and Keywords

Reddit

Stocks

Sentiment

Finance

Wallstreetbets

Trusted By
Trusted by company1Trusted by company2Trusted by company3
WallStreetBets Market Sentiment Data Dataset on Opendatabay data marketplace

"No reviews yet"

Free

About

This dataset contains posts and comments collected from the r/wallstreetbets subreddit, primarily focusing on content from 2022. It provides a valuable resource for understanding trends initiated by the Reddit community, performing sentiment analysis on user-generated content, and extracting key topics discussed within the subreddit.

Columns

  • title: The title of a post.
  • score: The score (upvotes minus downvotes) of a post or comment, with values ranging from -152 to 105k.
  • id: A unique identifier for each post or comment, with approximately 1.1 million unique values.
  • url: The URL associated with a post; approximately 91% of values are null, with around 95.2k unique URLs.
  • comms_num: The number of comments a post has received, ranging from 0 to approximately 39.9k.
  • created: A timestamp indicating when the post or comment was created, represented as a Unix timestamp.
  • body: The main body content of a post or comment; approximately 4% of values are null or contain image emotes.
  • timestamp: A datetime representation of when the post or comment was created, ranging from 27th January 2021 to 28th March 2025.

Distribution

The dataset is provided as a CSV file, named wallstreetbets_2022.csv, with a size of 221.47 MB. It consists of 8 columns and approximately 1.1 million records. The data is collected and merged daily.

Usage

This dataset is ideal for various applications, including:
  • Understanding Market Trends: Analyse the collective sentiment and discussions to identify emerging trends among the "Reddit educated crowd".
  • Sentiment Analysis: Perform sentiment analysis on posts and comments to gauge public mood towards specific stocks or market events.
  • Topic Modelling: Extract significant topics from the extensive text data to uncover key areas of interest and discussion.
  • Social Network Analysis: Investigate interactions and influence within the WallStreetBets community.

Coverage

This dataset primarily covers content from the WallStreetBets subreddit from 2022. However, the timestamps within the data indicate a broader collection period, ranging from 27th January 2021 to 28th March 2025. The data reflects contributions from the Reddit community within this specific subreddit. It is updated daily.

License

CC0: Public Domain

Who Can Use It

This dataset is suitable for:
  • Financial Analysts and Researchers: To study social media influence on market behaviour and investment trends.
  • Data Scientists and NLP Practitioners: For developing and testing sentiment analysis models, topic modelling algorithms, and text classification systems.
  • Academics: Conducting research on online communities, collective intelligence, and financial phenomena driven by social media.
  • Market Strategists: Gaining insights into retail investor sentiment and potential market movements.

Dataset Name Suggestions

  • WallStreetBets 2022 Posts and Comments
  • Reddit WallStreetBets Daily Feed
  • WSB Community Discussions (2021-2025)
  • WallStreetBets Market Sentiment Data

Attributes

Listing Stats

VIEWS

0

DOWNLOADS

0

LISTED

26/08/2025

REGION

GLOBAL

Universal Data Quality Score Logo UDQSQUALITY

5 / 5

VERSION

1.0

Free

Download Dataset in CSV Format