Opendatabay APP

Apple Stock Discussions on Reddit

Finance & Banking Analytics

Tags and Keywords

Business

Finance

Investing

Nlp

Trusted By
Trusted by company1Trusted by company2Trusted by company3
Apple Stock Discussions on Reddit Dataset on Opendatabay data marketplace

"No reviews yet"

Free

About

This dataset explores the impact of public opinion on the valuation of Apple Inc. (AAPL), one of the world's largest software companies. It comprises a collection of posts and comments from Reddit that mention AAPL, each labelled with its score. The aim is to investigate whether the collective "wisdom of crowds" can predict a company's performance and how public sentiment can influence its market value.

Columns

The dataset is split into two primary files, five-years-of-aapl-on-reddit-comments.csv and five-years-of-aapl-on-reddit-posts.csv, each containing distinct and shared columns:
Common Columns:
  • type: The type of Reddit entry, either a post or a comment. (String)
  • subreddit.name: The name of the subreddit where the entry was made. (String)
  • subreddit.nsfw: Indicates whether the subreddit is classified as Not Safe For Work. (Boolean)
  • created_utc: The Unix timestamp indicating when the post or comment was created. (Integer)
  • permalink: The direct link to the post or comment on Reddit. (String)
  • score: The score (upvotes minus downvotes) of the post or comment. (Integer)
Columns Unique to five-years-of-aapl-on-reddit-comments.csv:
  • body: The full text content of the comment. (String)
  • sentiment: The classified sentiment (e.g., positive, negative, neutral) of the comment. (String)
Columns Unique to five-years-of-aapl-on-reddit-posts.csv:
  • domain: The domain of the link shared in the post, if applicable. (String)
  • url: The URL linked within the post, if applicable. (String)
  • selftext: The self-text content of the post. (String)
  • title: The title of the Reddit post. (String)

Distribution

The data is provided in CSV format, typically known for its versatility and ease of use. It is structured into two separate files for Reddit comments and posts. The dataset contains approximately 297,000 individual records across both files, representing a significant volume of public discourse.

Usage

This dataset is ideal for various analytical applications:
  • Correlation Analysis: Examine the relationship between changes in public opinion expressed on Reddit and AAPL's stock price fluctuations.
  • Subreddit Sentiment Analysis: Identify which subreddits tend to exhibit a bullish or bearish outlook on AAPL.
  • Sentiment Analysis: Utilise the body text of posts and comments to perform detailed sentiment analysis on public perceptions of Apple.
  • Market Prediction: Explore the potential of using social media sentiment to predict future company performance.

Coverage

The dataset focuses on public opinion surrounding Apple Inc. (AAPL) as expressed on Reddit. The data spans a period of approximately five years, with timestamps ranging from late 2016 to late 2021, and is globally relevant. Earlier descriptions of the data mentioned a timeframe from 2005 to 2010.

License

CC0

Who Can Use It

  • Financial Analysts: To gauge public sentiment for investment strategies.
  • Data Scientists: For natural language processing (NLP) tasks and predictive modelling using social media data.
  • Academics and Researchers: To study the "wisdom of crowds" and the influence of online public opinion on corporate valuation.
  • Market Researchers: To understand consumer and investor perception of a major technology company.

Dataset Name Suggestions

  • Reddit Sentiment for Apple (AAPL)
  • AAPL Social Media Opinion Data
  • Apple Stock Discussions on Reddit
  • Public Sentiment on Apple Inc. (Reddit)

Attributes

Original Data Source: AAPL on Reddit

Listing Stats

VIEWS

0

DOWNLOADS

0

LISTED

26/06/2025

REGION

GLOBAL

Universal Data Quality Score Logo UDQSQUALITY

5 / 5

VERSION

1.0

Free

Download Dataset in ZIP Format