Apple Stock Discussions on Reddit
Finance & Banking Analytics
Tags and Keywords
Trusted By




"No reviews yet"
Free
About
This dataset explores the impact of public opinion on the valuation of Apple Inc. (AAPL), one of the world's largest software companies. It comprises a collection of posts and comments from Reddit that mention AAPL, each labelled with its score. The aim is to investigate whether the collective "wisdom of crowds" can predict a company's performance and how public sentiment can influence its market value.
Columns
The dataset is split into two primary files,
five-years-of-aapl-on-reddit-comments.csv
and five-years-of-aapl-on-reddit-posts.csv
, each containing distinct and shared columns:Common Columns:
- type: The type of Reddit entry, either a post or a comment. (String)
- subreddit.name: The name of the subreddit where the entry was made. (String)
- subreddit.nsfw: Indicates whether the subreddit is classified as Not Safe For Work. (Boolean)
- created_utc: The Unix timestamp indicating when the post or comment was created. (Integer)
- permalink: The direct link to the post or comment on Reddit. (String)
- score: The score (upvotes minus downvotes) of the post or comment. (Integer)
Columns Unique to
five-years-of-aapl-on-reddit-comments.csv
:- body: The full text content of the comment. (String)
- sentiment: The classified sentiment (e.g., positive, negative, neutral) of the comment. (String)
Columns Unique to
five-years-of-aapl-on-reddit-posts.csv
:- domain: The domain of the link shared in the post, if applicable. (String)
- url: The URL linked within the post, if applicable. (String)
- selftext: The self-text content of the post. (String)
- title: The title of the Reddit post. (String)
Distribution
The data is provided in CSV format, typically known for its versatility and ease of use. It is structured into two separate files for Reddit comments and posts. The dataset contains approximately 297,000 individual records across both files, representing a significant volume of public discourse.
Usage
This dataset is ideal for various analytical applications:
- Correlation Analysis: Examine the relationship between changes in public opinion expressed on Reddit and AAPL's stock price fluctuations.
- Subreddit Sentiment Analysis: Identify which subreddits tend to exhibit a bullish or bearish outlook on AAPL.
- Sentiment Analysis: Utilise the body text of posts and comments to perform detailed sentiment analysis on public perceptions of Apple.
- Market Prediction: Explore the potential of using social media sentiment to predict future company performance.
Coverage
The dataset focuses on public opinion surrounding Apple Inc. (AAPL) as expressed on Reddit. The data spans a period of approximately five years, with timestamps ranging from late 2016 to late 2021, and is globally relevant. Earlier descriptions of the data mentioned a timeframe from 2005 to 2010.
License
CC0
Who Can Use It
- Financial Analysts: To gauge public sentiment for investment strategies.
- Data Scientists: For natural language processing (NLP) tasks and predictive modelling using social media data.
- Academics and Researchers: To study the "wisdom of crowds" and the influence of online public opinion on corporate valuation.
- Market Researchers: To understand consumer and investor perception of a major technology company.
Dataset Name Suggestions
- Reddit Sentiment for Apple (AAPL)
- AAPL Social Media Opinion Data
- Apple Stock Discussions on Reddit
- Public Sentiment on Apple Inc. (Reddit)
Attributes
Original Data Source: AAPL on Reddit