Online Bitcoin Discussion Dataset
Finance & Banking Analytics
Tags and Keywords
Trusted By




"No reviews yet"
Free
About
This dataset provides a detailed collection of posts and comments from the /r/Bitcoin subreddit, gathered during June 2022. It is compiled into two convenient CSV files, offering valuable insights into public discourse surrounding cryptocurrency. The data is suitable for both enterprise and academic use, particularly for analysing shifts in public opinion to potentially anticipate future trends in the volatile cryptocurrency market. User anonymity is maintained as usernames are not included to prevent targeted harassment.
Columns
The dataset includes the following columns:
- type: Denotes the type of the data point.
- id: A unique Base-36 identifier for the comment.
- subreddit.id: The unique Base-36 identifier for the subreddit where the comment was posted.
- subreddit.name: The human-readable name of the subreddit.
- subreddit.nsfw: A boolean indicating if the subreddit is Not Safe For Work.
- created_utc: The timestamp indicating when the comment was created.
- permalink: The permanent link to the comment on Reddit.
- body: The main text content of the comment.
- sentiment: The analysed sentiment score for the comment.
- score: The score attributed to the comment.
Distribution
The dataset is provided in CSV file format and is organised into two separate files. It consists of approximately 170,000 records, representing a significant volume of Reddit comments. While a precise row count is not fixed, the available information indicates a substantial number of entries.
Usage
This dataset is ideal for:
- Cryptocurrency Market Analysis: Gaining insights into public sentiment and discussion trends related to Bitcoin.
- Social Media Analysis: Studying engagement patterns and content dynamics within a prominent online community.
- Sentiment Analysis: Leveraging the 'sentiment' column to perform in-depth analysis of opinions expressed in comments.
- Academic Research: Supporting studies on online communities, financial markets, and natural language processing.
- Enterprise Applications: Informing strategic decisions by understanding public perception and identifying emerging trends in the crypto space.
Coverage
The dataset's geographic scope is global, reflecting the worldwide nature of Reddit and cryptocurrency discussions. The time range covered by the data is June 2022. The data specifically focuses on content from the /r/Bitcoin subreddit, targeting a particular segment of online users engaged with this topic. User-specific demographic information is limited as usernames are excluded to preserve privacy and prevent targeted harassment.
License
CC-BY
Who Can Use It
Intended users include:
- Data Scientists & Analysts: For developing predictive models or exploring social media trends.
- Academics & Researchers: For conducting studies on internet culture, finance, or advanced text analysis.
- Financial Analysts: To obtain insights into public opinion that may influence cryptocurrency markets.
- Marketing Professionals: To understand online discussions and public perception within the cryptocurrency domain.
Dataset Name Suggestions
- Reddit Bitcoin Comments (June 2022)
- Cryptocurrency Social Sentiment Data
- Bitcoin Subreddit Activity June 2022
- Online Bitcoin Discussion Dataset
Attributes
Original Data Source: Reddit /r/Bitcoin Data for Jun 2022