Reddit r/Astrology Community Text Corpus
Reddit & Forum Data
Tags and Keywords
Trusted By




"No reviews yet"
Free
About
Captures the unfiltered stream of discussions, posts, and comments originating from the popular r/Astrology subreddit, a community dedicated to debating and sharing various facets of astrological belief systems. It offers a unique resource for analysing real-world, dynamic social media text related to religion and belief systems, allowing researchers to track evolving community sentiment and identify popular discussion trends in the domain. The data was collected programmatically directly from the Reddit platform.
Columns
- title: The headline provided for original posts, which is not applicable to comments.
- score: A metric reflecting the post's impact and community approval, calculated based on the number of comments and upvotes. Scores range from -58 to 601.
- id: A unique identifier assigned to every single entry, whether it is a post or a comment.
- url: The web address leading to the full discussion thread for the relevant post. This field is mostly missing for comments.
- comms_num: The total count of comments associated with a specific post.
- created: The date and time of creation, presented in a standard timestamp format.
- body: The main textual content of the contribution, relevant to both original posts and subsequent comments.
- timestamp: The date and time of content creation presented in a date/time format for readability.
Distribution
The information is delivered in a single data file,
reddit_astrology.csv, which is structured for easy analysis. The file size is 6.52 MB and contains roughly 14,500 individual records, encompassing both initial posts and associated user comments. The data structure includes 8 distinct columns.Usage
This resource is ideal for various data science initiatives. It can be used to perform sentiment analysis to gauge community feelings toward specific astrological events like Mercury Retrograde. Researchers can identify key discussion topics and track how often certain terms or concepts appear over time. Furthermore, the dataset supports advanced text mining and natural language processing (NLP) model training.
Coverage
The data collection period spans from May 2021 through to December 2021, providing several months of continuous community input. The scope is centred on the online discussions generated within the r/Astrology subreddit, meaning the geographic focus is worldwide, reflecting the nature of a global online forum. New data is expected to be updated on a daily basis.
License
CC0: Public Domain
Who Can Use It
- Social Scientists: To study online group dynamics, the spread of belief systems, and modern digital anthropology.
- Marketing Analysts: To understand niche community interests and language surrounding spiritual or self-help topics.
- NLP Engineers: To develop and refine algorithms focused on domain-specific text analysis, text classification, or generating summaries of discussion threads.
Dataset Name Suggestions
- Reddit r/Astrology Community Text Corpus
- Astrology Social Media Discourse Data
- Daily Astrology Reddit Post Log
Attributes
Original Data Source:Reddit r/Astrology Community Text Corpus
Loading...
