Reddit US Politics Activity
Government & Civic Records
Tags and Keywords
Trusted By




"No reviews yet"
Free
About
This dataset captures posts and comments from the r/Politics subreddit on Reddit, focusing on USA politics. It was compiled using a Reddit application and the praw Python package. The r/Politics subreddit is highly active, with millions of contributors and a constant stream of new posts and comments. This data is valuable for identifying political topics, analysing trends, and understanding public sentiments as expressed within this online community.
Columns
- title: The title of a post.
- score: The score (upvotes minus downvotes) of a post or comment.
- id: A unique identifier for the post or comment.
- url: The URL of the post.
- comms_num: The number of comments associated with a post.
- created: The Unix timestamp indicating when the post or comment was created.
- body: The main text content of a comment or post.
- timestamp: Another timestamp field, likely similar to 'created'.
Distribution
The dataset is typically provided in a CSV format. It includes a variety of data points, with scores ranging from 0 to over 93,000, and a large proportion of entries falling within the lower score bands. The number of comments per post ranges from 0 to over 8,500, with many entries having a lower comment count. The data covers a period from late July 2021 to mid-December 2021, with varying counts of entries across different weekly intervals. For instance, some weeks saw over 5,000 entries, while others had a few hundred.
Usage
This dataset is ideal for:
- Identifying key topics in USA political discourse.
- Plotting trends in public opinion or discussion volume over time.
- Understanding sentiment related to various political issues.
- Developing and testing Natural Language Processing (NLP) models on political text.
Coverage
The dataset focuses exclusively on USA politics, drawing content from the r/Politics subreddit. It spans a timeframe from 27th July 2021 to 13th December 2021. The data reflects the activity and discussions of the millions of contributors on this specific Reddit community.
License
CC0
Who Can Use It
- Political scientists and researchers keen to study online political discourse.
- Data scientists and analysts building models for text analysis or sentiment detection.
- Journalists and media professionals seeking insights into public sentiment on political events.
- Sociologists examining online community dynamics and polarisation.
Dataset Name Suggestions
- Reddit US Politics Activity
- r/Politics Subreddit Data 2021
- US Political Reddit Discussions
- Online Political Discourse Dataset
- Reddit Political Posts & Comments
Attributes
Original Data Source: Politics on Reddit