Birds Aren't Real Reddit Dataset
Social Media and Networking
Tags and Keywords
Trusted By




"No reviews yet"
Free
About
This dataset captures social media activity from r/BirdsArentReal, the official subreddit for those who believe birds have been replaced by government drones [1]. It serves as a platform for Generation Z members, who may propagate this myth as a joke or seriously, to share related content and support the "Birds Aren't Real" movement [1]. The data includes Reddit posts and comments, providing insights into this unique social phenomenon [1]. For example, users can find discussions about avian-like robots, such as one created by researchers with peregrine falcon legs capable of perching on complex surfaces and carrying irregular objects [2].
Columns
- title: The title of the Reddit post or comment [2].
- score: The score (upvotes/downvotes) of the Reddit post or comment [2].
- id: A unique identifier for the Reddit post or comment [2].
- url: The URL associated with the Reddit post or comment [2].
- comms_num: The number of comments on a Reddit post [2].
- created: The creation timestamp of the Reddit post or comment [2].
- body: The main content or body text of the Reddit post or comment [2].
- timestamp: A numerical timestamp for the creation of the Reddit entry [2].
- Comment: This column appears to capture specific comments or comment-related data [2].
Distribution
The dataset comprises Reddit posts and comments [1], available in an unspecified format. While a precise total number of rows or records is not given directly, the data spans a significant volume, with over 12,000 unique score values recorded [3]. For instance, a substantial portion of the dataset (approximately 74%) relates to a specific URL [3], and a large number of entries fall within certain score ranges [2, 3]. The data is unfiltered [1]. The dataset version is 1.0, with a listed date of 22/06/2025 [4].
Usage
This dataset is ideal for [1]:
- Training text data analysis models.
- Performing sentiment analysis on social media content.
- Conducting topic modelling on the collected text corpus.
- Analysing trends and discourse within a unique online community.
- Can be combined with Twitter data on the same topic for broader social media analysis [1].
Coverage
The data covers a time range from 20th November 2021 to 15th March 2022 [5]. The dataset is global in scope [4]. It originates from the r/BirdsArentReal subreddit, focusing on content propagated by Generation Z members [1]. The dataset has a quality rating of 5 out of 5 [4].
License
CC0
Who Can Use It
- Data Scientists: For text processing, natural language processing (NLP), and machine learning model training [1].
- Researchers: Studying online communities, social phenomena, and the spread of unusual beliefs or parody movements [1].
- Marketers/Trend Analysts: To understand niche online trends and Gen Z engagement [1].
- Academics: For social sciences and digital humanities research.
Dataset Name Suggestions
- Birds Aren't Real Reddit Dataset
- Gen Z Conspiracy Theory Social Media Data
- r/BirdsArentReal Community Posts
- Fictional Avian Drone Social Data
- Birds Aren't Real Subreddit Activity
Attributes
Original Data Source: Birds Aren't Real