Reddit Tales From The Job Dataset
Social Media and Networking
Tags and Keywords
Trusted By




"No reviews yet"
Free
About
This dataset contains Reddit posts and comments collected from the r/talesfromthejob subreddit, an online community where individuals share stories and experiences related to their employment. The data, which has not been filtered, was compiled using praw (The Python Reddit API Wrapper). It provides a rich source of social media text for various analytical purposes, including sentiment analysis and identifying discussion topics related to workplace narratives. Both posts and comments are included, offering a dual perspective on user-generated content.
Columns
- title: The title of the Reddit post.
- score: The score of the post, indicating its impact or popularity, often based on upvotes and the number of comments.
- id: A unique identifier for each post or comment.
- url: The URL linking directly to the Reddit post thread.
- comms_num: The total number of comments associated with a given post.
- created: The date on which the post or comment was created.
- body: The main text content of the post or comment.
- timestamp: A numerical timestamp indicating the time of creation.
Distribution
The dataset is typically provided in a CSV format. It includes both Reddit posts and their corresponding comments. While specific row or record counts are not stated, the data ranges approximately from March 2012 to January 2022. The dataset is intended for global use.
Usage
This dataset is ideally suited for:
- Performing sentiment analysis on workplace experiences.
- Identifying common discussion topics and themes related to jobs and employment.
- Natural Language Processing (NLP) research and model training.
- Studying online community behaviour and content patterns on social media platforms.
Coverage
The data's geographic scope is global, reflecting content from a worldwide user base on Reddit. The time range covered by the dataset extends from March 2012 to January 2022. There are no specific notes on data availability for particular demographic groups, as the content is derived from a public subreddit.
License
CC0
Who Can Use It
This dataset is valuable for:
- Data scientists and analysts interested in social media content.
- Researchers studying workplace culture, sentiment, or online communication.
- AI and LLM developers looking for text data to train models on conversational or narrative content.
- Organisations seeking insights into employee experiences or public sentiment regarding work environments.
Dataset Name Suggestions
- Reddit Tales From The Job Dataset
- Workplace Stories Reddit Data
- Job Experiences Text Collection
- Tales From The Job Reddit Posts & Comments
Attributes
Original Data Source: Reddit Tales From The Job