Opendatabay APP

Sarcastic Stories Dataset

Social Media and Networking

Tags and Keywords

Retail

Text

Social

Nlp

Trusted By
Trusted by company1Trusted by company2Trusted by company3
Sarcastic Stories Dataset Dataset on Opendatabay data marketplace

"No reviews yet"

Free

About

This dataset comprises a collection of posts and comments from the r/IDontWorkHereLady subreddit, a community that derives from subreddits like 'Tales From Retail'. It features humorous and sarcastic stories where individuals are mistakenly identified as employees in various businesses. The data provides a rich resource for analysing creative English language usage and is particularly suited for testing advanced Natural Language Processing (NLP) skills. The data remains unfiltered from its original collection.

Columns

  • title: The title of the Reddit post.
  • score: The score of the post, reflecting its impact and number of comments.
  • id: A unique identifier for each post or comment.
  • url: The URL of the post thread.
  • comms_num: The number of comments associated with a post.
  • created: The date of creation for the post or comment.
  • body: The main text content of the post or comment.
  • timestamp: A timestamp indicating when the post or comment was created.

Distribution

The dataset is typically provided in a CSV file format. It contains approximately 20,751 entries, consisting of both Reddit posts and comments. The specific breakdown of posts versus comments or other detailed structural elements beyond the listed columns is not provided.

Usage

This dataset is ideal for:
  • Performing sentiment analysis on the humorous and sarcastic content.
  • Identifying discussion topics and trends within the mistaken identity narratives.
  • Testing and developing advanced NLP models due to the nuanced language.

Coverage

The dataset's geographic coverage is global, as it is collected from a publicly accessible online platform. It spans a time range from 24th January 2021 to 31st March 2022. There are no specific notes on demographic scope beyond the general nature of Reddit contributors.

License

CC0

Who Can Use It

This dataset is intended for:
  • Data scientists and NLP practitioners looking for real-world text data to train and evaluate models.
  • Researchers in linguistics, sociology, or human-computer interaction studying online communities, humour, or social narratives.
  • Developers creating applications that involve text analysis, topic extraction, or sentiment understanding.

Dataset Name Suggestions

  • Reddit IDontWorkHereLady Stories
  • Mistaken Identity Retail Tales
  • Sarcastic Stories Dataset
  • Online Humour Text Collection
  • IDontWorkHereLady Reddit Corpus

Attributes

Original Data Source:I Don't Work Here Lady

Listing Stats

VIEWS

0

DOWNLOADS

0

LISTED

17/06/2025

REGION

GLOBAL

Universal Data Quality Score Logo UDQSQUALITY

5 / 5

VERSION

1.0

Free