Retail sale deals [crawled July 17, 2020]
Data Science and Analytics
Tags and Keywords
Trusted By



![Retail sale deals [crawled July 17, 2020] Dataset on Opendatabay data marketplace](/_next/image?url=https%3A%2F%2Fstorage.googleapis.com%2Fopendatabay_public%2F997019b7-56c5-4f7b-9b43-e8a35dc05970%2Fcf0dbe3c-1750958117587.jpg&w=640&q=75)
"No reviews yet"
Free
About
Context
RedFlagDeals is a forum where users can post product sales that they come across. The "All Hot Deals" section of the forum was scraped for relevant information on July 17, 2020.
I supplied a kernel on how to clean the data and will follow up with some analyses for identifying promising deals. I will continue updating the data-set with new posts on the forum should there be sufficient interest, wich I will evaluate based on the number of downloads and upvotes.
Content
Three tables are supplied.
Each row in the main table corresponds to a post. Columns indicate post information such as the title, the sum of up-votes minus down-votes, a link to the referenced deal, and more.
The comments table stores all comments made in response to the scraped posts. Titles in the 'title' column serve as foreign keys and link comments to the corresponding posts found in the main table.
Lastly, a cleaned version of the main table was supplied, for those who do not want to deal with data wrangling. The corresponding code can be found in the Kernel section.
Inspiration
After data-wrangling of the main table, the set should be fairly simple to analyze and may contain some interesting deals. Since links to the sales are included, you may come across offerings that interest you.
The comments table can be used for natural language processing and more robust sentiment analysis. You may want to consider applying PCA.
Happy sales hunting!
Some questions you may want to answer:
Which users generate the most discussed posts or the highest number of upvotes?
What type of products do top-users post?
What products offer the biggest savings?
What are the most popular product categories posted on the forum?
Which retailers are most frequently represented?
Which retailers generate the highest number of replies per pos
License
CC0
Original Data Source: Retail sale deals [crawled July 17, 2020]