E-commerce Customer Feedback Dataset
Reviews & Ratings
Tags and Keywords
Trusted By




"No reviews yet"
Free
About
This dataset provides a collection of randomly selected customer reviews and ratings for various Amazon products. It comprises nearly 1.6 thousand individual reviews, making it a valuable resource for understanding consumer feedback. The primary aim of using this dataset is to identify the main topics within these reviews, enabling better classification for improved search functionality. It is particularly suited for developing algorithms that can differentiate topics based on a body of review text.
Columns
The dataset includes the following fields:
- id: A unique identifier for each entry.
- asins: The product identification number.
- brand: The manufacturer or brand of the product.
- categories: The categorisation of the product.
- colors: The colour of the product.
- dateAdded: The date when the product was first listed or added to the dataset.
- dateUpdated: The date when the product's information was last updated.
- dimension: The physical dimensions of the product.
- ean: The European Article Number (EAN) for the product.
- keys: A special assigned key associated with the product.
Distribution
The dataset contains approximately 1.6 thousand reviews. The data is structured in a tabular format, suitable for analysis.
Key distributions observed within the dataset include:
- Brands: A significant majority of products (99%) are from Amazon, with a smaller portion (1%) from Moshi.
- Categories: A notable 34% of products fall under categories such as Amazon Devices, Smart Home, and Voice Assistants, with another 12% simply categorised as Amazon Devices. Other categories account for 54% of the data.
- Colours: About 52% of entries have null values for colour, while 42% are recorded as Black. Other colours make up the remaining 6%.
- Dates: The date range for products added or updated spans from 17 January 2015 to 13 August 2017, with varying counts of entries across different periods.
- Dimensions: 65% of the entries have null dimensions, while 34% specify a dimension of 4.8 inches by 6.6 inches by 3.2 inches.
Usage
This dataset is ideal for a range of applications, including:
- Developing and evaluating Topic Modelling Algorithms to categorise customer reviews.
- Performing Natural Language Processing (NLP) tasks such as sentiment analysis or keyword extraction from product reviews.
- Gaining insights into consumer behaviour and product feedback in the e-commerce sector.
- Supporting data clean-up and exploratory data analysis for textual datasets.
Coverage
The dataset's coverage is global, encompassing reviews from various customers. The time range of the data spans from 17 January 2015 to 13 August 2017. No specific demographic details about the customers are provided.
License
CCO
Who Can Use It
This dataset is suitable for:
- Data Scientists and Machine Learning Engineers focused on NLP and topic modelling.
- Researchers in fields such as e-commerce, consumer studies, and computational linguistics.
- Students and beginners in data science looking for a practical dataset for learning and experimentation.
- Businesses aiming to understand customer feedback and improve product categorisation.
Dataset Name Suggestions
- Amazon Product Reviews Corpus
- E-commerce Customer Feedback Dataset
- Amazon Ratings and Reviews Data
- Product Review Topic Analysis Dataset
- Customer Review Dataset for E-commerce
Attributes
Original Data Source: Amazon Product Reviews Dataset