E-commerce Customer Reviews
Product Reviews & Feedback
Tags and Keywords
Trusted By




"No reviews yet"
Free
About
This dataset provides Amazon product review data, serving as a valuable resource for machine learning and data analytics applications, particularly in the realm of e-commerce services. It includes textual reviews alongside corresponding ratings, offering insights into customer feedback and product sentiment. The primary inspiration for this contribution is to facilitate the application of machine learning to such datasets.
Columns
The dataset contains 10 columns in total, with the following 8 described in detail:
- Id: A unique identifier for each record within the dataset.
- ProductId: The identifier for the product being reviewed.
- HelpfulnessNumerator: The count of users who found the review helpful. Values range from 0 to 866, with a mean of 1.74.
- HelpfulnessDenominator: The total number of users who voted on the helpfulness of the review. Values range from 0 to 923, with a mean of 2.23.
- Score: The product rating provided by the reviewer. Scores range from 1 to 5, with a mean of 4.18. The majority of reviews have a score of 5.
- Time: The timestamp representing when the review was written. Timestamps range from approximately 939 million to 1.35 billion, with a mean of 1.3 billion.
- Summary: A concise summary of the review content. This column contains 295,744 unique summary entries.
- Text: The actual body of the product review. This column features 393,579 unique text entries, with one common entry starting, "This review will make me sound really stupid, but whatever."
Distribution
The data is provided in a tabular format, specifically as a CSV file named
Reviews.csv
, which has a size of 300.9 MB. The dataset comprises 568,000 valid records, with no mismatched or missing values reported across the detailed columns.Usage
This dataset is ideally suited for various applications, including:
- Machine learning model development, particularly for classification tasks.
- Text analysis and natural language processing (NLP), enabling insights into sentiment and review content.
- Data analytics to understand patterns in customer ratings and reviews.
- Developing recommendation systems based on user feedback.
- Identifying product trends and customer satisfaction in e-commerce.
Coverage
The dataset's coverage is primarily defined by its time range, which spans from approximately September 1999 (based on a minimum timestamp of 939,340,800.00) to early 2013 (based on a maximum timestamp of 1,351,209,600.00). No specific geographic or demographic scope is mentioned in the available information.
License
CC0: Public Domain license.
Who Can Use It
This dataset is beneficial for:
- Data scientists and machine learning engineers working on text classification, sentiment analysis, or recommendation engines.
- Business analysts in the e-commerce sector seeking to understand customer feedback and product performance.
- Researchers interested in consumer behaviour, online reviews, and large-scale text data.
- Anyone exploring data analytics with real-world ratings and review data.
Dataset Name Suggestions
- Amazon Product Review Dataset
- E-commerce Customer Reviews
- Product Ratings and Text Corpus
- Amazon Review Data for ML
Attributes
Original Data Source: E-commerce Customer Reviews