Opendatabay APP

Amazon Customer Review Data

Product Reviews & Feedback

Tags and Keywords

Amazon

Reviews

Product

Customer

Sentiment

Trusted By
Trusted by company1Trusted by company2Trusted by company3
Amazon Customer Review Data Dataset on Opendatabay data marketplace

"No reviews yet"

Free

About

Provides essential information collected from customer reviews to help shoppers learn about products and decide if a product is right for them. These reviews are intended to offer genuine product feedback from fellow shoppers. There is a strict zero-tolerance policy against any review designed to mislead or manipulate customers, including those written as a form of promotion. Content deemed unacceptable and subject to removal includes submissions from individuals with a direct or indirect financial interest in the product, those perceived to have a close personal relationship with the product's owner, or reviews given in exchange for monetary reward or bonus in-game credits. The data excludes manipulative practices, such as a product manufacturer posing as an unbiased shopper or a seller leaving negative reviews on a competitor's product.

Columns

  1. marketplace (string): The geographic location of the product.
  2. customer_id (string): The unique identification number assigned to the customer who submitted the review.
  3. review_id (string): The unique identifier for the specific review record.
  4. product_id (string): The unique identification number of the product being reviewed.
  5. product_parent (string): The identifier linking the product to its parent grouping.
  6. product_title (string): The title of the product reviewed.
  7. product_category (string): The specific category to which the product belongs.
  8. star_rating (int): The numerical rating given to the product, measured out of 5 stars.
  9. helpful_votes (int): The count of votes indicating the review was helpful for increasing sales.
  10. total_votes (int): The overall count of votes cast regarding the review.
  11. review_headline (string): The heading provided for the written review.
  12. review_body (string): The full textual content of the customer's review.
  13. review_date (string): The date on which the product review was submitted.
  14. vine (string/int): A field related to the Amazon Vine programme.
  15. verified_purchase (int): Indicates whether the purchase associated with the review was verified (Y/1 or N/0).
  16. Sentiment_books (string): The determined sentiment of the review (e.g., positive or negative).
  17. review_month (string): The month extracted from the review date.
  18. review_day (string): The day of the week extracted from the review date.
  19. review_year (int): The year extracted from the review date.

Distribution

The data is tagged for use in Text processing, Data Visualization, and Data Cleaning applications, often handled using tools like pandas for managing categorical data structures. The usability score is rated at 10.00. The expectation is that this dataset will never receive updates. Sample data validation is extremely high, showing 100% valid records across all columns with zero mismatched or missing values in the reviewed segments. For instance, the sample indicates a single unique product category ("Books") and a single marketplace location ("US").

Usage

This data is excellent for training machine learning models focused on sentiment analysis, allowing developers to classify feedback into positive or negative categories. It is also suitable for analysing customer behaviour patterns, specifically how star ratings, helpful votes, and review text correlate with product success. Researchers can leverage the product hierarchy (parent ID) and customer identifiers to study purchasing trends and product performance.

Coverage

The dataset spans multiple product categories, including Books, Jewellery, Digital Ebook Purchases, Grocery, and PC products. Geographic coverage is indicated by the marketplace column, with samples focusing on the "US" location. The temporal scope is defined by the review date field, allowing for time-series analysis of customer sentiment. The data availability includes detailed metrics on helpful and total votes, as well as verification status, providing granular insight into the credibility of the feedback.

License

CC0: Public Domain

Who Can Use It

  • E-commerce Analysts: To monitor and benchmark product performance based on genuine customer feedback and star ratings.
  • Machine Learning Engineers: To build and refine natural language processing models for accurate sentiment classification.
  • Market Researchers: To study consumer reactions to new products and identify key drivers behind positive or negative reviews.
  • Data Visualization Specialists: To create dashboards illustrating trends in review helpfulness and overall product health.

Dataset Name Suggestions

  • Amazon Customer Review Data
  • E-commerce Consumer Feedback Repository
  • Global Product Ratings and Reviews
  • Genuine Shopper Feedback Log

Attributes

Original Data Source: Amazon Customer Review Data

Listing Stats

VIEWS

10

DOWNLOADS

2

LISTED

19/10/2025

REGION

GLOBAL

Universal Data Quality Score Logo UDQSQUALITY

5 / 5

VERSION

1.0

Loading...

Free

Download Dataset in ZIP Format