E-commerce Product Ratings & Sentiments
Reviews & Ratings
Tags and Keywords
Trusted By




"No reviews yet"
Free
About
This dataset comprises four million synthetic e-commerce product reviews compiled across eight popular product categories. It is designed to provide realistic user review data, including sentiment analysis, for a variety of analytical and machine learning applications. The dataset's purpose is to facilitate the development and testing of systems related to customer feedback, product understanding, and AI model fine-tuning in an e-commerce context.
Columns
- product_id: A unique, synthetic identifier for each product.
- product_title: The name of the product, for instance, "Wireless Bluetooth Earbuds".
- category: One of eight distinct product categories to which the product belongs.
- review_text: A realistic user-generated review of the product.
- rating: An integer value indicating the user's rating, ranging from 1 to 5 stars.
- sentiment: The sentiment derived from the review text, categorised as Positive, Neutral, or Negative.
Distribution
The dataset contains four million synthetic e-commerce product reviews. It is provided in CSV format and is UTF-8 encoded. Specific row counts beyond the four million records are not detailed, but the structure is consistent for each review entry.
Usage
Ideal applications and use cases for this dataset include:
- Natural Language Processing (NLP) sentiment analysis: Analysing the emotional tone of product reviews.
- Product review summarisation: Generating concise summaries from large volumes of review text.
- E-commerce recommender systems: Building systems that suggest products based on user preferences and review data.
- Fake review detection: Identifying fraudulent or unauthentic product reviews.
- Fine-tuning Large Language Models (LLMs) on product-related tasks, enhancing their understanding and generation of e-commerce content.
Coverage
The dataset's regional scope is global. While the data is synthetic, it aims to represent a broad range of e-commerce review scenarios. The dataset was listed on 05/06/2025. Specific time ranges for the reviews themselves or demographic scope are not provided as the data is synthetically generated.
License
CCO
Who Can Use It
This dataset is suitable for:
- Data scientists and machine learning engineers working on NLP, sentiment analysis, or recommender systems.
- Researchers in academia exploring e-commerce data trends, review analysis, or synthetic data generation.
- Developers building AI models for e-commerce platforms, particularly those involving customer feedback and product intelligence.
- Businesses seeking to understand customer sentiment or improve product offerings through data-driven insights.
Dataset Name Suggestions
- Synthetic E-commerce Product Reviews
- Digital Product Feedback Dataset
- Customer Review Analysis Data
- E-commerce Product Ratings & Sentiments
Attributes
Original Data Source: Synthetic E-commerce Product Reviews Dataset