Opendatabay APP

E-commerce Product Ratings & Sentiments

Reviews & Ratings

Tags and Keywords

Business

Classification

Synthetic

Nlp

Trusted By
Trusted by company1Trusted by company2Trusted by company3
E-commerce Product Ratings & Sentiments Dataset on Opendatabay data marketplace

"No reviews yet"

Free

About

This dataset comprises four million synthetic e-commerce product reviews compiled across eight popular product categories. It is designed to provide realistic user review data, including sentiment analysis, for a variety of analytical and machine learning applications. The dataset's purpose is to facilitate the development and testing of systems related to customer feedback, product understanding, and AI model fine-tuning in an e-commerce context.

Columns

  • product_id: A unique, synthetic identifier for each product.
  • product_title: The name of the product, for instance, "Wireless Bluetooth Earbuds".
  • category: One of eight distinct product categories to which the product belongs.
  • review_text: A realistic user-generated review of the product.
  • rating: An integer value indicating the user's rating, ranging from 1 to 5 stars.
  • sentiment: The sentiment derived from the review text, categorised as Positive, Neutral, or Negative.

Distribution

The dataset contains four million synthetic e-commerce product reviews. It is provided in CSV format and is UTF-8 encoded. Specific row counts beyond the four million records are not detailed, but the structure is consistent for each review entry.

Usage

Ideal applications and use cases for this dataset include:
  • Natural Language Processing (NLP) sentiment analysis: Analysing the emotional tone of product reviews.
  • Product review summarisation: Generating concise summaries from large volumes of review text.
  • E-commerce recommender systems: Building systems that suggest products based on user preferences and review data.
  • Fake review detection: Identifying fraudulent or unauthentic product reviews.
  • Fine-tuning Large Language Models (LLMs) on product-related tasks, enhancing their understanding and generation of e-commerce content.

Coverage

The dataset's regional scope is global. While the data is synthetic, it aims to represent a broad range of e-commerce review scenarios. The dataset was listed on 05/06/2025. Specific time ranges for the reviews themselves or demographic scope are not provided as the data is synthetically generated.

License

CCO

Who Can Use It

This dataset is suitable for:
  • Data scientists and machine learning engineers working on NLP, sentiment analysis, or recommender systems.
  • Researchers in academia exploring e-commerce data trends, review analysis, or synthetic data generation.
  • Developers building AI models for e-commerce platforms, particularly those involving customer feedback and product intelligence.
  • Businesses seeking to understand customer sentiment or improve product offerings through data-driven insights.

Dataset Name Suggestions

  • Synthetic E-commerce Product Reviews
  • Digital Product Feedback Dataset
  • Customer Review Analysis Data
  • E-commerce Product Ratings & Sentiments

Attributes

Listing Stats

VIEWS

1

DOWNLOADS

0

LISTED

05/06/2025

REGION

GLOBAL

Universal Data Quality Score Logo UDQSQUALITY

5 / 5

VERSION

1.0

Free