Opendatabay APP

Preprocessed Amazon Review Sentiment

Entertainment & Media Consumption

Tags and Keywords

Tabular

Retail

Intermediate

Nlp

Nltk

Sentiment

Reviews

Preprocessed

Trusted By
Trusted by company1Trusted by company2Trusted by company3
Preprocessed Amazon Review Sentiment Dataset on Opendatabay data marketplace

"No reviews yet"

Free

About

This dataset contains preprocessed Amazon product review data for the Gen3EcoDot, primarily scraped from amazon.in. It is designed to facilitate the training and testing of classification models, particularly for sentiment analysis. The reviews have been stemmed and lemmatised using NLTK, and sentiment labels are generated using TextBlob polarity scores, making it ready for direct use in machine learning and natural language processing tasks.

Columns

  • Index: A unique identifier for each record within the dataset.
  • Review: The original, raw text of the customer review.
  • Stemmed and Lemmatised review using nltk: The preprocessed version of the review text, optimised for text analysis and model training.
  • Polarity: The numerical polarity score derived from the TextBlob analysis, indicating the sentiment expressed in the review (ranging from -1.00 for negative to 1.00 for positive sentiment).
  • Division: A categorical label generated based on the polarity score, providing discrete sentiment categories.

Distribution

The dataset is provided in a tabular format, typically a CSV file, and is preprocessed for immediate use. While specific row counts are not explicitly stated as a single number, the 'division' column includes approximately 4156 records, categorised into various ranges. The 'polarity' column also contains counts for sentiment ranges, summing to approximately 4084 records, indicating positive, neutral, and negative sentiments.

Usage

This dataset is ideal for a variety of applications, including:
  • Training and testing sentiment classification models.
  • Developing and evaluating Natural Language Processing (NLP) algorithms.
  • Conducting sentiment analysis on product reviews.
  • Academic research in text analytics and machine learning.
  • Building applications that require pre-classified text data.

Coverage

The data primarily covers product reviews from amazon.in, providing a global scope for e-commerce sentiment. The listing date for the dataset is noted as 16/06/2025. No specific historical time range for the reviews themselves or demographic details are provided.

License

CC0

Who Can Use It

This dataset is suitable for:
  • Data scientists and machine learning engineers looking for preprocessed text data to train classification models.
  • Researchers and academics in NLP, sentiment analysis, and text mining.
  • Students learning about text preprocessing and sentiment modelling.
  • Developers building applications that require sentiment understanding from review data.

Dataset Name Suggestions

  • Preprocessed Amazon Review Sentiment
  • Gen3EcoDot Sentiment Analysis Dataset
  • E-commerce Product Review Polarity
  • NLTK TextBlob Sentiment Data

Attributes

Listing Stats

VIEWS

2

DOWNLOADS

1

LISTED

16/06/2025

REGION

GLOBAL

Universal Data Quality Score Logo UDQSQUALITY

5 / 5

VERSION

1.0

Free