E-commerce Product Review Dataset
E-commerce & Online Transactions
Tags and Keywords
Trusted By




"No reviews yet"
Free
About
This dataset contains web-scraped comments and product details from Amazon and Mercado Livre, originally gathered for a personal project. It enables research based on a given input in a 'search query' column, paired with the corresponding product title, product link, and the customer review. The same search queries were performed across both Amazon and Mercado Livre platforms. This resource is particularly valuable for those looking to conduct Natural Language Processing (NLP) studies specifically in Portuguese.
Columns
- Search Query: Represents the input string used to search for products. This column includes entries such as 'smartphone' and 'mouse', with an 'Other' category accounting for the majority of queries. There are 8542 total values in this column.
- Product Title: Provides the title of the product retrieved during the search. There are 909 unique product titles recorded.
- Link: Contains the URL directing to the product page on Amazon or Mercado Livre. This column features 908 unique links.
- Review: Holds the actual customer review or comment associated with the product. There are 7668 unique review entries.
Distribution
The data is structured as a dataset, typically provided in a CSV file format. While specific total row counts are not detailed, the dataset includes unique value counts for its columns: 909 unique product titles, 908 unique links, and 7668 unique reviews. The structure facilitates mapping search queries to specific products and their associated reviews.
Usage
This dataset is ideally suited for various applications, especially for those undertaking NLP studies in the Portuguese language. It can be used for sentiment analysis, text classification, topic modelling, and building language models based on e-commerce review data.
Coverage
The dataset primarily covers product reviews from Amazon and Mercado Livre, with a specific focus on content in Portuguese. While Mercado Livre primarily serves Brazil and other Latin American countries, the dataset's listed region is global. It was listed as of 17th June 2025.
License
CC0
Who Can Use It
This dataset is suitable for:
- Data Scientists and Researchers: For developing and testing NLP models specifically for Portuguese language processing.
- Academics: For linguistic studies on e-commerce communication and consumer sentiment in Portuguese.
- Developers: For building applications that require understanding or generating Portuguese text, particularly in the context of product reviews.
Dataset Name Suggestions
- Amazon & Mercado Livre Portuguese Reviews
- E-commerce Product Review Dataset (Portuguese)
- Portuguese NLP E-commerce Comments
- Avaliações em Português - Amazon e Mercado Livre
Attributes
Original Data Source: Avaliações em Português - Amazon e Mercado Livre