Daraz Multilingual Sentiment Dataset
Reviews & Ratings
Tags and Keywords
Trusted By




"No reviews yet"
Free
About
This dataset comprises 16,990 product reviews collected from Daraz [4]. These reviews are code-mixed, featuring English, Roman Urdu, and Urdu languages, and are categorised into three sentiment classes: positive, negative, and neutral [4, 5]. It serves as a valuable resource for sentiment analysis and natural language processing tasks [5].
Columns
The specific column names for this dataset are not provided in the source material.
Distribution
The dataset contains 16,990 unique product reviews [4]. It is structured with sentiment classifications: 60% positive, 26% negative, and 14% classified as 'Other' (comprising 2,461 reviews) [4]. The reviews are code-mixed, combining English, Roman Urdu, and Urdu text [4, 5]. The data version is 1.0 [6].
Usage
This dataset is ideal for training and evaluating machine learning models for sentiment analysis, particularly in code-mixed language environments [5]. It is suitable for text classification tasks, natural language processing research, and developing AI applications that require understanding user sentiment from e-commerce reviews [5].
Coverage
The data is sourced from Daraz product reviews [4] and is applicable globally [6]. It encompasses reviews written in English, Roman Urdu, and Urdu [4, 5].
License
**CC By
Who Can Use It
Data scientists and AI/ML engineers working on natural language processing, sentiment analysis, or text classification [5, 6]. Researchers interested in code-mixed language analysis or e-commerce sentiment trends [4, 5].
Dataset Name Suggestions
- Daraz Code-Mixed Product Review Sentiments
- E-commerce Code-Mixed Review Sentiment Dataset
- Urdu-English Product Reviews for Sentiment Analysis
- Daraz Multilingual Sentiment Dataset
Attributes
Original Data Source: Daraz Code Mixed Product Reviews