Rotten Tomatoes Movie Review Sentiment Dataset
Entertainment & Media Consumption
Tags and Keywords
Trusted By



"No reviews yet"
Free
About
This dataset is a sentiment classification dataset, containing 5,331 positive and 5,331 negative processed sentences derived from Rotten Tomatoes movie reviews. Its primary purpose is to serve as a benchmark for text classification tasks. The data consists of simple CSV files with two columns: reviews and labels. On average, the reviews comprise 21 words. It is important to shuffle the data before usage, as the initial 5,331 rows exclusively contain negative samples, followed by 5,331 positive samples.
Columns
- reviews: Text content of user reviews.
- labels: A binary indicator where '1' signifies a fresh (good) review and '0' signifies a rotten (bad) review.
Distribution
The dataset is provided as CSV files. It contains a total of 10,662 unique records or rows. The data is structured with two distinct columns. There are 5,331 records labelled '0' (rotten) and 5,331 records labelled '1' (fresh).
Usage
This dataset is ideal for:
- Sentiment analysis applications.
- Developing and evaluating text classification models.
- Natural Language Processing (NLP) research and development.
- Training models for binary classification problems in text.
Coverage
The dataset has a global regional coverage, indicating that the reviews are not geographically restricted. Specific time ranges or demographic scopes are not detailed in the available information.
License
CC0
Who Can Use It
This dataset is particularly useful for:
- AI and Machine Learning developers building sentiment analysis tools.
- Researchers in the field of Natural Language Processing (NLP).
- Data scientists looking for benchmark datasets for text classification.
- Academics studying text mining and opinion analysis.
Dataset Name Suggestions
- Rotten Tomatoes Movie Review Sentiment Dataset
- Film Review Sentiment Classification Data
- Rotten Tomatoes User Review Polarity
- Movie Review Sentiment Benchmark
Attributes
Original Data Source: Rotten Tomatoes Reviews Dataset