Binary Movie Sentiment Analysis
Entertainment & Media Consumption
Tags and Keywords
Trusted By




"No reviews yet"
Free
About
This is a large dataset designed for binary sentiment classification of movie reviews. It offers a substantial amount of data compared to prior benchmark datasets, providing 25,000 highly polar movie reviews for training and an additional 25,000 for testing. Unlabelled data is also available for use. The data fields remain consistent across all segments of the dataset. Its primary purpose is to enable the classification of movie reviews into either positive or negative sentiment categories.
Columns
- text: The actual text content of the movie review (String).
- label: The sentiment label assigned to the review, where 0 indicates a negative sentiment and 1 signifies a positive sentiment (Integer).
Distribution
The dataset is typically provided in CSV format. It comprises 25,000 highly polar movie reviews for training and 25,000 for testing. Specifically, the test file,
test.csv
, lists movie reviews along with their sentiment labels. There are 24,801 unique text values within the dataset. The label distribution shows 12,500 reviews with a '0' label and 12,500 reviews with a '1' label.Usage
This dataset is ideally suited for:
- Training binary sentiment classification models.
- Developing models to categorise movie reviews into positive and negative sentiment groups.
- Building a substantial movie review database for research initiatives. To utilise this dataset, a machine learning algorithm capable of performing binary classification, such as logistic regression or support vector machines, is required.
Coverage
The data's regional coverage is global.
License
CC0
Who Can Use It
This dataset is intended for:
- Data scientists looking to build or improve sentiment analysis models.
- Machine learning engineers developing natural language processing (NLP) applications for text classification.
- Researchers studying public opinion or movie review dynamics.
Dataset Name Suggestions
- IMDB Movie Review Sentiment Dataset
- Binary Movie Sentiment Analysis
- Large Movie Review Sentiment Corpus
- Movie Review Polarity Data
Attributes
Original Data Source: IMDB Movie Reviews (Binary Sentiment)