Opendatabay APP

Binary Movie Sentiment Analysis

Entertainment & Media Consumption

Tags and Keywords

Movies

Text

Nlp

Trusted By
Trusted by company1Trusted by company2Trusted by company3
Binary Movie Sentiment Analysis Dataset on Opendatabay data marketplace

"No reviews yet"

Free

About

This is a large dataset designed for binary sentiment classification of movie reviews. It offers a substantial amount of data compared to prior benchmark datasets, providing 25,000 highly polar movie reviews for training and an additional 25,000 for testing. Unlabelled data is also available for use. The data fields remain consistent across all segments of the dataset. Its primary purpose is to enable the classification of movie reviews into either positive or negative sentiment categories.

Columns

  • text: The actual text content of the movie review (String).
  • label: The sentiment label assigned to the review, where 0 indicates a negative sentiment and 1 signifies a positive sentiment (Integer).

Distribution

The dataset is typically provided in CSV format. It comprises 25,000 highly polar movie reviews for training and 25,000 for testing. Specifically, the test file, test.csv, lists movie reviews along with their sentiment labels. There are 24,801 unique text values within the dataset. The label distribution shows 12,500 reviews with a '0' label and 12,500 reviews with a '1' label.

Usage

This dataset is ideally suited for:
  • Training binary sentiment classification models.
  • Developing models to categorise movie reviews into positive and negative sentiment groups.
  • Building a substantial movie review database for research initiatives. To utilise this dataset, a machine learning algorithm capable of performing binary classification, such as logistic regression or support vector machines, is required.

Coverage

The data's regional coverage is global.

License

CC0

Who Can Use It

This dataset is intended for:
  • Data scientists looking to build or improve sentiment analysis models.
  • Machine learning engineers developing natural language processing (NLP) applications for text classification.
  • Researchers studying public opinion or movie review dynamics.

Dataset Name Suggestions

  • IMDB Movie Review Sentiment Dataset
  • Binary Movie Sentiment Analysis
  • Large Movie Review Sentiment Corpus
  • Movie Review Polarity Data

Attributes

Listing Stats

VIEWS

4

DOWNLOADS

0

LISTED

17/06/2025

REGION

GLOBAL

Universal Data Quality Score Logo UDQSQUALITY

5 / 5

VERSION

1.0

Free