Text Classification Dataset
Education & Learning Analytics
Related Searches
Trusted By




"No reviews yet"
Free
About
A curated dataset of 241,000+ English-language comments labeled for sentiment (negative, neutral, positive). Ideal for training and evaluating NLP models in sentiment analysis.
Dataset Features
1. text:
Contains individual English-language comments or posts sourced from various online platforms.
2. label:
Represents the sentiment classification assigned to each comment. It uses the following encoding:
0 — Negative sentiment
1 — Neutral sentiment
2 — Positive sentiment
Distribution
- Format: CSV (Comma-Separated Values)
- 2 Columns: text: The comment content label: Sentiment classification (0 = Negative, 1 = Neutral, 2 = Positive)
- File Size: Approximately 23.9 MB
- Structure: Each row contains a single comment and its corresponding sentiment label.
Usage
This dataset is ideal for a variety of applications:
-
1. Sentiment Analysis Model Training: Train machine learning or deep learning models to classify text as positive, negative, or neutral.
-
2. Text Classification Projects: Use as a labeled dataset for supervised learning in text classification tasks.
-
3. Customer Feedback Analysis: Train models to automatically interpret user reviews, support tickets, or survey responses.
Coverage
-
Geographic Coverage: Primarily English-language content from global online platforms
-
Time Range: The exact time range of data collection is unspecified; however, the dataset reflects contemporary online language patterns and sentiment trends typically observed in the 2010s to early 2020s.
-
Demographics: Specific demographic information (e.g., age, gender, location, industry) is not included in the dataset, as the focus is purely on textual sentiment rather than user profiling.
License
CC0
Who Can Use It
- Data Scientists: For training machine learning models.
- Researchers: For academic or scientific studies.
- Businesses: For analysis, insights, or AI development.