Twitter Reddit Election Sentiment Analysis Data
Reddit & Forum Data
Tags and Keywords
Trusted By




"No reviews yet"
Free
About
This dataset was created as part of a university project focused on sentiment analysis across multiple social media platforms, specifically Twitter and Reddit. It contains tweets and comments extracted from these platforms related to the 2019 Indian General Elections, specifically concerning Narendra Modi and other political leaders, reflecting public opinion towards the next Prime Minister. The data has been meticulously cleaned using Python's
re
module and Natural Language Processing (NLP) techniques. Each entry is assigned a sentiment label ranging from -1 to 1, where 1 indicates a positive sentiment, -1 a negative sentiment, and 0 a neutral sentiment.Columns
- Twitter.csv: This dataset comprises two columns. The first column contains the cleaned tweets, and the second column indicates the sentiment label for each tweet. Approximately 163,000 tweets are included.
- Reddit.csv: This dataset also features two columns.
clean_comment
: This column holds comments extracted from various Reddit subreddits, primarily focusing on opinions about Modi and other prime ministerial candidates during the 2019 Indian elections.category
: This column describes the actual sentiment of the respective comment, with labels ranging from -1 to 1. There are approximately 37,000 comments in this dataset.
Distribution
The dataset is typically provided in CSV file format. It consists of two primary files:
Twitter.csv
, which includes around 163,000 tweets, and Reddit.csv
, containing approximately 37,000 comments. The Reddit_Data.csv
file, for example, is 6.89 MB in size. Generally, each dataset is structured with two columns: one dedicated to the cleaned text (tweets or comments) and the other to its corresponding sentiment label.Usage
This dataset is ideally suited for conducting sentiment analysis, especially when examining content from multi-source social media platforms. It can be effectively utilised for academic projects, research into public opinion dynamics during elections, and for developing or testing Natural Language Processing (NLP) models specifically designed for sentiment classification.
Coverage
The dataset's scope is primarily centred on the 2019 General Elections held in India. It includes tweets and comments made about Narendra Modi and other prominent leaders, capturing the sentiments and opinions of people towards the candidates for the Prime Minister role during that specific historical period.
License
CC BY-NC-SA 4.0
Who Can Use It
This dataset is particularly suitable for students and researchers undertaking university projects focused on social media sentiment analysis. Data scientists, political analysts, and developers working on NLP models for public opinion tracking, classification tasks, or social media research would also find this dataset exceptionally valuable.
Dataset Name Suggestions
- Indian Election 2019 Social Media Sentiment
- Twitter Reddit Election Sentiment Analysis Data
- Modi 2019 Election Sentiment Dataset
- Multi-Source Social Media Election Opinion
- India General Election Tweets & Comments
Attributes
Original Data Source: Twitter Reddit Election Sentiment Analysis Data