Turkish Tweet Sentiment Analysis Dataset
Data Science and Analytics
Tags and Keywords
Trusted By




"No reviews yet"
Free
About
This dataset comprises over 11,000 tweets primarily in Turkish, curated to facilitate sentiment analysis and the detection of cyberbullying within social media contexts. Each tweet is pre-labelled with either a positive or negative sentiment, making it ideal for training and evaluating machine learning models. The dataset's creation was specifically driven by a project focused on identifying cyberbullying, providing a valuable resource for similar research and development efforts.
Columns
- Tip: This column indicates the sentiment label for each tweet, classifying it as either 'Positive' or 'Negative'.
- Paylaşım: This column contains the full text of the tweet itself.
Distribution
The dataset is structured as a collection of individual social media posts. It contains 11,006 unique entries, with sentiment distribution approximately 55% positive and 45% negative. The specific file format is typically CSV, though a sample file will be made available separately on the platform.
Usage
This dataset is particularly well-suited for applications such as:
- Developing and testing algorithms for social media sentiment analysis.
- Building models for the detection and classification of online cyberbullying.
- Research in Natural Language Processing (NLP) and Deep Learning, especially concerning Turkish text.
- General data science and analytics projects requiring labelled social media data.
Coverage
The dataset primarily covers social media content written in Turkish, making it linguistically specific. While no explicit geographic or time range is specified for the tweets' origin or collection period, its focus on Turkish language posts implies a scope relevant to Turkish-speaking online communities.
License
CC0
Who Can Use It
This dataset is designed for use by:
- Data Scientists: For developing and refining sentiment analysis and classification models.
- Machine Learning Engineers: To train and test deep learning models on text data.
- NLP Researchers: For studies on linguistic patterns, sentiment, and cyberbullying detection in Turkish.
- Academics and Students: For educational projects, research, and thesis work related to social media analysis and AI.
- Organisations: Looking to implement social media monitoring or content moderation systems.
Dataset Name Suggestions
- Turkish Tweet Sentiment Analysis Dataset
- Social Media Cyberbullying Tweets (Turkish)
- Turkish Sentiment Labelled Tweets
- Turkish Social Media Sentiment Dataset
Attributes
Original Data Source: Türkçe Sosyal Medya Paylaşımı Veri Seti