Opendatabay APP

Turkish Tweet Sentiment Analysis Dataset

Data Science and Analytics

Tags and Keywords

Software

Nlp

Deep

Artificial

Twitter

Sentiment

Turkish

Cyberbullying

Trusted By
Trusted by company1Trusted by company2Trusted by company3
Turkish Tweet Sentiment Analysis Dataset Dataset on Opendatabay data marketplace

"No reviews yet"

Free

About

This dataset comprises over 11,000 tweets primarily in Turkish, curated to facilitate sentiment analysis and the detection of cyberbullying within social media contexts. Each tweet is pre-labelled with either a positive or negative sentiment, making it ideal for training and evaluating machine learning models. The dataset's creation was specifically driven by a project focused on identifying cyberbullying, providing a valuable resource for similar research and development efforts.

Columns

  • Tip: This column indicates the sentiment label for each tweet, classifying it as either 'Positive' or 'Negative'.
  • Paylaşım: This column contains the full text of the tweet itself.

Distribution

The dataset is structured as a collection of individual social media posts. It contains 11,006 unique entries, with sentiment distribution approximately 55% positive and 45% negative. The specific file format is typically CSV, though a sample file will be made available separately on the platform.

Usage

This dataset is particularly well-suited for applications such as:
  • Developing and testing algorithms for social media sentiment analysis.
  • Building models for the detection and classification of online cyberbullying.
  • Research in Natural Language Processing (NLP) and Deep Learning, especially concerning Turkish text.
  • General data science and analytics projects requiring labelled social media data.

Coverage

The dataset primarily covers social media content written in Turkish, making it linguistically specific. While no explicit geographic or time range is specified for the tweets' origin or collection period, its focus on Turkish language posts implies a scope relevant to Turkish-speaking online communities.

License

CC0

Who Can Use It

This dataset is designed for use by:
  • Data Scientists: For developing and refining sentiment analysis and classification models.
  • Machine Learning Engineers: To train and test deep learning models on text data.
  • NLP Researchers: For studies on linguistic patterns, sentiment, and cyberbullying detection in Turkish.
  • Academics and Students: For educational projects, research, and thesis work related to social media analysis and AI.
  • Organisations: Looking to implement social media monitoring or content moderation systems.

Dataset Name Suggestions

  • Turkish Tweet Sentiment Analysis Dataset
  • Social Media Cyberbullying Tweets (Turkish)
  • Turkish Sentiment Labelled Tweets
  • Turkish Social Media Sentiment Dataset

Attributes

Listing Stats

VIEWS

0

DOWNLOADS

0

LISTED

17/06/2025

REGION

GLOBAL

Universal Data Quality Score Logo UDQSQUALITY

5 / 5

VERSION

1.0

Free

Download Dataset in CSV Format