Opendatabay APP

Twitter Bot Classification Dataset

Social Media and Posts

Tags and Keywords

Bot

Twitter

Detection

Social

Accounts

Trusted By
Trusted by company1Trusted by company2Trusted by company3
Twitter Bot Classification Dataset Dataset on Opendatabay data marketplace

"No reviews yet"

Free

About

Delve into the world of Twitter bots with this dataset, designed to facilitate the analysis and detection of automated accounts. It provides a unique opportunity to explore user profiles, tweet content, and interaction metrics to uncover hidden patterns and gain insights into bot detection research. The data serves to enhance understanding of social media interactions and the identification of bot accounts.

Columns

  • User ID: A unique identifier for each user within the dataset.
  • Username: The associated username for each user.
  • Tweet: The textual content of the tweet.
  • Retweet Count: The number of times a tweet has been retweeted.
  • Mention Count: The number of mentions included in the tweet.
  • Follower Count: The number of followers a user possesses.
  • Verified: A boolean value indicating whether the user account is verified.
  • Bot Label: A binary label (1 for bot, 0 for non-bot) indicating the user's classification.
  • Location: The geographical location associated with the user.
  • Created At: The date and time when the tweet was created.
  • Hashtags: The hashtags incorporated into the tweet.

Distribution

The dataset is provided in a CSV file format, specifically named 'bot_detection_dataset.csv'. It includes a collection of user profiles and their associated tweet data. While a precise number of rows or records is not specified, the dataset is version 2 and has a size of 7.47 MB. It features a binary label indicating whether each user is a bot or not.

Usage

This dataset is ideal for various applications, including bot detection research, the analysis and identification of bot accounts on Twitter, and training and evaluating machine learning models for binary classification. It can also be used to predict whether a user is a bot and for exploring user profiles and tweet content to uncover hidden patterns in social media interactions.

Coverage

The dataset includes user-associated location data and the creation date and time of tweets, offering insights into temporal and geographical aspects. It focuses on user profiles and tweet data relevant to bot identification on Twitter. Information regarding specific demographic groups or annual data availability is not detailed beyond general user profile attributes.

License

CC0: Public Domain

Who Can Use It

  • Researchers: To delve into Twitter bots, conduct bot detection research, and understand social media interactions.
  • Data Scientists and Machine Learning Practitioners: For training and evaluating various machine learning algorithms (such as Logistic Regression, Random Forest, Gradient Boosting, Support Vector Machines, and Neural Networks) for bot detection.
  • Developers: To build and enhance tools for identifying automated accounts on social media platforms.
  • Social Media Analysts: For gaining insights into the prevalence and characteristics of bot accounts.

Dataset Name Suggestions

  • Twitter Bot Detector Data
  • Social Media Bot Analysis Dataset
  • Automated Account Detection Data
  • Twitter Bot Classification Dataset
  • Social Bot Identification Set

Attributes

Listing Stats

VIEWS

0

DOWNLOADS

0

LISTED

08/09/2025

REGION

GLOBAL

Universal Data Quality Score Logo UDQSQUALITY

5 / 5

VERSION

1.0

Free

Download Dataset in ZIP Format