Opendatabay APP

Social Media Account Type Detection Dataset

Social Media and Posts

Tags and Keywords

Bot

Twitter

Ai

Detection

Account

Trusted By
Trusted by company1Trusted by company2Trusted by company3
Social Media Account Type Detection Dataset Dataset on Opendatabay data marketplace

"No reviews yet"

Free

About

This collection of Twitter User Accounts supports applied research dedicated to detecting and consequently preventing the spread of misinformation across social media platforms. Given the exponential growth of content, Artificial Intelligence plays a crucial role in enabling platforms to automatically notify or restrict access to accounts deemed suspicious. The necessity of this capability is underscored by historical events, such as Twitter’s removal of more than 26 thousand suspicious accounts in 2019. This resource provides labels essential for distinguishing between automated and human activity.

Columns

  • id: The specific Twitter ID assigned to the user account, recorded as an integer.
  • account_type: An indicator of the category of the account, detailing whether it is classified as a 'bot' or 'human', stored as a string.

Distribution

The data is structured as a CSV file named twitter_human_bots_dataset.csv, which has a file size of approximately 658.69 kB. It features two columns and includes over 30,000 rows corresponding to various User Accounts from Twitter. The total valid entry count is 37.4 thousand records. Analysis shows that human accounts account for 67% of the total records, while bot accounts represent 33% of the total. The dataset is expected to be updated annually.

Usage

This data is perfectly suited for training and evaluating supervised machine learning models focused on account classification and bot detection. It is useful for researchers and developers aiming to create automated support tools for social media platforms to address intrusion and the dissemination of non-appropriate content. It can also be leveraged for feature engineering studies to determine the characteristics that best distinguish between human and automated behaviour.

Coverage

The scope covers various Twitter user accounts and their assigned classifications. While the specific time range and geographic details of the data collection are not explicitly outlined, the context is centered on global social media networks. The classification focuses exclusively on the account type (bot or human).

License

CC0: Public Domain

Who Can Use It

Intended users include data scientists developing advanced classification models, AI researchers investigating network integrity, machine learning engineers creating tools for platform moderation, and social network analysts studying deceptive online practices.

Dataset Name Suggestions

  • Twitter Bot Account Detection Data
  • Social Media Account Categorisation
  • Human vs Bot Twitter Labels
  • Suspicious User Account Data

Attributes

Listing Stats

VIEWS

1

DOWNLOADS

0

LISTED

07/10/2025

REGION

GLOBAL

Universal Data Quality Score Logo UDQSQUALITY

5 / 5

VERSION

1.0

Free

Download Dataset in CSV Format