Opendatabay APP

Twitter Account Profiling Dataset

Social Media and Networking

Related Searches

Twitter

Social Media Analytics

Bot Detection

Machine Learning

Dataset Analysis

User Behavior

Trusted By
Trusted by company1Trusted by company2Trusted by company3
Twitter Account Profiling Dataset Dataset on Opendatabay data marketplace

"No reviews yet"

Free

About

This dataset gives a clear view of Twitter user accounts, showing whether they are humans or bots based on different account features. It aims to help analyze user behavior, account traits, and classify accounts into bots or humans. Researchers and developers can use it for social media analysis, bot detection, and machine learning.

Dataset Features

  • TH_ID: A unique identifier for each row in the dataset.
  • created_at: The date and time when the account was created.
  • default_profile: Indicates whether the account uses Twitter's default profile settings ('TRUE'/'FALSE').
  • default_profile_image: Indicates whether the account uses the default Twitter profile image ('TRUE'/'FALSE').
  • description: The bio or description of the account as written by the user.
  • favourites_count: Total number of tweets favorited by the user.
  • followers_count: Total number of followers the user has.
  • friends_count: Total number of accounts the user follows.
  • geo_enabled: Indicates whether the user has enabled geolocation for tweets ('TRUE'/'FALSE').
  • id: Unique numeric identifier for the account.
  • lang: primary language of the user's tweets.
  • profile_background_image_url: URL of the account's profile background image.
  • profile_image_url: URL of the account's profile image.
  • screen_name: The username or handle of the account.
  • statuses_count: Total number of tweets posted by the user.
  • verified: Indicates whether the account is verified by Twitter ('TRUE'/'FALSE').
  • average_tweets_per_day: Average number of tweets the account posts per day.
  • account_age_days: The age of the account in days.
  • account_type: classification of the account as human or bot.

Distribution

  • Data Volume: 25877 rows, 19 columns.
  • Format: Tabular, with data types including dates, Booleans, integers, and strings.

Usage

This dataset is ideal for a variety of applications:
  • Bot Detection: Train machine learning models to classify human and bot accounts.
  • User Behavior Analysis: Analyze characteristics and activities of different account types.
  • Social Media Studies: Research account verification, activity levels, and user engagement.

Coverage

  • Geographic Coverage: Global.
  • Time Range: Includes accounts created between 2009 and 2017.
  • Demographics: Covers diverse account types, including public figures, photographers, and bots.

License

CC0 (Public Domain)

Who Can Use It

  • Data Scientists: For training machine learning models to detect bots.
  • Researchers: For studying social media user behaviours and trends.
  • Businesses: For understanding engagement patterns and automating bot identification.

Listing Stats

VIEWS

10

DOWNLOADS

2

LISTED

22/01/2025

REGION

GLOBAL

UDQSSQUALITY

5 / 5

VERSION

1.0

Free