Opendatabay APP

COVID-19 Twitter Engagement Data

Data Science and Analytics

Tags and Keywords

Data

Analytics

Visualization

Exploratory

Analysis

Nlp

Cleaning

Trusted By
Trusted by company1Trusted by company2Trusted by company3
COVID-19 Twitter Engagement Data Dataset on Opendatabay data marketplace

"No reviews yet"

Free

About

This dataset focuses on Twitter engagement metrics related to the Coronavirus disease (COVID-19), an infectious disease caused by the SARS-CoV-2 virus [1]. It provides a detailed collection of tweets, including their text content, the accounts that posted them, any hashtags used, and the geographical locations associated with the accounts [1]. The dataset is valuable for understanding public discourse, information dissemination, and engagement patterns on Twitter concerning COVID-19, particularly for analysing how people experience mild to moderate symptoms and recover, or require medical attention [1].

Columns

  • Datetime: Represents the exact date and time a tweet was posted [2].
  • Tweet Id: A unique identifier assigned to each tweet [2].
  • Text: The actual content of the tweet [2].
  • Username: The display name of the tweet author [2].
  • Permalink: The direct link to the tweet on Twitter [2].
  • User: A link to the author's Twitter account [2].
  • Outlinks: Any external links included within the tweet [2].
  • CountLinks: The number of links present in the tweet [2].
  • ReplyCount: The total number of replies to that specific tweet [2].
  • RetweetCount: The total number of retweets of that specific tweet [2].
  • DateTime Count: A daily count of tweets, aggregated by date ranges [2].
  • Label Count: A count associated with specific ranges of tweet IDs or other engagement metrics, indicating the distribution of tweets within those ranges [3-5].

Distribution

The dataset is structured with daily tweet counts and covers a period from 10 January 2020 to 28 February 2020 [2, 6, 7]. It includes approximately 179,040 daily tweet entries during this timeframe, derived from the sum of daily counts and tweet ID counts [2, 3, 6-11]. Tweet activity shows distinct peaks, with notable increases in late January (e.g., 6,091 tweets between 23-24 January 2020) [2] and a significant surge in late February, reaching 47,643 tweets between 26-27 February 2020, followed by 42,289 and 44,824 in subsequent days [7, 10, 11]. The distribution of certain tweet engagement metrics, such as replies or retweets, indicates that a substantial majority of tweets (over 152,500 records) fall within lower engagement ranges (e.g., 0-43 or 0-1628.96), with fewer tweets showing very high engagement (e.g., only 1 record between 79819.04-81448.00) [4, 5]. The data file would typically be in CSV format [12].

Usage

This dataset is ideal for:
  • Data Science and Analytics projects focused on social media [1].
  • Visualization of tweet trends and engagement over time.
  • Exploratory data analysis to uncover patterns in COVID-19 related discussions [1].
  • Natural Language Processing (NLP) tasks, such as sentiment analysis or topic modelling on tweet content [1].
  • Data cleaning and preparation exercises for social media data [1].

Coverage

The dataset has a global geographic scope [13]. It covers tweet data from 10 January 2020 to 28 February 2020 [2, 6, 7]. The content is specific to the Coronavirus disease (COVID-19) [1].

License

CC0

Who Can Use It

This dataset is particularly useful for:
  • Data scientists and analysts interested in social media trends and public health discourse [1].
  • Researchers studying information spread and public sentiment during health crises.
  • Developers building AI and LLM data solutions [13].
  • Individuals interested in exploratory analysis and data visualization of real-world social media data [1].

Dataset Name Suggestions

  • COVID-19 Twitter Engagement Data
  • SARS-CoV-2 Tweet Activity Log
  • Pandemic Social Media Discourse
  • Coronavirus Tweets Analytics
  • Global COVID-19 Tweet Metrics

Attributes

Original Data Source: Covid_19 Tweets Dataset

Listing Stats

VIEWS

4

DOWNLOADS

0

LISTED

27/06/2025

REGION

GLOBAL

Universal Data Quality Score Logo UDQSQUALITY

5 / 5

VERSION

1.0

Free