COVID-19 Vaccine Twitter Conversation Data
Patient Health Records & Digital Health
Tags and Keywords
Trusted By




"No reviews yet"
Free
About
A continually updated collection of trending Twitter conversations centred around the COVID-19 vaccine. The data provides social network activity scraped using the Twitter API and a Python script, specifically targeting the
#covid vaccine hashtag. It captures public discourse regarding the pandemic and vaccination efforts, offering material for social listening, trend analysis, and social media dynamics studies. Collection of this social activity started on 22 October 2020.Columns
The product file contains 36 columns detailing various attributes of the collected tweets. Key fields include:
id: The unique identifier for the tweet.conversation_id: The identifier linking related tweets within a specific discussion.created_at,date, andtime: Timestamp information showing when the tweet was posted.user_id,username,name: Identifiers and display names of the user who published the post.tweet: The primary text content of the social media post.language: The detected language of the tweet content, with English being highly prevalent (96%).mentions: A list of user handles included in the post.urls: Any external links or URLs embedded within the post.replies_count,retweets_count,likes_count: Metrics quantifying user engagement and interaction with the tweet.hashtagsandcashtags: Supplementary tagging information.videoandphotos: Indicators for attached multimedia content.
Distribution
The raw data is available in a single CSV file, measuring approximately 111.9 MB. The resource contains a large list of more than 2 lakh (210,000) collected tweets. Each record is structured across 36 distinct attributes. While collection was managed daily, future updates are expected annually.
Usage
This resource is highly suitable for various analytical purposes:
- Evaluating public sentiment towards the COVID-19 vaccine and related policies.
- Identifying emerging subjects and trending discussion topics linked to the main hashtag.
- Tracking temporal trends in social media activity surrounding vaccine development.
- Researching online communities and health-related communication strategies.
- Developing tools for geopolitical conversation analysis, provided techniques account for the scarcity of explicit geolocation data.
Coverage
The data captures tweets over a period ranging from 12 February 2020 up to 22 October 2020, which was the start date of continuous collection. The majority of the content is in English. While geolocation fields like
place, geo, and near are largely missing, the data frequently indicates a single dominant timezone (530, potentially India Standard Time), which accounts for almost all included tweets.License
CC0: Public Domain
Who Can Use It
- Social Scientists: To analyse discourse patterns and emotional valence (sentiment) in public health crises.
- Public Health Agencies: To monitor public reactions and identify potential areas of concern regarding vaccination campaigns.
- Technology Developers: For building and testing Natural Language Processing (NLP) models focused on identifying specific trends or user behaviours on social platforms.
- Academic Researchers: To study digital sociology, health informatics, and internet phenomena.
Dataset Name Suggestions
- COVID-19 Vaccine Twitter Conversation Data
- Social Media Discourse on Vaccine (Oct 2020)
- Global Vaccine Trending Tweets
- Online Health Dialogue Data Set
Attributes
Original Data Source:COVID-19 Vaccine Twitter Conversation Data
Loading...
