Opendatabay APP

Public Sentiment on Vaccines

Public Health & Epidemiology

Tags and Keywords

Pfizer

Biontech

Vaccine

Tweets

Health

Trusted By
Trusted by company1Trusted by company2Trusted by company3
Public Sentiment on Vaccines Dataset on Opendatabay data marketplace

"No reviews yet"

Free

About

This dataset contains recent tweets about the Pfizer and BioNTech vaccine, collected using the tweepy Python package to access the Twitter API. Its primary purpose is to facilitate studies on the subjects of these tweets and enable various Natural Language Processing (NLP) tasks related to public discourse on the vaccine.

Columns

  • id: Unique identifier for each tweet.
  • user_name: The Twitter user's display name.
  • user_location: The geographical location specified by the user.
  • user_description: The biographical description provided by the user.
  • user_created: The date when the user's account was created.
  • user_followers: The number of followers the user has.
  • user_friends: The number of accounts the user is following.
  • user_favourites: The total number of tweets the user has marked as favourite.
  • user_verified: A boolean indicating if the user's account is verified.
  • date: The date when the tweet was posted.
  • text: The actual content of the tweet.
  • hashtags: A list of hashtags included in the tweet.
  • source: The application or platform used to post the tweet (e.g., Twitter for iPhone).
  • retweets: The number of times the tweet has been retweeted.
  • favorites: The number of times the tweet has been liked/favourited.
  • is_retweet: A boolean indicating if the tweet is a retweet (all tweets in this dataset are original, i.e., false).

Distribution

The dataset is provided as a CSV file, specifically vaccination_tweets.csv, with a file size of 4.54 MB. It comprises 16 columns and approximately 11,000 records (rows). The data file is typically in CSV format, and a sample file will be updated separately to the platform.

Usage

This dataset is ideal for:
  • Analysing public sentiment and opinions regarding the Pfizer and BioNTech vaccine.
  • Tracking and understanding the discourse surrounding vaccine adoption and public health initiatives.
  • Performing various NLP tasks, such as topic modelling, sentiment analysis, and entity recognition on social media data related to vaccines.
  • Academic research into social media trends during public health crises.

Coverage

The tweets in this dataset span a time range from 12 December 2020 to 24 November 2021. User accounts that posted these tweets were created between 15 July 2006 and 19 November 2021. While global in nature, a notable portion of user locations are either unspecified or concentrated in regions like Malaysia. Approximately 21% of user locations and 23% of hashtags are missing.

License

CC0: Public Domain

Who Can Use It

  • Data Scientists: For building and testing NLP models, sentiment analysis, and social media analytics.
  • Public Health Researchers: To understand public perception and concerns about vaccine programmes.
  • Social Media Analysts: To identify trends, key influencers, and emerging topics related to vaccine discussions.
  • Students and Academics: For research projects and learning purposes in areas such as public health, data science, and computational linguistics.

Dataset Name Suggestions

  • Pfizer BioNTech Vaccine Tweets
  • Social Media Vaccine Discourse
  • COVID-19 Vaccine Twitter Insights
  • Global Vaccine Tweet Analysis
  • Public Sentiment on Vaccines

Attributes

Original Data Source:Public Sentiment on Vaccines

Listing Stats

VIEWS

0

DOWNLOADS

0

LISTED

08/07/2025

REGION

GLOBAL

Universal Data Quality Score Logo UDQSQUALITY

5 / 5

VERSION

1.0

Free

Download Dataset in CSV Format