Public Sentiment on Vaccines
Public Health & Epidemiology
Tags and Keywords
Trusted By




"No reviews yet"
Free
About
This dataset contains recent tweets about the Pfizer and BioNTech vaccine, collected using the tweepy Python package to access the Twitter API. Its primary purpose is to facilitate studies on the subjects of these tweets and enable various Natural Language Processing (NLP) tasks related to public discourse on the vaccine.
Columns
- id: Unique identifier for each tweet.
- user_name: The Twitter user's display name.
- user_location: The geographical location specified by the user.
- user_description: The biographical description provided by the user.
- user_created: The date when the user's account was created.
- user_followers: The number of followers the user has.
- user_friends: The number of accounts the user is following.
- user_favourites: The total number of tweets the user has marked as favourite.
- user_verified: A boolean indicating if the user's account is verified.
- date: The date when the tweet was posted.
- text: The actual content of the tweet.
- hashtags: A list of hashtags included in the tweet.
- source: The application or platform used to post the tweet (e.g., Twitter for iPhone).
- retweets: The number of times the tweet has been retweeted.
- favorites: The number of times the tweet has been liked/favourited.
- is_retweet: A boolean indicating if the tweet is a retweet (all tweets in this dataset are original, i.e., false).
Distribution
The dataset is provided as a CSV file, specifically
vaccination_tweets.csv
, with a file size of 4.54 MB. It comprises 16 columns and approximately 11,000 records (rows). The data file is typically in CSV format, and a sample file will be updated separately to the platform.Usage
This dataset is ideal for:
- Analysing public sentiment and opinions regarding the Pfizer and BioNTech vaccine.
- Tracking and understanding the discourse surrounding vaccine adoption and public health initiatives.
- Performing various NLP tasks, such as topic modelling, sentiment analysis, and entity recognition on social media data related to vaccines.
- Academic research into social media trends during public health crises.
Coverage
The tweets in this dataset span a time range from 12 December 2020 to 24 November 2021. User accounts that posted these tweets were created between 15 July 2006 and 19 November 2021. While global in nature, a notable portion of user locations are either unspecified or concentrated in regions like Malaysia. Approximately 21% of user locations and 23% of hashtags are missing.
License
CC0: Public Domain
Who Can Use It
- Data Scientists: For building and testing NLP models, sentiment analysis, and social media analytics.
- Public Health Researchers: To understand public perception and concerns about vaccine programmes.
- Social Media Analysts: To identify trends, key influencers, and emerging topics related to vaccine discussions.
- Students and Academics: For research projects and learning purposes in areas such as public health, data science, and computational linguistics.
Dataset Name Suggestions
- Pfizer BioNTech Vaccine Tweets
- Social Media Vaccine Discourse
- COVID-19 Vaccine Twitter Insights
- Global Vaccine Tweet Analysis
- Public Sentiment on Vaccines
Attributes
Original Data Source:Public Sentiment on Vaccines