Global Monkeypox Twitter Data
Social Media and Posts
Tags and Keywords
Trusted By




"No reviews yet"
Free
About
This dataset contains tweets regarding Monkeypox, collected specifically for data analysis and Natural Language Processing (NLP) tasks. It offers a valuable resource for researchers and developers aiming to understand public discourse, track sentiment, or build algorithms capable of classifying Monkeypox-related tweets [1, 2].
Columns
The dataset comprises eight important columns:
- date: The date when the tweet was posted [1].
- time: The time of the tweet [1].
- id: The unique Twitter username identifier of the person who tweeted about Monkeypox [1].
- tweet: The actual text content of the tweet related to Monkeypox [1].
- language: The language in which the tweet was written [1].
- replies_count: The total number of replies a tweet received [1].
- retweets_count: The total number of retweets for a particular tweet [1].
- likes_count: The total number of likes a tweet garnered [1].
Distribution
The dataset is provided as a CSV data file [1]. The file size is 2.01 MB [3]. It contains approximately 10,000 tweets [4-9].
Usage
This dataset is ideally suited for various applications, including:
- Data analysis on social media trends related to health topics [1].
- Natural Language Processing (NLP) projects, such as sentiment analysis or topic modelling [1].
- Developing machine learning algorithms to predict or classify tweets about Monkeypox [2].
- Understanding public perception and discussion surrounding the Monkeypox disease [1].
Coverage
The dataset primarily includes tweets from 19th August 2022 based on the 'date' column [4]. However, time-related metadata within the dataset indicates activity spanning from 3rd September to 4th September 2022 [5]. The tweets are predominantly in English (88%), with a smaller percentage in Filipino (2%) and other languages (10%), suggesting a global scope with a strong focus on English-speaking discourse [7]. Geographic or specific demographic scopes are not detailed in the provided information [10].
License
CC0: Public Domain
Who Can Use It
This dataset is particularly useful for:
- Data scientists and analysts engaged in social media research [1].
- Natural Language Processing practitioners and students learning NLP concepts and building text classification models [2].
- Public health researchers interested in digital epidemiology and understanding disease-related discussions online [1].
- Anyone planning to create an algorithm to predict whether a tweet is about Monkeypox [2].
Dataset Name Suggestions
- Monkeypox Tweets for NLP
- Global Monkeypox Twitter Data
- Social Media Discourse on Monkeypox
- Monkeypox Public Health Tweet Analysis
- Twitter Scrape: Monkeypox Discussions
Attributes
Original Data Source: Global Monkeypox Twitter Data