Blockchain Tweet Analysis Dataset
Crypto & Blockchain Transactions
Tags and Keywords
Trusted By




"No reviews yet"
Free
About
This dataset comprises tweets collected using the Twitter API, all featuring the #Blockchain hashtag. The collection process involves running a daily query for this high-frequency hashtag to gather a large sample of tweets. It is designed to enable in-depth analysis of topics associated with blockchain technology, geographical distribution of discussions, sentiment evaluation, and trend identification. Data collection commenced on 1 January 2021.
Columns
- user_name: The display name of the Twitter user.
- name: This appears to be synonymous with the user's display name.
- user_location: The location provided by the user in their profile. Note that a significant portion of this data is not available.
- user_description: The biographical text from the user's profile.
- user_created: The date and time when the user's Twitter account was created.
- user_followers: The total number of followers the user has.
- user_friends: The total number of users the user is following.
- user_favourites: The total number of tweets the user has liked or favourited.
- user_verified: A boolean indicator signifying whether the user's account is verified by Twitter.
- date: The timestamp indicating when the tweet was posted.
- text: The full content of the tweet.
- hashtags: A list of additional hashtags included within the tweet. This column has a notable number of missing values.
- source: The client application or platform from which the tweet was published (e.g., Twitter for Android, Twitter Web App).
- is_retweet: A boolean value indicating if the tweet is a retweet. All sampled tweets are original posts rather than retweets.
Distribution
The dataset is provided in a CSV format, named
blockchain_tweets.csv
, and has a file size of 4.65 MB. It contains 13 columns with 10,000 records, providing a robust sample for analysis.Usage
This dataset is ideal for:
- Investigating specific subjects and themes that utilise the #Blockchain hashtag.
- Analysing the geographical spread of blockchain-related discussions on Twitter.
- Assessing public sentiment towards blockchain technology and related topics.
- Identifying emerging trends and patterns within the blockchain discourse.
Coverage
The data collection began on 1 January 2021. The dataset includes a sample of tweets from November 2021, specifically between 13 November and 15 November. Geographically, user location data is included, although 64% of this information is unavailable. Dhaka, Bangladesh, is the most frequently occurring specified location. The data reflects public Twitter activity related to the #Blockchain hashtag.
License
CC0: Public Domain
Who Can Use It
- Researchers and Academics: To conduct studies on social media discourse, public opinion, and the evolution of technology-related discussions.
- Market Analysts: To monitor and understand public engagement, brand perception, and market trends within the blockchain and cryptocurrency sectors.
- Data Scientists: For developing models for natural language processing, sentiment analysis, user behaviour prediction, and trend forecasting.
- Policy Makers: To gain insights into public perception and the social impact of emerging technologies.
- Journalists: To gather background information and identify key narratives surrounding blockchain.
Dataset Name Suggestions
- Blockchain Twitter Data 2021
- #Blockchain Tweet Analysis Dataset
- Social Media Blockchain Trends
- Twitter Blockchain Engagement
- Daily Blockchain Hashtag Data
Attributes
Original Data Source: Blockchain Tweet Analysis Dataset