Daily Wordle Guess Distribution
Social Media and Posts
Tags and Keywords
Trusted By




"No reviews yet"
Free
About
captures a vast collection of public tweets related to the viral Wordle game, offering insights into player behaviour and global trends. It was created to help users determine the distribution of guess attempts and evaluate how others are performing in the game. The data allows analysts to investigate which guess attempt was the most commonly observed, identify the most active geographical locations for playing the game, and track the various platforms or sources from which users are sharing their scores. The data collection process involves daily updates, reflecting the scores for the immediately preceding Wordle ID.
Columns
The dataset includes 10 distinct columns detailing information about the tweet and the user:
- WordleID: The sequential identification number assigned to the specific daily Wordle puzzle (e.g., 254).
- ID: The unique identification number for the individual tweet.
- Created_At: The date and time the tweet was posted.
- Text: The textual content of the tweet itself.
- Source: The application or device used to post the tweet (e.g., Twitter for iPhone, Twitter for Android).
- UserID: The unique identification number associated with the user account that published the tweet.
- Username: The publicly visible username of the player who tweeted.
- User_ScreenName: The player's Twitter handle.
- Location: The geographical location associated with the emerging tweet. Note that approximately 27% of entries lack location data.
- Truncated: A Boolean field indicating whether the tweet text was shortened upon collection.
Distribution
The data is provided in a file named WordleMegaData.csv and totals 508.92 MB in size. The dataset comprises 10 columns and contains an abundance of records, with approximately 2.14 million valid entries across most fields. The Location field specifically contains about 1.57 million valid entries. The data is structured for daily analysis and reflects metrics gathered from the tweets scraped during the day the Wordle ID was played.
Usage
Ideal applications for this data include analysing social media trends related to gaming and virality. It is highly suitable for statistical research aimed at understanding the difficulty of specific Wordle IDs based on guess attempt distributions. Analysts can also use this information to map out global player activity, determine demographic reach, or examine the preferences for social sharing platforms based on the provided source data.
Coverage
The temporal scope of the dataset extends from 28th February 2022 through to 24th July 2022, starting with Wordle ID 254. Geographically, the data reflects tweets collected globally. While location data is available, a significant portion of the tweets lack a specific geographical tag. Of the valid location data, the United States is the most frequently noted location.
License
CC0: Public Domain
Who Can Use It
- Data Scientists: For developing models based on large-scale text data and time-series analysis of gaming trends.
- Social Media Strategists: To understand viral game mechanics and user engagement on platforms like Twitter.
- Academics/Researchers: For studies on public engagement with digital games, data collection methodologies from social platforms, or geographical trend analysis.
- Game Developers: To benchmark public interest and difficulty levels of daily puzzles.
Dataset Name Suggestions
- Wordle Player Tweets
- Global Wordle Attempt Metrics
- Daily Wordle Guess Distribution
- Wordle Twitter Activity
Attributes
Original Data Source: Daily Wordle Guess Distribution
Loading...
