Premier League Fan Engagement Dataset
Social Media and Networking
Tags and Keywords
Trusted By




"No reviews yet"
Free
About
This dataset provides insights into online discussions surrounding Fantasy Premier League (FPL), a widely popular online fantasy football game based on the English Premier League. FPL allows participants to select real-life Premier League players and earn points based on their actual match performances. The game boasts over 9 million registered users globally, making it a prominent fantasy sports platform. For context, an FPL team typically operates with a budget of £100.0m, and top players like Mohamed Salah have achieved significant scores, with Salah holding the record for 303 points in the 2017/18 season. This collection of tweets, specifically containing keywords such as "Fantasy Premier League" and "FPL", was gathered using the snscrape library. It offers a valuable resource for understanding the FPL community, gauging interest levels, and analysing player experiences through social media chatter.
Columns
- ID: A unique identifier assigned to each tweet.
- Timestamp: The precise date and time when the tweet was originally posted.
- User: The username of the individual who posted the tweet.
- Text: The actual content or message of the tweet.
- Hashtag: Any hashtags that were included within the tweet's content.
- Retweets: The total number of times the tweet had been retweeted when the data was collected.
- Likes: The total number of likes the tweet had received when the data was collected.
- Replies: The total number of replies directed at the tweet when the data was collected.
- Source: The specific application or device used by the user to post the tweet, such as "Twitter for Android" or "Twitter for iPhone".
- Location: The geographical location optionally specified on the user's Twitter profile.
- Verified_Account: A true or false value indicating whether the user's Twitter account was verified at the time of scraping.
- Followers: The count of followers the user had when the tweet data was scraped.
- Following: The count of accounts the user was following when the tweet data was scraped.
Distribution
The dataset contains 114,466 tweets. The data is typically available in a CSV file format. It encompasses a substantial number of unique users, with 112,217 distinct user accounts represented. Additionally, there are 54,030 unique location values captured from user profiles. A breakdown of tweet sources indicates that 19% originate from "Twitter for Android", 17% from "Twitter for iPhone", and 65% from other sources. A notable portion (29%) of source information is null. For hashtags, 75% are null, 4% include 'FPL', and 21% represent other hashtags. The majority of tweets (114,394) have between 0 and 156 retweets, while 114,352 tweets have between 0 and 1,719 likes. Similarly, 114,168 tweets have between 0 and 26 replies. The maximum recorded retweets for a single tweet is 7,805, while the maximum likes is approximately 86,000, and the maximum replies is 1,317.
Usage
This dataset is well-suited for various analytical tasks, particularly in the fields of natural language processing (NLP) and machine learning. Ideal applications include:
- Sentiment analysis: Determining the overall mood or emotional tone of tweets related to FPL.
- Topic modelling: Identifying key themes and subjects discussed within the FPL community on Twitter.
- Community analysis: Gaining insights into the dynamics and interests of FPL players and enthusiasts.
- Interest assessment: Measuring the level of online engagement and popularity of FPL over time.
- User experience studies: Understanding the challenges, successes, and general experiences of playing FPL as expressed by users.
Coverage
The dataset spans tweets collected from 2012 to 2023, specifically between 29 October 2012 and 9 April 2023. Geographically, the data reflects a global scope, consistent with FPL's worldwide user base of over 9 million registered users. The collection process aimed to gather a substantial number of tweets for each year within this period.
License
CCO
Who Can Use It
- Data Scientists and Machine Learning Engineers for developing NLP models and predictive analytics.
- Market Researchers and Brand Strategists interested in social media trends and community engagement in the sports sector.
- Academics and Students for research projects on social media analytics, fan behaviour, or fantasy sports.
- Sports Analysts and Fantasy Sports Enthusiasts looking to understand FPL discourse and sentiment.
Dataset Name Suggestions
- FPL Tweets Data 2012-2023
- Fantasy Premier League Twitter Archive
- Premier League Fan Engagement Dataset
- Social Media Chatter on FPL
Attributes
Original Data Source: FPL Tweets Dataset