Emma Raducanu Tweets Analysis
Entertainment & Media Consumption
Tags and Keywords
Trusted By




"No reviews yet"
Free
About
This dataset provides a collection of recent tweets pertaining to Emma Raducanu, the winner of the 2021 US Open women's singles [6]. Emma Raducanu, a British teenager with Romanian and Chinese heritage born in Canada, made history by winning the US Open from the qualifiers, securing 20 consecutive sets without a single loss [6]. She is the first British woman to win a Grand Slam since 1977 and the first woman in US Open history to win the event from the qualifiers [6]. Following her victory, her ranking significantly improved, placing her in the 23rd position globally [6]. She is also noted for her academic achievements, having excelled in mathematics and economics at her selective grammar school in south London [6]. The data is collected using the tweepy Python package to access the Twitter API, focusing on the search term
#EmmaRaducanu
[6]. Data collection occurs continuously via a script that gathers a small number of recent tweets, which are then merged with existing datasets and saved in CSV format [6]. The accumulated dataset is uploaded to Kaggle daily or several times a day as a new version [6].Columns
- id: Unique identifier for the tweet [7].
- user_name: The name of the user who posted the tweet [7].
- user_location: The stated location of the user [7].
- user_description: The biographical description provided by the user [7].
- user_created: The date and time the user's Twitter account was created [7].
- user_followers: The number of followers the user has [7].
- user_friends: The number of accounts the user is following [7].
- user_favourites: The total number of tweets the user has favourited [7].
- user_verified: A boolean indicating if the user's account is verified [7].
- date: The date and time the tweet was posted [7].
Distribution
The dataset is typically provided in CSV format [1]. It includes approximately 12,450 records (tweets), derived from the true/false counts for user verification [8]. Specific total row counts for all individual columns are not explicitly stated across the entire dataset, but counts for various ranges of user_followers, user_friends, user_favourites, and date are detailed [7-11].
Usage
This dataset is suitable for a variety of analyses related to social media, sports, and public sentiment [6]. Ideal applications and use cases include:
- Studying the subjects and topics present in recent tweets about the new US Open champion or about tennis [6].
- Performing various Natural Language Processing (NLP) tasks, such as topic modelling and sentiment analysis [6].
- Identifying specific types of tweets, for example, those about women's tennis, British athletes, or the US Open [6].
- Tracking and analysing trends in news related to tennis, the US Open, or Emma Raducanu [6].
- Conducting sentiment analysis on the tweet corpus, with the possibility of segmenting analysis by topics or countries [6].
- Investigating the distribution and popularity of hashtags associated with the tweets [6].
Coverage
The dataset's coverage is global [12], although specific user locations like "London" are observed [9]. The primary focus is on tweets about Emma Raducanu, a British athlete [6]. The time range for tweets spans from September 2021 to January 2022, specifically from 12th September 2021 to 14th January 2022 [13]. The user account creation dates range from 2nd November 2006 to 9th November 2021 [10]. The data collection process is continuous, with the dataset being updated regularly to include recent tweets [6].
License
CC0
Who Can Use It
This dataset is ideal for:
- Data scientists and machine learning practitioners for NLP model training and social media data analysis [6].
- Sports analysts interested in public reception and fan engagement surrounding major sporting events and athletes [6].
- Media and communications researchers studying online communities, celebrity influence, and news trends [6].
- Students and academics for research projects in social sciences, computer science, and linguistics [6].
- Marketing and PR professionals for sentiment tracking and public opinion monitoring [6].
Dataset Name Suggestions
- Emma Raducanu Tweets Analysis
- US Open 2021 Tennis Tweets
- Emma Raducanu Social Media Data
- Celebrity Tennis Tweets
- Raducanu Twitter Mentions
Attributes
Original Data Source: Emma Raducanu