IPL 2020 Twitter Conversation Data
Sports & Recreation
Tags and Keywords
Trusted By




"No reviews yet"
Free
About
This dataset provides a collection of tweets pertaining to the Indian Premier League (IPL) 2020 season. Its primary aim is to offer insight into public discourse and sentiment surrounding this significant cricket event. The dataset was created to observe what topics were trending and discussed by individuals on Twitter during the tournament. A notable inspiration behind its creation was the potential correlation between the start of the IPL and a dip in COVID-19 cases in India, suggesting more people were staying indoors to watch matches. This resource is valuable for analysing social media activity and public reactions during a major sporting spectacle.
Columns
The dataset includes the following columns:
- S.no: A serial number assigned to each record.
- id: A unique identifier generated for each individual tweet.
- user_id: The unique identification number for the Twitter user who posted the tweet.
- username: The Twitter handle (e.g.,
@exampleuser
) of the account that published the tweet. - name: The display name associated with the Twitter account.
- tweet: The full textual content of the tweet.
- language: The detected language in which the tweet was written.
- mentions: A list of other Twitter accounts that were mentioned within the tweet.
- urls: Any external links or URLs included in the tweet's content.
- photos: Links to any images that were attached to or referenced in the tweet.
Distribution
The dataset is typically structured in a CSV (Comma Separated Values) file format. It contains approximately 25,239 individual tweets or records, each representing a distinct social media post during the IPL 2020. While the exact file size is not specified, the dataset is organised with the aforementioned columns, making it suitable for tabular data processing.
Usage
This dataset is ideally suited for a variety of applications, including:
- Social media sentiment analysis: To gauge public opinion and emotional responses related to IPL 2020 teams, players, and matches.
- Trend and topic analysis: Identifying popular discussions, hashtags, and emerging themes during the cricket season.
- Natural Language Processing (NLP) research: For tasks such as text classification, entity recognition, and language modelling based on real-world social media text.
- Academic studies: Exploring the intersection of large sporting events, social behaviour, and public health (as inspired by the COVID-19 context).
- Sports analytics and fan engagement studies: Understanding how fans interact with and react to live sporting events on social platforms.
Coverage
The dataset focuses exclusively on tweets posted during the IPL 2020 season. Geographically, the data's relevance is primarily to Asia, given that the Indian Premier League is a major sporting event in the region. There is no specific demographic scope provided within the dataset, as it comprises tweets from a wide range of public Twitter users.
License
CC0
Who Can Use It
This dataset is particularly useful for:
- Data scientists and analysts: For conducting sentiment analysis, performing NLP tasks, and exploring social network structures.
- Researchers: Especially those in sociology, public health, media studies, and sports science, interested in social media trends and event impacts.
- Marketing and brand strategists: To understand audience engagement and public perception surrounding major sporting events.
- Developers: Building applications that require historical social media data for analysis or integration.
- Students: For academic projects involving text mining and social media analysis.
Dataset Name Suggestions
- IPL 2020 Twitter Conversation Data
- Indian Premier League 2020 Tweet Analysis
- Cricket 2020 Social Media Discourse
- IPL Season 13 Tweets
- Public Sentiment on IPL 2020
Attributes
Original Data Source: IPL 2020 🏏🏏Tweets