Uber Public Discourse Dataset
Social Media and Networking
Tags and Keywords
Trusted By




"No reviews yet"
Free
About
This dataset contains all tweets related to #uber or #Uber up until 11 July 2022. It features 10,000 recent tweets and is ideal for sentiment analysis research, various Natural Language Processing (NLP) tasks, or general exploration and fun.
Columns
- Index: An index for the records.
- id: A unique identification number for each tweet.
- conversation_id: The unique conversation ID to which the tweet belongs.
- created_at: The date and time when the tweet was created.
- date: The date on which the tweet was posted.
- timezone: The time zone of the tweet's origin.
- place: Information about the geographical place associated with the tweet.
- tweet: The full text content of the tweet.
- language: The language in which the tweet is written.
- hashtags: Any hashtags used within the tweet.
- username: The username of the individual who posted the tweet.
Distribution
The dataset is typically provided in CSV format. It comprises 10,000 recent tweets, representing a substantial collection of social media data. The tweets are distributed globally. Language distribution includes approximately 35% English ('en'), 29% French ('fr'), and 36% other languages. Hashtag usage shows 82% of tweets do not contain hashtags, 3% contain 'uberfiles', and 15% include other hashtags.
Usage
This dataset is well-suited for:
- Conducting sentiment analysis research on public opinion regarding Uber.
- Developing and testing models for Natural Language Processing (NLP) tasks, such as text classification or named entity recognition.
- Exploring social media trends and user engagement related to Uber for personal projects or insights.
Coverage
The dataset's geographic scope is global, encompassing tweets from various locations. The data covers a time range up to 11 July 2022. There are no specific demographic notes on data availability, as it captures public tweets from Twitter users generally.
License
CC0
Who Can Use It
This dataset is intended for a range of users, including:
- Data scientists and researchers focused on social media analysis and NLP.
- Academics studying public sentiment or discourse surrounding ride-sharing services.
- Developers looking for real-world text data to train and test algorithms.
- Students and hobbyists interested in exploring large social media datasets for personal learning or enjoyment.
Dataset Name Suggestions
- Uber Global Tweets
- Social Media Uber Sentiment
- Twitter Uber Activity
- Uber Public Discourse Dataset
Attributes
Original Data Source: Twitter Dataset: Uber