US Political Communication Dataset
Social Media and Posts
Tags and Keywords
Trusted By




"No reviews yet"
Free
About
This archive captures the entire digital output from Donald Trump's existence on the Twitter platform, covering posts from May 2009 through to January 2021. It provides an extensive, chronological record of his public communications, featuring over 56,000 individual tweets. The data is invaluable for researchers and analysts interested in tracking changes in political discourse, rhetoric, and public engagement during a significant era of modern US politics. Beyond the text itself, the data includes metrics such as user interaction, device origin, and the status of the posts, detailing whether they were retweets or subsequently deleted.
Columns
id: The unique identification number assigned to the individual tweet.text: The content of the tweet itself, containing over 56,000 unique values.isRetweet: A Boolean indicator specifying whether the post is a re-shared tweet (approximately 17% of records).isDeleted: A Boolean indicator noting if the tweet was deleted (approximately 2% of the archive).device: Specifies the posting source, such as ‘Twitter for iPhone’ (the most frequent device at 49%) or ‘Twitter for Android’ (26%). There are 20 unique devices recorded.favorites: The count of ‘likes’ or favorites the post received, with a maximum recorded value exceeding 1.8 million.retweets: The count of times the post was re-shared by other users, with a maximum value over 400,000.date: The date and time the tweet was posted, recorded in Eastern Time (EST).isFlagged: A Boolean field indicating if the tweet was flagged (representing 1% of the total posts).
Distribution
The entire collection is provided in a single data file,
trump_tweets.csv, which is approximately 11.02 MB in size. The dataset is structured with 9 columns and contains 56,600 validated rows of records. The archive represents a fixed historical collection with no expected future updates. Data fields exhibit high quality, showing 100% validity across all records.Usage
The data is ideally suited for studying the intersection of technology and political communication. Ideal applications include:
- Performing large-scale natural language processing (NLP) for sentiment and topic modelling in political science.
- Investigating the effectiveness of communication strategies based on engagement metrics (favorites and retweets).
- Analysing temporal patterns in posting frequency and content shifts over the 12-year period.
- Mapping rhetorical patterns used in high-stakes political messaging.
Coverage
The temporal coverage spans from the earliest posts on 5 May 2009 up until the final inclusion on 8 January 2021. This provides a detailed historical view of a major US political figure’s digital presence.
License
CC0: Public Domain
Who Can Use It
- Data Scientists: Utilising the text field for advanced machine learning models and NLP projects.
- Political Analysts: Studying evolving political strategies and public response to controversial or key messaging.
- Students and Educators: The data is rated as beginner-friendly, making it excellent for introductory projects on text analysis and data visualisation.
Dataset Name Suggestions
- Donald Trump Twitter History (2009-2021)
- The Trump Twitter Archive
- US Political Communication Dataset
- Historical Tweets Archive
Attributes
Original Data Source: US Political Communication Dataset
Loading...
