Opendatabay APP

Senate Political Network and Discourse Archive

Social Media and Posts

Tags and Keywords

Senate

Twitter

Politics

Network

Topics

Trusted By
Trusted by company1Trusted by company2Trusted by company3
Senate Political Network and Discourse Archive Dataset on Opendatabay data marketplace

"No reviews yet"

Free

About

Analysing the digital presence and interpersonal connections of the United States Senate provides a deep look into the social and political dynamics of American governance. This collection merges biographical data with social media metrics to map the network of senators on Twitter, offering insights into their interactions, public discourse, and social representation. By including demographic factors such as gender and race alongside 15 years of tweet history, the data supports critical research into fairness in algorithmic models and the geospatial representation of political networks.

Columns

  • senators.csv: Includes names, age, occupation, location, and state coordinates for geospatial mapping. It also contains Twitter metadata such as screen_name, followers count, following count, gender, and race.
  • relationships.csv: Details the connections between senators using person1, person2, and relationship status (mutual following, one-way following, or no connection).
  • dataset_with_topics.csv:
    • date: The publication date and time of the tweet.
    • id: The unique identifier for the tweet on the platform.
    • username: The handle of the senator who published the post.
    • text: The specific content of the tweet.
    • retweets: The number of times the post was shared.
    • likes: The number of likes received by the post.
    • topic_1500 / topic_500 / topic_150: Results of BERTopic clustering at different minimum cluster sizes.
  • topic_info_150: Contains topic details for the 150-size clusters, including manual labels for classification tasks.

Distribution

The information is delivered across four related CSV tables that can be merged for deep analysis. The primary file, dataset_with_topics.csv, is approximately 61.2 MB and contains roughly 239,000 valid records. The data maintains a 100% validity rate for core fields such as dates and usernames. It holds a usability score of 10.00 and is maintained as a static archive with no future updates expected.

Usage

This resource is ideal for conducting exploratory data analysis on political communication and network topology. It is well-suited for training multi-class classification models using manual topic labels or performing feature extraction from political text. Additionally, researchers can use the demographic attributes to conduct fairness studies and evaluate social representation within automated systems.

Coverage

The geographic scope is focused on the United States, specifically the members of the US Senate. Temporally, the tweet collection spans from March 2008 to March 2023, though it is intended as a representative sample rather than an exhaustive archive of every tweet published. The demographic data includes specific details on the age, gender, and race of the senators involved.

License

CC0: Public Domain

Who Can Use It

Political scientists can leverage these records to study the evolution of legislative discourse over the last decade and a half. Social researchers may utilise the demographic and relationship data to identify patterns in political networking. Furthermore, machine learning engineers can use the clustered topic data to benchmark natural language processing models and fairness metrics.

Dataset Name Suggestions

  • US Senate Twitter Network and Topic Analysis
  • Congressional Social Media Relations: 2008–2023
  • Senate Political Network and Discourse Archive
  • US Legislative Twitter Interactions and Demographics
  • Senate Topic Modelling and Fairness Study Dataset

Attributes

Listing Stats

VIEWS

1

DOWNLOADS

0

LISTED

27/12/2025

REGION

GLOBAL

Universal Data Quality Score Logo UDQSQUALITY

5 / 5

VERSION

1.0

Loading...

Free

Download Dataset in ZIP Format