Citibike User Demographics and Trip Metrics
Data Science and Analytics
Tags and Keywords
Trusted By




"No reviews yet"
Free
About
Records bike trips provided by a New York bike-sharing service during May 2018. The data offers key insights into rider behaviour, trip origins, destinations, duration, and user demographics. This information is valuable for understanding transportation patterns, analysing operational efficiency, and conducting urban planning research.
Columns
- start_time (numeric): The precise time a bicycle trip began, recorded in New York City local time.
- stop_time (numeric): The time when the bicycle trip concluded, recorded in New York City local time.
- start_station_id (categorical): A unique identifier assigned to the station where the trip commenced.
- start_station_name (categorical): The formal name of the starting station. The most frequently used starting station is Pershing Square North.
- end_station_id (categorical): A unique code for the station where the trip ended.
- end_station_name (categorical): The formal name of the ending station. The most frequently used ending station is Pershing Square North.
- user_type (categorical): Classification of the bike user, distinguishing between Subscribers and Customers. Subscribers account for approximately 95% of the entries.
- bike_id (categorical): A unique code used to identify the specific bike utilized for the trip.
- gender (categorical): The reported gender of the user. Male users represent about 74% of the recorded trips.
- age (numeric): The user's age at the time of the trip, ranging from a minimum of 16 to a maximum of 65.
- trip_duration (numeric): The elapsed time of the trip, measured in minutes.
Distribution
The product focuses on bike trips for one calendar month and is typically provided in CSV format. It contains an estimated 1.6 million rows and features 11 specific columns. The records are highly validated, showing 100% validity across key fields, with no missing values reported for the primary attributes.
Usage
This data is suitable for projects aimed at understanding daily trip trends and traffic flow in New York City. Ideal applications include finding answers to questions such as identifying the largest groups of users based on age, gender, or user type, determining the daily volume trend of trips, and locating the stations most visited by users. It supports spatial analysis and demand forecasting.
Coverage
The scope of the data covers the New York City region serviced by Citibike. The temporal range is strictly limited to May 2018. Demographic coverage includes age data, with users spanning 16 to 65 years old, and recorded gender information.
License
CC0: Public Domain
Who Can Use It
- Analysts and Data Scientists: For predictive modelling and calculating key performance indicators (KPIs) related to ridership.
- Urban Researchers: To study population mobility and the efficacy of shared transport services.
- Visualisation Specialists: To map trip heatmaps, station usage fluctuations, and daily riding patterns.
- Beginners: The dataset is well-tagged as suitable for beginner data analytics projects.
Dataset Name Suggestions
- NYC Citibike Trip Data (May 2018)
- New York Bike-Sharing Records: May 2018
- Citibike User Demographics and Trip Metrics
Attributes
Original Data Source:Citibike User Demographics and Trip Metrics
Loading...
