Ford GoBike February 2019 Trip Data
Data Science and Analytics
Tags and Keywords
Trusted By




"No reviews yet"
Free
About
This dataset provides detailed information on individual journeys made within a bike-sharing system operating across the greater San Francisco Bay area. The data focuses specifically on February 2019, offering a valuable resource for data exploration and analytical practice. It may require some data wrangling to prepare it for detailed analysis.
Columns
- duration_sec: The length of the trip in seconds. Ranges from 61 to 85,400 seconds, with a mean of 726 seconds.
- start_time: The precise start time of each bike trip. Features over 35,000 unique start times.
- end_time: The precise end time of each bike trip. Features over 35,000 unique end times.
- start_station_id: The unique identification code for the starting bike station. There are 330 unique start stations.
- start_station_name: The name of the starting bike station, such as 'Market St at 10th St' or 'San Francisco Caltrain Station 2 (Townsend St at 4th St)'.
- start_station_latitude: The latitude coordinate of the starting station, primarily within the 37.77 to 37.79 range.
- start_station_longitude: The longitude coordinate of the starting station, primarily within the -122.41 to -122.38 range.
- end_station_id: The unique identification code for the ending bike station. There are 330 unique end stations.
- end_station_name: The name of the ending bike station.
- end_station_latitude: The latitude coordinate of the ending station.
- end_station_longitude: The longitude coordinate of the ending station.
- bike_id: The unique identification code for the bicycle used in the trip. There are over 6,500 unique bike codes.
- user_type: Indicates the type of user, categorised as 'Subscriber' (approximately 89%) or 'Customer' (approximately 11%).
- member_birth_year: The birth year of the member, ranging from 1878 to 2001. Approximately 5% of entries are missing.
- member_gender: The gender of the member, with 'Male' accounting for 71%, 'Female' 22%, and 'Other' 6%. Approximately 5% of entries are missing.
- bike_share_for_all_trip: A boolean value indicating whether the trip was part of a bike-share-for-all programme. About 9% of trips are 'true' and 91% are 'false'.
Distribution
The dataset is provided in CSV format and includes 16 columns of information. It contains approximately 183,000 individual ride records and has a file size of 29.2 MB.
Usage
This dataset is ideal for exploring and practising data analysis techniques. It can be used by students and educators to train studying ideas with a substantial dataset. Potential applications include analysing ride patterns, station popularity, user demographics, and trip durations.
Coverage
The data covers individual rides within the greater San Francisco Bay area. The time range for this dataset is specifically the month of February 2019. Demographic information includes user type (subscriber/customer), birth year (spanning 1878-2001), and gender (Male, Female, Other), though some birth year and gender data may be unavailable.
License
CC BY 4.0 License and the CC BY-NC-SA 4.0 License
Who Can Use It
This dataset is particularly suitable for students learning data analysis, educators seeking a practical resource for their courses, and researchers interested in urban mobility patterns or bike-sharing systems. It can also be used by data scientists and analysts looking to practice data wrangling, visualisation, and statistical analysis.
Dataset Name Suggestions
- Ford GoBike February 2019 Trip Data
- San Francisco Bay Area Bike Share Trips (Feb 2019)
- GoBike Ride Activity Data (February 2019)
- SF Bike Share Monthly Trip Log 2019
Attributes
Original Data Source: Ford GoBike February 2019 Trip Data