Bike Share Demand Prediction Data
Data Science and Analytics
Tags and Keywords
Trusted By




"No reviews yet"
Free
About
This dataset is designed for predicting bike users, serving as a regression problem data set. It provides essential information for forecasting bike share demand, which is crucial for operational planning and resource allocation. The dataset was originally sourced and then prepared into a competition-ready format, with
y_test
for the test data provided as a function.Columns
- ID: A unique identifier for each entry.
- Count: The total number of bike rentals, representing the target variable for prediction. It ranges from 0.00 to 17377.00, with an average of approximately 8.68k.
- dteday: The date of the record, spanning from 1st January 2011 to 31st December 2012.
- hr: The hour of the day, from 0 to 23. The mean hour is 11.5.
- weathersit: Categorical data indicating the weather situation, ranging from 1 to 4. The most frequent weather situation falls between 1.00 and 1.30.
- temp: Normalised temperature in Celsius, with values between 0.02 and 0.96, and a mean of 0.49.
- atemp: Normalised feeling temperature in Celsius, ranging from 0.00 to 1.00, with an average of 0.47.
- hum: Normalised humidity, expressed as a percentage, from 0.00 to 1.00. The mean humidity is 0.63.
- windspeed: Normalised wind speed, varying between 0.00 and 0.85, with an average of 0.19.
- casual: The number of casual bike users, with values ranging from 0 to 357. The mean is 35.2.
Distribution
The dataset is structured as a regression problem data set, typically provided in a CSV format. The
test.csv
file has a size of 312.46 kB. For all observed columns (ID, Count, dteday, hr, weathersit, temp, atemp, hum, windspeed, casual), there are 6431 valid entries. Crucially, there are no mismatched or missing values across these columns. The dataset is anticipated to be updated annually.Usage
This dataset is ideally suited for:
- Bike share demand forecasting, helping operators predict future rental volumes.
- Developing regression models to understand factors influencing bike usage.
- Performing time series analysis to identify patterns and trends in hourly and daily bike rentals.
- Business intelligence for bike share companies to optimise fleet management and distribution.
- Health-related studies exploring urban mobility and active transportation.
Coverage
The dataset covers a time range from 1st January 2011 to 31st December 2012. While it focuses on bike users and bike share operations, specific geographic details (e.g., city or region) are not provided in the source material. Demographic information beyond "casual" users is also not detailed.
License
CC0: Public Domain
Who Can Use It
- Data scientists and machine learning engineers for developing and testing predictive models.
- Urban planners and transportation analysts to inform policy decisions and infrastructure development related to active transport.
- Bike share companies for operational decision-making, such as predicting peak hours or seasonal demand.
- Researchers studying environmental factors' impact on urban mobility.
Dataset Name Suggestions
- Bike Share Demand Prediction Data
- Hourly Bike Rental Forecasting
- Bike User Count Dataset
- Urban Bike Share Metrics
- Time Series Bike Demand
Attributes
Original Data Source: Bike Share Demand Prediction Data