Hotel Booking Cancellation Analysis
Data Science and Analytics
Tags and Keywords
Trusted By




"No reviews yet"
Free
About
Hotel reservations for both city and resort hotels is provided, with a primary focus on analysing booking cancellations. This analysis is intended to support a hotel chain in renegotiating its marketing agency contract, which is currently based on total bookings rather than actual stays. The data can be used to investigate hypotheses about why customers cancel, such as the lead time before arrival, booking changes, and special requests. The ultimate goal is to understand the drivers of cancellations, reduce the cancellation rate, and mitigate the financial impact of last-minute cancellations.
Columns
- hotel: Type of hotel, either 'Resort Hotel' or 'City Hotel'.
- is_canceled: A binary indicator where '1' means the booking was cancelled and '0' means it was not.
- lead_time: The number of days between the booking date and the arrival date.
- arrival_date_year: The year of the scheduled arrival.
- arrival_date_month: The month of the scheduled arrival.
- arrival_date_week_number: The week number of the year for the scheduled arrival.
- arrival_date_day_of_month: The day of the month for the scheduled arrival.
- stays_in_weekend_nights: Number of weekend nights (Saturday or Sunday) booked.
- stays_in_week_nights: Number of weekday nights (Monday to Friday) booked.
- adults: The number of adults included in the booking.
- children: The number of children included in the booking.
- babies: The number of babies included in the booking.
- meal: Type of meal plan booked.
- country: The country of origin of the guest, represented by ISO 3155-3:2013 codes.
- market_segment: The market segment designation (e.g., 'Online TA' for Travel Agents).
- distribution_channel: The booking distribution channel (e.g., 'TA/TO' for Travel Agents/Tour Operators).
- is_repeated_guest: A binary indicator where '1' signifies a repeat guest and '0' signifies a new guest.
- previous_cancellations: Number of previous bookings cancelled by the customer.
- previous_bookings_not_canceled: Number of previous bookings not cancelled by the customer.
- reserved_room_type: The code for the type of room originally reserved.
- assigned_room_type: The code for the type of room assigned at check-in.
- booking_changes: The number of changes made to the booking.
- agent: ID of the travel agency that made the booking.
- company: ID of the company that made or paid for the booking.
- days_in_waiting_list: Number of days the booking was on a waiting list before being confirmed.
- customer_type: The type of booking (e.g., 'Transient', 'Group').
- adr: Average Daily Rate, calculated from the total accommodation cost divided by the number of nights.
- required_car_parking_spaces: Number of car parking spaces requested by the customer.
- total_of_special_requests: The total number of special requests made by the customer (e.g., high floor).
- reservation_status: The final status of the reservation ('Canceled', 'Check-Out', or 'No-Show').
- reservation_status_date: The date when the reservation status was last updated.
Distribution
The dataset is available as a single CSV file named
hotel_bookings.csv
with a size of 15.39 MB. It contains 119,000 rows and 31 columns.Usage
This dataset is suitable for analysing hotel booking cancellations to understand their drivers and financial impact. It can be used to test hypotheses regarding customer behaviour, such as the relationship between lead time and cancellation probability. The data is designed to be analysed using tools like SQL (specifically BigQuery) for data manipulation and Power BI for creating interactive dashboards and visualisations. The findings can inform business decisions, such as renegotiating marketing contracts and developing strategies to reduce cancellations.
Coverage
The data covers hotel reservations for the years 2015, 2016, and 2017. It includes bookings for two types of hotels: a city hotel and a resort hotel. The geographic coverage is global, with guest countries of origin from 178 different nations, though a significant portion of guests are from Portugal (PRT).
License
CC0: Public Domain
Who Can Use It
- Data Analysts: Can perform exploratory data analysis to uncover patterns in cancellations and create reports for business stakeholders.
- Business Intelligence Developers: Can build dashboards in tools like Power BI to track key metrics such as cancellation rates and their financial costs.
- Hotel Managers: Can use the insights to understand customer segments and refine booking policies to minimise revenue loss from cancellations.
- Marketing Teams: Can analyse booking channels and market segments to optimise campaigns and potentially adjust contracts with agencies.
Dataset Name Suggestions
- Hotel Booking Cancellation Analysis
- City and Resort Hotel Reservations 2015-2017
- Hospitality Cancellation Drivers Dataset
- Hotel Reservation and Cancellation Data
- Customer Booking Behaviour in Hotels
Attributes
Original Data Source: Hotel Booking Cancellation Analysis