← All Projects

Predicting Hotel Booking Cancellations for INN Hotels

Using tree-based classifiers to flag at-risk bookings before they cancel

Overview

A large share of hotel bookings get cancelled, so I built a model to predict which ones are at risk.

Methodology

flowchart LR
  A[Raw Data] --> B[Clean & Encode]
  B --> C[EDA]
  C --> D[Train/Test Split]
  D --> E["Random Forest / Decision Tree"]
  E --> F["Tune (Cross-Validation)"]
  F --> G["Evaluate: Recall / F1 / ROC"]
  G --> H[Interpret]

The Data

I worked with about 36,000 past bookings, each described by 19 details like lead time and price.

Exploratory Analysis

I explored how booking details relate to each other and to whether a booking was cancelled.

Key Drivers of Cancellation

A handful of factors did most of the work in separating cancelled bookings from kept ones.

Modeling & Results

I trained and tuned decision tree and random forest models, with the random forest performing best.

Key Takeaways

The hotel can now flag risky bookings early and act to reduce cancellations.

More Visualizations

Tech Stack

Attribution

This project was completed as part of the MIT Applied Data Science Program (MIT IDSS / Great Learning). The program provided the case-study scaffolding; the analysis, code, and results are my own. Published with permission, for portfolio use only.