This repository contains an exploration of hotel reservation data, including Exploratory Data Analysis (EDA), clustering using KMeans, and predictive analysis utilizing various machine learning models.
This project explores hotel reservations data to extract insights and predict booking cancellations. The analysis includes data preprocessing, visualization, clustering, and predictive modeling techniques to enhance understanding and decision - making in the hospitality industry.
- EDA: Exploratory Data Analysis offers insights into the dataset, identifying trends, patterns, and relationships.
- Clustering Analysis: Implementation of KMeans clustering for segmentation and customer profiling.
- Predictive Analysis: Utilization of machine learning models (Logistic Regression, Naive Bayes, SVM, KNN, Decision Tree, Random Forest, XGBoost) to predict booking cancellations.
Prediction Model | Accuracy Score |
---|---|
Logistic Regression | 80.74% |
Naive Bayes | 44.27% |
SVM Linear Kernel | 80.29% |
SVM Polynomial Kernel | 82.66% |
SVM RBF Kernel | 83.85% |
KNN | 83.61% |
Decision Tree | 85.11% |
Random Forest | 90.07% |
XGBoost | 88.83% |
/data
: Contains the dataset used for analysis./notebooks
: Jupyter notebooks for EDA, clustering, and predictive modeling./visualizations
: Visual outputs generated from analysis./scripts
: Useful scripts for data preprocessing and model training.