The final two charts in section 3.5 focus on cancellation behavior. Chart 7 compares the proportion of canceled bookings between Resort Hotel and City Hotel using a stacked proportional bar chart. Chart 8 examines whether guests who cancel tend to have booked further in advance, using a boxplot of lead time grouped by cancellation status. Together these charts identify which hotel type faces greater cancellation risk and whether early booking is a signal of eventual cancellation. Lead time in the hotel industry is the number of days between the date a reservation is made and the scheduled arrival date. A high lead time means a guest booked far in advance; a low lead time means the booking was made close to arrival.Documentation Index
Fetch the complete documentation index at: https://mintlify.com/frxxxnz/1ACC0216-TB1-2026-1/llms.txt
Use this file to discover all available pages before exploring further.
Chart 7 — Proporción de Cancelaciones por Hotel
Chart 7 — Proporción de Cancelaciones por Hotel
Chart 7 uses
position = "fill" inside geom_bar(), which rescales all bars to a common height of 1 (100%). Each bar is split by is_canceled (0 = kept, 1 = canceled), so the chart shows the proportion of bookings in each cancellation state rather than raw counts. This makes it straightforward to compare cancellation rates between Resort Hotel and City Hotel even if the two hotel types have very different total booking volumes.upc-grupo5-tb1.R
is_canceled was converted to a factor earlier in the script (df$is_canceled <- as.factor(df$is_canceled)). This ensures ggplot2 maps it to a discrete fill palette with a legend showing 0 and 1, rather than treating it as a continuous numeric variable.Chart 8 — Antelación de Reserva vs Cancelación
Chart 8 — Antelación de Reserva vs Cancelación
Chart 8 maps
is_canceled to the x-axis and lead_time to the y-axis, with fill = is_canceled coloring each box. A boxplot per cancellation group shows the median lead time, interquartile range, and outliers, making it easy to see whether canceled bookings were made substantially earlier than kept ones.The original script contains a typo on the
geom_boxplots() call — the function does not exist in ggplot2. The correct function name is geom_boxplot() (no trailing s). The corrected version is shown below.upc-grupo5-tb1.R