Forecasting emergency department boarding

This is a SAIL 2026 presentation on forecasting emergency department boarding using econometric, machine learning, and foundation-model approaches.

The main takeaway is straightforward: for this prospective, operational forecasting problem, classical multivariate forecasting remained very hard to beat. A well-specified vector autoregression (VAR) model consistently outperformed the operational moving-average baseline at both UC San Diego Health emergency departments, while hybrid and foundation-model approaches provided more modest or site-dependent gains.

Presentation

Link to download presentation.

Abstract

Background: Emergency department (ED) boarding is a major driver of crowding, delays, and downstream harm. Short-horizon forecasts of boarding volume can enable proactive operational actions, including discharge acceleration, intra-system transfers, and elective surgical adjustments. However, most published forecasting work targets ED arrivals and is evaluated retrospectively.

Methods: We prospectively compared six forecasting approaches spanning econometrics, machine learning, and time-series foundation models to predict daily ED boarding volume up to four days ahead (T+1 to T+4) across two UC San Diego Health EDs, La Jolla and Hillcrest. Daily covariates reflected operational drivers and included recent boarding volume, planned surgical volume, hospital census, and temporal indicators. Each morning at 07:10, models were retrained using all data through day T and generated forecasts for T+1 to T+4 under an identical rolling evaluation framework. We evaluated vector autoregression (VAR), XGBoost, Google TimesFM, and two hybrid extensions (VAR+XGBoost and TimesFM+XReg), against a two-week moving-average operational baseline.

Results: During a four-month prospective validation period, VAR consistently outperformed the operational baseline at both sites. The VAR+XGBoost hybrid achieved the lowest RMSE for three of four horizons at La Jolla and two of four horizons at Hillcrest. Relative to the baseline RMSE, VAR+XGBoost reduced forecast error by 16 to 42% at La Jolla and 5 to 19% at Hillcrest, while improving only modestly over VAR alone, underscoring the continued strength of interpretable econometric structure for multivariate operational time series. TimesFM performance was site-dependent, and adding covariates via TimesFM+XReg improved accuracy at La Jolla and only at shorter horizons at Hillcrest. VAR was implemented in live operations, with forecasts emailed daily to Mission Control and reviewed in the morning huddle to inform same-day actions. Forecast uncertainty was communicated using 80% confidence intervals, which showed good coverage during prospective use.

Conclusion: This work demonstrates a practical learning health system workflow for forecast-driven decision support, provides a prospective, real-world comparison of econometric, machine learning, and foundation-model paradigms for ED boarding prediction, and suggests that classical multivariate forecasting remains a strong baseline even when modern foundation models and hybrids are available.

What we studied

We compared six approaches for daily ED boarding forecasts:

1) Two-week moving average, used as the operational baseline

2) Vector autoregression (VAR), an econometric model

3) XGBoost, a machine learning model

4) TimesFM, a time-series foundation model

5) VAR + XGBoost, a hybrid model

6) TimesFM + XReg, a foundation-model extension with external regressors

The models were evaluated prospectively from July through October 2024 at two UC San Diego Health emergency departments, La Jolla and Hillcrest.

Operational implementation

The VAR model was implemented in live operations. Each morning at 07:10, forecasts were generated and emailed to Mission Control staff and health-system leaders. The forecasts were reviewed during the daily huddle and used to support proactive operational planning, including discharge planning, intra-system transfers, and surgical-case scheduling.

Key findings

  • VAR consistently beat the operational moving-average baseline at both sites.
  • VAR + XGBoost had the best point performance across several horizons, but only modestly improved on VAR alone.
  • TimesFM performance varied by site, performing worse than baseline at La Jolla and better than baseline at Hillcrest.
  • Adding external regressors improved TimesFM at La Jolla and at shorter horizons at Hillcrest.
  • Uncertainty estimates from the deployed VAR model were communicated as 80% confidence intervals and showed good prospective coverage.

Citation

Poursoltan L, Ötleş E, Cao J, Clay B, Trimble B, Adrid L, Pan J, Chua A, Bell J, Longhurst CA, Zhu K, Singh K. Prospective comparison of econometric, machine learning, and foundation models for forecasting emergency department boarding patients. Presented at SAIL 2026.

Cheers,

Erkin
Go ÖN Home