Clear Sky Science · en

A hierarchical conformal framework for uncertainty-aware length of stay prediction in multi-hospital settings

2026-01-29 · Back to index

Why hospital stay predictions matter

When someone is admitted to the hospital, one of the first questions families and staff ask is, “How long will they be here?” The answer affects far more than curiosity: it drives bed availability, staffing schedules, operating room plans, and even whether a patient can safely go home or needs extra support. This paper describes a new way to predict length of stay that does not just give a single number, but also a realistic range that reflects how uncertain the prediction is—crucial for safe and efficient care.

The challenge of predicting time in the hospital

Predicting length of stay is harder than it looks. Hospitals treat a wide mix of patients, from routine cases to complex emergencies, and their practices vary by size, ownership, teaching status, and region. This means patients are “clustered” inside hospitals and regions, so their outcomes are not independent. Many current machine-learning models output a best guess but offer little trustworthy information about how wrong they might be. For hospital leaders who must avoid overcrowded wards or empty beds, that missing uncertainty can lead to unsafe discharges, unnecessary cancellations, or wasteful “just in case” buffers.

Combining two schools of thinking about uncertainty

The authors studied two popular ways to capture uncertainty and found that each has serious drawbacks on its own. Bayesian methods model uncertainty directly and can reflect complex structures such as hospitals nested within regions, but in practice their uncertainty ranges can be overconfident when model assumptions are even slightly off. Conformal prediction methods, by contrast, make almost no assumptions about the data and can guarantee that their ranges contain the true outcome a chosen percentage of the time, but they usually give intervals of the same width to every patient, ignoring how hard or easy a particular case is to predict. The key idea in this work is to create a hybrid that uses each approach for what it does best: Bayesian modeling to judge which patients are more or less uncertain, and conformal prediction to keep the overall reliability of the ranges in check.

How the hybrid system works in practice

The system begins with a “hierarchical random forest,” a tree-based machine-learning model that learns patterns at three levels: individual patients, their hospitals, and the broader regions those hospitals belong to. From this base, a Bayesian layer looks at the residual errors and estimates how uncertain each new prediction is, taking into account hospital and regional quirks. Separately, a conformal calibration step looks at past prediction errors across thousands of patients and determines how wide intervals must be to achieve a desired reliability level—about 95 percent in this study. The hybrid then scales these conformal adjustments up for cases the Bayesian layer judges as risky and down for cases it sees as straightforward, creating patient-specific intervals that are both cautious and efficiently sized.

What the data say about performance

The authors tested their framework on more than 61,000 hospital stays from nearly 3,800 U.S. hospitals in a national inpatient database. Pure conformal prediction hit the 95 percent target almost exactly but used essentially the same wide range for everyone. A purely Bayesian add-on produced very narrow ranges but only captured the true length of stay about 14 percent of the time—far too low for safe use. The hybrid approach came close to the target, covering about 94.3 percent of cases, while modestly shrinking the average interval and, more importantly, redistributing width: about 21 percent narrower intervals for the least uncertain patients and about 6 percent wider for the most uncertain. These adaptive ranges remained stable across different types of hospitals and even when the model was tested on completely unseen institutions.

What this means for patients and hospitals

For non-specialists, the main takeaway is that this method turns black-box predictions into tools with understandable and trustworthy margins of error. Instead of one shaky number, hospitals gain ranges that are statistically backed and flex with case difficulty: tighter for routine patients, looser for those who might surprise clinicians. This makes it easier to plan beds and staffing realistically while flagging which patients deserve extra attention and contingency planning. Although the current ranges are still fairly wide in calendar days, the framework shows how careful statistics can move hospitals from guesswork toward more reliable, uncertainty-aware decisions that support both safety and efficiency.

Citation: Shahbazi, M.A., Baheri, A. & Azadeh-Fard, N. A hierarchical conformal framework for uncertainty-aware length of stay prediction in multi-hospital settings. Sci Rep 16, 6564 (2026). https://doi.org/10.1038/s41598-026-37450-w

Keywords: hospital length of stay, uncertainty quantification, conformal prediction, Bayesian modeling, healthcare analytics