Clear Sky Science · en
A fine-grained transformer combined with multimodal data for predicting hospital length of stay in acute coronary syndrome
Why hospital stay length matters
For people rushed to the hospital with chest pain or a heart attack, how long they will need to stay is more than a matter of curiosity. It affects how quickly they get a bed, how soon others can be admitted, how many nurses are needed on each shift, and even what costs families will face. This study introduces a new artificial intelligence system that estimates hospital length of stay for patients with acute coronary syndromes—dangerous heart conditions caused by blocked coronary arteries—by looking not only at medical records but also directly at detailed heart imaging.
Looking inside clogged heart arteries
Doctors already rely on computed tomography (CT) scans to view the heart’s blood vessels and spot narrowings or blockages that can trigger acute coronary syndromes. These vessels are long, branching tubes, and small diseased segments may occupy only a tiny fraction of each image. Traditional computer models often summarize such scans into broad measurements, which can gloss over the fine details of where and how severely an artery is damaged. The new approach, called FRAME, treats the CT images much more carefully, aiming to capture subtle changes in vessel shape that signal how sick a patient is—and how long they might need hospital care.

Teaching the computer to read vessel shapes
The researchers first devised a way for the computer to teach itself what healthy and diseased vessels look like, without needing a human to label every image. They gently altered each CT image using safe brightness and shape adjustments that never erase the actual lesion areas. By comparing the original and modified versions of the same scan, the system learned which features should stay constant and which should change when the image is transformed. This "self-supervised" training helped the model focus on the true three-dimensional patterns of the vessels—such as curves, narrowings, and irregular walls—rather than being distracted by noise or irrelevant background structures.
Zooming in on tiny problem spots
After this training, FRAME analyzes new CT scans in small tiles, or patches, instead of treating each image as a single block. An attention mechanism weighs each patch and highlights those that appear most related to disease, effectively zooming in on narrow, damaged vessel regions while downplaying healthy areas. In parallel, the system processes information from the electronic medical record, including age, heart rate, kidney function, inflammation markers, and blood cell counts. A fine-grained fusion step then lets the CT patches and record data influence one another, using a Transformer—the same type of architecture behind many modern language models—to connect patterns in vessel damage with patterns in lab tests and vital signs.

Putting the model to the test
To check how well FRAME works, the team applied it to data from 615 patients treated at a large chest hospital in Shanghai between 2015 and 2021. Every patient had both CT imaging and detailed medical records, along with a recorded hospital length of stay. The researchers compared FRAME with 16 leading alternatives, ranging from classic machine learning formulas to advanced image networks and multimodal models. Across the full patient group and across four ranges of hospital stay—from under five days to longer than two weeks—the new system consistently made the most accurate predictions, with errors of about one day on average and a very strong match between predicted and actual stay lengths.
Seeing what the model sees
Beyond raw accuracy, the team examined where FRAME was “looking” in the images and which medical record features it relied on. The highlighted image patches almost always corresponded to vessel segments with visible lesions, suggesting the model had learned clinically meaningful patterns that might eventually help automate lesion outlining for radiologists. Among the record data, heart rate, kidney function markers, uric acid, inflammation proteins, and certain blood cell ratios carried the most weight—and each showed a clear statistical link with how long patients stayed in the hospital. This aligns with previous clinical studies, giving doctors additional confidence that the system’s reasoning is medically sensible.
What this means for patients and hospitals
In simple terms, the study shows that carefully combining rich heart imaging with standard bedside measurements allows computers to estimate hospital stay length for acute coronary syndrome patients more accurately than existing tools. By paying attention to the exact shapes of diseased vessels and the most telling lab results, FRAME could help hospitals plan beds and staff, flag high-risk patients earlier, and potentially speed up image review for busy clinicians. While the authors note that larger and more varied datasets will be needed before such systems guide everyday care, their work demonstrates a promising step toward smarter, more efficient heart emergency management.
Citation: You, L., Cen, X. & Wang, S. A fine-grained transformer combined with multimodal data for predicting hospital length of stay in acute coronary syndrome. Sci Rep 16, 11465 (2026). https://doi.org/10.1038/s41598-026-41279-8
Keywords: acute coronary syndrome, hospital length of stay, cardiac CT imaging, medical AI, multimodal prediction