Clear Sky Science · en

Clinically-guided models or foundation models? predicting cervical spondylotic myelopathy from electronic health records

2026-01-20 · Back to index

Why spotting this spinal problem sooner matters

Cervical spondylotic myelopathy (CSM) is a mouthful, but for many older adults it quietly threatens the spinal cord. It can start as clumsiness, a shuffling walk, or bathroom troubles and gradually progress to severe disability or even paralysis. Doctors often miss it for years because the signs are subtle and resemble more common problems like arthritis or carpal tunnel syndrome. This study asks a timely question: can patterns buried in electronic health records help flag people at risk for CSM years before a formal diagnosis, and what kind of artificial intelligence (AI) is best suited for the job?

A hidden condition in a graying population

CSM occurs when age-related wear and tear narrows the spinal canal in the neck and compresses the spinal cord. The condition is common in older adults; neck imaging shows spinal cord compression in about one-third of people over 60, and a substantial share of them will go on to develop symptoms. Yet studies suggest patients often wait two to six years between the first signs and a correct diagnosis, losing precious time when surgery or other interventions could prevent permanent damage. As populations age and primary care clinicians struggle with crowded clinics and limited exposure to spine disorders, the need for scalable ways to catch CSM early is growing.

Turning medical records into an early warning system

Modern electronic health records (EHRs) capture a detailed trail of diagnoses, lab tests, procedures, and clinic visits. The researchers reasoned that this trail likely contains clues to early CSM—such as repeated falls, nerve tests, or physical therapy—long before anyone orders specialized spinal imaging. They assembled data from roughly 2 million patients in two large U.S. datasets: a national insurance claims database and the records of a regional health system. Within these, they identified tens of thousands of people who eventually received a CSM diagnosis and matched them to similar patients who did not, creating a large-scale test bed to see whether AI could predict who would later be diagnosed with CSM at time points ranging from 6 to 30 months in advance.

Big general-purpose AI versus lean, clinically guided tools

The team compared several types of machine-learning models that process EHR data. On one end were large "foundation models"—powerful, transformer-based systems originally trained on millions of patient records to learn general patterns in healthcare data. On the other were smaller models built only from a concise list of 497 diagnosis, procedure, and medication codes handpicked by spine specialists as highly relevant to CSM. The researchers measured performance using statistics suited to rare diseases, asking how much better each model was than random guessing at identifying patients who would later develop CSM across different prediction windows.

Accuracy at home, reliability on the road

When models were trained and tested within the same large, diverse insurance dataset, the biggest foundation model usually performed best, achieving up to about six to seven times the accuracy of a non-informative classifier. However, the picture changed when the models were evaluated on the independent health system. There, the simpler, clinically guided models generally outperformed the complex transformers and, in some cases, achieved up to 13-fold better performance than random chance in predicting which patients would soon receive a CSM diagnosis. A reverse experiment—training on the single health system and testing on the national dataset—told a similar story: smaller, clinically focused models tended to travel better between institutions. Subgroup analyses also revealed that all models worked best for patients with more frequent healthcare visits, raising questions about fairness for those who see doctors less often.

What this means for patients and doctors

The findings suggest that AI could help flag people at high risk for CSM as much as two and a half years before diagnosis, potentially steering clinicians toward earlier neurological exams and spine imaging. Yet the study also highlights a trade-off: while large, sophisticated AI models can excel on the data they are trained on, smaller, carefully designed models grounded in clinical expertise may be more reliable when moved into new hospitals and patient populations. For patients, the takeaway is hopeful but nuanced: intelligent use of routine health data could shorten the long diagnostic odyssey many with CSM face, but success will depend not only on powerful algorithms, but also on thoughtful model design, careful testing across diverse settings, and attention to equity so that early detection benefits are shared widely.

Citation: Yakdan, S., Warner, B., Ghogawala, Z. et al. Clinically-guided models or foundation models? predicting cervical spondylotic myelopathy from electronic health records. npj Digit. Med. 9, 153 (2026). https://doi.org/10.1038/s41746-026-02337-7

Keywords: cervical spondylotic myelopathy, electronic health records, machine learning, foundation models, early diagnosis