Clear Sky Science · en

A test-time clinically adaptive framework for detecting multiple fundus diseases harnessing ophthalmic foundation models

· Back to index

Why Smarter Eye Screening Matters

Millions of people around the world are at risk of losing their sight from conditions such as diabetes, glaucoma, and age-related changes in the back of the eye. Doctors can spot many of these problems in photographs of the eye’s interior, called fundus images, but carefully checking every image is time‑consuming, especially when several diseases can appear at once. This study presents RetExpert, a new artificial intelligence (AI) system designed to act like a versatile eye specialist: it can screen for many retinal diseases simultaneously, cope with messy real‑world data, and adapt itself on the fly to different clinics and cameras.

Figure 1
Figure 1.

Seeing Many Eye Problems at Once

Typical AI tools for eye screening are trained to detect just one disease at a time and often assume that images will look very similar to those used during training. Real clinics are different. Patients frequently have more than one eye condition, disease labels in datasets are incomplete, and cameras and populations vary from site to site. RetExpert tackles this by building on “foundation models,” large vision systems first trained on huge collections of retinal images. These foundation models already understand common patterns in fundus photographs, such as blood vessels, the optic nerve, and the macula. RetExpert turns the blocks of such a model into “adaptive knowledge units” and adds small adapter layers that can be fine‑tuned for multi‑disease screening without disturbing the valuable knowledge already learned.

Making Better Use of Uneven and Uncertain Data

Real medical data tend to be unbalanced: common diseases like diabetic retinopathy appear far more often than rare disorders. If trained naively, AI will overfocus on frequent conditions and neglect unusual but important ones. RetExpert uses a training approach that gives extra attention to under‑represented diseases so that performance does not collapse on rare findings. The system also recognizes that some predictions are naturally more uncertain than others. Instead of producing a single overall confidence score, RetExpert estimates separate uncertainty scores for each disease it is asked to detect. This disease‑by‑disease view is closer to how clinicians think and allows the model to treat fragile predictions cautiously during later adaptation steps.

Reducing Mix‑Ups Between Similar Diseases

In practice, some retinal diseases can look alike, and it can be especially harmful if an AI model confidently gives conflicting answers—such as calling the same image both “normal” and “macular degeneration.” To handle this, the authors built a fundus disease co‑occurrence matrix, a structured summary of how likely pairs of diseases are to appear together based on medical knowledge and statistics. During training, RetExpert learns to align its output probabilities with these medically sensible relationships. The team also introduced a “confusion score” that measures how often the model mixes up specific disease pairs. With the co‑occurrence information in place, confusion between tricky pairs, such as macular degeneration and high myopia, dropped by more than a third, making predictions more trustworthy for clinical use.

Figure 2
Figure 2.

Adapting Itself at Test Time

One of the biggest obstacles to deploying AI across hospitals is “domain shift”: images from a new clinic can differ because of different patient groups, camera types, or imaging settings. Conventional systems must be retrained or extensively fine‑tuned whenever they move to a new environment, which is costly and slow. RetExpert instead performs lightweight adaptation during use. When it encounters a batch of new images, it briefly adjusts only its small adapter layers and final decision head, guided first by how stable its own features are and then by pseudo‑labels weighted by the per‑disease uncertainty estimates. Crucially, these updates are temporary and reset after each batch, so the core model does not drift over time, preserving safety and reproducibility while still gaining short‑term flexibility.

How Well Does It Work in the Real World?

The authors tested RetExpert on two large multi‑disease datasets used for development and then on 15 additional public and private datasets collected from different countries, cameras, and clinical settings. Across most tasks—including detection of diabetic retinopathy, age‑related macular degeneration, glaucoma, myopic changes, and ocular toxoplasmosis—RetExpert matched or exceeded the performance of current leading retinal foundation models. It also showed better reliability, with lower confusion scores and stronger results on challenging “out‑of‑distribution” datasets that mimic new hospitals and devices. Although highly specialized single‑disease systems can still be slightly better for that one condition, RetExpert narrows the gap while offering broad, multi‑disease coverage in one unified tool.

A Step Toward Trustworthy Automated Eye Checks

In everyday terms, RetExpert is like a seasoned general eye doctor built on top of a large shared knowledge base, equipped with tools to handle rare diseases, acknowledge when it is unsure, and adjust quickly to new clinics without constant retraining. By combining these elements—adaptive modules, uncertainty‑aware learning, medical prior knowledge, and test‑time adaptation—the framework delivers more accurate and dependable multi‑disease screening from simple color photographs of the back of the eye. If developed and validated further, such systems could support earlier detection of sight‑threatening conditions at scale, especially in settings where access to eye specialists is limited.

Citation: Jiang, H., Liu, Z., Gao, M. et al. A test-time clinically adaptive framework for detecting multiple fundus diseases harnessing ophthalmic foundation models. npj Digit. Med. 9, 300 (2026). https://doi.org/10.1038/s41746-026-02480-1

Keywords: retinal imaging, medical AI, multi-disease screening, foundation models, domain adaptation