Clear Sky Science · en
An agentic system for rare disease diagnosis with traceable reasoning
Why finding rare illnesses faster matters
For families living with rare diseases, getting a name for their condition can take years of appointments, conflicting opinions and dead ends. This long “diagnostic odyssey” delays treatment, drains savings and leaves patients in limbo. This study introduces DeepRare, an artificial‑intelligence system designed to act as a digital teammate for doctors. It aims to spot rare diseases earlier and explain its thinking in a way that physicians can check and trust.
A new kind of digital assistant for doctors
DeepRare is built to work the way real clinics do. It can read free‑text clinic notes, structured symptom lists and results from modern gene tests, then turn all of that into a ranked list of possible rare diseases. Unlike many existing tools that simply return a list of names, DeepRare produces a step‑by‑step reasoning chain that links each suggestion to medical papers, expert databases and similar past cases. That means a clinician can see not just what the system recommends, but why.

How the system thinks through a case
Under the hood, DeepRare is organized as a “multi‑agent” system. At its center is a powerful language model that acts like a coordinator, deciding what to do next as new information arrives. Around it sit specialized helper modules: some clean up and standardize symptoms, others search medical literature and rare‑disease case banks, and others interpret long lists of gene variants. For each patient, DeepRare cycles through gathering evidence, proposing diagnoses and then re‑checking its own ideas. If it finds problems with a candidate explanation, it loops back to search for better evidence instead of pressing ahead with a shaky guess.
Putting DeepRare to the test in the real world
The researchers tested DeepRare on 6,401 patient cases drawn from nine different sources, including large public datasets, online case reports and four major hospitals in Europe, North America and China. Together, these cases covered 2,919 distinct rare diseases across 14 organ systems, from brain disorders to kidney conditions. Using only structured symptom descriptions, DeepRare correctly placed the true disease in the top spot for about 57% of cases—more than 20 percentage points better than the next‑best method and far above widely used tools such as PhenoBrain and PubCaseFinder. When gene‑sequencing results were also available, DeepRare’s top‑rank accuracy climbed to around 69%, beating a leading genetics tool called Exomiser on two independent hospital cohorts.
Comparing AI to human specialists
To see whether DeepRare’s performance would hold up against human experts, the team ran a head‑to‑head comparison using 163 difficult cases from a children’s hospital. Five physicians with more than a decade of experience in rare diseases were asked to suggest up to five possible diagnoses per case, using search engines but no AI. DeepRare received the same symptom information. The system not only matched clinicians when allowed five guesses, it actually surpassed their accuracy for the single top prediction. Another group of ten rare‑disease doctors examined the AI’s reasoning chains for 180 cases and judged over 95% of its cited evidence to be accurate and relevant, suggesting that its explanations are medically reliable rather than mere “AI storytelling.”
Why traceable reasoning changes the game
Beyond raw accuracy, the authors argue that DeepRare could reshape how rare diseases are investigated. Because the system links each step of its thinking to medical sources and similar patients, it can dramatically cut the time a clinician spends hunting through the literature or searching case databases. Its strong performance across many organ systems means it could help non‑specialist doctors recognize conditions they rarely see, especially in hospitals without dedicated rare‑disease centers. The study does note open challenges, such as confusion between look‑alike syndromes and the need to expand the system toward early screening and treatment planning. Still, the findings suggest that carefully designed AI teammates like DeepRare could shorten the diagnostic odyssey for thousands of families by making rare diseases easier to recognize and reason about in everyday clinical practice.

Citation: Zhao, W., Wu, C., Fan, Y. et al. An agentic system for rare disease diagnosis with traceable reasoning. Nature 651, 775–784 (2026). https://doi.org/10.1038/s41586-025-10097-9
Keywords: rare disease diagnosis, medical AI, large language models, clinical decision support, genomic medicine