Clear Sky Science · en
Impact of AI misinformation on diagnostic accuracy and confidence calibration in novice medical students
Why smart machines can still mislead beginners
Artificial intelligence is rapidly entering classrooms and clinics, promising faster learning and smarter decisions. But when students lean on AI to understand complex medical problems, what happens if the explanation sounds convincing yet is wrong? This study tests that real-world dilemma in junior medical students and finds a worrying answer: misleading AI explanations can actively harm learning, while perfectly correct explanations often help far less than we might hope.

Testing three kinds of AI help
Researchers in China ran a randomized trial with 111 junior medical students who had basic science training but little clinical experience. All students answered 25 challenging, board-style multiple-choice questions that mimicked real licensing exams. One group saw only the questions. A second group saw carefully checked, expert-approved AI explanations pointing them to the right answer. A third group saw AI-style explanations that were polished and plausible but deliberately supported a specific wrong choice. After each question, students picked an answer and rated how confident they felt.
When wrong guidance is worse than no help
The results showed a sharp imbalance between benefit and harm. Students who received the misleading explanations scored far worse than those who got no explanations at all: their accuracy dropped from about one in five questions correct to less than one in ten. In contrast, students who saw the correct AI explanations did only slightly better than the control group, and the difference was not statistically reliable. In other words, polished but wrong guidance pushed students decisively in the wrong direction, while polished and correct guidance did not reliably lift their performance above the baseline of working alone.

Confident mistakes and the “plausibility trap”
The picture became even more troubling when the researchers looked at confidence. Any AI explanation—right or wrong—made students feel more sure of themselves than those who worked without help. However, only the group with correct explanations showed healthy “calibration,” where confidence was higher for right answers than for wrong ones. In the misleading group, confidence stayed high whether students were correct or incorrect, meaning they could not use their own sense of certainty to tell good reasoning from bad. Detailed analyses showed that the deceptive explanations often funneled students toward a specific incorrect choice: in the misleading group, more than 70% of wrong answers were the very option the AI had subtly endorsed. Some explanations worked as “half-truths,” using accurate details to support a faulty conclusion that novices struggled to challenge.
Why this matters for medical training
These findings echo concerns about “automation bias,” where people rely too heavily on computer output instead of carefully checking information. In a knowledge-heavy field like medicine, the danger is not just a wrong answer—it is a wrong answer that feels fully justified. The study suggests that simply dropping conversational AI into students’ study routines as a friendly tutor is risky, especially when learners are too inexperienced to spot subtle flaws. The authors argue that medical schools should shift from treating AI as an all-knowing teacher to using it as material for structured “AI-auditing” drills. In these exercises, students would practice picking apart AI explanations, verifying claims against trusted sources, and learning to recognize the difference between fluent reasoning and truly sound reasoning.
What this means for future doctors and their tools
In plain terms, the study’s conclusion is stark: for novice medical students, bad AI explanations do more damage than good AI explanations do good. Misleading guidance not only lowers their chances of getting the right answer, it also leaves them wrongly confident in their mistakes. To protect future patients, educators and AI designers will need to build systems and curricula that slow students down, expose common AI failure patterns, and encourage critical checking instead of blind trust. The goal is not to reject AI, but to train the next generation of doctors to question it thoughtfully, so that smart tools become partners in safe care rather than sources of convincing misinformation.
Citation: Teng, D., Tan, L., Cao, Q. et al. Impact of AI misinformation on diagnostic accuracy and confidence calibration in novice medical students. npj Digit. Med. 9, 356 (2026). https://doi.org/10.1038/s41746-026-02547-z
Keywords: AI in medical education, misinformation, diagnostic reasoning, student confidence, automation bias