Clear Sky Science · en
Large language models in systematic review and meta-analysis of surgical treatments for vaginal vault prolapse
Why this matters for everyday health
As women live longer, pelvic organ prolapse—when pelvic organs sag and cause pressure or bulging—has become increasingly common. Many women need surgery for this condition after a hysterectomy, but doctors still debate which operation works best and lasts the longest. At the same time, medical research is exploding, making it hard for clinicians to keep up. This study tackles both problems at once: it compares leading surgical options for vaginal vault prolapse and tests whether a modern artificial intelligence tool, a large language model, can safely help experts sift through the medical evidence.

Understanding the condition and the surgical choices
Vaginal vault prolapse occurs when the top of the vagina drops after the uterus has been removed, often bringing a sense of heaviness, a visible bulge, or trouble with bladder and bowel control. Surgeons can correct this in several ways. Sacrocolpopexy (SC) lifts and attaches the top of the vagina to a strong ligament in the lower spine, usually through the abdomen using open, keyhole, or robotic techniques. Sacrospinous fixation (SSF) anchors the vagina to a ligament inside the pelvis through the vagina. Transvaginal mesh (TVM) once offered added support using synthetic material placed through the vagina, but concerns over mesh complications led regulators in some countries, including the United States, to withdraw these products. Despite decades of use, no single approach has clearly emerged as the best for every woman.
How the researchers used both people and machines
The authors carried out a systematic review and meta-analysis, often called the “gold standard” for summarizing medical evidence. They focused on randomized controlled trials—studies that compare treatments in a rigorous, head-to-head fashion—of surgeries for post-hysterectomy vaginal vault prolapse. What makes their work unusual is that every step after the database search was done twice: once by human experts and once with help from ChatGPT, a large language model. The AI screened study titles and summaries, checked full articles against the rules for inclusion, pulled out detailed numbers on surgical results and complications, and even helped generate the statistical code and graphs, while clinicians double-checked all outputs.
What the clinical evidence shows about surgery
The review included 18 randomized trials involving 1,668 women, with follow-up ranging from one to nine years. Overall, SC provided durable support of the vaginal apex, and open and laparoscopic versions performed similarly. When SC was compared with SSF, there was a hint that SC might lead to fewer repeat operations for prolapse, but the difference was not statistically firm, and the number of trials was small. TVM often achieved better anatomic correction than SSF—especially at three years—but this gain came with a price: higher rates of mesh-related problems and repeat operations. Across all techniques, most women reported marked relief of symptoms and better quality of life, yet some had anatomic “failures” that did not cause bothersome symptoms, underscoring that success is not only what doctors see on exam but also what women feel day to day.

How well the AI performed beside human reviewers
In the evidence review itself, the AI proved fast and surprisingly reliable, but not flawless. When screening titles and summaries, it agreed substantially with the human reviewer and correctly rejected most irrelevant articles, yet it missed nearly 30 percent of relevant trials—too many to trust without oversight. For full-text decisions, agreement rose above 94 percent, and for many kinds of data extraction, accuracy reached about 99 percent, sometimes even catching a human mistake. Risk-of-bias assessments, which judge how trustworthy each trial is, showed good overall agreement but revealed that both people and AI can struggle with subtle issues like missing outcomes or selective reporting. Importantly, every statistical result produced with AI assistance matched those from traditional analyses, supporting the technical soundness of the workflow.
What this means for patients and future research
For women facing vaginal vault prolapse surgery, this study reinforces a few key points. Sacrocolpopexy remains a strong, durable option, whether done through an open or laparoscopic approach, and may modestly reduce the chance of future prolapse surgery compared with sacrospinous fixation, though firm proof is lacking. Transvaginal mesh can provide excellent anatomical support but brings higher risks of mesh-specific complications, which helps explain why its role has receded in some countries. Crucially, no single operation emerged as clearly superior across all outcomes. Choices should therefore be tailored, balancing durability, risk of complications, surgical access, and what matters most to each patient. On the digital side, carefully supervised AI tools show real promise for speeding up and clarifying complex evidence reviews, but they are not ready to replace human judgment. Instead, a partnership between clinicians and AI may become an important way to keep surgical decisions aligned with the best available science.
Citation: Park, Y., Zhang, HS. & Bai, S.W. Large language models in systematic review and meta-analysis of surgical treatments for vaginal vault prolapse. npj Digit. Med. 9, 262 (2026). https://doi.org/10.1038/s41746-026-02431-w
Keywords: vaginal vault prolapse surgery, sacrocolpopexy, transvaginal mesh, systematic review, artificial intelligence in medicine