Clear Sky Science · en

ChatTogoVar: a TogoVar-based retrieval-augmented generation system for precise genomic variant interpretation

· Back to index

Why smarter genetic answers matter

Genetic tests are becoming part of routine health care, but the raw results are hard to understand. Doctors and researchers need to know whether a tiny change in DNA is common and harmless or rare and linked to disease. Large language models, the same kind of AI that powers popular chatbots, can explain complex information in plain language, yet they sometimes sound confident while being wrong. This study introduces ChatTogoVar, a system that connects an AI chatbot to a trusted Japanese genetic database so it can give clearer, better supported answers about human DNA changes.

Figure 1. How linking a genetic database to an AI helper can turn raw DNA codes into clearer answers for doctors and patients.
Figure 1. How linking a genetic database to an AI helper can turn raw DNA codes into clearer answers for doctors and patients.

From raw DNA to useful answers

When a person has their genome analyzed, the result is a long list of small DNA differences called variants. On their own, these codes say little about health. Specialists rely on databases that track how often each variant appears in different populations, which genes it affects, and whether it has been linked to disease. The TogoVar database focuses on variants seen in people in Japan and pulls together information from many large studies and clinical resources. ChatTogoVar builds on this foundation, acting as a conversational layer that can answer natural language questions, such as whether a particular variant is connected to a disease or how common it is in certain groups.

How the new system works

ChatTogoVar follows a retrieval augmented generation approach. When a user asks a question about a specific variant, the system first detects its identifier and queries the TogoVar programming interface. TogoVar returns structured data describing the variant, including its position in the genome, the affected gene, observed frequencies in Japanese and other populations, predicted impact on the protein, and known clinical interpretations from sources such as ClinVar. ChatTogoVar then packs this information into a carefully designed prompt and sends it to an underlying language model, which crafts a readable answer that must cite the database evidence it used and state when no data are available.

Putting the system to the test

The authors compared ChatTogoVar with a general purpose chatbot and with an existing variant focused assistant called VarChat. They built 50 question types covering basic facts, population frequencies, disease links, drug response, functional impact, evolution, related variants, and available tools, then combined these with 30 real variants, creating 1500 question and variant pairs. Human experts manually scored answers from all three systems on a subset of 150 questions, judging accuracy, completeness, logic, clarity, and use of evidence. A separate large scale evaluation used an AI based scoring method on all 1500 questions to measure performance consistently across many variants and topics.

Figure 2. Step by step flow of genetic variant data into a database guided AI system that filters, evaluates, and refines more accurate answers.
Figure 2. Step by step flow of genetic variant data into a database guided AI system that filters, evaluates, and refines more accurate answers.

What the comparisons revealed

Across nearly every question and scoring category, ChatTogoVar outperformed both the general chatbot and VarChat. In the expert review, it gave the best answer for 90 percent of questions, while the general chatbot came out on top for only a few. One telling example involved a variant truly linked to Parkinson disease. ChatTogoVar correctly identified the gene and disease and pointed to the relevant clinical record, whereas the general chatbot confused the variant with one in another gene and mentioned the wrong condition. The large AI based evaluation, which covered ten times as many questions, showed the same pattern: grounding answers in current database records sharply reduced such mix ups and unsupported claims.

Steps toward safer genomic advice

This work shows that pairing a chat style AI with a curated genetic database can make variant explanations more accurate and better documented. ChatTogoVar does not replace expert judgment, and it is still limited by the coverage of the databases it uses, especially for areas like drug response and complex variant patterns. However, by highlighting what is known, what is uncertain, and where the supporting data come from, it offers a more dependable starting point for doctors, genetic counselors, and researchers who must interpret genomic test results in everyday practice.

Citation: Mitsuhashi, N., Fujiwara, T. & Yamaguchi, A. ChatTogoVar: a TogoVar-based retrieval-augmented generation system for precise genomic variant interpretation. Hum Genome Var 13, 12 (2026). https://doi.org/10.1038/s41439-026-00344-4

Keywords: genomic variants, retrieval augmented generation, TogoVar, large language models, genomic medicine