Clear Sky Science · en

8,266 SARS-CoV-2 Genomic Assemblies from Asymptomatic Carriers in Japan

· Back to index

Why Hidden Infections Matter

During the COVID-19 pandemic, much attention focused on people who felt sick, but many others carried the virus without any symptoms at all. These silent infections can quietly fuel new waves of illness. This study describes one of the largest efforts in Japan to track the genetic makeup of the coronavirus in people who felt healthy when they were tested. By turning more than eight thousand virus samples from these individuals into detailed genetic blueprints, the researchers created a public resource that can help scientists worldwide better understand how the virus spreads and changes.

Figure 1
Figure 1.

A Nationwide Testing Effort

The work centers on the SB Coronavirus Inspection Center, a large screening service in Japan that mainly tested people without symptoms, often as part of workplace programs or local government campaigns. Between July 2020 and January 2023, the center processed over 4.5 million saliva samples. About 18,500 of those tested positive for the coronavirus, giving an overall positivity rate of just 0.40 percent. These numbers rose and fell in step with Japan’s eight major pandemic waves, from early strains through Alpha, Delta, and multiple Omicron sublineages, showing that silent infections surged alongside the better known symptomatic cases.

Who the Silent Carriers Were

Positive cases came from 45 of Japan’s prefectures, with especially large numbers from Shiga, Tokyo, Nagasaki, Osaka, and Hokkaido. Among the 18,475 people who tested positive, roughly one-third were men, one-third women, and for the remaining third gender was not recorded. For those with age information, the median age was 36 years, suggesting that working-age adults were strongly represented in this quietly infected group. Because individuals could opt out of research use, only anonymized samples and limited background data were passed to the research team, protecting personal privacy while still allowing broad patterns to be studied.

Turning Saliva into Genetic Blueprints

From the positive saliva samples, the researchers extracted viral genetic material and sequenced it using a standardized commercial test on Illumina machines. They focused on samples most likely to yield clear data, especially those with stronger virus signals. In total, they attempted sequencing on 14,201 of the 18,475 positive samples and successfully assembled 8,266 near-complete virus genomes. These genomes were sorted into two quality groups based on how long the sequence was and how much of it was confidently read, with almost 3,000 high-coverage genomes reserved for more detailed analyses of virus lineages and mutations.

Figure 2
Figure 2.

Checking Quality Against National Data

To make sure their sequences were trustworthy, the team compared the mix of virus variants in their dataset with hundreds of thousands of genomes from across Japan stored in a global database. The same succession of variants—early strains, then Alpha, then Delta, followed by Omicron waves—appeared in both, suggesting that the asymptomatic samples captured the broader national picture. The researchers also zoomed in on two specific changes in the virus’s nucleocapsid protein, known to affect how easily the virus spreads and how severe disease can be. They found that the frequency of these mutations in asymptomatic carriers was similar to that in symptomatic patients, and that their own dataset had fewer missing calls at these sites, indicating solid technical performance.

A Resource for Future Pandemic Insights

All raw sequencing reads and assembled genomes from this project have been deposited in major international databases, where they can be freely accessed by researchers (under standard data-use terms). While the study does not claim to explain why some people stay symptom-free, it provides a high-quality foundation for others to explore such questions. Scientists can now search these genomes for mutation patterns linked to milder infections, train computer models to predict viral behavior, or refine estimates of how silent spread shapes future waves. For the public, the key takeaway is that behind every headline about case counts lies a large, often invisible pool of infections—and that carefully tracking the virus’s genetic code, even in people who are not sick, is crucial for staying ahead of the next phase of the pandemic.

Citation: Ohyanagi, H., Takeuchi, J.S., Kawanishi, Y. et al. 8,266 SARS-CoV-2 Genomic Assemblies from Asymptomatic Carriers in Japan. Sci Data 13, 512 (2026). https://doi.org/10.1038/s41597-026-06871-7

Keywords: asymptomatic COVID-19, SARS-CoV-2 genomics, Japan surveillance, viral variants, genome dataset