Clear Sky Science · en

In silico typing maps the natural diversity of Escherichia coli transporter-dependent capsules

· Back to index

Why the sugar coat on bacteria matters

Many strains of Escherichia coli, a common gut bacterium and frequent cause of serious infections, wear a sugary capsule on their surface. This slippery coat helps them evade the immune system and survive in different hosts and environments. For decades, scientists struggled to classify and track these capsules because traditional lab tests were slow and often unreliable. This study shows how modern DNA analysis can map the full variety of these capsules, revealing overlooked types and helping to guide future vaccines and targeted therapies.

From test tubes to computer-based typing

Earlier work classified E. coli capsules using antibodies that recognized surface structures, a process known as serotyping. These tests were laborious, imprecise and especially difficult for capsules, which can mimic human molecules and trigger weak immune responses. As a result, capsule typing largely faded out by the late twentieth century, and only a subset of known capsule types were well studied. Meanwhile, genome sequencing became cheap and common, but there was no complete reference linking capsule DNA to known capsule types. This gap meant researchers could not reliably recognize new capsule variants or understand how they were distributed across patients, animals and the environment.

Figure 1. How DNA sequencing reveals the hidden variety of sugar coats on E. coli across people, animals and environments.
Figure 1. How DNA sequencing reveals the hidden variety of sugar coats on E. coli across people, animals and environments.

Building a genetic atlas of E. coli capsules

The authors focused on a major group of E. coli capsules that depend on a molecular transport system to move the sugar coat to the cell surface. First, they sequenced a historic reference collection of strains whose capsules had already been defined by classical methods. By matching capsule structures to their underlying DNA, they created a clean map from genotype to serotype for 35 established transporter-dependent capsules, which they refined to 30 genetically distinct types. Next, they combed through more than 37,000 publicly available E. coli genomes. Using a key capsule gene as a landmark, they extracted the surrounding DNA regions and grouped them into unique capsule loci based on shared gene content.

Discovering new capsule families and functions

This large-scale search uncovered 85 distinct transporter-dependent capsule types, including 55 that were not part of the original reference collection. By analyzing the shared core genes that build and export the capsule, the team sorted these loci into four genetic lineages and even identified a previously unrecognized subgroup. To understand what structures these capsules might form, they combined domain searches, protein structure prediction and comparisons with known enzyme families. This approach allowed them to assign likely functions to over 90 percent of capsule-specific genes. In some cases, they used mass spectrometry on purified capsules to resolve mismatches between predicted genes and older chemical descriptions, updating the proposed structure for certain capsule types.

A new tool to read capsule types from genomes

With this catalogue in hand, the researchers developed kTYPr, a software tool that reads genome sequences and predicts capsule type. Instead of relying on simple sequence matches, kTYPr uses hidden Markov models, which capture patterns within protein families and tolerate natural variation. The tool first checks for the presence of the core capsule genes, then evaluates which specific set of capsule enzymes best fits the genome. This strategy can distinguish closely related capsules, recognize rearranged gene clusters and handle incomplete genomes assembled from metagenomic samples.

Figure 2. How a stepwise genome-matching process reads E. coli capsule genes to sort bacteria into different capsule types.
Figure 2. How a stepwise genome-matching process reads E. coli capsule genes to sort bacteria into different capsule types.

Capsule diversity across hosts, habitats and disease

The team applied kTYPr to more than 24,000 carefully curated E. coli genomes from humans, animals, food and environmental sources, as well as nearly 3,000 genome fragments reconstructed from the stool of healthy people. They found that about a quarter of all genomes carried a complete transporter-dependent capsule locus, with such capsules especially common in strains from humans, pets and human-linked environments. New, previously uncharacterized capsule types were enriched in understudied settings such as wild animals, livestock and food. In humans, the same capsule types appeared both in healthy gut communities and in strains causing urinary tract infections, bloodstream infections and meningitis, although some capsule types were more strongly linked to invasive disease than others.

What this means for infection control and prevention

By drawing a detailed map from capsule genes to capsule types and wrapping it into user-friendly software, this study turns the once obscure sugar coat of E. coli into something that can be routinely tracked in genome data. The work reveals far more capsule diversity than previously recognized and shows that many disease-associated capsule types are also common in the healthy gut, where they may act as quiet colonizers that sometimes cause severe infections. This new genetic atlas and toolset will help researchers study how capsules shape E. coli ecology, how they interact with the immune system and phages, and how they might be targeted more precisely by future vaccines and therapies.

Citation: Miravet-Verde, S., Cacace, E., Mores, C.R. et al. In silico typing maps the natural diversity of Escherichia coli transporter-dependent capsules. Nat Microbiol 11, 1217–1232 (2026). https://doi.org/10.1038/s41564-026-02323-5

Keywords: Escherichia coli, bacterial capsules, genome typing, microbial diversity, vaccine targets