Clear Sky Science · en

High-Resolution Colony Images of Clinically Isolated Bacteria for Automated Detection and Deep Learning

· Back to index

Why tiny dots on a dish matter

When doctors test for infections, they often grow microbes from a patient sample on a nutrient plate and wait for tiny dots called colonies to appear. Each dot traces back to a single microbe, and its look and pattern help lab staff guess what is causing disease. But reading hundreds of plates by eye is slow and tiring. This study describes a large, carefully built collection of high quality colony images designed to help computers learn to spot and count these dots quickly and consistently.

Figure 1. How standardized petri dish photos feed an automated system for spotting bacterial colonies at a glance.
Figure 1. How standardized petri dish photos feed an automated system for spotting bacterial colonies at a glance.

From crowded lab bench to digital pictures

In hospital labs, staff routinely grow bacteria from blood, urine, or breathing passages. They then inspect the colonies for their size, color, and shape to guide diagnosis and treatment. Traditional reading by eye is not only labor intensive; it also varies from person to person, especially when many plates must be checked in a short time. As health systems test more patients and seek faster answers, there is a strong push to move from manual checking to digital tools that can handle large numbers of images with the same standard every time.

Building a cleaner window on bacteria

The authors set out to create a colony image collection that would avoid many flaws of older datasets, such as uneven lighting and small numbers of species. They gathered clinical strains from real patients over nearly two years, covering 19 important types of bacteria, including several that often resist antibiotics. For each species, they chose multiple distinct strains and grew them on solid media under closely controlled conditions, sometimes waiting 48 hours to let slow growers show their typical look. They then used a closed, light proof imaging device with a fixed camera and two light sources, one shining through the plate and one reflecting off the top, to capture sharp, stable pictures of each dish.

Turning plates into numbers for computers

Once the images were captured, trained lab staff carefully drew boxes around every colony using labeling software. Two inspectors marked the colonies independently, and a third expert resolved the rare conflicts, so that the final markings closely matched the true colony edges. The team also flipped images horizontally or vertically to create extra examples, adjusting the markings to match. In the end, they produced 950 original images and their flipped versions, yielding 1,900 images with 118,442 marked colonies. The labels are stored in several common file types so that different artificial intelligence tools can use the same data with little extra work.

Figure 2. How detailed colony images pass through an AI model to separate different bacterial types into clear groups.
Figure 2. How detailed colony images pass through an AI model to separate different bacterial types into clear groups.

Putting the dataset to the test

To show what their dataset can support, the researchers trained a modern object detection model to find colonies on the plates and study how different species appear in a compact map of image features. They used only the original images, splitting them into training and testing groups, and applied common tricks such as mixing and masking parts of images during learning. On the held out test plates the model reached very high accuracy for most of the 19 species, with only a few problems on small or faint colonies that blur into the background. When they plotted how the model groups colonies in a two dimensional map, each species formed a clear cluster, showing that the images carry strong, distinct visual cues that a computer can learn.

What this means for future lab work

This open dataset offers a clean, rich starting point for building and checking tools that can automatically detect and count colonies, and even tease apart subtle traits such as rings of red cell damage around some bacteria. While it focuses on single species grown alone rather than the messy mix of real patient samples, it captures the natural variety among strains and presents colonies under highly repeatable conditions. For a lay reader, the main takeaway is that by turning thousands of well lit petri dish images into a shared resource, this work makes it easier for many groups to design and compare smart systems that may one day help hospital labs deliver more reliable and quicker infection results.

Citation: Du, J., Yang, C., Sun, M. et al. High-Resolution Colony Images of Clinically Isolated Bacteria for Automated Detection and Deep Learning. Sci Data 13, 757 (2026). https://doi.org/10.1038/s41597-026-07095-5

Keywords: bacterial colonies, image dataset, automated detection, deep learning, clinical microbiology