Clear Sky Science · en

CTRNet: a lightweight and efficient deep learning model for field maize whorl identification

2026-03-30 · Back to index

Why spotting a hidden leaf matters

On a summer cornfield, some of the most damaging insects go straight for a plant’s “heart” – the tight spiral of leaves at the top called the whorl. These pests are small, the target they attack is even smaller, and farmers often must spray large areas just in case. This study introduces a new computer-vision system, CTRNet, designed to reliably find tiny maize whorls in messy real‑world fields so that crop monitoring and pesticide use can become far more precise and less wasteful.

The challenge of seeing a small target in a big field

For pest control, it is crucial to know exactly where the whorl is, because it is the main site where caterpillars lay eggs and feed, reducing photosynthesis and yield. But in real fields, whorls are hard to see: they look small in images, are often hidden by overlapping leaves, and appear against backgrounds full of weeds, soil, and shadows. Earlier approaches either relied on people visually inspecting plants or on simple image tricks based on color and texture. These methods worked only in clean, controlled scenes and quickly failed when lighting changed, leaves overlapped, or multiple plant problems appeared at once.

Deep learning steps into the field

Recent years have seen deep-learning detectors, especially those in the YOLO family, greatly improve the ability of machines to spot objects in images in real time. Several versions have been adapted to crops and leaves, but standard models still struggle with very small targets like maize whorls, and with the constant changes in light and leaf arrangement outdoors. They often lose fine detail as images are compressed through the network and can be distracted by cluttered backgrounds. The authors therefore build on a modern YOLO11 model and redesign key parts of the network to better capture small structures, share information across image scales, and ignore irrelevant background patterns.

What makes CTRNet different

The proposed CTRNet (Contextual and Texture‑enhanced Representation Network) keeps the speed and compact size of YOLO11, but adds several specialized modules. One module encourages different layers of the network to exchange information, so that broad context and fine detail reinforce each other even when whorls are partly hidden. Another module is tuned to both coarse, slowly changing patterns and fine, high‑frequency details, helping the system preserve edges and textures that mark the center of the whorl. A gated fusion stage then combines signals from multiple scales while dampening redundant or noisy features. Finally, an attention mechanism reshapes the incoming image features so that bright patches, shadows, and complex backgrounds are corrected before they can confuse the detector.

Putting the system to the test

To train and test CTRNet, the team assembled a dataset of 2,816 images from both public sources and their own field surveys, spanning growth stages from seedlings to mature plants. Photos captured the view and height typical of an agricultural robot’s camera, under a wide range of light conditions and field layouts. In head‑to‑head comparisons with several YOLO variants and a transformer‑based detector, CTRNet achieved the highest accuracy for identifying whorls, raising a standard detection score (mAP@0.5) from 81.6% to 84.7% while actually using fewer model parameters than the baseline. Visual comparisons showed that CTRNet focused more tightly on the true whorl region and produced fewer false highlights on surrounding leaves or soil, especially in low‑light, harsh sunlight, or heavily occluded scenes.

Fast enough for robots in the rows

Beyond accuracy, the authors tested whether CTRNet could run on a small edge‑AI computer similar to what a field robot would carry. On an NVIDIA Jetson Orin Nano device, the model maintained real‑time frame rates, especially when combined with an optimized inference engine and half‑precision arithmetic. This means CTRNet can realistically guide sprayers or scouting robots that must react quickly as they move down crop rows, rather than relying on slow offline analysis.

What this means for smarter pest control

In simple terms, CTRNet gives machines sharper “eyes” for a tiny but important part of the corn plant. By reliably spotting whorls despite shadows, glare, and leaf clutter, it enables more targeted monitoring of pest damage and more precise application of pesticides. The work shows that carefully designed lightweight deep‑learning models can not only match but surpass heavier systems in both speed and accuracy, opening the way to smarter, less wasteful crop protection tools and, potentially, similar systems for other crops and diseases.

Citation: Tian, X., Zhang, J. & Li, Y. CTRNet: a lightweight and efficient deep learning model for field maize whorl identification. Sci Rep 16, 10570 (2026). https://doi.org/10.1038/s41598-026-45727-3

Keywords: maize pest detection, crop computer vision, precision agriculture, lightweight deep learning, field robotics