Clear Sky Science · en

MFDH-Net: defect detection network for multi-level feature fusion and cross-sensing decoupling head

· Back to index

Why tiny flaws matter in modern factories

From razor-thin steel sheets to densely packed circuit boards and glossy car body panels, today’s factories depend on surfaces that are almost perfect. Even a hairline crack or pinprick of corrosion can shorten a product’s life, trigger recalls, or halt a production line. For years, workers have stared at fast‑moving parts trying to spot such flaws by eye. This paper describes MFDH‑Net, a new artificial‑intelligence system that automatically finds hard‑to‑see defects on industrial surfaces, aiming to make inspection faster, more reliable, and easier to scale.

The challenge of spotting subtle defects

Industrial defects are deceptive. Scratches, pits, and stains can resemble ordinary texture or lighting changes; some flaws are tiny, while others span large areas; and many appear against busy, noisy backgrounds. Traditional computer vision systems struggle when different types of defects look very similar, when flaws are small and faint, or when objects in an image come in many sizes. The authors focus on surfaces such as steel plates, printed circuit boards, and automobile body parts, where these issues are especially severe. Their goal is to design a detector that can separate “normal” patterns from truly abnormal ones, even when the differences are subtle and occur across a wide range of scales.

Figure 1
Figure 1.

Looking close and far at the same time

MFDH‑Net starts with a new backbone called the Dual‑domain Feature Extraction Network. It is built to look at each image in two complementary ways. One branch, inspired by classic convolutional neural networks, zooms in on fine local details such as tiny edges and textures. The other branch, inspired by Transformer models, captures long‑range relationships across the entire image, helping the system understand the broader context around a suspected flaw. These two views are not kept separate: the network repeatedly lets local and global features interact, so that a small scratch is judged not only by its immediate pixels but also by how it contrasts with the overall surface pattern.

Weaving information across scales and positions

After extracting features, the model must reconcile information from small, medium, and large structures. The authors introduce a Multilevel Feature Aggregation Network that passes signals up and down between layers rather than in a single direction. This design encourages deep interaction between fine‑grained details and high‑level patterns, with adaptive weights that tell the model how much to trust each scale. A further component, the Spatial Semantic Fusion Module, aligns features from different resolutions so that a region denoting a scratch in one layer lines up exactly with the same region in another. This careful alignment helps prevent confusion, such as one layer calling an area a defect while another calls it background.

Figure 2
Figure 2.

Specialized heads for “what” and “where”

Identifying a defect involves two intertwined questions: what kind of flaw is it, and where exactly is it located? MFDH‑Net tackles this with a Cross‑aware Decoupling Head that splits the processing into branches tuned to classification (the “what”) and precise localization (the “where”). A cross‑perception attention mechanism further emphasizes small or faint defects by re‑weighting spatial regions and feature channels that are likely to contain flaws, while downplaying background clutter. This is particularly important for tiny imperfections on circuit boards or car panels, which might otherwise be lost amid complex textures and reflections.

How well does the system perform?

The researchers tested MFDH‑Net on several demanding public and real‑world datasets: steel surfaces, printed circuit boards, a multi‑type steel defect set, and automobile body parts collected from a production line. Across these, the network achieved very high detection accuracy, often exceeding 94% for correctly identifying and localizing defects, while still operating at real‑time speeds of around 52 frames per second. Careful ablation studies—where individual components are removed—show that each piece of the design, from dual‑domain feature extraction to multi‑level fusion and the specialized detection head, contributes measurable gains. Compared with a range of popular detectors, including both classic convolutional models and newer hybrid and Transformer‑based systems, MFDH‑Net consistently delivered a better balance of accuracy and speed.

What this means for smart manufacturing

For non‑experts, the main takeaway is that MFDH‑Net offers a more reliable, automated way to spot minute defects that human inspectors might miss, without slowing down production. By combining close‑up detail analysis with a wide‑angle view of each surface, and by carefully knitting together information across scales and tasks, the system can flag flaws on diverse products with high confidence. While the approach still depends on labeled training data, which can be costly to obtain, it points toward future inspection systems that adapt quickly to new factories and products. In short, the work brings industry closer to surface quality checks that are as rigorous as a human expert’s eye, but faster, more consistent, and easier to deploy at scale.

Citation: Zhang, L., Yang, Z., Ma, Y. et al. MFDH-Net: defect detection network for multi-level feature fusion and cross-sensing decoupling head. Sci Rep 16, 9750 (2026). https://doi.org/10.1038/s41598-026-40568-6

Keywords: industrial defect detection, computer vision, deep learning, quality inspection, intelligent manufacturing