Clear Sky Science · en

Spatial-frequency complementary fusion network for dehazing with multi-scale and attention modules

2026-04-09 · Back to index

Why clearing hazy photos matters

Anyone who has tried to photograph a foggy city skyline or a smoggy highway knows how haze can wash out colors and blur details. Beyond holiday snapshots, this loss of clarity also affects safety systems in cars, environmental monitoring, and remote sensing from aircraft and satellites. This paper presents a new way to digitally clear haze from a single image, aiming to recover sharp, natural-looking scenes that are more useful to both people and machines.

From simple tricks to learning from data

Early haze removal methods relied on clever hand-crafted rules, such as assuming that at least some parts of the scene contain very dark pixels or that colors follow certain patterns. These approaches can work well in simple cases but struggle when lighting, weather, or scene layout become complex. With the rise of deep learning, researchers began training neural networks to learn how clear and hazy images differ, allowing them to recover the clean version directly from examples. Most of these learning-based methods, however, work only in the regular image space, adjusting pixels and local patterns without fully exploiting how haze changes the image when viewed as a mix of low and high visual frequencies.

Figure 1. How a dual-view network turns a single hazy photo into a clearer, more natural looking image.

Looking at haze in two different ways

The authors point out that haze does not just dim an image evenly. When the image is converted into frequency space, which separates broad smooth areas from fine textures and edges, hazy pictures show a clear loss of medium and high frequency content and a buildup of low frequency energy. In simple terms, fine details like leaves and building edges fade, while the overall brightness and color cast become dominant. Standard methods that operate only on pixel neighborhoods have trouble directly correcting this frequency imbalance. The paper argues that a better dehazing system should work in both spaces at once: the everyday pixel view and the frequency view that highlights lost details.

A network that fuses shapes and textures

To realize this idea, the authors design SFC-Net, a neural network that combines spatial and frequency information at every important stage. Its core feature enhancement block, called the spatial-frequency multi-scale module, splits features into several branches. One branch focuses on broad patterns using average statistics, another emphasizes strong responses using maximum values, and a third analyses the image in frequency space to capture textures and structure that are easily weakened by haze. These branches are then fused so that the network can jointly reason about what should be bright, what should be sharp, and where subtle detail needs to be restored, leading to clearer and more realistic dehazed images.

Figure 2. How separating smooth regions and fine textures helps a network strip away haze and recover lost details.

Guided attention to the most useful clues

Beyond feature extraction, the network uses a spatial-frequency complementary attention module to decide which regions and which types of information deserve the most focus. This module first builds separate attention maps over image locations and over channels, then passes these enhanced features through a frequency transform, allowing the system to highlight frequency components that matter for haze removal while downplaying less useful ones. An adaptive gate balances these contributions so that the network can treat different scenes differently, for example giving more weight to fine textures in a leafy forest than in a smooth sky. Additional residual blocks and a careful upsampling head help preserve details and avoid artificial patterns as the network reconstructs the final clear image.

How well the method works in practice

The researchers train and test SFC-Net on widely used synthetic and real-world haze datasets. They evaluate image quality using standard measures of signal-to-noise, structural similarity, and a no-reference score that estimates how natural an image looks without needing a clean ground truth. Across indoor and outdoor test sets, SFC-Net matches or exceeds recent advanced dehazing methods, particularly improving sharpness and color faithfulness in outdoor scenes. It also performs strongly on real photographs and on independent benchmarks that simulate real haze, and ablation studies show that each of the new modules contributes meaningfully to the final performance rather than just increasing model size.

Clearer views through smarter fusion

In everyday terms, this work shows that cleaning up hazy images benefits from looking at them in two complementary ways: as ordinary pictures and as patterns of smooth regions and fine details. By building a network that fuses these views and learns where to focus its efforts, the authors achieve crisper, more natural-looking results than many existing systems. The approach could help improve visibility for autonomous driving, surveillance, and environmental observation, offering clearer digital windows onto scenes that would otherwise be dull and washed out by haze.

Citation: Yan, C., Liu, G. Spatial-frequency complementary fusion network for dehazing with multi-scale and attention modules. Sci Rep 16, 16412 (2026). https://doi.org/10.1038/s41598-026-47027-2

Keywords: image dehazing, deep learning, computer vision, image enhancement, frequency domain