Clear Sky Science · en

Soft actor critic-based performance optimization for IRS-aided cognitive radio systems

· Back to index

Smarter Airwaves for a Crowded Wireless World

Our phones, sensors, and smart homes are all competing for the same invisible resource: radio waves. As more devices connect, squeezing extra performance out of limited spectrum becomes vital. This paper explores a new way to boost the data rates of low‑priority users without harming high‑priority ones, by combining “smart walls” that bend radio waves with an artificial‑intelligence learning method that teaches the network how to configure itself.

Figure 1
Figure 1.

Sharing Without Shouting Over the Neighbors

Modern wireless systems often use a “primary” and “secondary” user model. Primary users, such as licensed services, have priority access to certain frequencies. Secondary users are allowed to reuse the same channels only if they keep their interference under strict limits. This is the core idea behind cognitive radio: radios that sense their environment and adapt so that spectrum is used more efficiently. The challenge is to give secondary users good data rates while staying almost invisible to primary users. Traditional approaches rely on clever signal processing at the base station alone, which quickly becomes complex as networks get denser and more antennas are added.

Bending Signals with Smart Reflecting Surfaces

The authors add a powerful new tool to the mix: intelligent reflecting surfaces. These are thin panels made up of many tiny passive elements that can adjust how they reflect incoming radio waves, like a wall of controllable mirrors for wireless signals. By carefully choosing the reflection pattern, the surface can steer energy toward the intended secondary user and away from primary receivers, improving performance without spending extra transmit power. The paper analyzes a system where a base station with many antennas serves secondary users, while several reflecting panels help shape the paths that signals take through the environment, under realistic millimeter‑wave propagation conditions.

Teaching the Network to Tune Itself

Finding the best combination of base‑station beam patterns, transmit power, and millions of tiny reflection settings is a messy mathematical problem. Classic optimization methods, such as block coordinate descent, tackle it by alternating between one group of variables and another. These work but become slow and unwieldy as surfaces grow larger or the environment changes. Instead, the authors cast the task as a learning problem for a deep reinforcement learning agent using the soft actor‑critic (SAC) algorithm. In this setup, the agent observes the current channel conditions, past reflection phases, and transmit power, then proposes new reflection settings. It receives a reward based mainly on the achieved data rate of the secondary user, as long as the interference to primary users remains below an allowed threshold. Over many simulated interactions, the agent learns a policy that directly maps observations to near‑optimal configurations.

Figure 2
Figure 2.

Results in Simulation and in Hardware

Through extensive simulations, the SAC‑based controller is compared against a traditional block coordinate descent benchmark on several fronts: achievable data rate for secondary users, impact of the number of reflecting elements and panels, transmit power limits, and interference constraints. The learned policy consistently matches or surpasses the benchmark in data rate, especially when intelligent surfaces have many elements, while requiring far less iterative computation once training is complete. The study also assesses runtime: for small surfaces, classic methods can be slightly faster, but as system size grows the learning‑based approach scales better. To support practical deployment, the authors design, fabricate, and test a 16‑element base‑station antenna array that operates from 3 to 7 GHz. Measurements show good matching, low correlation between antennas, and around 90% radiation efficiency, confirming the hardware platform can support demanding multi‑antenna operation.

What This Means for Future Wireless Networks

In everyday terms, this work shows how combining smart reflective panels with a learning algorithm can let lower‑priority devices share spectrum more aggressively without disturbing higher‑priority services. Instead of hand‑crafted formulas, the network learns how to aim and shape its signals on its own, even in complex environments and with many controllable elements. As wireless systems evolve beyond 5G, approaches like this could help deliver higher data rates, better coverage, and more efficient use of scarce spectrum, all while keeping interference in check.

Citation: Ghallab, R., Abdrabo, A. & Elashry, I. Soft actor critic-based performance optimization for IRS-aided cognitive radio systems. Sci Rep 16, 14283 (2026). https://doi.org/10.1038/s41598-026-49465-4

Keywords: cognitive radio, intelligent reflecting surfaces, deep reinforcement learning, wireless spectrum sharing, soft actor critic