Clear Sky Science · en
Deep reinforcement learning for resource allocation and scalable numerology in NR-U enabled multi-RAT HetNets
Why your future phone needs smarter airwaves
As we move toward 6G, our phones, cars, factory robots and VR headsets will all compete for the same invisible resource: wireless spectrum. Some gadgets need super‑fast video, others need split‑second reaction times, and the airwaves they share are already crowded. This paper explores how combining new 5G/6G radio technology with artificial intelligence can squeeze far more performance out of both licensed and unlicensed spectrum, keeping demanding apps smooth even in busy cities and factories.

Many services, one crowded wireless world
Tomorrow’s networks must serve very different needs at once. Enhanced Mobile Broadband (eMBB) powers high‑rate tasks like 4K streaming and virtual reality, while Ultra‑Reliable Low‑Latency Communication (URLLC) supports safety‑critical links such as self‑driving cars or industrial control, where milliseconds matter. Instead of building a separate physical network for each, operators can create "slices"—virtual lanes on the same radio hardware—each tuned to one type of service. The challenge is that all these slices still share limited spectrum and base stations, so deciding who gets which resources, and when, is a complex juggling act.
Putting unlicensed spectrum to work
To relieve pressure on licensed frequencies, 5G introduced New Radio in unlicensed bands (NR‑U), which lets cellular base stations operate alongside Wi‑Fi around 5 GHz and beyond. The authors study a heterogeneous network where a large macro base station and several small cells use both licensed NR and unlicensed NR‑U. Users can connect in three ways: to a traditional NR small cell, to an NR‑U small cell, or through carrier aggregation that combines both links. At the same time, each cell supports two slices: one focused on speed (eMBB) and the other on ultra‑low delay (URLLC). The system must also share the unlicensed band fairly with nearby Wi‑Fi access points, which contend for the channel using their own rules.
Flexible timing for different needs
A key tool in this design is "scalable numerology," a 5G feature that changes how radio signals are arranged in time and frequency. Coarser settings use narrow spacing and longer time slots, which are efficient for high data rates but react slowly. Finer settings use wider spacing and very short slots, which respond quickly and suit delay‑sensitive traffic, but carry fewer bits per slot. The paper allows each slice—speed‑oriented or delay‑oriented—to pick its own numerology on both NR and NR‑U links. This flexibility greatly enlarges the space of possible configurations, but also makes manual tuning almost impossible.
Teaching the network to adapt by itself
To navigate this complexity, the authors turn to artificial intelligence. They model user "satisfaction" with a simple index that rises when a user’s data rate exceeds a target or its delay falls below a threshold. A deep reinforcement learning method called a dueling deep Q‑network (DDQN) observes the current load on each slice and cell, then learns how to adjust the share of radio resources and the numerology choice per slice to maximize total satisfaction. On top of this, a regret‑based learning algorithm lets users "reconsider" which base station and mode (NR, NR‑U, or combined) they attach to, gradually steering them toward options that historically gave better satisfaction. The process repeats: resource settings influence user associations, which in turn feed back into the learning loop.

What the simulations reveal
Using detailed mathematical models of signal quality, interference, and Wi‑Fi channel sharing, the team simulates a dense indoor scenario with a macro cell, three small cells and coexisting Wi‑Fi networks. They compare their intelligent multi‑radio, multi‑slice system against three common baselines: NR‑only networks, mixed NR and Wi‑Fi without aggregation, and LTE‑Wi‑Fi aggregation (LWA). Across a wide range of user counts and service mixes, the proposed approach raises average user satisfaction by up to about 70% relative to simpler schemes. It remains robust even when many Wi‑Fi users contend on the same unlicensed channels, and it outperforms more traditional optimization techniques such as genetic algorithms or simpler learning methods.
What this means for everyday users
For non‑specialists, the message is straightforward: smarter, AI‑driven control of how our devices share both licensed and unlicensed spectrum can make future 6G networks feel faster and more responsive, even in busy environments. By flexibly splitting capacity between fast video and ultra‑reliable control signals, choosing radio settings on the fly, and deciding which base station and frequency each device should use, the proposed system keeps more users satisfied more of the time. If adopted in real deployments, such techniques could help your next‑generation phone, car, or headset work smoothly without needing vast new swaths of exclusive spectrum.
Citation: Elmosilhy, N.A., Elmesalawy, M.M., El-Haleem, A.M.A. et al. Deep reinforcement learning for resource allocation and scalable numerology in NR-U enabled multi-RAT HetNets. Sci Rep 16, 4768 (2026). https://doi.org/10.1038/s41598-026-36539-6
Keywords: 6G network slicing, NR-U and Wi-Fi coexistence, deep reinforcement learning, resource allocation, URLLC and eMBB