Clear Sky Science · en
Rate adaption of XP-HARQ assisted NOMA: a decentralized multi-agent DRL perspective
Why faster, more reliable wireless links matter
As everyday objects from factory robots to home sensors join the internet, our wireless networks must deliver tiny messages both extremely fast and with almost no failures. This paper explores a new way to push more data through crowded airwaves while still meeting strict reliability and delay targets, a challenge at the heart of future 6G and advanced Internet of Things (IoT) systems.

Sending many voices over the same air
Traditional wireless systems try to avoid interference by giving each device its own time or frequency slot, like callers taking turns on a shared line. A newer idea called non-orthogonal multiple access lets many devices speak at once on the same resources, with the base station sorting their overlapping signals. This boosts capacity but also makes careful control of data rates and power essential, especially when devices must meet ultra-reliable low-latency needs such as one-millisecond delays and extremely low error rates.
Making retries smarter, not just longer
To ensure messages arrive correctly, current networks often rely on automatic repeat requests: if a packet is garbled, it is sent again. While this improves reliability, simple repeats waste precious airwaves and can cause queues to build up when many devices are active. A more efficient approach, known as cross-packet hybrid automatic repeat request, cleverly mixes new information with old during each retry. Instead of resending the same bits, each retransmission carries a blend of extra details for the failed data plus fresh content, squeezing more useful information into every transmission attempt.

Letting devices learn good behavior on their own
Combining shared-channel access with smart retransmissions creates a powerful but highly complex system. The base station sees signals that depend on many factors: changing wireless conditions, overlapping users, and multi-round packet mixing. Classic mathematical optimization struggles in this setting, particularly when devices only know outdated channel quality information. The authors instead treat each IoT device as a learning agent that adjusts its own sending rate over time. Using a branch of artificial intelligence called multi-agent deep reinforcement learning, these agents explore different rate choices, observe whether their packets succeed or fail, and gradually discover strategies that keep the network fast and reliable.
Competitive versus cooperative learning
The study compares two styles of learning. In the cooperative style, all devices share a common goal: maximize the total useful throughput for the whole network. In the competitive style, each device focuses mainly on its own long-term data rate while still respecting reliability rules. Both approaches use an advanced learning method that handles continuous rate choices and keeps the value estimates from becoming overly optimistic. Simulations show that in small networks, centralized learning—where a single controller decides for everyone—can work, but it quickly becomes unstable and inefficient as the number of devices grows. The decentralized multi-agent approach scales better, and the competitive version offers the most stable and highest throughput across a range of signal conditions.
What this means for future connected things
For a general reader, the key message is that future IoT networks may not be run by fixed formulas but by swarms of small learning agents inside the devices themselves. By blending shared-channel access, smarter retransmissions, and decentralized learning, the proposed system moves more data with fewer delays while keeping the chance of failure extremely low. In practical terms, this means factories, vehicles, and medical sensors could rely on wireless links that react on the fly to changing conditions, staying fast and dependable without constant human tuning.
Citation: Wang, J., He, F., Shi, Z. et al. Rate adaption of XP-HARQ assisted NOMA: a decentralized multi-agent DRL perspective. npj Wirel. Technol. 2, 18 (2026). https://doi.org/10.1038/s44459-025-00024-9
Keywords: ultra-reliable low-latency communication, Internet of Things, non-orthogonal multiple access, hybrid automatic repeat request, multi-agent reinforcement learning