Clear Sky Science · en
Graph transformer Q-network for collaborative governance and decentralized decision-making in multi-intersection networks
Why smarter traffic lights matter
Anyone who drives in a city knows the frustration of hitting red light after red light, even when the road seems clear. Those stop-and-go waves are more than an annoyance: they waste time, burn fuel, and can gridlock whole corridors when queues spill back through several intersections. This study explores a new way to make traffic lights "talk" to each other so that green waves form more reliably across long stretches of road, even when traffic is unpredictable and the street network is large and complex.

How city streets become a network
The researchers begin by treating an urban road system as a network of connected points. Each intersection is a node and each road between them is a link. Every signal controller sees only what local sensors report: how many cars are waiting, how long they have been delayed, and which phase is currently green. No controller has a full picture of the city at once, yet each change of light affects traffic that will reach other junctions later. The challenge is to let these local controllers cooperate so that vehicles can travel along a corridor with as few unnecessary stops as possible, while still serving side streets and turning traffic.
Teaching lights to cooperate step by step
Instead of hand-crafted timing plans, the authors use reinforcement learning, where an algorithm learns by trying actions in a traffic simulator and observing the results. Each intersection acts as an agent that chooses which phase to show next and for how long. The key innovation is a method called the Graph Transformer Q-Network, or GTQN, which decides which neighboring intersections matter most at each moment. It does this in two stages: it first selects a small set of influential upstream or downstream neighbors, then it assigns each of them a strength of influence based on the current traffic state. This prevents the controller from being overwhelmed by noisy information from faraway nodes that have little effect on its own traffic.

Following traffic over space and time
To form a smooth green wave, a signal needs to anticipate cars that were released several junctions away and may take many seconds to arrive. GTQN addresses this by combining information about the layout of the network with a record of how conditions have changed over time. A transformer module, originally popularized in language models, is used to look back over recent history at each intersection and to pick out which past moments matter for the current decision. At the same time, a graph module reasons over the connections between intersections. By fusing space and time in one model, the system can learn how platoons of vehicles move along a corridor and how best to align greens with their arrival.
Setting goals beyond a single corner
If each traffic light tried only to empty its own queue, the overall corridor could perform poorly. For example, a downstream signal might cut short a green phase that would have let a group of cars arriving from upstream pass through without stopping. To avoid this, the authors design a two-level objective. Each intersection is still rewarded for reducing its own queues and waiting times, but a central training signal also penalizes extra stops experienced by vehicles traveling along the main corridor after they have been released from the entry point. During training, a centralized "governance" module uses this corridor-wide score to guide learning. After training, the learned controllers act locally and share only sparse, targeted messages with their selected neighbors.
What the simulations show
The team tests GTQN in detailed simulations of both synthetic grids and a real city network from Chengdu, China, including a system with 100 intersections. Compared with several advanced multi-agent control methods, GTQN reduces how often vehicles must stop, shortens waiting times, and keeps queues from growing long enough to block upstream junctions. It also maintains reasonable performance when some messages between intersections are delayed or lost, an important property for real communication networks. Careful ablation studies show that each element of the design matters: learned sparsity, the combined space-time model, and the centralized training signal all contribute to robust coordination.
What this means for everyday travel
For drivers, cyclists, and bus riders, the idea behind this work is simple: instead of each traffic light working in isolation, the signals along a corridor learn to anticipate one another and to protect the movement of groups of vehicles. In high-fidelity simulations, this leads to fewer stops, shorter queues, and steadier travel speeds along busy routes. While the study is still limited to a virtual environment and does not yet handle pedestrians, transit priority, or all the quirks of real-world hardware, it demonstrates that carefully designed cooperation among many local controllers can turn a chaotic series of red lights into a more predictable and efficient journey.
Citation: Zhang, H. Graph transformer Q-network for collaborative governance and decentralized decision-making in multi-intersection networks. Sci Rep 16, 15549 (2026). https://doi.org/10.1038/s41598-026-45895-2
Keywords: traffic signal control, multi-agent reinforcement learning, graph transformer, corridor progression, intelligent transportation