Clear Sky Science · en
Molecular LEGION: incalculably large coverage of chemical space around the NLRP3 target
Why this matters for future medicines
Designing new medicines is like searching for a few special grains of sand on an endless beach. This paper describes a way to massively widen the search around a promising immune-system switch called NLRP3, which is linked to many chronic inflammatory diseases. By combining several kinds of artificial intelligence with clever chemistry tricks, the authors generate and share a gigantic collection of computer-designed molecules that could one day become starting points for new drugs.
The challenge of an endless molecule universe
Chemists talk about “chemical space” to describe all the small molecules that could, in principle, exist. That space is mind-bogglingly huge – far beyond anything we can store in a database or test in a lab. Existing catalogs of real or easily synthesizable molecules cover only a tiny speck of this universe, and even fewer have any known biological activity. Most drug discovery today still fishes in that small, well-used pond, which limits the chance of finding truly new and patentable treatments. The authors argue that the most exciting region sits between what is clearly makeable and what is clearly impossible to synthesize: molecules that look realistic but have never been made before.

A difficult but valuable inflammation switch
The team focuses on NLRP3, a protein complex that helps control inflammation. When misregulated, NLRP3 has been linked to a wide range of disorders, from autoimmune diseases to metabolic and neurodegenerative conditions. Several companies have already designed small-molecule blockers of NLRP3, but none have yet become approved medicines, partly because of issues such as selectivity, delivery to the right tissues, and complex biology. This makes NLRP3 both high-risk and high-reward: a perfect testing ground for methods that can explore much broader chemical territory than standard approaches allow.
How the LEGION workflow explores chemical space
The authors introduce LEGION, a multi-stage workflow built on an industrial AI platform called Chemistry42. First, they start from known 3D structures where small molecules sit inside the NLRP3 protein. Using these as templates, they run two independent searches: one screens huge collections of existing, synthetically feasible molecules, and the other uses generative AI models to invent new, plausible ones. Both are guided by computer simulations that check how well each molecule fits the protein, including its shape and key contact points. From the resulting “virtual hits,” the team automatically extracts core molecular backbones, or scaffolds, that appear crucial for binding.
From key backbones to billions of possibilities
Next, LEGION turns these 3D-derived scaffolds into 2D building blocks marked with positions where chemical groups can be attached. The researchers refine and expand this set to more than 34,000 unique scaffolds, then to about 94,000 favored ones that are especially suitable for further design. They use two complementary strategies to build out huge virtual libraries: a 2D generative pipeline that proposes new molecules around these scaffolds, and a simple but powerful combinatorial scheme that systematically plugs “left” and “right” fragments into central backbones. Carefully sampling this combinatorial explosion yields around 110 million distinct molecules in the shared datasets, and the accompanying code could in principle generate about 123 billion.

Checking that the virtual molecules still make sense
Creating dazzling numbers of structures is only useful if at least some of them have a good chance of working. To test this, the authors randomly pick subsets of their 2D-designed molecules and run full 3D docking and scoring again, as if they were new candidates. They find that more than half of the generative-AI molecules and a healthy fraction of the combinatorial ones behave like promising “virtual hits” in these tests. In additional case studies, they show that LEGION is capable of rediscovering known families of NLRP3-blocking molecules without being told about them directly, and that it even contained examples of a novel chemotype later reported by another group as a fresh NLRP3 inhibitor series.
What this means going forward
For non-specialists, the main message is that drug hunters are beginning to map their way through an almost unimaginably large space of possible medicines with the help of AI. Rather than offering a single new drug, this study provides a vast, target-focused landscape of computer-designed molecules around NLRP3, along with the tools to explore it. This landscape can speed up virtual screening, inspire new patent strategies, and help researchers leap from one chemical family to another in search of safer, more effective anti-inflammatory drugs. In short, LEGION turns a distant, abstract chemical universe into a structured playground for future NLRP3 drug discovery.
Citation: Zagribelnyy, B., Aladinskiy, V., Bondarev, N. et al. Molecular LEGION: incalculably large coverage of chemical space around the NLRP3 target. Sci Data 13, 576 (2026). https://doi.org/10.1038/s41597-026-06850-y
Keywords: NLRP3 inhibitors, generative chemistry, chemical space, AI-driven drug discovery, virtual screening