Clear Sky Science · en

Generative modeling enables molecular structure retrieval from Coulomb explosion imaging

· Back to index

Watching Molecules Move in Real Time

Chemists have long dreamed of filming molecules as they twist, break apart, and form new bonds during chemical reactions. Doing so would not only satisfy scientific curiosity; it could ultimately help design more efficient drugs, catalysts, and materials by revealing exactly how atoms rearrange on their natural, ultrafast timescales. This article reports a new way to turn a violent measurement, where molecules are blasted apart by intense X‑ray pulses, into clear three‑dimensional pictures of their original shapes using a powerful form of artificial intelligence.

Figure 1
Figure 1.

Blowing Molecules Apart to See Their Shape

A technique called Coulomb explosion imaging starts from a simple but dramatic idea: strip many electrons off a molecule in an instant so that its positively charged atoms repel one another and fly apart. Using intense femtosecond laser or X‑ray pulses, researchers trigger this explosion and record where and how fast each ion fragment hits a detector. Those momenta encode information about how the atoms were arranged just before the blast. In principle, if one could perfectly solve the physics of this explosion, one could work backwards from the outgoing fragments to the original molecular structure. In practice, however, this is an extremely hard inverse problem, especially for molecules with more than a few atoms, because the many‑body quantum dynamics are too complex and time‑consuming to simulate repeatedly in a traditional iterative reconstruction.

Teaching a Neural Network to Read Explosions

To overcome this bottleneck, the authors introduce MOLEXA, a generative neural network designed specifically to infer molecular geometry from measured ion momenta. MOLEXA builds on modern deep‑learning tools originally created for language and image generation: a Transformer architecture augmented with a custom “memory” mechanism, and a diffusion process that starts from a noisy guess of the atomic positions and gradually refines it. The network takes as input the identities, charge states, and three‑dimensional momenta of all ions from a Coulomb explosion event and outputs a predicted arrangement of atoms in space, effectively learning a direct shortcut from momentum space back to real space.

Overcoming the Data Dilemma

Training such a model requires many examples of explosions where both the outgoing fragments and the original molecular structures are known. But high‑accuracy simulations of X‑ray–induced Coulomb explosions are so computationally demanding that only a modest dataset can be produced. The authors therefore adopt a two‑stage strategy. First, they train MOLEXA on a very large synthetic dataset generated with a simplified, classical explosion model that is fast but approximate. Then they fine‑tune the same network on a much smaller yet highly accurate dataset produced by ab initio simulations that include detailed electronic processes. This staged approach halves the typical structure‑prediction error compared with training on the small accurate set alone, allowing the model to reach a mean absolute position error of less than one atomic unit—about half the length of a typical chemical bond—for molecules of up to seven atoms, and only slightly worse for eight‑ and nine‑atom systems.

Figure 2
Figure 2.

Checking Accuracy and Knowing When to Trust It

The researchers systematically test MOLEXA on thousands of simulated molecules and find that it not only reconstructs overall shapes but also yields reasonable bond distances and angles. For simple diatomic molecules, it outperforms classical reconstruction formulas based solely on kinetic energy release. Importantly, the network is equipped with an uncertainty estimation module: along with each predicted atomic coordinate, it outputs an internal estimate of how large the error is likely to be. Across many test cases, higher predicted uncertainty correlates strongly with larger actual errors, providing a practical confidence gauge for experimental users. The model can also generate multiple reconstructions for the same input, whose spread offers an independent measure of uncertainty.

From Static Snapshots to Reaction Movies

To show that MOLEXA works on real data, the team applies it to Coulomb explosion measurements from the European X‑ray Free‑Electron Laser for familiar molecules such as water, tetrafluoromethane, and ethanol. Using only the measured ion momenta, with no molecule‑specific tuning, the network reconstructs equilibrium structures that agree well with trusted reference geometries. They further demonstrate how MOLEXA can capture distinct structural arrangements along a model reaction pathway of cyclobutene ring opening, including ring breaking and proton migration, when provided with idealized momentum inputs. While current experiments often average over many quantum states and geometries, future time‑resolved studies that better separate these contributions could use MOLEXA to assemble frame‑by‑frame “movies” of molecules reacting.

Why This Advance Matters

This work shows that generative AI can solve a long‑standing, highly nonlinear inverse problem that has resisted traditional methods. By learning from realistic simulations how explosions encode structure, MOLEXA enables researchers to extract three‑dimensional molecular shapes—and their changes over time—from data that previously seemed too complex to invert. The approach is general: similar two‑stage training could help tackle other problems where a detailed physical model exists but is too expensive to embed directly in reconstruction algorithms. If extended to larger molecules and partial detection scenarios, this strategy may turn Coulomb explosion imaging into a routine tool for watching chemical reactions unfold in real space and real time.

Citation: Li, X., Jahnke, T., Boll, R. et al. Generative modeling enables molecular structure retrieval from Coulomb explosion imaging. Nat Commun 17, 3430 (2026). https://doi.org/10.1038/s41467-026-70160-5

Keywords: Coulomb explosion imaging, molecular structure reconstruction, generative neural networks, ultrafast X-ray science, inverse problems in physics