Clear Sky Science · en

Platonic representation of foundation machine learning interatomic potentials

2026-05-07 · Back to index

Why many models can share one hidden map

Modern materials research relies on machine learning tools that can predict how atoms interact, letting scientists explore new crystals and compounds on a computer instead of only in the lab. Yet each powerful model tends to speak its own private "language" for describing atomic environments, making it hard to compare them or combine their strengths. This study asks whether there is a deeper common map beneath these different languages, and shows how to reveal and use it.

Figure 1. Different atom-based AI models funnel into one shared colorful map that organizes materials in a common hidden space.

Different tools, different private worlds

Machine learning interatomic potentials are models that rapidly estimate the energies and forces between atoms, based on training data from quantum mechanical calculations. Over the past decade a zoo of such models has appeared, from graph-based networks to designs that carefully respect the symmetries of physics. They are trained on overlapping but not identical databases of inorganic crystals, and they encode each atomic environment as a high dimensional vector inside the model. Looked at directly, these internal vectors form very different patterns from one model to another, even when the models are trained on the same structures and asked to predict the same physical quantities. In other words, their hidden coordinate systems are incompatible.

Building a shared coordinate grid

The authors propose a way to translate these private coordinate systems into a single, shared space without opening up or retraining the models. They select a set of special reference atomic environments, called anchors, chosen so that they span a wide range of chemistries and structures. For any model and any atom, they measure how similar that atom’s internal vector is to each anchor and use the collection of similarities as the new coordinates. This trick replaces absolute positions inside a black box with relative positions to the same common landmarks. When applied to seven distinct interatomic potentials, ranging from symmetry respecting to symmetry breaking designs, the method produces a unified map where elements fall into coherent clusters that mirror the periodic table.

Figure 2. Atoms flow through layered anchor planes and emerge as neat colored clusters, revealing how a shared process organizes materials.

What the shared map reveals

Once the models have been placed in this platonic space, the authors can quantify how similarly they organize matter. Global comparisons show that different models agree on the broad layout of chemical space, while local comparisons reveal important differences in how they treat fine details. Symmetry aware models group related atomic environments into compact, nearly spherical clouds, whereas models that ignore these symmetries produce skewed, stretched patterns. A generative model that has seen the same structures but has not been trained on energies or forces fails to reproduce the clear periodic patterns, demonstrating that the shared geometry reflects learned physics rather than just data statistics.

Doing arithmetic and health checks on materials

Because all models now live in a common coordinate system, the authors can perform simple vector arithmetic on entire materials and reactions and compare the results across models. For example, averaging the atomic points for a complex oxide yields a material level vector that is nearly aligned between different models, and subtracting the vectors for two crystal forms of the same compound reveals how sensitive each model is to subtle structural changes. By mixing reactant vectors from one model with product vectors from another, they construct "stitched" reaction vectors that still behave sensibly, hinting at modular reuse of models trained on different datasets. The platonic map also acts as a diagnostic tool: it can track how embeddings drift during fine tuning, expose when a model’s internal representation breaks expected symmetries under rotation, and flag atomic configurations that lie far from the manifold of known stable materials.

Why this matters for future materials discovery

This work supports the idea that, despite their surface differences, advanced physics based machine learning models tend to converge on a shared internal picture of the atomic world when they are constrained by correct physical targets. By offering a practical recipe for uncovering that shared picture, the platonic representation provides a foundation for comparing, combining and interrogating models in a consistent way. For non specialists, the key message is that smarter coordination between many specialized tools can make virtual materials discovery more reliable, more interpretable and better able to highlight when its own predictions should not be trusted.

Citation: Li, Z., Walsh, A. Platonic representation of foundation machine learning interatomic potentials. Nat Mach Intell 8, 830–840 (2026). https://doi.org/10.1038/s42256-026-01235-7

Keywords: interatomic potentials, materials informatics, latent space, representation learning, model interoperability