Clear Sky Science · en
A multimodal large language model for materials science
Why new materials matter to everyday life
From better phone batteries to more efficient solar panels and faster computers, many of tomorrow’s technologies depend on discovering and perfecting new materials. Yet figuring out which combinations of atoms will give just the right properties is like searching for needles in a cosmic-sized haystack. This article presents MatterChat, an artificial intelligence system designed to talk with scientists in everyday language while also “seeing” the detailed atomic structure of materials, helping to predict how they behave and how they might be made in the lab.

A digital assistant that sees atoms and reads words
MatterChat is built to combine two very different kinds of information: the precise 3D arrangement of atoms in a solid and the text-based questions and knowledge that scientists use. On one side, the system takes a crystal structure, which can be represented as a network of atoms and bonds. It runs this through powerful physics-inspired models that have been trained on hundreds of thousands of known materials to produce compact numerical fingerprints of each atom and its surroundings. On the other side, a large language model—based on the open-source Mistral 7B system—turns a user’s question, such as asking about stability or electronic behavior, into its own internal representation. A specially designed “bridge” module then learns how to align these two worlds so that the language model can reason directly about what the atoms are doing.
Teaching the system with a rich catalog of materials
To train MatterChat, the authors drew on a large database containing 142,899 inorganic materials from the Materials Project. For each material, they used not only the full atomic structure but also twelve kinds of descriptive information. These ranged from basic identifiers such as the chemical formula, symmetry type and crystal family to nine key properties including whether the material is metallic or insulating, its electronic bandgap, magnetic behavior and several measures of thermodynamic stability. By pairing each structure with many text-style questions and answers about these properties, the system learned to connect patterns in atomic arrangements with the words scientists use to describe and analyze materials.
From simple lookups to scientific-style reasoning
Once trained, MatterChat can do more than just repeat database entries. When a user supplies a structure, the system can answer a wide range of questions, such as identifying the formula, guessing whether the material should exist in the lab, or estimating energies that normally require heavy quantum-mechanical calculations. The authors show examples where MatterChat produces detailed assessments of specific materials, including comments on stability, electronic gaps and magnetism. In some cases it goes further, generating plausible step-by-step laboratory recipes for making well-known compounds like gallium nitride and yttrium iron garnet, drawing on the general scientific knowledge stored in its language model while grounding its answers in the supplied crystal structure.

Outperforming general-purpose chatbots on hard numbers
A striking result is that MatterChat does better at predicting quantitative material properties than both traditional language models and several specialized physics-based machine-learning tools. Across nine benchmark tasks, it is more accurate at deciding, for example, whether a solid is metallic, stable or magnetic. For continuous quantities such as formation energy or bandgap, its predictions are closer to the values obtained from demanding computer simulations. This holds even when it is tested on a separate collection of newly discovered compounds from the GNoME project, which were not used during training. The authors also analyze the system’s internal representations and show that similar structures cluster together and that changes in stability track smoothly across this internal map, indicating that the model has learned chemically meaningful patterns.
What this could mean for discovering new materials
For nonspecialists, the main takeaway is that MatterChat acts like a conversation partner that both understands scientific language and has a detailed sense of how atoms are arranged in solids. By fusing these abilities, it can quickly answer questions that once required expert intuition and expensive calculations, and it can suggest realistic synthesis routes for promising candidates. Although the authors note that the system still struggles with extremely precise numbers and can occasionally overconfidently guess, they argue that its modular design and strong performance mark an important step toward AI tools that help scientists navigate the vast space of possible materials, speeding up the path from ideas to working technologies.
Citation: Tang, Y., Xu, W., Cao, J. et al. A multimodal large language model for materials science. Nat Mach Intell 8, 588–601 (2026). https://doi.org/10.1038/s42256-026-01214-y
Keywords: materials discovery, multimodal AI, crystal structures, property prediction, scientific chatbots