Clear Sky Science · en
Ancient chinese glass heritage classification based on compositional data and machine learning
Why old glass still has new stories to tell
Ancient Chinese glass beads and vessels may look similar to treasures from Egypt or the Middle East, but inside they are chemically quite different. Over centuries, burial in soil and exposure to moisture also change their surfaces, making it hard for curators to tell where an object was made or how. This study shows how modern statistics and machine learning can read the hidden chemical “fingerprints” of weathered glass, giving museums a faster and more objective way to classify artifacts and trace the history of technology along the Silk Road.

Glass along the Silk Road
Early glass objects reached China via the Silk Road, mainly as imported beads. Craftspeople later learned to make glass locally with their own raw materials. As a result, Chinese glass could imitate foreign styles in color and decoration while having a distinct recipe. Two broad types emerged: high‑potassium glass, made with plant ash rich in potassium, and lead‑barium glass, made with ores containing lead and barium. These differences matter because they reflect changes in raw materials, trade, and technology. Yet centuries of weathering blur these signals, so experts have traditionally relied on what they see under the microscope—the color, pattern, and degree of surface decay—combined with personal experience, a practice that is time‑consuming and subjective.
Turning glass recipes into usable data
The authors started from a real contest dataset of ancient Chinese glass, which included each object’s type, color, decorative motif, degree of weathering, and detailed chemical composition. Because glass chemistry is naturally measured in percentages that add up to a whole, the team applied a mathematical step called a centered log‑ratio transformation. This converts the oxide percentages into numbers that can be safely analyzed without creating misleading correlations. They cleaned the data, filled in a few missing values in a controlled way, and checked that the transformed measurements behaved statistically like normal bell‑curve data—an essential precondition for many modern analysis tools.
How weathering reshapes glass
Next, the researchers asked which visible features truly relate to weathering. Using chi‑square and Fisher’s exact tests on 56 artifacts, they found a clear link between glass type and degree of surface decay, but no meaningful connection with color or decorative motif. High‑potassium and lead‑barium glasses age differently because of their distinct internal structures, not because of how they look. By comparing chemical measurements taken before and after weathering on different parts of the same pieces, and by grouping many samples into five categories (such as “before‑weathering lead‑barium” or “severe‑weathering lead‑barium”), they showed that key components like silica and certain metal oxides shift systematically as glass decays. From these group differences, they built simple ratio‑based correction factors that can estimate a glass’s original composition from its altered surface, at least for many of the major ingredients.

Teaching algorithms to recognize glass families
With corrected compositions in hand, the team trained several machine learning models—decision trees, logistic regression, support vector machines, and random forests—to sort samples into the two major families, high‑potassium and lead‑barium. Remarkably, a single ingredient, lead oxide (PbO), was enough for a decision tree to separate the two with perfect accuracy in their dataset: low lead meant high‑potassium glass, high lead meant lead‑barium glass. Other models reached similarly high performance and stayed reliable even when the researchers added artificial “noise” to mimic measurement uncertainty. They then went a step further, using clustering methods to discover natural subgroups within each main family. High‑potassium glass split into two subtypes—one richer in calcium and copper, another richer in barium and lead—while lead‑barium glass divided into three patterns emphasizing different supporting ingredients such as magnesium, sodium, or copper and barium. These fine‑grained groups hint at distinct recipes and workshops.
What this means for museums and history
For non‑specialists, the key message is that ancient glass can now be classified less by eye and more by data. By combining careful chemical measurement, appropriate statistical treatment of percentage data, and robust machine learning, this study offers curators and archaeologists a repeatable way to identify weathered glass objects and link them to particular traditions of craft. Over time, applying such methods to larger collections could help map trade routes, pinpoint production centers, and track how Chinese glassmakers experimented with new fluxes like lead and plant ash. In short, algorithms trained on chemistry are becoming powerful new aids in telling the story of how a seemingly simple material, glass, connected cultures across continents.
Citation: Tang, P., Gan, X. & Tang, J. Ancient chinese glass heritage classification based on compositional data and machine learning. npj Herit. Sci. 14, 125 (2026). https://doi.org/10.1038/s40494-026-02370-5
Keywords: ancient Chinese glass, Silk Road trade, cultural heritage science, machine learning classification, glass weathering