Clear Sky Science · en
Explainable machine learning-based classification of traditional Korean ceramics using XRF chemical composition data
Ceramic Treasures Meet Modern Algorithms
For centuries, experts have classified Korea’s finest ceramics—soft green celadon, boldly decorated buncheong, and serene white porcelain—by eye and experience. But what happens when a fragment is damaged, discolored, or doesn’t quite fit the textbook look? This study shows how modern machine learning can read the chemical “fingerprints” of these wares to sort them objectively, and even explain what ingredients give each piece its distinctive beauty.
From Glaze Colors to Hidden Ingredients
Celadon, buncheong, and white porcelain are more than museum labels; they trace shifts in taste and technology from Korea’s Goryeo to Joseon dynasties. Celadon is famed for its jade-green glaze and intricate inlay, buncheong for its lively white-slip decorations on a darker body, and white porcelain for its pure, restrained elegance. Yet visually based sorting has limits: early or experimental pieces can look different, and weathering or breakage can mask key features. The authors turn instead to X-ray fluorescence (XRF), a technique that reveals how much of each major oxide—such as silica, alumina, iron, and titanium—is present in the ceramic body. Because these chemical recipes reflect raw materials and firing conditions, they provide a more stable basis for identifying what kind of ware a shard once was.

Teaching Computers to Recognize Old Clay
The team compiled XRF data for 624 ceramic samples from earlier scientific studies, evenly covering celadon, buncheong, and white porcelain. They then trained six different machine learning models to recognize the three types using only ten measured oxides. Some models, like decision trees and random forests, split the data into branches based on simple rules. Others, such as support vector machines, draw more flexible boundaries in a mathematical space. To avoid tailoring the models too closely to this particular dataset, the authors reserved part of the data for testing and probed performance on an entirely separate group of 59 samples drawn from independent research.
How Well the Machines Performed
Two tree-based methods—random forest and extreme gradient boosting—came out on top, correctly classifying about 96% of the test samples. A support vector machine trailed only slightly, while simpler, more rigid methods lagged behind. A closer look at the errors revealed a telling pattern: white porcelain was almost always identified correctly, but celadon and buncheong were often confused with each other. This mirrors history and technology. Both celadon and buncheong share similar clays and high firing temperatures, and early buncheong often borrowed techniques from celadon, so their chemical signatures naturally overlap. White porcelain, made from unusually pure clay with very little color-causing material, stands apart as a distinct cluster in the data.

Explaining the Decisions: Why Iron and Titanium Matter
Powerful models are not much use to historians if they behave like black boxes. To open the lid, the researchers used SHAP, a method that assigns each chemical a score for how strongly it pushes a sample toward one ceramic type or another. Across the best-performing models, two oxides dominated the story: iron oxide (Fe2O3) and titanium dioxide (TiO2). These are already known to shape color in fired clay, shifting hues from yellowish to bluish-green depending on their amount and the kiln atmosphere. The machine learning analysis confirmed that low iron and titanium levels strongly favor white porcelain; intermediate levels tend to signal celadon; and higher iron, paired with moderate titanium, is characteristic of buncheong’s darker, earthier bodies. Other oxides, such as those containing phosphorus and sodium, played supporting roles in teasing celadon and buncheong apart when their main coloring ingredients overlapped.
What This Means for Reading the Past
In essence, the study shows that computers can sort traditional Korean ceramics with expert-level accuracy while clearly spelling out which ingredients matter most. Rather than replacing curators and archaeologists, this approach offers them a quantitative companion: a way to double-check visual judgments, resolve borderline cases, and better understand how subtle shifts in clay and firing helped drive the evolution from green celadon to bold buncheong to pure white porcelain. As more chemical data are gathered from different kilns and periods, such explainable machine learning tools could become standard aids for reconstructing the technological choices and cultural values embedded in even the smallest pottery shard.
Citation: Cho, Y.E., Sim, S., Choi, J. et al. Explainable machine learning-based classification of traditional Korean ceramics using XRF chemical composition data. npj Herit. Sci. 14, 28 (2026). https://doi.org/10.1038/s40494-026-02301-4
Keywords: Korean ceramics, machine learning, XRF analysis, cultural heritage, porcelain classification