Clear Sky Science · en
Toward enhanced unsupervised clustering of 20th century Korean paintings via multimodal features
Seeing Patterns in Korean Modern Art
What if a computer could help us understand how painters are alike—or completely different—just by looking at their works? This study uses artificial intelligence to examine twentieth-century Korean paintings, revealing hidden patterns in color, texture, and style. For museum visitors, art lovers, and curious readers, it offers a new way to see how distinctive artists are, and how their works quietly cluster into families of style that even experts sometimes debate.
Building a Carefully Chosen Art Collection
To give the computer something meaningful to learn from, the researchers first assembled a focused digital collection: 1,100 paintings by eleven major modern and contemporary Korean artists, from ink landscapists to abstract painters and realists. Each artist contributed 100 works, gathered mainly from the National Museum of Modern and Contemporary Art (MMCA) and other trusted institutions and foundations. The group includes key figures such as abstract pioneers, realist painters of everyday life, innovators in ink wash, and artists blending folk traditions with modern expression. Their prominence in landmark national exhibitions, including the famous Lee Kun-hee Collection, helped ensure that the dataset reflects the core of twentieth-century Korean art rather than a random assortment of images.
Translating Paintings into Numbers
Computers cannot “see” art as people do, so the team translated each painting into a bundle of numerical features. They captured basic color information in two different ways (RGB and HSV), measured fine-grained texture patterns using a method called gray-level co-occurrence, and added a powerful semantic snapshot from a pre-trained vision–language model known as CLIP. CLIP was originally trained on huge numbers of image–text pairs from the internet, so it carries a broad, language-aware sense of what images look like. For each painting, these four streams—color, color variation, texture, and semantic impression—were normalized and then combined into a single, balanced feature vector, creating a compact but rich fingerprint of the artwork’s visual character.

Letting the Clusters Emerge on Their Own
Rather than telling the computer which painting belonged to which artist during training, the researchers used an unsupervised approach: they asked the algorithm to group similar paintings on its own. First, a technique called t-SNE squeezed the high-dimensional fingerprints down to two dimensions so that the overall structure could be visualized. Then K-means clustering divided the paintings into many small groups, later refined to focus on the most meaningful clusters. Only after this process did the team attach artist names, using simple majority voting within each group, to check how well the clusters lined up with real authorship. The best version of the method—equally blending CLIP, color, and texture—correctly matched paintings to their artists about 82% of the time, outperforming versions that relied on single cues such as color alone or texture alone.
What the Computer Saw in Color and Brushwork
The clustering results were not just numbers; they produced recognizable visual stories. When the team plotted the clusters, most artists formed tight, well-separated islands of points, each island filled with representative works that shared clear traits: monochrome ink landscapes with delicate brushwork, bold geometric abstractions in primary colors, or quiet still lifes with stable compositions and repeated textures. In artists whose work hinges on a signature palette—such as bright color fields or specific tonal harmonies—simple color cues already worked quite well. For others, like ink painters or expressionists with dramatic brushwork, texture and semantic information were crucial. Misclassifications often occurred where human experts would also hesitate: abstract painters with similar compositions, or artists sharing fluid lines and overlapping color choices. In these cases, errors turned into clues about genuine visual kinships across different names.

From Data to Deeper Art Understanding
For non-specialists, the key takeaway is that a computer, looking only at digital images, could recover much of what art historians already know about who painted what—and even hint at unexpected relationships. By combining color, texture, and learned semantic impressions, the framework offers a repeatable, objective way to group and compare works by modern and contemporary Korean painters. It does not replace human judgment or the rich cultural context that experts bring, but it provides a quantitative map that can guide the eye to clusters, border zones, and visual cousins worth a closer look. In this way, machine learning becomes a new companion for curators and viewers, helping them navigate large collections and discover how the many voices of Korean modern art weave together into a complex, but analyzable, visual landscape.
Citation: Baek, S., Park, SJ., Park, SE. et al. Toward enhanced unsupervised clustering of 20th century Korean paintings via multimodal features. npj Herit. Sci. 14, 76 (2026). https://doi.org/10.1038/s40494-026-02304-1
Keywords: Korean modern art, artificial intelligence, painting style analysis, image clustering, digital art history