Clear Sky Science · en

Tsinghua bamboo slip scribe verification using Siamese networks

· Back to index

Ancient books meet modern code

Long before paper and screens, Chinese thinkers wrote on slim strips of bamboo. Today, these fragile manuscripts are treasure troves for historians, but one basic question is surprisingly hard to answer: which slips were written by the same hand? This study blends archaeology and artificial intelligence to build a digital assistant that can help scholars sort out who wrote what on thousands of 2,300‑year‑old bamboo slips from Tsinghua University’s famous collection.

Figure 1
Figure 1.

Why the handwriting matters

The Tsinghua bamboo slips date to the Warring States period, just before China’s first empire. They preserve early versions of texts on politics, history and philosophy that either shaped, or were lost from, the later tradition. To truly understand these writings, researchers need to know how the slips were grouped, which parts belong to the same manuscript, and how many scribes worked on them. Traditionally, experts answer such questions by eye, weighing stroke smoothness, pressure, and layout. This craft is slow, subjective and difficult to scale as new finds appear.

Turning strokes into data

The authors set out to teach a computer to compare individual handwritten characters cut from high‑resolution photos of the slips. They first built a large image collection: 15,745 single characters from 11 previously identified scribes, based on consensus paleographic studies. Using professional image‑processing software, they removed background noise, isolated each character within a rectangle, and filtered out damaged or overlapping signs. They then expanded the smaller classes—scribes with only a few surviving characters—by simple tricks such as flipping, rotating, cropping and adding noise, so that the algorithm would not be biased toward more common styles.

A twin network that looks for likeness

Instead of asking the computer to name the writer of each character, the team posed an easier but more flexible question: do these two images come from the same hand? To do this, they used a “Siamese” neural network, a pair of identical image‑processing branches that share parameters. Each branch converts a character image into a compact numerical fingerprint. The system then measures the distance between the two fingerprints: small distances suggest the same scribe, larger ones suggest different scribes. At the heart of each branch is an upgraded lightweight model called MobileNet_V3+, enhanced with an attention mechanism that learns to emphasize the most telling visual features—subtle curves, stroke thickness, or preferred ways of forming parts of characters—while downplaying less useful details.

Figure 2
Figure 2.

How well the system works

On the Tsinghua dataset, the best version of the model correctly judged whether pairs of characters came from the same scribe about 90% of the time, with a very high score on a standard test of two‑category discrimination. It outperformed several heavier‑weight image‑recognition systems, such as ResNet, VGG and Vision Transformers, which tended either to overfit the limited data or miss the fine stylistic cues needed for this task. Visual inspections of the network’s “attention maps” showed that, as training progressed, the model stopped looking at the overall silhouette and instead locked onto key stroke segments—much like a human expert.

Helping resolve real scholarly debates

To see whether the tool is useful beyond the lab, the authors applied it to several bamboo manuscripts whose authorship has been debated for years. For three texts (“Ji Gong”, “Hou Fu” and “She Ming”), earlier scholars gradually came to believe they were written by the same scribe in the broader “Yin Zhi” group. The model strongly supported this view, finding very high similarity across all pairings. For another pair of manuscripts, “Zhi Zheng” and “Zhi Bang”, researchers had argued over whether a single or several scribes were involved. The network’s comparisons suggested that pages 1–42 of “Zhi Zheng” formed one distinct scribal style, while page 43 of “Zhi Zheng” closely matched “Zhi Bang” but not the earlier pages—evidence for two separate scribes that were not part of any previously defined category.

What this means for the past and the future

In plain terms, this work shows that a compact AI system can reliably tell when two tiny fragments of ancient handwriting likely come from the same person, even when it sees only single characters. It will not replace expert judgment, but it can rapidly scan large collections, flag likely matches, and provide quantitative backing for or against particular groupings of slips. Beyond the Tsinghua cache, the same approach could be adapted to other fragile records, from oracle bones to Silk Road scrolls, helping historians and linguists piece together how ideas moved across time and space.

Citation: Wang, H., Li, M., Liu, B. et al. Tsinghua bamboo slip scribe verification using Siamese networks. npj Herit. Sci. 14, 147 (2026). https://doi.org/10.1038/s40494-026-02416-8

Keywords: bamboo slips, handwriting analysis, deep learning, cultural heritage, Siamese network