Clear Sky Science · en

Semantic segmentation and spatial grid analysis of Chinese heritage landscape photographic compositions with cross-cultural perspectives

· Back to index

How Holiday Photos Reveal What We Love About Gardens

When we travel, we snap countless photos without thinking much about what they say. Yet the things we choose to frame – a quiet pond, a sweeping bridge, or an ornate pavilion – quietly reveal what we value and how we see a place. This study turns thousands of tourist photos from two world‑famous Chinese classical gardens into data, using advanced image analysis to uncover how visitors from different cultures visually experience these heritage landscapes, and how that knowledge can guide better conservation and visitor design.

Figure 1
Figure 1.

Turning Casual Snapshots into Useful Clues

The researchers focused on two iconic sites: the royal Summer Palace in Beijing, with its vast lake and commanding buildings, and the Humble Administrator’s Garden in Suzhou, known for intimate courtyards and literary charm. They collected more than 9,000 user‑uploaded photographs from two travel platforms. One platform mainly serves Chinese users and stands in for Eastern visitors; the other is popular with international travelers and stands in for Western visitors. Instead of asking people what they liked through surveys, the team treated each vacation snapshot as a record of what caught a visitor’s eye at that moment.

Teaching a Computer to See Garden Elements

To read these images systematically, the team used a deep‑learning method called semantic segmentation, which teaches a computer to color every pixel according to what it depicts. They reduced a long list of visual labels into ten easy‑to‑recognize garden ingredients such as trees and plants, water, buildings, enclosing walls, pathways and bridges, rocks, and decorative furnishings. For each photo, the system noted whether an element was present at all and how much of the frame it occupied. Then the authors overlaid a simple three‑by‑three grid – similar to the well‑known “Rule of Thirds” in photography – to see where in the picture each type of element tended to sit: top or bottom, center or side.

Figure 2
Figure 2.

What Visitors Actually Point Their Cameras At

The pixel‑by‑pixel analysis showed clear patterns. Across both gardens and both tourist groups, greenery was almost universal, appearing in more than 95% of images and often filling the largest share of the frame. Water surfaces, though less frequent, tended to spread widely whenever they appeared, creating open, airy scenes. Certain ingredients were rarer but powerful when used – for example, large buildings or long bridges that, once included, often dominated the photo. By counting how many different element types appeared together, the team found that most photos contained between one and five components, a balance between simplicity and richness. Western visitors tended to capture more varied mixes of elements in some areas, while Chinese visitors showed steadier patterns across both gardens.

Different Eyes, Different Compositions

Looking at where elements fell within the grid revealed cultural contrasts in how scenes were framed. Photos by Chinese tourists were more often centered and balanced, filling both the middle and upper parts of the image – a style that echoes traditional Chinese aesthetics emphasizing harmony among buildings, plants, and water. Western visitors, in contrast, favored “bottom‑heavy” compositions, with strong foreground features like paths and bridges leading into the scene and prominent structures sitting lower in the frame. They also showed a special attraction to doors, windows, and other openings that frame a view, especially in the Suzhou garden, whereas Chinese visitors foregrounded walls, plants, and furnishings tied to literati culture.

Hotspots, Heritage Stories, and Why It Matters

By clustering photos with similar combinations of elements, the study could infer popular shooting locations: lake shores, zigzag bridges, moon gates, temple complexes, and courtyard corridors. Chinese visitors were drawn to spaces that express imperial stories and themes of harmony, such as willow‑lined walks around Kunming Lake or literary pavilions. Western visitors gravitated toward striking architectural objects – the Marble Boat, temple towers, or framed views through arches – often paired with water. These tendencies, the authors argue, reflect deeper cultural habits of attention: some observers seek the overall mood and relationships within a scene, while others zero in on bold, isolated subjects.

From Pixels to Better Garden Experiences

In everyday language, the study shows that holiday photos are more than souvenirs: they are a window into how different people “read” the same historic place. By decoding what tourists choose to include, how large they make it, and where they position it in the frame, managers of heritage gardens gain an objective map of what visitors actually notice and cherish. That knowledge can inform everything from where to place paths and viewing platforms to how to design marketing images that speak to different audiences, all while easing pressure on crowded spots. Although this work focused on two gardens, the approach – letting computers sift through thousands of casual photos to reveal shared patterns of taste – could help protect and enliven many heritage landscapes worldwide.

Citation: Chai, H., Lu, S., Ni, L. et al. Semantic segmentation and spatial grid analysis of Chinese heritage landscape photographic compositions with cross-cultural perspectives. npj Herit. Sci. 14, 176 (2026). https://doi.org/10.1038/s40494-026-02439-1

Keywords: Chinese classical gardens, tourist photography, heritage perception, deep learning, cross-cultural preferences