Clear Sky Science · en
A dataset of tourist mobility networks across China derived from online travel blogs
Why your holiday stories matter
Every time someone shares a travel tale online, they leave behind more than pretty pictures and memories. Hidden in those posts are clues about where people go, how they move from place to place, and what makes some destinations more connected than others. This study taps into thousands of Chinese travel blogs to build a detailed picture of how tourists actually move between attractions across China, offering new insight for curious readers, planners, and anyone who wonders how digital footprints can reshape our understanding of travel.

From personal trips to a national map of movement
The researchers focused on Qunar.com, a major Chinese travel platform whose blog tool gently nudges users to record their journeys day by day and attraction by attraction. Unlike free‑form social media posts, these blogs are laid out in a structured, chronological way and are tied to a built‑in database of tourist attractions. That design choice turned countless holiday diaries into a rich source of structured information. By collecting blogs written about trips within China during a ten‑year span, the team was able to read not the stories themselves, but the ordered lists of places that bloggers said they visited on their journeys.
Turning stories into networks of places
In the dataset the team built, each tourist attraction becomes a point in a vast web, and every move from one attraction to the next becomes a line between two points. If many bloggers reported going from a lakeside park to a nearby old town, that connection appears as a heavily used link in the web. By stringing together each blogger’s visit list in order, the researchers reconstructed tens of thousands of itineraries and then combined them into nationwide “mobility networks.” These networks are different from usual travel statistics: instead of showing how people go from their home city to a destination, they reveal how visitors weave their way from sight to sight once they have arrived.

Peeking behind the data‑gathering curtain
To build a clean and reliable picture, the team had to make careful choices. They filtered out blogs that mostly described trips outside China, removed duplicate copies of the same blog, and ignored posts that mentioned only a single attraction, since those provide no information about movement. When bloggers listed the same attraction several times in a row, those repeats were collapsed into one, because no real movement took place. For each attraction mentioned, the researchers pulled in its approximate location, its hosting city, and both Chinese and English names, using map and translation services. Importantly, they did not keep expressive content such as narrative text or photos; only factual parts like dates, locations, and anonymous blog IDs were retained to respect both platform rules and user privacy.
Different trips, different patterns
Because each blog on Qunar.com includes simple background details, the dataset can be sliced in several revealing ways. The team grouped trips by season—spring, summer, autumn, and winter—and also by travel companions, distinguishing solo trips from journeys with friends or with family. For each of these groups they built a separate network, so that future researchers can compare, for example, how winter visits connect ski resorts versus how summer trips link beaches and historic towns. When they examined the overall structure of these networks, they found patterns familiar from other big‑scale travel studies: a few highly popular attractions dominate many routes, while most places receive far fewer transitions. They also showed that clusters in the network line up well with China’s provincial boundaries, suggesting that tourists tend to move within recognizable regional circuits.
Strengths, limits, and future uses
The authors are careful to stress that bloggers are not a perfect mirror of all tourists. People who write travel blogs tend to be enthusiastic, internet‑savvy travelers, often taking leisure or sightseeing trips rather than business journeys or family visits. The number of blogs on Qunar.com also rose and fell over the years, especially after a major corporate merger that likely changed how the platform was promoted. As a result, the dataset is best suited to exploring relative patterns—such as which attractions are strongly linked or how seasonal routes differ—rather than precise headcounts of visitors. Still, by releasing both the cleaned networks and the underlying visit sequences as open data, along with code to rebuild and adjust the networks, the study offers a powerful new lens for anyone interested in tourism, urban planning, transportation, or the broader question of how our online traces can illuminate the way we move through the world.
What it all means for everyday travel
For a lay reader, the takeaway is simple: the casual act of logging a trip online can be combined with thousands of other logs to reveal the hidden skeleton of a country’s tourism system. This work shows that personal travel diaries, when handled carefully and stripped of identifying details, can help map which attractions naturally group together into routes, which cities serve as hubs, and how seasons and travel companions shape our paths. In doing so, it lays the groundwork for smarter destination planning, more balanced promotion of lesser‑known sites, and richer comparisons between the experiences of “online tourists” and the wider traveling public.
Citation: Zheng, Y., Wang, J., Zhang, Y. et al. A dataset of tourist mobility networks across China derived from online travel blogs. Sci Data 13, 443 (2026). https://doi.org/10.1038/s41597-026-06780-9
Keywords: tourist mobility, user-generated travel data, China tourism, network analysis, online travel blogs