Clear Sky Science · en

Cross-language hotel review sentiment analysis via multi-agent federated learning with heterogeneous graph attention networks

· Back to index

Why Online Hotel Reviews Matter

Picking a hotel increasingly starts with scrolling through other travelers’ comments. Those reviews come in many languages and from many websites, and they can make or break a hotel’s reputation and revenue. This paper describes a new system that reads hotel reviews in several languages at once, spots whether they are positive or negative, and warns hotel managers early when their reputation is about to change—all while keeping guests’ data safely stored where it was written.

Figure 1
Figure 1.

Many Languages, Many Platforms, One Problem

Today, travelers post reviews in English, Chinese, French, German and mixed language styles across booking sites like Booking.com, TripAdvisor, Agoda and Ctrip. Existing tools often translate everything into one language and then analyze it, which can mangle meaning, especially in smaller or culturally specific languages. Other tools send all review text to a single central server, raising privacy worries and clashing with strict data protection laws. The result is that hotels may miss subtle shifts in guest satisfaction or fail to notice coordinated attacks of fake reviews in time to respond.

Working Together Without Sharing Raw Data

The authors propose a “federated” setup in which each participating site–language pair runs its own local software agent. These agents learn from their own reviews, but instead of sending the raw text to a central hub, they share only mathematical updates to a shared model. A coordination layer combines these updates and sends back an improved model to all partners. Extra safeguards, like adding carefully calibrated noise and encrypting the updates, make it very hard for anyone to reconstruct what an individual guest actually wrote. This design allows the system to learn from 154,680 reviews across four languages while staying within modern privacy regulations.

Seeing Reviews as a Living Network

Rather than treating each review as an isolated block of text, the system turns the entire review universe into a rich network. In this network, nodes represent guests, hotels, reviews, languages and time points, and links capture who stayed where, when they wrote, and in which language. A special type of neural network then “looks” across this web, paying more attention to the most informative connections, such as trusted reviewers or recurring patterns in a specific language. This approach allows the model to pick up both universal signals (“excellent location”, “unclean room”) and language-specific phrases or habits that ordinary multilingual models often miss.

Figure 2
Figure 2.

From Sentiment Scores to Live Reputation and Fake Review Alerts

On top of this multilingual engine, the authors build a reputation monitor that tracks how a hotel’s standing changes over time. The system looks for sharp rises in negative comments, slow but steady improvements after service upgrades, and unusual bursts of activity that may signal organized manipulation. It analyzes writing style, timing and reviewer behavior to filter out fake or suspicious reviews, and it updates a hotel’s reputation score in near real time. In tests, the system identified fake reviews with over 93% accuracy and could flag major reputation shifts, on average, just over three days before they became obvious on public rating pages.

What This Means for Travelers and Hotels

For a general reader, the take‑home message is that it is now possible to combine reviews from multiple languages and platforms into a single, privacy‑aware picture of hotel quality. The proposed system classifies review sentiment more accurately than leading language models, treats data more carefully than centralised services, and spots both genuine problems and suspicious review patterns earlier than traditional tools. In practice, this means travelers can rely on more trustworthy ratings and summaries, while hotels gain an early‑warning dashboard that encourages them to fix issues quickly and respond to guests more effectively—all without exposing personal review data beyond where it was originally posted.

Citation: Han, X. Cross-language hotel review sentiment analysis via multi-agent federated learning with heterogeneous graph attention networks. Sci Rep 16, 12681 (2026). https://doi.org/10.1038/s41598-026-41500-8

Keywords: hotel reviews, sentiment analysis, federated learning, fake review detection, online reputation