Clear Sky Science · en

Evaluating the quality of AI-generated subtitle translations from a reception-oriented perspective: a comparative study of ChatGPT, human, and neural machine translations in sitcoms

· Back to index

Why subtitles for comedy matter

Streaming has turned foreign TV shows into everyday entertainment, but jokes can fall flat if subtitles miss the mark. This study looks at how well different kinds of Chinese–English subtitles work for the classic American sitcom Friends, and asks a simple question that matters to any viewer: do AI tools like ChatGPT make it easier or harder to enjoy the show?

Different ways to create subtitles

The researchers compared three kinds of subtitles for short clips from the first episode of Friends. The first set came from a professional fan group that carefully translated the dialog. The second set used a familiar online translation engine. The third used ChatGPT, asked only to translate the English lines into natural Chinese. All subtitles were bilingual, with Chinese above English, and featured tricky moments that involved wordplay, sarcasm, or emotional shifts, which are especially important in comedy.

To understand how viewers reacted, the team sent an online package to hundreds of Chinese participants. Each person watched nine clips: three scenes, each repeated with a different subtitle version in random order so they would not know which was which. After each trio, they chose which subtitles helped them follow the plot and rated their satisfaction on a simple five-point scale. A final question asked what they valued most in subtitles, such as accuracy, ease of understanding, or smooth flow with the video.

Figure 1. How different subtitle creators shape the viewing experience of a sitcom episode.
Figure 1. How different subtitle creators shape the viewing experience of a sitcom episode.

Measuring subtitle quality from two angles

The study did not stop at personal opinions. The authors also ran the three subtitle versions through a specialist rating system that scores how well subtitles match the original meaning, read smoothly, and fit the screen comfortably. This system tracks different kinds of mistakes, from clumsy wording to serious changes in meaning, and turns them into an overall quality score. By comparing these scores with viewers’ ratings, the researchers could see whether expert-style assessments lined up with everyday audience experience.

Across all three scenes, ChatGPT’s subtitles clearly beat those from the traditional machine translation engine in both expert scores and viewer satisfaction. In some cases, especially in one clip, ChatGPT’s version even scored higher than the professional subtitles in the technical assessment. Viewers often found its lines natural and easy to follow, and many could not reliably tell them apart from human work. However, on average, the human-made subtitles still came out slightly ahead in audience ratings, especially when it came to capturing humor or culturally rich expressions.

Figure 2. How human, traditional machine, and AI translators each handle jokes and impact viewer enjoyment.
Figure 2. How human, traditional machine, and AI translators each handle jokes and impact viewer enjoyment.

Who the viewers are changes what they see

The study found that people’s backgrounds shaped how sharply they judged the subtitles. High school students tended to rate all three versions similarly, and sometimes liked ChatGPT’s subtitles as much as or even more than the original fan-made ones. University students and those with graduate degrees were more critical and better at spotting differences between versions. Viewers who had already watched Friends before were also more sensitive to nuances, favoring the original subtitles, while those new to the show had trouble telling versions apart. Whether someone studied languages mattered less than their general education level and how familiar they were with the series.

Why AI still needs a human touch

Concrete examples in the paper show both the promise and the limits of AI subtitles. In some jokes, ChatGPT produced smoother and more vivid Chinese than the professional version, making the humor feel more immediate. In other moments, it translated word-for-word and missed hidden meanings or cultural hints, which could puzzle viewers. The survey confirmed that audiences care most about understanding the plot, with accuracy and smooth timing also ranking high. The authors conclude that AI tools like ChatGPT already offer better sitcom subtitles than older machine translation engines and can sometimes rival human work, but they still need careful postediting and proofreading. For now, the best results come from combining AI speed with human judgment, helping more viewers enjoy foreign shows without losing the heart of the humor.

Citation: Chen, S., Hu, X. Evaluating the quality of AI-generated subtitle translations from a reception-oriented perspective: a comparative study of ChatGPT, human, and neural machine translations in sitcoms. Humanit Soc Sci Commun 13, 748 (2026). https://doi.org/10.1057/s41599-026-07414-6

Keywords: subtitles, audiovisual translation, ChatGPT, sitcoms, viewer reception