Clear Sky Science · en

Cuentos: A Large-Scale Eye-Tracking Reading Corpus on Spanish Narrative Texts

· Back to index

Why watching eyes can reveal how we read

Every time you read a story, your eyes dart, pause, and jump in ways you barely notice—but these tiny movements quietly track how your mind is working. Most of what we know about this comes from studies in English. This paper introduces “Cuentos,” the largest public collection of eye-movement data from people reading full stories in Spanish. It turns the invisible dance of the eyes into a rich resource for understanding how Spanish speakers read and for building smarter language technologies.

Stories, not isolated sentences

Instead of using short, artificial sentences, the researchers asked 113 native Spanish speakers to read complete, self-contained stories written in Latin American Spanish. The collection includes 30 different tales—some long, some short—spanning genres such as realism, horror, essays, and science writing. On average, long stories contain about 3,300 words, and short ones about 800, together covering nearly 40,000 words and 8,500 distinct terms. This design captures how people naturally read narrative texts, from beginning to end, rather than how they process isolated lines in a lab.

Figure 1
Figure 1.

Tracking every pause of the eyes

Participants sat in a darkened room and read stories on a computer screen while a high-speed eye tracker recorded where they looked a thousand times per second. The device captured two key behaviors: brief stops called fixations, when the eyes gather information from the page, and fast jumps called saccades, when the eyes move to a new spot. The texts were split across multiple screens, and readers could freely move back and forth using arrow keys, just as someone might flip between pages. After each story, they answered comprehension questions to ensure they had paid attention, and for the short stories they also did a brief word-association task to reset their focus before the next tale.

Turning raw gaze paths into structured data

Collecting raw eye-movement points is only the beginning. The team built custom software to clean and organize this information with great care. They removed unreliable data, such as extremely short or very long fixations and trials where the eye tracker had poor calibration. For each screen, human reviewers adjusted guide lines so that clusters of fixations lined up precisely with the proper line of text. Then, using the position of spaces between words, they assigned individual fixations to specific words. Special cases—like the eye’s big jump from the end of one line to the start of the next, or accidental returns to earlier screens—were detected and filtered out. The result is a meticulously curated map linking each word in the stories to how long, how often, and in what pattern it was viewed.

What the eye movements reveal

From these cleaned traces, the authors computed a rich set of measures for each word. Some reflect early, automatic processing, such as how long the first fixation lasts or how long a word is looked at before the eyes move on. Others capture later, more deliberate processing, such as time spent coming back to re-read earlier words. Using modern statistical models, they confirmed well-known patterns from other languages now firmly in Spanish: shorter and more frequent words are read more quickly, and readers are more likely to skip very short, familiar words altogether. Where a word appears in a sentence or on the screen also subtly shapes how long the eyes linger on it. These checks show that the new dataset behaves in a sensible, interpretable way and can serve as a reliable benchmark.

Figure 2
Figure 2.

A new tool for reading research and smart software

All of the data and code are freely available in standardized formats, making it easy for other scientists to explore. Linguists can use Cuentos to study Spanish-specific features such as word endings, word order, and style. Psychologists can examine how individuals differ in their reading strategies or how genre affects mental effort. Developers in artificial intelligence and natural language processing can feed this information into models that better mimic human reading, improving tasks like making texts easier to read or predicting which words are harder to understand. In simple terms, Cuentos turns the subtle movements of Spanish readers’ eyes into a powerful shared tool for both understanding the mind and building more human-like language technologies.

Citation: Travi, F., Bianchi, B., Slezak, D.F. et al. Cuentos: A Large-Scale Eye-Tracking Reading Corpus on Spanish Narrative Texts. Sci Data 13, 434 (2026). https://doi.org/10.1038/s41597-026-06798-z

Keywords: eye tracking, reading, Spanish language, natural language processing, cognitive science