Clear Sky Science · en

Refining the processing dynamics of English compound words in L2 learners: a psycholinguistic modeling approach

· Back to index

Why word puzzles in a second language matter

When people read in a second language, even simple-looking words can hide intricate mental work. This study asks how Chinese learners of English untangle compound nouns like “teapot” or “snowman” while reading full sentences. By tracking readers’ eye movements and using computer models, the researchers show which clues the brain leans on first—how common a word is, how clear its meaning is from its parts, and how it is built—and how this mix of clues shifts from the first glance at a word to the moment its meaning finally clicks.

Watching eyes to see the mind at work

To peer into this hidden process, the team recorded eye movements from 40 advanced university students in China as they read 123 English sentences, each containing one compound noun. Tiny shifts in the eyes reveal how long readers linger on each word. The authors focused on three measures: the first instant the eyes land on the compound, the whole first pass across it, and the total time spent including rereading. These stages roughly map onto early recognition of the letter string, building up word structure and partial meaning, and finally fitting the word into the sentence. At the same time, each compound was described using ten numeric features that captured how often it and its parts appear in language, how clearly its overall meaning relates to its parts, and how its pieces are arranged.

Figure 1. How Chinese learners rely on word commonness and meaning to read English compound nouns step by step.
Figure 1. How Chinese learners rely on word commonness and meaning to read English compound nouns step by step.

Letting data-driven models sift the clues

Rather than using only traditional statistics, the researchers turned to supervised machine learning. They trained four kinds of predictive models—decision trees, random forests, neural networks, and support vector regression—to estimate how long readers would fixate on each compound at each stage, based solely on the ten word features. By comparing how accurate these models were, and which features they relied on most, the team could infer which linguistic cues matter most in real time. This approach embraces the idea that reading is not a simple straight-line process: different factors can interact in complex, nonlinear ways that are difficult to capture with standard linear equations.

First fast guesses, then deeper meaning

The models revealed a clear time course. Early on, the overall frequency of the whole compound strongly dominated: common words were recognized quickly, leading to shorter first fixations. When the compound was rare, readers appeared to fall back on its parts, especially the first piece, hinting that they try to break the word into familiar building blocks. In the middle stage, as readers continued to look at the compound, frequency still mattered but the meaning of the second part—the head that often defines what kind of thing the compound is—became more important. By the late stage, when total reading time was considered, meaning-related measures rivaled frequency in influence. Compounds whose overall meaning closely matched the meanings of their parts were wrapped up more quickly than opaque ones whose meanings could not be easily guessed from their components.

Figure 2. How the influence of word frequency and meaning shifts across early, middle, and late stages of reading compound nouns.
Figure 2. How the influence of word frequency and meaning shifts across early, middle, and late stages of reading compound nouns.

A flexible system for handling complex words

Taken together, the eye-tracking and modeling results support a picture of the bilingual mind as adaptable rather than rigid. Chinese learners of English can store and retrieve frequent compounds as whole units, much like native speakers. Yet when words are unfamiliar or their meanings are hard to predict, readers switch to slower, part-by-part analysis, weighing how familiar each piece is and how well their meanings fit together. The authors describe this as a “multi-route” system that tries out several pathways in parallel and gravitates toward whichever combination of clues offers the best chance of understanding. For teachers and textbook writers, this suggests that second language learners benefit both from repeated exposure to common compounds and from guidance in spotting meaningful word parts, helping them tackle new word puzzles with greater confidence.

Citation: Peng, Y., Chen, S., Hou, R. et al. Refining the processing dynamics of English compound words in L2 learners: a psycholinguistic modeling approach. Humanit Soc Sci Commun 13, 672 (2026). https://doi.org/10.1057/s41599-026-06999-2

Keywords: compound words, second language reading, eye tracking, word frequency, semantic transparency