Clear Sky Science · en

Competing cognitive pressures on human exploration in the absence of trade-off with exploitation

· Back to index

Why we explore, even when nothing is at stake

Imagine scrolling restaurant reviews or wandering new streets in a city: you are exploring, but your clicks or steps do not immediately win or lose you anything. This study asks what our curiosity looks like in such low-stakes settings, and whether it differs from the way we explore when every choice pays out or costs us. By stripping away immediate rewards in a carefully controlled experiment, the authors reveal hidden tug‑of‑wars inside our decision making between two kinds of information seeking.

Turning rewards into colors

Most lab studies of exploration use gambling-style games where each choice yields points or money. That makes it hard to tell whether people are genuinely curious or simply chasing payoffs. Here, the researchers designed a new task where the “rewards” were just shades of color, not points. On each trial, volunteers chose between two abstract shapes, each linked to a bag that produced mostly bluish or mostly orangish outcomes. Importantly, seeing a color did not immediately give or take away money; instead, it only revealed the statistical pattern behind that option, like learning how a slot machine tends to behave.

Figure 1
Figure 1.

Three ways to ask the same question

The clever twist was to keep the sampling experience the same while changing only the instructions and when rewards appeared. In the MATCH condition, people were told to collect a target color, and each more-target-colored outcome earned points right away, mimicking classic “explore–exploit” dilemmas. In the GUESS condition, there was no target during sampling; only at the end of the sequence were participants asked which option was mostly blue or mostly orange, and they were paid solely for that final answer. The FIND condition sat in between: the target color was known from the start, but rewards still depended only on a single final choice. Across several independent groups, the team showed that performance in all conditions was well above chance, confirming that participants learned the color–option pairings.

Chunking versus chasing uncertainty

When exploration was not competing with immediate reward, people behaved in a surprisingly structured way. In the GUESS condition, they began each new sequence by repeatedly sampling the same option several times in a row, as if wanting to get a solid first impression of that one. Only after this “chunk” of repeated choices did they switch and, later in the sequence, start favoring whichever option was currently more uncertain. The authors call the first tendency local uncertainty minimization: reduce doubt about the option you are currently touching. The later tendency is global uncertainty minimization: deliberately sample the option whose behavior you know least about. In contrast, in the MATCH condition, where each outcome had clear value, people quickly gravitated toward the option that best matched the target color and showed far less of this initial chunking pattern.

Figure 2
Figure 2.

Peering under the hood with computational models

To understand these patterns more deeply, the researchers built mathematical models that predict choices from the history of observed colors. An “optimal” sampler, unconcerned with mental effort, would always pick the most uncertain option to gain information as efficiently as possible. Human participants did not behave like this ideal agent. Model fits showed that, in addition to a modest tendency to chase uncertainty when rewards were delayed, people had a strong bias to repeat their previous choice and, in many cases, to keep repeating until they had reached a personal threshold of confidence about that option. Interestingly, individuals who showed stronger early chunking often also displayed more directed exploration later and performed better overall, suggesting that this seemingly suboptimal strategy may actually be a useful compromise given human cognitive limits.

Why this matters for everyday curiosity

These findings suggest that when we explore without worrying about immediate payoffs, two forces shape our curiosity. One pushes us to stay with what we are currently examining, to make sure we really understand it; the other nudges us toward whatever we know least about overall. In real life, browsing reviews, learning a new city, or testing new tools likely reflects the same balance between local and global information seeking. The study shows that if we only study exploration in reward-heavy tasks, we risk misunderstanding how people naturally seek knowledge for its own sake.

Citation: Alméras, C., Chambon, V. & Wyart, V. Competing cognitive pressures on human exploration in the absence of trade-off with exploitation. Nat Commun 17, 883 (2026). https://doi.org/10.1038/s41467-026-68639-2

Keywords: human exploration, decision making, uncertainty, information seeking, cognitive modeling