Clear Sky Science · en

Genetic and environmental influences on data missingness in developmental cognitive neuroscience

2026-04-22 · Back to index

Why It Matters When Infant Data Go Missing

When scientists study how babies’ brains and behavior develop, they often lose a surprising amount of data: a baby looks away, gets fussy, or a machine glitches. Most of the time this lost information is treated as random noise and simply discarded. But what if the very fact that data are missing tells us something real about the child or their environment? This study asks whether genetics and family circumstances help explain which infants end up with incomplete data in common lab tests of early brain and visual function.

Looking at Data Loss, Not Just Results

The researchers drew on a large twin project in Sweden that followed nearly 600 five‑month‑old babies. All were same‑sex twins, which allowed the team to compare identical pairs, who share essentially all of their genes, with fraternal pairs, who share only about half. On a single test day, each infant took part in three kinds of lab experiments that are standard in developmental brain research: a brain‑wave test using an EEG cap while watching moving patterns on a screen; an eye‑tracking task that measured whether babies looked more at a person’s eyes or mouth; and a pupillometry task that tracked how the pupils responded to brief flashes of light.

Two Ways Data Can Disappear

Instead of focusing on what the babies’ brains or eyes did, the researchers focused on what was missing. At the “experiment level,” they asked whether a child had to be excluded from an entire experiment because there was no usable data. At the “trial level,” they counted, within each experiment, how many individual trials produced valid readings after applying strict quality checks. Crucially, they treated missing data itself as a trait, and used twin methods to see how much of the variation in missingness could be traced to genes, to family‑wide influences shared by twins, or to individual experiences unique to each child.

Genes, Family, and the Fate of a Data Point

Overall, about 40% of infants were missing from at least one of the three experiments, and 60% contributed good data to all. For this broad yes‑or‑no measure of participation, differences between infants were best explained by environmental factors. Influences shared within a family, such as general routines, parent behavior, or features of the testing day that affected both twins, accounted for a sizeable portion of who ended up with missing experiments. Influences unique to one child—like a one‑off distraction or small technical glitch—accounted for the rest. When the team zoomed in on individual experiments, they found that being excluded from the EEG task showed a moderate genetic component, while exclusion from the two eye‑tracking tasks was mainly shaped by shared environmental factors.

Hidden Heritability in Data Quality

The picture changed when the researchers examined trial‑by‑trial data quality. Here, genetic influences were evident across all three experiments. For gaze tracking and EEG, genes explained a moderate share of differences in how many trials were usable. For the pupillometry task, more than half of the variation in trial‑level data quality was linked to genetic factors, with the remainder due to individual‑specific experiences. In contrast, shared family environment did not significantly shape these trial‑level measures. Interestingly, there was very little overlap in data quality across the three experiments: a baby who produced many good trials in one task was not necessarily more likely to do so in the others, even though all were run on the same day, often with the same tester.

What Missing Data Really Tells Us

To check for familiar sources of bias, the authors also tested whether missingness was tied to factors such as parental education, income, infant temperament, or genetic likelihood for autism and later autistic‑like traits. After rigorous correction for multiple tests, they found no strong evidence for such links in this general‑population sample, though they note that small effects could have gone undetected. Overall, the findings show that missing data in infant brain and behavior studies is not simply random noise: it reflects a mix of genetic influences and environmental experiences, and these influences differ by method and by level of analysis.

Why Researchers Should Care About the Gaps

For non‑specialists, the takeaway is that when infant data go missing, it is often for systematic reasons tied to the child or their context, not just to bad luck. That means common analysis choices that assume data are missing completely at random—such as simply dropping incomplete cases—can quietly distort study conclusions and limit how well findings generalize. The authors argue that developmental scientists should treat missingness itself as a meaningful signal, adopt more advanced statistical methods that explicitly handle non‑random loss, and refine testing procedures to reduce avoidable data gaps. In short, understanding why information is missing is an essential part of understanding how children’s brains and behavior truly develop.

Citation: Bussu, G., Portugal, A.M., Viktorsson, C. et al. Genetic and environmental influences on data missingness in developmental cognitive neuroscience. Commun Psychol 4, 70 (2026). https://doi.org/10.1038/s44271-026-00457-0

Keywords: missing data, infant brain development, twin study, eye tracking, pupillometry