Clear Sky Science · en

A Multimodal Dataset for Neurophysiological and AI Applications

· Back to index

Why this matters for kids who struggle to focus

Many families, teachers, and clinicians know how hard it can be to tell whether a child’s restlessness or daydreaming is part of everyday life or a sign of Attention Deficit Hyperactivity Disorder (ADHD). Today’s diagnoses still rely mostly on interviews and questionnaires, which can be influenced by memory, expectations, or stress. This study introduces the BALLADEER ADHD Dataset, a large, open collection of brain and body measurements gathered while children and teens play attention-focused games. It is designed to help researchers build more objective tools for understanding and identifying ADHD — and to do so in a way that is transparent and shareable worldwide.

Figure 1
Figure 1.

From classroom behavior to brain and body signals

ADHD affects roughly one in twenty school‑age children, shaping how they pay attention, control impulses, and manage activity levels. Because its symptoms overlap with other conditions, diagnosis can be tricky. Over the past few decades, scientists have turned to brain recordings and other body signals to look for clearer biological clues. Electrical activity from the scalp (EEG) can reveal patterns linked to attention; eye‑tracking shows where and when a child looks at important details; and changes in skin conductance and heart rhythm reflect stress and alertness. However, most earlier studies used small, private datasets that could not be freely checked or reused. As a result, many promising findings could not be thoroughly tested or turned into reliable, everyday tools.

Building a rich, shared picture of attention

The BALLADEER project set out to change this by collecting a multimodal dataset — that is, a coordinated set of measurements from several sources at once. The team recorded data from 164 children and adolescents aged 6 to 18, including 62 with an ADHD diagnosis and 102 without. During sessions spread over two days, participants completed a battery of well‑known paper‑and‑pencil tests as well as computer‑based and virtual‑reality tasks that mimic everyday attention challenges. While they played and solved problems, the researchers recorded electrical activity from the brain using EEG headsets, eye movements using an eye‑tracking bar mounted under a monitor, and signals such as heart rate and skin conductance from a wrist‑worn device. All of this was paired with detailed logs of what was happening on screen second by second.

Attention games that feel more like play than testing

To make data collection engaging and child‑friendly, the team designed game‑like tasks. In “Attention Slackline,” children watch flags on two mountains and press a button when the patterns match; their brain waves, gaze, and heart signals are recorded continuously. In “Attention Robots,” they scan rows of cartoon robots, selecting only those with specific features, while the system logs exactly which robot they are looking at. A commercial platform called CogniFit presents a variety of short exercises to probe perception, coordination, and problem‑solving, and a virtual‑reality system called Nesplora places children in a simulated classroom or aquarium to measure how well they follow instructions amid realistic distractions. Together, these tasks aim to tap into sustained attention, impulse control, and mental flexibility — the very skills that are often challenging for people with ADHD.

Figure 2
Figure 2.

How the data are captured and organized

Behind the scenes, the researchers built a dedicated software and hardware setup to keep every device in sync. A central Python‑based server starts and stops recordings on the EEG headsets and wristbands at the same moment that a game level begins and ends. The games send time‑stamped messages whenever a child responds or a key event appears on screen. All of the raw signals and event logs are stored on a secure network drive in simple, widely used formats (CSV and JSON). The shared structure includes folders labeled by anonymous user ID, task, date, and device type, along with files that describe each participant’s age, sex, and ADHD status without revealing personal identities. The authors deliberately avoided heavy pre‑processing, so other scientists can apply their own cleaning methods and analysis techniques.

Strengths, caveats, and what comes next

The BALLADEER dataset stands out because it combines several types of measurements gathered at the same time in a relatively large group of young people, and it is fully open for others to download and analyze. This makes it a valuable testing ground for new artificial‑intelligence methods that try to spot patterns linked to ADHD or discover new digital “biomarkers” that could complement clinical judgment. At the same time, the authors are clear about its limits: the sample comes from a single region, ADHD subtypes were not systematically labeled, and the size is still modest for training very large deep‑learning models. Some recordings contain movement‑related noise, and there is no separate resting‑state condition. Rather than hiding these issues, the team documents them so users can design careful analyses.

What this means for families and future care

In everyday terms, this dataset does not diagnose any child on its own. Instead, it offers researchers a powerful, shared microscope for studying how attention difficulties show up in the brain, eyes, and body during realistic tasks. Over time, work based on BALLADEER could help clinicians move beyond checklists and gut feeling by adding objective, data‑driven measures to the toolbox. That could lead to earlier, more accurate identification of ADHD, better tracking of how children respond to treatment, and fairer decisions in schools and clinics. By turning play‑like activities into precise measurements and sharing those data openly, the study lays groundwork for a new generation of science‑based support for children who struggle to focus.

Citation: Trujillo, J., Ferrer-Cascales, R., Teruel, M.A. et al. A Multimodal Dataset for Neurophysiological and AI Applications. Sci Data 13, 436 (2026). https://doi.org/10.1038/s41597-026-06758-7

Keywords: ADHD, EEG, eye tracking, physiological signals, machine learning