Clear Sky Science · en
Expert-level probabilistic breathing event detector informs phenotyping of sleep apnea
Why this matters for your sleep
Many people stop breathing briefly during sleep without realizing it, a condition known as sleep apnea. Diagnosing it today requires experts to watch hours of overnight recordings by hand, a slow and imperfect process. This study introduces a computer system that can spot and characterize these breathing pauses as well as human specialists, and even reveal more about why they happen. Such tools could make sleep apnea testing faster, cheaper, and more tailored to each person.

The challenge of spotting troubled sleep
Sleep apnea is usually described with a single number: how many times per hour your breathing significantly slows or stops. But arriving at that number is surprisingly messy. Different clinics use slightly different rules, and even experts at the same center disagree about where one event starts, ends, or what type it is. Some events block the throat (obstructive apnea), others arise from the brain failing to drive breathing (central apnea), and many are partial reductions in airflow known as hypopneas. There are also subtle breathing instabilities that do not cause clear dips in blood oxygen or obvious arousals, so they are often ignored in routine scoring. All of this makes the standard index of apnea severity less reliable and less informative than patients and doctors might expect.
Teaching a computer to read a night of sleep
The researchers built an automatic system called the Apneic Breathing Event Detector (ABED) to tackle this problem. ABED takes in a rich set of overnight signals: airflow at the nose and mouth, movement of the chest and abdomen, blood oxygen levels, and computer-estimated chances of brief brain arousals and wakefulness. It examines the night in overlapping four-minute windows and uses a modern deep learning architecture—combining convolutional layers and recurrent layers—to decide where breathing events occur and what type they are. In addition to the classic obstructive, central, and hypopnea events, ABED also detects “isolated respiratory events,” subtle airflow reductions without obvious arousals or oxygen drops that usually go uncounted in clinical reports.
How well the detector matches human experts
To train and test ABED, the team used more than 6500 overnight sleep studies from four large research cohorts and then evaluated it on over 1100 unseen studies from those groups plus two additional multi–expert datasets. Across all data, the system’s estimate of the standard apnea–hypopnea index closely tracked expert scores, with a very strong correlation and correct assignment of severity group (none, mild, moderate, severe) in roughly three out of four people. At the level of individual events, ABED detected apneas and hypopneas with an overall F1 score of 0.78, and it distinguished obstructive, central, and hypopnea events comparably to or better than individual human scorers in the independent datasets. Importantly, the model handled recordings from many different centers, suggesting it is more generalizable than earlier, smaller systems trained at a single site.
A probability view of breathing events
ABED does more than assign each event a single label. For every detected breathing disturbance, it produces probabilities that the event belongs to each category. The authors call this richer description “apnotyping.” An event that looks mostly obstructive might still carry a moderate probability of being central, or a hypopnea might lie halfway between a full obstruction and a milder irregularity. When the team summarized these probabilities across the night for each person, patterns emerged that lined up with deeper traits of their breathing control, such as how strongly their brain responds to changes in blood gases (loop gain), how well their throat muscles compensate during obstruction, and how easily they wake up in response to breathing trouble. In several cases, these probability-based features predicted such traits better than traditional hand-scored indexes.

What this could mean for patients
For someone wondering whether they have sleep apnea—or whether their current treatment is the right one—ABED points to a future where diagnosis is faster and more informative. Instead of relying on a single nightly average and the eyes of one tired scorer, automated tools could provide consistent event-by-event descriptions and a graded sense of uncertainty, while also hinting at why breathing fails in a given person. Although the system still has limits, such as lower accuracy in very mild cases and lack of testing in children, it shows that expert-level automatic scoring can illuminate the full spectrum of sleep-related breathing problems. Ultimately, this may help doctors match patients not just to a diagnosis, but to the therapies most likely to work for their particular pattern of sleep apnea.
Citation: Kjaer, M.R., Hanif, U., Brink-Kjaer, A. et al. Expert-level probabilistic breathing event detector informs phenotyping of sleep apnea. Nat Commun 17, 2548 (2026). https://doi.org/10.1038/s41467-026-69163-z
Keywords: sleep apnea, deep learning, polysomnography, automatic diagnosis, respiratory events