Clear Sky Science · en
The association of a common HFE gene variant with stroke disability utilizing predictive machine learning and electronic health records
Why genes and stroke recovery matter
Stroke can change a person’s life in an instant, yet two people with similar brain bleeds may recover very differently. This study asks whether a common gene variant that affects how the brain handles iron might help some people bounce back better after a type of stroke caused by bleeding in the brain. By blending electronic health records, big genetic databases, and machine learning, the researchers explore whether this subtle genetic difference could tilt the odds toward a more independent life after intracerebral hemorrhage, one of the deadliest forms of stroke.

Different strokes and uneven recoveries
Not all strokes are the same. Intracerebral hemorrhage involves a blood vessel bursting inside the brain, flooding nearby tissue with blood and iron, and it carries a high risk of death and long-term disability. Only a minority of survivors regain full independence within months. Doctors know that age and other health problems shape recovery, but those factors do not tell the whole story. Past work has shown that certain gene variants influence how well people recover from stroke. In animals, one particular variant of the HFE gene, which helps regulate iron in the body, appeared to protect the brain after a hemorrhage. The human version of this variant, called H63D, is surprisingly common, yet its impact on recovery from brain bleeding had not been tested in large groups of patients.
Using hospital data to stand in for missing scores
To study this question at scale, the team first had to solve a practical problem. Large genetic biobanks contain DNA data and hospital diagnosis codes, but they usually do not record a standard stroke disability score known as the Modified Rankin Scale, which measures how independent a person is after stroke. In contrast, specialty stroke centers record this score but do not typically have genetic data. The researchers therefore trained a machine learning model on more than 6,000 stroke patients at a U.S. medical center, using hospital diagnosis codes, age, sex, and race to predict whether a patient left the hospital with mild disability (able to live independently) or with moderate to severe disability. Several types of models performed well, and a gradient boosting model provided the best balance of accuracy and reliability.
Checking the prediction tool in large populations
Once the prediction tool was built, the team applied it to two huge genetic resources, the UK Biobank and the All of Us Research Program, focusing on participants of European ancestry, where the H63D variant is most frequent. In these biobanks, stroke patients were identified from diagnosis codes, and the model estimated each person’s likely disability level after stroke. People the model classified as more disabled had higher five-year death rates and more serious medical complications after stroke than those classified as less disabled. This pattern matched what doctors see in real clinics, suggesting that the predicted scores were meaningfully tracking stroke severity even though they were not measured directly.

A subtle gene effect in brain bleeding, but not other strokes
With predicted disability in hand, the researchers then tested whether the H63D variant was linked to better or worse outcomes across different kinds of stroke. When they combined results from both biobanks, people with intracerebral hemorrhage who did not carry the H63D variant had higher odds of being predicted to leave the hospital with significant disability compared with carriers of the variant. In other words, having H63D was associated with a modest but meaningful shift toward better functional status after this type of brain bleed. The same pattern did not appear for other stroke types, such as ischemic stroke or transient ischemic attack, which involve blocked arteries rather than sudden bleeding.
What this could mean for future care
To a layperson, the key message is that a common gene variant that slightly changes how the brain handles iron may help cushion the damage from bleeding strokes, making it less likely that survivors will remain severely disabled. The study does not prove cause and effect, and it relies on predicted rather than directly measured disability scores, so more work is needed to confirm the link. Still, the findings suggest that simple genetic differences may partly explain why some people recover better than others after similar brain injuries. In the long run, such insights could guide which patients are most likely to benefit from new treatments that target iron-related damage in the brain.
Citation: Markus, H., Helmuth, T.B., Connor, J.R. et al. The association of a common HFE gene variant with stroke disability utilizing predictive machine learning and electronic health records. Sci Rep 16, 15294 (2026). https://doi.org/10.1038/s41598-026-46129-1
Keywords: stroke recovery, intracerebral hemorrhage, HFE gene variant, machine learning, electronic health records