Clear Sky Science · en

A vision transformer model for the detection of glaucoma from optic disc photographs

2026-03-24 · Back to index

Why this matters for everyday sight

Glaucoma is one of the leading causes of permanent blindness worldwide, yet it often creeps in without symptoms until vision is already badly damaged. The good news is that simple photographs of the back of the eye are widely available, even in clinics with limited equipment. This study explores whether an advanced computer program can look at those photos and reliably flag early signs of glaucoma, long before most people would notice problems.

A quiet threat to vision

Glaucoma slowly harms the optic nerve, the bundle of fibers that carries visual signals from the eye to the brain. Doctors look for changes in the optic disc, the point where this nerve exits the eye, but interpreting these subtle shapes is hard and even experts often disagree. Many regions of the world also lack enough eye specialists to screen large populations. As a result, about half of glaucoma cases worldwide are believed to go undiagnosed, especially in low and middle income countries, and many people only learn they have the disease after considerable sight has been lost.

Teaching a computer to read eye photos

The researchers gathered more than a thousand optic disc photographs from people with early glaucoma treated at a U.S. eye center, along with hundreds of photos of healthy eyes from two public image databases. Glaucoma specialists graded each picture as either glaucomatous or healthy based only on how the optic nerve looked, excluding eyes with other retinal problems. The team cropped every image so the optic disc occupied a similar portion of the frame and used careful quality checks and realistic image tweaks, such as slight rotations, zooms, and blur, to expand the training set while keeping it true to real world conditions.

Figure 1. AI system reviews simple eye photos to tell healthy optic nerves from glaucomatous ones at a glance.

A new kind of neural network

Instead of relying on more traditional image analysis systems, the team built its model on a "vision transformer," a newer family of deep learning tools originally developed for recognizing objects in everyday photographs. This model slices each optic disc image into many small patches, represents each patch as a data token, and then uses layers of attention blocks to weigh how different regions of the disc relate to one another. The network outputs a score between 0 and 1 that reflects how likely the eye is to have glaucoma, with scores at or above 0.5 counted as positive. To make the most of the available data, the researchers used balanced sampling, weighted loss functions, and cross validation, and they compared the transformer to a strong convolutional network called EfficientNet.

Figure 2. Eye photo is split into patches that flow through layered processing to separate healthy from damaged optic discs.

How well the system saw early disease

When tested on images it had not seen before, the vision transformer model nearly perfectly separated glaucomatous from healthy eyes. Its main performance measure, the area under the receiver operating curve, was 1.00 on the test set, with accuracy around 99 percent, very high sensitivity, and very high specificity. In practical terms, the system missed almost no glaucoma cases and labeled very few healthy eyes as diseased. When the researchers later challenged the model with nearly a thousand eyes that had moderate to advanced glaucoma, it correctly identified all but one of them. The transformer also outperformed the EfficientNet based approach, which had lower accuracy and more false alarms and misses.

What this could mean for eye care

Because the model works with standard optic disc photographs and was trained on a racially diverse group of patients, it offers a realistic glimpse of how artificial intelligence might help screen for glaucoma in many parts of the world. The authors caution that their study used a smaller overall sample size than some others and leaned on external datasets for many of the healthy controls, which could introduce hidden biases. They argue that larger, more varied image collections, images captured with portable cameras, and inclusion of basic patient information like age or degree of nearsightedness will be important next steps. Still, their findings suggest that smart analysis of simple eye photos could become a cost effective way to catch glaucoma early and reduce avoidable blindness, especially where specialists are scarce.

Take home message for readers

This work shows that an advanced computer model can learn to spot the early fingerprints of glaucoma from routine photographs of the optic nerve with very high accuracy. While more testing in real world clinics is needed, such tools could one day help doctors quickly sort large numbers of patients into those who need urgent eye care and those who do not, making early protection of sight more accessible around the globe.

Citation: Bouris, E., Leyva, B.K., Odugbo, O.P. et al. A vision transformer model for the detection of glaucoma from optic disc photographs. Sci Rep 16, 14831 (2026). https://doi.org/10.1038/s41598-026-44662-7

Keywords: glaucoma screening, optic disc photography, deep learning, vision transformer, eye health