Clear Sky Science · en

The current state of polygenic scores for the development of lung cancer: a systematic review and validation in UK Biobank

· Back to index

Why Our Genes Matter for Lung Cancer

Lung cancer is often linked to smoking, but not everyone who smokes gets lung cancer, and some people who never smoke do. This puzzle has led scientists to ask how much of lung cancer risk is hidden in our DNA. The article explores whether combining many small genetic clues into what are called polygenic scores can help identify who is most likely to develop lung cancer, beyond what we already know from tobacco use alone.

Figure 1
Figure 1.

Looking for Genetic Clues

Instead of a single “lung cancer gene,” researchers now know that risk comes from the combined effect of many subtle DNA differences. These tiny changes, scattered across the genome, each nudge risk up or down by a small amount. By adding up hundreds or even thousands of these changes into a single number—a polygenic score—scientists hope to estimate a person’s inherited tendency to develop lung cancer. If such scores worked well, they might one day help decide who should receive early screening scans, even among people who have never smoked.

Gathering All the Existing Scores

The authors first carried out a broad search of the scientific literature and a public database of genetic risk scores. They found 60 different polygenic scores for lung cancer that had been created since 2012, mostly using large genetic studies in European and East Asian populations. These scores differed in how many DNA changes they included, how they were built, and whether they tried to account for smoking. Some had been tested only in the groups they were created from, and only a few had ever been checked in completely independent populations.

Putting the Scores to the Test

To compare these scores fairly, the team turned to the UK Biobank, a large health study that has genetic data and long-term health records for about half a million adults. After excluding people who already had cancer, they followed over 429,000 participants, including more than 3,500 who later developed lung cancer. The researchers were able to reconstruct and test 39 of the published scores in this group. For each person, they calculated a polygenic score and then examined how well it separated those who went on to develop lung cancer from those who did not, using standard measures of prediction performance.

Figure 2
Figure 2.

What the Results Really Show

Most of the tested scores did show some link with future lung cancer, meaning people with higher scores tended to be diagnosed more often. However, the strength of this prediction was modest. In technical terms, nearly all scores performed better than chance but fell well short of the accuracy seen for similar scores in cancers such as breast or colorectal cancer. Even the best-performing lung cancer scores could not concentrate more than about 2% of future cases into the top 1% of the genetic risk distribution. Making the scores more complex by adding more DNA markers, or using newer methods, did not noticeably improve their performance.

Differences by Smoking and Ancestry

Because smoking is such a powerful risk factor, the researchers also asked how well the scores worked in people with different smoking histories. For most scores, prediction was slightly better in current and former smokers than in people who had never used tobacco, suggesting that many genetic markers may partly reflect a tendency toward smoking behavior. Interestingly, a small subset of scores performed somewhat better in people who had never smoked, hinting that those particular DNA patterns might capture more of the underlying biological tendency to develop lung cancer itself. The study also highlighted a serious imbalance: most original genetic studies were based on people of European or East Asian ancestry, leaving very little solid information about how well these scores perform in other ethnic groups.

What This Means for Future Screening

For a layperson, the core message is that current genetic scores for lung cancer are not yet strong enough to stand alone as a screening tool. They can modestly separate higher- and lower-risk people, especially among smokers, but the differences are too small to reliably pick out who will get lung cancer. The authors conclude that, for now, these scores might be most useful as one ingredient in broader risk models that also include age, smoking history, and other health or environmental factors. They also emphasize the need for more diverse genetic research and for better understanding of how genes and smoking interact, before genetic risk can meaningfully change who gets screened and when.

Citation: Galal, B., Dennis, J., Antoniou, A.C. et al. The current state of polygenic scores for the development of lung cancer: a systematic review and validation in UK Biobank. Br J Cancer 134, 939–948 (2026). https://doi.org/10.1038/s41416-025-03330-9

Keywords: lung cancer risk, polygenic scores, genetic susceptibility, smoking and genetics, cancer screening