Clear Sky Science · en
Towards the tumor microenvironment scoring methods for immune checkpoint inhibitor response
Why predicting cancer drug response matters
Immune checkpoint inhibitors are a new class of cancer drugs that can unleash the body’s own defenses against tumors, sometimes leading to dramatic, long‑lasting remissions. Yet only a fraction of patients benefit, while others endure side effects without meaningful improvement. This paper asks a practical question with life‑and‑death consequences: can we use the molecular "fingerprint" of a tumor and its surrounding tissue to score how likely a patient is to respond to these treatments before therapy even begins?
Taking the pulse of the tumor’s neighborhood
Every tumor sits in a bustling neighborhood of cancer cells, blood vessels, immune cells, and supporting tissue, collectively called the tumor microenvironment. The authors focus on methods that convert this complex environment into numerical "scores" using gene activity measurements from bulk RNA sequencing, a technology that reads out which genes are turned on or off in a piece of tumor. They review and re‑analyze 17 such scoring systems, many of which capture different aspects of the immune landscape—how many killer T cells are present, whether immune cells are active or exhausted, or how much scar‑like stromal tissue surrounds the cancer. These scores aim to forecast who will respond to immune checkpoint inhibitors across several cancers, including melanoma, lung, bladder, head and neck, and kidney cancers.

How the study put these scores to the test
To fairly compare methods developed by many different research groups, the authors gathered data from multiple clinical studies in which patients received immune checkpoint inhibitors and had tumor RNA sequencing performed. They built large combined datasets—for example, merging four melanoma studies and several mixed‑cancer cohorts—and also examined individual cancer types on their own. Because each study used slightly different lab protocols, they first corrected for "batch effects" so that technical differences would not masquerade as biology. They then asked two main questions for each score: how well did it distinguish responders from non‑responders, and how well did it predict how long patients lived after treatment?
What worked, what helped, and what fell short
The analysis revealed a sobering but informative picture. Some scores performed reasonably well in specific settings: for example, measures of cytolytic, or cell‑killing, T‑cell activity (called CYT1 and CYT2) and a dysfunction‑focused score (TIDE) were especially informative in melanoma. A simple two‑gene ratio capturing macrophage behavior, known as CS Polarity, rose to the top in the large mixed‑cancer group, while a “hot tumor” gene signature named TIP Hot was consistently useful in several cancers, particularly bladder, lung, and head and neck tumors. Another score, IS_immune, reflecting overall immune activity, predicted survival well in both bladder cancer and the broader non‑immunotherapy TCGA dataset. However, when all scores were compared side by side, their ability to predict response or survival was generally modest, and no single method was reliably strong across every cancer type.

Hot tumors, cold tumors, and the limits of current scores
The authors found that scores tended to work best in so‑called "hot" tumors—those already infiltrated by many active immune cells, such as melanoma, certain lung cancers, head and neck cancers, and urothelial (bladder) cancers. In contrast, in "cold" tumors like many kidney cancers, where immune cells are sparse or suppressed, all existing scores struggled. Even when some measures showed statistical differences between responders and non‑responders, their real‑world predictive power remained weak. The study also highlights why narrow signatures can fail: scores built around a single cell type or pathway may miss important contextual factors, such as whether T cells are exhausted or whether the tumor has evolved ways to shut them down. On the other hand, extremely large, complex models risk overfitting and can perform poorly when applied to new patient groups.
Where this leaves patients and future research
For patients and clinicians, the key message is cautious optimism. Tumor microenvironment scores already capture meaningful biological signals, and a few—like TIP Hot, CS Polarity, TIDE, and IS_immune—show promise in particular cancers. But they are not yet accurate enough, or universal enough, to serve as stand‑alone tests for deciding who should receive immune checkpoint inhibitors. The authors argue that future progress will require larger and more diverse datasets, smarter ways to reduce the complexity of gene data, better integration of clinical factors and other biomarkers (such as blood tests and microbiome data), and models that account for how tumors evolve over time. With these advances, tumor microenvironment scoring could become a powerful tool to match patients to the right immunotherapy and spare others from ineffective treatment.
Citation: Zhou, Q., Kirshtein, A. & Shahriyari, L. Towards the tumor microenvironment scoring methods for immune checkpoint inhibitor response. npj Precis. Onc. 10, 88 (2026). https://doi.org/10.1038/s41698-025-01221-z
Keywords: tumor microenvironment, immunotherapy response, immune checkpoint inhibitors, gene expression scores, hot and cold tumors