Clear Sky Science · en
Interpretable machine learning models using peripheral blood biomarkers for diagnosis and prognosis of glottic laryngeal squamous cell carcinoma
Why a simple blood test could help protect your voice
Glottic laryngeal cancer affects the vocal cords and can threaten both speech and life, yet it is often hard to distinguish from harmless voice problems before surgery. This study explores whether the same routine blood tests many people receive before an operation or checkup can be combined with modern computer algorithms to spot dangerous tumors earlier and estimate how a patient is likely to fare after treatment—all without extra scans or invasive procedures.

Looking for cancer clues in everyday blood tests
The researchers focused on men with problems in the vocal cords, comparing three groups: 124 patients with cancer of the vocal folds, 124 patients with noncancerous voice cord lesions, and 124 healthy volunteers. For everyone, they collected standard pre-surgery blood measurements that reflect inflammation (such as counts of white blood cells), clotting tendency (such as fibrinogen and clotting times), and nutritional status (such as albumin, a key blood protein). Because these tests are already part of routine hospital care, any discoveries would be easy to apply widely and at low cost.
Teaching machines to tell harmful from harmless
To turn this sea of numbers into practical guidance, the team used two popular machine-learning methods, known as Random Forest and XGBoost. These programs learn patterns from data much like a spam filter learns to separate junk mail from genuine messages. Here, the goal was to distinguish cancer from benign voice problems using only blood test results. After training and cross-checking on most of the patients, the models were tested on a separate group. The XGBoost model in particular performed very well, correctly telling cancer from noncancer in most cases, with an accuracy measure (AUC) of 0.93—high for a non-invasive test based purely on routine lab work.
Making the black box understandable to doctors
Computer models are often criticized for being black boxes, but this work used a method called SHAP to show which blood markers were driving the predictions. The most important signals were measures linked to blood clotting and immune activity: the international normalized ratio (INR), fibrinogen, thrombin time, and ratios that compare different types of white blood cells (neutrophil-to-monocyte and lymphocyte-to-monocyte ratios). In general, patients with cancer tended to have more signs of inflammation and a body primed to form clots, along with shifts in the balance of immune cells. The researchers even built a simple visual scoring tool, based on the top markers, so clinicians could estimate an individual patient’s cancer risk at the bedside.
Blood signals that track how aggressive the cancer is
Beyond diagnosis, the study asked whether blood markers reflect how dangerous a tumor is. By linking blood results to details from surgical pathology reports, the team found that certain combined indices—especially the systemic immune-inflammation index (SII) and several cell-count ratios—rose in step with larger tumors, lymph node spread, and higher overall stage. One marker, the neutrophil-to-platelet ratio, was strongly associated with cancer cells invading along nerves, a worrisome feature linked to recurrence. Over a median follow-up of about four and a half years, patients with higher neutrophil counts, a higher neutrophil-to-lymphocyte ratio, and higher SII fared worse, with more relapses and deaths.

What this could mean for patients and clinicians
Put simply, this research shows that a thoughtfully analyzed “snapshot” of a patient’s blood can reveal much more than routine lab reports suggest. By combining familiar tests with interpretable machine-learning tools, doctors may soon be able to better decide which hoarse patients urgently need biopsy, which cancers are likely to behave aggressively, and who might benefit from closer follow-up or additional therapy. While the study was retrospective and limited to men from a single region—meaning it needs confirmation in broader groups—it outlines a practical, low-cost path toward more personalized, data-informed care for people with suspected or confirmed cancer of the vocal cords.
Citation: Zhang, Y., Yan, X., Li, X. et al. Interpretable machine learning models using peripheral blood biomarkers for diagnosis and prognosis of glottic laryngeal squamous cell carcinoma. Sci Rep 16, 10451 (2026). https://doi.org/10.1038/s41598-026-40074-9
Keywords: laryngeal cancer, blood biomarkers, machine learning, cancer prognosis, immune inflammation