Clear Sky Science · en

Machine learning application in colon cancer treatment outcome prediction

· Back to index

Why predicting colon cancer outcomes matters

Colon cancer is one of the most common cancers worldwide, and many patients and families want to know a simple, urgent thing: “What are my chances, and what can be done to improve them?” This study from Iran explores how modern computer techniques, known as machine learning, can sift through detailed medical records to better predict which patients are at higher risk after surgery. By sharpening these predictions, doctors may be able to tailor treatment and follow-up care more precisely, giving vulnerable patients a better shot at long-term survival.

Turning hospital records into helpful patterns

The researchers drew on 10 years of data from 764 people who had surgery for colon cancer at a major center in Shiraz, Iran. For each patient, they collected 44 pieces of information, including age, blood tests, tumor size, cancer stage, symptoms, and details of the operation and treatments such as chemotherapy. These records were cleaned and checked carefully: impossible lab values were corrected, patients who could not be followed were removed, and missing answers were filled in with reasonable estimates. The team then split the data so that most of it trained the computer models, while a separate portion was held back to test how well those models could predict who would be alive or dead at follow-up.

Figure 1
Figure 1.

How smart algorithms learn from patients

Instead of relying only on traditional statistics, the study compared several modern computer approaches side by side. These included different “forest” and “boosting” methods, which combine many simple decision rules, as well as neural networks, which loosely mimic how brain cells connect. The goal for every method was the same: use the patients’ information to guess whether each person would survive, and then compare those guesses with what actually happened. The models were judged on how often they were right overall, how good they were at catching patients who died, and how well they avoided false alarms for those who lived. The best-performing methods reached around 80% overall accuracy, a strong result given the complexity of cancer outcomes.

Which models and factors mattered most

Among all of the approaches, a method called CatBoost gave the highest overall accuracy, while a random forest model showed the best balance between correctly flagging high-risk patients and not over-calling risk in those who did well. To make the results more understandable to doctors, the team used an explanation tool that ranks which pieces of information influenced the computer’s decisions the most. Cancer stage—a summary of how large the tumor is, whether it has reached lymph nodes, and whether it has spread—was the single strongest factor. Tumor size, how deeply the tumor invaded the colon wall, the presence of spread to other organs, type of treatment, tumor grade (how abnormal the cells looked), involvement of lymph and blood vessels, patient age, and weight loss also played important roles in shaping survival predictions.

Figure 2
Figure 2.

From numbers to bedside decisions

These findings suggest that a carefully trained computer model, fed with routine clinical information, can help doctors spot patients who are quietly at high risk after colon cancer surgery. In day-to-day practice, such a tool could sit inside an electronic health record, instantly combining details about a patient’s tumor and general health into a simple risk estimate. That number would not replace a doctor’s judgment, but it could guide choices such as how often a patient should be checked, whether additional treatments are worth the side effects, or when a second opinion is needed. Because the most important factors identified by the computer match what cancer specialists already consider critical, the system is easier to trust and explain to patients.

What this means for patients and the future

For patients and families, the key message is that computers can now use ordinary medical data to support more personalized care for colon cancer. While the study was done at a single center in Iran and still needs to be tested in other hospitals and with richer data, such as genetic and imaging information, it shows that machine learning can highlight who needs extra attention and why. Over time, as more data are added and models are refined, these tools could help doctors around the world deliver treatment that is not only evidence-based, but also finely tuned to each person’s particular cancer and circumstances.

Citation: Ghasemi, H., Hosseini, S.V., Rezaianzadeh, A. et al. Machine learning application in colon cancer treatment outcome prediction. Sci Rep 16, 6159 (2026). https://doi.org/10.1038/s41598-026-36917-0

Keywords: colon cancer, machine learning, treatment outcomes, risk prediction, clinical data