Clear Sky Science · en

A predictive model for PICC-related thrombosis in sepsis patients using XGBoost algorithm

· Back to index

Why this matters for patients and families

Many people with life-threatening infections, such as sepsis, rely on long, flexible tubes placed in their veins to receive medicine and nutrition. These lines, called PICC lines, are essential but can sometimes trigger dangerous blood clots. This study shows how doctors used a modern computer technique to predict which sepsis patients are most likely to develop clots around these lines, with the goal of preventing complications before they happen.

Catheters that help and sometimes harm

When patients have sepsis, they often need weeks of intravenous antibiotics, fluids, and nutrition. Peripherally inserted central catheters, or PICC lines, are threaded from a vein in the arm toward the heart to make this possible. While convenient, PICC lines can also disturb blood flow and irritate the vessel wall, allowing clots to form. These clots may block veins, prolong intensive care stays, and in severe cases break off and travel to the lungs. Until now, it has been hard for clinicians to know in advance which patients are most at risk, because many different health and blood factors are involved at the same time.

Figure 1
Figure 1.

Using big data and smart algorithms

The researchers drew on a large open critical care database from a major U.S. hospital, containing detailed records from more than 94,000 intensive care stays between 2008 and 2022. From this resource, they identified 8,128 adults with sepsis who had a PICC line in place for at least two days and did not have certain complicating conditions such as blood cancers. Among these patients, 538 developed a clot related to the catheter, most of them within the first month after it was inserted. The team split the patients into two groups: one to train their computer model and another to test how well it would work on new cases.

What the model learned about clot risk

To spot patterns too complex for simple rules, the team used an approach called XGBoost, a type of machine learning that combines many small decision trees into a single strong predictor. They fed the model information that clinicians routinely collect, including age, sex, blood test results, other illnesses, and how long the PICC had been in place. The model produced a probability that a particular patient would develop a clot. Its accuracy was judged using the area under the receiver operating characteristic curve, a standard yardstick for prediction tools. Scores around 0.76 in both the training and test groups indicated that the model was able to clearly distinguish higher-risk from lower-risk patients.

Peeking inside the black box

A common worry with machine learning is that it can feel like a black box. To address this, the researchers used an interpretability method called SHAP to estimate how much each factor pushed a prediction toward higher or lower risk. The most influential features included white blood cell count, platelet count, prior heart attack, hemoglobin, kidney function, time the PICC had been in place, age, mild liver disease, a blood clotting measure called prothrombin time, and diabetes without severe complications. Together, these paint a picture of clot risk rising when inflammation is high, blood and clotting are out of balance, and the line remains in the vein longer. A decision curve analysis further suggested that using the model to guide preventive treatment would benefit more patients than simply treating everyone or no one.

Figure 2
Figure 2.

What this could mean at the bedside

For everyday care, this work suggests that doctors could use routinely available information to estimate a sepsis patient’s chance of developing a PICC-related clot before trouble starts. Those flagged as high risk might receive closer monitoring, shorter catheter durations, special catheter types, or early preventive blood thinners, while low-risk patients might avoid unnecessary treatment. The study has limits—it comes from a single hospital system and uses past data rather than a live clinical trial—but it offers a concrete step toward more personalized and safer use of life-saving intravenous lines in some of the sickest patients.

Citation: Hao, W., She, Ty., Yuan, Zn. et al. A predictive model for PICC-related thrombosis in sepsis patients using XGBoost algorithm. Sci Rep 16, 14378 (2026). https://doi.org/10.1038/s41598-026-44999-z

Keywords: sepsis, PICC line, blood clot, machine learning, intensive care