Clear Sky Science · en
Predicting future kidney function in type 2 diabetes mellitus using machine learning and baseline health information
Why this matters for people with diabetes
For many people living with type 2 diabetes, a major long‑term worry is whether their kidneys will gradually fail, eventually requiring dialysis. Doctors can measure current kidney health, but it is much harder to know years in advance whose kidneys are most at risk. This study explores whether computer‑based pattern recognition, known as machine learning, can use a single routine checkup to forecast how well a person’s kidneys will work many years into the future.

A common but silent danger
Diabetic kidney disease affects about four in ten people with type 2 diabetes and is a leading cause of chronic kidney failure worldwide. Kidney function is often tracked using an index called estimated glomerular filtration rate, or eGFR, which reflects how efficiently the kidneys filter the blood. Some patients lose this function very quickly, while others decline slowly or remain stable for many years. Because the early stages usually have no symptoms, doctors need better ways to spot those “rapid decliners” early, when extra monitoring and treatment can still slow or prevent serious damage.
Turning checkup data into a crystal ball
The researchers followed 974 adults with type 2 diabetes treated at three hospitals in Japan for a median of just over five years. At the start, they collected 54 pieces of information that are routinely measured at diabetes visits, including age, blood pressure, weight, blood tests, urine tests, and current kidney function. Importantly, they did not rely on future test results or specialized genetic or protein markers—only what would typically be available at an initial clinic visit. They then asked a set of computer models to learn the relationship between these starting measurements and the eGFR values observed at yearly follow‑up visits, up to nine years later.
How the smart models performed
The team compared three modern machine learning methods—Light Gradient Boosting Machine, Random Forest, and Support Vector Machine—with a traditional statistical approach known as multiple linear regression. All approaches did reasonably well overall, but the support vector machine stood out. It provided the most accurate predictions across the full range of kidney function, especially for people whose kidneys were still working very well at the start. Using only the initial checkup data plus the number of years into the future, this model could maintain moderate accuracy for up to about six years, whereas the traditional method’s predictions became unreliable beyond four years.

What drives the predictions
To peek inside these “black box” models, the researchers used an explanation tool that ranks which baseline features mattered most. As expected, current kidney measures—eGFR itself and blood creatinine—were influential, along with age and a urine marker of kidney damage. Blood fats, markers of anemia, and certain liver‑related enzymes also played important roles. In different subgroups, such as older adults or women, slightly different patterns emerged, hinting that some risk factors may carry more weight for certain patients. Even when the model was simplified to include only the most informative factors, its performance changed little, which is encouraging for eventual use in real clinics.
From prediction to practical action
The work has limits: the study involved a relatively small number of Japanese outpatients, it has not yet been tested in other countries or healthcare systems, and it assumes that many aspects of a patient’s life and treatment remain similar over time. Still, the findings suggest that it is feasible to turn a single, ordinary diabetes visit into a personalized forecast of kidney health years ahead. In the future, such a tool could be built into electronic health records to highlight patients whose projected kidney function drops sharply, prompting closer follow‑up, earlier referrals to kidney specialists, and more timely use of protective therapies. In short, smarter use of existing data could give patients and clinicians a valuable head start in protecting kidney health.
Citation: Unoki-Kubota, H., Nakajima, K., Shimizu, Y. et al. Predicting future kidney function in type 2 diabetes mellitus using machine learning and baseline health information. Sci Rep 16, 10890 (2026). https://doi.org/10.1038/s41598-026-45500-6
Keywords: diabetic kidney disease, type 2 diabetes, kidney function decline, machine learning prediction, eGFR