Clear Sky Science · en
A length-biased Sujit distribution framework: properties, simulation-based inference, and application to clinical remission data
Why measuring time to remission is tricky
When doctors track how long cancer patients stay in remission, the data are not as simple as they look. Some patients are observed for many years, others for only a short spell, and longer remissions tend to be noticed more often. This study develops a new mathematical tool designed to handle such uneven observations and give a clearer picture of how remission times are distributed in a patient population.

A new way to weigh survival times
The authors build on an existing simple probability model, called the Sujit distribution, and modify it to account for the fact that longer-lasting cases are more likely to appear in real-world records. This adjustment, known as length bias, effectively gives greater weight to longer durations when describing the overall pattern of survival or remission times. The resulting model, called the Length-Biased Sujit (LBSJT) distribution, keeps the convenience of having just one key parameter while gaining the flexibility to match a wider variety of real datasets.
Capturing how risk grows over time
A central question in survival analysis is how the risk of failure or relapse changes as time passes. Using the new LBSJT model, the researchers derive formulas for core quantities such as the chance of surviving beyond a given time, the instantaneous risk of failure, and related measures that describe aging and wear out. They show that, depending on the value of its single parameter, the model can represent situations where risk steadily increases and then settles to a stable level. This pattern fits many practical scenarios, such as medical conditions where relapse becomes more likely up to a point and then stops accelerating.

Putting the model through its paces
To check how well their approach behaves in practice, the team conducts large computer experiments. They generate many artificial datasets from the LBSJT distribution and then try to recover the underlying parameter using standard maximum likelihood methods. Across a wide range of sample sizes and parameter settings, the estimated values become more accurate and less variable as the number of observations increases. The uncertainty ranges around the estimates also shrink in a predictable way. These results indicate that the proposed method is statistically reliable, especially when moderate to large datasets are available.
Testing on real remission data
The researchers then apply the LBSJT model to two real datasets from leukemia patients. One records overall survival times for 40 patients, and the other records how long 20 patients remain in remission after treatment with a single drug. In both cases, the data show clear asymmetry and irregular tails that are difficult for many familiar models to capture. By comparing a range of competing distributions using several goodness of fit measures, the authors find that LBSJT consistently provides one of the best matches to the observed patterns, especially in the tails where rare but important outcomes occur.
What this means for medical and reliability studies
For readers, the main takeaway is that the way we summarize time-to-event data strongly influences the stories we tell about patient outcomes and system reliability. The LBSJT model offers a compact yet flexible way to account for the natural tendency to observe longer durations more often, while still remaining simple enough for routine use. In the remission datasets studied, it describes the spread and skew of times better than several standard alternatives, suggesting it can help clinicians and engineers obtain more faithful summaries of how long systems and patients last under real conditions.
Citation: Sindhu, T.N., Shafiq, A., Khatib, Y.E. et al. A length-biased Sujit distribution framework: properties, simulation-based inference, and application to clinical remission data. Sci Rep 16, 14857 (2026). https://doi.org/10.1038/s41598-026-42402-5
Keywords: survival analysis, length biased distribution, remission time, lifetime modeling, statistical simulation