Competing and Noncompeting Risk Models for Predicting Kidney Allograft Failure

imageKey Points

Prediction models are becoming increasingly relevant in precision medicine.These models should be highly performant and not negatively affected by competing risk events.We thus aimed to carefully assess the effect of competing risks in allograft failure prediction.

Background

Prognostic models are becoming increasingly relevant in clinical trials as potential surrogate end points and for patient management as clinical decision support tools. However, the effect of competing risks on model performance remains poorly investigated. We aimed to carefully assess the performance of competing risk and noncompeting risk models in the context of kidney transplantation, where allograft failure and death with a functioning graft are two competing outcomes.

Methods

We included 11,046 kidney transplant recipients enrolled in ten countries. We developed prediction models for long-term kidney graft failure prediction, without accounting (i.e., censoring) and accounting for the competing risk of death with a functioning graft, using Cox, Fine–Gray, and cause-specific Cox regression models. To this aim, we followed a detailed and transparent analytical framework for competing and noncompeting risk modeling and carefully assessed the models' development, stability, discrimination, calibration, overall fit, clinical utility, and generalizability in external validation cohorts and subpopulations. More than 15 metrics were used to provide an exhaustive assessment of model performance.

Results

Among 11,046 recipients in the derivation and validation cohorts, 1497 (14%) lost their graft and 1003 (9%) died with a functioning graft after a median follow-up postrisk evaluation of 4.7 years (interquartile range, 2.7–7.0). The cumulative incidence of graft loss was similarly estimated by Kaplan–Meier and Aalen–Johansen methods (17% versus 16% in the derivation cohort). Cox and competing risk models showed similar and stable risk estimates for predicting long-term graft failure (average mean absolute prediction error of 0.0140, 0.0138, and 0.0135 for Cox, Fine–Gray, and cause-specific Cox models, respectively). Discrimination and overall fit were comparable in the validation cohorts, with concordance index ranging from 0.76 to 0.87. Across various subpopulations and clinical scenarios, the models performed well and similarly, although in some high-risk groups (such as donors older than 65 years), the findings suggest a trend toward moderately improved calibration when using a competing risk approach.

Conclusions

Competing and noncompeting risk models performed similarly in predicting long-term kidney graft failure.