Index of Prediction Accuracy (IPA)
1 Introduction
This vignette demonstrates how our software calculates the index of prediction accuracy1, 2. We distinguish three settings:
- uncensored binary outcome
- right censored survival outcome (no competing risks)
- right censored time to event outcome with competing risks
The Brier score is a loss type metric of prediction performance where lower values correspond to better prediction performance. The IPA formula for a model is very much the same as the formula for R2 in a standard linear regression model:
\begin{equation*} \operatorname{IPA} = 1-\frac{\text{BrierScore(Prediction model)}}{\text{BrierScore(Null model)}} \end{equation*}2 Package version
data.table: [1] ‘1.14.2’ survival: [1] ‘3.2.13’ riskRegression: [1] ‘2022.3.8’ Publish: [1] ‘2020.12.23’
3 Data
For the purpose of illustrating our software we simulate data alike the data of an active surveillance prostate cancer study 3. Specifically, we generate a learning set (n=278) and a validation set (n=208). In both data sets we define a binary outcome variable for the progression status after one year. Note that smallest censored event time is larger than 1 year, and hence the event status after one year is uncensored.
set.seed(18) astrain <- simActiveSurveillance(278) astest <- simActiveSurveillance(208) astrain[,Y1:=1*(event==1 & time<=1)] astest[,Y1:=1*(event==1 & time<=1)]
4 IPA for a binary outcome
To illustrate the binary outome setting we analyse the 1-year
progression status. We have complete 1-year followup, i.e., no dropout
or otherwise censored data before 1 year. We fit two logistic
regression models, one including and one excluding the biomarker
erg.status
:
lrfit.ex <- glm(Y1~age+lpsaden+ppb5+lmax+ct1+diaggs,data=astrain,family="binomial") lrfit.inc <- glm(Y1~age+lpsaden+ppb5+lmax+ct1+diaggs+erg.status,data=astrain,family="binomial") publish(lrfit.inc,org=TRUE)
Variable | Units | OddsRatio | CI.95 | p-value |
---|---|---|---|---|
age | 0.98 | [0.90;1.06] | 0.6459 | |
lpsaden | 0.95 | [0.66;1.36] | 0.7747 | |
ppb5 | 1.09 | [0.92;1.28] | 0.3224 | |
lmax | 1.08 | [0.83;1.41] | 0.5566 | |
ct1 | cT1 | Ref | ||
cT2 | 1.00 | [0.29;3.41] | 0.9994 | |
diaggs | GNA | Ref | ||
3/3 | 0.60 | [0.27;1.34] | 0.2091 | |
3/4 | 0.25 | [0.05;1.30] | 0.1006 | |
erg.status | neg | Ref | ||
pos | 3.66 | [1.90;7.02] | <0.0001 |
Based on these models we predict the risk of progression within one year in the validation set.
astest[,risk.ex:=100*predictRisk(lrfit.ex,newdata=astest)]
astest[,risk.inc:=100*predictRisk(lrfit.inc,newdata=astest)]
publish(head(astest[,-c(8,9)]),digits=1,org=TRUE)
age | lpsaden | ppb5 | lmax | ct1 | diaggs | erg.status | Y1 | risk.ex | risk.inc |
---|---|---|---|---|---|---|---|---|---|
62.6 | -3.2 | 4.9 | 4.6 | cT1 | 3/3 | pos | 0.0 | 23.2 | 36.3 |
66.9 | -1.7 | 0.7 | 4.1 | cT1 | 3/3 | pos | 1.0 | 14.0 | 24.7 |
65.4 | -1.5 | 4.0 | 3.9 | cT1 | 3/3 | neg | 0.0 | 17.4 | 10.6 |
59.0 | -2.8 | 6.8 | 3.3 | cT2 | 3/4 | pos | 1.0 | 10.7 | 21.1 |
55.6 | -3.5 | 2.8 | 3.0 | cT1 | 3/3 | neg | 0.0 | 21.9 | 11.8 |
71.1 | -2.6 | 3.3 | 3.7 | cT1 | 3/3 | neg | 0.0 | 15.0 | 9.5 |
To calculate the Index of Prediction Accuracy (IPA) we call the
Score
function as follows on a list which includes the two logistic
regression models.
X1 <- Score(list("Exclusive ERG"=lrfit.ex,"Inclusive ERG"=lrfit.inc),data=astest, formula=Y1~1,summary="ipa",se.fit=0L,metrics="brier",contrasts=FALSE) X1
Metric Brier: Results by model: model Brier IPA 1: Null model 15.2 0.0 2: Exclusive ERG 14.8 2.7 3: Inclusive ERG 14.1 7.3 NOTE: Values are multiplied by 100 and given in %. NOTE: The lower Brier the better, the higher IPA the better.
Both logistic regression models have a lower Brier score than the
Null model
which ignores all predictor variables. Hence, both models
have a positive IPA. The logistic regression model which excludes the
ERG biomarker scores IPA=2.68% and the logistic regression model which
includes the ERG biomarer scores IPA = 7.29%. The difference in IPA
between the two models is 4.62%. This means that when we omit
erg.status
from the model, then we loose 4.62% in IPA compared to
the full model. It is sometimes interesting to compare the predictor
variables according to how much they contribute to the prediction
performance. Generally, this is a non-trivial task which depends on
the order in which the variables are entered into the model, the
functional form and also on the type of model. However, we can drop
one variable at a time from the full model and for each variable
compute the loss in IPA as the difference between IPA of the full
model and IPA of the model where the variable is omitted.
IPA(lrfit.inc,newdata=astest)
Variable Brier IPA IPA.drop 1: Null model 15.2 0.0 7.3 2: Full model 14.1 7.3 0.0 3: age 14.1 7.4 -0.1 4: lpsaden 14.1 7.6 -0.3 5: ppb5 14.2 6.9 0.4 6: lmax 14.1 7.2 0.1 7: ct1 14.1 7.3 -0.0 8: diaggs 14.6 4.4 2.9 9: erg.status 14.8 2.7 4.6 NOTE: Values are multiplied by 100 and given in %. NOTE: The higher IPA the better. NOTE: IPA.drop = IPA(Full model) - IPA. The higher the drop the more important is the variable for the full model.
5 IPA for right censored survival outcome
To illustrate the survival outome setting we analyse the 3-year
progression-free survival probability. So, that the combined endpoint
is progression or death. We fit two Cox regression models, one
including and one excluding the biomarker erg.status
:
coxfit.ex <- coxph(Surv(time,event!=0)~age+lpsaden+ppb5+lmax+ct1+diaggs,data=astrain,x=TRUE) coxfit.inc <- coxph(Surv(time,event!=0)~age+lpsaden+ppb5+lmax+ct1+diaggs+erg.status,data=astrain,x=TRUE) publish(coxfit.inc,org=TRUE)
Variable | Units | HazardRatio | CI.95 | p-value |
---|---|---|---|---|
age | 1.03 | [0.99;1.07] | 0.124 | |
lpsaden | 1.10 | [0.94;1.29] | 0.230 | |
ppb5 | 1.21 | [1.12;1.30] | <0.001 | |
lmax | 1.06 | [0.94;1.19] | 0.359 | |
ct1 | cT1 | Ref | ||
cT2 | 0.97 | [0.57;1.66] | 0.916 | |
diaggs | GNA | Ref | ||
3/3 | 0.53 | [0.37;0.76] | <0.001 | |
3/4 | 0.32 | [0.18;0.58] | <0.001 | |
erg.status | neg | Ref | ||
pos | 1.80 | [1.35;2.38] | <0.001 |
Based on these models we predict the risk of progression or death within 3 years in the validation set.
astest[,risk.ex:=100*predictRisk(coxfit.ex,newdata=astest,times=3)]
astest[,risk.inc:=100*predictRisk(coxfit.inc,newdata=astest,times=3)]
publish(head(astest[,-c(8,9)]),digits=1,org=TRUE)
age | lpsaden | ppb5 | lmax | ct1 | diaggs | erg.status | Y1 | risk.ex | risk.inc |
---|---|---|---|---|---|---|---|---|---|
62.6 | -3.2 | 4.9 | 4.6 | cT1 | 3/3 | pos | 0.0 | 67.5 | 80.7 |
66.9 | -1.7 | 0.7 | 4.1 | cT1 | 3/3 | pos | 1.0 | 48.5 | 60.3 |
65.4 | -1.5 | 4.0 | 3.9 | cT1 | 3/3 | neg | 0.0 | 67.4 | 60.8 |
59.0 | -2.8 | 6.8 | 3.3 | cT2 | 3/4 | pos | 1.0 | 51.1 | 70.1 |
55.6 | -3.5 | 2.8 | 3.0 | cT1 | 3/3 | neg | 0.0 | 41.5 | 35.5 |
71.1 | -2.6 | 3.3 | 3.7 | cT1 | 3/3 | neg | 0.0 | 65.5 | 57.5 |
To calculate the Index of Prediction Accuracy (IPA) we call the
Score
function as follows on a list which includes the two Cox
regression models.
X2 <- Score(list("Exclusive ERG"=coxfit.ex,"Inclusive ERG"=coxfit.inc),data=astest, formula=Surv(time,event!=0)~1,summary="ipa",se.fit=0L,metrics="brier",contrasts=FALSE,times=3) X2
Metric Brier: Results by model: model times Brier IPA 1: Null model 3 24.0 0.0 2: Exclusive ERG 3 22.4 6.4 3: Inclusive ERG 3 19.9 17.1 NOTE: Values are multiplied by 100 and given in %. NOTE: The lower Brier the better, the higher IPA the better.
It is sometimes interesting to compare the predictor variables according to how much they contribute to the prediction performance. Generally, this is a non-trivial task which depends on the order in which the variables are entered into the model, the functional form and also on the type of model. However, we can drop one variable at a time from the full model and for each variable compute the loss in IPA as the difference between IPA of the full model and IPA of the model where the variable is omitted.
IPA(coxfit.inc,newdata=astest,times=3)
Variable times Brier IPA IPA.drop 1: Null model 3 24.0 0.0 17.1 2: Full model 3 19.9 17.1 0.0 3: age 3 19.7 17.6 -0.6 4: lpsaden 3 20.1 16.2 0.8 5: ppb5 3 21.3 11.2 5.9 6: lmax 3 19.9 16.7 0.4 7: ct1 3 19.9 17.0 0.1 8: diaggs 3 20.8 13.0 4.1 9: erg.status 3 22.4 6.4 10.7 NOTE: Values are multiplied by 100 and given in %. NOTE: The higher IPA the better. NOTE: IPA.drop = IPA(Full model) - IPA. The higher the drop the more important is the variable for the full model. Warning message: In `[.data.table`(r2, , `:=`(IPA.drop, IPA[Variable == "Full model"] - : Invalid .internal.selfref detected and fixed by taking a (shallow) copy of the data.table so that := can add this new column by reference. At an earlier point, this data.table has been copied by R (or was created manually using structure() or similar). Avoid names<- and attr<- which in R currently (and oddly) may copy the whole data.table. Use set* syntax instead to avoid copying: ?set, ?setnames and ?setattr. If this message doesn't help, please report your use case to the data.table issue tracker so the root cause can be fixed or this message improved.
6 IPA for right censored time to event outcome with competing risks
To illustrate the competing risk setting we analyse the 3-year risk of
progression in presence of the competing risk of death without
progression. We fit two sets of cause-specific Cox regression models 4,
one including and one excluding the biomarker erg.status
:
cscfit.ex <- CSC(Hist(time,event)~age+lpsaden+ppb5+lmax+ct1+diaggs,data=astrain) cscfit.inc <- CSC(Hist(time,event)~age+lpsaden+ppb5+lmax+ct1+diaggs+erg.status,data=astrain) publish(cscfit.inc)
Variable Units 1 2 age 1.04 [1.00;1.09] 1.01 [0.95;1.07] lpsaden 1.13 [0.92;1.38] 1.09 [0.83;1.42] ppb5 1.14 [1.04;1.24] 1.39 [1.22;1.58] lmax 1.19 [1.03;1.39] 0.82 [0.67;1.00] ct1 cT1 Ref Ref cT2 1.31 [0.73;2.36] 0.31 [0.07;1.28] diaggs GNA Ref Ref 3/3 0.54 [0.35;0.84] 0.56 [0.29;1.10] 3/4 0.44 [0.22;0.88] 0.19 [0.06;0.60] erg.status neg Ref Ref pos 2.20 [1.56;3.11] 1.20 [0.71;2.04]
Based on these models we predict the risk of progression in presence of the competing risk of death within 3 years in the validation set.
astest[,risk.ex:=100*predictRisk(cscfit.ex,newdata=astest,times=3,cause=1)]
astest[,risk.inc:=100*predictRisk(cscfit.inc,newdata=astest,times=3,cause=1)]
publish(head(astest[,-c(8,9)]),digits=1,org=TRUE)
age | lpsaden | ppb5 | lmax | ct1 | diaggs | erg.status | Y1 | risk.ex | risk.inc |
---|---|---|---|---|---|---|---|---|---|
62.6 | -3.2 | 4.9 | 4.6 | cT1 | 3/3 | pos | 0.0 | 49.7 | 65.5 |
66.9 | -1.7 | 0.7 | 4.1 | cT1 | 3/3 | pos | 1.0 | 45.2 | 60.1 |
65.4 | -1.5 | 4.0 | 3.9 | cT1 | 3/3 | neg | 0.0 | 50.6 | 42.3 |
59.0 | -2.8 | 6.8 | 3.3 | cT2 | 3/4 | pos | 1.0 | 46.0 | 69.0 |
55.6 | -3.5 | 2.8 | 3.0 | cT1 | 3/3 | neg | 0.0 | 26.3 | 19.9 |
71.1 | -2.6 | 3.3 | 3.7 | cT1 | 3/3 | neg | 0.0 | 51.8 | 42.2 |
To calculate the Index of Prediction Accuracy (IPA) we call the
Score
function as follows on a list which includes the two sets of
cause-specific Cox regression models.
X3 <- Score(list("Exclusive ERG"=cscfit.ex, "Inclusive ERG"=cscfit.inc), data=astest, formula=Hist(time,event)~1, summary="ipa",se.fit=0L,metrics="brier", contrasts=FALSE,times=3,cause=1) X3
Metric Brier: Results by model: model times Brier IPA 1: Null model 3 24.5 0.0 2: Exclusive ERG 3 23.2 5.0 3: Inclusive ERG 3 20.2 17.5 NOTE: Values are multiplied by 100 and given in %. NOTE: The lower Brier the better, the higher IPA the better.
It is sometimes interesting to compare the predictor variables according to how much they contribute to the prediction performance. Generally, this is a non-trivial task which depends on the order in which the variables are entered into the model, the functional form and also on the type of model. However, we can drop one variable at a time from the full model (here from both cause-specific Cox regression models) and for each variable compute the loss in IPA as the difference between IPA of the full model and IPA of the model where the variable is omitted.
IPA(cscfit.inc,newdata=astest,times=3)
Variable times Brier IPA IPA.drop 1: Null model 3 24.5 0.0 17.5 2: Full model 3 20.2 17.5 0.0 3: age 3 20.1 18.0 -0.5 4: lpsaden 3 20.4 16.8 0.8 5: ppb5 3 20.4 16.5 1.1 6: lmax 3 21.4 12.6 4.9 7: ct1 3 19.8 18.9 -1.4 8: diaggs 3 20.8 14.8 2.8 9: erg.status 3 23.2 5.0 12.5 NOTE: Values are multiplied by 100 and given in %. NOTE: The higher IPA the better. NOTE: IPA.drop = IPA(Full model) - IPA. The higher the drop the more important is the variable for the full model.
Footnotes:
Michael W Kattan and Thomas A Gerds. The index of prediction accuracy: An intuitive measure useful for evaluating risk prediction models. Diagnostic and Prognostic Research, 2(1):7, 2018.
Thomas A Gerds and Michael W Kattan. Medical Risk Prediction: With Ties to Machine Learning. Chapman and Hall/CRC, 2021.
Berg KD, Vainer B, Thomsen FB, Roeder MA, Gerds TA, Toft BG, Brasso K, and Iversen P. Erg protein expression in diagnostic specimens is associated with increased risk of progression during active surveillance for prostate cancer. European urology, 66(5):851–860, 2014.
Brice Ozenne, Anne Lyngholm S{\o }rensen, Thomas Scheike, Christian Torp-Pedersen, and Thomas Alexander Gerds. riskregression: Predicting the risk of an event using Cox regression models. R Journal, 9(2):440–460, 2017.