The package morse
is devoted to the analysis of data
from standard toxicity tests. It provides a simple workflow to
explore/visualize a data set, and compute estimations of risk assessment
indicators. This document illustrates a typical use of
morse
on survival and reproduction data, which can be
followed step-by-step to analyze new data sets.
The steps for a TKTD data analysis are absolutely analogous to what we described for the analysis at target time. Here the goal is to estimate the relationship between chemical compound concentration, time and survival rate using the GUTS models. GUTS, for General Unified Threshold models of Survival, is a TKTD models generalising most of existing mechanistic models for survival description. For details about GUTS models, see the vignette Models in ‘morse’ package, and the included references.
Here is a typical session to analyse concentration-dependent time-course data using the so-called “Stochastic Death” (SD) model:
# (1) load data set
data(propiconazole)
# (2) check structure and integrity of the data set
survDataCheck(propiconazole)
## Correct formatting
## [1] msg
## <0 rows> (or 0-length row.names)
# (3) create a `survData` object
PRZcst <- survData(propiconazole)
# (4) represent the number of survivors as a function of time
plot(PRZcst)
To fit the Stochastic Death model, we have to specify the
model_type
as "SD"
:
A function allows to simulated within priors density function:
## kd hb z kk
## 1 0.42021033 0.000655978 14.64122 5.659141e-04
## 2 0.04699319 0.035627682 27.24372 1.453847e-02
## 3 0.14945752 0.003648068 20.95387 6.194312e-03
## 4 0.20575672 0.006783051 22.72157 6.160040e-04
## 5 0.03084036 1.032718904 26.27874 3.798713e-06
## 6 0.34192629 0.005371578 15.03421 2.093326e-03
Once fitting is done, we can compute posteriors vs. priors
distribution with the function plot()
on a
PriorPosterior
object as follow:
## kd hb z kk pp
## 1 0.003824626 0.011839870 9.76479 8.248152e-04 prior
## 2 0.008129851 0.001644949 10.35455 7.227431e-05 prior
## 3 0.045149663 0.011564429 11.01445 1.012687e-04 prior
## 4 0.011138893 0.015557551 15.66961 5.813275e-02 prior
## 5 0.003895477 0.036785674 29.12877 1.165233e-03 prior
## 6 0.004161302 0.192601430 11.07244 1.328534e-06 prior
Or having the plot:
The plot()
function provides a representation of the
fitting for each replicates
Original data can be removed by using the option
adddata = FALSE
A posterior predictive check is also possible using function
ppc()
:
Nsurv
We can extract the number of survival Nsurv
estimated by
the inference process.
To do so, there is two function, either
extract_Nsurv_ppc
or extract_Nsurv_sim
.
## qinf95 q50 qsup95 time replicate Nsurv
## 1 20 20 20 0 0 20
## 2 17 19 20 1 0 19
## 3 19 19 19 2 0 19
## 4 19 19 19 3 0 19
## 5 19 19 19 4 0 19
## 6 20 20 20 0 11.91 20
## qinf95 q50 qsup95 time replicate Nsurv
## 1 20 20 20 0 0 20
## 2 17 20 20 1 0 19
## 3 17 20 20 2 0 19
## 4 17 20 20 3 0 19
## 5 17 20 20 4 0 19
## 6 20 20 20 0 11.91 20
You can fix the background mortality, parameter hb
.
# fit the TKTD model SD with fixed hb value
fit_cstSDFIXhb <- fit(PRZcst, model_type = "SD", hb_value = 0.2, refresh = 0)
Using the priorPosterior
function, you can recover the
summary of each parameters:
pp_cstSDFIXhb = priorPosterior(fit_cstSDFIXhb)
head(pp_cstSDFIXhb[pp_cstSDFIXhb$pp == "posterior", ])
## kd hb z kk pp
## 1001 0.7109887 0.2 10.260949 0.1456833 posterior
## 1002 0.7988118 0.2 10.967499 0.1331460 posterior
## 1003 0.6856149 0.2 10.141955 0.1497547 posterior
## 1004 0.6318863 0.2 10.247165 0.1515833 posterior
## 1005 0.5343595 0.2 9.631093 0.1762897 posterior
## 1006 0.6531892 0.2 10.414790 0.1512549 posterior
The Individual Tolerance (IT) model is a variant of TKTD
survival analysis. It can also be used with morse
as
demonstrated hereafter. For the IT model, we have to specify
the model_type
as "IT"
:
And the plot of posteriors vs. priors distributions:
## qinf95 q50 qsup95 time replicate Nsurv color response
## 1 20 20 20 0 0 20 TRUE 20
## 2 18 20 20 1 0 19 TRUE 19
## 3 17 19 19 2 0 19 TRUE 19
## 4 17 19 19 3 0 19 TRUE 19
## 5 17 19 19 4 0 19 TRUE 19
## 6 20 20 20 0 11.91 20 TRUE 20
So we can plot the PPC:
GUTS models can be used to simulate the survival of the organisms
under any exposure pattern, using the calibration done with function
survFit()
from observed data. The function for prediction
is called predict()
and returns an object of class
SurvFitPredict
.
# (1) upload or build a data frame with the exposure profile
# argument `replicate` is used to provide several profiles of exposure
data_4prediction <- data.frame(time = c(1:10, 1:10),
conc = c(c(0,0,40,0,0,0,40,0,0,0),
c(21,19,18,23,20,14,25,8,13,5)),
replicate = c(rep("pulse", 10), rep("random", 10)))
# (2) Use the fit on constant exposure propiconazole with model SD (see previously)
predict_PRZ_cstSD_4pred <- predict(fit_cstSD, data_4prediction)
If NA
are produced an error
message is
returned.
From an object survFitPredict
, results can ben plotted
with function plot()
:
# (3) Plot the predicted survival rate under the new exposure profiles.
plot(predict_PRZ_cstSD_4pred)
# (3) Plot the predicted survival rate under the new exposure profiles.
plot(predict_PRZ_cstSD_4pred, background_concentration = TRUE)
# (3) Plot the predicted survival rate under the new exposure profiles.
plot(predict_PRZ_cstSD_4pred, background_concentration = TRUE, add_legend = TRUE)
Nsurv
on predictionWe can compute the number of survival based on the initial number of
individuals by using the df_mcmc
return by a
SurvPredict
object.
To do so, you have to give the initial number of intidivuals.
For instance:
Nsurv_predict_PRZ_cst4pred = compute_Nsurv(predict_PRZ_cstSD_4pred, Ninit = 10)
Nsurv_predict_PRZ_cst4pred$df_quantile
## conc time replicate q50 qinf95 qsup95
## 1 0 1 pulse 10 10 10
## 2 0 2 pulse 10 8 10
## 3 40 3 pulse 9 6 10
## 4 0 4 pulse 3 1 7
## 5 0 5 pulse 1 0 3
## 6 0 6 pulse 0 0 2
## 7 40 7 pulse 0 0 1
## 8 0 8 pulse 0 0 0
## 9 0 9 pulse 0 0 0
## 10 0 10 pulse 0 0 0
## 11 21 1 random 10 10 10
## 12 19 2 random 9 7 10
## 13 18 3 random 5 2 8
## 14 23 4 random 1 0 3
## 15 20 5 random 0 0 1
## 16 14 6 random 0 0 0
## 17 25 7 random 0 0 0
## 18 8 8 random 0 0 0
## 19 13 9 random 0 0 0
## 20 5 10 random 0 0 0
While the model has been estimated using the background mortality
parameter hb
, it can be interesting to see the prediction
without it. This is possible with the argument hb_value
. If
TRUE
, the background mortality is taken into account, and
if FALSE
, the background mortality is set to \(0\) in the prediction.
# Use the same data set profile to predict without 'hb'
predict_PRZ_cstSD_4pred_hbOUT <- predict(
fit = fit_cstSD,
display.exposure = data_4prediction,
hb_value = 0)
# Plot the prediction:
plot(predict_PRZ_cstSD_4pred_hbOUT)
# Use the same data set profile to predict without 'hb'
predict_PRZ_cstSD_4pred_hbFIX2 <- predict(
fit = fit_cstSD,
display.exposure = data_4prediction,
hb_value = 2)
# Plot the prediction:
plot(predict_PRZ_cstSD_4pred_hbFIX2)
Following EFSA recommendations, the next functions compute qualitative and quantitative model performance criteria suitable for GUTS, and TKTD modelling in general: the percentage of observations within the 95% credible interval of the Posterior Prediction Check (PPC), the Normalised Root Mean Square Error (NRMSE) and the Survival-Probability Prediction Error (SPPE).
PPC
The PPC compares the predicted median numbers of survivors associated to their uncertainty limits with the observed numbers of survivors. This can be visualised by plotting the predicted versus the observed values and counting how frequently the confidence/credible limits intersect with the 1:1 prediction line [see previous plot]. Based on experience, PPC resulting in less than 50% of the observations within the uncertainty limits indicate poor model performance.
Normalised Root Mean Square Error NRMSE
NRMSE criterion is also based on the expectation that predicted and observed survival numbers matches the 1:1 line in a scatter plot. The criterion is based on the classical root-mean-square error (RMSE), used to aggregate the magnitudes of the errors in predictions for various time-points into a single measure of predictive power. In order to provide a criterion expressed as a percentage, it is suggested using a normalised RMSE by the mean of the observations.
\[ NRMSE = \frac{RMSE}{\overline{Y}} = \frac{1}{\overline{Y}} \sqrt{\frac{1}{n} \sum_{i=1}^{n} (Y_{obs,i} - Y_{pred,i})^2} \times 100 \]
Survival Probability Prediction Error (SPPE)
The SPPE indicator is negative (between 0 and -100%) for an underestimation of effects, and positive (between 0 and 100%) for an overestimation of effects. An SPPE value of 0% means an exact prediction of the observed survival probability at the end of the experiment.
\[ SPPE = \left( \frac{Y_{obs, t_{end}}}{Y_{init}} - \frac{Y_{pred, t_{end}}}{Y_{init}} \right) \times 100 = \frac{Y_{obs, t_{end}} - Y_{pred, t_{end}}}{Y_{init}} \times 100 \]
For NRMSE and SPPE, we need to compute the number
of survivors. To do so, we use the function predict_Nsurv()
where two arguments are required: the first argument is a
survFit
object, and the other is a data set with four
columns (time
, conc
, replicate
and Nsurv
). Contrary to the function
predict()
, here the column Nsurv
is
necessary.
predict_psurv_PRZ_SD_cstTOcst <- predict(fit_cstSD, propiconazole)
compute_Nsurv_PRZ_SD_cstTOcst <- compute_Nsurv(
predict_psurv_PRZ_SD_cstTOcst,
Ninit = compute_Ninit(propiconazole)
)
head(predict_psurv_PRZ_SD_cstTOcst
$df_quantile)
## conc time replicate qinf95 q50 qsup95
## 1 35.92 0 A 1.000000000 1.000000000 1.000000000
## 2 35.92 1 A 0.562888029 0.663980794 0.750036058
## 3 35.92 2 A 0.081593662 0.142659091 0.227215648
## 4 35.92 3 A 0.005380550 0.016174751 0.041036148
## 5 35.92 4 A 0.000234217 0.001328144 0.005904622
## 6 28.93 0 B 1.000000000 1.000000000 1.000000000
data(propiconazole_pulse_exposure)
predict_psurv_PRZ_SD_cstTOvar <- predict(fit_cstSD, propiconazole_pulse_exposure)
predict_Nsurv_PRZ_SD_cstTOvar <- compute_Nsurv(
predict_psurv_PRZ_SD_cstTOvar,
Ninit = compute_Ninit(propiconazole)
)
head(predict_Nsurv_PRZ_SD_cstTOvar$df_quantile)
## conc time replicate q50 qinf95 qsup95
## 1 30.560 0.00 varA 20 20 21
## 2 27.930 0.96 varA 16 11 19
## 3 0.000 1.00 varA 12 7 17
## 4 0.260 1.96 varA 7 2 12
## 5 0.258 2.00 varA 4 0 8
## 6 0.210 2.96 varA 2 0 5
Then, using object produce with the function
compute_Nsurv()
we can compute PPC, NRMSE
and SPPE for all models.
When ploting a PPC for a survFitPredict_Nsurv
object, 3
types of lines are represented (following EFSA recommendations). - A
plain line corresponding to the 1:1 line (\(y=x\)): prediction match perfectly with
observation when dots are on this line. - A band of dashed lines
corresponding to the range of 25% deviation. - A band of dotted lines
corresponding to the range of 50% deviation.
Following the naming of parameters in the EFSA Scientific Opinion (2018), which differs from our naming of parameters, we add an option to be in agreement with EFSA.
Several names of parameters are used in the TKTD GUTS models. The
‘R-package’ morse
, and more specifically since the GUTS
implementation, several name of parameters have been used.
For stability reason of algorithms and package, we do not change
parameters name in implemented algorithms. However, we added argument
EFSA_name
to use EFSA naming in the functions
priors_distribution()
providing the distributions of priors
(note: distributions of posteriors are obtained with $mcmc
element of a survFit
object) and
plot_prior_post()
plotting priors distributions versus
posteriors distributions.
For instance:
Compared to the target time analysis, TKTD modelling allows to
compute and plot the lethal concentration for any x percentage
and at any time-point. The chosen time-point can be specified with
time_LCx
, by default the maximal time-point in the data set
is used.
## Note the use of the pipe operator, `|>`, which is a powerful tool for clearly expressing a sequence of multiple operations.
## For more information on pipes, see: http://r4ds.had.co.nz/pipes.html
Warning messages are returned when the range of concentrations is not appropriated for one or more LCx calculation(s).
Using prediction functions, GUTS models can be used to simulate the survival rate of organisms exposed to a given exposure pattern. In general, this realistic exposure profile does not result in any related mortality, but a critical question is to know how far the exposure profile is from adverse effect, that is a “margin of safety”.
This idea is then to multiply the concentration in the realistic exposure profile by a “multiplication factor”, denoted \(MF_x\), resulting in \(x\%\) (classically \(10\%\) or \(50\%\)) of additional death at a specified time (by default, at the end of the exposure period).
The multiplication factor \(MF_x\) then informs the “margin of safety” that could be used to assess if the risk should be considered as acceptable or not.
Computing an \(MF_x\) is easy with
function MFx()
. It only requires object
survFit
and the exposure profile, argument
data_predict
in the function. The chosen percentage of
survival reduction is specified with argument X
, the
default is \(50\), and the chosen
time-point can be specified with time_MFx
, by default the
maximal time-point in the data set is used.
There is no explicit formulation of \(MF_x\) (at least for the GUTS-SD model), so
the accuracy
argument can be used to change the accuracy of
the convergence level.
# (1) upload or build a data frame with the exposure profile
data_4MFx <- data.frame(time = 1:10,
conc = c(0,0.5,8,3,0,0,0.5,8,3.5,0),
replicate = "A")
# (2) Use the fit on constant exposure propiconazole with model SD (see previously)
MFx_PRZ_cstSD_4MFx <- lpxt(fit = fit_cstSD, display.exposure = data_4MFx)
As the computing time can be long, the function prints the
accuracy
for each step of the tested multiplication factor,
for the median and the 95% credible interval.
To compare the initial survival rate (corresponding to a
multiplication factor set to 1) with the survival rate at the asked
multiplication factor leading to a reduction of \(x\%\) of survival (provided with argument
x
).
Then, we can plot the survival rate as a function of the tested multiplication factors. Note that it is a linear interpolation between tested multiplication factor (cross dots on the graph).
As indicated, the warning message just remind you how multiplication factors and linear interpolations between them have been computed to obtain the graph.
What is provided with the function plot()
is direclty
accessible within the object of class MFx
:
## Psurv_1 conc time replicate
## 1 1.0000000 0.000000 1 A
## 2 0.9655516 1.132629 2 A
## 3 0.9322898 18.122070 3 A
## 4 0.8265749 6.795776 4 A
## 5 0.7565589 0.000000 5 A
## 6 0.7304966 0.000000 6 A
## 7 0.7053322 1.132629 7 A
## 8 0.6810346 18.122070 8 A
## 9 0.5734038 7.928406 9 A
## 10 0.4975760 0.000000 10 A
Here is an other example with a 10 percent Multiplication Factor:
# (2 bis) fit on constant exposure propiconazole with model SD (see previously)
MFx_PRZ_cstSD_4MFx_x10 <- lpxt(fit = fit_cstSD, x = 0.1, display.exposure = data_4MFx)
plot(MFx_PRZ_cstSD_4MFx_x10)
Multiplication factor is also available for the GUTS IT model (option
quiet = TRUE
remove the output):
# (2) Use the fit on constant exposure propiconazole with model IT. No print of run messages.
MFx_PRZ_cstIT_4pred <- lpxt(
fit = fit_cstIT, x = 0.5, t = 4,
display.exposure = data_4MFx)
# (3) Plot the survival rate versus multiplication factors.
plot(MFx_PRZ_cstIT_4pred)
This last example set a reduction of \(10\%\) of the survival rate, remove the
background mortality by setting hb_value = FALSE
and is
computed at time time_MFx = 4
.
# (2) Use the fit on constant exposure propiconazole with model IT. No print of run messages.
MFx_PRZ_cstIT_4pred <- lpxt(fit = fit_cstIT, x = 0.1, t = 4, display.exposure = data_4MFx)
#
plot(MFx_PRZ_cstIT_4pred)