--- title: "KL-Optimality: designs for model discrimination" output: rmarkdown::html_vignette vignette: > %\VignetteIndexEntry{KL-Optimality: designs for model discrimination} %\VignetteEngine{knitr::rmarkdown} %\VignetteEncoding{UTF-8} --- ```{r, include = FALSE} knitr::opts_chunk$set( collapse = TRUE, comment = "#>", fig.width = 6, fig.height = 4, out.width = "90%" ) ``` ```{r setup} library(optedr) ``` ## Background Standard optimality criteria (D, A, I, …) assume the model is correctly specified and aim to estimate its parameters as precisely as possible. KL-Optimality addresses a different question: *given two competing models, where should observations be taken to tell them apart most easily?* The criterion is based on the Kullback-Leibler divergence between the reference model $f_1$ and the best-fitting version of the rival $f_2$: $$\Phi_{\text{KL}}(\xi) = \int \min_{\beta_2} \text{KL}(f_1(y \mid x) \,\|\, f_2(y \mid x, \beta_2))\, \xi(dx).$$ The design $\xi^*$ that maximises $\Phi_{\text{KL}}$ places observations where the two models are most distinguishable on average. **Reference:** López-Fidalgo, Tommasi & Trandafir (2007). *An optimal experimental design criterion for discriminating between non-normal models.* J. R. Stat. Soc. B, 69(2), 231–242. ## API overview ```r opt_des( "KL-Optimality", model = , parameters = , par_values = , design_space = , # rival specification: rival_model = , # default: same as model rival_params = , # default: same as parameters rival_pars = , rival_lower = , rival_upper = , # GLM family: family = "Normal", # "Poisson", "Binomial", "Gamma" phi = 1, # dispersion parameter # alternative: supply a custom KL function kl_fun = NULL ) ``` The internal optimisation over rival parameters is done automatically. The optimal rival values are stored as a hidden attribute and shown by `summary()`. ## Example 1: Michaelis-Menten vs linear rival (Normal, 1D) **Context.** Enzyme kinetics with one substrate. The reference model is Michaelis-Menten $\mu(x) = V_{\max} x / (K_m + x)$ with $V_{\max} = 2$, $K_m = 1$. The rival is a linear rate $\mu(x) = a x$, valid only when $x \ll K_m$. At high concentrations MM saturates while the linear model keeps growing — that is where the models diverge most. ```{r kl-mm} result_kl_mm <- opt_des( "KL-Optimality", model = y ~ Vmax * x / (Km + x), parameters = c("Vmax", "Km"), par_values = c(2, 1), design_space = c(0.1, 5), rival_model = y ~ a * x, rival_params = c("a"), rival_pars = c(1), rival_lower = c(0.2), rival_upper = c(2.5), family = "Normal", phi = 1 ) result_kl_mm ``` `summary()` shows the GLM family and the optimal rival parameters found by the internal optimisation: ```{r kl-mm-summary} summary(result_kl_mm) ``` The sensitivity plot shows $d_{\text{KL}}(x, \xi)$ with the Equivalence Theorem bound (dashed). Support points concentrate in the saturation region where MM and the linear rival differ most: ```{r kl-mm-plot, fig.cap = "KL sensitivity function: MM vs linear rival."} plot(result_kl_mm) ``` Efficiency of a five-point uniform design relative to the KL-optimal: ```{r kl-mm-eff} design_unif <- data.frame( Point = c(0.1, 1.3, 2.5, 3.8, 5.0), Weight = rep(1/5, 5) ) eff_kl <- design_efficiency(design_unif, result_kl_mm) cat("Efficiency of uniform design:", round(eff_kl * 100, 2), "%\n") ``` ## Example 2: exponential decay, Poisson family **Context.** A drug clearance study where the event count follows a Poisson process with rate $\mu(x) = \exp(a - bx)$. The reference model has nominal decay rate $b = 0.5$. The rival represents a faster-elimination regime ($b \in [0.8, 1.5]$), which could arise from drug interactions. The KL divergence for Poisson is $\text{KL} = \mu_2 - \mu_1 - \mu_1 \log(\mu_2/\mu_1)$; since the rival decays faster, the gap is largest at long follow-up times. ```{r kl-poisson} result_kl_pois <- opt_des( "KL-Optimality", model = y ~ exp(a - b * x), parameters = c("a", "b"), par_values = c(2, 0.5), design_space = c(0, 4), rival_pars = c(2, 1.0), rival_lower = c(1.5, 0.8), rival_upper = c(2.5, 1.5), family = "Poisson", phi = 1 ) result_kl_pois summary(result_kl_pois) ``` ```{r kl-poisson-plot, fig.cap = "KL sensitivity for Poisson decay model."} plot(result_kl_pois) ``` The optimal rival parameters found internally confirm the closest rival within the constraint: ```{r kl-poisson-rival} hv <- attr(result_kl_pois, "hidden_value") cat("Optimal rival: a =", round(hv$beta2_star[1], 3), " b =", round(hv$beta2_star[2], 3), "\n") ``` ## Using `make_kl_fun()` for custom KL functions For some discrimination problems the KL divergence does not reduce to the standard cumulant-generating-function form that `opt_des()` uses internally. `make_kl_fun()` automates the construction of the KL function from two model-family pairs. ### Supported pairs | Combination | Formula | |-------------|---------| | Same family, same $\phi$ | Standard cumulant form | | Normal vs Normal, $\phi_1 \neq \phi_2$ | $\frac{1}{2}\left[\log\frac{\phi_2}{\phi_1} + \frac{\phi_1}{\phi_2} + \frac{(\mu_1-\mu_2)^2}{\phi_2} - 1\right]$ | | Gamma vs Gamma, different shape | Closed form with digamma/lgamma | For other pairs, supply `kl_fun` directly as a function `(x, beta2) -> scalar`. ### Example 3: Normal with different variances (1D) **Context.** Same exponential decay model, but now the rival has variance $\phi_2 = 4$ (four times larger). The KL function includes a constant term $\frac{1}{2}[\log(\phi_2/\phi_1) + \phi_1/\phi_2 - 1]$ even when the means coincide, so the design must also account for the variance difference. ```{r kl-variances} kl_fn_var <- make_kl_fun( "Normal", model1 = y ~ a * exp(-b * x), params1 = c("a", "b"), par_values1 = c(1, 0.5), phi1 = 1, family2 = "Normal", model2 = y ~ c * exp(-d * x), params2 = c("c", "d"), phi2 = 4 ) result_kl_var <- opt_des( "KL-Optimality", model = y ~ a * exp(-b * x), parameters = c("a", "b"), par_values = c(1, 0.5), design_space = c(0, 4), kl_fun = kl_fn_var, rival_pars = c(1, 1.0), rival_lower = c(0.5, 0.8), rival_upper = c(2.0, 1.5) ) result_kl_var summary(result_kl_var) ``` ```{r kl-var-plot, fig.cap = "KL sensitivity: Normal phi=1 vs Normal phi=4."} plot(result_kl_var) ``` ### Example 4: two-factor MM vs linear rival (`make_kl_fun`, 2D) `make_kl_fun()` works with models that use different subsets of the design variables. Here the reference uses $(x_1, x_2)$ (bisubstrate MM) while the rival depends only on $x_1$ (single-substrate linear approximation): ```{r kl-2d} kl_fn_2d <- make_kl_fun( "Normal", model1 = y ~ Vmax * x1 * x2 / ((Km1 + x1) * (Km2 + x2)), params1 = c("Vmax", "Km1", "Km2"), par_values1 = c(1, 1, 1), model2 = y ~ alpha * x1, params2 = "alpha" ) result_kl_2d <- opt_des( "KL-Optimality", model = y ~ Vmax * x1 * x2 / ((Km1 + x1) * (Km2 + x2)), parameters = c("Vmax", "Km1", "Km2"), par_values = c(1, 1, 1), design_space = list(x1 = c(0.1, 5), x2 = c(0.1, 5)), kl_fun = kl_fn_2d, rival_pars = c(0.5), rival_lower = c(0.05), rival_upper = c(2.0) ) result_kl_2d ``` The heatmap concentrates support where the difference between the bisubstrate hyperbola and the linear rival is greatest (saturation regime in both $x_1$ and $x_2$): ```{r kl-2d-plot, fig.cap = "KL sensitivity heatmap: 2D MM vs linear rival."} plot(result_kl_2d) ``` ## Example 5: discriminating error structures — the Hill model **Reference:** De la Calle-Arroyo, Leorato, Rodríguez-Aragón & Tommasi (2025). *Augmented designs to choose between constant absolute and relative errors.* Chemometrics and Intelligent Laboratory Systems, doi:10.1016/j.chemolab.2025.105374. **Context.** Dose-response experiments with the Hill (4-parameter Emax) model: $$\eta(x, \beta) = (E_{\text{con}} - b) \cdot \frac{(x/\text{IC}_{50})^s}{1 + (x/\text{IC}_{50})^s} + b$$ with $E_{\text{con}} = 1.70$, $b = 0.137$, $\text{IC}_{50} = 111$, $s = -1.03$ and $x \in [0.01, 1500]$ nM. Two error structures are competing: - **Relative error** (reference): $\varepsilon \sim N(0,\, \sigma_{\text{rel}}^2 \cdot \eta^2)$ - **Absolute error** (rival): $\varepsilon \sim N(0,\, \sigma_{\text{abs}}^2)$ The KL divergence minimised over the rival variance $\sigma_{\text{abs}}^2$ is $$\text{KL}(x,\, \sigma_{\text{abs}}^2) = \frac{1}{2}\!\left(\frac{\eta^2(x)}{\sigma_{\text{abs}}^2} - 1 + \log\frac{\sigma_{\text{abs}}^2}{\eta^2(x)}\right).$$ We supply this directly as `kl_fun` since it is not covered by the standard Normal-vs-Normal formula (the reference variance is $x$-dependent): ```{r kl-hill} kl_fun_hill <- function(x, beta2) { sigma_sq <- beta2[1] eta <- (1.70 - 0.137) * (x / 111)^(-1.03) / (1 + (x / 111)^(-1.03)) + 0.137 0.5 * (eta^2 / sigma_sq - 1 + log(sigma_sq / eta^2)) } result_kl_hill <- opt_des( "KL-Optimality", model = y ~ (Econ - b) * (x / IC50)^s / (1 + (x / IC50)^s) + b, parameters = c("Econ", "b", "IC50", "s"), par_values = c(1.70, 0.137, 111, -1.03), design_space = c(0.01, 1500), kl_fun = kl_fun_hill, rival_pars = c(1.0), rival_lower = c(1e-4), rival_upper = c(1e6) ) result_kl_hill summary(result_kl_hill) ``` ```{r kl-hill-plot, fig.cap = "KL sensitivity for the Hill model (error structure discrimination)."} plot(result_kl_hill) ``` The KL-optimal design places all weight at the two extremes of the design space. This matches the result reported in the paper (support at $x = 0.01$ with weight 0.23 and $x = 1500$ with weight 0.77). The Atwood criterion at 100 % confirms the design is exactly optimal. The optimal rival variance recovered internally: ```{r kl-hill-rival} hv_hill <- attr(result_kl_hill, "hidden_value") cat("Optimal rival sigma_abs^2 =", round(hv_hill$beta2_star, 4), "\n") ``` Note that with only two support points this design cannot estimate all four parameters of the Hill model. The paper proposes combining it with a D-optimal design via the augmentation workflow described in `vignette("optedr-augment")`.