--- title: "Augmenting designs with controlled efficiency loss" output: rmarkdown::html_vignette vignette: > %\VignetteIndexEntry{Augmenting designs with controlled efficiency loss} %\VignetteEngine{knitr::rmarkdown} %\VignetteEncoding{UTF-8} --- ```{r, include = FALSE} knitr::opts_chunk$set( collapse = TRUE, comment = "#>", fig.width = 6, fig.height = 4, out.width = "90%" ) ``` ```{r setup} library(optedr) ``` ## Motivation In practice an experiment is rarely designed from scratch. A researcher may already have data collected at certain conditions and want to add new observations to improve estimation — without discarding what has already been measured. The key question is: *where* can new points be placed so that the efficiency of the augmented design stays above an acceptable threshold? **optedr** answers this question with two functions used in sequence: 1. `get_augment_region()` — computes the candidate region: the set of design points whose addition keeps the D-efficiency of the augmented design above a user-specified threshold `delta_val`. 2. `augment_design()` — adds a chosen point to the initial design and rescales the weights. Both functions support the same optimality criteria as `opt_des()` and work for any number of factors. ## Key parameters | Parameter | Role | |-----------|------| | `init_design` | Current design (data frame with `Point`/`Weight` in 1D, or factor columns + `Weight` in multi-D) | | `alpha` | Fraction of total weight assigned to the new point after augmentation | | `delta_val` | Minimum acceptable D-efficiency of the augmented design | | `calc_optimal_design` | If `TRUE`, also computes the optimal design and uses it as the reference for efficiency | | `new_points` | Data frame of points to add (non-interactive mode); omit for interactive mode | | `par_int` | Indices of parameters of interest (Ds-Optimality only) | | `n_lhs` | Number of Latin-Hypercube candidates for the region search (multi-D) | ## One-factor augmentation ### Step 1: compute the candidate region We start with a uniform three-point design for Antoine's equation and look for points that keep the D-efficiency of the augmented design above 85 %. ```{r region-1d} init_des <- data.frame( Point = c(30, 60, 90), Weight = c(1/3, 1/3, 1/3) ) region <- get_augment_region( criterion = "D-Optimality", init_design = init_des, alpha = 0.25, model = y ~ 10^(a - b / (c + x)), parameters = c("a", "b", "c"), par_values = c(8.07131, 1730.63, 233.426), design_space = c(1, 100), calc_optimal_design = FALSE, delta_val = 0.85 ) print(region) ``` `region$region` is a data frame of candidate intervals. Each row gives a lower and upper bound on the design space where the new point can be placed. ### Step 2: choose a point and augment ```{r augment-1d} new_pt <- mean(region$region[1:2]) augmented <- augment_design( criterion = "D-Optimality", init_design = init_des, alpha = 0.25, model = y ~ 10^(a - b / (c + x)), parameters = c("a", "b", "c"), par_values = c(8.07131, 1730.63, 233.426), design_space = c(1, 100), calc_optimal_design = FALSE, delta_val = 0.85, new_points = data.frame(Point = new_pt, Weight = 1) ) print(augmented) cat("Sum of weights:", sum(augmented$Weight), "\n") ``` ### Comparing efficiency before and after ```{r efficiency-1d} result_opt <- opt_des( "D-Optimality", y ~ 10^(a - b / (c + x)), c("a", "b", "c"), c(8.07131, 1730.63, 233.426), c(1, 100) ) eff_before <- design_efficiency(init_des, result_opt) eff_after <- design_efficiency(augmented, result_opt) cat("Efficiency before augmenting:", round(eff_before * 100, 2), "%\n") cat("Efficiency after augmenting: ", round(eff_after * 100, 2), "%\n") cat("Gain: ", round((eff_after - eff_before) * 100, 2), "percentage points\n") ``` ### Using the optimal design as reference (`calc_optimal_design = TRUE`) When `calc_optimal_design = TRUE`, the function internally computes the optimal design and uses it to define the efficiency threshold. This is the recommended mode when no optimal design has been computed yet: ```{r augment-with-opt, eval=FALSE} region_opt <- get_augment_region( criterion = "D-Optimality", init_design = init_des, alpha = 0.25, model = y ~ 10^(a - b / (c + x)), parameters = c("a", "b", "c"), par_values = c(8.07131, 1730.63, 233.426), design_space = c(1, 100), calc_optimal_design = TRUE, delta_val = 0.85 ) ``` ## Two-factor augmentation In multi-dimensional spaces `get_augment_region()` samples candidate points with a Latin Hypercube (controlled by `n_lhs`) and returns a data frame of candidates together with their estimated efficiency gain. A heatmap of the efficiency function is displayed automatically. ### Initial design and candidate region ```{r region-2d} init_2d <- data.frame( x1 = c(0.8, 10, 5), x2 = c(10, 0.8, 5), Weight = c(1/3, 1/3, 1/3) ) result_2D <- opt_des( criterion = "D-Optimality", model = y ~ Vmax * x1 * x2 / ((K1 + x1) * (K2 + x2)), parameters = c("Vmax", "K1", "K2"), par_values = c(1, 1, 1), design_space = list(x1 = c(0.1, 10), x2 = c(0.1, 10)) ) region_2d <- get_augment_region( criterion = "D-Optimality", init_design = init_2d, alpha = 0.25, model = y ~ Vmax * x1 * x2 / ((K1 + x1) * (K2 + x2)), parameters = c("Vmax", "K1", "K2"), par_values = c(1, 1, 1), design_space = list(x1 = c(0.1, 10), x2 = c(0.1, 10)), calc_optimal_design = FALSE, delta_val = 0.85 ) ``` `region_2d$region` is a data frame of sampled candidates, each with an `efficiency` column. Pick the candidate that maximises efficiency: ```{r augment-2d} best_2d <- region_2d$region[which.max(region_2d$region$efficiency), ] eff_antes <- suppressMessages(design_efficiency(init_2d, result_2D)) aug_2d <- augment_design( criterion = "D-Optimality", init_design = init_2d, alpha = 0.25, model = y ~ Vmax * x1 * x2 / ((K1 + x1) * (K2 + x2)), parameters = c("Vmax", "K1", "K2"), par_values = c(1, 1, 1), design_space = list(x1 = c(0.1, 10), x2 = c(0.1, 10)), calc_optimal_design = FALSE, delta_val = 0.85, new_points = data.frame(x1 = best_2d$x1, x2 = best_2d$x2, Weight = 1) ) eff_despues <- suppressMessages(design_efficiency(aug_2d, result_2D)) cat("Efficiency before:", round(eff_antes * 100, 2), "%\n") cat("Efficiency after: ", round(eff_despues * 100, 2), "%\n") print(aug_2d) ``` ## Three-factor augmentation For three or more factors the candidate region is displayed as a scatter-matrix coloured by candidate/non-candidate status, with the current design shown as triangles. ```{r region-3d} init_3d <- data.frame( x1 = c(0.8, 10, 10, 0.8, 10), x2 = c(10, 0.8, 10, 10, 0.8), x3 = c(10, 10, 0.8, 0.8, 10), Weight = rep(0.2, 5) ) region_3d <- get_augment_region( criterion = "D-Optimality", init_design = init_3d, alpha = 0.45, model = y ~ Vmax * x1 * x2 * x3 / ((K1+x1) * (K2+x2) * (K3+x3)), parameters = c("Vmax", "K1", "K2", "K3"), par_values = c(1, 1, 1, 1), design_space = list(x1 = c(0.1, 10), x2 = c(0.1, 10), x3 = c(0.1, 10)), calc_optimal_design = FALSE, delta_val = 0.93 ) cat("Number of candidate points:", nrow(region_3d$region), "\n") plot(region_3d$plot) ``` ## Augmenting with Ds-Optimality When the goal is to augment while preserving estimation quality for a *subset* of parameters, use `criterion = "Ds-Optimality"` and pass `par_int`: ```{r augment-ds} region_ds <- get_augment_region( criterion = "Ds-Optimality", init_design = init_2d, alpha = 0.25, model = y ~ Vmax * x1 * x2 / ((K1 + x1) * (K2 + x2)), parameters = c("Vmax", "K1", "K2"), par_values = c(1, 1, 1), design_space = list(x1 = c(0.1, 10), x2 = c(0.1, 10)), calc_optimal_design = FALSE, par_int = c(1), delta_val = 0.85, n_lhs = 5000 ) best_ds <- region_ds$region[which.max(region_ds$region$efficiency), ] aug_ds <- augment_design( criterion = "Ds-Optimality", init_design = init_2d, alpha = 0.25, model = y ~ Vmax * x1 * x2 / ((K1 + x1) * (K2 + x2)), parameters = c("Vmax", "K1", "K2"), par_values = c(1, 1, 1), design_space = list(x1 = c(0.1, 10), x2 = c(0.1, 10)), calc_optimal_design = FALSE, par_int = c(1), delta_val = 0.85, new_points = data.frame(x1 = best_ds$x1, x2 = best_ds$x2, Weight = 1), n_lhs = 5000 ) print(aug_ds) ``` ## Interactive mode Omitting `new_points` (and `delta_val`) from both functions triggers an interactive session where the package plots the candidate region and asks the user to type a point. This mode is documented in `?augment_design`.