--- title: "Power simulations using multiple approaches for internal validation" output: rmarkdown::html_vignette bibliography: "`r system.file('references.bib', package='graphicalMCP')`" vignette: > %\VignetteIndexEntry{Power simulations using multiple approaches for internal validation} %\VignetteEngine{knitr::rmarkdown} %\VignetteEncoding{UTF-8} --- ```{r, include = FALSE} knitr::opts_chunk$set( fig.align = "center", collapse = TRUE, comment = "#>", cache.lazy = FALSE ) ``` ```{r setup, message = FALSE, warning = FALSE} library(graphicalMCP) ``` # Introduction Multiple approaches are implemented in `graphicalMCP` to reject a hypothesis for different purposes and/or considerations. One approach is to calculate the adjusted p-value of this hypothesis and compare it with `alpha`. This approach is implemented in `adjusted_p` functions and `graph_test_closure()` (when `test_values = FALSE`). Another approach is to calculate the adjusted significance level of this hypothesis and compare it with its p-value. This approach is implemented in `adjusted_weights` functions, `graph_test_closure()` (when `test_values = TRUE`), and `graph_calculate_power()`. To further tailor this approach for different outputs, a different way of coding are used for `graph_test_closure()` (when `test_values = TRUE`). When implementing these approaches in `graph_calculate_power()`, variations are added to optimize computing speed. Thus, these approaches could be compared with each other for internal validation. # Power simulations A random graph will be generated and used for the comparison. A set of marginal power (without multiplicity adjustment) is randomly generated. Local power (with multiplicity adjustment) is calculated using `graph_calculate_power()`. In addition, p-values simulated from `graph_calculate_power()` are saved. These p-values are used to generate local power via `graph_test_shortcut()` and `graph_test_closure()` as the proportion of times every hypothesis can be rejected. We expect to observe matching results for 1000 random graphs. ## Bonferroni tests We compare power simulations from `graph_calculate_power()` and those using `graph_test_shortcut()` via respectively the adjusted p-value approach and the adjusted significance level approach. ```{r bonferroni-results} out <- read.csv(here::here("vignettes/internal-validation_bonferroni.csv")) # Matching power using the adjusted p-value approach all.equal(out$adjusted_p, rep(TRUE, nrow(out))) # Matching power using the adjusted significance level approach all.equal(out$adjusted_significance_level, rep(TRUE, nrow(out))) ``` ## Hochberg tests We compare power simulations from `graph_calculate_power()` and those using `graph_test_closure()` via respectively the adjusted p-value approach and the adjusted significance level approach. Two test groups are used with randomly assigned hypotheses. ```{r hochberg-results} out <- read.csv(here::here("vignettes/internal-validation_hochberg.csv")) # Matching power using the adjusted p-value approach all.equal(out$adjusted_p, rep(TRUE, nrow(out))) # Matching power using the adjusted significance level approach all.equal(out$adjusted_significance_level, rep(TRUE, nrow(out))) ``` ## Simes tests We compare power simulations from `graph_calculate_power()` and those using `graph_test_closure()` via respectively the adjusted p-value approach and the adjusted significance level approach. Two test groups are used with randomly assigned hypotheses. ```{r simes-results} out <- read.csv(here::here("vignettes/internal-validation_simes.csv")) # Matching power using the adjusted p-value approach all.equal(out$adjusted_p, rep(TRUE, nrow(out))) # Matching power using the adjusted significance level approach all.equal(out$adjusted_significance_level, rep(TRUE, nrow(out))) ``` ## Parametric tests We compare power simulations from `graph_calculate_power()` and those using `graph_test_closure()` via respectively the adjusted p-value approach and the adjusted significance level approach. Two test groups are used with randomly assigned hypotheses. ```{r parametric-results} out <- read.csv(here::here("vignettes/internal-validation_parametric.csv")) # Matching power using the adjusted p-value approach all.equal(out$adjusted_p, rep(TRUE, nrow(out))) # Matching power using the adjusted significance level approach all.equal(out$adjusted_significance_level, rep(TRUE, nrow(out))) ``` ## Mixed tests of Bonferroni, Hochberg and Simes We compare power simulations from `graph_calculate_power()` and those using `graph_test_closure()` via respectively the adjusted p-value approach and the adjusted significance level approach. Two test groups are used with randomly assigned hypotheses. Two test types are randomly picked among Bonferroni, Hochberg and Simes tests. ```{r mixed-results} out <- read.csv(here::here("vignettes/internal-validation_mixed.csv")) # Matching power using the adjusted p-value approach all.equal(out$adjusted_p, rep(TRUE, nrow(out))) # Matching power using the adjusted significance level approach all.equal(out$adjusted_significance_level, rep(TRUE, nrow(out))) ``` ## Mixed tests of parametric and one of Bonferroni, Hochberg and Simes We compare power simulations from `graph_calculate_power()` and those using `graph_test_closure()` via respectively the adjusted p-value approach and the adjusted significance level approach. Two test groups are used with randomly assigned hypotheses. Parametric test type is assigned to the first test group and the test type for the second test group is randomly picked among Bonferroni, Hochberg and Simes tests. ```{r parametric-mixed-results} out <- read.csv(here::here("vignettes/internal-validation_parametric-mixed.csv")) # Matching power using the adjusted p-value approach all.equal(out$adjusted_p, rep(TRUE, nrow(out))) # Matching power using the adjusted significance level approach all.equal(out$adjusted_significance_level, rep(TRUE, nrow(out))) ``` # Conclusions Multiple approaches are implemented in `graphicalMCP` to reject a hypothesis for different purposes and/or considerations. One approach is to calculate the adjusted p-value of this hypothesis and compare it with `alpha`. Another approach is to calculate the adjusted significance level of this hypothesis and compare it with its p-value. Based on 1000 random graphs, these two approaches produce matching power for all types of tests. Therefore, the internal validation is considered to be complete. ```{r bonferroni, include = FALSE, eval = FALSE} n_graph <- 1000 m <- 6 alpha <- 0.025 n_sim <- 1e2 sim_corr <- matrix(0.5, m, m) diag(sim_corr) <- 1 out_adjusted_p <- rep(NA, n_graph) out_adjusted_significance_level <- rep(NA, n_graph) for (i in 1:n_graph) { set.seed(1234 + i - 1) graph <- random_graph(m) marginal_power <- runif(m, 0.5, 0.9) results_calculate_power <- graph_calculate_power( graph, alpha = alpha, power_marginal = marginal_power, sim_corr = sim_corr, sim_n = n_sim, verbose = TRUE ) p_sim <- results_calculate_power$details$p_sim adjusted_p_results <- p_sim adjusted_significance_level_results <- p_sim for (j in 1:n_sim) { temp <- graph_test_shortcut( graph, p_sim[j, ], alpha = alpha, test_values = TRUE ) adjusted_p_results[j, ] <- temp$outputs$rejected temp_adjusted_significance_level <- temp$test_values$results$Inequality_holds names(temp_adjusted_significance_level) <- temp$test_values$results$Hypothesis adjusted_significance_level_results[j, ] <- temp_adjusted_significance_level[names(temp$outputs$rejected)] } results_adjusted_p <- colMeans(adjusted_p_results) names(results_adjusted_p) <- names(temp$outputs$rejected) # Matching power using the adjusted p-value approach out_adjusted_p[i] <- all.equal( results_calculate_power$power$power_local, results_adjusted_p, tolerance = 1 / n_sim ) results_significance_level <- colMeans(adjusted_significance_level_results) names(results_significance_level) <- names(temp$outputs$rejected) # Matching power using the adjusted significance level approach out_adjusted_significance_level[i] <- all.equal( results_calculate_power$power$power_local, results_significance_level, tolerance = 1 / n_sim ) } out <- cbind(out_adjusted_p, out_adjusted_significance_level) colnames(out) <- c("adjusted_p", "adjusted_significance_level") write.csv( out, here::here("vignettes/internal-validation_bonferroni.csv"), row.names = FALSE ) ``` ```{r hochberg, include = FALSE, eval = FALSE} n_graph <- 1000 m <- 4 alpha <- 0.025 n_sim <- 1e2 sim_corr <- matrix(0.5, m, m) diag(sim_corr) <- 1 out_adjusted_p <- rep(NA, n_graph) out_adjusted_significance_level <- rep(NA, n_graph) for (i in 1:n_graph) { set.seed(1234 + i - 1) groups <- sample(1:m) test_groups <- list(groups[1:(m / 2)], groups[(m / 2 + 1):m]) graph <- random_graph(m) graph$hypotheses[groups[1:(m / 2)]] <- sum(graph$hypotheses[groups[1:(m / 2)]]) / 3 graph$hypotheses[groups[(m / 2 + 1):m]] <- sum(graph$hypotheses[groups[(m / 2 + 1):m]]) / 3 graph$transitions <- matrix(1 / (m - 1), nrow = m, ncol = m) diag(graph$transitions) <- 0 marginal_power <- runif(m, 0.5, 0.9) results_calculate_power <- graph_calculate_power( graph, alpha = alpha, power_marginal = marginal_power, sim_corr = sim_corr, sim_n = n_sim, test_types = c("hochberg", "hochberg"), test_groups = test_groups, verbose = TRUE ) p_sim <- results_calculate_power$details$p_sim adjusted_p_results <- p_sim adjusted_significance_level_results <- p_sim for (j in 1:n_sim) { temp <- graph_test_closure( graph, p_sim[j, ], alpha = alpha, test_types = c("hochberg", "hochberg"), test_groups = test_groups, test_values = TRUE ) adjusted_p_results[j, ] <- temp$outputs$rejected intersection <- unique(temp$test_values$results$Intersection) intersection_rej <- intersection for (k in 1:length(intersection)) { intersection_rej[k] <- any(subset( temp$test_values$results, Intersection == intersection[k] )$Inequality_holds) } for (k in 1:m) { id <- substr(intersection, start = k, stop = k) == "1" adjusted_significance_level_results[j, k] <- suppressWarnings(all(intersection_rej[id])) } } results_adjusted_p <- colMeans(adjusted_p_results) names(results_adjusted_p) <- names(temp$outputs$rejected) # Matching power using the adjusted p-value approach out_adjusted_p[i] <- all.equal( results_calculate_power$power$power_local, results_adjusted_p, tolerance = 1 / n_sim ) results_significance_level <- colMeans(adjusted_significance_level_results) names(results_significance_level) <- names(temp$outputs$rejected) # Matching power using the adjusted significance level approach out_adjusted_significance_level[i] <- all.equal( results_calculate_power$power$power_local, results_significance_level, tolerance = 1 / n_sim ) } out <- cbind(out_adjusted_p, out_adjusted_significance_level) colnames(out) <- c("adjusted_p", "adjusted_significance_level") write.csv( out, here::here("vignettes/internal-validation_hochberg.csv"), row.names = FALSE ) ``` ```{r simes, include = FALSE, eval = FALSE} n_graph <- 1000 m <- 4 alpha <- 0.025 n_sim <- 1e2 sim_corr <- matrix(0.5, m, m) diag(sim_corr) <- 1 out_adjusted_p <- rep(NA, n_graph) out_adjusted_significance_level <- rep(NA, n_graph) for (i in 1:n_graph) { set.seed(1234 + i - 1) groups <- sample(1:m) test_groups <- list(groups[1:(m / 2)], groups[(m / 2 + 1):m]) graph <- random_graph(m) marginal_power <- runif(m, 0.5, 0.9) results_calculate_power <- graph_calculate_power( graph, alpha = alpha, power_marginal = marginal_power, sim_corr = sim_corr, sim_n = n_sim, test_types = c("simes", "simes"), test_groups = test_groups, verbose = TRUE ) p_sim <- results_calculate_power$details$p_sim adjusted_p_results <- p_sim adjusted_significance_level_results <- p_sim for (j in 1:n_sim) { temp <- graph_test_closure( graph, p_sim[j, ], alpha = alpha, test_types = c("simes", "simes"), test_groups = test_groups, test_values = TRUE ) adjusted_p_results[j, ] <- temp$outputs$rejected intersection <- unique(temp$test_values$results$Intersection) intersection_rej <- intersection for (k in 1:length(intersection)) { intersection_rej[k] <- any(subset( temp$test_values$results, Intersection == intersection[k] )$Inequality_holds) } for (k in 1:m) { id <- substr(intersection, start = k, stop = k) == "1" adjusted_significance_level_results[j, k] <- suppressWarnings(all(intersection_rej[id])) } } results_adjusted_p <- colMeans(adjusted_p_results) names(results_adjusted_p) <- names(temp$outputs$rejected) # Matching power using the adjusted p-value approach out_adjusted_p[i] <- all.equal( results_calculate_power$power$power_local, results_adjusted_p, tolerance = 1 / n_sim ) results_significance_level <- colMeans(adjusted_significance_level_results) names(results_significance_level) <- names(temp$outputs$rejected) # Matching power using the adjusted significance level approach out_adjusted_significance_level[i] <- all.equal( results_calculate_power$power$power_local, results_significance_level, tolerance = 1 / n_sim ) } out <- cbind(out_adjusted_p, out_adjusted_significance_level) colnames(out) <- c("adjusted_p", "adjusted_significance_level") write.csv( out, here::here("vignettes/internal-validation_simes.csv"), row.names = FALSE ) ``` ```{r parametric, include = FALSE, eval = FALSE} n_graph <- 1000 m <- 4 alpha <- 0.025 n_sim <- 1e2 sim_corr <- matrix(0.5, m, m) diag(sim_corr) <- 1 out_adjusted_p <- rep(NA, n_graph) out_adjusted_significance_level <- rep(NA, n_graph) for (i in 1:n_graph) { set.seed(1234 + i - 1) groups <- sample(1:m) test_groups <- list(groups[1:(m / 2)], groups[(m / 2 + 1):m]) test_corr <- list( sim_corr[groups[1:(m / 2)], groups[1:(m / 2)]], sim_corr[groups[(m / 2 + 1):m], groups[(m / 2 + 1):m]] ) graph <- random_graph(m) marginal_power <- runif(m, 0.5, 0.9) results_calculate_power <- graph_calculate_power( graph, alpha = alpha, power_marginal = marginal_power, sim_corr = sim_corr, sim_n = n_sim, test_types = c("parametric", "parametric"), test_groups = test_groups, test_corr = test_corr, verbose = TRUE ) p_sim <- results_calculate_power$details$p_sim adjusted_p_results <- p_sim adjusted_significance_level_results <- p_sim for (j in 1:n_sim) { temp <- graph_test_closure( graph, p_sim[j, ], alpha = alpha, test_types = c("parametric", "parametric"), test_groups = test_groups, test_corr = test_corr, test_values = TRUE ) adjusted_p_results[j, ] <- temp$outputs$rejected intersection <- unique(temp$test_values$results$Intersection) intersection_rej <- intersection for (k in 1:length(intersection)) { intersection_rej[k] <- any(subset( temp$test_values$results, Intersection == intersection[k] )$Inequality_holds) } for (k in 1:m) { id <- substr(intersection, start = k, stop = k) == "1" adjusted_significance_level_results[j, k] <- suppressWarnings(all(intersection_rej[id])) } } results_adjusted_p <- colMeans(adjusted_p_results) names(results_adjusted_p) <- names(temp$outputs$rejected) # Matching power using the adjusted p-value approach out_adjusted_p[i] <- all.equal( results_calculate_power$power$power_local, results_adjusted_p, tolerance = 1 / n_sim ) results_significance_level <- colMeans(adjusted_significance_level_results) names(results_significance_level) <- names(temp$outputs$rejected) # Matching power using the adjusted significance level approach out_adjusted_significance_level[i] <- all.equal( results_calculate_power$power$power_local, results_significance_level, tolerance = 1 / n_sim ) } out <- cbind(out_adjusted_p, out_adjusted_significance_level) colnames(out) <- c("adjusted_p", "adjusted_significance_level") write.csv( out, here::here("vignettes/internal-validation_parametric.csv"), row.names = FALSE ) ``` ```{r mixed, include = FALSE, eval = FALSE} n_graph <- 1000 m <- 4 alpha <- 0.025 n_sim <- 1e2 sim_corr <- matrix(0.5, m, m) diag(sim_corr) <- 1 out_adjusted_p <- rep(NA, n_graph) out_adjusted_significance_level <- rep(NA, n_graph) for (i in 1:n_graph) { set.seed(1234 + i - 1) groups <- sample(1:m) test_groups <- list(groups[1:(m / 2)], groups[(m / 2 + 1):m]) test_corr <- list( sim_corr[groups[1:(m / 2)], groups[1:(m / 2)]], sim_corr[groups[(m / 2 + 1):m], groups[(m / 2 + 1):m]] ) test_types <- sample(c("bonferroni", "hochberg", "simes"), 2) graph <- random_graph(m) graph$hypotheses[groups[1:(m / 2)]] <- sum(graph$hypotheses[groups[1:(m / 2)]]) / 3 graph$hypotheses[groups[(m / 2 + 1):m]] <- sum(graph$hypotheses[groups[(m / 2 + 1):m]]) / 3 graph$transitions <- matrix(1 / (m - 1), nrow = m, ncol = m) diag(graph$transitions) <- 0 marginal_power <- runif(m, 0.5, 0.9) results_calculate_power <- graph_calculate_power( graph, alpha = alpha, power_marginal = marginal_power, sim_corr = sim_corr, sim_n = n_sim, test_types = test_types, test_groups = test_groups, verbose = TRUE ) p_sim <- results_calculate_power$details$p_sim adjusted_p_results <- p_sim adjusted_significance_level_results <- p_sim for (j in 1:n_sim) { temp <- graph_test_closure( graph, p_sim[j, ], alpha = alpha, test_types = test_types, test_groups = test_groups, test_values = TRUE ) adjusted_p_results[j, ] <- temp$outputs$rejected intersection <- unique(temp$test_values$results$Intersection) intersection_rej <- intersection for (k in 1:length(intersection)) { intersection_rej[k] <- any(subset( temp$test_values$results, Intersection == intersection[k] )$Inequality_holds) } for (k in 1:m) { id <- substr(intersection, start = k, stop = k) == "1" adjusted_significance_level_results[j, k] <- suppressWarnings(all(intersection_rej[id])) } } results_adjusted_p <- colMeans(adjusted_p_results) names(results_adjusted_p) <- names(temp$outputs$rejected) # Matching power using the adjusted p-value approach out_adjusted_p[i] <- all.equal( results_calculate_power$power$power_local, results_adjusted_p, tolerance = 1 / n_sim ) results_significance_level <- colMeans(adjusted_significance_level_results) names(results_significance_level) <- names(temp$outputs$rejected) # Matching power using the adjusted significance level approach out_adjusted_significance_level[i] <- all.equal( results_calculate_power$power$power_local, results_significance_level, tolerance = 1 / n_sim ) } out <- cbind(out_adjusted_p, out_adjusted_significance_level) colnames(out) <- c("adjusted_p", "adjusted_significance_level") write.csv( out, here::here("vignettes/internal-validation_mixed.csv"), row.names = FALSE ) ``` ```{r parametric-mixed, include = FALSE, eval = FALSE} n_graph <- 1000 m <- 4 alpha <- 0.025 n_sim <- 1e2 sim_corr <- matrix(0.5, m, m) diag(sim_corr) <- 1 out_adjusted_p <- rep(NA, n_graph) out_adjusted_significance_level <- rep(NA, n_graph) for (i in 1:n_graph) { set.seed(1234 + i - 1) groups <- sample(1:m) test_groups <- list(groups[1:(m / 2)], groups[(m / 2 + 1):m]) test_corr <- list(sim_corr[groups[1:(m / 2)], groups[1:(m / 2)]], NA) test_types <- c( "parametric", sample(c("bonferroni", "hochberg", "simes"), 1) ) graph <- random_graph(m) graph$hypotheses[groups[1:(m / 2)]] <- sum(graph$hypotheses[groups[1:(m / 2)]]) / 3 graph$hypotheses[groups[(m / 2 + 1):m]] <- sum(graph$hypotheses[groups[(m / 2 + 1):m]]) / 3 graph$transitions <- matrix(1 / (m - 1), nrow = m, ncol = m) diag(graph$transitions) <- 0 marginal_power <- runif(m, 0.5, 0.9) results_calculate_power <- graph_calculate_power( graph, alpha = alpha, power_marginal = marginal_power, sim_corr = sim_corr, sim_n = n_sim, test_types = test_types, test_groups = test_groups, test_corr = test_corr, verbose = TRUE ) p_sim <- results_calculate_power$details$p_sim adjusted_p_results <- p_sim adjusted_significance_level_results <- p_sim for (j in 1:n_sim) { temp <- graph_test_closure( graph, p_sim[j, ], alpha = alpha, test_types = test_types, test_groups = test_groups, test_corr = test_corr, test_values = TRUE ) adjusted_p_results[j, ] <- temp$outputs$rejected intersection <- unique(temp$test_values$results$Intersection) intersection_rej <- intersection for (k in 1:length(intersection)) { intersection_rej[k] <- any(subset( temp$test_values$results, Intersection == intersection[k] )$Inequality_holds) } for (k in 1:m) { id <- substr(intersection, start = k, stop = k) == "1" adjusted_significance_level_results[j, k] <- suppressWarnings(all(intersection_rej[id])) } } results_adjusted_p <- colMeans(adjusted_p_results) names(results_adjusted_p) <- names(temp$outputs$rejected) # Matching power using the adjusted p-value approach out_adjusted_p[i] <- all.equal( results_calculate_power$power$power_local, results_adjusted_p, tolerance = 1 / n_sim ) results_significance_level <- colMeans(adjusted_significance_level_results) names(results_significance_level) <- names(temp$outputs$rejected) # Matching power using the adjusted significance level approach out_adjusted_significance_level[i] <- all.equal( results_calculate_power$power$power_local, results_significance_level, tolerance = 1 / n_sim ) } out <- cbind(out_adjusted_p, out_adjusted_significance_level) colnames(out) <- c("adjusted_p", "adjusted_significance_level") write.csv( out, here::here("vignettes/internal-validation_parametric-mixed.csv"), row.names = FALSE ) ```