--- title: "Getting Started with PSpower" output: rmarkdown::html_vignette: toc: true toc_depth: 3 vignette: > %\VignetteIndexEntry{Getting Started with PSpower} %\VignetteEngine{knitr::rmarkdown} %\VignetteEncoding{UTF-8} --- ```{r setup, include=FALSE} knitr::opts_chunk$set( collapse = TRUE, comment = "#>", fig.width = 7, fig.height = 4.5 ) library(PSpower) ``` ## Overview **PSpower** is a design-stage tool for planning observational and randomized studies that estimate treatment effects using propensity score (PS) weighting. It answers: *how large does my study need to be?* Two functions cover the most common outcome types: | Function | Outcome | Default test | |---|---|---| | `power_ps()` | Continuous or binary | Two-sided | | `power_cox()` | Time-to-event (survival) | One-sided | > **This is a prospective planning tool.** > Computing power after data are collected using the observed sample size ("post-hoc power") is uninformative and should not be done. --- ### The sample size formula For all outcome types, the required sample size takes the form $$N = \left\lceil V \cdot \frac{(z_{1-\alpha/k} + z_\beta)^2}{\delta^2} \right\rceil,$$ where $\delta$ is the standardized effect size, $\alpha$ is the significance level, $\beta$ is the target power, and $k = 2$ for a two-sided test or $k = 1$ for a one-sided test. All reported sample sizes are ceiling integers (rounded up to the nearest whole number). The asymptotic variance $V$ is what distinguishes the two functions: - **Continuous/binary** (`power_ps`): $V$ is the variance of the Hájek estimator, determined by $r$, $\phi$, $\rho^2$, and the estimand. - **Survival** (`power_cox`): $V$ is the variance of the PS-weighted partial likelihood estimator, determined by $r$, $d_1$, $d_0$, $\tau_0 = \log(\text{HR})$, and (for observational studies) $\phi$ and the estimand. --- ### Inputs at a glance **Common inputs** *Effect size* — the standardized treatment effect $\delta = \tau / \text{SD}(Y)$ for continuous/binary outcomes, or the log hazard ratio $\tau_0 = \log(\text{HR})$ for survival outcomes. For continuous outcomes, $|\delta| = 0.2$ is considered small; 0.5 moderate; 0.8 large. For survival, $\text{HR} = 0.75$ gives $\tau_0 \approx -0.29$. *Treatment proportion r* — the fraction of participants who receive treatment, $r = \Pr(Z = 1)$. A balanced design uses $r = 0.5$; in observational data, $r$ is estimated from the study population. *Estimand* — the scientific quantity of interest: - **ATE** — average treatment effect across the entire study population (inverse probability weights) - **ATT** — average effect among the treated group (weights for group 1) - **ATC** — average effect among the control group (weights for group 0); operationally equivalent to ATT with group labels swapped ($r \to 1 - r$, group 1 $\leftrightarrow$ group 0). Supported directly by `power_ps()`; for `power_cox()`, re-label groups and use `estimand = "ATT"`. - **ATO** — average effect in the *overlap population* (overlap weights); the most efficient estimand when covariate overlap is limited In a randomized trial all four estimands coincide. **Continuous and binary outcomes** (`power_ps`) *Overlap coefficient φ* — a number in $(0, 1)$ measuring how similar the PS distributions of the two groups are. $\phi = 1$ in a randomized trial; $\phi < 1$ reflects confounding, and sample size increases as $\phi$ decreases. Rule of thumb: $\phi \geq 0.95$ good; $[0.90, 0.95)$ moderate; $[0.85, 0.90)$ poor; $< 0.85$ very poor. $\phi$ can be estimated from a pilot study or any dataset with fitted propensity scores using `overlap_coef()`. When it is not yet known, supplying a vector of plausible values (e.g., $\phi \in \{0.85, 0.90, 0.95\}$) allows $N$ to be computed across the plausible range. *Confounder coefficient ρ²* — the squared correlation between the potential outcome $Y(0)$ and the PS linear predictor. When $\rho^2 = 0$ (the default), the outcome is unrelated to the confounders beyond the treatment indicator. A sensitivity analysis over $\rho^2 \in [0, 0.05]$ is recommended; values above 0.05 are uncommon in practice. **Survival outcomes** (`power_cox`) *Event rates d₁ and d₀* — the probability of observing an event during follow-up in the treated group ($d_1$) and the control group ($d_0$). These can be estimated from registries, pilot data, or published survival curves. If $d_0$ is omitted it is set equal to $d_1$. *Study type* — `"rct"` for a randomized trial or `"obs"` for an observational study. *Overlap coefficient φ* — required when `study_type = "obs"`; the same quantity as in `power_ps()`. See the rule of thumb and estimation guidance under the continuous/binary section above. *Method* — `"robust"` (default) uses the sandwich variance, which accounts for the actual hazard ratio; `"schoenfeld"` is the classical formula derived under a null effect and may overestimate or underestimate the required $N$ relative to the robust method. Available only for randomized trials. --- ## 1. Continuous and binary outcomes: `power_ps()` ### Single scenario Suppose we plan a study with: - Standardized effect $\delta = 0.2$ (e.g., a 5-point difference in a symptom score with SD = 25) - 50% of participants receive treatment ($r = 0.5$) - Estimated overlap $\phi = 0.90$ (moderate) - Target: 80% power, two-sided $\alpha = 0.05$ ```{r ps-single} res <- power_ps( effect_size = 0.2, r = 0.5, phi = 0.9, estimand = "ATE", power = 0.80 ) res ``` ### Multiple configurations Vectors can be supplied to any input; all combinations are computed automatically. **Effect of treatment allocation** The required sample size depends on how evenly participants are split between treatment and control. The following evaluates five values of $r$ across three estimands ($5 \times 3 = 15$ scenarios): ```{r ps-r} res_r <- power_ps( effect_size = 0.2, r = c(0.30, 0.40, 0.50, 0.60, 0.70), phi = 0.90, estimand = c("ATE", "ATT", "ATO"), power = 0.80 ) plot(res_r) ``` $N$ is minimized at $r = 0.5$ for ATE; the ATT and ATC estimands are not symmetric in $r$ — ATT favors a larger treated fraction, and vice versa for ATC. **Effect of covariate overlap** ```{r ps-phi} res_phi <- power_ps( effect_size = 0.2, r = 0.5, phi = c(0.85, 0.90, 0.95), estimand = c("ATE", "ATT", "ATO"), power = 0.80 ) plot(res_phi) ``` ```{r ps-phi-summary} summary(res_phi) ``` The ATO estimand consistently requires fewer participants than ATE, especially under poor overlap. ### Role of the confounder coefficient ρ² The table below shows how $N$ changes across $\rho^2 \in [0, 0.05]$ for ATE with $\phi = 0.90$. A sensitivity analysis over this range is recommended; values above 0.05 are uncommon in practice. ```{r ps-rho2} power_ps( effect_size = 0.2, r = 0.5, phi = 0.90, rho2 = c(0, 0.01, 0.02, 0.05), estimand = "ATE", power = 0.80 ) ``` An upper bound on $\rho^2$ can be obtained from the $R^2$ of a regression of the outcome on the study covariates using historical data. --- ## 2. Survival outcomes: `power_cox()` ### Randomized trial #### Single scenario ```{r cox-rct-single} res_rct <- power_cox( effect_size = log(0.75), # HR = 0.75 r = 0.5, d1 = 0.40, study_type = "rct", power = 0.80 ) res_rct ``` The default method uses the robust sandwich variance, which accounts for the actual hazard ratio. The Schoenfeld formula (derived under a null effect) does not correct for the effect size and may give a larger or smaller $N$: ```{r cox-schoenfeld} power_cox( effect_size = log(0.75), r = 0.5, d1 = 0.40, study_type = "rct", method = "schoenfeld", power = 0.80 ) ``` #### Multiple configurations The following evaluates three hazard ratios and three treatment proportions ($3 \times 3 = 9$ scenarios): ```{r cox-rct-multi} res_rct_grid <- power_cox( effect_size = c(log(0.65), log(0.75), log(0.85)), r = c(0.40, 0.50, 0.60), d1 = 0.40, study_type = "rct", power = 0.80 ) plot(res_rct_grid) ``` ```{r cox-rct-summary} summary(res_rct_grid) ``` ### Observational study For observational studies, the overlap coefficient $\phi$ must also be provided. The ATE estimand uses inverse probability weights; ATO uses overlap weights and is more efficient when overlap is limited. #### Single scenario ```{r cox-obs-single} res_obs <- power_cox( effect_size = log(0.75), r = 0.5, d1 = 0.40, phi = 0.90, study_type = "obs", estimand = "ATE", power = 0.80 ) res_obs ``` #### Multiple configurations ```{r cox-obs-multi} res_obs_grid <- power_cox( effect_size = log(0.75), r = 0.5, d1 = 0.40, phi = c(0.85, 0.90, 0.95), study_type = "obs", estimand = c("ATE", "ATO"), power = 0.80, n_mc = 5e4 # use the default 1e6 in practice ) plot(res_obs_grid) ``` ```{r cox-obs-summary} summary(res_obs_grid) ``` --- ## 3. Estimating propensity score overlap from data If a pilot dataset or a similar published cohort is available, $\phi$ can be estimated directly from fitted propensity scores using `overlap_coef()`. ```{r overlap-empirical} set.seed(2026) n <- 500 X <- rnorm(n) ps <- plogis(0.5 * X) # fitted propensity scores from a PS model Z <- rbinom(n, 1, ps) overlap_coef(ps = ps, Z = Z) ``` When no data are available, use the rule of thumb above and compute $N$ across the plausible range (e.g., $\phi \in \{0.85, 0.90, 0.95\}$), as shown in the examples above. `overlap_coef()` also accepts Beta distribution parameters directly, which is useful when the PS distribution has been summarized in a published paper: ```{r overlap-analytical} overlap_coef(a = 3, b = 2) # r = a/(a+b) = 0.60 ```