---
title: "Non-Survival Endpoints: Continuous, Binary, and Count"
output:
  rmarkdown::html_vignette:
    toc: true
    toc_depth: 3
    number_sections: false
vignette: >
  %\VignetteIndexEntry{Non-Survival Endpoints: Continuous, Binary, and Count}
  %\VignetteEngine{knitr::rmarkdown}
  %\VignetteEncoding{UTF-8}
---

```{r setup, include = FALSE}
knitr::opts_chunk$set(
  collapse   = TRUE,
  comment    = "#>",
  fig.width  = 7,
  fig.height = 4.5,
  out.width  = "100%",
  dpi        = 96
)
library(SingleArmMRCT)
```

This vignette describes Regional Consistency Probability (RCP) calculations for three non-survival endpoint types: **continuous**, **binary**, and **count** (negative binomial). For each endpoint, the statistical model, treatment effect scale, closed-form formulae, and worked examples are provided.

---

## 1. Continuous Endpoint

### Statistical model

Let $\hat{\mu}_j$ denote the sample mean for Region $j$. Under the assumption that individual observations are independently and identically distributed as $N(\mu, \sigma^2)$ within each region, the regional sample means are:

$$
\hat{\mu}_j \sim N\!\left(\mu,\; \frac{\sigma^2}{N_j}\right), \qquad j = 1, \ldots, J
$$

independently across regions. The treatment effect relative to a historical control mean $\mu_0$ is $\delta = \mu - \mu_0 > 0$.

### Consistency criteria

**Method 1 (Effect Retention):**

$$
\text{RCP}_1 = \Pr\!\left[\,(\hat{\mu}_1 - \mu_0) \geq \pi\,(\hat{\mu} - \mu_0)\,\right]
$$

Defining $D = (\hat{\mu}_1 - \mu_0) - \pi(\hat{\mu} - \mu_0)$, the condition $D \geq 0$ is equivalent to:

$$
D = (1 - \pi f_1)\,(\hat{\mu}_1 - \mu_0) - \pi(1 - f_1)\,(\hat{\mu}_{-1} - \mu_0) \geq 0
$$

where $\hat{\mu}_{-1}$ is the sample mean pooled over regions $2, \ldots, J$. Under homogeneity:

$$
E[D] = (1 - \pi)\,\delta, \qquad
\mathrm{Var}(D) = (1 - \pi f_1)^2\,\frac{\sigma^2}{N_1}
                + \bigl[\pi(1 - f_1)\bigr]^2\,\frac{\sigma^2}{N - N_1}
$$

Therefore:

$$
\text{RCP}_1 = \Phi\!\left(\frac{(1 - \pi)\,\delta}
  {\sqrt{(1 - \pi f_1)^2\,\sigma^2/N_1
        + \{\pi(1 - f_1)\}^2\,\sigma^2/(N - N_1)}}\right)
$$

**Method 2 (Simultaneous Positivity):**

$$
\text{RCP}_2 = \Pr\!\left[\,\hat{\mu}_j > \mu_0 \;\text{ for all } j\,\right]
= \prod_{j=1}^{J} \Phi\!\left(\frac{\delta\,\sqrt{N_j}}{\sigma}\right)
$$

### Example

Setting: $\mu = 0.5$, $\mu_0 = 0.1$, $\sigma = 1$, $N = 100$ ($J = 3$ regions with $N_1 = 20$), $\pi = 0.5$.

```{r}
result_f <- rcp1armContinuous(
  mu       = 0.5,
  mu0      = 0.1,
  sd       = 1,
  Nj       = c(20, 40, 40),
  PI       = 0.5,
  approach = "formula"
)
print(result_f)
```

```{r}
result_s <- rcp1armContinuous(
  mu       = 0.5,
  mu0      = 0.1,
  sd       = 1,
  Nj       = c(20, 40, 40),
  PI       = 0.5,
  approach = "simulation",
  nsim     = 10000,
  seed     = 1
)
print(result_s)
```

### Visualisation

```{r fig.alt="Line plot of RCP versus f1 for a continuous endpoint with mu = 0.5, mu0 = 0.1, sigma = 1, showing Method 1 and Method 2 across N = 20, 40, 100"}
plot_rcp1armContinuous(
  mu        = 0.5,
  mu0       = 0.1,
  sd        = 1,
  PI        = 0.5,
  N_vec     = c(20, 40, 100),
  J         = 3,
  nsim      = 5000,
  seed      = 1,
  base_size = 8
)
```

---

## 2. Binary Endpoint

### Statistical model

Let $Y_j$ denote the number of responders in Region $j$. Under independent Bernoulli trials with a common response rate $p$:

$$
Y_j \sim \mathrm{Binomial}(N_j,\; p), \qquad j = 1, \ldots, J
$$

independently across regions. The regional response rate estimator is $\hat{p}_j = Y_j / N_j$, the overall estimator is $\hat{p} = \sum_j Y_j / N$, and the treatment effect is $\delta = p - p_0 > 0$.

### Consistency criteria

**Method 1 (Effect Retention) — Exact Enumeration:**

$$
\text{RCP}_1 = \Pr\!\left[\,(\hat{p}_1 - p_0) \geq \pi\,(\hat{p} - p_0)\,\right]
$$

By the additivity of independent binomials, $Y_{-1} = \sum_{j \geq 2} Y_j \sim \mathrm{Binomial}(N - N_1,\; p)$. The formula approach enumerates all combinations $(y_1, y_{-1}) \in \{0, \ldots, N_1\} \times \{0, \ldots, N - N_1\}$ and sums the joint probabilities satisfying the consistency condition:

$$
\text{RCP}_1 = \sum_{y_1=0}^{N_1} \sum_{y_{-1}=0}^{N-N_1}
  b(y_1;\,N_1,\,p)\;b(y_{-1};\,N{-}N_1,\,p)
  \cdot \mathbf{1}\!\left[\frac{y_1}{N_1} - p_0 \geq \pi\!\left(\frac{y_1+y_{-1}}{N} - p_0\right)\right]
$$

where $b(y;\,n,\,p) = \binom{n}{y}p^y(1-p)^{n-y}$.

**Method 2 (Simultaneous Positivity):**

The condition $\hat{p}_j > p_0$ is equivalent to $Y_j \geq y_{j,\min}$ where $y_{j,\min} = \lfloor N_j p_0 \rfloor + 1$. Denoting by $F_{\mathrm{Bin}(n,\,p)}(k)$ the CDF of the binomial distribution with parameters $n$ and $p$ evaluated at $k$:

$$
\text{RCP}_2 = \prod_{j=1}^{J} \left[1 - F_{\mathrm{Bin}(N_j,\,p)}(y_{j,\min} - 1)\right]
$$

### Example

Setting: $p = 0.5$, $p_0 = 0.2$, $N = 100$ ($J = 3$ regions with $N_1 = 20$), $\pi = 0.5$.

```{r}
result_f <- rcp1armBinary(
  p        = 0.5,
  p0       = 0.2,
  Nj       = c(20, 40, 40),
  PI       = 0.5,
  approach = "formula"
)
print(result_f)
```

```{r}
result_s <- rcp1armBinary(
  p        = 0.5,
  p0       = 0.2,
  Nj       = c(20, 40, 40),
  PI       = 0.5,
  approach = "simulation",
  nsim     = 10000,
  seed     = 1
)
print(result_s)
```

### Visualisation

```{r fig.alt="Line plot of RCP versus f1 for a binary endpoint with p = 0.5, p0 = 0.2, showing Method 1 and Method 2 across N = 20, 40, 100"}
plot_rcp1armBinary(
  p         = 0.5,
  p0        = 0.2,
  PI        = 0.5,
  N_vec     = c(20, 40, 100),
  J         = 3,
  nsim      = 5000,
  seed      = 1,
  base_size = 8
)
```

---

## 3. Count Endpoint (Negative Binomial)

### Statistical model

Count data are modelled by the negative binomial distribution. The total event count in Region $j$ is:

$$
Y_j \sim \mathrm{NB}\!\left(\mu = N_j\,\lambda,\;\; \mathrm{size} = N_j\,\phi\right), \qquad j = 1, \ldots, J
$$

independently across regions, where $\lambda > 0$ is the expected count per patient under the alternative and $\phi > 0$ is the dispersion parameter. The regional rate estimator is $\hat{\lambda}_j = Y_j / N_j$, and the treatment effect is expressed as a **rate ratio**:

$$
\widehat{RR}_j = \frac{\hat{\lambda}_j}{\lambda_0}
$$

Benefit is indicated by $RR = \lambda / \lambda_0 < 1$.

By the **reproducibility property** of the negative binomial, the pooled count for regions $2, \ldots, J$ follows $\mathrm{NB}(\mu = (N - N_1)\lambda,\; \mathrm{size} = (N - N_1)\phi)$, enabling exact enumeration analogous to the binary case.

### Consistency criteria

**Method 1 (log-RR scale):**

$$
\text{RCP}_{1,\log} = \Pr\!\left[\,\log(\widehat{RR}_1) \leq \pi\,\log(\widehat{RR})\,\right]
$$

Since $RR < 1$ (benefit), $\log(RR) < 0$, so the condition requires $\log(\widehat{RR}_1)$ to be sufficiently negative relative to the overall $\log(\widehat{RR})$.

**Method 1 (linear-RR scale):**

$$
\text{RCP}_{1,\text{lin}} = \Pr\!\left[\,(1 - \widehat{RR}_1) \geq \pi\,(1 - \widehat{RR})\,\right]
$$

Both Method 1 variants use exact enumeration over all $(y_1, y_{-1})$ combinations via the outer product of negative binomial PMFs.

**Method 2:**

Denoting by $F_{\mathrm{NB}(\mu,\,\phi)}(k)$ the CDF of the negative binomial distribution with mean $\mu$ and size $\phi$ evaluated at $k$, the condition $\widehat{RR}_j < 1$ is equivalent to $Y_j < N_j\lambda_0$, i.e., $Y_j \leq \lfloor N_j\lambda_0 \rfloor - 1$ when $N_j\lambda_0$ is not an integer (and $Y_j \leq N_j\lambda_0 - 1$ otherwise). Therefore:

$$
\text{RCP}_2 = \prod_{j=1}^{J} \Pr\!\left(\widehat{RR}_j < 1\right)
= \prod_{j=1}^{J} F_{\mathrm{NB}(N_j\lambda,\,N_j\phi)}\!\left(\lfloor N_j\lambda_0 \rfloor - 1\right)
$$

### Example

Setting: $\lambda = 2$, $\lambda_0 = 3$, $\phi = 1$, $N = 100$ ($J = 3$ regions with $N_1 = 20$), $\pi = 0.5$.

```{r}
result_f <- rcp1armCount(
  lambda     = 2,
  lambda0    = 3,
  dispersion = 1,
  Nj         = c(20, 40, 40),
  PI         = 0.5,
  approach   = "formula"
)
print(result_f)
```

```{r}
result_s <- rcp1armCount(
  lambda     = 2,
  lambda0    = 3,
  dispersion = 1,
  Nj         = c(20, 40, 40),
  PI         = 0.5,
  approach   = "simulation",
  nsim       = 10000,
  seed       = 1
)
print(result_s)
```

The output reports three RCP values: Method 1 on the log-RR scale (`Method1_logRR`), Method 1 on the linear-RR scale (`Method1_linearRR`), and Method 2 (`Method2`).

### Visualisation

The count endpoint plot uses a grid layout: facet rows distinguish the two Method 1 scales (log-RR and $1 - RR$), and facet columns correspond to different total sample sizes.

```{r fig.height=6, fig.alt="Grid plot of RCP versus f1 for a count endpoint with lambda = 2, lambda0 = 3, showing Method 1 on log-RR and linear-RR scales and Method 2 across N = 20, 40, 100"}
plot_rcp1armCount(
  lambda     = 2,
  lambda0    = 3,
  dispersion = 1,
  PI         = 0.5,
  N_vec      = c(20, 40, 100),
  J          = 3,
  nsim       = 5000,
  seed       = 1,
  base_size  = 11
)
```

---

## Summary

```{r echo=FALSE}
tbl <- data.frame(
  Endpoint   = c("Continuous", "Binary", "Count"),
  Model      = c("Normal", "Binomial", "Negative binomial"),
  `Effect parameter` = c(
    "$\\delta = \\mu - \\mu_0$",
    "$\\delta = p - p_0$",
    "$\\log(RR) = \\log(\\lambda/\\lambda_0)$ (Method 1, log-RR scale); $1 - RR = 1 - \\lambda/\\lambda_0$ (Method 1, linear-RR scale)"
  ),
  `Benefit direction` = c(
    "$\\hat{\\mu}_j > \\mu_0$",
    "$\\hat{p}_j > p_0$",
    "$\\widehat{RR}_j < 1$"
  ),
  `Method 1 computation` = c(
    "Closed-form (normal approximation)",
    "Exact enumeration (binomial)",
    "Exact enumeration (negative binomial)"
  ),
  `Method 2 computation` = c(
    "Product of normal tail probabilities",
    "Product of binomial tail probabilities",
    "Product of NB tail probabilities"
  ),
  check.names = FALSE
)
knitr::kable(tbl, align = "llllll")
```

---

## References

Homma G (2024). Cautionary note on regional consistency evaluation in multiregional clinical trials with binary outcomes. *Pharmaceutical Statistics*, 23(3):385--398. https://doi.org/10.1002/pst.2358