Test examples

This vignette provides examples of some of the hypothesis tests that can be specified in simr. The function doTest can be used to apply a test to an input model, which lets you check that the test works before running a power simulation.

Documentation for the test specification functions can be found in the help system at ?tests.

Binomial GLMM with a categorical predictor

The first example comes from the help page for glmer. The data frame cbpp contains data on contagious bovine pleuropneumonia. An observation variable is added to allow for overdispersion. Note that the response is specified using cbind — simr expects a binomial model to be in this form.

cbpp$obs <- 1:nrow(cbpp)
gm1 <- glmer(cbind(incidence, size - incidence) ~ period + (1 | herd) + (1|obs), data=cbpp,
    family=binomial)
summary(gm1)$coef

##              Estimate Std. Error   z value     Pr(>|z|)
## (Intercept) -1.500290  0.2967069 -5.056470 4.270869e-07
## period2     -1.226498  0.4802929 -2.553647 1.066014e-02
## period3     -1.328831  0.4938972 -2.690502 7.134458e-03
## period4     -1.866248  0.5935789 -3.144060 1.666211e-03

Note that period is a categorical variable with 4 levels, which enters the model as 3 dummy variables. To test all 3 dummy variables simultaneously, you can use a likelihood ratio test.

doTest(gm1, fixed("period", "lr"))

## boundary (singular) fit: see help('isSingular')

## p-value for predictor 'period': 0.003150249
##           --------------------
## Test: Likelihood ratio

If you were (for some reason) especially interested in the significance for the dummy variable period2 you could use a z-test. This test uses the value Pr(>|z|) reported in the summary above.

doTest(gm1, fixed("period2", "z"))

## p-value for predictor 'period2': 0.01066014
##           --------------------
## Test: z-test
##       Effect size for period2 is -1.2

Suppose your model also has a continuous predictor. You can use fixed to choose which fixed effect to apply tests to.

gm2 <- glmer(cbind(incidence, size - incidence) ~ period + size + (1 | herd), data=cbpp,
    family=binomial)
doTest(gm2, fixed("size", "z"))

## p-value for predictor 'size': 0.863387
##           --------------------
## Test: z-test
##       Effect size for size is 0.0045

Once you have chosen your tests, you can run a power analysis by replacing doTest with powerSim. Don’t forget to specify an appropriate effect size.

fixef(gm2)["size"] <- 0.05
powerSim(gm2, fixed("size", "z"), nsim=50)

## Power for predictor 'size', (95% confidence interval):
##       72.00% (57.51, 83.77)
## 
## Test: z-test
##       Effect size for size is 0.050
## 
## Based on 50 simulations, (0 warnings, 0 errors)
## alpha = 0.05, nrow = 112
## 
## Time elapsed: 0 h 0 m 9 s

Models with interaction or quadratic terms

As your models become more complex, it can be easier to explicitly specify your null hypothesis using the compare functions.

Cake

This example uses the cake dataset.

fm1 <- lmer(angle ~ recipe * temp + (1|recipe:replicate), data=cake, REML=FALSE)

Main effects should not be tested when they appear in an interaction term. Using the fcompare function, we can specify a comparison with a simpler model (without having to re-type the random effects specification).

doTest(fm1, fcompare(~ recipe + temp))

## p-value for model comparison: 0.9586321
##           --------------------
## Test: Likelihood ratio
##       Comparison to ~recipe + temp + [re]

This also works for polynomial terms:

fm2 <- lmer(angle ~ recipe + poly(temp, 2) + (1|recipe:replicate), data=cake, REML=FALSE)
summary(fm2)$coef

##                 Estimate Std. Error    t value
## (Intercept)    33.122222   1.677940 19.7398074
## recipeB        -1.477778   2.372966 -0.6227555
## recipeC        -1.522222   2.372966 -0.6414850
## poly(temp, 2)1 44.347549   4.537142  9.7743351
## poly(temp, 2)2 -2.586135   4.537142 -0.5699921

doTest(fm2, fcompare(~ recipe + temp))

## p-value for model comparison: 0.5688225
##           --------------------
## Test: Likelihood ratio
##       Comparison to ~recipe + temp + [re]

Budworms

We can do similar things with the budworm data in the pbkrtest package.

data(budworm, package="pbkrtest")
bw1 <- glm(cbind(ndead, ntotal-ndead) ~ dose*sex, family="binomial", data=budworm)
summary(bw1)$coef

##                Estimate Std. Error    z value     Pr(>|z|)
## (Intercept)  -1.7157796 0.32233098 -5.3230368 1.020491e-07
## dose          0.1156765 0.02378865  4.8626745 1.158102e-06
## sexmale      -0.2119351 0.51523286 -0.4113384 6.808244e-01
## dose:sexmale  0.1815579 0.06691613  2.7132151 6.663383e-03

Of course we don’t want to retype the cbind boilerplate:

doTest(bw1, compare(. ~ dose + sex))

## p-value for model comparison: 0.001741539
##           --------------------
## Test: Likelihood ratio
##       Comparison to . ~ dose + sex

Since dose is continous and sex is binary we could also use a Z-test on the single interaction term.

doTest(bw1, fixed("dose:sexmale", "z"))

## p-value for predictor 'dose:sexmale': 0.006663383
##           --------------------
## Test: z-test

Single random effects

The random function gives you access to tests from the RLRsim package. No additional arguments are needed for a single random effect.

re1 <- lmer(Yield ~ 1|Batch, data=Dyestuff)
doTest(re1, random())

## p-value for a single random effect: 0.0035
##           --------------------
## Test: Exact restricted LRT (package RLRsim)

Multiple random effects

Where the model has multiple random effects, compare can be used to test alternative specifications.

fm1 <- lmer(Reaction ~ Days + (Days | Subject), data=sleepstudy)

doTest(fm1, compare( ~ Days + (1 | Subject)))

## p-value for model comparison: 7.072413e-10
##           --------------------
## Test: Likelihood ratio
##       Comparison to ~Days + (1 | Subject)

The LRT is fast but only approximate. In fact, when testing random effects, the test is conservative¹ because the null hypothesis is at a boundary of the parameter space. This means that you will underestimate power if you use the LRT. For more accurate results you can use compare with a parametric bootstrap test from the pbkrtest package. These can be quite slow, so you may want to use the LRT to exploring designs, and then double check with the parametric bootstrap.

doTest(fm1, compare( ~ Days + (1 | Subject), "pb"))

Note that the shortcut rcompare saves you from retyping the fixed effect specification.

doTest(fm1, rcompare( ~ (1 | Subject), "pb"))

A note about errors during simulation

During a simulation study, some iterations may fail due to some sort of error. When this happens, simr treats that iteration as a failed (i.e. not significant) test. In the following example there are 50 simulations, with 14 successes, 34 failures, and 2 non-results. The power is calculated as 14/50, i.e. 28%:

binFit <- glm(formula = cbind(z, 10 - z) ~ x + g, family = binomial, data = simdata)

poisSim <- glm(formula = z ~ x + g, family = poisson, data = simdata)
coef(poisSim)[1:2] <- c(1, -0.05)

powerSim(binFit, sim=poisSim, nsim=50, seed=1)

## Power for predictor 'x', (95% confidence interval):
##       28.00% (16.23, 42.49)
## 
## Test: z-test
## 
## Based on 50 simulations, (0 warnings, 2 errors)
## alpha = 0.05, nrow = 30
## 
## Time elapsed: 0 h 0 m 0 s

Rather than interrupting part-way through an analysis, simr traps and logs errors and warnings. You can access these logs using $warnings and $errors. If you didn’t assign your analysis to a variable, you can recover it with the lastResult function.

ps <- lastResult()
ps$errors

##     stage index                           message
## 1 Fitting    14 Value 1.04545 out of range (0, 1)
## 2 Fitting    33 Value 1.13636 out of range (0, 1)

See, e.g., Pinheiro, J.C., Bates D.M. (2000) Mixed-Effects Models in S and S-PLUS, Springer, New York (p84).↩︎