---
title: "unitizeR - Reproducible Tests"
author: "Brodie Gaslam"
output:
    rmarkdown::html_vignette:
        toc: true
        css: styles.css

vignette: >
  %\VignetteIndexEntry{4 - Reproducible Tests}
  %\VignetteEngine{knitr::rmarkdown}
  %\usepackage[utf8]{inputenc}
---

## Managing State

### Reproducibility

R's emphasis on avoiding side effects generally means that if you run the same R
code more than once you can be relatively certain that you will get the same
result each time.  While this is generally true, there are some exceptions.  If
you evaluate:
```
x <- x + 5
```
on the command line, the result will depend on what the value of `x` was in the
workspace prior to evaluation.  Since workspaces are littered with objects from
day to day R use tests are better run elsewhere to avoid conflicts with those
objects.

There are even more subtle factors that can affect test evaluation.  For
example, if `x` is an S3 object, the packages loaded on the search path could
affect the result of the command.  Global options could also affect the outcome.

Here is a non-exhaustive list of aspects of state that might affect test
outcomes:

1. Workspace / Evaluation Environment.
1. Random seed.
1. Working directory.
1. Search path.
1. Global options.
1. Loaded namespaces.
1. System time.
1. System variables.
1. Locale.
1. etc.

Ideally a unit testing framework would nullify these environmental factors such
that the only changes in test evaluation are caused by changes in the code that
is being tested.  `unitizer` provides functionality that sets session state to
known "clean" values ahead of the evaluation of each test.  Currently `unitizer`
attempts to manage the first six aspects of state listed above.

**In order to comply with CRAN policies state management is turned off by
default.**

### Batch Evaluation and Deferred Review

`unitizer` batch processes all the tests when it is first run before it breaks
into interactive mode.  It does this to:

1. Display useful summary data (how many tests passed/failed in which
   sections), which is often helpful to know before beginning to debug.
2. Allow time consuming process to run unattended so that the interactive test
   review process is not interrupted by slow tests.

The batch-evaluate-and-review-later creates the need for a mechanism to recreate
state for when we review the tests.  Imagine trying to figure out why a test
failed when all the variables may have been changed by subsequent tests.
`unitizer` will always recreate the state of the variables defined by the test
scripts, and can optionally recreate other aspects of state provided that is
enabled.

### Enabling State Management

You can turn on the "suggested" state management level to manage the first
four elements of state listed in the previous section.  To do so,
use `unitize(..., state='suggested')` or `options(unitizer.state='suggested')`.
Be sure to read `?unitizerState` before you enable this setting as there are
cases when state management may not work.

## Workspace And Evaluation Environments

### Test Environments

In order to allow review of each test in its original evaluation environment,
each test is evaluated in a separate environment. Each of these environments has
for parent the environment of the previous test.  This means that a test has
access to all the objects created/used by earlier tests, but not objects
created/used by subsequent tests.  When a later test "modifies" an existing
object, the existing object is not really modified; rather, the test creates a
new object of the same name in the child environment which masks the object in
the earlier test.  This is functionally equivalent to overwriting the object as
far as the later test is concerned.

For the most part this environment trickery should be transparent to the user.
An exception is the masking of `ls` and `traceback` with versions that account
for the special nature of the `unitizer` REPL.  Another is that you can not
remove an object created in an earlier test with `rm` (well, it is possible,
but the how isn't documented and you are advised not to attempt it).  Here is a
more complex exception:


    a <- function() b()
    NULL                 # Prevent `a` and `b` being part of the same test
    b <- function() TRUE
    a()

In this case, when we evaluate `a()` we must step back two environments to find
`a`, but that's okay.  The problem is that once inside `a`, we must now evaluate
`b()`, but `b` is defined in a child environment, not a parent environment so
R's object lookup fails.  If we remove the NULL this would work, but only
because neither the `a` or `b` assignments are tests, so both `a` and `b` would
be assigned to the environment of the `a()` call (see [details on tests
vignette](u2_tests.html)).

If you are getting weird "object not found" errors when you run your tests, but
the same code does not generate those errors when run directly in the command
line, this illusion could be failing you.  In those situations, make sure that
you assign all the variables necessary right ahead of the test so they will all
get stored in the same environment.

### The Parent Environment

In the "suggested" state tracking mode `unitize` will run tests in an
environment that has the same parent as `.GlobalEnv` (`UnitizerEnv` below):
```
             .GlobalEnv
                       \
                        +--> package:x --> ... --> Base
                       /
TestEnv --> UnitizerEnv
```
This means that objects in the global environment / workspace will not affect your tests.

Unfortunately implementing this structure is not trivial because we need to
ensure `UnitizerEnv` stays pointed at the environment just below `.GlobalEnv`
even as tests modify the search path by calling `library`/`attach`/`detach`,
etc.  To achieve this `unitizer` traces `base::library`, `base::attach`, and
`base::detach` **when state tracking is enabled** and **only when `unitizer` is
running**.  Any time any of those functions is called, `unitizer` updates the
parent of `UnitizerEnv` to be the second environment on the search path (i.e.
the parent of `.GlobalEnv`).  So, for example, if a test calls `library(z)`, the
new search path would look like so:

```
             .GlobalEnv
                       \
                        +--> package:y --> package:x --> ... --> Base
                       /
TestEnv --> UnitizerEnv
```

Clearly overriding such fundamental functions such as `library` / `attach` /
`detach` is not good form.  We recognize this, and try to do the overriding in
as lightweight a manner as possible by tracing them only to record the search
path while `unitizer` is evaluating.  This should be completely transparent to
the user.  The untracing is registered to the `on.exit` of `unitize` so the
functions should get untraced even if `unitize` fails.

Aside from the issues raised above, this method is not completely robust.  Any
tests that turn tracing off using `tracingState`, or themselves
`trace`/`untrace` any of `library` / `attach` / `detach` will interfere with
`unitizer`.  If you must do any of the above you should consider specifying a
parent environment for your tests through the `state` parameter to `unitize`
(see `?unitize`).

Some functions that expect to find `.GlobalEnv` on the search path may not work
as expected.  For example, `setClass` uses `topenv` by default to find an
environment to define classes in.  When `setClass` is called at the top level,
this normally results in the class being defined in `.GlobalEnv`, but if
`.GlobalEnv` is not available `setClass` will attempt to define the class in the
first environment on the search path, which will likely be a locked namespace.
You can work around this by specifying an environment in calls to `setClass`.

### Package Namespace as Parent Environment

Sometimes it is convenient to use the namespace of a package as the parent
environment.  This allows you to write tests that use internal package functions
without having to resort to `:::`.  You can set the parent evaluation
environment with the `state` argument to `unitize` / `unitize_dir`.  See
`?unitize` and `?unitizeState`.

If you do use this feature keep in mind that your tests will be directly exposed
to the global environment as well since R looks through the search path starting
at the global environment after looking in the package namespace and imports
(your package code is always exposed to this).

### Issues With Reference Objects

For the most part R is a copy-on-modify language, which allows us to employ the
trickery described above.  There are however "reference" objects that are not
copied when they are modified.  Notable examples include environments, reference
classes, and `data.table`.  Since our trickery requires us to keep copies of
each object in different environments as they are modified, it does not work
with reference objects since they are not automatically duplicated.

The main consequence of this is that when you are reviewing a test that involves
a reference object, the value of that reference object during review will be the
value after the last reference modification, which may have been made after the
test you are reviewing.  The tests will still work as they should, passing if
you did not introduce regressions, and failing otherwise.  However if you review
a failed test you may have a hard time making sense of what happened since the
objects you review will may not have the values they had when the test was
actually run.

### Patchwork Reference Environments

When we review `unitizer` tests, it is possible to end up in a situation where
we wish to update our store by keeping a mix of the new tests as well as some of
the old ones.  This leads to some complications because in order to faithfully
reproduce the environments associated with both the reference and the new tests
we would potentially have to store the entire set of environments produced by
the test script for both the new and reference tests.  Even worse, if we re-run
`unitizer` again, we run the risk of having to store yet another set of
environments (the old reference environments, what were new environments but
became reference ones on this additional run, and the new environments created
by this third run).  The problem continues to grow with as each incremental run
of the `unitizer` script potentially creates the need to store yet another set
of environments.

As a work-around to this problem `unitizer` only keeps the environment
associated with the actual reference tests you chose to keep (e.g. when you type
`N` at the `unitizer` prompt when reviewing a failed test).  `unitizer` then
grafts that test and its environment to the environment chain from the newly
evaluated tests (note that for all tests that pass, we keep the new version of
the tests, not the reference one).  This means that in future `unitizer` runs
where you examine this same reference test, the other "reference" objects
available for inspection may not be from the same evaluation that produced the
test.  The `ls` command will highlight which objects are from the same
evaluation vs which ones are not (see the [discussion on
`ls`](u3_interactive-env.html#ls)).

This is not an ideal outcome, but the compromise was necessary to avoid the
possibility of ever increasing `unitizer` stores.  For more details see
`?"healEnvs,unitizerItems,unitizer-method"`.

## Clean Search Paths

### Description / Implementation

One other way tests can change behavior unexpectedly is if the packages /
objects attached to the search path change.  A simple example is a test script
that relies on package "X", and the user attached that package at some point
during interactive use, but forgot to add the requisite `library` call to the
test script itself.  During testing, the scripts will work fine, but at some
future date if the test scripts are run again they are likely to fail due to the
dependency on the package that is not explicitly loaded in the test scripts.

In the "suggested" state tracking mode `unitizer` runs on a "trimmed" search
path that contains only the packages loaded by in a freshly loaded R session
(i.e. the packages between `package:base` and `package:stats`; see
`?unitizerState`).  You will need to explicitly load packages that your tests
depend on in your test file (e.g. by using `library()`).  `unitize` will restore
the search path to its original state once you complete review.

`unitizer` also relies on tracing `library`/`attach`/`detach` to implement this
feature, so the caveats described [above](#The-Parent-Environment) apply equally
here.  `unitizer` **does not modify the search path itself** other than by using
`library`, `attach`, and `detach`.

When search path tracking is enabled, `unitizer` tracks the versions of the
packages on the search path.  If tests fails and package versions on the search
path have changes since the reference test was stored, you will be alerted.

### Potential Issues

When `unitizer` manipulates the search path it restores the original one by
using `library`/`attach` on any previously detached objects or packages.  This
generally works fine, but detaching and re-attaching packages is not and cannot
be the same as loading a package or attaching an environment for the first time.
For example, S3 method registration is not undone when detaching a package, or
even unloading its namespace.  See discussion in `?detach` and in
`?unitizerState`.

One known problem is the use of `devtools::load_all` and similar which place a
pretend package environment on the search path.  Such packages cannot be
re-loaded with `library` so the re-attach process will fail (see
[#252](https://github.com/brodieG/unitizer/issues/252)).

Another issue is attached environments that contain references to themselves, as
the `tools:rstudio` environment attached by `Rstudio` does.  It contains
functions that have for environment the `tools:rstudio` environment.  The
problem is that once that environment is detached from the search path, those
functions no longer have access to the search path.  Re-attaching the
environment to the search path does not solve the problem because `attach`
attaches a _copy_ of the environment, not the environment itself.  This new
environment will contain the same objects as the original environment, but all
the functions therein will have for environment the original detached
environment, not the copy that is attached to the search path.

For the specific `tools::rstudio` problem we work around the issue by keeping it
on the search path even search path tracking is enabled (you can over-ride this
by changing `search.path.keep`, or, if you have environments on your search path
with similar properties, add their names to `search.path.keep`).  Other options
include re-attaching with `parent.env<-` instead of `attach`, but messing with
the search path in that way seems to be exactly what R core warns about in
`?parent.env`:

> The replacement function parent.env<- is extremely dangerous as it can be used to destructively change environments in ways that violate assumptions made by the internal C code. It may be removed in the near future.

## Global Options

`unitizer` can track and reset global options.  Because many packages set
options when their namespaces are attached, implementation of this feature must
be coordinated with a careful management of loaded namespaces.  For example, we
can reasonably easily set options to be what you would expect in a freshly
loaded vanilla R session, but if some namespaces as otherwise they would be in a
compromised set with their options wiped out.

`unitizer` can manage search paths and namespaces, but unfortunately some
package namespaces cannot be unloaded so options management can be problematic
when such packages are involved (one example is `data.table`).  Because of this
options management is not enabled in the "suggested" state management mode.

Note that no matter what tests are always run with `options(warn=1)` and
`options(error=NULL)`.

See `?unitizer.opts` for more details.

## Random Seed

See `?unitizerState`.

## Working Directory

See `?unitizerState`.