Maintainers
- westford14
Here we will compare and contrast the two different sampling backends that are available to users of this library. There are far more indepth reviews of when to use which sampler and why (Beraha et al. 2021; Homan et al. 2014), but, nevertheless, we can further add some guidance for this specific library.
I’m going to skip over a lot of the background information on why measurement error adjustment is important in dietary exposure assessment as there are many far more detailed tutorials on this. We should begin by creating a custom dataset that can be used throughout the entirety of this example. Normally dietary consumption data is characterized by a gamma distribution (Passerelli et al. 2022), so we that’s the data that will be generated. It is very important to note here that this library will standardize (i.e. make the consumption distribution more normal) by default. So, please be careful when applying this library to your dataset.
x <- rgamma(100, shape = 1, rate = 0.01)
y <- rgamma(100, shape = 3, rate = 0.02)
z <- rgamma(100, shape = 3, rate = 0.3)These will represent random consumption values, and accordingly, we will generate some example output data that is linked to these input datas.
Thus, the full dataset looks like:
df <- data.frame(
list(x = x, y = y, z = z, output = output)
)
head(df, 10)
#> x y z output
#> 1 193.929578 90.74957 8.662289 32.25831
#> 2 18.041910 338.32089 6.448633 46.03330
#> 3 53.443286 42.33018 3.058837 45.20608
#> 4 2.396523 86.38140 6.098603 25.39488
#> 5 42.777424 380.82160 4.813437 30.38507
#> 6 14.315278 104.60287 11.283213 38.96740
#> 7 294.574967 138.23559 7.384926 33.34672
#> 8 45.663770 97.96775 6.283557 30.57872
#> 9 2.690928 184.23220 5.053584 31.23650
#> 10 233.469657 340.39325 3.089829 37.17521
x_v_coef <- generate_coefficient(1000, 0.4, 0.7, 0.95)
y_v_coef <- generate_coefficient(1000, 0.5, 0.7, 0.95)
z_v_coef <- generate_coefficient(1000, 0.3, 0.6, 0.95)
### Stan vs. JAGS Pre-Model
With the data generated, we can now fit the pre-model to solve for the covariances
in the observed data. [Muoka et. al 2021](#References) created a very clever situation
where the JAGS sampler will converge quite quickly as the inverse-wishart distribution
is used to create a semi-conjugate prior situation. This of course does not guarantee
that the sampled posterior will be fully realized, but it does give the sampler that
much more of a chance to converge. From an API perspective, the same function to fit
the pre-model with a JAGS backend can be used to fit with a Stan backend. The only thing
that needs to be adjusted is the `stan = TRUE` parameter for the function.
``` r
jags_output <- acme_model(df, c("x", "y", "z"))
#> Compiling model graph
#> Resolving undeclared variables
#> Allocating nodes
#> Graph information:
#> Observed stochastic nodes: 100
#> Unobserved stochastic nodes: 4
#> Total graph size: 166
#>
#> Initializing model
stan_output <- acme_model(df, c("x", "y", "z"), stan = TRUE)
#>
#> SAMPLING FOR MODEL 'anon_model' NOW (CHAIN 1).
#> Chain 1:
#> Chain 1: Gradient evaluation took 9.9e-05 seconds
#> Chain 1: 1000 transitions using 10 leapfrog steps per transition would take 0.99 seconds.
#> Chain 1: Adjust your expectations accordingly!
#> Chain 1:
#> Chain 1:
#> Chain 1: Iteration: 1 / 10000 [ 0%] (Warmup)
#> Chain 1: Iteration: 1000 / 10000 [ 10%] (Warmup)
#> Chain 1: Iteration: 1501 / 10000 [ 15%] (Sampling)
#> Chain 1: Iteration: 2500 / 10000 [ 25%] (Sampling)
#> Chain 1: Iteration: 3500 / 10000 [ 35%] (Sampling)
#> Chain 1: Iteration: 4500 / 10000 [ 45%] (Sampling)
#> Chain 1: Iteration: 5500 / 10000 [ 55%] (Sampling)
#> Chain 1: Iteration: 6500 / 10000 [ 65%] (Sampling)
#> Chain 1: Iteration: 7500 / 10000 [ 75%] (Sampling)
#> Chain 1: Iteration: 8500 / 10000 [ 85%] (Sampling)
#> Chain 1: Iteration: 9500 / 10000 [ 95%] (Sampling)
#> Chain 1: Iteration: 10000 / 10000 [100%] (Sampling)
#> Chain 1:
#> Chain 1: Elapsed Time: 0.596 seconds (Warm-up)
#> Chain 1: 3.767 seconds (Sampling)
#> Chain 1: 4.363 seconds (Total)
#> Chain 1:Both of the models can be then introspected to see the underlying JAGS / Stan model code along with samples to get a better understanding of how the samplers can yield slightly different posteriors.
jags_output$model
#> JAGS model:
#>
#>
#> model {
#> for (i in 1:n) {
#> model_data[i, 1:p] ~ dmnorm( zMu[1:p] , zInvCovMat[1:p, 1:p] )
#> }
#>
#> for (varIdx in 1:p) {
#> # -- uninformative prior for zMu -- #
#> zMu[varIdx] ~ dnorm(0, 1000000)
#> }
#> zInvCovMat ~ dwish(zRmat[1:p, 1:p], zdf)
#>
#> # -- Convert invCovMat to SD and correlation -- #
#> zCovMat <- inverse(zInvCovMat)
#> for (varIdx in 1:p) {
#> zSigma[varIdx] <- sqrt(zCovMat[varIdx,varIdx])
#> }
#> for (varIdx1 in 1:p) {
#> for (varIdx2 in 1:p) {
#> zRho[varIdx1, varIdx2] <- (
#> zCovMat[varIdx1, varIdx2] / (zSigma[varIdx1] * zSigma[varIdx2])
#> )
#> }
#> }
#>
#> for (varIdx in 1:p) {
#> sigma[varIdx] <- zSigma[varIdx] * sd_orig[, varIdx]
#> mu[varIdx] <- zMu[varIdx] * sd_orig[, varIdx] + mean_orig[, varIdx]
#> }
#> for (varIdx1 in 1:p) {
#> for (varIdx2 in 1:p) {
#> rho[varIdx1, varIdx2] <- zRho[varIdx1, varIdx2]
#> }
#> }
#> }
#>
#> Fully observed variables:
#> mean_orig model_data n p sd_orig zRmat zdfstan_output$model
#> S4 class stanmodel 'anon_model' coded as follows:
#> data {
#> int<lower=1> n;
#> int<lower=1> p;
#> matrix[n, p] model_data;
#> matrix[p, p] zRmat;
#> real<lower=0> zdf;
#> vector[p] sd_orig;
#> vector[p] mean_orig;
#> }
#> parameters {
#> vector[p] zMu;
#> cov_matrix[p] zInvCovMat;
#> }
#> transformed parameters {
#> matrix[p, p] zCovMat;
#> vector[p] zSigma;
#> matrix[p, p] zRho;
#> // -- Convert invCovMat to SD and correlation --
#> zCovMat = inv(zInvCovMat);
#> for (varIdx in 1:p) {
#> zSigma[varIdx] = sqrt(zCovMat[varIdx, varIdx]);
#> }
#> for (varIdx1 in 1:p) {
#> for (varIdx2 in 1:p) {
#> zRho[varIdx1, varIdx2] = zCovMat[varIdx1, varIdx2] / (zSigma[varIdx1] * zSigma[varIdx2]);
#> }
#> }
#> }
#> model {
#> for (i in 1:n) {
#> model_data[i, 1:p] ~ multi_normal(zMu[1:p], zInvCovMat[1:p, 1:p]);
#> }
#> // -- uninformative prior for zMu --
#> for (i in 1:p) {
#> zMu[i] ~ normal(0, 1000000);
#> }
#> zInvCovMat ~ wishart(zdf, zRmat[1:p, 1:p]);
#> }
#> generated quantities {
#> vector[p] mu;
#> vector[p] sigma;
#> matrix[p, p] rho;
#> // -- transform sd and means back to original scale --
#> for (varIdx in 1:p) {
#> sigma[varIdx] = zSigma[varIdx] * sd_orig[varIdx];
#> mu[varIdx] = zMu[varIdx] * sd_orig[varIdx] + mean_orig[varIdx];
#> }
#> rho = zRho;
#> }
#> To show exactly how the samplers can create differing results. We can directly look at the different metrics for the solved covariance structure as well as the actual resultant covariance matrix.
jags_output$covariance_matrix
#>
#> Iterations = 1001:11000
#> Thinning interval = 1
#> Number of chains = 1
#> Sample size per chain = 10000
#>
#> 1. Empirical mean and standard deviation for each variable,
#> plus standard error of the mean:
#>
#> Mean SD Naive SE Time-series SE
#> rho[1,1] 1.00000 1.409e-16 1.409e-18 0.0000000
#> rho[2,1] 0.08985 9.896e-02 9.896e-04 0.0009896
#> rho[3,1] -0.13364 9.668e-02 9.668e-04 0.0009668
#> rho[1,2] 0.08985 9.896e-02 9.896e-04 0.0009896
#> rho[2,2] 1.00000 1.422e-16 1.422e-18 0.0000000
#> rho[3,2] -0.13592 9.711e-02 9.711e-04 0.0009630
#> rho[1,3] -0.13364 9.668e-02 9.668e-04 0.0009668
#> rho[2,3] -0.13592 9.711e-02 9.711e-04 0.0009630
#> rho[3,3] 1.00000 0.000e+00 0.000e+00 0.0000000
#> sigma[1] 111.39497 7.959e+00 7.959e-02 0.0795902
#> sigma[2] 98.21803 7.045e+00 7.045e-02 0.0687691
#> sigma[3] 6.47476 4.622e-01 4.622e-03 0.0046991
#>
#> 2. Quantiles for each variable:
#>
#> 2.5% 25% 50% 75% 97.5%
#> rho[1,1] 1.0000 1.00000 1.00000 1.00000 1.00000
#> rho[2,1] -0.1025 0.02202 0.09061 0.15746 0.28025
#> rho[3,1] -0.3157 -0.20090 -0.13561 -0.06821 0.06043
#> rho[1,2] -0.1025 0.02202 0.09061 0.15746 0.28025
#> rho[2,2] 1.0000 1.00000 1.00000 1.00000 1.00000
#> rho[3,2] -0.3240 -0.20167 -0.13718 -0.06983 0.05588
#> rho[1,3] -0.3157 -0.20090 -0.13561 -0.06821 0.06043
#> rho[2,3] -0.3240 -0.20167 -0.13718 -0.06983 0.05588
#> rho[3,3] 1.0000 1.00000 1.00000 1.00000 1.00000
#> sigma[1] 97.3128 105.78669 110.95773 116.36249 128.58500
#> sigma[2] 85.6659 93.29242 97.70399 102.65356 113.34842
#> sigma[3] 5.6404 6.15341 6.44824 6.76547 7.43727stan_output$covariance_matrix
#> $summary
#> mean se_mean sd 2.5% 25%
#> sigma[1] 108.821210 6.975823e-02 8.021457e+00 93.59244 103.248954
#> sigma[2] 95.714630 6.163810e-02 6.925195e+00 82.56554 91.014884
#> sigma[3] 6.313615 4.136898e-03 4.553638e-01 5.45204 6.004330
#> rho[1,1] 1.000000 2.539112e-18 1.362537e-16 1.00000 1.000000
#> rho[1,2] 32.033426 2.352995e+01 2.169541e+03 -98.56311 4.069593
#> rho[1,3] -7.156656 3.815892e+00 3.517537e+02 -69.80735 -10.871758
#> rho[2,1] 32.033426 2.352995e+01 2.169541e+03 -98.56311 4.069593
#> rho[2,2] 1.000000 1.807418e-18 1.359980e-16 1.00000 1.000000
#> rho[2,3] 1.447703 6.796757e+00 6.268534e+02 -65.91060 -10.668964
#> rho[3,1] -7.156656 3.815892e+00 3.517537e+02 -69.80735 -10.871758
#> rho[3,2] 1.447703 6.796757e+00 6.268534e+02 -65.91060 -10.668964
#> rho[3,3] 1.000000 1.770896e-18 1.363175e-16 1.00000 1.000000
#> lp__ -150.433843 3.878089e-02 2.191957e+00 -155.64937 -151.687541
#> 50% 75% 97.5% n_eff Rhat
#> sigma[1] 108.714693 114.221931 124.65692 13222.561 0.9998995
#> sigma[2] 95.548497 100.393867 109.50405 12623.086 0.9998825
#> sigma[3] 6.305834 6.615026 7.23390 12116.226 0.9999691
#> rho[1,1] 1.000000 1.000000 1.00000 2879.603 0.9998823
#> rho[1,2] 7.101119 13.628188 106.09897 8501.464 1.0000697
#> rho[1,3] -6.422826 -4.347302 55.68731 8497.380 0.9998825
#> rho[2,1] 7.101119 13.628188 106.09897 8501.464 1.0000697
#> rho[2,2] 1.000000 1.000000 1.00000 5661.715 0.9998823
#> rho[2,3] -6.261771 -4.174223 62.69786 8506.061 0.9998863
#> rho[3,1] -6.422826 -4.347302 55.68731 8497.380 0.9998825
#> rho[3,2] -6.261771 -4.174223 62.69786 8506.061 0.9998863
#> rho[3,3] 1.000000 1.000000 1.00000 5925.399 0.9998823
#> lp__ -150.063881 -148.835957 -147.24029 3194.689 1.0002539
#>
#> $c_summary
#> , , chains = chain:1
#>
#> stats
#> parameter mean sd 2.5% 25% 50%
#> sigma[1] 108.821210 8.021457e+00 93.59244 103.248954 108.714693
#> sigma[2] 95.714630 6.925195e+00 82.56554 91.014884 95.548497
#> sigma[3] 6.313615 4.553638e-01 5.45204 6.004330 6.305834
#> rho[1,1] 1.000000 1.362537e-16 1.00000 1.000000 1.000000
#> rho[1,2] 32.033426 2.169541e+03 -98.56311 4.069593 7.101119
#> rho[1,3] -7.156656 3.517537e+02 -69.80735 -10.871758 -6.422826
#> rho[2,1] 32.033426 2.169541e+03 -98.56311 4.069593 7.101119
#> rho[2,2] 1.000000 1.359980e-16 1.00000 1.000000 1.000000
#> rho[2,3] 1.447703 6.268534e+02 -65.91060 -10.668964 -6.261771
#> rho[3,1] -7.156656 3.517537e+02 -69.80735 -10.871758 -6.422826
#> rho[3,2] 1.447703 6.268534e+02 -65.91060 -10.668964 -6.261771
#> rho[3,3] 1.000000 1.363175e-16 1.00000 1.000000 1.000000
#> lp__ -150.433843 2.191957e+00 -155.64937 -151.687541 -150.063881
#> stats
#> parameter 75% 97.5%
#> sigma[1] 114.221931 124.65692
#> sigma[2] 100.393867 109.50405
#> sigma[3] 6.615026 7.23390
#> rho[1,1] 1.000000 1.00000
#> rho[1,2] 13.628188 106.09897
#> rho[1,3] -4.347302 55.68731
#> rho[2,1] 13.628188 106.09897
#> rho[2,2] 1.000000 1.00000
#> rho[2,3] -4.174223 62.69786
#> rho[3,1] -4.347302 55.68731
#> rho[3,2] -4.174223 62.69786
#> rho[3,3] 1.000000 1.00000
#> lp__ -148.835957 -147.24029As you can see the sigma values are quite similar, but
the rho values vary quite a bit. This has direct
implications for what the resultant structure of the output coefficient
adjustments will look like. As referenced above, the samplers explore
the posterior space in a bit of a different way, so this can lead to
divergences between the two results. This should not be viewed as an
error as the posterior here is not mathematically directly estimatable,
but it is important to assess, for you, what makes more sense. The
research team for this library believes that the JAGS sampler is best,
so it is the default argument. All results from the Stan sampler should
be viewed as experimental and not necessarily supported by the
maintainers.
With the pre-model fit, now we can assess how the resulting coefficients are different depending on which sampler is used.
validity_coefficients <- c(x_v_coef, y_v_coef, z_v_coef)
jags_lambda <- attenuation_matrix(
jags_output,
c("x", "y", "z"),
validity_coefficients
)
jags_model_output <- multivariate_model(
"output ~ x + y + z",
data = df,
columns = c("x", "y", "z"),
a_c_matrix = jags_lambda$matrix,
sds = jags_lambda$sds,
variances = jags_lambda$variances,
univariate = TRUE
)validity_coefficients <- c(x_v_coef, y_v_coef, z_v_coef)
stan_lambda <- attenuation_matrix(
stan_output,
c("x", "y", "z"),
validity_coefficients,
stan = TRUE
)
stan_model_output <- multivariate_model(
"output ~ x + y + z",
data = df,
columns = c("x", "y", "z"),
a_c_matrix = stan_lambda$matrix,
sds = stan_lambda$sds,
variances = stan_lambda$variances,
univariate = TRUE
)Please be aware that the stan = TRUE parameter is passed
to each step of the adjustment pipeline as the underlying libraries have
slightly different APIs. This will prevent unexpected errors if you
accidentally try to pass a rJags object to a function
expecting a Stan object.
jags_plots <- plot_covariates(jags_model_output, c("x", "y", "z"))
#> No id variables; using all as measure variables
#> No id variables; using all as measure variables
#> No id variables; using all as measure variables
jags_plots$xstan_plots <- plot_covariates(stan_model_output, c("x", "y", "z"))
#> No id variables; using all as measure variables
#> No id variables; using all as measure variables
#> No id variables; using all as measure variables
stan_plots$xAgain, the differing pre-model results yield different adjustments for our synthetic dataset. It is worth repeating that one is not necessarily more accurate, but the divergences should be treated as important data points when assessing the measurement error structure in your own dataset.
And accordingly, the traceplots can be assessed to make sure that both the samplers have converged.