This vignette walks through the core workflow of sssvcqr
on a small synthetic example: simulating a global–local quantile
regression problem, fitting the SS-SVCQR estimator, inspecting the
global-versus-local selection, querying the fitted object through the
standard R model methods, predicting at new locations, tuning the
penalties with spatially blocked cross-validation, and reading the
first-order KKT diagnostics.
The model. SS-SVCQR fits the conditional quantile
\[ Q_\tau(y_i \mid z_i, x_i, u_i) = z_i^\top \alpha + \sum_{j=1}^p x_{ij}\{\beta_{G,j} + \delta_j(u_i)\}, \]
with global controls Z, candidate spatially varying
covariates X, coordinates u, a group penalty
that may shrink an entire deviation vector \(\delta_j\) exactly to zero, and a
graph-Laplacian smoothness penalty over a \(k\)-nearest-neighbor proximity graph. The
companion JSS software paper covers the algorithm and case study in
detail; this vignette is the short interactive walkthrough.
simulate_sssvcqr_data() returns synthetic data on the
unit square with a known true active set, true global-baseline
coefficients, and known deviation fields. The default p = 3
design has x1 and x3 active and
x2 globally constant.
library("sssvcqr")
dat <- simulate_sssvcqr_data(n = 120, q = 2, p = 3, seed = 20260505)
str(dat[c("y", "Z", "X", "u", "active", "alpha_true", "beta_G_true")],
max.level = 1)
#> List of 7
#> $ y : num [1:120] -2.465 -4.413 -0.213 0.329 2.858 ...
#> $ Z : num [1:120, 1:2] -0.775 0.231 0.475 0.641 1.186 ...
#> ..- attr(*, "dimnames")=List of 2
#> $ X : num [1:120, 1:3] 0.952 -1.18 -0.41 0.375 0.789 ...
#> ..- attr(*, "dimnames")=List of 2
#> $ u : num [1:120, 1:2] 0.1745 0.8158 0.5108 0.0156 0.4219 ...
#> ..- attr(*, "dimnames")=List of 2
#> $ active : Named logi [1:3] TRUE FALSE TRUE
#> ..- attr(*, "names")= chr [1:3] "x1" "x2" "x3"
#> $ alpha_true : num [1:2] 1 0.6
#> $ beta_G_true: num [1:3] 1.5 1 0.5
dat$active
#> x1 x2 x3
#> TRUE FALSE TRUEss_svcqr() takes an explicit matrix interface so that
the analyst’s Z/X partition is unambiguous.
The penalties lambda1 and lambda2 control
sparsity (group selection) and graph smoothness, respectively.
fit <- ss_svcqr(
y = dat$y, Z = dat$Z, X = dat$X, u = dat$u,
tau = 0.5,
lambda1 = 5, lambda2 = 0.1, k_nn = 8,
control = list(max_iter = 180, warn_nonconvergence = FALSE)
)
fit
#> Sparse-smooth SVC quantile regression fit
#> n = 120 q = 2 p = 3 tau = 0.5
#> lambda1 = 5 lambda2 = 0.1
#> iterations = 77 converged = TRUEThe fitted summary prints the model size, tuning parameters, ADMM
convergence status, the global block, and the L2 norms of each estimated
deviation field. The deviation norms are the visual signature of
global-versus-local selection: the second deviation field is exactly
zero because the group penalty has correctly identified x2
as global.
summary(fit)
#> Sparse-smooth SVC quantile regression summary
#> n = 120 q = 2 p = 3 tau = 0.5
#> lambda1 = 5 lambda2 = 0.1
#> iterations = 77 converged = TRUE
#>
#> alpha:
#> [1] 1.0825977 0.8039034
#> beta_G:
#> [1] 1.3700378 0.9478852 0.2285181
#> delta L2 norms:
#> [1] 1.059795 0.000000 1.176414
selection_recovery_table(fit, dat)
#> covariate_index covariate true_active true_deviation_norm
#> 1 1 x1 TRUE 11.18681
#> 2 2 x2 FALSE 0.00000
#> 3 3 x3 TRUE 20.47177
#> estimated_deviation_norm selected_active
#> 1 1.059795 TRUE
#> 2 0.000000 FALSE
#> 3 1.176414 TRUEStandard R methods are wired through the sssvcqr class
so that fitted values and residuals can be extracted directly, and
coef() returns a list with the global block and the
deviation matrix.
The plot() method renders the fitted deviation field,
the local total coefficient surface, residuals, or ADMM convergence
traces. By default the spatial plots use an inverse-distance-weighted
gridded surface over the first two coordinate columns, with a diverging
color scale.
predict() either returns fitted conditional quantiles
(the default) or local coefficient surfaces
(type = "coefficients"). For new locations the deviation is
extrapolated by inverse-distance-weighted averaging of the \(k\) nearest training deviations.
unew <- dat$u[1:3, , drop = FALSE] + 0.01
round(predict(fit,
Znew = dat$Z[1:3, , drop = FALSE],
Xnew = dat$X[1:3, , drop = FALSE],
unew = unew), 4)
#> [1] -1.4124 -3.4994 -0.1799
round(predict(fit, type = "coefficients")[1:3, ], 4)
#> [,1] [,2] [,3]
#> [1,] 1.2779 0.9479 0.2632
#> [2,] 1.4987 0.9479 0.2482
#> [3,] 1.4155 0.9479 0.2167cv_ss_svcqr() evaluates a \((\lambda_1,\lambda_2)\) grid using spatial
folds (k-means clusters on the coordinate matrix by default), so that
nearby observations are not split between training and validation. The
example below uses a small grid and three folds for speed; empirical
applications should use a broader grid and more iterations.
dat_cv <- simulate_sssvcqr_data(n = 80, q = 2, p = 3, seed = 20260505)
cv <- cv_ss_svcqr(
y = dat_cv$y, Z = dat_cv$Z, X = dat_cv$X, u = dat_cv$u,
tau = 0.5,
lambda1_seq = c(1, 2),
lambda2_seq = c(0.5, 1),
K_folds = 3, adaptive_weights = FALSE,
control = list(max_iter = 25, warn_nonconvergence = FALSE)
)
cv
#> Spatially blocked CV for SS-SVCQR
#> tau = 0.5
#> best lambda1 = 2
#> best lambda2 = 0.5
#> best mean check loss = 0.9822423
cv$best
#> $lambda1
#> [1] 2
#>
#> $lambda2
#> [1] 0.5
#>
#> $cv_mean
#> [1] 0.9822423
#>
#> $cv_sd
#> [1] 0.7556301kkt_sssvcqr() reports the first-order conditions of the
SS-SVCQR objective: gradient norms for the unpenalized global block,
stationarity residuals for nonzero deviation groups, zero-group margins
for groups shrunk to zero, and the maximum violation of the
per-component degree-weighted centering constraints. Centering
violations should be at the level of solver round-off when the fit has
converged.
?ss_svcqr,
?cv_ss_svcqr, ?build_graph_laplacian,
?kkt_sssvcqr) document each argument and return value in
full.