This vignette shows the basic workflow of msPCA on the
built-in mtcars dataset. We compute sparse principal
components, inspect the solution with the print() and
summary() S3 methods, and compare the sparse result with
dense PCA.
We work with the correlation matrix of mtcars and ask
for two 4-sparse principal components under the default orthogonality
constraint.
print() shows the sparse loading matrix restricted to
the union of all active variables, together with the percentage of
variance explained and the number of non-zero loadings per
component.
print(res, Sigma)
#>
#> msPCA solution: 2 sparse PCs
#> Pct. variance explained: 32.45835 27.98031
#> Non-zero loadings per PC: 4 4
#>
#> Sparse PCs
#> [,1] [,2]
#> mpg 0.4994877 0.0000000
#> cyl -0.4952709 0.0000000
#> disp -0.5096594 0.0000000
#> hp 0.0000000 -0.5180512
#> wt -0.4954454 0.0000000
#> qsec 0.0000000 0.5056635
#> vs 0.0000000 0.4935970
#> carb 0.0000000 -0.4819641summary() gives a fuller breakdown: a per-PC table of
variance explained and sparsity, followed by the pairwise feasibility
violation matrix so you can see at a glance how well the orthogonality
constraint is satisfied between each pair of components.
summary(res, Sigma)
#>
#> msPCA summary: 2 sparse PC(s)
#> Input type : Sigma
#> Runtime (s) : 0.018
#>
#> Per-component statistics:
#> PC nonzero variance fve cumulative_fve
#> PC1 4 3.570419 0.3245835 0.3245835
#> PC2 4 3.077834 0.2798031 0.6043866
#>
#> Pairwise feasibility violations (upper triangle):
#> PC1 PC2
#> PC1 . 0
#> PC2 . .
#> Total: 0e+00By default (type = "Sigma") the first argument is a
covariance/correlation matrix. Set type = "X" to pass the
raw data matrix instead (rows are observations, columns are variables).
With type = "X", msPCA applies the algorithm to the data
directly: each matrix–vector product \(\Sigma
\beta = X^\top (X \beta) / (n - 1)\) is computed without ever
forming the \(p \times p\) matrix. This
is mathematically equivalent but more scalable when \(p \gg n\).
The preprocessing arguments control which matrix is implicitly used:
center (default TRUE) subtracts column
means.scale (default FALSE) divides by column
standard deviations; set to TRUE to operate on the
correlation matrix.divisor selects the normalization: "n-1"
(default, matching cov/cor) or
"n".With scale = TRUE and divisor = "n-1", the
raw-data call reproduces the correlation-matrix result above. When
type = "X", the variance figures are stored inside the
object, so C need not be passed to print() or
summary().
X <- as.matrix(mtcars)
set.seed(42)
res_X <- mspca(X, r = 2, ks = c(4, 4), type = "X", scale = TRUE, verbose = FALSE)
print(res_X)
#>
#> msPCA solution: 2 sparse PCs
#> Pct. variance explained: 32.45835 27.98031
#> Non-zero loadings per PC: 4 4
#>
#> Sparse PCs
#> [,1] [,2]
#> mpg 0.4994875 0.0000000
#> cyl -0.4952714 0.0000000
#> disp -0.5096593 0.0000000
#> hp 0.0000000 -0.5180511
#> wt -0.4954451 0.0000000
#> qsec 0.0000000 0.5056627
#> vs 0.0000000 0.4935983
#> carb 0.0000000 -0.4819636The same dual interface is available for the single-component
tpm().
Sparse PCA requires a constraint to prevent redundancy between
components. The default (feasibilityConstraintType = 0)
enforces orthogonality of the loading vectors. Setting
feasibilityConstraintType = 1 instead enforces zero
pairwise correlation between the resulting scores. The choice can lead
to different solutions when the variables are strongly correlated.
set.seed(42)
res_corr <- mspca(Sigma, r = 2, ks = c(4, 4),
feasibilityConstraintType = 1, verbose = FALSE)
print(res_corr, Sigma)
#>
#> msPCA solution: 2 sparse PCs
#> Pct. variance explained: 24.72306 22.81602
#> Non-zero loadings per PC: 4 4
#>
#> Sparse PCs
#> [,1] [,2]
#> hp -0.3117412 0.00000000
#> drat 0.0000000 0.33662426
#> wt 0.0000000 -0.08659835
#> qsec 0.6737896 0.00000000
#> vs 0.2785136 0.00000000
#> am 0.0000000 0.62421437
#> gear 0.0000000 0.69967225
#> carb -0.6093071 0.00000000
summary(res_corr, Sigma, feasibilityConstraintType = 1)
#>
#> msPCA summary: 2 sparse PC(s)
#> Input type : Sigma
#> Runtime (s) : 0.398
#>
#> Per-component statistics:
#> PC nonzero variance fve cumulative_fve
#> PC1 4 2.719537 0.2472306 0.2472306
#> PC2 4 2.509762 0.2281602 0.4753908
#>
#> Pairwise feasibility violations (upper triangle):
#> PC1 PC2
#> PC1 . 9.620782e-05
#> PC2 . .
#> Total: 9.620782e-05The utility functions feasibility_violation_off() and
fraction_variance_explained() can be called directly for
custom reporting or for comparing solutions across methods.
# Orthogonality and zero-correlation violations for the default solution
feasibility_violation_off(Sigma, res$x_best, feasibilityConstraintType = 0)
#> [1] 0
feasibility_violation_off(Sigma, res$x_best, feasibilityConstraintType = 1)
#> [1] 2.335601
# Total and per-PC fraction of variance explained
fraction_variance_explained(Sigma, res$x_best)
#> [1] 0.6043866
fraction_variance_explained_perPC(Sigma, res$x_best)
#> [1] 0.3245835 0.2798031The first two dense principal components explain more variance, but all variables receive non-zero loadings.
pca_res <- prcomp(mtcars, scale. = TRUE)
fraction_variance_explained(Sigma, pca_res$rotation[, 1:2])
#> [1] 0.8417153Sparse PCA trades a small reduction in explained variance for a much more interpretable loading pattern.