Worked Example: msPCA on mtcars

Overview

This vignette shows the basic workflow of msPCA on the built-in mtcars dataset. We compute sparse principal components, inspect the solution with the print() and summary() S3 methods, and compare the sparse result with dense PCA.

Install and load

Install the package directly from CRAN.

install.packages("msPCA")
library(msPCA)

Fit two sparse PCs

We work with the correlation matrix of mtcars and ask for two 4-sparse principal components under the default orthogonality constraint.

Sigma <- cor(mtcars)

set.seed(42)
res <- mspca(Sigma, r = 2, ks = c(4, 4), verbose = FALSE)

print() shows the sparse loading matrix restricted to the union of all active variables, together with the percentage of variance explained and the number of non-zero loadings per component.

print(res, Sigma)
#> 
#> msPCA solution: 2 sparse PCs
#> Pct. variance explained: 32.45835 27.98031 
#> Non-zero loadings per PC: 4 4 
#> 
#> Sparse PCs
#>            [,1]       [,2]
#> mpg   0.4994877  0.0000000
#> cyl  -0.4952709  0.0000000
#> disp -0.5096594  0.0000000
#> hp    0.0000000 -0.5180512
#> wt   -0.4954454  0.0000000
#> qsec  0.0000000  0.5056635
#> vs    0.0000000  0.4935970
#> carb  0.0000000 -0.4819641

summary() gives a fuller breakdown: a per-PC table of variance explained and sparsity, followed by the pairwise feasibility violation matrix so you can see at a glance how well the orthogonality constraint is satisfied between each pair of components.

summary(res, Sigma)
#> 
#> msPCA summary: 2 sparse PC(s)
#> Input type   : Sigma 
#> Runtime (s)  : 0.018 
#> 
#> Per-component statistics:
#>   PC nonzero variance       fve cumulative_fve
#>  PC1       4 3.570419 0.3245835      0.3245835
#>  PC2       4 3.077834 0.2798031      0.6043866
#> 
#> Pairwise feasibility violations (upper triangle):
#>     PC1 PC2
#> PC1   .   0
#> PC2   .   .
#> Total: 0e+00

Working from the raw data matrix

By default (type = "Sigma") the first argument is a covariance/correlation matrix. Set type = "X" to pass the raw data matrix instead (rows are observations, columns are variables). With type = "X", msPCA applies the algorithm to the data directly: each matrix–vector product \(\Sigma \beta = X^\top (X \beta) / (n - 1)\) is computed without ever forming the \(p \times p\) matrix. This is mathematically equivalent but more scalable when \(p \gg n\).

The preprocessing arguments control which matrix is implicitly used:

With scale = TRUE and divisor = "n-1", the raw-data call reproduces the correlation-matrix result above. When type = "X", the variance figures are stored inside the object, so C need not be passed to print() or summary().

X <- as.matrix(mtcars)

set.seed(42)
res_X <- mspca(X, r = 2, ks = c(4, 4), type = "X", scale = TRUE, verbose = FALSE)
print(res_X)
#> 
#> msPCA solution: 2 sparse PCs
#> Pct. variance explained: 32.45835 27.98031 
#> Non-zero loadings per PC: 4 4 
#> 
#> Sparse PCs
#>            [,1]       [,2]
#> mpg   0.4994875  0.0000000
#> cyl  -0.4952714  0.0000000
#> disp -0.5096593  0.0000000
#> hp    0.0000000 -0.5180511
#> wt   -0.4954451  0.0000000
#> qsec  0.0000000  0.5056627
#> vs    0.0000000  0.4935983
#> carb  0.0000000 -0.4819636

The same dual interface is available for the single-component tpm().

Orthogonality versus zero correlation

Sparse PCA requires a constraint to prevent redundancy between components. The default (feasibilityConstraintType = 0) enforces orthogonality of the loading vectors. Setting feasibilityConstraintType = 1 instead enforces zero pairwise correlation between the resulting scores. The choice can lead to different solutions when the variables are strongly correlated.

set.seed(42)
res_corr <- mspca(Sigma, r = 2, ks = c(4, 4),
                  feasibilityConstraintType = 1, verbose = FALSE)
print(res_corr, Sigma)
#> 
#> msPCA solution: 2 sparse PCs
#> Pct. variance explained: 24.72306 22.81602 
#> Non-zero loadings per PC: 4 4 
#> 
#> Sparse PCs
#>            [,1]        [,2]
#> hp   -0.3117412  0.00000000
#> drat  0.0000000  0.33662426
#> wt    0.0000000 -0.08659835
#> qsec  0.6737896  0.00000000
#> vs    0.2785136  0.00000000
#> am    0.0000000  0.62421437
#> gear  0.0000000  0.69967225
#> carb -0.6093071  0.00000000
summary(res_corr, Sigma, feasibilityConstraintType = 1)
#> 
#> msPCA summary: 2 sparse PC(s)
#> Input type   : Sigma 
#> Runtime (s)  : 0.398 
#> 
#> Per-component statistics:
#>   PC nonzero variance       fve cumulative_fve
#>  PC1       4 2.719537 0.2472306      0.2472306
#>  PC2       4 2.509762 0.2281602      0.4753908
#> 
#> Pairwise feasibility violations (upper triangle):
#>     PC1          PC2
#> PC1   . 9.620782e-05
#> PC2   .            .
#> Total: 9.620782e-05

Diagnostics

The utility functions feasibility_violation_off() and fraction_variance_explained() can be called directly for custom reporting or for comparing solutions across methods.

# Orthogonality and zero-correlation violations for the default solution
feasibility_violation_off(Sigma, res$x_best, feasibilityConstraintType = 0)
#> [1] 0
feasibility_violation_off(Sigma, res$x_best, feasibilityConstraintType = 1)
#> [1] 2.335601

# Total and per-PC fraction of variance explained
fraction_variance_explained(Sigma, res$x_best)
#> [1] 0.6043866
fraction_variance_explained_perPC(Sigma, res$x_best)
#> [1] 0.3245835 0.2798031

Comparison with dense PCA

The first two dense principal components explain more variance, but all variables receive non-zero loadings.

pca_res <- prcomp(mtcars, scale. = TRUE)
fraction_variance_explained(Sigma, pca_res$rotation[, 1:2])
#> [1] 0.8417153

Sparse PCA trades a small reduction in explained variance for a much more interpretable loading pattern.