MCStd Function Use Case 5: Difference of R-Squared in Multiple Groups
Ivan Jacob Agaloos Pesigan
2023-03-12
Source:vignettes/mcstd_5_difference_rsq_multigroup.Rmd
mcstd_5_difference_rsq_multigroup.Rmd
The MCStd()
function is used to generate Monte Carlo
confidence intervals for differences between \(R^{2}\) in multiple groups.
Data
In this example, we use data from Kwan and Chan (2014) with three groups (Hong Kong, Japan, and Korea) where child’s reading ability (\(Y_{1}\)) is regressed on parental occupational status (\(X_{1}\)), parental educational level (\(X_{2}\)), and child’s home possession (\(X_{3}\))
\[\begin{equation} Y_{1} = \alpha_{1} + \gamma_{1} X_{1} + \gamma_{2} X_{2} + \gamma_{3} X_{3} + \zeta_{1} . \end{equation}\]
Note that \(\zeta_{1}\) is the stochastic error term with expected value of zero and finite variance \(\psi_{1}\), \(\alpha_{1}\) is the intercept, and \(\gamma_{1}\), \(\gamma_{2}\), and \(\gamma_{3}\) are regression coefficients.

A Three-Regressor Multiple Regression Model (Covariance Structure)
varnames <- c("Y1", "X1", "X2", "X3")
nobs_hongkong <- 4625
covs_hongkong <- matrix(
data = c(
8176.0021, 27.3990, 28.2320, 31.2722,
27.3990, 0.9451, 0.6006, 0.4326,
28.2320, 0.6006, 0.7977, 0.3779,
31.2722, 0.4326, 0.3779, 0.8956
),
nrow = 4
)
colnames(covs_hongkong) <- rownames(covs_hongkong) <- varnames
knitr::kable(
x = covs_hongkong, digits = 4,
caption = "Covariance Matrix for Hong Kong"
)
Y1 | X1 | X2 | X3 | |
---|---|---|---|---|
Y1 | 8176.0021 | 27.3990 | 28.2320 | 31.2722 |
X1 | 27.3990 | 0.9451 | 0.6006 | 0.4326 |
X2 | 28.2320 | 0.6006 | 0.7977 | 0.3779 |
X3 | 31.2722 | 0.4326 | 0.3779 | 0.8956 |
nobs_japan <- 5943
covs_japan <- matrix(
data = c(
9666.8658, 34.2501, 35.2189, 30.6472,
34.2501, 1.0453, 0.6926, 0.5027,
35.2189, 0.6926, 1.0777, 0.4524,
30.6472, 0.5027, 0.4524, 0.9583
),
nrow = 4
)
colnames(covs_japan) <- rownames(covs_japan) <- varnames
knitr::kable(
x = covs_japan, digits = 4,
caption = "Covariance Matrix for Japan"
)
Y1 | X1 | X2 | X3 | |
---|---|---|---|---|
Y1 | 9666.8658 | 34.2501 | 35.2189 | 30.6472 |
X1 | 34.2501 | 1.0453 | 0.6926 | 0.5027 |
X2 | 35.2189 | 0.6926 | 1.0777 | 0.4524 |
X3 | 30.6472 | 0.5027 | 0.4524 | 0.9583 |
nobs_korea <- 5151
covs_korea <- matrix(
data = c(
8187.6921, 31.6266, 37.3062, 30.9021,
31.6266, 0.9271, 0.6338, 0.4088,
37.3062, 0.6338, 1.0007, 0.3902,
30.9021, 0.4088, 0.3902, 0.8031
),
nrow = 4
)
colnames(covs_korea) <- rownames(covs_korea) <- varnames
knitr::kable(
x = covs_korea, digits = 4,
caption = "Covariance Matrix for Korea"
)
Y1 | X1 | X2 | X3 | |
---|---|---|---|---|
Y1 | 8187.6921 | 31.6266 | 37.3062 | 30.9021 |
X1 | 31.6266 | 0.9271 | 0.6338 | 0.4088 |
X2 | 37.3062 | 0.6338 | 1.0007 | 0.3902 |
X3 | 30.9021 | 0.4088 | 0.3902 | 0.8031 |
Model Specification
We regress Y1
on X1
, X2
, and
X3
. We label the error variance \(\zeta_{1}\) for the three groups as
psi1.g1
, psi1.g2
, and psi1.g3
.
\(R^{2}\) is defined using the
:=
operator in the lavaan
model syntax using
the following equation
\[\begin{equation} R^{2} = 1 - \psi^{\ast} \end{equation}\]
where \(\psi^{\ast}\) is the standardized error variance.
model <- "
Y1 ~ X1 + X2 + X3
Y1 ~~ c(psi1.g1, psi1.g2, psi1.g3) * Y1
rsq.g1 := 1 - psi1.g1
rsq.g2 := 1 - psi1.g2
rsq.g3 := 1 - psi1.g3
rsq.g12 := rsq.g1 - rsq.g2
rsq.g13 := rsq.g1 - rsq.g3
rsq.g23 := rsq.g2 - rsq.g3
"
Model Fitting
We can now fit the model using the sem()
function from
lavaan
with mimic = "eqs"
to ensure
compatibility with results from Kwan and Chan (2011).
Note: We recommend setting
fixed.x = FALSE
when generating standardized estimates and confidence intervals to model the variances and covariances of the exogenous observed variables if they are assumed to be random. Iffixed.x = TRUE
, which is the default setting inlavaan
,MC()
will fix the variances and the covariances of the exogenous observed variables to the sample values.
Standardized Monte Carlo Confidence Intervals
Standardized Monte Carlo Confidence intervals can be generated by
passing the result of the MC()
function to the
MCStd()
function.
Note: The parameterization of \(R^{2}\) and above should only be interpreted using the output of the
MCStd()
function since the input in the functions defined by:=
require standardized estimates.
unstd <- MC(fit, R = 20000L, alpha = 0.05)
MCStd(unstd)
#> Standardized Monte Carlo Confidence Intervals
#> est se R 2.5% 97.5%
#> Y1~X1 0.0568 0.0190 20000 0.0190 0.0941
#> Y1~X2 0.1985 0.0187 20000 0.1616 0.2351
#> Y1~X3 0.2500 0.0150 20000 0.2207 0.2790
#> psi1.g1 0.8215 0.0103 20000 0.8008 0.8411
#> X1~~X1 1.0000 0.0000 20000 1.0000 1.0000
#> X1~~X2 0.6917 0.0077 20000 0.6764 0.7064
#> X1~~X3 0.4702 0.0115 20000 0.4475 0.4925
#> X2~~X2 1.0000 0.0000 20000 1.0000 1.0000
#> X2~~X3 0.4471 0.0118 20000 0.4238 0.4698
#> X3~~X3 1.0000 0.0000 20000 1.0000 1.0000
#> Y1~X1.g2 0.1390 0.0164 20000 0.1071 0.1709
#> Y1~X2.g2 0.1792 0.0159 20000 0.1480 0.2100
#> Y1~X3.g2 0.1688 0.0138 20000 0.1416 0.1961
#> psi1.g2 0.8371 0.0087 20000 0.8193 0.8534
#> X1~~X1.g2 1.0000 0.0000 20000 1.0000 1.0000
#> X1~~X2.g2 0.6525 0.0074 20000 0.6379 0.6669
#> X1~~X3.g2 0.5023 0.0097 20000 0.4832 0.5211
#> X2~~X2.g2 1.0000 0.0000 20000 1.0000 1.0000
#> X2~~X3.g2 0.4452 0.0103 20000 0.4245 0.4651
#> X3~~X3.g2 1.0000 0.0000 20000 1.0000 1.0000
#> Y1~X1.g3 0.0863 0.0170 20000 0.0527 0.1194
#> Y1~X2.g3 0.2557 0.0164 20000 0.2235 0.2877
#> Y1~X3.g3 0.2289 0.0139 20000 0.2016 0.2564
#> psi1.g3 0.7761 0.0103 20000 0.7556 0.7959
#> X1~~X1.g3 1.0000 0.0000 20000 1.0000 1.0000
#> X1~~X2.g3 0.6580 0.0079 20000 0.6422 0.6734
#> X1~~X3.g3 0.4738 0.0108 20000 0.4526 0.4947
#> X2~~X2.g3 1.0000 0.0000 20000 1.0000 1.0000
#> X2~~X3.g3 0.4353 0.0113 20000 0.4132 0.4570
#> X3~~X3.g3 1.0000 0.0000 20000 1.0000 1.0000
#> rsq.g1 0.1785 0.0103 20000 0.1589 0.1992
#> rsq.g2 0.1629 0.0087 20000 0.1466 0.1807
#> rsq.g3 0.2239 0.0103 20000 0.2041 0.2444
#> rsq.g12 0.0155 0.0135 20000 -0.0110 0.0418
#> rsq.g13 -0.0455 0.0146 20000 -0.0742 -0.0169
#> rsq.g23 -0.0610 0.0135 20000 -0.0873 -0.0344
References
Kwan, J. L. Y., & Chan, W. (2014). Comparing squared multiple correlation coefficients using structural equation modeling Structural Equation Modeling: A Multidisciplinary Journal, 21(2), 225-238. https://doi.org/10.1080/10705511.2014.882673