Example 4: The Simple Mediation Model with Missing Data
Ivan Jacob Agaloos Pesigan
2023-03-12
Source:vignettes/example_4_simple_miss.Rmd
example_4_simple_miss.Rmd
In this example, the Monte Carlo method is used to generate
confidence intervals for the indirect effect in a simple mediation model
with missing data where variable X
has an effect on
variable Y
, through a mediating variable
M
.
Data
n <- 1000
a <- 0.50
b <- 0.50
cp <- 0.25
s2_em <- 1 - a^2
s2_ey <- 1 - cp^2 - a^2 * b^2 - b^2 * s2_em - 2 * cp * a * b
em <- rnorm(n = n, mean = 0, sd = sqrt(s2_em))
ey <- rnorm(n = n, mean = 0, sd = sqrt(s2_ey))
X <- rnorm(n = n)
M <- a * X + em
Y <- cp * X + b * M + ey
df <- data.frame(X, M, Y)
# Create data set with missing values.
miss <- sample(seq_len(dim(df)[1]), 300)
df[miss[1:100], "X"] <- NA
df[miss[101:200], "M"] <- NA
df[miss[201:300], "Y"] <- NA
Model Specification
The indirect effect is defined by the product of the slopes of paths
X
to M
labeled as a
and
M
to Y
labeled as b
. In this
example, we are interested in the confidence intervals of
indirect
defined as the product of a
and
b
using the :=
operator in the
lavaan
model syntax.
model <- "
Y ~ cp * X + b * M
M ~ a * X
indirect := a * b
direct := cp
total := cp + (a * b)
"
Model Fitting
We can now fit the model using the sem()
function from
lavaan
. We are using missing = "fiml"
to
handle missing data in lavaan
. Since there are missing
values in x
, we also set fixed.x = FALSE
.
fit <- sem(data = df, model = model, missing = "fiml", fixed.x = FALSE)
Monte Carlo Confidence Intervals
The fit
lavaan
object can then be passed to
the MC()
function from semmcci
to generate
Monte Carlo confidence intervals.
MC(fit, R = 20000L, alpha = c(0.001, 0.01, 0.05))
#> Monte Carlo Confidence Intervals
#> est se R 0.05% 0.5% 2.5% 97.5% 99.5% 99.95%
#> cp 0.2335 0.0296 20000 0.1350 0.1579 0.1765 0.2913 0.3094 0.3285
#> b 0.5112 0.0298 20000 0.4177 0.4351 0.4524 0.5703 0.5899 0.6116
#> a 0.4809 0.0286 20000 0.3872 0.4069 0.4256 0.5372 0.5542 0.5737
#> Y~~Y 0.5542 0.0268 20000 0.4695 0.4860 0.5022 0.6074 0.6240 0.6411
#> M~~M 0.7564 0.0358 20000 0.6442 0.6636 0.6860 0.8262 0.8474 0.8765
#> X~~X 1.0591 0.0499 20000 0.8985 0.9326 0.9619 1.1572 1.1874 1.2185
#> Y~1 -0.0127 0.0253 20000 -0.0960 -0.0776 -0.0618 0.0377 0.0536 0.0750
#> M~1 -0.0223 0.0292 20000 -0.1127 -0.0962 -0.0803 0.0342 0.0526 0.0697
#> X~1 0.0025 0.0338 20000 -0.1094 -0.0844 -0.0645 0.0690 0.0904 0.1135
#> indirect 0.2458 0.0203 20000 0.1835 0.1939 0.2073 0.2871 0.3011 0.3142
#> direct 0.2335 0.0296 20000 0.1350 0.1579 0.1765 0.2913 0.3094 0.3285
#> total 0.4794 0.0287 20000 0.3886 0.4054 0.4233 0.5369 0.5538 0.5723
Standardized Monte Carlo Confidence Intervals
Standardized Monte Carlo Confidence intervals can be generated by
passing the result of the MC()
function to the
MCStd()
function.
fit <- sem(data = df, model = model, missing = "fiml", fixed.x = FALSE)
unstd <- MC(fit, R = 20000L, alpha = c(0.001, 0.01, 0.05))
MCStd(unstd)
#> Standardized Monte Carlo Confidence Intervals
#> est se R 0.05% 0.5% 2.5% 97.5% 99.5% 99.95%
#> cp 0.2409 0.0297 20000 0.1463 0.1640 0.1815 0.2991 0.3170 0.3347
#> b 0.5128 0.0269 20000 0.4217 0.4421 0.4593 0.5649 0.5803 0.5997
#> a 0.4946 0.0255 20000 0.4085 0.4283 0.4439 0.5430 0.5583 0.5720
#> Y~~Y 0.5568 0.0251 20000 0.4703 0.4919 0.5075 0.6060 0.6208 0.6399
#> M~~M 0.7554 0.0251 20000 0.6729 0.6882 0.7051 0.8030 0.8166 0.8331
#> X~~X 1.0000 0.0000 20000 1.0000 1.0000 1.0000 1.0000 1.0000 1.0000
#> indirect 0.2536 0.0188 20000 0.1927 0.2060 0.2172 0.2903 0.3030 0.3160
#> direct 0.2409 0.0297 20000 0.1463 0.1640 0.1815 0.2991 0.3170 0.3347
#> total 0.4945 0.0255 20000 0.4093 0.4262 0.4430 0.5430 0.5573 0.5757