Parametric Bootstrap (The State Space Model)

Model

The measurement model is given by $\begin{equation} \mathbf{y}_{i, t} = \boldsymbol{\nu} + \boldsymbol{\Lambda} \boldsymbol{\eta}_{i, t} + \boldsymbol{\varepsilon}_{i, t}, \quad \mathrm{with} \quad \boldsymbol{\varepsilon}_{i, t} \sim \mathcal{N} \left( \mathbf{0}, \boldsymbol{\Theta} \right) \end{equation}$ where $\mathbf{y}_{i, t}$ , $\boldsymbol{\eta}_{i, t}$ , and $\boldsymbol{\varepsilon}_{i, t}$ are random variables and $\boldsymbol{\nu}$ , $\boldsymbol{\Lambda}$ , and $\boldsymbol{\Theta}$ are model parameters. $\mathbf{y}_{i, t}$ represents a vector of observed random variables, $\boldsymbol{\eta}_{i, t}$ a vector of latent random variables, and $\boldsymbol{\varepsilon}_{i, t}$ a vector of random measurement errors, at time $t$ and individual $i$ . $\boldsymbol{\nu}$ denotes a vector of intercepts, $\boldsymbol{\Lambda}$ a matrix of factor loadings, and $\boldsymbol{\Theta}$ the covariance matrix of $\boldsymbol{\varepsilon}$ .

An alternative representation of the measurement error is given by $\begin{equation} \boldsymbol{\varepsilon}_{i, t} = \boldsymbol{\Theta}^{\frac{1}{2}} \mathbf{z}_{i, t}, \quad \mathrm{with} \quad \mathbf{z}_{i, t} \sim \mathcal{N} \left( \mathbf{0}, \mathbf{I} \right) \end{equation}$ where $\mathbf{z}_{i, t}$ is a vector of independent standard normal random variables and $\left( \boldsymbol{\Theta}^{\frac{1}{2}} \right) \left( \boldsymbol{\Theta}^{\frac{1}{2}} \right)^{\prime} = \boldsymbol{\Theta}$ .

The dynamic structure is given by $\begin{equation} \boldsymbol{\eta}_{i, t} = \boldsymbol{\alpha} + \boldsymbol{\beta} \boldsymbol{\eta}_{i, t - 1} + \boldsymbol{\zeta}_{i, t}, \quad \mathrm{with} \quad \boldsymbol{\zeta}_{i, t} \sim \mathcal{N} \left( \mathbf{0}, \boldsymbol{\Psi} \right) \end{equation}$ where $\boldsymbol{\eta}_{i, t}$ , $\boldsymbol{\eta}_{i, t - 1}$ , and $\boldsymbol{\zeta}_{i, t}$ are random variables, and $\boldsymbol{\alpha}$ , $\boldsymbol{\beta}$ , and $\boldsymbol{\Psi}$ are model parameters. Here, $\boldsymbol{\eta}_{i, t}$ is a vector of latent variables at time $t$ and individual $i$ , $\boldsymbol{\eta}_{i, t - 1}$ represents a vector of latent variables at time $t - 1$ and individual $i$ , and $\boldsymbol{\zeta}_{i, t}$ represents a vector of dynamic noise at time $t$ and individual $i$ . $\boldsymbol{\alpha}$ denotes a vector of intercepts, $\boldsymbol{\beta}$ a matrix of autoregression and cross regression coefficients, and $\boldsymbol{\Psi}$ the covariance matrix of $\boldsymbol{\zeta}_{i, t}$ .

An alternative representation of the dynamic noise is given by $\begin{equation} \boldsymbol{\zeta}_{i, t} = \boldsymbol{\Psi}^{\frac{1}{2}} \mathbf{z}_{i, t}, \quad \mathrm{with} \quad \mathbf{z}_{i, t} \sim \mathcal{N} \left( \mathbf{0}, \mathbf{I} \right) \end{equation}$ where $\left( \boldsymbol{\Psi}^{\frac{1}{2}} \right) \left( \boldsymbol{\Psi}^{\frac{1}{2}} \right)^{\prime} = \boldsymbol{\Psi}$ .

Parameters

Notation

Let $t = 100$ be the number of time points and $n = 5$ be the number of individuals.

Let the measurement model intecept vector $\boldsymbol{\nu}$ be given by

$\begin{equation} \boldsymbol{\nu} = \left( \begin{array}{c} 0 \\ 0 \\ 0 \\ \end{array} \right) . \end{equation}$

Let the factor loadings matrix $\boldsymbol{\Lambda}$ be given by

$\begin{equation} \boldsymbol{\Lambda} = \left( \begin{array}{ccc} 1 & 0 & 0 \\ 0 & 1 & 0 \\ 0 & 0 & 1 \\ \end{array} \right) . \end{equation}$

Let the measurement error covariance matrix $\boldsymbol{\Theta}$ be given by

$\begin{equation} \boldsymbol{\Theta} = \left( \begin{array}{ccc} 0.2 & 0 & 0 \\ 0 & 0.2 & 0 \\ 0 & 0 & 0.2 \\ \end{array} \right) . \end{equation}$

Let the initial condition $\boldsymbol{\eta}_{0}$ be given by

$\begin{equation} \boldsymbol{\eta}_{0} \sim \mathcal{N} \left( \boldsymbol{\mu}_{\boldsymbol{\eta} \mid 0}, \boldsymbol{\Sigma}_{\boldsymbol{\eta} \mid 0} \right) \end{equation}$

$\begin{equation} \boldsymbol{\mu}_{\boldsymbol{\eta} \mid 0} = \left( \begin{array}{c} 0 \\ 0 \\ 0 \\ \end{array} \right) \end{equation}$

$\begin{equation} \boldsymbol{\Sigma}_{\boldsymbol{\eta} \mid 0} = \left( \begin{array}{ccc} 1 & 0.2 & 0.2 \\ 0.2 & 1 & 0.2 \\ 0.2 & 0.2 & 1 \\ \end{array} \right) . \end{equation}$

Let the constant vector $\boldsymbol{\alpha}$ be given by

$\begin{equation} \boldsymbol{\alpha} = \left( \begin{array}{c} 0 \\ 0 \\ 0 \\ \end{array} \right) . \end{equation}$

Let the transition matrix $\boldsymbol{\beta}$ be given by

$\begin{equation} \boldsymbol{\beta} = \left( \begin{array}{ccc} 0.7 & 0 & 0 \\ 0.5 & 0.6 & 0 \\ -0.1 & 0.4 & 0.5 \\ \end{array} \right) . \end{equation}$

Let the dynamic process noise $\boldsymbol{\Psi}$ be given by

$\begin{equation} \boldsymbol{\Psi} = \left( \begin{array}{ccc} 0.1 & 0 & 0 \\ 0 & 0.1 & 0 \\ 0 & 0 & 0.1 \\ \end{array} \right) . \end{equation}$

R Function Arguments

n
#> [1] 5
time
#> [1] 100
mu0
#> [1] 0 0 0
sigma0
#>      [,1] [,2] [,3]
#> [1,]  1.0  0.2  0.2
#> [2,]  0.2  1.0  0.2
#> [3,]  0.2  0.2  1.0
sigma0_l # sigma0_l <- t(chol(sigma0))
#>      [,1]      [,2]      [,3]
#> [1,]  1.0 0.0000000 0.0000000
#> [2,]  0.2 0.9797959 0.0000000
#> [3,]  0.2 0.1632993 0.9660918
alpha
#> [1] 0 0 0
beta
#>      [,1] [,2] [,3]
#> [1,]  0.7  0.0  0.0
#> [2,]  0.5  0.6  0.0
#> [3,] -0.1  0.4  0.5
psi
#>      [,1] [,2] [,3]
#> [1,]  0.1  0.0  0.0
#> [2,]  0.0  0.1  0.0
#> [3,]  0.0  0.0  0.1
psi_l # psi_l <- t(chol(psi))
#>           [,1]      [,2]      [,3]
#> [1,] 0.3162278 0.0000000 0.0000000
#> [2,] 0.0000000 0.3162278 0.0000000
#> [3,] 0.0000000 0.0000000 0.3162278
nu
#> [1] 0 0 0
lambda
#>      [,1] [,2] [,3]
#> [1,]    1    0    0
#> [2,]    0    1    0
#> [3,]    0    0    1
theta
#>      [,1] [,2] [,3]
#> [1,]  0.2  0.0  0.0
#> [2,]  0.0  0.2  0.0
#> [3,]  0.0  0.0  0.2
theta_l # theta_l <- t(chol(theta))
#>           [,1]      [,2]      [,3]
#> [1,] 0.4472136 0.0000000 0.0000000
#> [2,] 0.0000000 0.4472136 0.0000000
#> [3,] 0.0000000 0.0000000 0.4472136

Parametric Bootstrap

R <- 5L # use at least 1000 in actual research
path <- getwd()
prefix <- "ssm"

We use the PBSSMFixed function from the bootStateSpace package to perform parametric bootstraping using the parameters described above. The argument R specifies the number of bootstrap replications. The generated data and model estimates are stored in path using the specified prefix for the file names. The ncores = parallel::detectCores() argument instructs the function to use all available CPU cores in the system.

NOTE: Fitting the state space model multiple times is computationally intensive.

library(bootStateSpace)
pb <- PBSSMFixed(
  R = R,
  path = path,
  prefix = prefix,
  n = n,
  time = time,
  mu0 = mu0,
  sigma0_l = sigma0_l,
  alpha = alpha,
  beta = beta,
  psi_l = psi_l,
  nu = nu,
  lambda = lambda,
  theta_l = theta_l,
  ncores = parallel::detectCores(),
  seed = 42
)
summary(pb)
#> Call:
#> PBSSMFixed(R = R, path = path, prefix = prefix, n = n, time = time, 
#>     mu0 = mu0, sigma0_l = sigma0_l, alpha = alpha, beta = beta, 
#>     psi_l = psi_l, nu = nu, lambda = lambda, theta_l = theta_l, 
#>     ncores = parallel::detectCores(), seed = 42)
#>             est     se R    2.5%   97.5%
#> beta_1_1    0.7 0.2137 5  0.6352  1.1627
#> beta_2_1    0.5 0.1216 5  0.3774  0.6776
#> beta_3_1   -0.1 0.1015 5 -0.2633 -0.0162
#> beta_1_2    0.0 0.1750 5 -0.3746  0.0529
#> beta_2_2    0.6 0.0982 5  0.4398  0.6675
#> beta_3_2    0.4 0.1152 5  0.4173  0.6753
#> beta_1_3    0.0 0.0978 5 -0.0237  0.1955
#> beta_2_3    0.0 0.0826 5 -0.0862  0.1163
#> beta_3_3    0.5 0.1048 5  0.2878  0.5547
#> psi_1_1     0.1 0.0558 5  0.0206  0.1617
#> psi_2_2     0.1 0.0210 5  0.0536  0.1070
#> psi_3_3     0.1 0.0431 5  0.0806  0.1773
#> theta_1_1   0.2 0.0479 5  0.1349  0.2525
#> theta_2_2   0.2 0.0190 5  0.1893  0.2277
#> theta_3_3   0.2 0.0336 5  0.1352  0.2099
#> mu0_1_1     0.0 0.5477 5 -0.5069  0.8142
#> mu0_2_1     0.0 0.4270 5 -0.6078  0.3782
#> mu0_3_1     0.0 0.6169 5 -0.3866  1.1014
#> sigma0_1_1  1.0 1.0726 5  0.0098  2.2670
#> sigma0_2_1  0.2 0.6871 5 -0.8369  0.9222
#> sigma0_3_1  0.2 0.0925 5  0.0116  0.2442
#> sigma0_2_2  1.0 0.5726 5  0.1734  1.4921
#> sigma0_3_2  0.2 0.1702 5  0.2221  0.5919
#> sigma0_3_3  1.0 0.4650 5  0.2353  1.1719
summary(pb, type = "bc")
#> Call:
#> PBSSMFixed(R = R, path = path, prefix = prefix, n = n, time = time, 
#>     mu0 = mu0, sigma0_l = sigma0_l, alpha = alpha, beta = beta, 
#>     psi_l = psi_l, nu = nu, lambda = lambda, theta_l = theta_l, 
#>     ncores = parallel::detectCores(), seed = 42)
#>             est     se R    2.5%   97.5%
#> beta_1_1    0.7 0.2137 5  0.6204  0.8211
#> beta_2_1    0.5 0.1216 5  0.3938  0.6883
#> beta_3_1   -0.1 0.1015 5 -0.2822 -0.0893
#> beta_1_2    0.0 0.1750 5 -0.1072  0.0614
#> beta_2_2    0.6 0.0982 5  0.4293  0.6618
#> beta_3_2    0.4 0.1152 5  0.4138  0.4138
#> beta_1_3    0.0 0.0978 5 -0.0248  0.1760
#> beta_2_3    0.0 0.0826 5 -0.0321  0.1300
#> beta_3_3    0.5 0.1048 5  0.4003  0.5704
#> psi_1_1     0.1 0.0558 5  0.0300  0.1665
#> psi_2_2     0.1 0.0210 5  0.0783  0.1092
#> psi_3_3     0.1 0.0431 5  0.0792  0.1751
#> theta_1_1   0.2 0.0479 5  0.1267  0.2193
#> theta_2_2   0.2 0.0190 5  0.1892  0.2274
#> theta_3_3   0.2 0.0336 5  0.1589  0.2113
#> mu0_1_1     0.0 0.5477 5 -0.4523  0.8502
#> mu0_2_1     0.0 0.4270 5 -0.5356  0.3868
#> mu0_3_1     0.0 0.6169 5 -0.4170  1.0268
#> sigma0_1_1  1.0 1.0726 5  0.0027  2.2109
#> sigma0_2_1  0.2 0.6871 5 -0.0354  1.0075
#> sigma0_3_1  0.2 0.0925 5  0.1330  0.2512
#> sigma0_2_2  1.0 0.5726 5  0.1956  1.5233
#> sigma0_3_2  0.2 0.1702 5  0.2171  0.2171
#> sigma0_3_3  1.0 0.4650 5  0.2359  1.1752

References

Chow, S.-M., Ho, M. R., Hamaker, E. L., & Dolan, C. V. (2010). Equivalence and differences between structural equation modeling and state-space modeling techniques. Structural Equation Modeling: A Multidisciplinary Journal, 17(2), 303–332. https://doi.org/10.1080/10705511003661553

Ou, L., Hunter, M. D., & Chow, S.-M. (2019). What’s for dynr: A package for linear and nonlinear dynamic modeling in R. The R Journal, 11(1), 91. https://doi.org/10.32614/rj-2019-012

R Core Team. (2024). R: A language and environment for statistical computing. R Foundation for Statistical Computing. https://www.R-project.org/