├── .gitignore
├── README.md
├── cfa-example
    ├── cfa-example.html
    ├── cfa-example.md
    ├── cfa-example.rmd
    └── figure
    │   └── unnamed-chunk-19.png
├── cheat-sheet-lavaan
    ├── cheat-sheet-lavaan.html
    ├── cheat-sheet-lavaan.md
    └── cheat-sheet-lavaan.rmd
├── convert.r
├── ex1-paper
    ├── ex1-paper.html
    ├── ex1-paper.md
    └── ex1-paper.rmd
├── ex2-paper
    ├── ex2-paper.html
    ├── ex2-paper.md
    └── ex2-paper.rmd
├── makefile
└── path-analysis
    ├── figure
        └── unnamed-chunk-5.png
    ├── path-analysis.html
    ├── path-analysis.md
    └── path-analysis.rmd


/.gitignore:
--------------------------------------------------------------------------------
1 | *cache*
2 | .build*
3 | 


--------------------------------------------------------------------------------
/README.md:
--------------------------------------------------------------------------------
1 | This repository shares a few example analyses using `lavaan`, an R package for structural equation modelling.
2 | 
3 | I've just been creating these examples to teach myself how to use the software. Feel free to re-use the code, but I make no guaranty as to the accuracy or validity of these analyses.
4 | 


--------------------------------------------------------------------------------
/cfa-example/cfa-example.md:
--------------------------------------------------------------------------------
   1 | # CFA Example
   2 | 
   3 | 
   4 | 
   5 | ```r
   6 | library(psych)
   7 | library(lavaan)
   8 | Data <- bfi
   9 | item_names <- names(Data)[1:25]
  10 | ```
  11 | 
  12 | 
  13 | 
  14 | 
  15 | ## Check data
  16 | 
  17 | 
  18 | 
  19 | ```r
  20 | sapply(Data[, item_names], function(X) sum(is.na(X)))
  21 | ```
  22 | 
  23 | ```
  24 | ## A1 A2 A3 A4 A5 C1 C2 C3 C4 C5 E1 E2 E3 E4 E5 N1 N2 N3 N4 N5 O1 O2 O3 O4 O5 
  25 | ## 16 27 26 19 16 21 24 20 26 16 23 16 25  9 21 22 21 11 36 29 22  0 28 14 20 
  26 | ```
  27 | 
  28 | ```r
  29 | 
  30 | Data$item_na <- apply(Data[, item_names], 1, function(X) sum(is.na(X)) > 
  31 |     0)
  32 | 
  33 | table(Data$item_na)
  34 | ```
  35 | 
  36 | ```
  37 | ## 
  38 | ## FALSE  TRUE 
  39 | ##  2436   364 
  40 | ```
  41 | 
  42 | ```r
  43 | Data <- Data[!Data$item_na, ]
  44 | ```
  45 | 
  46 | 
  47 | 
  48 | 
  49 | * I decided to remove data with missing data to simplify subsequent exploration of the features of the lavaan software.
  50 | 
  51 | 
  52 | ## Basic CFA
  53 | 
  54 | 
  55 | ```r
  56 | m1_model <- ' N =~ N1 + N2 + N3 + N4 + N5
  57 |               E =~ E1 + E2 + E3 + E4 + E5
  58 |               O =~ O1 + O2 + O3 + O4 + O5
  59 |               A =~ A1 + A2 + A3 + A4 + A5
  60 |               C =~ C1 + C2 + C3 + C4 + C5
  61 | '
  62 | 
  63 | m1_fit <- cfa(m1_model, data=Data[, item_names])
  64 | summary(m1_fit, standardized=TRUE)
  65 | ```
  66 | 
  67 | ```
  68 | ## lavaan (0.4-14) converged normally after 63 iterations
  69 | ## 
  70 | ##   Number of observations                          2436
  71 | ## 
  72 | ##   Estimator                                         ML
  73 | ##   Minimum Function Chi-square                 4165.467
  74 | ##   Degrees of freedom                               265
  75 | ##   P-value                                        0.000
  76 | ## 
  77 | ## Parameter estimates:
  78 | ## 
  79 | ##   Information                                 Expected
  80 | ##   Standard Errors                             Standard
  81 | ## 
  82 | ##                    Estimate  Std.err  Z-value  P(>|z|)   Std.lv  Std.all
  83 | ## Latent variables:
  84 | ##   N =~
  85 | ##     N1                1.000                               1.300    0.825
  86 | ##     N2                0.947    0.024   39.899    0.000    1.230    0.803
  87 | ##     N3                0.884    0.025   35.919    0.000    1.149    0.721
  88 | ##     N4                0.692    0.025   27.753    0.000    0.899    0.573
  89 | ##     N5                0.628    0.026   24.027    0.000    0.816    0.503
  90 | ##   E =~
  91 | ##     E1                1.000                               0.920    0.564
  92 | ##     E2                1.226    0.051   23.899    0.000    1.128    0.699
  93 | ##     E3               -0.921    0.041  -22.431    0.000   -0.847   -0.627
  94 | ##     E4               -1.121    0.047  -23.977    0.000   -1.031   -0.703
  95 | ##     E5               -0.808    0.039  -20.648    0.000   -0.743   -0.553
  96 | ##   O =~
  97 | ##     O1                1.000                               0.635    0.564
  98 | ##     O2               -1.020    0.068  -14.962    0.000   -0.648   -0.418
  99 | ##     O3                1.373    0.072   18.942    0.000    0.872    0.724
 100 | ##     O4                0.437    0.048    9.160    0.000    0.277    0.233
 101 | ##     O5               -0.960    0.060  -16.056    0.000   -0.610   -0.461
 102 | ##   A =~
 103 | ##     A1                1.000                               0.484    0.344
 104 | ##     A2               -1.579    0.108  -14.650    0.000   -0.764   -0.648
 105 | ##     A3               -2.030    0.134  -15.093    0.000   -0.983   -0.749
 106 | ##     A4               -1.564    0.115  -13.616    0.000   -0.757   -0.510
 107 | ##     A5               -1.804    0.121  -14.852    0.000   -0.873   -0.687
 108 | ##   C =~
 109 | ##     C1                1.000                               0.680    0.551
 110 | ##     C2                1.148    0.057   20.152    0.000    0.781    0.592
 111 | ##     C3                1.036    0.054   19.172    0.000    0.705    0.546
 112 | ##     C4               -1.421    0.065  -21.924    0.000   -0.967   -0.702
 113 | ##     C5               -1.489    0.072  -20.694    0.000   -1.013   -0.620
 114 | ## 
 115 | ## Covariances:
 116 | ##   N ~~
 117 | ##     E                 0.292    0.032    9.131    0.000    0.244    0.244
 118 | ##     O                -0.093    0.022   -4.138    0.000   -0.112   -0.112
 119 | ##     A                 0.141    0.018    7.713    0.000    0.223    0.223
 120 | ##     C                -0.250    0.025  -10.118    0.000   -0.283   -0.283
 121 | ##   E ~~
 122 | ##     O                -0.265    0.021  -12.347    0.000   -0.453   -0.453
 123 | ##     A                 0.304    0.025   12.293    0.000    0.683    0.683
 124 | ##     C                -0.224    0.020  -11.121    0.000   -0.357   -0.357
 125 | ##   O ~~
 126 | ##     A                -0.093    0.011   -8.446    0.000   -0.303   -0.303
 127 | ##     C                 0.130    0.014    9.190    0.000    0.301    0.301
 128 | ##   A ~~
 129 | ##     C                -0.110    0.012   -9.254    0.000   -0.334   -0.334
 130 | ## 
 131 | ## Variances:
 132 | ##     N1                0.793    0.037                      0.793    0.320
 133 | ##     N2                0.836    0.036                      0.836    0.356
 134 | ##     N3                1.222    0.043                      1.222    0.481
 135 | ##     N4                1.654    0.052                      1.654    0.672
 136 | ##     N5                1.969    0.060                      1.969    0.747
 137 | ##     E1                1.814    0.058                      1.814    0.682
 138 | ##     E2                1.332    0.049                      1.332    0.512
 139 | ##     E3                1.108    0.038                      1.108    0.607
 140 | ##     E4                1.088    0.041                      1.088    0.506
 141 | ##     E5                1.251    0.040                      1.251    0.694
 142 | ##     O1                0.865    0.032                      0.865    0.682
 143 | ##     O2                1.990    0.063                      1.990    0.826
 144 | ##     O3                0.691    0.039                      0.691    0.476
 145 | ##     O4                1.346    0.040                      1.346    0.946
 146 | ##     O5                1.380    0.045                      1.380    0.788
 147 | ##     A1                1.745    0.052                      1.745    0.882
 148 | ##     A2                0.807    0.028                      0.807    0.580
 149 | ##     A3                0.754    0.032                      0.754    0.438
 150 | ##     A4                1.632    0.051                      1.632    0.740
 151 | ##     A5                0.852    0.032                      0.852    0.528
 152 | ##     C1                1.063    0.035                      1.063    0.697
 153 | ##     C2                1.130    0.039                      1.130    0.650
 154 | ##     C3                1.170    0.039                      1.170    0.702
 155 | ##     C4                0.960    0.040                      0.960    0.507
 156 | ##     C5                1.640    0.059                      1.640    0.615
 157 | ##     N                 1.689    0.073                      1.000    1.000
 158 | ##     E                 0.846    0.062                      1.000    1.000
 159 | ##     O                 0.404    0.033                      1.000    1.000
 160 | ##     A                 0.234    0.030                      1.000    1.000
 161 | ##     C                 0.463    0.036                      1.000    1.000
 162 | ## 
 163 | ```
 164 | 
 165 | 
 166 | 
 167 | 
 168 | * **`Std.lv`**: Only latent variables have been standardized
 169 | * **`Std.all`**: Observed and latent variables have been standardized. 
 170 | * **Factor loadings**: Under the `latent variables` section, the `Std.all` column provides standardised factor loadings. 
 171 | * **Factor correlations**: Under the `Covariances`  section, the `Std.all` column provides standardised factor loadings.
 172 | * **`Variances`**: Latent factor variances can be constrained for identifiability purposes to be 1, but in this case, one of the loadings was constrained to be one. Variances for items represent the variance not explained by the latent factor.
 173 | 
 174 | 
 175 | 
 176 | 
 177 | 
 178 | ```r
 179 | variances <- c(unique = subset(inspect(m1_fit, "standardizedsolution"), 
 180 |     lhs == "N1" & rhs == "N1")[, "est.std"], common = subset(inspect(m1_fit, 
 181 |     "standardizedsolution"), lhs == "N" & rhs == "N1")[, "est.std"]^2)
 182 | (variances <- c(variances, total = sum(variances)))
 183 | ```
 184 | 
 185 | ```
 186 | ## unique common  total 
 187 | ## 0.3195 0.6805 1.0000 
 188 | ```
 189 | 
 190 | 
 191 | 
 192 | 
 193 | * The output above illustrates the point about variances. Variance for each item is explained by either the common factor or by error variance. As there is just one latent factor loading on the item, the squared standardised coefficient is the variance explained by the common factor. The sum of the unique and common standardised variances is one, which naturally corresponds to the variance of a standardised variable.
 194 | * The code also demonstrates ideas about how to extract specific information from the lavaan model fit object. Specifically, the `inspect` method provides access to a wide range of specific information. See help for further details.
 195 | * I used the `subset` method to provide an easy one-liner for extracting elements from the data frame returned by the `inspect` method.
 196 | 
 197 | 
 198 | 
 199 | ```r
 200 | variances <- c(N1_N1 = subset(parameterestimates(m1_fit), lhs == 
 201 |     "N1" & rhs == "N1")[, "est"], N_N = subset(parameterestimates(m1_fit), lhs == 
 202 |     "N" & rhs == "N")[, "est"], N_N1 = subset(parameterestimates(m1_fit), lhs == 
 203 |     "N" & rhs == "N1")[, "est"])
 204 | 
 205 | cbind(parameters = c(variances, total = variances["N_N1"] * variances["N_N"] + 
 206 |     variances["N1_N1"], raw_divide_by_n_minus_1 = var(Data[, "N1"]), raw_divide_by_n = mean((Data[, 
 207 |     "N1"] - mean(Data[, "N1"]))^2)))
 208 | ```
 209 | 
 210 | ```
 211 | ##                         parameters
 212 | ## N1_N1                       0.7932
 213 | ## N_N                         1.6893
 214 | ## N_N1                        1.0000
 215 | ## total.N_N1                  2.4825
 216 | ## raw_divide_by_n_minus_1     2.4835
 217 | ## raw_divide_by_n             2.4825
 218 | ```
 219 | 
 220 | 
 221 | 
 222 | 
 223 | * The output above shows the unstandardised parameters related to the item `N1`.
 224 | * `N1_N1` corresponds to the unstandardised unique variance for the item.
 225 | * `N_N` times `N_N1` represents the unstandardised common variance.
 226 | * Thus, the sum of the unique and common variance represents the total variance.
 227 | * When I calculated this on the raw data using the standard $n-1$ denominator, the value was slightly larger, but when I used $n$ as the denominator, the estimate was very close. 
 228 | 
 229 | 
 230 | 
 231 | ## Compare with a single factor model
 232 | 
 233 | 
 234 | ```r
 235 | m2_model <- ' G =~ N1 + N2 + N3 + N4 + N5
 236 |               + E1 + E2 + E3 + E4 + E5
 237 |               + O1 + O2 + O3 + O4 + O5
 238 |               + A1 + A2 + A3 + A4 + A5
 239 |               + C1 + C2 + C3 + C4 + C5
 240 | '
 241 | 
 242 | m2_fit <- cfa(m2_model, data=Data[, item_names])
 243 | summary(m2_fit, standardized=TRUE)
 244 | ```
 245 | 
 246 | ```
 247 | ## lavaan (0.4-14) converged normally after 55 iterations
 248 | ## 
 249 | ##   Number of observations                          2436
 250 | ## 
 251 | ##   Estimator                                         ML
 252 | ##   Minimum Function Chi-square                10673.239
 253 | ##   Degrees of freedom                               275
 254 | ##   P-value                                        0.000
 255 | ## 
 256 | ## Parameter estimates:
 257 | ## 
 258 | ##   Information                                 Expected
 259 | ##   Standard Errors                             Standard
 260 | ## 
 261 | ##                    Estimate  Std.err  Z-value  P(>|z|)   Std.lv  Std.all
 262 | ## Latent variables:
 263 | ##   G =~
 264 | ##     N1                1.000                               0.547    0.347
 265 | ##     N2                0.959    0.081   11.809    0.000    0.524    0.342
 266 | ##     N3                0.960    0.083   11.547    0.000    0.525    0.329
 267 | ##     N4                1.375    0.099   13.919    0.000    0.752    0.479
 268 | ##     N5                0.884    0.081   10.860    0.000    0.484    0.298
 269 | ##     E1                1.332    0.099   13.509    0.000    0.728    0.447
 270 | ##     E2                1.868    0.122   15.297    0.000    1.022    0.633
 271 | ##     E3               -1.382    0.094  -14.730    0.000   -0.756   -0.559
 272 | ##     E4               -1.702    0.111  -15.307    0.000   -0.931   -0.635
 273 | ##     E5               -1.292    0.090  -14.425    0.000   -0.707   -0.526
 274 | ##     O1               -0.656    0.058  -11.321    0.000   -0.359   -0.318
 275 | ##     O2                0.444    0.067    6.641    0.000    0.243    0.156
 276 | ##     O3               -0.877    0.068  -12.801    0.000   -0.479   -0.398
 277 | ##     O4                0.142    0.048    2.930    0.003    0.078    0.065
 278 | ##     O5                0.416    0.058    7.196    0.000    0.228    0.172
 279 | ##     A1                0.568    0.065    8.797    0.000    0.311    0.221
 280 | ##     A2               -1.032    0.074  -13.913    0.000   -0.565   -0.479
 281 | ##     A3               -1.322    0.090  -14.663    0.000   -0.723   -0.552
 282 | ##     A4               -1.172    0.088  -13.307    0.000   -0.641   -0.432
 283 | ##     A5               -1.413    0.093  -15.123    0.000   -0.773   -0.608
 284 | ##     C1               -0.705    0.063  -11.188    0.000   -0.386   -0.312
 285 | ##     C2               -0.725    0.066  -10.923    0.000   -0.396   -0.301
 286 | ##     C3               -0.682    0.064  -10.645    0.000   -0.373   -0.289
 287 | ##     C4                1.009    0.079   12.852    0.000    0.552    0.401
 288 | ##     C5                1.332    0.099   13.505    0.000    0.728    0.446
 289 | ## 
 290 | ## Variances:
 291 | ##     N1                2.183    0.064                      2.183    0.880
 292 | ##     N2                2.075    0.061                      2.075    0.883
 293 | ##     N3                2.267    0.066                      2.267    0.892
 294 | ##     N4                1.897    0.057                      1.897    0.770
 295 | ##     N5                2.401    0.070                      2.401    0.911
 296 | ##     E1                2.130    0.064                      2.130    0.801
 297 | ##     E2                1.560    0.050                      1.560    0.599
 298 | ##     E3                1.255    0.039                      1.255    0.687
 299 | ##     E4                1.284    0.042                      1.284    0.597
 300 | ##     E5                1.304    0.040                      1.304    0.723
 301 | ##     O1                1.140    0.033                      1.140    0.899
 302 | ##     O2                2.351    0.068                      2.351    0.976
 303 | ##     O3                1.222    0.036                      1.222    0.842
 304 | ##     O4                1.417    0.041                      1.417    0.996
 305 | ##     O5                1.701    0.049                      1.701    0.970
 306 | ##     A1                1.883    0.054                      1.883    0.951
 307 | ##     A2                1.072    0.032                      1.072    0.771
 308 | ##     A3                1.196    0.037                      1.196    0.696
 309 | ##     A4                1.794    0.053                      1.794    0.814
 310 | ##     A5                1.017    0.032                      1.017    0.630
 311 | ##     C1                1.376    0.040                      1.376    0.902
 312 | ##     C2                1.582    0.046                      1.582    0.910
 313 | ##     C3                1.528    0.044                      1.528    0.917
 314 | ##     C4                1.590    0.047                      1.590    0.839
 315 | ##     C5                2.134    0.064                      2.134    0.801
 316 | ##     G                 0.299    0.037                      1.000    1.000
 317 | ## 
 318 | ```
 319 | 
 320 | 
 321 | 
 322 | 
 323 | 
 324 | 
 325 | ```r
 326 | round(cbind(m1 = inspect(m1_fit, "fit.measures"), m2 = inspect(m2_fit, 
 327 |     "fit.measures")), 3)
 328 | ```
 329 | 
 330 | ```
 331 | ##                           m1         m2
 332 | ## chisq               4165.467  1.067e+04
 333 | ## df                   265.000  2.750e+02
 334 | ## pvalue                 0.000  0.000e+00
 335 | ## baseline.chisq     18222.116  1.822e+04
 336 | ## baseline.df          300.000  3.000e+02
 337 | ## baseline.pvalue        0.000  0.000e+00
 338 | ## cfi                    0.782  4.200e-01
 339 | ## tli                    0.754  3.670e-01
 340 | ## logl              -99840.238 -1.031e+05
 341 | ## unrestricted.logl -97757.504 -9.776e+04
 342 | ## npar                  60.000  5.000e+01
 343 | ## aic               199800.476  2.063e+05
 344 | ## bic               200148.363  2.066e+05
 345 | ## ntotal              2436.000  2.436e+03
 346 | ## bic2              199957.729  2.064e+05
 347 | ## rmsea                  0.078  1.250e-01
 348 | ## rmsea.ci.lower         0.076  1.230e-01
 349 | ## rmsea.ci.upper         0.080  1.270e-01
 350 | ## rmsea.pvalue           0.000  0.000e+00
 351 | ## srmr                   0.075  1.160e-01
 352 | ```
 353 | 
 354 | ```r
 355 | anova(m1_fit, m2_fit)
 356 | ```
 357 | 
 358 | ```
 359 | ## Chi Square Difference Test
 360 | ## 
 361 | ##         Df    AIC    BIC Chisq Chisq diff Df diff Pr(>Chisq)    
 362 | ## m1_fit 265 199800 200148  4165                                  
 363 | ## m2_fit 275 206288 206578 10673       6508      10     <2e-16 ***
 364 | ## ---
 365 | ## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1 
 366 | ```
 367 | 
 368 | 
 369 | 
 370 | 
 371 | * The output compares the model fit statistics for the two models.
 372 | * It also performs a chi-square difference test which shows that a one-factor model has significantly worse fit than the two-factor model.
 373 | 
 374 | 
 375 | ## Modification indices
 376 | 
 377 | 
 378 | ```r
 379 | m1_mod <- modificationindices(m1_fit)
 380 | m1_mod_summary <- subset(m1_mod, mi > 100)
 381 | m1_mod_summary[order(m1_mod_summary$mi, decreasing = TRUE), ]
 382 | ```
 383 | 
 384 | ```
 385 | ##    lhs op rhs    mi    epc sepc.lv sepc.all sepc.nox
 386 | ## 1   N1 ~~  N2 418.8  0.841   0.841    0.348    0.348
 387 | ## 2    E =~  N4 200.8  0.487   0.448    0.285    0.285
 388 | ## 3    O =~  E3 153.7  0.672   0.427    0.316    0.316
 389 | ## 4   N3 ~~  N4 134.1  0.403   0.403    0.161    0.161
 390 | ## 5    O =~  E4 122.6 -0.636  -0.404   -0.276   -0.276
 391 | ## 6    C =~  E5 121.5  0.504   0.343    0.255    0.255
 392 | ## 7    E =~  O3 114.2 -0.429  -0.395   -0.328   -0.328
 393 | ## 8    E =~  O4 113.9  0.372   0.343    0.287    0.287
 394 | ## 9    N =~  C5 108.8  0.271   0.352    0.216    0.216
 395 | ## 10   E =~  A5 108.6 -0.488  -0.449   -0.354   -0.354
 396 | ## 11   N =~  C2 107.0  0.219   0.285    0.216    0.216
 397 | ## 12  C1 ~~  C2 107.0  0.288   0.288    0.177    0.177
 398 | ## 13  E2 ~~  O4 104.7  0.310   0.310    0.161    0.161
 399 | ## 14  A1 ~~  A2 101.4 -0.276  -0.276   -0.166   -0.166
 400 | ```
 401 | 
 402 | 
 403 | 
 404 | 
 405 | * `modificationindices` suggests several ad hoc modifications that could be made to improve the fit of the model.
 406 | * The largest index suggests that items `N1` and `N2` share common variance. If we look at the help file on the bfi dataset `?bfi`, we see tha the text for `N1` ("Get angry easily") and `N2` ("Get irritated easily") are very similar. 
 407 | 
 408 | 
 409 | 
 410 | ```r
 411 | (N_cors <- round(cor(Data[, paste0("N", 1:5)]), 2))
 412 | ```
 413 | 
 414 | ```
 415 | ##      N1   N2   N3   N4   N5
 416 | ## N1 1.00 0.72 0.57 0.41 0.38
 417 | ## N2 0.72 1.00 0.55 0.39 0.35
 418 | ## N3 0.57 0.55 1.00 0.52 0.43
 419 | ## N4 0.41 0.39 0.52 1.00 0.40
 420 | ## N5 0.38 0.35 0.43 0.40 1.00
 421 | ```
 422 | 
 423 | ```r
 424 | N1_N2_corr <- N_cors["N1", "N2"]
 425 | other_N_corrs <- round(mean(abs(N_cors[lower.tri(N_cors)][-1])), 
 426 |     2)
 427 | ```
 428 | 
 429 | 
 430 | 
 431 | 
 432 | * The correlation matrix also shows that the correlation N1 and N2 ($r = 0.72$) is much larger than it is for the other variables ($\text{mean}(|r|) = 0.44$).
 433 | 
 434 | ## Various matrices
 435 | ### Observed, fitted, and residual covariance matrices
 436 | The following analysis extracts observed, fitted, and residual covariances and checks that they are consistent with expectations. I only perform this for five items rather than the full 25 item set in order to make the point about demonstrating their meaning clearer.
 437 | 
 438 | 
 439 | 
 440 | ```r
 441 | N_names <- paste0("N", 1:5)
 442 | N_matrices <- list(observed = inspect(m1_fit, "sampstat")$cov[N_names, 
 443 |     N_names], fitted = fitted(m1_fit)$cov[N_names, N_names], residual = resid(m1_fit)$cov[N_names, 
 444 |     N_names])
 445 | 
 446 | N_matrices$check <- N_matrices$observed - (N_matrices$fitted + N_matrices$residual)
 447 | lapply(N_matrices, function(X) round(X, 3))
 448 | ```
 449 | 
 450 | ```
 451 | ## $observed
 452 | ##       N1    N2    N3    N4    N5
 453 | ## N1 2.482 1.735 1.425 1.013 0.973
 454 | ## N2 1.735 2.350 1.344 0.950 0.873
 455 | ## N3 1.425 1.344 2.542 1.309 1.114
 456 | ## N4 1.013 0.950 1.309 2.463 1.026
 457 | ## N5 0.973 0.873 1.114 1.026 2.635
 458 | ## 
 459 | ## $fitted
 460 | ##       N1    N2    N3    N4    N5
 461 | ## N1 2.482 1.599 1.493 1.169 1.061
 462 | ## N2 1.599 2.350 1.414 1.106 1.004
 463 | ## N3 1.493 1.414 2.542 1.033 0.937
 464 | ## N4 1.169 1.106 1.033 2.463 0.734
 465 | ## N5 1.061 1.004 0.937 0.734 2.635
 466 | ## 
 467 | ## $residual
 468 | ##        N1     N2     N3     N4     N5
 469 | ## N1  0.000  0.135 -0.068 -0.155 -0.087
 470 | ## N2  0.135  0.000 -0.069 -0.157 -0.131
 471 | ## N3 -0.068 -0.069  0.000  0.276  0.177
 472 | ## N4 -0.155 -0.157  0.276  0.000  0.293
 473 | ## N5 -0.087 -0.131  0.177  0.293  0.000
 474 | ## 
 475 | ## $check
 476 | ##    N1 N2 N3 N4 N5
 477 | ## N1  0  0  0  0  0
 478 | ## N2  0  0  0  0  0
 479 | ## N3  0  0  0  0  0
 480 | ## N4  0  0  0  0  0
 481 | ## N5  0  0  0  0  0
 482 | ## 
 483 | ```
 484 | 
 485 | 
 486 | 
 487 | 
 488 | * The overved covariance matrix was extracted using the `cov` function on the sample data.
 489 | * The fitted covariance matrix can be extracted using the `fitted` method on the model fit object and then extracting the cov
 490 | * Many symmetric matrices in lavaan are of class `lavaan.matrix.symmetric`. This hides the upper triangle of the matrix and formats the matrix to `nd` decimal places.
 491 | Run `getAnywhere(print.lavaan.matrix.symmetric)` to see more details.
 492 | * The `sampstat` option in the `inspect` method can be used to extract the sample covariance matrix. This is similar, but not exactly the same as running `cov` on the sample data.
 493 | * The `resid` method can be used to extract the residual covariance matrix
 494 | * I then create a `check` that `observed = fitted - residual`, which it does.
 495 | 
 496 | ### Observed, fitted, and residual correlation matrices
 497 | I often find it more meaningful to examine observed, fitted, and residual correlation matrices.  Standardisation often makes it easier to understand the real magnitude of any residual.
 498 | 
 499 | 
 500 | 
 501 | ```r
 502 | N_names <- paste0("N", 1:5)
 503 | N_cov <- list(observed = inspect(m1_fit, "sampstat")$cov[N_names, 
 504 |     N_names], fitted = fitted(m1_fit)$cov[N_names, N_names])
 505 | 
 506 | N_cor <- list(observed = cov2cor(N_cov$observed), fitted = cov2cor(N_cov$fitted))
 507 | 
 508 | N_cor$residual <- N_cor$observed - N_cor$fitted
 509 | 
 510 | lapply(N_cor, function(X) round(X, 2))
 511 | ```
 512 | 
 513 | ```
 514 | ## $observed
 515 | ##      N1   N2   N3   N4   N5
 516 | ## N1 1.00 0.72 0.57 0.41 0.38
 517 | ## N2 0.72 1.00 0.55 0.39 0.35
 518 | ## N3 0.57 0.55 1.00 0.52 0.43
 519 | ## N4 0.41 0.39 0.52 1.00 0.40
 520 | ## N5 0.38 0.35 0.43 0.40 1.00
 521 | ## 
 522 | ## $fitted
 523 | ##      N1   N2   N3   N4   N5
 524 | ## N1 1.00 0.66 0.59 0.47 0.41
 525 | ## N2 0.66 1.00 0.58 0.46 0.40
 526 | ## N3 0.59 0.58 1.00 0.41 0.36
 527 | ## N4 0.47 0.46 0.41 1.00 0.29
 528 | ## N5 0.41 0.40 0.36 0.29 1.00
 529 | ## 
 530 | ## $residual
 531 | ##       N1    N2    N3    N4    N5
 532 | ## N1  0.00  0.06 -0.03 -0.06 -0.03
 533 | ## N2  0.06  0.00 -0.03 -0.07 -0.05
 534 | ## N3 -0.03 -0.03  0.00  0.11  0.07
 535 | ## N4 -0.06 -0.07  0.11  0.00  0.11
 536 | ## N5 -0.03 -0.05  0.07  0.11  0.00
 537 | ## 
 538 | ```
 539 | 
 540 | 
 541 | 
 542 | 
 543 | * `cov2cor` is a `base` R function that scales a covariance matrix into a correlation matrix.
 544 | *  Fitted and observed correlation matrices can be obtained by running `cov2cor` on the corresponding covariance matrices.
 545 | * The residual correlation matrix can be obtained by subtracting the fitted correlation matrix from the observed correlation matrix.
 546 | * In this case we can see that the certain pairs of items correlate more or less than other pairs. In particular `N1-N2`, `N3-N4`, `N4-N5` have positive correlation residuals. An examination of the items below may suggest some added degree of similarity between these pairs of items. For example, N1 and N2 both concern anger and irritation, whereas N3 and N4 both concern mood and affect. 
 547 | 
 548 | 
 549 | > N1: Get angry easily. (q_952)
 550 | > N2: Get irritated easily. (q_974)
 551 | > N3: Have frequent mood swings. (q_1099
 552 | > N4: Often feel blue. (q_1479)
 553 | > N5: Panic easily. (q_1505)
 554 | 
 555 | ## Uncorrelated factors
 556 | ### All Uncorrelated factors
 557 | The following examines a mdoel with uncorrelated factors.
 558 | 
 559 | 
 560 | 
 561 | ```r
 562 | m3_model <- ' N =~ N1 + N2 + N3 + N4 + N5
 563 |               E =~ E1 + E2 + E3 + E4 + E5
 564 |               O =~ O1 + O2 + O3 + O4 + O5
 565 |               A =~ A1 + A2 + A3 + A4 + A5
 566 |               C =~ C1 + C2 + C3 + C4 + C5
 567 | '
 568 | 
 569 | m3_fit <- cfa(m3_model, data=Data[, item_names], orthogonal=TRUE)
 570 | 
 571 | round(cbind(m1=inspect(m1_fit, 'fit.measures'),
 572 |       m3=inspect(m3_fit, 'fit.measures')), 3)
 573 | ```
 574 | 
 575 | ```
 576 | ##                           m1         m3
 577 | ## chisq               4165.467  5.640e+03
 578 | ## df                   265.000  2.750e+02
 579 | ## pvalue                 0.000  0.000e+00
 580 | ## baseline.chisq     18222.116  1.822e+04
 581 | ## baseline.df          300.000  3.000e+02
 582 | ## baseline.pvalue        0.000  0.000e+00
 583 | ## cfi                    0.782  7.010e-01
 584 | ## tli                    0.754  6.730e-01
 585 | ## logl              -99840.238 -1.006e+05
 586 | ## unrestricted.logl -97757.504 -9.776e+04
 587 | ## npar                  60.000  5.000e+01
 588 | ## aic               199800.476  2.013e+05
 589 | ## bic               200148.363  2.015e+05
 590 | ## ntotal              2436.000  2.436e+03
 591 | ## bic2              199957.729  2.014e+05
 592 | ## rmsea                  0.078  8.900e-02
 593 | ## rmsea.ci.lower         0.076  8.700e-02
 594 | ## rmsea.ci.upper         0.080  9.200e-02
 595 | ## rmsea.pvalue           0.000  0.000e+00
 596 | ## srmr                   0.075  1.380e-01
 597 | ```
 598 | 
 599 | ```r
 600 | anova(m1_fit, m3_fit)
 601 | ```
 602 | 
 603 | ```
 604 | ## Chi Square Difference Test
 605 | ## 
 606 | ##         Df    AIC    BIC Chisq Chisq diff Df diff Pr(>Chisq)    
 607 | ## m1_fit 265 199800 200148  4165                                  
 608 | ## m3_fit 275 201255 201545  5640       1474      10     <2e-16 ***
 609 | ## ---
 610 | ## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1 
 611 | ```
 612 | 
 613 | ```r
 614 | 
 615 | rmsea_m1 <-  round(inspect(m1_fit, 'fit.measures')['rmsea'], 3)
 616 | rmsea_m3 <-  round(inspect(m3_fit, 'fit.measures')['rmsea'], 3)
 617 | ```
 618 | 
 619 | 
 620 | 
 621 | 
 622 | * To convert a `cfa` model from one that permits fators to be correlated to one that constrains factors to be uncorrelated, just specify `orthogonal=TRUE`.
 623 | * In this case constraining the factor covariances to all be zero led to a significant reduction in fit. This poorer fit can also be seen in measures like RMSEA (m1=
 624 | `0.078`; m3 = `0.089` ).
 625 | 
 626 | 
 627 | ### Correlations and covariances between factors
 628 | It is useful to be able to extract correlations and covaraiances between factors.
 629 | 
 630 | 
 631 | 
 632 | ```r
 633 | inspect(m1_fit, "coefficients")$psi
 634 | ```
 635 | 
 636 | ```
 637 | ##   N      E      O      A      C     
 638 | ## N  1.689                            
 639 | ## E  0.292  0.846                     
 640 | ## O -0.093 -0.265  0.404              
 641 | ## A  0.141  0.304 -0.093  0.234       
 642 | ## C -0.250 -0.224  0.130 -0.110  0.463
 643 | ```
 644 | 
 645 | ```r
 646 | cov2cor(inspect(m1_fit, "coefficients")$psi)
 647 | ```
 648 | 
 649 | ```
 650 | ##   N      E      O      A      C     
 651 | ## N  1.000                            
 652 | ## E  0.244  1.000                     
 653 | ## O -0.112 -0.453  1.000              
 654 | ## A  0.223  0.683 -0.303  1.000       
 655 | ## C -0.283 -0.357  0.301 -0.334  1.000
 656 | ```
 657 | 
 658 | ```r
 659 | A_E_r <- cov2cor(inspect(m1_fit, "coefficients")$psi)["A", "E"]
 660 | ```
 661 | 
 662 | 
 663 | 
 664 | 
 665 | * This code first extracts the factor variances and covariances.
 666 | * I assume that naming the element `psi` (i.e., $\psi$) is a reference to LISREL Matrix notation (see this discussion from [USP 655 SEM](http://www.upa.pdx.edu/IOA/newsom/semclass/ho_lisrel%20notation.pdf)).
 667 | * Once again `cov2cor` is used to convert the covariance matrix to a correlation matrix.
 668 | * An inspection of the values shows that there are some substantive correlations that helps to explain why constraining them to zero in an orthogonal model would have substantially damaged fit. For example, the correlation between extraversion (`E`) and agreeableness (`A`) was quite high ($r = 0.68$).
 669 | 
 670 | 
 671 | 
 672 | 
 673 | ```r
 674 | # c('O', 'C', 'E', 'A', 'N') # set of factor names lhs != rhs # excludes
 675 | # factor variances
 676 | subset(inspect(m1_fit, "standardized"), rhs %in% c("O", "C", "E", 
 677 |     "A", "N") & lhs != rhs)
 678 | ```
 679 | 
 680 | ```
 681 | ##    lhs op rhs est.std se  z pvalue
 682 | ## 1    N ~~   E   0.244 NA NA     NA
 683 | ## 2    N ~~   O  -0.112 NA NA     NA
 684 | ## 3    N ~~   A   0.223 NA NA     NA
 685 | ## 4    N ~~   C  -0.283 NA NA     NA
 686 | ## 5    E ~~   O  -0.453 NA NA     NA
 687 | ## 6    E ~~   A   0.683 NA NA     NA
 688 | ## 7    E ~~   C  -0.357 NA NA     NA
 689 | ## 8    O ~~   A  -0.303 NA NA     NA
 690 | ## 9    O ~~   C   0.301 NA NA     NA
 691 | ## 10   A ~~   C  -0.334 NA NA     NA
 692 | ```
 693 | 
 694 | 
 695 | 
 696 | 
 697 | * The same values can be extracted from the `standardized` coefficients table using the `inspect` method.
 698 | 
 699 | We can also confirm that for the orthogonal model (`m3`) the correlations are zero.
 700 | 
 701 | 
 702 | 
 703 | ```r
 704 | cov2cor(inspect(m3_fit, "coefficients")$psi)
 705 | ```
 706 | 
 707 | ```
 708 | ##   N E O A C
 709 | ## N 1        
 710 | ## E 0 1      
 711 | ## O 0 0 1    
 712 | ## A 0 0 0 1  
 713 | ## C 0 0 0 0 1
 714 | ```
 715 | 
 716 | 
 717 | 
 718 | 
 719 | 
 720 | ## Constrain factor correlations to be equal
 721 | ### Change constraints so that factor variances are one
 722 | 
 723 | 
 724 | 
 725 | ```r
 726 | m4_model <- ' N =~ N1 + N2 + N3 + N4 + N5
 727 |               E =~ E1 + E2 + E3 + E4 + E5
 728 |               O =~ O1 + O2 + O3 + O4 + O5
 729 |               A =~ A1 + A2 + A3 + A4 + A5
 730 |               C =~ C1 + C2 + C3 + C4 + C5
 731 | '
 732 | 
 733 | m4_fit <- cfa(m4_model, data=Data[, item_names], std.lv=TRUE)
 734 | 
 735 | inspect(m4_fit, 'coefficients')$psi
 736 | ```
 737 | 
 738 | ```
 739 | ##   N      E      O      A      C     
 740 | ## N  1.000                            
 741 | ## E -0.244  1.000                     
 742 | ## O -0.112  0.453  1.000              
 743 | ## A -0.223  0.683  0.303  1.000       
 744 | ## C -0.283  0.357  0.301  0.334  1.000
 745 | ```
 746 | 
 747 | ```r
 748 | inspect(m4_fit, 'coefficients')$psi
 749 | ```
 750 | 
 751 | ```
 752 | ##   N      E      O      A      C     
 753 | ## N  1.000                            
 754 | ## E -0.244  1.000                     
 755 | ## O -0.112  0.453  1.000              
 756 | ## A -0.223  0.683  0.303  1.000       
 757 | ## C -0.283  0.357  0.301  0.334  1.000
 758 | ```
 759 | 
 760 | 
 761 | 
 762 | 
 763 | * `std.lv` is an argument that when `TRUE` standardises latent variables by fixing their variance to 1.0. The default is `FALSE` which instead constrains the first factor loading to 1.0.
 764 | * This makes the covariance and the correlation matrix of the factors the same.
 765 | 
 766 | We can see the differences in the loadings by comparing the loadings for the neuroticism factor: 
 767 | 
 768 | 
 769 | 
 770 | ```r
 771 | head(parameterestimates(m4_fit), 5)
 772 | ```
 773 | 
 774 | ```
 775 | ##   lhs op rhs   est    se     z pvalue ci.lower ci.upper
 776 | ## 1   N =~  N1 1.300 0.028 46.07      0    1.244    1.355
 777 | ## 2   N =~  N2 1.230 0.028 44.38      0    1.176    1.285
 778 | ## 3   N =~  N3 1.149 0.030 38.41      0    1.090    1.207
 779 | ## 4   N =~  N4 0.899 0.031 28.75      0    0.838    0.960
 780 | ## 5   N =~  N5 0.816 0.033 24.65      0    0.751    0.881
 781 | ```
 782 | 
 783 | ```r
 784 | head(parameterestimates(m1_fit), 5)
 785 | ```
 786 | 
 787 | ```
 788 | ##   lhs op rhs   est    se     z pvalue ci.lower ci.upper
 789 | ## 1   N =~  N1 1.000 0.000    NA     NA    1.000    1.000
 790 | ## 2   N =~  N2 0.947 0.024 39.90      0    0.900    0.993
 791 | ## 3   N =~  N3 0.884 0.025 35.92      0    0.836    0.932
 792 | ## 4   N =~  N4 0.692 0.025 27.75      0    0.643    0.741
 793 | ## 5   N =~  N5 0.628 0.026 24.03      0    0.577    0.679
 794 | ```
 795 | 
 796 | ```r
 797 | 
 798 | # shows how ratio of loadings has not changed
 799 | head(parameterestimates(m4_fit), 5)$est/head(parameterestimates(m4_fit), 
 800 |     5)$est[1]
 801 | ```
 802 | 
 803 | ```
 804 | ## [1] 1.0000 0.9467 0.8839 0.6918 0.6278
 805 | ```
 806 | 
 807 | 
 808 | 
 809 | 
 810 | 
 811 | 
 812 | ### Add equality constraints
 813 | 
 814 | 
 815 | ```r
 816 | m5_model <- ' N =~ N1 + N2 + N3 + N4 + N5
 817 |               E =~ E1 + E2 + E3 + E4 + E5
 818 |               O =~ O1 + O2 + O3 + O4 + O5
 819 |               A =~ A1 + A2 + A3 + A4 + A5
 820 |               C =~ C1 + C2 + C3 + C4 + C5
 821 |     N ~~ R*E + R*O + R*A + R*C
 822 |     E ~~ R*O + R*A + R*C
 823 |     O ~~ R*A + R*C
 824 |     A ~~ R*C
 825 | '
 826 | 
 827 | Data_reversed <- Data
 828 | Data_reversed[, paste0('N', 1:5)] <- 7 - Data[, paste0('N', 1:5)]
 829 | 
 830 | m5_fit <- cfa(m5_model, data=Data_reversed[, item_names], std.lv=TRUE)
 831 | ```
 832 | 
 833 | 
 834 | 
 835 | 
 836 | * Equality constraints were added by labelling all the covariance parameters with a common label (i.e., `R`). 
 837 | * `~~` stands for covariance.
 838 | * `R*E` labels the parameter with the `E` variable with the label 
 839 | * I reversed the neuroticism items and hence the factor to ensure that all the inter-item correlations were positive.
 840 | 
 841 | The following output shows that the correlation/covariance is the same for all factor inter-correlations.
 842 | 
 843 | 
 844 | 
 845 | ```r
 846 | inspect(m5_fit, "coefficients")$psi
 847 | ```
 848 | 
 849 | ```
 850 | ##   N     E     O     A     C    
 851 | ## N 1.000                        
 852 | ## E 0.323 1.000                  
 853 | ## O 0.323 0.323 1.000            
 854 | ## A 0.323 0.323 0.323 1.000      
 855 | ## C 0.323 0.323 0.323 0.323 1.000
 856 | ```
 857 | 
 858 | 
 859 | 
 860 | 
 861 | The following analysis compare the fit of the unconstrained with the equal-covariance model.
 862 | 
 863 | 
 864 | 
 865 | ```r
 866 | round(cbind(m1 = inspect(m1_fit, "fit.measures"), m5 = inspect(m5_fit, 
 867 |     "fit.measures")), 3)
 868 | ```
 869 | 
 870 | ```
 871 | ##                           m1         m5
 872 | ## chisq               4165.467  4.576e+03
 873 | ## df                   265.000  2.740e+02
 874 | ## pvalue                 0.000  0.000e+00
 875 | ## baseline.chisq     18222.116  1.822e+04
 876 | ## baseline.df          300.000  3.000e+02
 877 | ## baseline.pvalue        0.000  0.000e+00
 878 | ## cfi                    0.782  7.600e-01
 879 | ## tli                    0.754  7.370e-01
 880 | ## logl              -99840.238 -1.000e+05
 881 | ## unrestricted.logl -97757.504 -9.776e+04
 882 | ## npar                  60.000  5.100e+01
 883 | ## aic               199800.476  2.002e+05
 884 | ## bic               200148.363  2.005e+05
 885 | ## ntotal              2436.000  2.436e+03
 886 | ## bic2              199957.729  2.003e+05
 887 | ## rmsea                  0.078  8.000e-02
 888 | ## rmsea.ci.lower         0.076  7.800e-02
 889 | ## rmsea.ci.upper         0.080  8.200e-02
 890 | ## rmsea.pvalue           0.000  0.000e+00
 891 | ## srmr                   0.075  8.900e-02
 892 | ```
 893 | 
 894 | ```r
 895 | anova(m1_fit, m5_fit)
 896 | ```
 897 | 
 898 | ```
 899 | ## Chi Square Difference Test
 900 | ## 
 901 | ##         Df   AIC   BIC Chisq Chisq diff Df diff Pr(>Chisq)    
 902 | ## m1_fit 265 2e+05 2e+05  4165                                  
 903 | ## m5_fit 274 2e+05 2e+05  4576        411       9     <2e-16 ***
 904 | ## ---
 905 | ## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1 
 906 | ```
 907 | 
 908 | 
 909 | 
 910 | 
 911 | * The unconstrained model provides a better fit both in terms of the chi-square difference test and when comparing various parisomony adjusted fit indices such as RMSEA. 
 912 | * The difference is relatively small.
 913 | 
 914 | The following summarises the correlations between variables (correlations with Neuroticism reversed).
 915 | 
 916 | 
 917 | 
 918 | ```r
 919 | rs <- abs(inspect(m4_fit, "coefficients")$psi)
 920 | summary(rs[lower.tri(rs)])
 921 | ```
 922 | 
 923 | ```
 924 | ##    Min. 1st Qu.  Median    Mean 3rd Qu.    Max. 
 925 | ##   0.112   0.254   0.302   0.329   0.352   0.683 
 926 | ```
 927 | 
 928 | ```r
 929 | hist(rs[lower.tri(rs)])
 930 | ```
 931 | 
 932 | ![plot of chunk unnamed-chunk-19](figure/unnamed-chunk-19.png) 
 933 | 
 934 | ```r
 935 | 
 936 | round(rs, 2)
 937 | ```
 938 | 
 939 | ```
 940 | ##   N    E    O    A    C   
 941 | ## N 1.00                    
 942 | ## E 0.24 1.00               
 943 | ## O 0.11 0.45 1.00          
 944 | ## A 0.22 0.68 0.30 1.00     
 945 | ## C 0.28 0.36 0.30 0.33 1.00
 946 | ```
 947 | 
 948 | 
 949 | 
 950 | 
 951 | * Given the very large sample size, even small variations in sample correlations likely reflect true variation.
 952 | * However, in particular, the correlation between E and A is much larger than the average correlation, and the correlation between O and N is much smaller than the average correlation.
 953 | 
 954 | ### Add equality constraints with some post hoc modifications
 955 | 
 956 | 
 957 | ```r
 958 | m6_model <- ' N =~ N1 + N2 + N3 + N4 + N5
 959 |               E =~ E1 + E2 + E3 + E4 + E5
 960 |               O =~ O1 + O2 + O3 + O4 + O5
 961 |               A =~ A1 + A2 + A3 + A4 + A5
 962 |               C =~ C1 + C2 + C3 + C4 + C5
 963 |     N ~~ R*E + R*A + R*C
 964 |     E ~~ R*O + R*C
 965 |     O ~~ R*A + R*C
 966 |     A ~~ R*C
 967 | '
 968 | 
 969 | Data_reversed <- Data
 970 | Data_reversed[, paste0('N', 1:5)] <- 7 - Data[, paste0('N', 1:5)]
 971 | 
 972 | m6_fit <- cfa(m6_model, data=Data_reversed[, item_names], std.lv=TRUE)
 973 | ```
 974 | 
 975 | 
 976 | 
 977 | 
 978 | The above model frees up the correlation between E and A, and between O and N.
 979 | 
 980 | 
 981 | 
 982 | ```r
 983 | round(cbind(m1 = inspect(m1_fit, "fit.measures"), m5 = inspect(m1_fit, 
 984 |     "fit.measures"), m6 = inspect(m6_fit, "fit.measures")), 3)
 985 | ```
 986 | 
 987 | ```
 988 | ##                           m1         m5         m6
 989 | ## chisq               4165.467   4165.467   4223.250
 990 | ## df                   265.000    265.000    272.000
 991 | ## pvalue                 0.000      0.000      0.000
 992 | ## baseline.chisq     18222.116  18222.116  18222.116
 993 | ## baseline.df          300.000    300.000    300.000
 994 | ## baseline.pvalue        0.000      0.000      0.000
 995 | ## cfi                    0.782      0.782      0.780
 996 | ## tli                    0.754      0.754      0.757
 997 | ## logl              -99840.238 -99840.238 -99869.130
 998 | ## unrestricted.logl -97757.504 -97757.504 -97757.504
 999 | ## npar                  60.000     60.000     53.000
1000 | ## aic               199800.476 199800.476 199844.259
1001 | ## bic               200148.363 200148.363 200151.559
1002 | ## ntotal              2436.000   2436.000   2436.000
1003 | ## bic2              199957.729 199957.729 199983.166
1004 | ## rmsea                  0.078      0.078      0.077
1005 | ## rmsea.ci.lower         0.076      0.076      0.075
1006 | ## rmsea.ci.upper         0.080      0.080      0.079
1007 | ## rmsea.pvalue           0.000      0.000      0.000
1008 | ## srmr                   0.075      0.075      0.077
1009 | ```
1010 | 
1011 | ```r
1012 | anova(m1_fit, m6_fit)
1013 | ```
1014 | 
1015 | ```
1016 | ## Chi Square Difference Test
1017 | ## 
1018 | ##         Df   AIC   BIC Chisq Chisq diff Df diff Pr(>Chisq)    
1019 | ## m1_fit 265 2e+05 2e+05  4165                                  
1020 | ## m6_fit 272 2e+05 2e+05  4223       57.8       7    4.2e-10 ***
1021 | ## ---
1022 | ## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1 
1023 | ```
1024 | 
1025 | ```r
1026 | anova(m5_fit, m6_fit)
1027 | ```
1028 | 
1029 | ```
1030 | ## Chi Square Difference Test
1031 | ## 
1032 | ##         Df   AIC   BIC Chisq Chisq diff Df diff Pr(>Chisq)    
1033 | ## m6_fit 272 2e+05 2e+05  4223                                  
1034 | ## m5_fit 274 2e+05 2e+05  4576        353       2     <2e-16 ***
1035 | ## ---
1036 | ## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1 
1037 | ```
1038 | 
1039 | 
1040 | 
1041 | 
1042 | * Freeing up these two correlations improved the model relative to the equality model. By most fit statistics, this model still provided a worse fit than the unconstrained model. However, interestingly, the RMSEA was slightly lower (i.e., better).
1043 | 
1044 | ### Add equality constraints without reversal
1045 | In section 5.5 of the [Lavaan introductory guide 0.4-13](http://users.ugent.be/~yrosseel/lavaan/lavaanIntroduction.pdf) it talks about various types of equality constraints. Thus, instead of reversing the neuroticism factor, it is possible to directly constrain covariances of neuroticism with each other factor to be the opposite of the covariances.
1046 | 
1047 | 
1048 | 
1049 | ```r
1050 | m7_model <- ' N =~ N1 + N2 + N3 + N4 + N5
1051 |               E =~ E1 + E2 + E3 + E4 + E5
1052 |               O =~ O1 + O2 + O3 + O4 + O5
1053 |               A =~ A1 + A2 + A3 + A4 + A5
1054 |               C =~ C1 + C2 + C3 + C4 + C5
1055 |     # covariances
1056 |     N ~~ R1*E + R1*O + R1*A + R1*C
1057 |     E ~~ R2*O + R2*A + R2*C
1058 |     O ~~ R2*A + R2*C
1059 |     A ~~ R2*C
1060 |     
1061 |     # constraints
1062 |     R1 == 0 - R2
1063 | '
1064 | 
1065 | m7_fit <- cfa(m7_model, data=Data[, item_names], std.lv=TRUE)
1066 | ```
1067 | 
1068 | 
1069 | 
1070 | 
1071 | Let's check that the results are the same whether we reverse data or set negative constraints.
1072 | 
1073 | 
1074 | 
1075 | 
1076 | 
1077 | ```r
1078 | m5_fit
1079 | ```
1080 | 
1081 | ```
1082 | ## lavaan (0.4-14) converged normally after 43 iterations
1083 | ## 
1084 | ##   Number of observations                          2436
1085 | ## 
1086 | ##   Estimator                                         ML
1087 | ##   Minimum Function Chi-square                 4576.170
1088 | ##   Degrees of freedom                               274
1089 | ##   P-value                                        0.000
1090 | ## 
1091 | ```
1092 | 
1093 | ```r
1094 | m7_fit
1095 | ```
1096 | 
1097 | ```
1098 | ## lavaan (0.4-14) converged normally after 283 iterations
1099 | ## 
1100 | ##   Number of observations                          2436
1101 | ## 
1102 | ##   Estimator                                         ML
1103 | ##   Minimum Function Chi-square                 4576.170
1104 | ##   Degrees of freedom                               274
1105 | ##   P-value                                        0.000
1106 | ## 
1107 | ```
1108 | 
1109 | 
1110 | 
1111 | 


--------------------------------------------------------------------------------
/cfa-example/cfa-example.rmd:
--------------------------------------------------------------------------------
  1 | # CFA Example
  2 | 
  3 | ```{r get_data, message=FALSE}
  4 | library(psych)
  5 | library(lavaan)
  6 | Data <- bfi
  7 | item_names <- names(Data)[1:25]
  8 | ```
  9 | 
 10 | ## Check data
 11 | 
 12 | ```{r }
 13 | sapply(Data[,item_names], function(X) sum(is.na(X)))
 14 | 
 15 | Data$item_na <- apply(Data[,item_names], 1, function(X) sum(is.na(X)) > 0)
 16 | 
 17 | table(Data$item_na)
 18 | Data <- Data[!Data$item_na, ]
 19 | ```
 20 | 
 21 | * I decided to remove data with missing data to simplify subsequent exploration of the features of the lavaan software.
 22 | 
 23 | 
 24 | ## Basic CFA
 25 | ```{r, tidy=FALSE}
 26 | m1_model <- ' N =~ N1 + N2 + N3 + N4 + N5
 27 |               E =~ E1 + E2 + E3 + E4 + E5
 28 |               O =~ O1 + O2 + O3 + O4 + O5
 29 |               A =~ A1 + A2 + A3 + A4 + A5
 30 |               C =~ C1 + C2 + C3 + C4 + C5
 31 | '
 32 | 
 33 | m1_fit <- cfa(m1_model, data=Data[, item_names])
 34 | summary(m1_fit, standardized=TRUE)
 35 | ```
 36 | 
 37 | * **`Std.lv`**: Only latent variables have been standardized
 38 | * **`Std.all`**: Observed and latent variables have been standardized. 
 39 | * **Factor loadings**: Under the `latent variables` section, the `Std.all` column provides standardised factor loadings. 
 40 | * **Factor correlations**: Under the `Covariances`  section, the `Std.all` column provides standardised factor loadings.
 41 | * **`Variances`**: Latent factor variances can be constrained for identifiability purposes to be 1, but in this case, one of the loadings was constrained to be one. Variances for items represent the variance not explained by the latent factor.
 42 | 
 43 | 
 44 | 
 45 | ```{r demonstrate_variance_point}
 46 | variances <- c(unique=subset(inspect(m1_fit, "standardizedsolution"), 
 47 |        lhs == 'N1' & rhs == 'N1')[, 'est.std'],
 48 |   common=subset(inspect(m1_fit, "standardizedsolution"), 
 49 |        lhs == 'N' & rhs == 'N1')[, 'est.std']^2)
 50 | (variances <- c(variances, total=sum(variances)))
 51 | ```
 52 | 
 53 | * The output above illustrates the point about variances. Variance for each item is explained by either the common factor or by error variance. As there is just one latent factor loading on the item, the squared standardised coefficient is the variance explained by the common factor. The sum of the unique and common standardised variances is one, which naturally corresponds to the variance of a standardised variable.
 54 | * The code also demonstrates ideas about how to extract specific information from the lavaan model fit object. Specifically, the `inspect` method provides access to a wide range of specific information. See help for further details.
 55 | * I used the `subset` method to provide an easy one-liner for extracting elements from the data frame returned by the `inspect` method.
 56 | 
 57 | ```{r}
 58 | variances <- c(N1_N1=subset(parameterestimates(m1_fit), 
 59 |     lhs == 'N1' & rhs == 'N1')[, 'est'],
 60 |                N_N=subset(parameterestimates(m1_fit), 
 61 |     lhs == 'N' & rhs == 'N')[, 'est'],
 62 |                N_N1=subset(parameterestimates(m1_fit), 
 63 |     lhs == 'N' & rhs == 'N1')[, 'est'])
 64 | 
 65 | cbind(parameters = c(variances, 
 66 |                total=variances['N_N1'] * variances['N_N'] + variances['N1_N1'],
 67 |             raw_divide_by_n_minus_1=var(Data[,'N1']),
 68 |             raw_divide_by_n=mean((Data[,'N1'] - mean(Data[,'N1']))^2)))
 69 | ```
 70 | 
 71 | * The output above shows the unstandardised parameters related to the item `N1`.
 72 | * `N1_N1` corresponds to the unstandardised unique variance for the item.
 73 | * `N_N` times `N_N1` represents the unstandardised common variance.
 74 | * Thus, the sum of the unique and common variance represents the total variance.
 75 | * When I calculated this on the raw data using the standard $n-1$ denominator, the value was slightly larger, but when I used $n$ as the denominator, the estimate was very close. 
 76 | 
 77 | 
 78 | 
 79 | ## Compare with a single factor model
 80 | ```{r, tidy=FALSE}
 81 | m2_model <- ' G =~ N1 + N2 + N3 + N4 + N5
 82 |               + E1 + E2 + E3 + E4 + E5
 83 |               + O1 + O2 + O3 + O4 + O5
 84 |               + A1 + A2 + A3 + A4 + A5
 85 |               + C1 + C2 + C3 + C4 + C5
 86 | '
 87 | 
 88 | m2_fit <- cfa(m2_model, data=Data[, item_names])
 89 | summary(m2_fit, standardized=TRUE)
 90 | ```
 91 | 
 92 | ```{r}
 93 | round(cbind(m1=inspect(m1_fit, 'fit.measures'),
 94 |       m2=inspect(m2_fit, 'fit.measures')), 3)
 95 | anova(m1_fit, m2_fit)
 96 | ```
 97 | 
 98 | * The output compares the model fit statistics for the two models.
 99 | * It also performs a chi-square difference test which shows that a one-factor model has significantly worse fit than the two-factor model.
100 | 
101 | 
102 | ## Modification indices
103 | ```{r}
104 | m1_mod <- modificationindices(m1_fit)
105 | m1_mod_summary <- subset(m1_mod, mi > 100)
106 | m1_mod_summary[order(m1_mod_summary$mi, decreasing=TRUE), ]
107 | ```
108 | 
109 | * `modificationindices` suggests several ad hoc modifications that could be made to improve the fit of the model.
110 | * The largest index suggests that items `N1` and `N2` share common variance. If we look at the help file on the bfi dataset `?bfi`, we see tha the text for `N1` ("Get angry easily") and `N2` ("Get irritated easily") are very similar. 
111 | 
112 | ```{r}
113 | (N_cors <- round(cor(Data[, paste0('N', 1:5)]), 2))
114 | N1_N2_corr <- N_cors['N1', 'N2']
115 | other_N_corrs <- round(mean(abs(N_cors[lower.tri(N_cors)][-1])), 2)
116 | 
117 | ```
118 | 
119 | * The correlation matrix also shows that the correlation N1 and N2 ($r = `r I(N1_N2_corr)`$) is much larger than it is for the other variables ($\text{mean}(|r|) = `r I(other_N_corrs)`$).
120 | 
121 | ## Various matrices
122 | ### Observed, fitted, and residual covariance matrices
123 | The following analysis extracts observed, fitted, and residual covariances and checks that they are consistent with expectations. I only perform this for five items rather than the full 25 item set in order to make the point about demonstrating their meaning clearer.
124 | 
125 | ```{r}
126 | N_names <- paste0('N', 1:5)
127 | N_matrices <- list(
128 |     observed=inspect(m1_fit, 'sampstat')$cov[N_names, N_names],
129 |      fitted=fitted(m1_fit)$cov[N_names, N_names],
130 |      residual=resid(m1_fit)$cov[N_names, N_names])
131 | 
132 | N_matrices$check <- N_matrices$observed - (N_matrices$fitted + N_matrices$residual)
133 | lapply(N_matrices, function(X) round(X, 3))
134 | ```
135 | 
136 | * The overved covariance matrix was extracted using the `cov` function on the sample data.
137 | * The fitted covariance matrix can be extracted using the `fitted` method on the model fit object and then extracting the cov
138 | * Many symmetric matrices in lavaan are of class `lavaan.matrix.symmetric`. This hides the upper triangle of the matrix and formats the matrix to `nd` decimal places.
139 | Run `getAnywhere(print.lavaan.matrix.symmetric)` to see more details.
140 | * The `sampstat` option in the `inspect` method can be used to extract the sample covariance matrix. This is similar, but not exactly the same as running `cov` on the sample data.
141 | * The `resid` method can be used to extract the residual covariance matrix
142 | * I then create a `check` that `observed = fitted - residual`, which it does.
143 | 
144 | ### Observed, fitted, and residual correlation matrices
145 | I often find it more meaningful to examine observed, fitted, and residual correlation matrices.  Standardisation often makes it easier to understand the real magnitude of any residual.
146 | 
147 | ```{r}
148 | N_names <- paste0('N', 1:5)
149 | N_cov <- list(
150 |     observed=inspect(m1_fit, 'sampstat')$cov[N_names, N_names],
151 |      fitted=fitted(m1_fit)$cov[N_names, N_names])
152 | 
153 | N_cor <- list(
154 |     observed = cov2cor(N_cov$observed),
155 |     fitted = cov2cor(N_cov$fitted) )
156 | 
157 | N_cor$residual <- N_cor$observed - N_cor$fitted
158 | 
159 | lapply(N_cor, function(X) round(X, 2))
160 | ```
161 | 
162 | * `cov2cor` is a `base` R function that scales a covariance matrix into a correlation matrix.
163 | *  Fitted and observed correlation matrices can be obtained by running `cov2cor` on the corresponding covariance matrices.
164 | * The residual correlation matrix can be obtained by subtracting the fitted correlation matrix from the observed correlation matrix.
165 | * In this case we can see that the certain pairs of items correlate more or less than other pairs. In particular `N1-N2`, `N3-N4`, `N4-N5` have positive correlation residuals. An examination of the items below may suggest some added degree of similarity between these pairs of items. For example, N1 and N2 both concern anger and irritation, whereas N3 and N4 both concern mood and affect. 
166 | 
167 | 
168 | > N1: Get angry easily. (q_952)
169 | > N2: Get irritated easily. (q_974)
170 | > N3: Have frequent mood swings. (q_1099
171 | > N4: Often feel blue. (q_1479)
172 | > N5: Panic easily. (q_1505)
173 | 
174 | ## Uncorrelated factors
175 | ### All Uncorrelated factors
176 | The following examines a mdoel with uncorrelated factors.
177 | 
178 | ```{r tidy=FALSE}
179 | m3_model <- ' N =~ N1 + N2 + N3 + N4 + N5
180 |               E =~ E1 + E2 + E3 + E4 + E5
181 |               O =~ O1 + O2 + O3 + O4 + O5
182 |               A =~ A1 + A2 + A3 + A4 + A5
183 |               C =~ C1 + C2 + C3 + C4 + C5
184 | '
185 | 
186 | m3_fit <- cfa(m3_model, data=Data[, item_names], orthogonal=TRUE)
187 | 
188 | round(cbind(m1=inspect(m1_fit, 'fit.measures'),
189 |       m3=inspect(m3_fit, 'fit.measures')), 3)
190 | anova(m1_fit, m3_fit)
191 | 
192 | rmsea_m1 <-  round(inspect(m1_fit, 'fit.measures')['rmsea'], 3)
193 | rmsea_m3 <-  round(inspect(m3_fit, 'fit.measures')['rmsea'], 3)
194 | ```
195 | 
196 | * To convert a `cfa` model from one that permits fators to be correlated to one that constrains factors to be uncorrelated, just specify `orthogonal=TRUE`.
197 | * In this case constraining the factor covariances to all be zero led to a significant reduction in fit. This poorer fit can also be seen in measures like RMSEA (m1=
198 | `r rmsea_m1`; m3 = `r rmsea_m3` ).
199 | 
200 | 
201 | ### Correlations and covariances between factors
202 | It is useful to be able to extract correlations and covaraiances between factors.
203 | 
204 | ```{r}
205 | inspect(m1_fit, 'coefficients')$psi
206 | cov2cor(inspect(m1_fit, 'coefficients')$psi)
207 | A_E_r <- cov2cor(inspect(m1_fit, 'coefficients')$psi)['A', 'E']
208 | ```
209 | 
210 | * This code first extracts the factor variances and covariances.
211 | * I assume that naming the element `psi` (i.e., $\psi$) is a reference to LISREL Matrix notation (see this discussion from [USP 655 SEM](http://www.upa.pdx.edu/IOA/newsom/semclass/ho_lisrel%20notation.pdf)).
212 | * Once again `cov2cor` is used to convert the covariance matrix to a correlation matrix.
213 | * An inspection of the values shows that there are some substantive correlations that helps to explain why constraining them to zero in an orthogonal model would have substantially damaged fit. For example, the correlation between extraversion (`E`) and agreeableness (`A`) was quite high ($r = `r I(round(A_E_r, 2))`$).
214 | 
215 | 
216 | ```{r}
217 | # c('O', 'C', 'E', 'A', 'N') # set of factor names
218 | # lhs != rhs  # excludes factor variances
219 | subset(inspect(m1_fit, 'standardized'), 
220 |     rhs %in% c('O', 'C', 'E', 'A', 'N') & lhs != rhs)
221 | ```
222 | 
223 | * The same values can be extracted from the `standardized` coefficients table using the `inspect` method.
224 | 
225 | We can also confirm that for the orthogonal model (`m3`) the correlations are zero.
226 | 
227 | ```{r}
228 | cov2cor(inspect(m3_fit, 'coefficients')$psi)
229 | ```
230 | 
231 | 
232 | ## Constrain factor correlations to be equal
233 | ### Change constraints so that factor variances are one
234 | 
235 | ```{r tidy=FALSE}
236 | m4_model <- ' N =~ N1 + N2 + N3 + N4 + N5
237 |               E =~ E1 + E2 + E3 + E4 + E5
238 |               O =~ O1 + O2 + O3 + O4 + O5
239 |               A =~ A1 + A2 + A3 + A4 + A5
240 |               C =~ C1 + C2 + C3 + C4 + C5
241 | '
242 | 
243 | m4_fit <- cfa(m4_model, data=Data[, item_names], std.lv=TRUE)
244 | 
245 | inspect(m4_fit, 'coefficients')$psi
246 | inspect(m4_fit, 'coefficients')$psi
247 | ```
248 | 
249 | * `std.lv` is an argument that when `TRUE` standardises latent variables by fixing their variance to 1.0. The default is `FALSE` which instead constrains the first factor loading to 1.0.
250 | * This makes the covariance and the correlation matrix of the factors the same.
251 | 
252 | We can see the differences in the loadings by comparing the loadings for the neuroticism factor: 
253 | 
254 | ```{r}
255 | head(parameterestimates(m4_fit), 5)
256 | head(parameterestimates(m1_fit), 5)
257 | 
258 | # shows how ratio of loadings has not changed
259 | head(parameterestimates(m4_fit), 5)$est / head(parameterestimates(m4_fit), 5)$est[1]
260 | ```
261 | 
262 | 
263 | 
264 | ### Add equality constraints
265 | ```{r tidy=FALSE}
266 | m5_model <- ' N =~ N1 + N2 + N3 + N4 + N5
267 |               E =~ E1 + E2 + E3 + E4 + E5
268 |               O =~ O1 + O2 + O3 + O4 + O5
269 |               A =~ A1 + A2 + A3 + A4 + A5
270 |               C =~ C1 + C2 + C3 + C4 + C5
271 |     N ~~ R*E + R*O + R*A + R*C
272 |     E ~~ R*O + R*A + R*C
273 |     O ~~ R*A + R*C
274 |     A ~~ R*C
275 | '
276 | 
277 | Data_reversed <- Data
278 | Data_reversed[, paste0('N', 1:5)] <- 7 - Data[, paste0('N', 1:5)]
279 | 
280 | m5_fit <- cfa(m5_model, data=Data_reversed[, item_names], std.lv=TRUE)
281 | ```
282 | 
283 | * Equality constraints were added by labelling all the covariance parameters with a common label (i.e., `R`). 
284 | * `~~` stands for covariance.
285 | * `R*E` labels the parameter with the `E` variable with the label 
286 | * I reversed the neuroticism items and hence the factor to ensure that all the inter-item correlations were positive.
287 | 
288 | The following output shows that the correlation/covariance is the same for all factor inter-correlations.
289 | 
290 | ```{r}
291 | inspect(m5_fit, 'coefficients')$psi
292 | ```
293 | 
294 | The following analysis compare the fit of the unconstrained with the equal-covariance model.
295 | 
296 | ```{r}
297 | round(cbind(m1=inspect(m1_fit, 'fit.measures'),
298 |       m5=inspect(m5_fit, 'fit.measures')), 3)
299 | anova(m1_fit, m5_fit)
300 | ```
301 | 
302 | * The unconstrained model provides a better fit both in terms of the chi-square difference test and when comparing various parisomony adjusted fit indices such as RMSEA. 
303 | * The difference is relatively small.
304 | 
305 | The following summarises the correlations between variables (correlations with Neuroticism reversed).
306 | 
307 | ```{r }
308 | rs <- abs(inspect(m4_fit, 'coefficients')$psi)
309 | summary(rs[lower.tri(rs)])
310 | hist(rs[lower.tri(rs)])
311 | 
312 | round(rs, 2)
313 | ```
314 | 
315 | * Given the very large sample size, even small variations in sample correlations likely reflect true variation.
316 | * However, in particular, the correlation between E and A is much larger than the average correlation, and the correlation between O and N is much smaller than the average correlation.
317 | 
318 | ### Add equality constraints with some post hoc modifications
319 | ```{r tidy=FALSE}
320 | m6_model <- ' N =~ N1 + N2 + N3 + N4 + N5
321 |               E =~ E1 + E2 + E3 + E4 + E5
322 |               O =~ O1 + O2 + O3 + O4 + O5
323 |               A =~ A1 + A2 + A3 + A4 + A5
324 |               C =~ C1 + C2 + C3 + C4 + C5
325 |     N ~~ R*E + R*A + R*C
326 |     E ~~ R*O + R*C
327 |     O ~~ R*A + R*C
328 |     A ~~ R*C
329 | '
330 | 
331 | Data_reversed <- Data
332 | Data_reversed[, paste0('N', 1:5)] <- 7 - Data[, paste0('N', 1:5)]
333 | 
334 | m6_fit <- cfa(m6_model, data=Data_reversed[, item_names], std.lv=TRUE)
335 | ```
336 | 
337 | The above model frees up the correlation between E and A, and between O and N.
338 | 
339 | ```{r}
340 | round(cbind(m1=inspect(m1_fit, 'fit.measures'),
341 |             m5=inspect(m1_fit, 'fit.measures'),
342 |       m6=inspect(m6_fit, 'fit.measures')), 3)
343 | anova(m1_fit, m6_fit)
344 | anova(m5_fit, m6_fit)
345 | ```
346 | 
347 | * Freeing up these two correlations improved the model relative to the equality model. By most fit statistics, this model still provided a worse fit than the unconstrained model. However, interestingly, the RMSEA was slightly lower (i.e., better).
348 | 
349 | ### Add equality constraints without reversal
350 | In section 5.5 of the [Lavaan introductory guide 0.4-13](http://users.ugent.be/~yrosseel/lavaan/lavaanIntroduction.pdf) it talks about various types of equality constraints. Thus, instead of reversing the neuroticism factor, it is possible to directly constrain covariances of neuroticism with each other factor to be the opposite of the covariances.
351 | 
352 | ```{r tidy=FALSE}
353 | m7_model <- ' N =~ N1 + N2 + N3 + N4 + N5
354 |               E =~ E1 + E2 + E3 + E4 + E5
355 |               O =~ O1 + O2 + O3 + O4 + O5
356 |               A =~ A1 + A2 + A3 + A4 + A5
357 |               C =~ C1 + C2 + C3 + C4 + C5
358 |     # covariances
359 |     N ~~ R1*E + R1*O + R1*A + R1*C
360 |     E ~~ R2*O + R2*A + R2*C
361 |     O ~~ R2*A + R2*C
362 |     A ~~ R2*C
363 |     
364 |     # constraints
365 |     R1 == 0 - R2
366 | '
367 | 
368 | m7_fit <- cfa(m7_model, data=Data[, item_names], std.lv=TRUE)
369 | ```
370 | 
371 | Let's check that the results are the same whether we reverse data or set negative constraints.
372 | 
373 | 
374 | ```{r}
375 | m5_fit
376 | m7_fit
377 | 
378 | inspect(m5_fit, 'coefficients')$psi
379 | inspect(m7_fit, 'coefficients')$psi
380 | ```


--------------------------------------------------------------------------------
/cfa-example/figure/unnamed-chunk-19.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/jeromyanglim/lavaan-examples/d7f5cbdc7fe14ffd039512bae6aa140c2a0ca5e6/cfa-example/figure/unnamed-chunk-19.png


--------------------------------------------------------------------------------
/cheat-sheet-lavaan/cheat-sheet-lavaan.html:
--------------------------------------------------------------------------------
  1 | <!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN"
  2 | "http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd">
  3 | <!-- saved from url=(0014)about:internet -->
  4 | <html>
  5 | <head>
  6 | <meta http-equiv="Content-Type" content="text/html; charset=utf-8">
  7 | 
  8 | <title>Lavaan Cheat Sheet</title>
  9 | 
 10 | <base target="_blank"/>
 11 | 
 12 | <style type="text/css">
 13 | body, td {
 14 |    font-family: sans-serif;
 15 |    background-color: white;
 16 |    font-size: 12px;
 17 |    margin: 8px;
 18 | }
 19 | 
 20 | tt, code, pre {
 21 |    font-family: 'DejaVu Sans Mono', 'Droid Sans Mono', 'Lucida Console', Consolas, Monaco, monospace;
 22 | }
 23 | 
 24 | h1 { 
 25 |    font-size:2.2em; 
 26 | }
 27 | 
 28 | h2 { 
 29 |    font-size:1.8em; 
 30 | }
 31 | 
 32 | h3 { 
 33 |    font-size:1.4em; 
 34 | }
 35 | 
 36 | h4 { 
 37 |    font-size:1.0em; 
 38 | }
 39 | 
 40 | h5 { 
 41 |    font-size:0.9em; 
 42 | }
 43 | 
 44 | h6 { 
 45 |    font-size:0.8em; 
 46 | }
 47 | 
 48 | a:visited {
 49 |    color: rgb(50%, 0%, 50%);
 50 | }
 51 | 
 52 | pre {	
 53 |    margin-top: 0;
 54 |    max-width: 95%;
 55 |    border: 1px solid #ccc;
 56 | }
 57 | 
 58 | pre code {
 59 |    display: block; padding: 0.5em;
 60 | }
 61 | 
 62 | code.r {
 63 |    background-color: #F8F8F8;
 64 | }
 65 | 
 66 | table, td, th {
 67 |   border: none;
 68 | }
 69 | 
 70 | blockquote {
 71 |    color:#666666;
 72 |    margin:0;
 73 |    padding-left: 1em;
 74 |    border-left: 0.5em #EEE solid;
 75 | }
 76 | 
 77 | hr {
 78 |    height: 0px;
 79 |    border-bottom: none;
 80 |    border-top-width: thin;
 81 |    border-top-style: dotted;
 82 |    border-top-color: #999999;
 83 | }
 84 | 
 85 | @media print {
 86 |    * { 
 87 |       background: transparent !important; 
 88 |       color: black !important; 
 89 |       filter:none !important; 
 90 |       -ms-filter: none !important; 
 91 |    }
 92 | 
 93 |    body { 
 94 |       font-size:12pt; 
 95 |       max-width:100%; 
 96 |    }
 97 |        
 98 |    a, a:visited { 
 99 |       text-decoration: underline; 
100 |    }
101 | 
102 |    hr { 
103 |       visibility: hidden;
104 |       page-break-before: always;
105 |    }
106 | 
107 |    pre, blockquote { 
108 |       padding-right: 1em; 
109 |       page-break-inside: avoid; 
110 |    }
111 | 
112 |    tr, img { 
113 |       page-break-inside: avoid; 
114 |    }
115 | 
116 |    img { 
117 |       max-width: 100% !important; 
118 |    }
119 | 
120 |    @page :left { 
121 |       margin: 15mm 20mm 15mm 10mm; 
122 |    }
123 |      
124 |    @page :right { 
125 |       margin: 15mm 10mm 15mm 20mm; 
126 |    }
127 | 
128 |    p, h2, h3 { 
129 |       orphans: 3; widows: 3; 
130 |    }
131 | 
132 |    h2, h3 { 
133 |       page-break-after: avoid; 
134 |    }
135 | }
136 | 
137 | </style>
138 | 
139 | 
140 | 
141 | 
142 | 
143 | </head>
144 | 
145 | <body>
146 | <h1>Lavaan Cheat Sheet</h1>
147 | 
148 | <h2>Assumptions</h2>
149 | 
150 | <ul>
151 | <li><code>Data</code> is data frame</li>
152 | <li><code>model</code> is the lavaan model syntax character variable</li>
153 | <li><code>fit</code> is an object of class <code>lavaan</code> typically returned from functions <code>cfa</code>, <code>sem</code>, <code>growth</code>, and <code>lavaan</code></li>
154 | <li><code>m1_fit</code> and <code>m2_fit</code> are used for showing model comparison of <code>lavaan</code> objects.</li>
155 | </ul>
156 | 
157 | <h2>Documentation tips</h2>
158 | 
159 | <ul>
160 | <li>Introduction: <a href="http://users.ugent.be/%7Eyrosseel/lavaan/lavaanIntroduction.pdf">http://users.ugent.be/~yrosseel/lavaan/lavaanIntroduction.pdf</a></li>
161 | <li>Basic model commmands: <code>?cfa ?sem ?lavaan</code>: </li>
162 | <li>Extracting elements: <code>?inspect</code></li>
163 | </ul>
164 | 
165 | <h2>Model fitting</h2>
166 | 
167 | <table><thead>
168 | <tr>
169 | <th>Name</th>
170 | <th>Command</th>
171 | </tr>
172 | </thead><tbody>
173 | <tr>
174 | <td>fit CFA to data</td>
175 | <td><code>cfa(model, data=Data)</code></td>
176 | </tr>
177 | <tr>
178 | <td>fit SEM to data</td>
179 | <td><code>sem(model, data=Data)</code></td>
180 | </tr>
181 | <tr>
182 | <td>standardised solution</td>
183 | <td><code>sem(model, data=Data, std.ov=TRUE)</code></td>
184 | </tr>
185 | <tr>
186 | <td>orthogonal factors</td>
187 | <td><code>cfa(model, data=Data, orthogonal=TRUE)</code></td>
188 | </tr>
189 | </tbody></table>
190 | 
191 | <h2>Matrices</h2>
192 | 
193 | <table><thead>
194 | <tr>
195 | <th>Name</th>
196 | <th>Command</th>
197 | </tr>
198 | </thead><tbody>
199 | <tr>
200 | <td>Factor covariance matrix</td>
201 | <td><code>inspect(fit, &quot;coefficients&quot;)$psi</code></td>
202 | </tr>
203 | <tr>
204 | <td>Fitted covariance matrix</td>
205 | <td><code>fitted(fit)$cov</code></td>
206 | </tr>
207 | <tr>
208 | <td>Observed covariance matrix</td>
209 | <td><code>inspect(fit, &#39;sampstat&#39;)$cov</code></td>
210 | </tr>
211 | <tr>
212 | <td>Residual covariance matrix</td>
213 | <td><code>resid(fit)$cov</code></td>
214 | </tr>
215 | <tr>
216 | <td>Factor correlation matrix</td>
217 | <td><code>cov2cor(inspect(fit, &quot;coefficients&quot;)$psi)</code> or use covariance command with standardised solution e.g.,  <code>cfa(..., std.ov=TRUE)</code></td>
218 | </tr>
219 | </tbody></table>
220 | 
221 | <h2>Fit Measures</h2>
222 | 
223 | <table><thead>
224 | <tr>
225 | <th>Name</th>
226 | <th>Command</th>
227 | </tr>
228 | </thead><tbody>
229 | <tr>
230 | <td>Fit measures:</td>
231 | <td><code>fitMeasures(fit)</code></td>
232 | </tr>
233 | <tr>
234 | <td>Specific fit measures e.g.:</td>
235 | <td><code>fitMeasures(fit)[c(&#39;chisq&#39;, &#39;df&#39;, &#39;pvalue&#39;, &#39;cfi&#39;, &#39;rmsea&#39;, &#39;srmr&#39;)]</code></td>
236 | </tr>
237 | </tbody></table>
238 | 
239 | <h2>Parameters</h2>
240 | 
241 | <table><thead>
242 | <tr>
243 | <th>Name</th>
244 | <th>Command</th>
245 | </tr>
246 | </thead><tbody>
247 | <tr>
248 | <td>Parameter information</td>
249 | <td><code>parTable(fit)</code></td>
250 | </tr>
251 | <tr>
252 | <td>Standardised estimates</td>
253 | <td><code>standardizedSolution(fit)</code> or <code>summary(fit, standardized=TRUE)</code></td>
254 | </tr>
255 | </tbody></table>
256 | 
257 | <p>R-squared | <code>inspect(fit, &#39;r2&#39;)</code></p>
258 | 
259 | <h2>Compare models</h2>
260 | 
261 | <table><thead>
262 | <tr>
263 | <th>Name</th>
264 | <th>Command</th>
265 | </tr>
266 | </thead><tbody>
267 | <tr>
268 | <td>Compare fit measures</td>
269 | <td><code>cbind(m1=inspect(m1_fit, &#39;fit.measures&#39;), m2=inspect(m2_fit, &#39;fit.measures&#39;))</code></td>
270 | </tr>
271 | <tr>
272 | <td>Chi-square difference test</td>
273 | <td><code>anova(m1_fit, m2_fit)</code></td>
274 | </tr>
275 | </tbody></table>
276 | 
277 | <h2>Model improvement</h2>
278 | 
279 | <table><thead>
280 | <tr>
281 | <th>Name</th>
282 | <th>Command</th>
283 | </tr>
284 | </thead><tbody>
285 | <tr>
286 | <td>Modification indices</td>
287 | <td><code>mod_ind &lt;- modificationindices(fit)</code></td>
288 | </tr>
289 | <tr>
290 | <td>10 greatest</td>
291 | <td><code>head(mod_ind[order(mod_ind$mi, decreasing=TRUE), ], 10)</code></td>
292 | </tr>
293 | <tr>
294 | <td>mi &gt; 5</td>
295 | <td><code>subset(mod_ind[order(mod_ind$mi, decreasing=TRUE), ], mi &gt; 5)</code></td>
296 | </tr>
297 | </tbody></table>
298 | 
299 | </body>
300 | 
301 | </html>
302 | 
303 | 


--------------------------------------------------------------------------------
/cheat-sheet-lavaan/cheat-sheet-lavaan.md:
--------------------------------------------------------------------------------
 1 | # Lavaan Cheat Sheet
 2 | ## Assumptions
 3 | * `Data` is data frame
 4 | * `model` is the lavaan model syntax character variable
 5 | * `fit` is an object of class `lavaan` typically returned from functions `cfa`, `sem`, `growth`, and `lavaan`
 6 | * `m1_fit` and `m2_fit` are used for showing model comparison of `lavaan` objects.
 7 | 
 8 | 
 9 | ## Documentation tips
10 | * Introduction: http://users.ugent.be/~yrosseel/lavaan/lavaanIntroduction.pdf
11 | * Basic model commmands: `?cfa ?sem ?lavaan`: 
12 | * Extracting elements: `?inspect`
13 | 
14 | ## Model fitting
15 | | Name | Command |
16 | -------| -------------
17 | fit CFA to data | `cfa(model, data=Data)`
18 | fit SEM to data | `sem(model, data=Data)`
19 | standardised solution | `sem(model, data=Data, std.ov=TRUE)`
20 | orthogonal factors | `cfa(model, data=Data, orthogonal=TRUE)`
21 | 
22 | 
23 | ## Matrices
24 | | Name | Command |
25 | -------| -------------
26 | Factor covariance matrix | `inspect(fit, "coefficients")$psi`
27 | Fitted covariance matrix | `fitted(fit)$cov`
28 | Observed covariance matrix | `inspect(fit, 'sampstat')$cov`
29 | Residual covariance matrix | `resid(fit)$cov`
30 | Factor correlation matrix | `cov2cor(inspect(fit, "coefficients")$psi)` or use covariance command with standardised solution e.g.,  `cfa(..., std.ov=TRUE)` 
31 | 
32 | ## Fit Measures
33 | | Name | Command |
34 | -------| -------------
35 | Fit measures: | `fitMeasures(fit)` 
36 | Specific fit measures e.g.: | `fitMeasures(fit)[c('chisq', 'df', 'pvalue', 'cfi', 'rmsea', 'srmr')]`
37 | 
38 | 
39 | ## Parameters
40 | | Name | Command |
41 | -------| -------------
42 | Parameter information | `parTable(fit)`
43 | Standardised estimates | `standardizedSolution(fit)` or `summary(fit, standardized=TRUE)`
44 | 
45 | R-squared | `inspect(fit, 'r2')`
46 | 
47 | 
48 | 
49 | ## Compare models
50 | 
51 | | Name | Command |
52 | -------| -------------
53 | Compare fit measures | `cbind(m1=inspect(m1_fit, 'fit.measures'), m2=inspect(m2_fit, 'fit.measures'))`
54 | Chi-square difference test | `anova(m1_fit, m2_fit)`
55 | 
56 | ## Model improvement
57 | | Name | Command |
58 | -------| -------------
59 | Modification indices | `mod_ind <- modificationindices(fit)`
60 | 10 greatest | `head(mod_ind[order(mod_ind$mi, decreasing=TRUE), ], 10)`
61 | mi > 5 | `subset(mod_ind[order(mod_ind$mi, decreasing=TRUE), ], mi > 5)`
62 | 
63 | 


--------------------------------------------------------------------------------
/cheat-sheet-lavaan/cheat-sheet-lavaan.rmd:
--------------------------------------------------------------------------------
 1 | # Lavaan Cheat Sheet
 2 | ## Assumptions
 3 | * `Data` is data frame
 4 | * `model` is the lavaan model syntax character variable
 5 | * `fit` is an object of class `lavaan` typically returned from functions `cfa`, `sem`, `growth`, and `lavaan`
 6 | * `m1_fit` and `m2_fit` are used for showing model comparison of `lavaan` objects.
 7 | 
 8 | 
 9 | ## Documentation tips
10 | * Introduction: http://users.ugent.be/~yrosseel/lavaan/lavaanIntroduction.pdf
11 | * Basic model commmands: `?cfa ?sem ?lavaan`: 
12 | * Extracting elements: `?inspect`
13 | 
14 | ## Model fitting
15 | | Name | Command |
16 | -------| -------------
17 | fit CFA to data | `cfa(model, data=Data)`
18 | fit SEM to data | `sem(model, data=Data)`
19 | standardised solution | `sem(model, data=Data, std.ov=TRUE)`
20 | orthogonal factors | `cfa(model, data=Data, orthogonal=TRUE)`
21 | 
22 | 
23 | ## Matrices
24 | | Name | Command |
25 | -------| -------------
26 | Factor covariance matrix | `inspect(fit, "coefficients")$psi`
27 | Fitted covariance matrix | `fitted(fit)$cov`
28 | Observed covariance matrix | `inspect(fit, 'sampstat')$cov`
29 | Residual covariance matrix | `resid(fit)$cov`
30 | Factor correlation matrix | `cov2cor(inspect(fit, "coefficients")$psi)` or use covariance command with standardised solution e.g.,  `cfa(..., std.ov=TRUE)` 
31 | Residual correlation matrix
32 | 
33 | ## Fit Measures
34 | | Name | Command |
35 | -------| -------------
36 | Fit measures: | `fitMeasures(fit)` 
37 | Specific fit measures e.g.: | `fitMeasures(fit)[c('chisq', 'df', 'pvalue', 'cfi', 'rmsea', 'srmr')]`
38 | 
39 | 
40 | ## Parameters
41 | | Name | Command |
42 | -------| -------------
43 | Parameter information | `parTable(fit)`
44 | Standardised estimates | `parameterestimates(m1_fit, standardized=TRUE)` or `standardizedSolution(fit)` or `summary(fit, standardized=TRUE)` or `inspect(fit, 'std.coef')`
45 | Unstandardised estimates | `parameterestimates(fit)` or `coef(fit)`
46 | R-squared | `inspect(fit, 'r2')`
47 | 
48 | 
49 | 
50 | ## Compare models
51 | 
52 | | Name | Command |
53 | -------| -------------
54 | Compare fit measures | `cbind(m1=inspect(m1_fit, 'fit.measures'), m2=inspect(m2_fit, 'fit.measures'))`
55 | Chi-square difference test | `anova(m1_fit, m2_fit)`
56 | 
57 | ## Model improvement
58 | | Name | Command |
59 | -------| -------------
60 | Modification indices | `mod_ind <- modificationindices(fit)`
61 | 10 greatest | `head(mod_ind[order(mod_ind$mi, decreasing=TRUE), ], 10)`
62 | mi > 5 | `subset(mod_ind[order(mod_ind$mi, decreasing=TRUE), ], mi > 5)`
63 | 
64 | 


--------------------------------------------------------------------------------
/convert.r:
--------------------------------------------------------------------------------
 1 | setwd('.build') # if necessary
 2 | 
 3 | require(knitr) # required for knitting from rmd to md
 4 | require(markdown) # required for md to html
 5 | rmd_to_html <- function(input_rmd, output_stem, markdown_only=FALSE) {
 6 |     output_rmd <- paste0(output_stem, '.rmd')
 7 |     output_md <- paste0(output_stem, '.md')
 8 |     output_html <- paste0(output_stem, '.html')
 9 |     output_tex <- paste0(output_stem, '.tex')
10 |     file.copy(input_rmd, output_rmd, overwrite=TRUE)
11 |     knit(output_rmd, output_md) # creates md file
12 |     markdownToHTML(output_md, output_html,
13 |                options='fragment_only') # creates html file
14 |     output_html
15 | }
16 | 
17 | # Combine component HTML files
18 | html_files <- rmd_to_html('../cheat-sheet-lavaan/cheat-sheet-lavaan.rmd', 'cheat-sheet-lavaan')
19 | html_files <- c(html_files,
20 |                 rmd_to_html('../cfa-example/cfa-example.rmd', 'cfa-example'))
21 | html_files <- c(html_files, 
22 |                 rmd_to_html('../ex1-paper/ex1-paper.rmd', 'ex1-paper'))
23 | html_files <- c(html_files, 
24 |                 rmd_to_html('../ex2-paper/ex2-paper.rmd', 'ex2-paper'))
25 | html_files <- c(html_files, 
26 |                 rmd_to_html('../path-analysis/path-analysis.rmd', 'path-analysis'))
27 | 
28 | # HTML to LaTeX to PDF
29 | combined_stem <- 'combined'
30 | combined_html <- paste0(combined_stem, '.html')
31 | combined_tex <- paste0(combined_stem, '.tex')
32 | combined_pdf <- paste0(combined_stem, '.pdf')
33 | system(paste('cat', paste(html_files, collapse=' '), '>', combined_html))
34 | 
35 | system(paste('pandoc --toc -s', combined_html, ' -o', combined_tex))
36 | system(paste('pdflatex -interaction nonstopmode', combined_tex))
37 | system(paste('pdflatex -interaction nonstopmode', combined_tex))
38 | system(paste('pdflatex -interaction nonstopmode', combined_tex))
39 | system(paste('gnome-open', combined_pdf))
40 | 


--------------------------------------------------------------------------------
/ex1-paper/ex1-paper.html:
--------------------------------------------------------------------------------
  1 | <!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN"
  2 | "http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd">
  3 | <!-- saved from url=(0014)about:internet -->
  4 | <html>
  5 | <head>
  6 | <meta http-equiv="Content-Type" content="text/html; charset=utf-8">
  7 | 
  8 | <title>Example 1 from Lavaan</title>
  9 | 
 10 | <base target="_blank"/>
 11 | 
 12 | <style type="text/css">
 13 | body, td {
 14 |    font-family: sans-serif;
 15 |    background-color: white;
 16 |    font-size: 12px;
 17 |    margin: 8px;
 18 | }
 19 | 
 20 | tt, code, pre {
 21 |    font-family: 'DejaVu Sans Mono', 'Droid Sans Mono', 'Lucida Console', Consolas, Monaco, monospace;
 22 | }
 23 | 
 24 | h1 { 
 25 |    font-size:2.2em; 
 26 | }
 27 | 
 28 | h2 { 
 29 |    font-size:1.8em; 
 30 | }
 31 | 
 32 | h3 { 
 33 |    font-size:1.4em; 
 34 | }
 35 | 
 36 | h4 { 
 37 |    font-size:1.0em; 
 38 | }
 39 | 
 40 | h5 { 
 41 |    font-size:0.9em; 
 42 | }
 43 | 
 44 | h6 { 
 45 |    font-size:0.8em; 
 46 | }
 47 | 
 48 | a:visited {
 49 |    color: rgb(50%, 0%, 50%);
 50 | }
 51 | 
 52 | pre {	
 53 |    margin-top: 0;
 54 |    max-width: 95%;
 55 |    border: 1px solid #ccc;
 56 | }
 57 | 
 58 | pre code {
 59 |    display: block; padding: 0.5em;
 60 | }
 61 | 
 62 | code.r {
 63 |    background-color: #F8F8F8;
 64 | }
 65 | 
 66 | table, td, th {
 67 |   border: none;
 68 | }
 69 | 
 70 | blockquote {
 71 |    color:#666666;
 72 |    margin:0;
 73 |    padding-left: 1em;
 74 |    border-left: 0.5em #EEE solid;
 75 | }
 76 | 
 77 | hr {
 78 |    height: 0px;
 79 |    border-bottom: none;
 80 |    border-top-width: thin;
 81 |    border-top-style: dotted;
 82 |    border-top-color: #999999;
 83 | }
 84 | 
 85 | @media print {
 86 |    * { 
 87 |       background: transparent !important; 
 88 |       color: black !important; 
 89 |       filter:none !important; 
 90 |       -ms-filter: none !important; 
 91 |    }
 92 | 
 93 |    body { 
 94 |       font-size:12pt; 
 95 |       max-width:100%; 
 96 |    }
 97 |        
 98 |    a, a:visited { 
 99 |       text-decoration: underline; 
100 |    }
101 | 
102 |    hr { 
103 |       visibility: hidden;
104 |       page-break-before: always;
105 |    }
106 | 
107 |    pre, blockquote { 
108 |       padding-right: 1em; 
109 |       page-break-inside: avoid; 
110 |    }
111 | 
112 |    tr, img { 
113 |       page-break-inside: avoid; 
114 |    }
115 | 
116 |    img { 
117 |       max-width: 100% !important; 
118 |    }
119 | 
120 |    @page :left { 
121 |       margin: 15mm 20mm 15mm 10mm; 
122 |    }
123 |      
124 |    @page :right { 
125 |       margin: 15mm 10mm 15mm 20mm; 
126 |    }
127 | 
128 |    p, h2, h3 { 
129 |       orphans: 3; widows: 3; 
130 |    }
131 | 
132 |    h2, h3 { 
133 |       page-break-after: avoid; 
134 |    }
135 | }
136 | 
137 | </style>
138 | 
139 | <!-- Styles for R syntax highlighter -->
140 | <style type="text/css">
141 |    pre .operator,
142 |    pre .paren {
143 |      color: rgb(104, 118, 135)
144 |    }
145 | 
146 |    pre .literal {
147 |      color: rgb(88, 72, 246)
148 |    }
149 | 
150 |    pre .number {
151 |      color: rgb(0, 0, 205);
152 |    }
153 | 
154 |    pre .comment {
155 |      color: rgb(76, 136, 107);
156 |    }
157 | 
158 |    pre .keyword {
159 |      color: rgb(0, 0, 255);
160 |    }
161 | 
162 |    pre .identifier {
163 |      color: rgb(0, 0, 0);
164 |    }
165 | 
166 |    pre .string {
167 |      color: rgb(3, 106, 7);
168 |    }
169 | </style>
170 | 
171 | <!-- R syntax highlighter -->
172 | <script type="text/javascript">
173 | var hljs=new function(){function m(p){return p.replace(/&/gm,"&amp;").replace(/</gm,"&lt;")}function f(r,q,p){return RegExp(q,"m"+(r.cI?"i":"")+(p?"g":""))}function b(r){for(var p=0;p<r.childNodes.length;p++){var q=r.childNodes[p];if(q.nodeName=="CODE"){return q}if(!(q.nodeType==3&&q.nodeValue.match(/\s+/))){break}}}function h(t,s){var p="";for(var r=0;r<t.childNodes.length;r++){if(t.childNodes[r].nodeType==3){var q=t.childNodes[r].nodeValue;if(s){q=q.replace(/\n/g,"")}p+=q}else{if(t.childNodes[r].nodeName=="BR"){p+="\n"}else{p+=h(t.childNodes[r])}}}if(/MSIE [678]/.test(navigator.userAgent)){p=p.replace(/\r/g,"\n")}return p}function a(s){var r=s.className.split(/\s+/);r=r.concat(s.parentNode.className.split(/\s+/));for(var q=0;q<r.length;q++){var p=r[q].replace(/^language-/,"");if(e[p]){return p}}}function c(q){var p=[];(function(s,t){for(var r=0;r<s.childNodes.length;r++){if(s.childNodes[r].nodeType==3){t+=s.childNodes[r].nodeValue.length}else{if(s.childNodes[r].nodeName=="BR"){t+=1}else{if(s.childNodes[r].nodeType==1){p.push({event:"start",offset:t,node:s.childNodes[r]});t=arguments.callee(s.childNodes[r],t);p.push({event:"stop",offset:t,node:s.childNodes[r]})}}}}return t})(q,0);return p}function k(y,w,x){var q=0;var z="";var s=[];function u(){if(y.length&&w.length){if(y[0].offset!=w[0].offset){return(y[0].offset<w[0].offset)?y:w}else{return w[0].event=="start"?y:w}}else{return y.length?y:w}}function t(D){var A="<"+D.nodeName.toLowerCase();for(var B=0;B<D.attributes.length;B++){var C=D.attributes[B];A+=" "+C.nodeName.toLowerCase();if(C.value!==undefined&&C.value!==false&&C.value!==null){A+='="'+m(C.value)+'"'}}return A+">"}while(y.length||w.length){var v=u().splice(0,1)[0];z+=m(x.substr(q,v.offset-q));q=v.offset;if(v.event=="start"){z+=t(v.node);s.push(v.node)}else{if(v.event=="stop"){var p,r=s.length;do{r--;p=s[r];z+=("</"+p.nodeName.toLowerCase()+">")}while(p!=v.node);s.splice(r,1);while(r<s.length){z+=t(s[r]);r++}}}}return z+m(x.substr(q))}function j(){function q(x,y,v){if(x.compiled){return}var u;var s=[];if(x.k){x.lR=f(y,x.l||hljs.IR,true);for(var w in x.k){if(!x.k.hasOwnProperty(w)){continue}if(x.k[w] instanceof Object){u=x.k[w]}else{u=x.k;w="keyword"}for(var r in u){if(!u.hasOwnProperty(r)){continue}x.k[r]=[w,u[r]];s.push(r)}}}if(!v){if(x.bWK){x.b="\\b("+s.join("|")+")\\s"}x.bR=f(y,x.b?x.b:"\\B|\\b");if(!x.e&&!x.eW){x.e="\\B|\\b"}if(x.e){x.eR=f(y,x.e)}}if(x.i){x.iR=f(y,x.i)}if(x.r===undefined){x.r=1}if(!x.c){x.c=[]}x.compiled=true;for(var t=0;t<x.c.length;t++){if(x.c[t]=="self"){x.c[t]=x}q(x.c[t],y,false)}if(x.starts){q(x.starts,y,false)}}for(var p in e){if(!e.hasOwnProperty(p)){continue}q(e[p].dM,e[p],true)}}function d(B,C){if(!j.called){j();j.called=true}function q(r,M){for(var L=0;L<M.c.length;L++){if((M.c[L].bR.exec(r)||[null])[0]==r){return M.c[L]}}}function v(L,r){if(D[L].e&&D[L].eR.test(r)){return 1}if(D[L].eW){var M=v(L-1,r);return M?M+1:0}return 0}function w(r,L){return L.i&&L.iR.test(r)}function K(N,O){var M=[];for(var L=0;L<N.c.length;L++){M.push(N.c[L].b)}var r=D.length-1;do{if(D[r].e){M.push(D[r].e)}r--}while(D[r+1].eW);if(N.i){M.push(N.i)}return f(O,M.join("|"),true)}function p(M,L){var N=D[D.length-1];if(!N.t){N.t=K(N,E)}N.t.lastIndex=L;var r=N.t.exec(M);return r?[M.substr(L,r.index-L),r[0],false]:[M.substr(L),"",true]}function z(N,r){var L=E.cI?r[0].toLowerCase():r[0];var M=N.k[L];if(M&&M instanceof Array){return M}return false}function F(L,P){L=m(L);if(!P.k){return L}var r="";var O=0;P.lR.lastIndex=0;var M=P.lR.exec(L);while(M){r+=L.substr(O,M.index-O);var N=z(P,M);if(N){x+=N[1];r+='<span class="'+N[0]+'">'+M[0]+"</span>"}else{r+=M[0]}O=P.lR.lastIndex;M=P.lR.exec(L)}return r+L.substr(O,L.length-O)}function J(L,M){if(M.sL&&e[M.sL]){var r=d(M.sL,L);x+=r.keyword_count;return r.value}else{return F(L,M)}}function I(M,r){var L=M.cN?'<span class="'+M.cN+'">':"";if(M.rB){y+=L;M.buffer=""}else{if(M.eB){y+=m(r)+L;M.buffer=""}else{y+=L;M.buffer=r}}D.push(M);A+=M.r}function G(N,M,Q){var R=D[D.length-1];if(Q){y+=J(R.buffer+N,R);return false}var P=q(M,R);if(P){y+=J(R.buffer+N,R);I(P,M);return P.rB}var L=v(D.length-1,M);if(L){var O=R.cN?"</span>":"";if(R.rE){y+=J(R.buffer+N,R)+O}else{if(R.eE){y+=J(R.buffer+N,R)+O+m(M)}else{y+=J(R.buffer+N+M,R)+O}}while(L>1){O=D[D.length-2].cN?"</span>":"";y+=O;L--;D.length--}var r=D[D.length-1];D.length--;D[D.length-1].buffer="";if(r.starts){I(r.starts,"")}return R.rE}if(w(M,R)){throw"Illegal"}}var E=e[B];var D=[E.dM];var A=0;var x=0;var y="";try{var s,u=0;E.dM.buffer="";do{s=p(C,u);var t=G(s[0],s[1],s[2]);u+=s[0].length;if(!t){u+=s[1].length}}while(!s[2]);if(D.length>1){throw"Illegal"}return{r:A,keyword_count:x,value:y}}catch(H){if(H=="Illegal"){return{r:0,keyword_count:0,value:m(C)}}else{throw H}}}function g(t){var p={keyword_count:0,r:0,value:m(t)};var r=p;for(var q in e){if(!e.hasOwnProperty(q)){continue}var s=d(q,t);s.language=q;if(s.keyword_count+s.r>r.keyword_count+r.r){r=s}if(s.keyword_count+s.r>p.keyword_count+p.r){r=p;p=s}}if(r.language){p.second_best=r}return p}function i(r,q,p){if(q){r=r.replace(/^((<[^>]+>|\t)+)/gm,function(t,w,v,u){return w.replace(/\t/g,q)})}if(p){r=r.replace(/\n/g,"<br>")}return r}function n(t,w,r){var x=h(t,r);var v=a(t);var y,s;if(v){y=d(v,x)}else{return}var q=c(t);if(q.length){s=document.createElement("pre");s.innerHTML=y.value;y.value=k(q,c(s),x)}y.value=i(y.value,w,r);var u=t.className;if(!u.match("(\\s|^)(language-)?"+v+"(\\s|$)")){u=u?(u+" "+v):v}if(/MSIE [678]/.test(navigator.userAgent)&&t.tagName=="CODE"&&t.parentNode.tagName=="PRE"){s=t.parentNode;var p=document.createElement("div");p.innerHTML="<pre><code>"+y.value+"</code></pre>";t=p.firstChild.firstChild;p.firstChild.cN=s.cN;s.parentNode.replaceChild(p.firstChild,s)}else{t.innerHTML=y.value}t.className=u;t.result={language:v,kw:y.keyword_count,re:y.r};if(y.second_best){t.second_best={language:y.second_best.language,kw:y.second_best.keyword_count,re:y.second_best.r}}}function o(){if(o.called){return}o.called=true;var r=document.getElementsByTagName("pre");for(var p=0;p<r.length;p++){var q=b(r[p]);if(q){n(q,hljs.tabReplace)}}}function l(){if(window.addEventListener){window.addEventListener("DOMContentLoaded",o,false);window.addEventListener("load",o,false)}else{if(window.attachEvent){window.attachEvent("onload",o)}else{window.onload=o}}}var e={};this.LANGUAGES=e;this.highlight=d;this.highlightAuto=g;this.fixMarkup=i;this.highlightBlock=n;this.initHighlighting=o;this.initHighlightingOnLoad=l;this.IR="[a-zA-Z][a-zA-Z0-9_]*";this.UIR="[a-zA-Z_][a-zA-Z0-9_]*";this.NR="\\b\\d+(\\.\\d+)?";this.CNR="\\b(0[xX][a-fA-F0-9]+|(\\d+(\\.\\d*)?|\\.\\d+)([eE][-+]?\\d+)?)";this.BNR="\\b(0b[01]+)";this.RSR="!|!=|!==|%|%=|&|&&|&=|\\*|\\*=|\\+|\\+=|,|\\.|-|-=|/|/=|:|;|<|<<|<<=|<=|=|==|===|>|>=|>>|>>=|>>>|>>>=|\\?|\\[|\\{|\\(|\\^|\\^=|\\||\\|=|\\|\\||~";this.ER="(?![\\s\\S])";this.BE={b:"\\\\.",r:0};this.ASM={cN:"string",b:"'",e:"'",i:"\\n",c:[this.BE],r:0};this.QSM={cN:"string",b:'"',e:'"',i:"\\n",c:[this.BE],r:0};this.CLCM={cN:"comment",b:"//",e:"$"};this.CBLCLM={cN:"comment",b:"/\\*",e:"\\*/"};this.HCM={cN:"comment",b:"#",e:"$"};this.NM={cN:"number",b:this.NR,r:0};this.CNM={cN:"number",b:this.CNR,r:0};this.BNM={cN:"number",b:this.BNR,r:0};this.inherit=function(r,s){var p={};for(var q in r){p[q]=r[q]}if(s){for(var q in s){p[q]=s[q]}}return p}}();hljs.LANGUAGES.cpp=function(){var a={keyword:{"false":1,"int":1,"float":1,"while":1,"private":1,"char":1,"catch":1,"export":1,virtual:1,operator:2,sizeof:2,dynamic_cast:2,typedef:2,const_cast:2,"const":1,struct:1,"for":1,static_cast:2,union:1,namespace:1,unsigned:1,"long":1,"throw":1,"volatile":2,"static":1,"protected":1,bool:1,template:1,mutable:1,"if":1,"public":1,friend:2,"do":1,"return":1,"goto":1,auto:1,"void":2,"enum":1,"else":1,"break":1,"new":1,extern:1,using:1,"true":1,"class":1,asm:1,"case":1,typeid:1,"short":1,reinterpret_cast:2,"default":1,"double":1,register:1,explicit:1,signed:1,typename:1,"try":1,"this":1,"switch":1,"continue":1,wchar_t:1,inline:1,"delete":1,alignof:1,char16_t:1,char32_t:1,constexpr:1,decltype:1,noexcept:1,nullptr:1,static_assert:1,thread_local:1,restrict:1,_Bool:1,complex:1},built_in:{std:1,string:1,cin:1,cout:1,cerr:1,clog:1,stringstream:1,istringstream:1,ostringstream:1,auto_ptr:1,deque:1,list:1,queue:1,stack:1,vector:1,map:1,set:1,bitset:1,multiset:1,multimap:1,unordered_set:1,unordered_map:1,unordered_multiset:1,unordered_multimap:1,array:1,shared_ptr:1}};return{dM:{k:a,i:"</",c:[hljs.CLCM,hljs.CBLCLM,hljs.QSM,{cN:"string",b:"'\\\\?.",e:"'",i:"."},{cN:"number",b:"\\b(\\d+(\\.\\d*)?|\\.\\d+)(u|U|l|L|ul|UL|f|F)"},hljs.CNM,{cN:"preprocessor",b:"#",e:"$"},{cN:"stl_container",b:"\\b(deque|list|queue|stack|vector|map|set|bitset|multiset|multimap|unordered_map|unordered_set|unordered_multiset|unordered_multimap|array)\\s*<",e:">",k:a,r:10,c:["self"]}]}}}();hljs.LANGUAGES.r={dM:{c:[hljs.HCM,{cN:"number",b:"\\b0[xX][0-9a-fA-F]+[Li]?\\b",e:hljs.IMMEDIATE_RE,r:0},{cN:"number",b:"\\b\\d+(?:[eE][+\\-]?\\d*)?L\\b",e:hljs.IMMEDIATE_RE,r:0},{cN:"number",b:"\\b\\d+\\.(?!\\d)(?:i\\b)?",e:hljs.IMMEDIATE_RE,r:1},{cN:"number",b:"\\b\\d+(?:\\.\\d*)?(?:[eE][+\\-]?\\d*)?i?\\b",e:hljs.IMMEDIATE_RE,r:0},{cN:"number",b:"\\.\\d+(?:[eE][+\\-]?\\d*)?i?\\b",e:hljs.IMMEDIATE_RE,r:1},{cN:"keyword",b:"(?:tryCatch|library|setGeneric|setGroupGeneric)\\b",e:hljs.IMMEDIATE_RE,r:10},{cN:"keyword",b:"\\.\\.\\.",e:hljs.IMMEDIATE_RE,r:10},{cN:"keyword",b:"\\.\\.\\d+(?![\\w.])",e:hljs.IMMEDIATE_RE,r:10},{cN:"keyword",b:"\\b(?:function)",e:hljs.IMMEDIATE_RE,r:2},{cN:"keyword",b:"(?:if|in|break|next|repeat|else|for|return|switch|while|try|stop|warning|require|attach|detach|source|setMethod|setClass)\\b",e:hljs.IMMEDIATE_RE,r:1},{cN:"literal",b:"(?:NA|NA_integer_|NA_real_|NA_character_|NA_complex_)\\b",e:hljs.IMMEDIATE_RE,r:10},{cN:"literal",b:"(?:NULL|TRUE|FALSE|T|F|Inf|NaN)\\b",e:hljs.IMMEDIATE_RE,r:1},{cN:"identifier",b:"[a-zA-Z.][a-zA-Z0-9._]*\\b",e:hljs.IMMEDIATE_RE,r:0},{cN:"operator",b:"<\\-(?!\\s*\\d)",e:hljs.IMMEDIATE_RE,r:2},{cN:"operator",b:"\\->|<\\-",e:hljs.IMMEDIATE_RE,r:1},{cN:"operator",b:"%%|~",e:hljs.IMMEDIATE_RE},{cN:"operator",b:">=|<=|==|!=|\\|\\||&&|=|\\+|\\-|\\*|/|\\^|>|<|!|&|\\||\\$|:",e:hljs.IMMEDIATE_RE,r:0},{cN:"operator",b:"%",e:"%",i:"\\n",r:1},{cN:"identifier",b:"`",e:"`",r:0},{cN:"string",b:'"',e:'"',c:[hljs.BE],r:0},{cN:"string",b:"'",e:"'",c:[hljs.BE],r:0},{cN:"paren",b:"[[({\\])}]",e:hljs.IMMEDIATE_RE,r:0}]}};
174 | hljs.initHighlightingOnLoad();
175 | </script>
176 | 
177 | 
178 | <!-- MathJax scripts -->
179 | <script type="text/javascript" src="https://c328740.ssl.cf1.rackcdn.com/mathjax/2.0-latest/MathJax.js?config=TeX-AMS-MML_HTMLorMML">
180 | </script>
181 | 
182 | 
183 | 
184 | </head>
185 | 
186 | <body>
187 | <h1>Example 1 from Lavaan</h1>
188 | 
189 | <p>This exercise examines the first example shown in <br/>
190 | <a href="http://www.jstatsoft.org/v48/i02/paper">http://www.jstatsoft.org/v48/i02/paper</a>.<br/>
191 | It&#39;s a three-factor confirmatory factor analysis example with three items per factor.<br/>
192 | All three latent factors are permitted to correlate.</p>
193 | 
194 | <ul>
195 | <li><code>x1</code> to <code>x3</code> load on a <code>visual</code> factor</li>
196 | <li><code>x4</code> to <code>x6</code> load on a <code>textual</code> factor</li>
197 | <li><code>x7</code> to <code>x9</code> load on a <code>speed</code> factor</li>
198 | </ul>
199 | 
200 | <pre><code class="r">library(&#39;lavaan&#39;)
201 | library(&#39;Hmisc&#39;)
202 | cases &lt;- HolzingerSwineford1939
203 | </code></pre>
204 | 
205 | <h2>Quick examination of data</h2>
206 | 
207 | <pre><code class="r">str(cases)
208 | </code></pre>
209 | 
210 | <pre><code>## &#39;data.frame&#39;:    301 obs. of  15 variables:
211 | ##  $ id    : int  1 2 3 4 5 6 7 8 9 11 ...
212 | ##  $ sex   : int  1 2 2 1 2 2 1 2 2 2 ...
213 | ##  $ ageyr : int  13 13 13 13 12 14 12 12 13 12 ...
214 | ##  $ agemo : int  1 7 1 2 2 1 1 2 0 5 ...
215 | ##  $ school: Factor w/ 2 levels &quot;Grant-White&quot;,..: 2 2 2 2 2 2 2 2 2 2 ...
216 | ##  $ grade : int  7 7 7 7 7 7 7 7 7 7 ...
217 | ##  $ x1    : num  3.33 5.33 4.5 5.33 4.83 ...
218 | ##  $ x2    : num  7.75 5.25 5.25 7.75 4.75 5 6 6.25 5.75 5.25 ...
219 | ##  $ x3    : num  0.375 2.125 1.875 3 0.875 ...
220 | ##  $ x4    : num  2.33 1.67 1 2.67 2.67 ...
221 | ##  $ x5    : num  5.75 3 1.75 4.5 4 3 6 4.25 5.75 5 ...
222 | ##  $ x6    : num  1.286 1.286 0.429 2.429 2.571 ...
223 | ##  $ x7    : num  3.39 3.78 3.26 3 3.7 ...
224 | ##  $ x8    : num  5.75 6.25 3.9 5.3 6.3 6.65 6.2 5.15 4.65 4.55 ...
225 | ##  $ x9    : num  6.36 7.92 4.42 4.86 5.92 ...
226 | </code></pre>
227 | 
228 | <pre><code class="r">Hmisc::describe(cases)
229 | </code></pre>
230 | 
231 | <pre><code>## cases 
232 | ## 
233 | ##  15  Variables      301  Observations
234 | ## ---------------------------------------------------------------------------
235 | ## id 
236 | ##       n missing  unique    Mean     .05     .10     .25     .50     .75 
237 | ##     301       0     301   176.6      17      33      82     163     272 
238 | ##     .90     .95 
239 | ##     318     335 
240 | ## 
241 | ## lowest :   1   2   3   4   5, highest: 346 347 348 349 351 
242 | ## ---------------------------------------------------------------------------
243 | ## sex 
244 | ##       n missing  unique    Mean 
245 | ##     301       0       2   1.515 
246 | ## 
247 | ## 1 (146, 49%), 2 (155, 51%) 
248 | ## ---------------------------------------------------------------------------
249 | ## ageyr 
250 | ##       n missing  unique    Mean 
251 | ##     301       0       6      13 
252 | ## 
253 | ##           11  12  13 14 15 16
254 | ## Frequency  8 101 110 55 20  7
255 | ## %          3  34  37 18  7  2
256 | ## ---------------------------------------------------------------------------
257 | ## agemo 
258 | ##       n missing  unique    Mean     .05     .10     .25     .50     .75 
259 | ##     301       0      12   5.375       0       1       2       5       8 
260 | ##     .90     .95 
261 | ##      10      11 
262 | ## 
263 | ##            0  1  2  3  4  5  6  7  8  9 10 11
264 | ## Frequency 22 31 26 26 27 27 21 25 26 23 19 28
265 | ## %          7 10  9  9  9  9  7  8  9  8  6  9
266 | ## ---------------------------------------------------------------------------
267 | ## school 
268 | ##       n missing  unique 
269 | ##     301       0       2 
270 | ## 
271 | ## Grant-White (145, 48%), Pasteur (156, 52%) 
272 | ## ---------------------------------------------------------------------------
273 | ## grade 
274 | ##       n missing  unique    Mean 
275 | ##     300       1       2   7.477 
276 | ## 
277 | ## 7 (157, 52%), 8 (143, 48%) 
278 | ## ---------------------------------------------------------------------------
279 | ## x1 
280 | ##       n missing  unique    Mean     .05     .10     .25     .50     .75 
281 | ##     301       0      35   4.936   3.000   3.333   4.167   5.000   5.667 
282 | ##     .90     .95 
283 | ##   6.333   6.667 
284 | ## 
285 | ## lowest : 0.6667 1.6667 1.8333 2.0000 2.6667
286 | ## highest: 7.0000 7.1667 7.3333 7.5000 8.5000 
287 | ## ---------------------------------------------------------------------------
288 | ## x2 
289 | ##       n missing  unique    Mean     .05     .10     .25     .50     .75 
290 | ##     301       0      25   6.088    4.50    4.75    5.25    6.00    6.75 
291 | ##     .90     .95 
292 | ##    7.75    8.50 
293 | ## 
294 | ## lowest : 2.25 3.50 3.75 4.00 4.25, highest: 8.25 8.50 8.75 9.00 9.25 
295 | ## ---------------------------------------------------------------------------
296 | ## x3 
297 | ##       n missing  unique    Mean     .05     .10     .25     .50     .75 
298 | ##     301       0      35    2.25   0.625   0.875   1.375   2.125   3.125 
299 | ##     .90     .95 
300 | ##   4.000   4.250 
301 | ## 
302 | ## lowest : 0.250 0.375 0.500 0.625 0.750
303 | ## highest: 4.000 4.125 4.250 4.375 4.500 
304 | ## ---------------------------------------------------------------------------
305 | ## x4 
306 | ##       n missing  unique    Mean     .05     .10     .25     .50     .75 
307 | ##     301       0      20   3.061   1.333   1.667   2.333   3.000   3.667 
308 | ##     .90     .95 
309 | ##   4.667   5.000 
310 | ## 
311 | ## lowest : 0.0000 0.3333 0.6667 1.0000 1.3333
312 | ## highest: 5.0000 5.3333 5.6667 6.0000 6.3333 
313 | ## ---------------------------------------------------------------------------
314 | ## x5 
315 | ##       n missing  unique    Mean     .05     .10     .25     .50     .75 
316 | ##     301       0      25   4.341    2.00    2.50    3.50    4.50    5.25 
317 | ##     .90     .95 
318 | ##    6.00    6.25 
319 | ## 
320 | ## lowest : 1.00 1.25 1.50 1.75 2.00, highest: 6.00 6.25 6.50 6.75 7.00 
321 | ## ---------------------------------------------------------------------------
322 | ## x6 
323 | ##       n missing  unique    Mean     .05     .10     .25     .50     .75 
324 | ##     301       0      40   2.186  0.7143  1.0000  1.4286  2.0000  2.7143 
325 | ##     .90     .95 
326 | ##  3.7143  4.2857 
327 | ## 
328 | ## lowest : 0.1429 0.2857 0.4286 0.5714 0.7143
329 | ## highest: 5.1429 5.4286 5.5714 5.8571 6.1429 
330 | ## ---------------------------------------------------------------------------
331 | ## x7 
332 | ##       n missing  unique    Mean     .05     .10     .25     .50     .75 
333 | ##     301       0      97   4.186   2.435   2.826   3.478   4.087   4.913 
334 | ##     .90     .95 
335 | ##   5.696   5.870 
336 | ## 
337 | ## lowest : 1.304 1.870 2.000 2.043 2.130
338 | ## highest: 6.652 6.826 6.957 7.261 7.435 
339 | ## ---------------------------------------------------------------------------
340 | ## x8 
341 | ##       n missing  unique    Mean     .05     .10     .25     .50     .75 
342 | ##     301       0      84   5.527    3.90    4.20    4.85    5.50    6.10 
343 | ##     .90     .95 
344 | ##    6.80    7.20 
345 | ## 
346 | ## lowest :  3.05  3.50  3.60  3.65  3.70
347 | ## highest:  8.00  8.05  8.30  9.10 10.00 
348 | ## ---------------------------------------------------------------------------
349 | ## x9 
350 | ##       n missing  unique    Mean     .05     .10     .25     .50     .75 
351 | ##     301       0     129   5.374   3.750   4.111   4.750   5.417   6.083 
352 | ##     .90     .95 
353 | ##   6.667   7.000 
354 | ## 
355 | ## lowest : 2.778 3.111 3.222 3.278 3.306
356 | ## highest: 7.528 7.611 7.917 8.611 9.250 
357 | ## ---------------------------------------------------------------------------
358 | </code></pre>
359 | 
360 | <p>The data set include <code>301</code> observations. It includes a few demographic variables (e.g., sex, age in years and months, school, and grade). It includes nine variables  that are the observed test scores used in the subsequent CFA.</p>
361 | 
362 | <h2>Fit CFA</h2>
363 | 
364 | <pre><code class="r">m1_model &lt;- &#39; visual  =~ x1 + x2 + x3
365 |               textual =~ x4 + x5 + x6
366 |               speed   =~ x7 + x8 + x9
367 | &#39;
368 | 
369 | m1_fit &lt;- cfa(m1_model, data=cases)
370 | </code></pre>
371 | 
372 | <ul>
373 | <li>model syntax is specified as a character variable</li>
374 | <li><code>cfa</code> is one model fitting function in <code>lavaan</code>. The command includes many options. Data can be specified as a data frame, as it is here using the <code>data</code> argument. Alternatively covariance matrix, vector of means, and sample size can be specified.</li>
375 | <li>From what I can tell, <code>lavaan</code> is the parent model fitting function that can take a <code>model.type</code> argument of <code>&#39;cfa&#39;</code>, <code>&#39;sem&#39;</code>, or <code>&#39;growth&#39;</code>. Thus, the arguments to <code>model.type</code> are functions themselves which just call <code>lavaan</code> with particular argument values.</li>
376 | </ul>
377 | 
378 | <h2>Show parameter table</h2>
379 | 
380 | <pre><code class="r">parTable(m1_fit)
381 | </code></pre>
382 | 
383 | <pre><code>##    id     lhs op     rhs user group free ustart exo label eq.id unco
384 | ## 1   1  visual =~      x1    1     1    0      1   0           0    0
385 | ## 2   2  visual =~      x2    1     1    1     NA   0           0    1
386 | ## 3   3  visual =~      x3    1     1    2     NA   0           0    2
387 | ## 4   4 textual =~      x4    1     1    0      1   0           0    0
388 | ## 5   5 textual =~      x5    1     1    3     NA   0           0    3
389 | ## 6   6 textual =~      x6    1     1    4     NA   0           0    4
390 | ## 7   7   speed =~      x7    1     1    0      1   0           0    0
391 | ## 8   8   speed =~      x8    1     1    5     NA   0           0    5
392 | ## 9   9   speed =~      x9    1     1    6     NA   0           0    6
393 | ## 10 10      x1 ~~      x1    0     1    7     NA   0           0    7
394 | ## 11 11      x2 ~~      x2    0     1    8     NA   0           0    8
395 | ## 12 12      x3 ~~      x3    0     1    9     NA   0           0    9
396 | ## 13 13      x4 ~~      x4    0     1   10     NA   0           0   10
397 | ## 14 14      x5 ~~      x5    0     1   11     NA   0           0   11
398 | ## 15 15      x6 ~~      x6    0     1   12     NA   0           0   12
399 | ## 16 16      x7 ~~      x7    0     1   13     NA   0           0   13
400 | ## 17 17      x8 ~~      x8    0     1   14     NA   0           0   14
401 | ## 18 18      x9 ~~      x9    0     1   15     NA   0           0   15
402 | ## 19 19  visual ~~  visual    0     1   16     NA   0           0   16
403 | ## 20 20 textual ~~ textual    0     1   17     NA   0           0   17
404 | ## 21 21   speed ~~   speed    0     1   18     NA   0           0   18
405 | ## 22 22  visual ~~ textual    0     1   19     NA   0           0   19
406 | ## 23 23  visual ~~   speed    0     1   20     NA   0           0   20
407 | ## 24 24 textual ~~   speed    0     1   21     NA   0           0   21
408 | </code></pre>
409 | 
410 | <ul>
411 | <li><p>What do the columns mean?</p>
412 | 
413 | <ul>
414 | <li><code>id</code>: numeric identifier for the parameter</li>
415 | <li><code>lhs</code>: left hand side variable name</li>
416 | <li><code>op</code>: operator (see page 7 of <a href="http://www.jstatsoft.org/v48/i02/paper">http://www.jstatsoft.org/v48/i02/paper</a>); <code>=~</code> 
417 | is manifested by; <code>~~</code> is correlated with.</li>
418 | <li><code>rhs</code>: right hand side variable name</li>
419 | <li><code>user</code>: 1 if parameter was specified by the user, 0 otherwise</li>
420 | <li><code>group</code>: presumably used in multiple group analysis</li>
421 | <li><code>free</code>: Nonzero elements are free parameters in the model</li>
422 | <li><code>ustart</code>: The value specified for fixed parameters</li>
423 | <li><code>exo</code>: ???</li>
424 | <li><code>label</code>: Probably just an optional label???</li>
425 | <li><code>eq.id</code>: ??? </li>
426 | <li><code>unco</code>: ???</li>
427 | </ul></li>
428 | <li><p>The model syntax used in <code>lavaan</code> incorporates a lot of parameters by default to permit a tidy model syntax. The exact nature of these parameters is also determined by options in the <code>cfa</code>, <code>sem</code> and other model fitting fucntions.</p></li>
429 | <li><p><code>parTable</code> is a method</p></li>
430 | </ul>
431 | 
432 | <p>It shows that the latent factors are allowed to intercorrelate. The <code>cfa</code> function has an an argument <code>orthogonal</code>. It defaults to FALSE which permits correlated factors.</p>
433 | 
434 | <pre><code class="r">parTable(cfa(m1_model, data=cases, orthogonal=TRUE))[22:24, ]
435 | </code></pre>
436 | 
437 | <pre><code>##    id     lhs op     rhs user group free ustart exo label eq.id unco
438 | ## 22 22  visual ~~ textual    0     1    0      0   0           0    0
439 | ## 23 23  visual ~~   speed    0     1    0      0   0           0    0
440 | ## 24 24 textual ~~   speed    0     1    0      0   0           0    0
441 | </code></pre>
442 | 
443 | <p>When <code>orthogonal=TRUE</code> is specified, the covariance of latent factors is constrained to zero. This is reflected in <code>free=0</code> (i.e., it&#39;s not free to vary) and <code>ustart=0</code> (the constrained value is zero) in the parameter table.</p>
444 | 
445 | <p>Returning to the original parameter table:</p>
446 | 
447 | <ul>
448 | <li>Variance parameters (<code>op=~~</code> where <code>lhs</code> is the same as <code>rhs</code>) are included for all observed and latent variables.</li>
449 | </ul>
450 | 
451 | <h2>Summarise fit</h2>
452 | 
453 | <pre><code class="r">summary(m1_fit)
454 | </code></pre>
455 | 
456 | <pre><code>## lavaan (0.4-14) converged normally after 41 iterations
457 | ## 
458 | ##   Number of observations                           301
459 | ## 
460 | ##   Estimator                                         ML
461 | ##   Minimum Function Chi-square                   85.306
462 | ##   Degrees of freedom                                24
463 | ##   P-value                                        0.000
464 | ## 
465 | ## Parameter estimates:
466 | ## 
467 | ##   Information                                 Expected
468 | ##   Standard Errors                             Standard
469 | ## 
470 | ##                    Estimate  Std.err  Z-value  P(&gt;|z|)
471 | ## Latent variables:
472 | ##   visual =~
473 | ##     x1                1.000
474 | ##     x2                0.553    0.100    5.554    0.000
475 | ##     x3                0.729    0.109    6.685    0.000
476 | ##   textual =~
477 | ##     x4                1.000
478 | ##     x5                1.113    0.065   17.014    0.000
479 | ##     x6                0.926    0.055   16.703    0.000
480 | ##   speed =~
481 | ##     x7                1.000
482 | ##     x8                1.180    0.165    7.152    0.000
483 | ##     x9                1.082    0.151    7.155    0.000
484 | ## 
485 | ## Covariances:
486 | ##   visual ~~
487 | ##     textual           0.408    0.074    5.552    0.000
488 | ##     speed             0.262    0.056    4.660    0.000
489 | ##   textual ~~
490 | ##     speed             0.173    0.049    3.518    0.000
491 | ## 
492 | ## Variances:
493 | ##     x1                0.549    0.114
494 | ##     x2                1.134    0.102
495 | ##     x3                0.844    0.091
496 | ##     x4                0.371    0.048
497 | ##     x5                0.446    0.058
498 | ##     x6                0.356    0.043
499 | ##     x7                0.799    0.081
500 | ##     x8                0.488    0.074
501 | ##     x9                0.566    0.071
502 | ##     visual            0.809    0.145
503 | ##     textual           0.979    0.112
504 | ##     speed             0.384    0.086
505 | ## 
506 | </code></pre>
507 | 
508 | <p>The default <code>summary</code> method shows \( \chi^2 \), \( df \), p-value for the overall model, unstandardised parameter estimates, in some cases with significance tests.</p>
509 | 
510 | <h2>Getting fit statistics</h2>
511 | 
512 | <p>There are multiple ways of getting fit statistics</p>
513 | 
514 | <pre><code class="r">fitMeasures(m1_fit)
515 | </code></pre>
516 | 
517 | <pre><code>##             chisq                df            pvalue    baseline.chisq 
518 | ##            85.306            24.000             0.000           918.852 
519 | ##       baseline.df   baseline.pvalue               cfi               tli 
520 | ##            36.000             0.000             0.931             0.896 
521 | ##              logl unrestricted.logl              npar               aic 
522 | ##         -3737.745         -3695.092            21.000          7517.490 
523 | ##               bic            ntotal              bic2             rmsea 
524 | ##          7595.339           301.000          7528.739             0.092 
525 | ##    rmsea.ci.lower    rmsea.ci.upper      rmsea.pvalue              srmr 
526 | ##             0.071             0.114             0.001             0.065 
527 | </code></pre>
528 | 
529 | <pre><code class="r"># equivalent to:
530 | # inspect(m1_fit, &#39;fit.measures&#39;)
531 | 
532 | fitMeasures(m1_fit)[&#39;rmsea&#39;]
533 | </code></pre>
534 | 
535 | <pre><code>##   rmsea 
536 | ## 0.09212 
537 | </code></pre>
538 | 
539 | <pre><code class="r">fitMeasures(m1_fit, c(&#39;rmsea&#39;, &#39;rmsea.ci.lower&#39;, &#39;rmsea.ci.upper&#39;))
540 | </code></pre>
541 | 
542 | <pre><code>##          rmsea rmsea.ci.lower rmsea.ci.upper 
543 | ##          0.092          0.071          0.114 
544 | </code></pre>
545 | 
546 | <pre><code class="r">
547 | 
548 | summary(m1_fit, fit.measures=TRUE)
549 | </code></pre>
550 | 
551 | <pre><code>## lavaan (0.4-14) converged normally after 41 iterations
552 | ## 
553 | ##   Number of observations                           301
554 | ## 
555 | ##   Estimator                                         ML
556 | ##   Minimum Function Chi-square                   85.306
557 | ##   Degrees of freedom                                24
558 | ##   P-value                                        0.000
559 | ## 
560 | ## Chi-square test baseline model:
561 | ## 
562 | ##   Minimum Function Chi-square                  918.852
563 | ##   Degrees of freedom                                36
564 | ##   P-value                                        0.000
565 | ## 
566 | ## Full model versus baseline model:
567 | ## 
568 | ##   Comparative Fit Index (CFI)                    0.931
569 | ##   Tucker-Lewis Index (TLI)                       0.896
570 | ## 
571 | ## Loglikelihood and Information Criteria:
572 | ## 
573 | ##   Loglikelihood user model (H0)              -3737.745
574 | ##   Loglikelihood unrestricted model (H1)      -3695.092
575 | ## 
576 | ##   Number of free parameters                         21
577 | ##   Akaike (AIC)                                7517.490
578 | ##   Bayesian (BIC)                              7595.339
579 | ##   Sample-size adjusted Bayesian (BIC)         7528.739
580 | ## 
581 | ## Root Mean Square Error of Approximation:
582 | ## 
583 | ##   RMSEA                                          0.092
584 | ##   90 Percent Confidence Interval          0.071  0.114
585 | ##   P-value RMSEA &lt;= 0.05                          0.001
586 | ## 
587 | ## Standardized Root Mean Square Residual:
588 | ## 
589 | ##   SRMR                                           0.065
590 | ## 
591 | ## Parameter estimates:
592 | ## 
593 | ##   Information                                 Expected
594 | ##   Standard Errors                             Standard
595 | ## 
596 | ##                    Estimate  Std.err  Z-value  P(&gt;|z|)
597 | ## Latent variables:
598 | ##   visual =~
599 | ##     x1                1.000
600 | ##     x2                0.553    0.100    5.554    0.000
601 | ##     x3                0.729    0.109    6.685    0.000
602 | ##   textual =~
603 | ##     x4                1.000
604 | ##     x5                1.113    0.065   17.014    0.000
605 | ##     x6                0.926    0.055   16.703    0.000
606 | ##   speed =~
607 | ##     x7                1.000
608 | ##     x8                1.180    0.165    7.152    0.000
609 | ##     x9                1.082    0.151    7.155    0.000
610 | ## 
611 | ## Covariances:
612 | ##   visual ~~
613 | ##     textual           0.408    0.074    5.552    0.000
614 | ##     speed             0.262    0.056    4.660    0.000
615 | ##   textual ~~
616 | ##     speed             0.173    0.049    3.518    0.000
617 | ## 
618 | ## Variances:
619 | ##     x1                0.549    0.114
620 | ##     x2                1.134    0.102
621 | ##     x3                0.844    0.091
622 | ##     x4                0.371    0.048
623 | ##     x5                0.446    0.058
624 | ##     x6                0.356    0.043
625 | ##     x7                0.799    0.081
626 | ##     x8                0.488    0.074
627 | ##     x9                0.566    0.071
628 | ##     visual            0.809    0.145
629 | ##     textual           0.979    0.112
630 | ##     speed             0.384    0.086
631 | ## 
632 | </code></pre>
633 | 
634 | <ul>
635 | <li>I assume that lavaan uses S4 classes which makes extracting elements a little different to S3 classes.</li>
636 | <li>The above code shows how to extract fit measures.</li>
637 | <li>While it is not clear hear, it appears that <code>rmsea.ci.lower</code> and <code>rmsea.ci.upper</code> refer to 90% lower and upper confidence intervals.</li>
638 | <li>Adding <code>fit.measures=TRUE</code> provides a way of displaying </li>
639 | </ul>
640 | 
641 | <h2>Modification indices</h2>
642 | 
643 | <pre><code class="r">m1_mod &lt;- modificationIndices(m1_fit)
644 | head(m1_mod[order(m1_mod$mi, decreasing=TRUE), ], 10)
645 | </code></pre>
646 | 
647 | <pre><code>##        lhs op rhs     mi    epc sepc.lv sepc.all sepc.nox
648 | ## 1   visual =~  x9 36.411  0.577   0.519    0.515    0.515
649 | ## 2       x7 ~~  x8 34.145  0.536   0.536    0.488    0.488
650 | ## 3   visual =~  x7 18.631 -0.422  -0.380   -0.349   -0.349
651 | ## 4       x8 ~~  x9 14.946 -0.423  -0.423   -0.415   -0.415
652 | ## 5  textual =~  x3  9.151 -0.272  -0.269   -0.238   -0.238
653 | ## 6       x2 ~~  x7  8.918 -0.183  -0.183   -0.143   -0.143
654 | ## 7  textual =~  x1  8.903  0.350   0.347    0.297    0.297
655 | ## 8       x2 ~~  x3  8.532  0.218   0.218    0.164    0.164
656 | ## 9       x3 ~~  x5  7.858 -0.130  -0.130   -0.089   -0.089
657 | ## 10  visual =~  x5  7.441 -0.210  -0.189   -0.147   -0.147
658 | </code></pre>
659 | 
660 | <ul>
661 | <li>The <code>modificationIndices</code> function returns modification indices and expected parameter changes (EPCs). </li>
662 | <li>The second line above sorts the rows of the modification indices table in decreasing order and shows those parameters with the 10 largest values.</li>
663 | </ul>
664 | 
665 | <pre><code class="r">m2_model &lt;- &#39; visual  =~ x1 + x2 + x3 + x9
666 |               textual =~ x4 + x5 + x6
667 |               speed   =~ x7 + x8 + x9
668 | &#39;
669 | 
670 | m2_fit &lt;- cfa(m2_model, data=cases)
671 | anova(m1_fit, m2_fit)
672 | </code></pre>
673 | 
674 | <pre><code>## Chi Square Difference Test
675 | ## 
676 | ##        Df  AIC  BIC Chisq Chisq diff Df diff Pr(&gt;Chisq)    
677 | ## m2_fit 23 7487 7568  52.4                                  
678 | ## m1_fit 24 7517 7595  85.3       32.9       1    9.6e-09 ***
679 | ## ---
680 | ## Signif. codes:  0 &#39;***&#39; 0.001 &#39;**&#39; 0.01 &#39;*&#39; 0.05 &#39;.&#39; 0.1 &#39; &#39; 1 
681 | </code></pre>
682 | 
683 | <ul>
684 | <li>Note that more than one empty line at the end of the model definition seems to cause an error.</li>
685 | <li>TODO: Work out why the change in \( \chi^2 \) 
686 | <code>32.9234</code>
687 | is different to the value of the modification index
688 | <code>36.411</code>.</li>
689 | </ul>
690 | 
691 | <h2>Standardised parameters</h2>
692 | 
693 | <pre><code class="r">summary(m1_fit)
694 | </code></pre>
695 | 
696 | <pre><code>## lavaan (0.4-14) converged normally after 41 iterations
697 | ## 
698 | ##   Number of observations                           301
699 | ## 
700 | ##   Estimator                                         ML
701 | ##   Minimum Function Chi-square                   85.306
702 | ##   Degrees of freedom                                24
703 | ##   P-value                                        0.000
704 | ## 
705 | ## Parameter estimates:
706 | ## 
707 | ##   Information                                 Expected
708 | ##   Standard Errors                             Standard
709 | ## 
710 | ##                    Estimate  Std.err  Z-value  P(&gt;|z|)
711 | ## Latent variables:
712 | ##   visual =~
713 | ##     x1                1.000
714 | ##     x2                0.553    0.100    5.554    0.000
715 | ##     x3                0.729    0.109    6.685    0.000
716 | ##   textual =~
717 | ##     x4                1.000
718 | ##     x5                1.113    0.065   17.014    0.000
719 | ##     x6                0.926    0.055   16.703    0.000
720 | ##   speed =~
721 | ##     x7                1.000
722 | ##     x8                1.180    0.165    7.152    0.000
723 | ##     x9                1.082    0.151    7.155    0.000
724 | ## 
725 | ## Covariances:
726 | ##   visual ~~
727 | ##     textual           0.408    0.074    5.552    0.000
728 | ##     speed             0.262    0.056    4.660    0.000
729 | ##   textual ~~
730 | ##     speed             0.173    0.049    3.518    0.000
731 | ## 
732 | ## Variances:
733 | ##     x1                0.549    0.114
734 | ##     x2                1.134    0.102
735 | ##     x3                0.844    0.091
736 | ##     x4                0.371    0.048
737 | ##     x5                0.446    0.058
738 | ##     x6                0.356    0.043
739 | ##     x7                0.799    0.081
740 | ##     x8                0.488    0.074
741 | ##     x9                0.566    0.071
742 | ##     visual            0.809    0.145
743 | ##     textual           0.979    0.112
744 | ##     speed             0.384    0.086
745 | ## 
746 | </code></pre>
747 | 
748 | <pre><code class="r">standardizedSolution(m1_fit)
749 | </code></pre>
750 | 
751 | <pre><code>##        lhs op     rhs est.std se  z pvalue
752 | ## 1   visual =~      x1   0.772 NA NA     NA
753 | ## 2   visual =~      x2   0.424 NA NA     NA
754 | ## 3   visual =~      x3   0.581 NA NA     NA
755 | ## 4  textual =~      x4   0.852 NA NA     NA
756 | ## 5  textual =~      x5   0.855 NA NA     NA
757 | ## 6  textual =~      x6   0.838 NA NA     NA
758 | ## 7    speed =~      x7   0.570 NA NA     NA
759 | ## 8    speed =~      x8   0.723 NA NA     NA
760 | ## 9    speed =~      x9   0.665 NA NA     NA
761 | ## 10      x1 ~~      x1   0.404 NA NA     NA
762 | ## 11      x2 ~~      x2   0.821 NA NA     NA
763 | ## 12      x3 ~~      x3   0.662 NA NA     NA
764 | ## 13      x4 ~~      x4   0.275 NA NA     NA
765 | ## 14      x5 ~~      x5   0.269 NA NA     NA
766 | ## 15      x6 ~~      x6   0.298 NA NA     NA
767 | ## 16      x7 ~~      x7   0.676 NA NA     NA
768 | ## 17      x8 ~~      x8   0.477 NA NA     NA
769 | ## 18      x9 ~~      x9   0.558 NA NA     NA
770 | ## 19  visual ~~  visual   1.000 NA NA     NA
771 | ## 20 textual ~~ textual   1.000 NA NA     NA
772 | ## 21   speed ~~   speed   1.000 NA NA     NA
773 | ## 22  visual ~~ textual   0.459 NA NA     NA
774 | ## 23  visual ~~   speed   0.471 NA NA     NA
775 | ## 24 textual ~~   speed   0.283 NA NA     NA
776 | </code></pre>
777 | 
778 | </body>
779 | 
780 | </html>
781 | 
782 | 


--------------------------------------------------------------------------------
/ex1-paper/ex1-paper.md:
--------------------------------------------------------------------------------
  1 | # Example 1 from Lavaan
  2 | 
  3 | This exercise examines the first example shown in 
  4 | <http://www.jstatsoft.org/v48/i02/paper>.
  5 | It's a three-factor confirmatory factor analysis example with three items per factor.
  6 | All three latent factors are permitted to correlate.
  7 | 
  8 | * `x1` to `x3` load on a `visual` factor
  9 | * `x4` to `x6` load on a `textual` factor
 10 | * `x7` to `x9` load on a `speed` factor
 11 | 
 12 | 
 13 | 
 14 | ```r
 15 | library('lavaan')
 16 | library('Hmisc')
 17 | cases <- HolzingerSwineford1939
 18 | ```
 19 | 
 20 | 
 21 | 
 22 | 
 23 | ## Quick examination of data
 24 | 
 25 | 
 26 | 
 27 | ```r
 28 | str(cases)
 29 | ```
 30 | 
 31 | ```
 32 | ## 'data.frame':	301 obs. of  15 variables:
 33 | ##  $ id    : int  1 2 3 4 5 6 7 8 9 11 ...
 34 | ##  $ sex   : int  1 2 2 1 2 2 1 2 2 2 ...
 35 | ##  $ ageyr : int  13 13 13 13 12 14 12 12 13 12 ...
 36 | ##  $ agemo : int  1 7 1 2 2 1 1 2 0 5 ...
 37 | ##  $ school: Factor w/ 2 levels "Grant-White",..: 2 2 2 2 2 2 2 2 2 2 ...
 38 | ##  $ grade : int  7 7 7 7 7 7 7 7 7 7 ...
 39 | ##  $ x1    : num  3.33 5.33 4.5 5.33 4.83 ...
 40 | ##  $ x2    : num  7.75 5.25 5.25 7.75 4.75 5 6 6.25 5.75 5.25 ...
 41 | ##  $ x3    : num  0.375 2.125 1.875 3 0.875 ...
 42 | ##  $ x4    : num  2.33 1.67 1 2.67 2.67 ...
 43 | ##  $ x5    : num  5.75 3 1.75 4.5 4 3 6 4.25 5.75 5 ...
 44 | ##  $ x6    : num  1.286 1.286 0.429 2.429 2.571 ...
 45 | ##  $ x7    : num  3.39 3.78 3.26 3 3.7 ...
 46 | ##  $ x8    : num  5.75 6.25 3.9 5.3 6.3 6.65 6.2 5.15 4.65 4.55 ...
 47 | ##  $ x9    : num  6.36 7.92 4.42 4.86 5.92 ...
 48 | ```
 49 | 
 50 | ```r
 51 | Hmisc::describe(cases)
 52 | ```
 53 | 
 54 | ```
 55 | ## cases 
 56 | ## 
 57 | ##  15  Variables      301  Observations
 58 | ## ---------------------------------------------------------------------------
 59 | ## id 
 60 | ##       n missing  unique    Mean     .05     .10     .25     .50     .75 
 61 | ##     301       0     301   176.6      17      33      82     163     272 
 62 | ##     .90     .95 
 63 | ##     318     335 
 64 | ## 
 65 | ## lowest :   1   2   3   4   5, highest: 346 347 348 349 351 
 66 | ## ---------------------------------------------------------------------------
 67 | ## sex 
 68 | ##       n missing  unique    Mean 
 69 | ##     301       0       2   1.515 
 70 | ## 
 71 | ## 1 (146, 49%), 2 (155, 51%) 
 72 | ## ---------------------------------------------------------------------------
 73 | ## ageyr 
 74 | ##       n missing  unique    Mean 
 75 | ##     301       0       6      13 
 76 | ## 
 77 | ##           11  12  13 14 15 16
 78 | ## Frequency  8 101 110 55 20  7
 79 | ## %          3  34  37 18  7  2
 80 | ## ---------------------------------------------------------------------------
 81 | ## agemo 
 82 | ##       n missing  unique    Mean     .05     .10     .25     .50     .75 
 83 | ##     301       0      12   5.375       0       1       2       5       8 
 84 | ##     .90     .95 
 85 | ##      10      11 
 86 | ## 
 87 | ##            0  1  2  3  4  5  6  7  8  9 10 11
 88 | ## Frequency 22 31 26 26 27 27 21 25 26 23 19 28
 89 | ## %          7 10  9  9  9  9  7  8  9  8  6  9
 90 | ## ---------------------------------------------------------------------------
 91 | ## school 
 92 | ##       n missing  unique 
 93 | ##     301       0       2 
 94 | ## 
 95 | ## Grant-White (145, 48%), Pasteur (156, 52%) 
 96 | ## ---------------------------------------------------------------------------
 97 | ## grade 
 98 | ##       n missing  unique    Mean 
 99 | ##     300       1       2   7.477 
100 | ## 
101 | ## 7 (157, 52%), 8 (143, 48%) 
102 | ## ---------------------------------------------------------------------------
103 | ## x1 
104 | ##       n missing  unique    Mean     .05     .10     .25     .50     .75 
105 | ##     301       0      35   4.936   3.000   3.333   4.167   5.000   5.667 
106 | ##     .90     .95 
107 | ##   6.333   6.667 
108 | ## 
109 | ## lowest : 0.6667 1.6667 1.8333 2.0000 2.6667
110 | ## highest: 7.0000 7.1667 7.3333 7.5000 8.5000 
111 | ## ---------------------------------------------------------------------------
112 | ## x2 
113 | ##       n missing  unique    Mean     .05     .10     .25     .50     .75 
114 | ##     301       0      25   6.088    4.50    4.75    5.25    6.00    6.75 
115 | ##     .90     .95 
116 | ##    7.75    8.50 
117 | ## 
118 | ## lowest : 2.25 3.50 3.75 4.00 4.25, highest: 8.25 8.50 8.75 9.00 9.25 
119 | ## ---------------------------------------------------------------------------
120 | ## x3 
121 | ##       n missing  unique    Mean     .05     .10     .25     .50     .75 
122 | ##     301       0      35    2.25   0.625   0.875   1.375   2.125   3.125 
123 | ##     .90     .95 
124 | ##   4.000   4.250 
125 | ## 
126 | ## lowest : 0.250 0.375 0.500 0.625 0.750
127 | ## highest: 4.000 4.125 4.250 4.375 4.500 
128 | ## ---------------------------------------------------------------------------
129 | ## x4 
130 | ##       n missing  unique    Mean     .05     .10     .25     .50     .75 
131 | ##     301       0      20   3.061   1.333   1.667   2.333   3.000   3.667 
132 | ##     .90     .95 
133 | ##   4.667   5.000 
134 | ## 
135 | ## lowest : 0.0000 0.3333 0.6667 1.0000 1.3333
136 | ## highest: 5.0000 5.3333 5.6667 6.0000 6.3333 
137 | ## ---------------------------------------------------------------------------
138 | ## x5 
139 | ##       n missing  unique    Mean     .05     .10     .25     .50     .75 
140 | ##     301       0      25   4.341    2.00    2.50    3.50    4.50    5.25 
141 | ##     .90     .95 
142 | ##    6.00    6.25 
143 | ## 
144 | ## lowest : 1.00 1.25 1.50 1.75 2.00, highest: 6.00 6.25 6.50 6.75 7.00 
145 | ## ---------------------------------------------------------------------------
146 | ## x6 
147 | ##       n missing  unique    Mean     .05     .10     .25     .50     .75 
148 | ##     301       0      40   2.186  0.7143  1.0000  1.4286  2.0000  2.7143 
149 | ##     .90     .95 
150 | ##  3.7143  4.2857 
151 | ## 
152 | ## lowest : 0.1429 0.2857 0.4286 0.5714 0.7143
153 | ## highest: 5.1429 5.4286 5.5714 5.8571 6.1429 
154 | ## ---------------------------------------------------------------------------
155 | ## x7 
156 | ##       n missing  unique    Mean     .05     .10     .25     .50     .75 
157 | ##     301       0      97   4.186   2.435   2.826   3.478   4.087   4.913 
158 | ##     .90     .95 
159 | ##   5.696   5.870 
160 | ## 
161 | ## lowest : 1.304 1.870 2.000 2.043 2.130
162 | ## highest: 6.652 6.826 6.957 7.261 7.435 
163 | ## ---------------------------------------------------------------------------
164 | ## x8 
165 | ##       n missing  unique    Mean     .05     .10     .25     .50     .75 
166 | ##     301       0      84   5.527    3.90    4.20    4.85    5.50    6.10 
167 | ##     .90     .95 
168 | ##    6.80    7.20 
169 | ## 
170 | ## lowest :  3.05  3.50  3.60  3.65  3.70
171 | ## highest:  8.00  8.05  8.30  9.10 10.00 
172 | ## ---------------------------------------------------------------------------
173 | ## x9 
174 | ##       n missing  unique    Mean     .05     .10     .25     .50     .75 
175 | ##     301       0     129   5.374   3.750   4.111   4.750   5.417   6.083 
176 | ##     .90     .95 
177 | ##   6.667   7.000 
178 | ## 
179 | ## lowest : 2.778 3.111 3.222 3.278 3.306
180 | ## highest: 7.528 7.611 7.917 8.611 9.250 
181 | ## ---------------------------------------------------------------------------
182 | ```
183 | 
184 | 
185 | 
186 | 
187 | The data set include `301` observations. It includes a few demographic variables (e.g., sex, age in years and months, school, and grade). It includes nine variables  that are the observed test scores used in the subsequent CFA.
188 | 
189 | ## Fit CFA
190 | 
191 | 
192 | ```r
193 | m1_model <- ' visual  =~ x1 + x2 + x3
194 |               textual =~ x4 + x5 + x6
195 |               speed   =~ x7 + x8 + x9
196 | '
197 | 
198 | m1_fit <- cfa(m1_model, data=cases)
199 | ```
200 | 
201 | 
202 | 
203 | 
204 | * model syntax is specified as a character variable
205 | * `cfa` is one model fitting function in `lavaan`. The command includes many options. Data can be specified as a data frame, as it is here using the `data` argument. Alternatively covariance matrix, vector of means, and sample size can be specified.
206 | * From what I can tell, `lavaan` is the parent model fitting function that can take a `model.type` argument of `'cfa'`, `'sem'`, or `'growth'`. Thus, the arguments to `model.type` are functions themselves which just call `lavaan` with particular argument values.
207 | 
208 | ## Show parameter table
209 | 
210 | 
211 | ```r
212 | parTable(m1_fit)
213 | ```
214 | 
215 | ```
216 | ##    id     lhs op     rhs user group free ustart exo label eq.id unco
217 | ## 1   1  visual =~      x1    1     1    0      1   0           0    0
218 | ## 2   2  visual =~      x2    1     1    1     NA   0           0    1
219 | ## 3   3  visual =~      x3    1     1    2     NA   0           0    2
220 | ## 4   4 textual =~      x4    1     1    0      1   0           0    0
221 | ## 5   5 textual =~      x5    1     1    3     NA   0           0    3
222 | ## 6   6 textual =~      x6    1     1    4     NA   0           0    4
223 | ## 7   7   speed =~      x7    1     1    0      1   0           0    0
224 | ## 8   8   speed =~      x8    1     1    5     NA   0           0    5
225 | ## 9   9   speed =~      x9    1     1    6     NA   0           0    6
226 | ## 10 10      x1 ~~      x1    0     1    7     NA   0           0    7
227 | ## 11 11      x2 ~~      x2    0     1    8     NA   0           0    8
228 | ## 12 12      x3 ~~      x3    0     1    9     NA   0           0    9
229 | ## 13 13      x4 ~~      x4    0     1   10     NA   0           0   10
230 | ## 14 14      x5 ~~      x5    0     1   11     NA   0           0   11
231 | ## 15 15      x6 ~~      x6    0     1   12     NA   0           0   12
232 | ## 16 16      x7 ~~      x7    0     1   13     NA   0           0   13
233 | ## 17 17      x8 ~~      x8    0     1   14     NA   0           0   14
234 | ## 18 18      x9 ~~      x9    0     1   15     NA   0           0   15
235 | ## 19 19  visual ~~  visual    0     1   16     NA   0           0   16
236 | ## 20 20 textual ~~ textual    0     1   17     NA   0           0   17
237 | ## 21 21   speed ~~   speed    0     1   18     NA   0           0   18
238 | ## 22 22  visual ~~ textual    0     1   19     NA   0           0   19
239 | ## 23 23  visual ~~   speed    0     1   20     NA   0           0   20
240 | ## 24 24 textual ~~   speed    0     1   21     NA   0           0   21
241 | ```
242 | 
243 | 
244 | 
245 | 
246 | * What do the columns mean?
247 |     * `id`: numeric identifier for the parameter
248 |     * `lhs`: left hand side variable name
249 |     * `op`: operator (see page 7 of http://www.jstatsoft.org/v48/i02/paper); `=~` 
250 |     is manifested by; `~~` is correlated with.
251 |     * `rhs`: right hand side variable name
252 |     * `user`: 1 if parameter was specified by the user, 0 otherwise
253 |     * `group`: presumably used in multiple group analysis
254 |     * `free`: Nonzero elements are free parameters in the model
255 |     * `ustart`: The value specified for fixed parameters
256 |     * `exo`: ???
257 |     * `label`: Probably just an optional label???
258 |     * `eq.id`: ??? 
259 |     * `unco`: ???
260 | 
261 | 
262 | * The model syntax used in `lavaan` incorporates a lot of parameters by default to permit a tidy model syntax. The exact nature of these parameters is also determined by options in the `cfa`, `sem` and other model fitting fucntions.
263 | * `parTable` is a method
264 | 
265 | It shows that the latent factors are allowed to intercorrelate. The `cfa` function has an an argument `orthogonal`. It defaults to FALSE which permits correlated factors.
266 | 
267 | 
268 | 
269 | ```r
270 | parTable(cfa(m1_model, data=cases, orthogonal=TRUE))[22:24, ]
271 | ```
272 | 
273 | ```
274 | ##    id     lhs op     rhs user group free ustart exo label eq.id unco
275 | ## 22 22  visual ~~ textual    0     1    0      0   0           0    0
276 | ## 23 23  visual ~~   speed    0     1    0      0   0           0    0
277 | ## 24 24 textual ~~   speed    0     1    0      0   0           0    0
278 | ```
279 | 
280 | 
281 | 
282 | When `orthogonal=TRUE` is specified, the covariance of latent factors is constrained to zero. This is reflected in `free=0` (i.e., it's not free to vary) and `ustart=0` (the constrained value is zero) in the parameter table.
283 | 
284 | Returning to the original parameter table:
285 | 
286 | * Variance parameters (`op=~~` where `lhs` is the same as `rhs`) are included for all observed and latent variables.
287 | 
288 | ## Summarise fit
289 | 
290 | 
291 | ```r
292 | summary(m1_fit)
293 | ```
294 | 
295 | ```
296 | ## lavaan (0.4-14) converged normally after 41 iterations
297 | ## 
298 | ##   Number of observations                           301
299 | ## 
300 | ##   Estimator                                         ML
301 | ##   Minimum Function Chi-square                   85.306
302 | ##   Degrees of freedom                                24
303 | ##   P-value                                        0.000
304 | ## 
305 | ## Parameter estimates:
306 | ## 
307 | ##   Information                                 Expected
308 | ##   Standard Errors                             Standard
309 | ## 
310 | ##                    Estimate  Std.err  Z-value  P(>|z|)
311 | ## Latent variables:
312 | ##   visual =~
313 | ##     x1                1.000
314 | ##     x2                0.553    0.100    5.554    0.000
315 | ##     x3                0.729    0.109    6.685    0.000
316 | ##   textual =~
317 | ##     x4                1.000
318 | ##     x5                1.113    0.065   17.014    0.000
319 | ##     x6                0.926    0.055   16.703    0.000
320 | ##   speed =~
321 | ##     x7                1.000
322 | ##     x8                1.180    0.165    7.152    0.000
323 | ##     x9                1.082    0.151    7.155    0.000
324 | ## 
325 | ## Covariances:
326 | ##   visual ~~
327 | ##     textual           0.408    0.074    5.552    0.000
328 | ##     speed             0.262    0.056    4.660    0.000
329 | ##   textual ~~
330 | ##     speed             0.173    0.049    3.518    0.000
331 | ## 
332 | ## Variances:
333 | ##     x1                0.549    0.114
334 | ##     x2                1.134    0.102
335 | ##     x3                0.844    0.091
336 | ##     x4                0.371    0.048
337 | ##     x5                0.446    0.058
338 | ##     x6                0.356    0.043
339 | ##     x7                0.799    0.081
340 | ##     x8                0.488    0.074
341 | ##     x9                0.566    0.071
342 | ##     visual            0.809    0.145
343 | ##     textual           0.979    0.112
344 | ##     speed             0.384    0.086
345 | ## 
346 | ```
347 | 
348 | 
349 | 
350 | 
351 | The default `summary` method shows $\chi^2$, $df$, p-value for the overall model, unstandardised parameter estimates, in some cases with significance tests.
352 | 
353 | 
354 | ## Getting fit statistics
355 | There are multiple ways of getting fit statistics
356 | 
357 | 
358 | 
359 | ```r
360 | fitMeasures(m1_fit)
361 | ```
362 | 
363 | ```
364 | ##             chisq                df            pvalue    baseline.chisq 
365 | ##            85.306            24.000             0.000           918.852 
366 | ##       baseline.df   baseline.pvalue               cfi               tli 
367 | ##            36.000             0.000             0.931             0.896 
368 | ##              logl unrestricted.logl              npar               aic 
369 | ##         -3737.745         -3695.092            21.000          7517.490 
370 | ##               bic            ntotal              bic2             rmsea 
371 | ##          7595.339           301.000          7528.739             0.092 
372 | ##    rmsea.ci.lower    rmsea.ci.upper      rmsea.pvalue              srmr 
373 | ##             0.071             0.114             0.001             0.065 
374 | ```
375 | 
376 | ```r
377 | # equivalent to:
378 | # inspect(m1_fit, 'fit.measures')
379 | 
380 | fitMeasures(m1_fit)['rmsea']
381 | ```
382 | 
383 | ```
384 | ##   rmsea 
385 | ## 0.09212 
386 | ```
387 | 
388 | ```r
389 | fitMeasures(m1_fit, c('rmsea', 'rmsea.ci.lower', 'rmsea.ci.upper'))
390 | ```
391 | 
392 | ```
393 | ##          rmsea rmsea.ci.lower rmsea.ci.upper 
394 | ##          0.092          0.071          0.114 
395 | ```
396 | 
397 | ```r
398 | 
399 | 
400 | summary(m1_fit, fit.measures=TRUE)
401 | ```
402 | 
403 | ```
404 | ## lavaan (0.4-14) converged normally after 41 iterations
405 | ## 
406 | ##   Number of observations                           301
407 | ## 
408 | ##   Estimator                                         ML
409 | ##   Minimum Function Chi-square                   85.306
410 | ##   Degrees of freedom                                24
411 | ##   P-value                                        0.000
412 | ## 
413 | ## Chi-square test baseline model:
414 | ## 
415 | ##   Minimum Function Chi-square                  918.852
416 | ##   Degrees of freedom                                36
417 | ##   P-value                                        0.000
418 | ## 
419 | ## Full model versus baseline model:
420 | ## 
421 | ##   Comparative Fit Index (CFI)                    0.931
422 | ##   Tucker-Lewis Index (TLI)                       0.896
423 | ## 
424 | ## Loglikelihood and Information Criteria:
425 | ## 
426 | ##   Loglikelihood user model (H0)              -3737.745
427 | ##   Loglikelihood unrestricted model (H1)      -3695.092
428 | ## 
429 | ##   Number of free parameters                         21
430 | ##   Akaike (AIC)                                7517.490
431 | ##   Bayesian (BIC)                              7595.339
432 | ##   Sample-size adjusted Bayesian (BIC)         7528.739
433 | ## 
434 | ## Root Mean Square Error of Approximation:
435 | ## 
436 | ##   RMSEA                                          0.092
437 | ##   90 Percent Confidence Interval          0.071  0.114
438 | ##   P-value RMSEA <= 0.05                          0.001
439 | ## 
440 | ## Standardized Root Mean Square Residual:
441 | ## 
442 | ##   SRMR                                           0.065
443 | ## 
444 | ## Parameter estimates:
445 | ## 
446 | ##   Information                                 Expected
447 | ##   Standard Errors                             Standard
448 | ## 
449 | ##                    Estimate  Std.err  Z-value  P(>|z|)
450 | ## Latent variables:
451 | ##   visual =~
452 | ##     x1                1.000
453 | ##     x2                0.553    0.100    5.554    0.000
454 | ##     x3                0.729    0.109    6.685    0.000
455 | ##   textual =~
456 | ##     x4                1.000
457 | ##     x5                1.113    0.065   17.014    0.000
458 | ##     x6                0.926    0.055   16.703    0.000
459 | ##   speed =~
460 | ##     x7                1.000
461 | ##     x8                1.180    0.165    7.152    0.000
462 | ##     x9                1.082    0.151    7.155    0.000
463 | ## 
464 | ## Covariances:
465 | ##   visual ~~
466 | ##     textual           0.408    0.074    5.552    0.000
467 | ##     speed             0.262    0.056    4.660    0.000
468 | ##   textual ~~
469 | ##     speed             0.173    0.049    3.518    0.000
470 | ## 
471 | ## Variances:
472 | ##     x1                0.549    0.114
473 | ##     x2                1.134    0.102
474 | ##     x3                0.844    0.091
475 | ##     x4                0.371    0.048
476 | ##     x5                0.446    0.058
477 | ##     x6                0.356    0.043
478 | ##     x7                0.799    0.081
479 | ##     x8                0.488    0.074
480 | ##     x9                0.566    0.071
481 | ##     visual            0.809    0.145
482 | ##     textual           0.979    0.112
483 | ##     speed             0.384    0.086
484 | ## 
485 | ```
486 | 
487 | 
488 | 
489 | 
490 | 
491 | * I assume that lavaan uses S4 classes which makes extracting elements a little different to S3 classes.
492 | * The above code shows how to extract fit measures.
493 | * While it is not clear hear, it appears that `rmsea.ci.lower` and `rmsea.ci.upper` refer to 90% lower and upper confidence intervals.
494 | * Adding `fit.measures=TRUE` provides a way of displaying 
495 | 
496 | 
497 | ## Modification indices
498 | 
499 | 
500 | ```r
501 | m1_mod <- modificationIndices(m1_fit)
502 | head(m1_mod[order(m1_mod$mi, decreasing=TRUE), ], 10)
503 | ```
504 | 
505 | ```
506 | ##        lhs op rhs     mi    epc sepc.lv sepc.all sepc.nox
507 | ## 1   visual =~  x9 36.411  0.577   0.519    0.515    0.515
508 | ## 2       x7 ~~  x8 34.145  0.536   0.536    0.488    0.488
509 | ## 3   visual =~  x7 18.631 -0.422  -0.380   -0.349   -0.349
510 | ## 4       x8 ~~  x9 14.946 -0.423  -0.423   -0.415   -0.415
511 | ## 5  textual =~  x3  9.151 -0.272  -0.269   -0.238   -0.238
512 | ## 6       x2 ~~  x7  8.918 -0.183  -0.183   -0.143   -0.143
513 | ## 7  textual =~  x1  8.903  0.350   0.347    0.297    0.297
514 | ## 8       x2 ~~  x3  8.532  0.218   0.218    0.164    0.164
515 | ## 9       x3 ~~  x5  7.858 -0.130  -0.130   -0.089   -0.089
516 | ## 10  visual =~  x5  7.441 -0.210  -0.189   -0.147   -0.147
517 | ```
518 | 
519 | 
520 | 
521 | 
522 | * The `modificationIndices` function returns modification indices and expected parameter changes (EPCs). 
523 | * The second line above sorts the rows of the modification indices table in decreasing order and shows those parameters with the 10 largest values.
524 | 
525 | 
526 | 
527 | ```r
528 | m2_model <- ' visual  =~ x1 + x2 + x3 + x9
529 |               textual =~ x4 + x5 + x6
530 |               speed   =~ x7 + x8 + x9
531 | '
532 | 
533 | m2_fit <- cfa(m2_model, data=cases)
534 | anova(m1_fit, m2_fit)
535 | ```
536 | 
537 | ```
538 | ## Chi Square Difference Test
539 | ## 
540 | ##        Df  AIC  BIC Chisq Chisq diff Df diff Pr(>Chisq)    
541 | ## m2_fit 23 7487 7568  52.4                                  
542 | ## m1_fit 24 7517 7595  85.3       32.9       1    9.6e-09 ***
543 | ## ---
544 | ## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1 
545 | ```
546 | 
547 | 
548 | 
549 | 
550 | * Note that more than one empty line at the end of the model definition seems to cause an error.
551 | * TODO: Work out why the change in $\chi^2$ 
552 | `32.9234`
553 | is different to the value of the modification index
554 | `36.411`.
555 | 
556 | ## Standardised parameters
557 | 
558 | 
559 | ```r
560 | summary(m1_fit)
561 | ```
562 | 
563 | ```
564 | ## lavaan (0.4-14) converged normally after 41 iterations
565 | ## 
566 | ##   Number of observations                           301
567 | ## 
568 | ##   Estimator                                         ML
569 | ##   Minimum Function Chi-square                   85.306
570 | ##   Degrees of freedom                                24
571 | ##   P-value                                        0.000
572 | ## 
573 | ## Parameter estimates:
574 | ## 
575 | ##   Information                                 Expected
576 | ##   Standard Errors                             Standard
577 | ## 
578 | ##                    Estimate  Std.err  Z-value  P(>|z|)
579 | ## Latent variables:
580 | ##   visual =~
581 | ##     x1                1.000
582 | ##     x2                0.553    0.100    5.554    0.000
583 | ##     x3                0.729    0.109    6.685    0.000
584 | ##   textual =~
585 | ##     x4                1.000
586 | ##     x5                1.113    0.065   17.014    0.000
587 | ##     x6                0.926    0.055   16.703    0.000
588 | ##   speed =~
589 | ##     x7                1.000
590 | ##     x8                1.180    0.165    7.152    0.000
591 | ##     x9                1.082    0.151    7.155    0.000
592 | ## 
593 | ## Covariances:
594 | ##   visual ~~
595 | ##     textual           0.408    0.074    5.552    0.000
596 | ##     speed             0.262    0.056    4.660    0.000
597 | ##   textual ~~
598 | ##     speed             0.173    0.049    3.518    0.000
599 | ## 
600 | ## Variances:
601 | ##     x1                0.549    0.114
602 | ##     x2                1.134    0.102
603 | ##     x3                0.844    0.091
604 | ##     x4                0.371    0.048
605 | ##     x5                0.446    0.058
606 | ##     x6                0.356    0.043
607 | ##     x7                0.799    0.081
608 | ##     x8                0.488    0.074
609 | ##     x9                0.566    0.071
610 | ##     visual            0.809    0.145
611 | ##     textual           0.979    0.112
612 | ##     speed             0.384    0.086
613 | ## 
614 | ```
615 | 
616 | ```r
617 | standardizedSolution(m1_fit)
618 | ```
619 | 
620 | ```
621 | ##        lhs op     rhs est.std se  z pvalue
622 | ## 1   visual =~      x1   0.772 NA NA     NA
623 | ## 2   visual =~      x2   0.424 NA NA     NA
624 | ## 3   visual =~      x3   0.581 NA NA     NA
625 | ## 4  textual =~      x4   0.852 NA NA     NA
626 | ## 5  textual =~      x5   0.855 NA NA     NA
627 | ## 6  textual =~      x6   0.838 NA NA     NA
628 | ## 7    speed =~      x7   0.570 NA NA     NA
629 | ## 8    speed =~      x8   0.723 NA NA     NA
630 | ## 9    speed =~      x9   0.665 NA NA     NA
631 | ## 10      x1 ~~      x1   0.404 NA NA     NA
632 | ## 11      x2 ~~      x2   0.821 NA NA     NA
633 | ## 12      x3 ~~      x3   0.662 NA NA     NA
634 | ## 13      x4 ~~      x4   0.275 NA NA     NA
635 | ## 14      x5 ~~      x5   0.269 NA NA     NA
636 | ## 15      x6 ~~      x6   0.298 NA NA     NA
637 | ## 16      x7 ~~      x7   0.676 NA NA     NA
638 | ## 17      x8 ~~      x8   0.477 NA NA     NA
639 | ## 18      x9 ~~      x9   0.558 NA NA     NA
640 | ## 19  visual ~~  visual   1.000 NA NA     NA
641 | ## 20 textual ~~ textual   1.000 NA NA     NA
642 | ## 21   speed ~~   speed   1.000 NA NA     NA
643 | ## 22  visual ~~ textual   0.459 NA NA     NA
644 | ## 23  visual ~~   speed   0.471 NA NA     NA
645 | ## 24 textual ~~   speed   0.283 NA NA     NA
646 | ```
647 | 
648 | 
649 | 
650 | 
651 | 
652 | 
653 | 


--------------------------------------------------------------------------------
/ex1-paper/ex1-paper.rmd:
--------------------------------------------------------------------------------
  1 | # Example 1 from Lavaan
  2 | `r opts_chunk$set(tidy=FALSE)`
  3 | This exercise examines the first example shown in 
  4 | <http://www.jstatsoft.org/v48/i02/paper>.
  5 | It's a three-factor confirmatory factor analysis example with three items per factor.
  6 | All three latent factors are permitted to correlate.
  7 | 
  8 | * `x1` to `x3` load on a `visual` factor
  9 | * `x4` to `x6` load on a `textual` factor
 10 | * `x7` to `x9` load on a `speed` factor
 11 | 
 12 | ```{r setup, message=FALSE}
 13 | library('lavaan')
 14 | library('Hmisc')
 15 | cases <- HolzingerSwineford1939
 16 | ```
 17 | 
 18 | ## Quick examination of data
 19 | 
 20 | ```{r }
 21 | str(cases)
 22 | Hmisc::describe(cases)
 23 | ```
 24 | 
 25 | The data set include `r nrow(cases)` observations. It includes a few demographic variables (e.g., sex, age in years and months, school, and grade). It includes nine variables  that are the observed test scores used in the subsequent CFA.
 26 | 
 27 | ## Fit CFA
 28 | ```{r fit_m1}
 29 | m1_model <- ' visual  =~ x1 + x2 + x3
 30 |               textual =~ x4 + x5 + x6
 31 |               speed   =~ x7 + x8 + x9
 32 | '
 33 | 
 34 | m1_fit <- cfa(m1_model, data=cases)
 35 | ```
 36 | 
 37 | * model syntax is specified as a character variable
 38 | * `cfa` is one model fitting function in `lavaan`. The command includes many options. Data can be specified as a data frame, as it is here using the `data` argument. Alternatively covariance matrix, vector of means, and sample size can be specified.
 39 | * From what I can tell, `lavaan` is the parent model fitting function that can take a `model.type` argument of `'cfa'`, `'sem'`, or `'growth'`. Thus, the arguments to `model.type` are functions themselves which just call `lavaan` with particular argument values.
 40 | 
 41 | ## Show parameter table
 42 | ```{r}
 43 | parTable(m1_fit)
 44 | ```
 45 | 
 46 | * What do the columns mean?
 47 |     * `id`: numeric identifier for the parameter
 48 |     * `lhs`: left hand side variable name
 49 |     * `op`: operator (see page 7 of http://www.jstatsoft.org/v48/i02/paper); `=~` 
 50 |     is manifested by; `~~` is correlated with.
 51 |     * `rhs`: right hand side variable name
 52 |     * `user`: 1 if parameter was specified by the user, 0 otherwise
 53 |     * `group`: presumably used in multiple group analysis
 54 |     * `free`: Nonzero elements are free parameters in the model
 55 |     * `ustart`: The value specified for fixed parameters
 56 |     * `exo`: ???
 57 |     * `label`: Probably just an optional label???
 58 |     * `eq.id`: ??? 
 59 |     * `unco`: ???
 60 | 
 61 | 
 62 | * The model syntax used in `lavaan` incorporates a lot of parameters by default to permit a tidy model syntax. The exact nature of these parameters is also determined by options in the `cfa`, `sem` and other model fitting fucntions.
 63 | * `parTable` is a method
 64 | 
 65 | It shows that the latent factors are allowed to intercorrelate. The `cfa` function has an an argument `orthogonal`. It defaults to FALSE which permits correlated factors.
 66 | 
 67 | ```{r}
 68 | parTable(cfa(m1_model, data=cases, orthogonal=TRUE))[22:24, ]
 69 | ```
 70 | When `orthogonal=TRUE` is specified, the covariance of latent factors is constrained to zero. This is reflected in `free=0` (i.e., it's not free to vary) and `ustart=0` (the constrained value is zero) in the parameter table.
 71 | 
 72 | Returning to the original parameter table:
 73 | 
 74 | * Variance parameters (`op=~~` where `lhs` is the same as `rhs`) are included for all observed and latent variables.
 75 | 
 76 | ## Summarise fit
 77 | ```{r}
 78 | summary(m1_fit)
 79 | ```
 80 | 
 81 | The default `summary` method shows $\chi^2$, $df$, p-value for the overall model, unstandardised parameter estimates, in some cases with significance tests.
 82 | 
 83 | 
 84 | ## Getting fit statistics
 85 | There are multiple ways of getting fit statistics
 86 | 
 87 | ```{r}
 88 | fitMeasures(m1_fit)
 89 | # equivalent to:
 90 | # inspect(m1_fit, 'fit.measures')
 91 | 
 92 | fitMeasures(m1_fit)['rmsea']
 93 | fitMeasures(m1_fit, c('rmsea', 'rmsea.ci.lower', 'rmsea.ci.upper'))
 94 | 
 95 | 
 96 | summary(m1_fit, fit.measures=TRUE)
 97 | ```
 98 | 
 99 | 
100 | * I assume that lavaan uses S4 classes which makes extracting elements a little different to S3 classes.
101 | * The above code shows how to extract fit measures.
102 | * While it is not clear hear, it appears that `rmsea.ci.lower` and `rmsea.ci.upper` refer to 90% lower and upper confidence intervals.
103 | * Adding `fit.measures=TRUE` provides a way of displaying 
104 | 
105 | 
106 | ## Modification indices
107 | ```{r}
108 | m1_mod <- modificationIndices(m1_fit)
109 | head(m1_mod[order(m1_mod$mi, decreasing=TRUE), ], 10)
110 | ```
111 | 
112 | * The `modificationIndices` function returns modification indices and expected parameter changes (EPCs). 
113 | * The second line above sorts the rows of the modification indices table in decreasing order and shows those parameters with the 10 largest values.
114 | 
115 | ```{r}
116 | m2_model <- ' visual  =~ x1 + x2 + x3 + x9
117 |               textual =~ x4 + x5 + x6
118 |               speed   =~ x7 + x8 + x9
119 | '
120 | 
121 | m2_fit <- cfa(m2_model, data=cases)
122 | anova(m1_fit, m2_fit)
123 | ```
124 | 
125 | * Note that more than one empty line at the end of the model definition seems to cause an error.
126 | * TODO: Work out why the change in $\chi^2$ 
127 | `r anova(m1_fit, m2_fit)[2, 'Chisq diff']`
128 | is different to the value of the modification index
129 | `r m1_mod[m1_mod$lhs == 'visual' &  m1_mod$rhs == 'x9', 'mi']`.
130 | 
131 | ## Standardised parameters
132 | ```{r}
133 | summary(m1_fit)
134 | standardizedSolution(m1_fit)
135 | ```
136 | 
137 | 
138 | 
139 | 


--------------------------------------------------------------------------------
/ex2-paper/ex2-paper.html:
--------------------------------------------------------------------------------
  1 | <!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN"
  2 | "http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd">
  3 | <!-- saved from url=(0014)about:internet -->
  4 | <html>
  5 | <head>
  6 | <meta http-equiv="Content-Type" content="text/html; charset=utf-8">
  7 | 
  8 | <title>Example 2 from Rossel&#39;s Paper on lavaan</title>
  9 | 
 10 | <base target="_blank"/>
 11 | 
 12 | <style type="text/css">
 13 | body, td {
 14 |    font-family: sans-serif;
 15 |    background-color: white;
 16 |    font-size: 12px;
 17 |    margin: 8px;
 18 | }
 19 | 
 20 | tt, code, pre {
 21 |    font-family: 'DejaVu Sans Mono', 'Droid Sans Mono', 'Lucida Console', Consolas, Monaco, monospace;
 22 | }
 23 | 
 24 | h1 { 
 25 |    font-size:2.2em; 
 26 | }
 27 | 
 28 | h2 { 
 29 |    font-size:1.8em; 
 30 | }
 31 | 
 32 | h3 { 
 33 |    font-size:1.4em; 
 34 | }
 35 | 
 36 | h4 { 
 37 |    font-size:1.0em; 
 38 | }
 39 | 
 40 | h5 { 
 41 |    font-size:0.9em; 
 42 | }
 43 | 
 44 | h6 { 
 45 |    font-size:0.8em; 
 46 | }
 47 | 
 48 | a:visited {
 49 |    color: rgb(50%, 0%, 50%);
 50 | }
 51 | 
 52 | pre {	
 53 |    margin-top: 0;
 54 |    max-width: 95%;
 55 |    border: 1px solid #ccc;
 56 | }
 57 | 
 58 | pre code {
 59 |    display: block; padding: 0.5em;
 60 | }
 61 | 
 62 | code.r {
 63 |    background-color: #F8F8F8;
 64 | }
 65 | 
 66 | table, td, th {
 67 |   border: none;
 68 | }
 69 | 
 70 | blockquote {
 71 |    color:#666666;
 72 |    margin:0;
 73 |    padding-left: 1em;
 74 |    border-left: 0.5em #EEE solid;
 75 | }
 76 | 
 77 | hr {
 78 |    height: 0px;
 79 |    border-bottom: none;
 80 |    border-top-width: thin;
 81 |    border-top-style: dotted;
 82 |    border-top-color: #999999;
 83 | }
 84 | 
 85 | @media print {
 86 |    * { 
 87 |       background: transparent !important; 
 88 |       color: black !important; 
 89 |       filter:none !important; 
 90 |       -ms-filter: none !important; 
 91 |    }
 92 | 
 93 |    body { 
 94 |       font-size:12pt; 
 95 |       max-width:100%; 
 96 |    }
 97 |        
 98 |    a, a:visited { 
 99 |       text-decoration: underline; 
100 |    }
101 | 
102 |    hr { 
103 |       visibility: hidden;
104 |       page-break-before: always;
105 |    }
106 | 
107 |    pre, blockquote { 
108 |       padding-right: 1em; 
109 |       page-break-inside: avoid; 
110 |    }
111 | 
112 |    tr, img { 
113 |       page-break-inside: avoid; 
114 |    }
115 | 
116 |    img { 
117 |       max-width: 100% !important; 
118 |    }
119 | 
120 |    @page :left { 
121 |       margin: 15mm 20mm 15mm 10mm; 
122 |    }
123 |      
124 |    @page :right { 
125 |       margin: 15mm 10mm 15mm 20mm; 
126 |    }
127 | 
128 |    p, h2, h3 { 
129 |       orphans: 3; widows: 3; 
130 |    }
131 | 
132 |    h2, h3 { 
133 |       page-break-after: avoid; 
134 |    }
135 | }
136 | 
137 | </style>
138 | 
139 | <!-- Styles for R syntax highlighter -->
140 | <style type="text/css">
141 |    pre .operator,
142 |    pre .paren {
143 |      color: rgb(104, 118, 135)
144 |    }
145 | 
146 |    pre .literal {
147 |      color: rgb(88, 72, 246)
148 |    }
149 | 
150 |    pre .number {
151 |      color: rgb(0, 0, 205);
152 |    }
153 | 
154 |    pre .comment {
155 |      color: rgb(76, 136, 107);
156 |    }
157 | 
158 |    pre .keyword {
159 |      color: rgb(0, 0, 255);
160 |    }
161 | 
162 |    pre .identifier {
163 |      color: rgb(0, 0, 0);
164 |    }
165 | 
166 |    pre .string {
167 |      color: rgb(3, 106, 7);
168 |    }
169 | </style>
170 | 
171 | <!-- R syntax highlighter -->
172 | <script type="text/javascript">
173 | var hljs=new function(){function m(p){return p.replace(/&/gm,"&amp;").replace(/</gm,"&lt;")}function f(r,q,p){return RegExp(q,"m"+(r.cI?"i":"")+(p?"g":""))}function b(r){for(var p=0;p<r.childNodes.length;p++){var q=r.childNodes[p];if(q.nodeName=="CODE"){return q}if(!(q.nodeType==3&&q.nodeValue.match(/\s+/))){break}}}function h(t,s){var p="";for(var r=0;r<t.childNodes.length;r++){if(t.childNodes[r].nodeType==3){var q=t.childNodes[r].nodeValue;if(s){q=q.replace(/\n/g,"")}p+=q}else{if(t.childNodes[r].nodeName=="BR"){p+="\n"}else{p+=h(t.childNodes[r])}}}if(/MSIE [678]/.test(navigator.userAgent)){p=p.replace(/\r/g,"\n")}return p}function a(s){var r=s.className.split(/\s+/);r=r.concat(s.parentNode.className.split(/\s+/));for(var q=0;q<r.length;q++){var p=r[q].replace(/^language-/,"");if(e[p]){return p}}}function c(q){var p=[];(function(s,t){for(var r=0;r<s.childNodes.length;r++){if(s.childNodes[r].nodeType==3){t+=s.childNodes[r].nodeValue.length}else{if(s.childNodes[r].nodeName=="BR"){t+=1}else{if(s.childNodes[r].nodeType==1){p.push({event:"start",offset:t,node:s.childNodes[r]});t=arguments.callee(s.childNodes[r],t);p.push({event:"stop",offset:t,node:s.childNodes[r]})}}}}return t})(q,0);return p}function k(y,w,x){var q=0;var z="";var s=[];function u(){if(y.length&&w.length){if(y[0].offset!=w[0].offset){return(y[0].offset<w[0].offset)?y:w}else{return w[0].event=="start"?y:w}}else{return y.length?y:w}}function t(D){var A="<"+D.nodeName.toLowerCase();for(var B=0;B<D.attributes.length;B++){var C=D.attributes[B];A+=" "+C.nodeName.toLowerCase();if(C.value!==undefined&&C.value!==false&&C.value!==null){A+='="'+m(C.value)+'"'}}return A+">"}while(y.length||w.length){var v=u().splice(0,1)[0];z+=m(x.substr(q,v.offset-q));q=v.offset;if(v.event=="start"){z+=t(v.node);s.push(v.node)}else{if(v.event=="stop"){var p,r=s.length;do{r--;p=s[r];z+=("</"+p.nodeName.toLowerCase()+">")}while(p!=v.node);s.splice(r,1);while(r<s.length){z+=t(s[r]);r++}}}}return z+m(x.substr(q))}function j(){function q(x,y,v){if(x.compiled){return}var u;var s=[];if(x.k){x.lR=f(y,x.l||hljs.IR,true);for(var w in x.k){if(!x.k.hasOwnProperty(w)){continue}if(x.k[w] instanceof Object){u=x.k[w]}else{u=x.k;w="keyword"}for(var r in u){if(!u.hasOwnProperty(r)){continue}x.k[r]=[w,u[r]];s.push(r)}}}if(!v){if(x.bWK){x.b="\\b("+s.join("|")+")\\s"}x.bR=f(y,x.b?x.b:"\\B|\\b");if(!x.e&&!x.eW){x.e="\\B|\\b"}if(x.e){x.eR=f(y,x.e)}}if(x.i){x.iR=f(y,x.i)}if(x.r===undefined){x.r=1}if(!x.c){x.c=[]}x.compiled=true;for(var t=0;t<x.c.length;t++){if(x.c[t]=="self"){x.c[t]=x}q(x.c[t],y,false)}if(x.starts){q(x.starts,y,false)}}for(var p in e){if(!e.hasOwnProperty(p)){continue}q(e[p].dM,e[p],true)}}function d(B,C){if(!j.called){j();j.called=true}function q(r,M){for(var L=0;L<M.c.length;L++){if((M.c[L].bR.exec(r)||[null])[0]==r){return M.c[L]}}}function v(L,r){if(D[L].e&&D[L].eR.test(r)){return 1}if(D[L].eW){var M=v(L-1,r);return M?M+1:0}return 0}function w(r,L){return L.i&&L.iR.test(r)}function K(N,O){var M=[];for(var L=0;L<N.c.length;L++){M.push(N.c[L].b)}var r=D.length-1;do{if(D[r].e){M.push(D[r].e)}r--}while(D[r+1].eW);if(N.i){M.push(N.i)}return f(O,M.join("|"),true)}function p(M,L){var N=D[D.length-1];if(!N.t){N.t=K(N,E)}N.t.lastIndex=L;var r=N.t.exec(M);return r?[M.substr(L,r.index-L),r[0],false]:[M.substr(L),"",true]}function z(N,r){var L=E.cI?r[0].toLowerCase():r[0];var M=N.k[L];if(M&&M instanceof Array){return M}return false}function F(L,P){L=m(L);if(!P.k){return L}var r="";var O=0;P.lR.lastIndex=0;var M=P.lR.exec(L);while(M){r+=L.substr(O,M.index-O);var N=z(P,M);if(N){x+=N[1];r+='<span class="'+N[0]+'">'+M[0]+"</span>"}else{r+=M[0]}O=P.lR.lastIndex;M=P.lR.exec(L)}return r+L.substr(O,L.length-O)}function J(L,M){if(M.sL&&e[M.sL]){var r=d(M.sL,L);x+=r.keyword_count;return r.value}else{return F(L,M)}}function I(M,r){var L=M.cN?'<span class="'+M.cN+'">':"";if(M.rB){y+=L;M.buffer=""}else{if(M.eB){y+=m(r)+L;M.buffer=""}else{y+=L;M.buffer=r}}D.push(M);A+=M.r}function G(N,M,Q){var R=D[D.length-1];if(Q){y+=J(R.buffer+N,R);return false}var P=q(M,R);if(P){y+=J(R.buffer+N,R);I(P,M);return P.rB}var L=v(D.length-1,M);if(L){var O=R.cN?"</span>":"";if(R.rE){y+=J(R.buffer+N,R)+O}else{if(R.eE){y+=J(R.buffer+N,R)+O+m(M)}else{y+=J(R.buffer+N+M,R)+O}}while(L>1){O=D[D.length-2].cN?"</span>":"";y+=O;L--;D.length--}var r=D[D.length-1];D.length--;D[D.length-1].buffer="";if(r.starts){I(r.starts,"")}return R.rE}if(w(M,R)){throw"Illegal"}}var E=e[B];var D=[E.dM];var A=0;var x=0;var y="";try{var s,u=0;E.dM.buffer="";do{s=p(C,u);var t=G(s[0],s[1],s[2]);u+=s[0].length;if(!t){u+=s[1].length}}while(!s[2]);if(D.length>1){throw"Illegal"}return{r:A,keyword_count:x,value:y}}catch(H){if(H=="Illegal"){return{r:0,keyword_count:0,value:m(C)}}else{throw H}}}function g(t){var p={keyword_count:0,r:0,value:m(t)};var r=p;for(var q in e){if(!e.hasOwnProperty(q)){continue}var s=d(q,t);s.language=q;if(s.keyword_count+s.r>r.keyword_count+r.r){r=s}if(s.keyword_count+s.r>p.keyword_count+p.r){r=p;p=s}}if(r.language){p.second_best=r}return p}function i(r,q,p){if(q){r=r.replace(/^((<[^>]+>|\t)+)/gm,function(t,w,v,u){return w.replace(/\t/g,q)})}if(p){r=r.replace(/\n/g,"<br>")}return r}function n(t,w,r){var x=h(t,r);var v=a(t);var y,s;if(v){y=d(v,x)}else{return}var q=c(t);if(q.length){s=document.createElement("pre");s.innerHTML=y.value;y.value=k(q,c(s),x)}y.value=i(y.value,w,r);var u=t.className;if(!u.match("(\\s|^)(language-)?"+v+"(\\s|$)")){u=u?(u+" "+v):v}if(/MSIE [678]/.test(navigator.userAgent)&&t.tagName=="CODE"&&t.parentNode.tagName=="PRE"){s=t.parentNode;var p=document.createElement("div");p.innerHTML="<pre><code>"+y.value+"</code></pre>";t=p.firstChild.firstChild;p.firstChild.cN=s.cN;s.parentNode.replaceChild(p.firstChild,s)}else{t.innerHTML=y.value}t.className=u;t.result={language:v,kw:y.keyword_count,re:y.r};if(y.second_best){t.second_best={language:y.second_best.language,kw:y.second_best.keyword_count,re:y.second_best.r}}}function o(){if(o.called){return}o.called=true;var r=document.getElementsByTagName("pre");for(var p=0;p<r.length;p++){var q=b(r[p]);if(q){n(q,hljs.tabReplace)}}}function l(){if(window.addEventListener){window.addEventListener("DOMContentLoaded",o,false);window.addEventListener("load",o,false)}else{if(window.attachEvent){window.attachEvent("onload",o)}else{window.onload=o}}}var e={};this.LANGUAGES=e;this.highlight=d;this.highlightAuto=g;this.fixMarkup=i;this.highlightBlock=n;this.initHighlighting=o;this.initHighlightingOnLoad=l;this.IR="[a-zA-Z][a-zA-Z0-9_]*";this.UIR="[a-zA-Z_][a-zA-Z0-9_]*";this.NR="\\b\\d+(\\.\\d+)?";this.CNR="\\b(0[xX][a-fA-F0-9]+|(\\d+(\\.\\d*)?|\\.\\d+)([eE][-+]?\\d+)?)";this.BNR="\\b(0b[01]+)";this.RSR="!|!=|!==|%|%=|&|&&|&=|\\*|\\*=|\\+|\\+=|,|\\.|-|-=|/|/=|:|;|<|<<|<<=|<=|=|==|===|>|>=|>>|>>=|>>>|>>>=|\\?|\\[|\\{|\\(|\\^|\\^=|\\||\\|=|\\|\\||~";this.ER="(?![\\s\\S])";this.BE={b:"\\\\.",r:0};this.ASM={cN:"string",b:"'",e:"'",i:"\\n",c:[this.BE],r:0};this.QSM={cN:"string",b:'"',e:'"',i:"\\n",c:[this.BE],r:0};this.CLCM={cN:"comment",b:"//",e:"$"};this.CBLCLM={cN:"comment",b:"/\\*",e:"\\*/"};this.HCM={cN:"comment",b:"#",e:"$"};this.NM={cN:"number",b:this.NR,r:0};this.CNM={cN:"number",b:this.CNR,r:0};this.BNM={cN:"number",b:this.BNR,r:0};this.inherit=function(r,s){var p={};for(var q in r){p[q]=r[q]}if(s){for(var q in s){p[q]=s[q]}}return p}}();hljs.LANGUAGES.cpp=function(){var a={keyword:{"false":1,"int":1,"float":1,"while":1,"private":1,"char":1,"catch":1,"export":1,virtual:1,operator:2,sizeof:2,dynamic_cast:2,typedef:2,const_cast:2,"const":1,struct:1,"for":1,static_cast:2,union:1,namespace:1,unsigned:1,"long":1,"throw":1,"volatile":2,"static":1,"protected":1,bool:1,template:1,mutable:1,"if":1,"public":1,friend:2,"do":1,"return":1,"goto":1,auto:1,"void":2,"enum":1,"else":1,"break":1,"new":1,extern:1,using:1,"true":1,"class":1,asm:1,"case":1,typeid:1,"short":1,reinterpret_cast:2,"default":1,"double":1,register:1,explicit:1,signed:1,typename:1,"try":1,"this":1,"switch":1,"continue":1,wchar_t:1,inline:1,"delete":1,alignof:1,char16_t:1,char32_t:1,constexpr:1,decltype:1,noexcept:1,nullptr:1,static_assert:1,thread_local:1,restrict:1,_Bool:1,complex:1},built_in:{std:1,string:1,cin:1,cout:1,cerr:1,clog:1,stringstream:1,istringstream:1,ostringstream:1,auto_ptr:1,deque:1,list:1,queue:1,stack:1,vector:1,map:1,set:1,bitset:1,multiset:1,multimap:1,unordered_set:1,unordered_map:1,unordered_multiset:1,unordered_multimap:1,array:1,shared_ptr:1}};return{dM:{k:a,i:"</",c:[hljs.CLCM,hljs.CBLCLM,hljs.QSM,{cN:"string",b:"'\\\\?.",e:"'",i:"."},{cN:"number",b:"\\b(\\d+(\\.\\d*)?|\\.\\d+)(u|U|l|L|ul|UL|f|F)"},hljs.CNM,{cN:"preprocessor",b:"#",e:"$"},{cN:"stl_container",b:"\\b(deque|list|queue|stack|vector|map|set|bitset|multiset|multimap|unordered_map|unordered_set|unordered_multiset|unordered_multimap|array)\\s*<",e:">",k:a,r:10,c:["self"]}]}}}();hljs.LANGUAGES.r={dM:{c:[hljs.HCM,{cN:"number",b:"\\b0[xX][0-9a-fA-F]+[Li]?\\b",e:hljs.IMMEDIATE_RE,r:0},{cN:"number",b:"\\b\\d+(?:[eE][+\\-]?\\d*)?L\\b",e:hljs.IMMEDIATE_RE,r:0},{cN:"number",b:"\\b\\d+\\.(?!\\d)(?:i\\b)?",e:hljs.IMMEDIATE_RE,r:1},{cN:"number",b:"\\b\\d+(?:\\.\\d*)?(?:[eE][+\\-]?\\d*)?i?\\b",e:hljs.IMMEDIATE_RE,r:0},{cN:"number",b:"\\.\\d+(?:[eE][+\\-]?\\d*)?i?\\b",e:hljs.IMMEDIATE_RE,r:1},{cN:"keyword",b:"(?:tryCatch|library|setGeneric|setGroupGeneric)\\b",e:hljs.IMMEDIATE_RE,r:10},{cN:"keyword",b:"\\.\\.\\.",e:hljs.IMMEDIATE_RE,r:10},{cN:"keyword",b:"\\.\\.\\d+(?![\\w.])",e:hljs.IMMEDIATE_RE,r:10},{cN:"keyword",b:"\\b(?:function)",e:hljs.IMMEDIATE_RE,r:2},{cN:"keyword",b:"(?:if|in|break|next|repeat|else|for|return|switch|while|try|stop|warning|require|attach|detach|source|setMethod|setClass)\\b",e:hljs.IMMEDIATE_RE,r:1},{cN:"literal",b:"(?:NA|NA_integer_|NA_real_|NA_character_|NA_complex_)\\b",e:hljs.IMMEDIATE_RE,r:10},{cN:"literal",b:"(?:NULL|TRUE|FALSE|T|F|Inf|NaN)\\b",e:hljs.IMMEDIATE_RE,r:1},{cN:"identifier",b:"[a-zA-Z.][a-zA-Z0-9._]*\\b",e:hljs.IMMEDIATE_RE,r:0},{cN:"operator",b:"<\\-(?!\\s*\\d)",e:hljs.IMMEDIATE_RE,r:2},{cN:"operator",b:"\\->|<\\-",e:hljs.IMMEDIATE_RE,r:1},{cN:"operator",b:"%%|~",e:hljs.IMMEDIATE_RE},{cN:"operator",b:">=|<=|==|!=|\\|\\||&&|=|\\+|\\-|\\*|/|\\^|>|<|!|&|\\||\\$|:",e:hljs.IMMEDIATE_RE,r:0},{cN:"operator",b:"%",e:"%",i:"\\n",r:1},{cN:"identifier",b:"`",e:"`",r:0},{cN:"string",b:'"',e:'"',c:[hljs.BE],r:0},{cN:"string",b:"'",e:"'",c:[hljs.BE],r:0},{cN:"paren",b:"[[({\\])}]",e:hljs.IMMEDIATE_RE,r:0}]}};
174 | hljs.initHighlightingOnLoad();
175 | </script>
176 | 
177 | 
178 | 
179 | 
180 | </head>
181 | 
182 | <body>
183 | <h1>Example 2 from Rossel&#39;s Paper on lavaan</h1>
184 | 
185 | <pre><code class="r">library(lavaan)
186 | Data &lt;- PoliticalDemocracy
187 | </code></pre>
188 | 
189 | <p>This example is an elaboration on Example 2 from Yves Rossel&#39;s Journal of Statistical Software Article (see <a href="http://www.jstatsoft.org/v48/i02/paper">here</a>).</p>
190 | 
191 | <h2>M0: Basic Measurement model</h2>
192 | 
193 | <pre><code class="r">m0_model &lt;- &#39;
194 | # measurement model
195 | ind60 =~ x1 + x2 + x3
196 | dem60 =~ y1 + y2 + y3 + y4
197 | dem65 =~ y5 + y6 + y7 + y8
198 | &#39;
199 | 
200 | m0_fit &lt;- cfa(m0_model, data=Data)
201 | </code></pre>
202 | 
203 | <ul>
204 | <li><code>m0</code> defines a basic measurement model that permits correlated factors.  Note that it does not have correlations between corresponding democracy indicator measures over time.</li>
205 | </ul>
206 | 
207 | <p><strong>Questions:</strong></p>
208 | 
209 | <ul>
210 | <li>Is it a good model?</li>
211 | </ul>
212 | 
213 | <pre><code class="r">fitmeasures(m0_fit)
214 | </code></pre>
215 | 
216 | <pre><code>##             chisq                df            pvalue    baseline.chisq 
217 | ##            72.462            41.000             0.002           730.654 
218 | ##       baseline.df   baseline.pvalue               cfi               tli 
219 | ##            55.000             0.000             0.953             0.938 
220 | ##              logl unrestricted.logl              npar               aic 
221 | ##         -1564.959         -1528.728            25.000          3179.918 
222 | ##               bic            ntotal              bic2             rmsea 
223 | ##          3237.855            75.000          3159.062             0.101 
224 | ##    rmsea.ci.lower    rmsea.ci.upper      rmsea.pvalue              srmr 
225 | ##             0.061             0.139             0.021             0.055 
226 | </code></pre>
227 | 
228 | <ul>
229 | <li>cfi suggests a reasonable model, but RMSEA is quite large.</li>
230 | </ul>
231 | 
232 | <pre><code class="r">inspect(m0_fit, &#39;standardized&#39;)
233 | </code></pre>
234 | 
235 | <pre><code>##      lhs op   rhs est.std se  z pvalue
236 | ## 1  ind60 =~    x1   0.920 NA NA     NA
237 | ## 2  ind60 =~    x2   0.973 NA NA     NA
238 | ## 3  ind60 =~    x3   0.872 NA NA     NA
239 | ## 4  dem60 =~    y1   0.845 NA NA     NA
240 | ## 5  dem60 =~    y2   0.760 NA NA     NA
241 | ## 6  dem60 =~    y3   0.705 NA NA     NA
242 | ## 7  dem60 =~    y4   0.860 NA NA     NA
243 | ## 8  dem65 =~    y5   0.803 NA NA     NA
244 | ## 9  dem65 =~    y6   0.783 NA NA     NA
245 | ## 10 dem65 =~    y7   0.819 NA NA     NA
246 | ## 11 dem65 =~    y8   0.847 NA NA     NA
247 | ## 12    x1 ~~    x1   0.154 NA NA     NA
248 | ## 13    x2 ~~    x2   0.053 NA NA     NA
249 | ## 14    x3 ~~    x3   0.240 NA NA     NA
250 | ## 15    y1 ~~    y1   0.286 NA NA     NA
251 | ## 16    y2 ~~    y2   0.422 NA NA     NA
252 | ## 17    y3 ~~    y3   0.503 NA NA     NA
253 | ## 18    y4 ~~    y4   0.261 NA NA     NA
254 | ## 19    y5 ~~    y5   0.355 NA NA     NA
255 | ## 20    y6 ~~    y6   0.387 NA NA     NA
256 | ## 21    y7 ~~    y7   0.329 NA NA     NA
257 | ## 22    y8 ~~    y8   0.283 NA NA     NA
258 | ## 23 ind60 ~~ ind60   1.000 NA NA     NA
259 | ## 24 dem60 ~~ dem60   1.000 NA NA     NA
260 | ## 25 dem65 ~~ dem65   1.000 NA NA     NA
261 | ## 26 ind60 ~~ dem60   0.448 NA NA     NA
262 | ## 27 ind60 ~~ dem65   0.555 NA NA     NA
263 | ## 28 dem60 ~~ dem65   0.978 NA NA     NA
264 | </code></pre>
265 | 
266 | <ul>
267 | <li>The table of standardised loadings show all factor loadings to be large.</li>
268 | </ul>
269 | 
270 | <pre><code class="r">m0_mod &lt;- modificationindices(m0_fit)
271 | head(m0_mod[order(m0_mod$mi, decreasing=TRUE), ], 12)
272 | </code></pre>
273 | 
274 | <pre><code>##      lhs op rhs    mi    epc sepc.lv sepc.all sepc.nox
275 | ## 1     y2 ~~  y6 9.279  2.129   2.129    0.162    0.162
276 | ## 2     y6 ~~  y8 8.668  1.513   1.513    0.140    0.140
277 | ## 3     y1 ~~  y5 8.183  0.884   0.884    0.131    0.131
278 | ## 4     y3 ~~  y6 6.574 -1.590  -1.590   -0.146   -0.146
279 | ## 5     y1 ~~  y3 5.204  1.024   1.024    0.121    0.121
280 | ## 6     y2 ~~  y4 4.911  1.432   1.432    0.110    0.110
281 | ## 7     y3 ~~  y7 4.088  1.152   1.152    0.108    0.108
282 | ## 8  ind60 =~  y5 4.007  0.762   0.510    0.197    0.197
283 | ## 9     x1 ~~  y2 3.785 -0.192  -0.192   -0.067   -0.067
284 | ## 10 ind60 =~  y4 3.568  0.811   0.543    0.163    0.163
285 | ## 11    y2 ~~  y3 3.215 -1.365  -1.365   -0.107   -0.107
286 | ## 12    y5 ~~  y6 3.116 -0.774  -0.774   -0.089   -0.089
287 | </code></pre>
288 | 
289 | <ul>
290 | <li>The table of largest modification indices suggest a range of ways that the model could be improved. Because the sample size is small, particular caution needs to be taken with these.</li>
291 | <li>Several of these modifications concern the expected requirement to permit indicator variables at different time points to correlate (e.g., <code>y2</code> with <code>y6</code>, <code>y3</code> with <code>y7</code>).</li>
292 | <li>It may also be that some pairs of items are correlated more than others. For example, the following correlation matrix shows how <code>y6</code> and <code>y8</code> have a particularly large correlation.</li>
293 | </ul>
294 | 
295 | <pre><code class="r">round(cor(Data[,c(&#39;y5&#39;, &#39;y6&#39;, &#39;y7&#39;, &#39;y8&#39;)]), 2)
296 | </code></pre>
297 | 
298 | <pre><code>##      y5   y6   y7   y8
299 | ## y5 1.00 0.56 0.68 0.63
300 | ## y6 0.56 1.00 0.61 0.75
301 | ## y7 0.68 0.61 1.00 0.71
302 | ## y8 0.63 0.75 0.71 1.00
303 | </code></pre>
304 | 
305 | <ul>
306 | <li>What are the correlations between the factors?</li>
307 | </ul>
308 | 
309 | <pre><code class="r">cov2cor(inspect(m0_fit, &quot;coefficients&quot;)$psi)
310 | </code></pre>
311 | 
312 | <pre><code>##       ind60 dem60 dem65
313 | ## ind60 1.000            
314 | ## dem60 0.448 1.000      
315 | ## dem65 0.555 0.978 1.000
316 | </code></pre>
317 | 
318 | <p>This certainly suggests that factors are strongly related, especially the two demographics measures.</p>
319 | 
320 | <h2>M1: Correlated item measurement model</h2>
321 | 
322 | <p>This next model permits corresponding democracy measures from the two points to be correlated.</p>
323 | 
324 | <pre><code class="r">m1_model &lt;- &#39;
325 |     # measurement model
326 |     ind60 =~ x1 + x2 + x3
327 |     dem60 =~ y1 + y2 + y3 + y4
328 |     dem65 =~ y5 + y6 + y7 + y8
329 | 
330 |     # correlated residuals
331 |     y1 ~~ y5
332 |     y2 ~~ y6
333 |     y3 ~~ y7
334 |     y4 ~~ y8
335 | &#39;
336 | 
337 | m1_fit &lt;- cfa(m1_model, data=Data)
338 | </code></pre>
339 | 
340 | <ul>
341 | <li>Is this an improvement over <code>m0</code> with uncorrelated indicators?</li>
342 | <li>Does <code>m1</code> have good fit in and of itself?</li>
343 | </ul>
344 | 
345 | <pre><code class="r">anova(m0_fit, m1_fit)
346 | </code></pre>
347 | 
348 | <pre><code>## Chi Square Difference Test
349 | ## 
350 | ##        Df  AIC  BIC Chisq Chisq diff Df diff Pr(&gt;Chisq)    
351 | ## m1_fit 37 3166 3233  50.8                                  
352 | ## m0_fit 41 3180 3238  72.5       21.6       4    0.00024 ***
353 | ## ---
354 | ## Signif. codes:  0 &#39;***&#39; 0.001 &#39;**&#39; 0.01 &#39;*&#39; 0.05 &#39;.&#39; 0.1 &#39; &#39; 1 
355 | </code></pre>
356 | 
357 | <pre><code class="r">round(cbind(m0=inspect(m0_fit, &#39;fit.measures&#39;), 
358 |             m1=inspect(m1_fit, &#39;fit.measures&#39;)), 3)
359 | </code></pre>
360 | 
361 | <pre><code>##                          m0        m1
362 | ## chisq                72.462    50.835
363 | ## df                   41.000    37.000
364 | ## pvalue                0.002     0.064
365 | ## baseline.chisq      730.654   730.654
366 | ## baseline.df          55.000    55.000
367 | ## baseline.pvalue       0.000     0.000
368 | ## cfi                   0.953     0.980
369 | ## tli                   0.938     0.970
370 | ## logl              -1564.959 -1554.146
371 | ## unrestricted.logl -1528.728 -1528.728
372 | ## npar                 25.000    29.000
373 | ## aic                3179.918  3166.292
374 | ## bic                3237.855  3233.499
375 | ## ntotal               75.000    75.000
376 | ## bic2               3159.062  3142.099
377 | ## rmsea                 0.101     0.071
378 | ## rmsea.ci.lower        0.061     0.000
379 | ## rmsea.ci.upper        0.139     0.115
380 | ## rmsea.pvalue          0.021     0.234
381 | ## srmr                  0.055     0.050
382 | </code></pre>
383 | 
384 | <ul>
385 | <li>It is a significant improvement. </li>
386 | <li>RMSEA and other fit measurs are substantially improved.</li>
387 | <li>The relatively small sample size makes it somewhat difficult to see how much further improvements should continue. In general, the RMSEA suggests that further improvements are possible but it may be less clear on how to proceed in a principled way.</li>
388 | </ul>
389 | 
390 | <h1>M2: Basic SEM</h1>
391 | 
392 | <pre><code class="r">m2_model &lt;- &#39;
393 |     # measurement model
394 |     ind60 =~ x1 + x2 + x3
395 |     dem60 =~ y1 + y2 + y3 + y4
396 |     dem65 =~ y5 + y6 + y7 + y8
397 | 
398 |     # correlated residuals
399 |     y1 ~~ y5
400 |     y2 ~~ y6
401 |     y3 ~~ y7
402 |     y4 ~~ y8
403 | 
404 |     # regressions
405 |     dem60 ~ ind60
406 |     dem65 ~ ind60 + dem60
407 | &#39;
408 | 
409 | m2_fit &lt;- sem(m2_model, data=Data)
410 | </code></pre>
411 | 
412 | <ul>
413 | <li>Is fit the same as model 1 as I would expect?</li>
414 | </ul>
415 | 
416 | <pre><code class="r">rbind(m1 = fitMeasures(m1_fit)[c(&#39;chisq&#39;, &#39;rmsea&#39;)], 
417 |     m2 = fitMeasures(m2_fit)[c(&#39;chisq&#39;, &#39;rmsea&#39;)])
418 | </code></pre>
419 | 
420 | <pre><code>##    chisq   rmsea
421 | ## m1 50.84 0.07061
422 | ## m2 50.84 0.07061
423 | </code></pre>
424 | 
425 | <p>Yes, it is.</p>
426 | 
427 | <ul>
428 | <li>Assuming democracy 1965 is the depenent variable, how can we get the information typically available in multiple regression output?
429 | 
430 | <ul>
431 | <li>R-squared?</li>
432 | <li>Unstandardised regression coefficients?</li>
433 | <li>Standardised regression coefficients?</li>
434 | <li>Standard errors, p-values,  and confidence intervals on unstandardised coefficients?</li>
435 | </ul></li>
436 | </ul>
437 | 
438 | <pre><code class="r"># m2_fit &lt;- sem(m2_model, data=Data)
439 | 
440 | # r-square for dem-65
441 | inspect(m2_fit, &#39;r2&#39;)[&#39;dem65&#39;]
442 | </code></pre>
443 | 
444 | <pre><code>##  dem65 
445 | ## 0.9139 
446 | </code></pre>
447 | 
448 | <pre><code class="r">
449 | # Unstandardised regression coefficients
450 | inspect(m2_fit, &#39;coef&#39;)$beta[&#39;dem65&#39;, ]
451 | </code></pre>
452 | 
453 | <pre><code>##  ind60  dem60  dem65 
454 | ## 0.5069 0.8157 0.0000 
455 | </code></pre>
456 | 
457 | <pre><code class="r">
458 | # Standardised regression coefficients
459 | subset(inspect(m2_fit, &#39;standardized&#39;), lhs == &#39;dem65&#39; &amp; op == &#39;~&#39;)
460 | </code></pre>
461 | 
462 | <pre><code>##     lhs op   rhs est.std se  z pvalue
463 | ## 1 dem65  ~ ind60   0.168 NA NA     NA
464 | ## 2 dem65  ~ dem60   0.869 NA NA     NA
465 | </code></pre>
466 | 
467 | <pre><code class="r">
468 | # Just a guess, may not be correct:
469 | # coefs &lt;- data.frame(coef=inspect(m2_fit, &#39;coef&#39;)$beta[&#39;dem65&#39;, ],
470 | #       se=inspect(m2_fit, &#39;se&#39;)$beta[&#39;dem65&#39;, ])
471 | # coefs$low95ci &lt;- coefs$coef - coefs$se * 1.96
472 | # coefs$high95ci &lt;- coefs$coef + coefs$se * 1.96
473 | </code></pre>
474 | 
475 | </body>
476 | 
477 | </html>
478 | 
479 | 


--------------------------------------------------------------------------------
/ex2-paper/ex2-paper.md:
--------------------------------------------------------------------------------
  1 | 
  2 | # Example 2 from Rossel's Paper on lavaan
  3 | 
  4 | 
  5 | ```r
  6 | library(lavaan)
  7 | Data <- PoliticalDemocracy
  8 | ```
  9 | 
 10 | 
 11 | 
 12 | 
 13 | This example is an elaboration on Example 2 from Yves Rossel's Journal of Statistical Software Article (see [here](http://www.jstatsoft.org/v48/i02/paper)).
 14 | 
 15 | ## M0: Basic Measurement model
 16 | 
 17 | 
 18 | ```r
 19 | m0_model <- '
 20 | # measurement model
 21 | ind60 =~ x1 + x2 + x3
 22 | dem60 =~ y1 + y2 + y3 + y4
 23 | dem65 =~ y5 + y6 + y7 + y8
 24 | '
 25 | 
 26 | m0_fit <- cfa(m0_model, data=Data)
 27 | ```
 28 | 
 29 | 
 30 | 
 31 | 
 32 | * `m0` defines a basic measurement model that permits correlated factors.  Note that it does not have correlations between corresponding democracy indicator measures over time.
 33 | 
 34 | **Questions:**
 35 | 
 36 | * Is it a good model?
 37 | 
 38 | 
 39 | 
 40 | ```r
 41 | fitmeasures(m0_fit)
 42 | ```
 43 | 
 44 | ```
 45 | ##             chisq                df            pvalue    baseline.chisq 
 46 | ##            72.462            41.000             0.002           730.654 
 47 | ##       baseline.df   baseline.pvalue               cfi               tli 
 48 | ##            55.000             0.000             0.953             0.938 
 49 | ##              logl unrestricted.logl              npar               aic 
 50 | ##         -1564.959         -1528.728            25.000          3179.918 
 51 | ##               bic            ntotal              bic2             rmsea 
 52 | ##          3237.855            75.000          3159.062             0.101 
 53 | ##    rmsea.ci.lower    rmsea.ci.upper      rmsea.pvalue              srmr 
 54 | ##             0.061             0.139             0.021             0.055 
 55 | ```
 56 | 
 57 | 
 58 | 
 59 | 
 60 | * cfi suggests a reasonable model, but RMSEA is quite large.
 61 | 
 62 | 
 63 | 
 64 | ```r
 65 | inspect(m0_fit, 'standardized')
 66 | ```
 67 | 
 68 | ```
 69 | ##      lhs op   rhs est.std se  z pvalue
 70 | ## 1  ind60 =~    x1   0.920 NA NA     NA
 71 | ## 2  ind60 =~    x2   0.973 NA NA     NA
 72 | ## 3  ind60 =~    x3   0.872 NA NA     NA
 73 | ## 4  dem60 =~    y1   0.845 NA NA     NA
 74 | ## 5  dem60 =~    y2   0.760 NA NA     NA
 75 | ## 6  dem60 =~    y3   0.705 NA NA     NA
 76 | ## 7  dem60 =~    y4   0.860 NA NA     NA
 77 | ## 8  dem65 =~    y5   0.803 NA NA     NA
 78 | ## 9  dem65 =~    y6   0.783 NA NA     NA
 79 | ## 10 dem65 =~    y7   0.819 NA NA     NA
 80 | ## 11 dem65 =~    y8   0.847 NA NA     NA
 81 | ## 12    x1 ~~    x1   0.154 NA NA     NA
 82 | ## 13    x2 ~~    x2   0.053 NA NA     NA
 83 | ## 14    x3 ~~    x3   0.240 NA NA     NA
 84 | ## 15    y1 ~~    y1   0.286 NA NA     NA
 85 | ## 16    y2 ~~    y2   0.422 NA NA     NA
 86 | ## 17    y3 ~~    y3   0.503 NA NA     NA
 87 | ## 18    y4 ~~    y4   0.261 NA NA     NA
 88 | ## 19    y5 ~~    y5   0.355 NA NA     NA
 89 | ## 20    y6 ~~    y6   0.387 NA NA     NA
 90 | ## 21    y7 ~~    y7   0.329 NA NA     NA
 91 | ## 22    y8 ~~    y8   0.283 NA NA     NA
 92 | ## 23 ind60 ~~ ind60   1.000 NA NA     NA
 93 | ## 24 dem60 ~~ dem60   1.000 NA NA     NA
 94 | ## 25 dem65 ~~ dem65   1.000 NA NA     NA
 95 | ## 26 ind60 ~~ dem60   0.448 NA NA     NA
 96 | ## 27 ind60 ~~ dem65   0.555 NA NA     NA
 97 | ## 28 dem60 ~~ dem65   0.978 NA NA     NA
 98 | ```
 99 | 
100 | 
101 | 
102 | 
103 | * The table of standardised loadings show all factor loadings to be large.
104 | 
105 | 
106 | 
107 | ```r
108 | m0_mod <- modificationindices(m0_fit)
109 | head(m0_mod[order(m0_mod$mi, decreasing=TRUE), ], 12)
110 | ```
111 | 
112 | ```
113 | ##      lhs op rhs    mi    epc sepc.lv sepc.all sepc.nox
114 | ## 1     y2 ~~  y6 9.279  2.129   2.129    0.162    0.162
115 | ## 2     y6 ~~  y8 8.668  1.513   1.513    0.140    0.140
116 | ## 3     y1 ~~  y5 8.183  0.884   0.884    0.131    0.131
117 | ## 4     y3 ~~  y6 6.574 -1.590  -1.590   -0.146   -0.146
118 | ## 5     y1 ~~  y3 5.204  1.024   1.024    0.121    0.121
119 | ## 6     y2 ~~  y4 4.911  1.432   1.432    0.110    0.110
120 | ## 7     y3 ~~  y7 4.088  1.152   1.152    0.108    0.108
121 | ## 8  ind60 =~  y5 4.007  0.762   0.510    0.197    0.197
122 | ## 9     x1 ~~  y2 3.785 -0.192  -0.192   -0.067   -0.067
123 | ## 10 ind60 =~  y4 3.568  0.811   0.543    0.163    0.163
124 | ## 11    y2 ~~  y3 3.215 -1.365  -1.365   -0.107   -0.107
125 | ## 12    y5 ~~  y6 3.116 -0.774  -0.774   -0.089   -0.089
126 | ```
127 | 
128 | 
129 | 
130 | 
131 | * The table of largest modification indices suggest a range of ways that the model could be improved. Because the sample size is small, particular caution needs to be taken with these.
132 | * Several of these modifications concern the expected requirement to permit indicator variables at different time points to correlate (e.g., `y2` with `y6`, `y3` with `y7`).
133 | * It may also be that some pairs of items are correlated more than others. For example, the following correlation matrix shows how `y6` and `y8` have a particularly large correlation.
134 | 
135 | 
136 | 
137 | ```r
138 | round(cor(Data[,c('y5', 'y6', 'y7', 'y8')]), 2)
139 | ```
140 | 
141 | ```
142 | ##      y5   y6   y7   y8
143 | ## y5 1.00 0.56 0.68 0.63
144 | ## y6 0.56 1.00 0.61 0.75
145 | ## y7 0.68 0.61 1.00 0.71
146 | ## y8 0.63 0.75 0.71 1.00
147 | ```
148 | 
149 | 
150 | 
151 | 
152 | 
153 | * What are the correlations between the factors?
154 | 
155 | 
156 | 
157 | ```r
158 | cov2cor(inspect(m0_fit, "coefficients")$psi)
159 | ```
160 | 
161 | ```
162 | ##       ind60 dem60 dem65
163 | ## ind60 1.000            
164 | ## dem60 0.448 1.000      
165 | ## dem65 0.555 0.978 1.000
166 | ```
167 | 
168 | 
169 | 
170 | 
171 | This certainly suggests that factors are strongly related, especially the two demographics measures.
172 | 
173 | 
174 | ## M1: Correlated item measurement model
175 | This next model permits corresponding democracy measures from the two points to be correlated.
176 | 
177 | 
178 | 
179 | ```r
180 | m1_model <- '
181 |     # measurement model
182 |     ind60 =~ x1 + x2 + x3
183 |     dem60 =~ y1 + y2 + y3 + y4
184 |     dem65 =~ y5 + y6 + y7 + y8
185 |     
186 |     # correlated residuals
187 |     y1 ~~ y5
188 |     y2 ~~ y6
189 |     y3 ~~ y7
190 |     y4 ~~ y8
191 | '
192 | 
193 | m1_fit <- cfa(m1_model, data=Data)
194 | ```
195 | 
196 | 
197 | 
198 | 
199 | * Is this an improvement over `m0` with uncorrelated indicators?
200 | * Does `m1` have good fit in and of itself?
201 | 
202 | 
203 | 
204 | ```r
205 | anova(m0_fit, m1_fit)
206 | ```
207 | 
208 | ```
209 | ## Chi Square Difference Test
210 | ## 
211 | ##        Df  AIC  BIC Chisq Chisq diff Df diff Pr(>Chisq)    
212 | ## m1_fit 37 3166 3233  50.8                                  
213 | ## m0_fit 41 3180 3238  72.5       21.6       4    0.00024 ***
214 | ## ---
215 | ## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1 
216 | ```
217 | 
218 | ```r
219 | round(cbind(m0=inspect(m0_fit, 'fit.measures'), 
220 |             m1=inspect(m1_fit, 'fit.measures')), 3)
221 | ```
222 | 
223 | ```
224 | ##                          m0        m1
225 | ## chisq                72.462    50.835
226 | ## df                   41.000    37.000
227 | ## pvalue                0.002     0.064
228 | ## baseline.chisq      730.654   730.654
229 | ## baseline.df          55.000    55.000
230 | ## baseline.pvalue       0.000     0.000
231 | ## cfi                   0.953     0.980
232 | ## tli                   0.938     0.970
233 | ## logl              -1564.959 -1554.146
234 | ## unrestricted.logl -1528.728 -1528.728
235 | ## npar                 25.000    29.000
236 | ## aic                3179.918  3166.292
237 | ## bic                3237.855  3233.499
238 | ## ntotal               75.000    75.000
239 | ## bic2               3159.062  3142.099
240 | ## rmsea                 0.101     0.071
241 | ## rmsea.ci.lower        0.061     0.000
242 | ## rmsea.ci.upper        0.139     0.115
243 | ## rmsea.pvalue          0.021     0.234
244 | ## srmr                  0.055     0.050
245 | ```
246 | 
247 | 
248 | 
249 | 
250 | * It is a significant improvement. 
251 | * RMSEA and other fit measurs are substantially improved.
252 | * The relatively small sample size makes it somewhat difficult to see how much further improvements should continue. In general, the RMSEA suggests that further improvements are possible but it may be less clear on how to proceed in a principled way.
253 | 
254 | 
255 | 
256 | 
257 | # M2: Basic SEM
258 | 
259 | 
260 | ```r
261 | m2_model <- '
262 |     # measurement model
263 |     ind60 =~ x1 + x2 + x3
264 |     dem60 =~ y1 + y2 + y3 + y4
265 |     dem65 =~ y5 + y6 + y7 + y8
266 |     
267 |     # correlated residuals
268 |     y1 ~~ y5
269 |     y2 ~~ y6
270 |     y3 ~~ y7
271 |     y4 ~~ y8
272 | 
273 |     # regressions
274 |     dem60 ~ ind60
275 |     dem65 ~ ind60 + dem60
276 | '
277 | 
278 | m2_fit <- sem(m2_model, data=Data)
279 | ```
280 | 
281 | 
282 | 
283 | 
284 | * Is fit the same as model 1 as I would expect?
285 | 
286 | 
287 | 
288 | ```r
289 | rbind(m1 = fitMeasures(m1_fit)[c('chisq', 'rmsea')], 
290 |     m2 = fitMeasures(m2_fit)[c('chisq', 'rmsea')])
291 | ```
292 | 
293 | ```
294 | ##    chisq   rmsea
295 | ## m1 50.84 0.07061
296 | ## m2 50.84 0.07061
297 | ```
298 | 
299 | 
300 | 
301 | Yes, it is.
302 | 
303 | * Assuming democracy 1965 is the depenent variable, how can we get the information typically available in multiple regression output?
304 |     * R-squared?
305 |     * Unstandardised regression coefficients?
306 |     * Standardised regression coefficients?
307 |     * Standard errors, p-values,  and confidence intervals on unstandardised coefficients?
308 | 
309 | 
310 | 
311 | ```r
312 | # m2_fit <- sem(m2_model, data=Data)
313 | 
314 | # r-square for dem-65
315 | inspect(m2_fit, 'r2')['dem65']
316 | ```
317 | 
318 | ```
319 | ##  dem65 
320 | ## 0.9139 
321 | ```
322 | 
323 | ```r
324 | 
325 | # Unstandardised regression coefficients
326 | inspect(m2_fit, 'coef')$beta['dem65', ]
327 | ```
328 | 
329 | ```
330 | ##  ind60  dem60  dem65 
331 | ## 0.5069 0.8157 0.0000 
332 | ```
333 | 
334 | ```r
335 | 
336 | # Standardised regression coefficients
337 | subset(inspect(m2_fit, 'standardized'), lhs == 'dem65' & op == '~')
338 | ```
339 | 
340 | ```
341 | ##     lhs op   rhs est.std se  z pvalue
342 | ## 1 dem65  ~ ind60   0.168 NA NA     NA
343 | ## 2 dem65  ~ dem60   0.869 NA NA     NA
344 | ```
345 | 
346 | ```r
347 | 
348 | # Just a guess, may not be correct:
349 | # coefs <- data.frame(coef=inspect(m2_fit, 'coef')$beta['dem65', ],
350 | #       se=inspect(m2_fit, 'se')$beta['dem65', ])
351 | # coefs$low95ci <- coefs$coef - coefs$se * 1.96
352 | # coefs$high95ci <- coefs$coef + coefs$se * 1.96
353 | ```
354 | 
355 | 
356 | 
357 | 
358 | 
359 | 
360 | 


--------------------------------------------------------------------------------
/ex2-paper/ex2-paper.rmd:
--------------------------------------------------------------------------------
  1 | `r opts_chunk$set(cache=TRUE, tidy=FALSE)`
  2 | # Example 2 from Rossel's Paper on lavaan
  3 | ```{r setup, message=FALSE}
  4 | library(lavaan)
  5 | Data <- PoliticalDemocracy
  6 | ```
  7 | 
  8 | This example is an elaboration on Example 2 from Yves Rossel's Journal of Statistical Software Article (see [here](http://www.jstatsoft.org/v48/i02/paper)).
  9 | 
 10 | ## M0: Basic Measurement model
 11 | ```{r basic_measurement_model}
 12 | m0_model <- '
 13 | # measurement model
 14 | ind60 =~ x1 + x2 + x3
 15 | dem60 =~ y1 + y2 + y3 + y4
 16 | dem65 =~ y5 + y6 + y7 + y8
 17 | '
 18 | 
 19 | m0_fit <- cfa(m0_model, data=Data)
 20 | ```
 21 | 
 22 | * `m0` defines a basic measurement model that permits correlated factors.  Note that it does not have correlations between corresponding democracy indicator measures over time.
 23 | 
 24 | **Questions:**
 25 | 
 26 | * Is it a good model?
 27 | 
 28 | ```{r m0_fit_measures}
 29 | fitmeasures(m0_fit)
 30 | ```
 31 | 
 32 | * cfi suggests a reasonable model, but RMSEA is quite large.
 33 | 
 34 | ```{r m0_standardised_parameters}
 35 | inspect(m0_fit, 'standardized')
 36 | ```
 37 | 
 38 | * The table of standardised loadings show all factor loadings to be large.
 39 | 
 40 | ```{r m0_mod_indices}
 41 | m0_mod <- modificationindices(m0_fit)
 42 | head(m0_mod[order(m0_mod$mi, decreasing=TRUE), ], 12)
 43 | ```
 44 | 
 45 | * The table of largest modification indices suggest a range of ways that the model could be improved. Because the sample size is small, particular caution needs to be taken with these.
 46 | * Several of these modifications concern the expected requirement to permit indicator variables at different time points to correlate (e.g., `y2` with `y6`, `y3` with `y7`).
 47 | * It may also be that some pairs of items are correlated more than others. For example, the following correlation matrix shows how `y6` and `y8` have a particularly large correlation.
 48 | 
 49 | ```{r}
 50 | round(cor(Data[,c('y5', 'y6', 'y7', 'y8')]), 2)
 51 | ```
 52 | 
 53 | 
 54 | * What are the correlations between the factors?
 55 | 
 56 | ```{r}
 57 | cov2cor(inspect(m0_fit, "coefficients")$psi)
 58 | ```
 59 | 
 60 | This certainly suggests that factors are strongly related, especially the two demographics measures.
 61 | 
 62 | 
 63 | ## M1: Correlated item measurement model
 64 | This next model permits corresponding democracy measures from the two points to be correlated.
 65 | 
 66 | ```{r correlated_measurement_model}
 67 | m1_model <- '
 68 |     # measurement model
 69 |     ind60 =~ x1 + x2 + x3
 70 |     dem60 =~ y1 + y2 + y3 + y4
 71 |     dem65 =~ y5 + y6 + y7 + y8
 72 |     
 73 |     # correlated residuals
 74 |     y1 ~~ y5
 75 |     y2 ~~ y6
 76 |     y3 ~~ y7
 77 |     y4 ~~ y8
 78 | '
 79 | 
 80 | m1_fit <- cfa(m1_model, data=Data)
 81 | ```
 82 | 
 83 | * Is this an improvement over `m0` with uncorrelated indicators?
 84 | * Does `m1` have good fit in and of itself?
 85 | 
 86 | ```{r}
 87 | anova(m0_fit, m1_fit)
 88 | round(cbind(m0=inspect(m0_fit, 'fit.measures'), 
 89 |             m1=inspect(m1_fit, 'fit.measures')), 3)
 90 | ```
 91 | 
 92 | * It is a significant improvement. 
 93 | * RMSEA and other fit measurs are substantially improved.
 94 | * The relatively small sample size makes it somewhat difficult to see how much further improvements should continue. In general, the RMSEA suggests that further improvements are possible but it may be less clear on how to proceed in a principled way.
 95 | 
 96 | 
 97 | 
 98 | 
 99 | # M2: Basic SEM
100 | ```{r m2_model}
101 | m2_model <- '
102 |     # measurement model
103 |     ind60 =~ x1 + x2 + x3
104 |     dem60 =~ y1 + y2 + y3 + y4
105 |     dem65 =~ y5 + y6 + y7 + y8
106 |     
107 |     # correlated residuals
108 |     y1 ~~ y5
109 |     y2 ~~ y6
110 |     y3 ~~ y7
111 |     y4 ~~ y8
112 | 
113 |     # regressions
114 |     dem60 ~ ind60
115 |     dem65 ~ ind60 + dem60
116 | '
117 | 
118 | m2_fit <- sem(m2_model, data=Data)
119 | ```
120 | 
121 | * Is fit the same as model 1 as I would expect?
122 | 
123 | ```{r m2_chi_square_check}
124 | rbind(m1 = fitMeasures(m1_fit)[c('chisq', 'rmsea')], 
125 |     m2 = fitMeasures(m2_fit)[c('chisq', 'rmsea')])
126 | ```
127 | Yes, it is.
128 | 
129 | * Assuming democracy 1965 is the depenent variable, how can we get the information typically available in multiple regression output?
130 |     * R-squared?
131 |     * Unstandardised regression coefficients?
132 |     * Standardised regression coefficients?
133 |     * Standard errors, p-values,  and confidence intervals on unstandardised coefficients?
134 | 
135 | ```{r}
136 | # m2_fit <- sem(m2_model, data=Data)
137 | 
138 | # r-square for dem-65
139 | inspect(m2_fit, 'r2')['dem65']
140 | 
141 | # Unstandardised regression coefficients
142 | inspect(m2_fit, 'coef')$beta['dem65', ]
143 | 
144 | # Standardised regression coefficients
145 | subset(inspect(m2_fit, 'standardized'), lhs == 'dem65' & op == '~')
146 | 
147 | # Just a guess, may not be correct:
148 | # coefs <- data.frame(coef=inspect(m2_fit, 'coef')$beta['dem65', ],
149 | #       se=inspect(m2_fit, 'se')$beta['dem65', ])
150 | # coefs$low95ci <- coefs$coef - coefs$se * 1.96
151 | # coefs$high95ci <- coefs$coef + coefs$se * 1.96
152 | ```
153 | 
154 | 
155 | 
156 | 


--------------------------------------------------------------------------------
/makefile:
--------------------------------------------------------------------------------
1 | 
2 | pdf-all:
3 | 	Rscript 'convert.r'
4 | 


--------------------------------------------------------------------------------
/path-analysis/figure/unnamed-chunk-5.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/jeromyanglim/lavaan-examples/d7f5cbdc7fe14ffd039512bae6aa140c2a0ca5e6/path-analysis/figure/unnamed-chunk-5.png


--------------------------------------------------------------------------------
/path-analysis/path-analysis.md:
--------------------------------------------------------------------------------
  1 | 
  2 | 
  3 | # Path Analysis Example
  4 | 
  5 | 
  6 | ```r
  7 | library(psych)
  8 | library(lavaan)
  9 | ```
 10 | 
 11 | 
 12 | 
 13 | 
 14 | 
 15 | ## Simulate data
 16 | Let's simulate some data:
 17 | 
 18 | * three orthogonal predictor variables
 19 | * one mediator variable
 20 | * one dependent variable
 21 | 
 22 | 
 23 | 
 24 | ```r
 25 | set.seed(1234)
 26 | N <- 1000
 27 | iv1 <- rnorm(N, 0, 1)
 28 | iv2 <- rnorm(N, 0, 1)
 29 | iv3 <- rnorm(N, 0, 1)
 30 | mv <- rnorm(N, .2 * iv1 + -.2 * iv2 + .3 * iv3, 1)
 31 | dv <- rnorm(N, .8 * mv, 1)
 32 | data_1 <- data.frame(iv1, iv2, iv3, mv, dv)
 33 | ```
 34 | 
 35 | 
 36 | 
 37 | 
 38 | ## Traditional examination of dataset
 39 | * Is a regression consistent with the model?
 40 | 
 41 | 
 42 | 
 43 | ```r
 44 | summary(lm(mv ~ iv1 + iv2 + iv3, data_1))
 45 | ```
 46 | 
 47 | ```
 48 | ## 
 49 | ## Call:
 50 | ## lm(formula = mv ~ iv1 + iv2 + iv3, data = data_1)
 51 | ## 
 52 | ## Residuals:
 53 | ##     Min      1Q  Median      3Q     Max 
 54 | ## -3.0281 -0.6863  0.0114  0.6697  3.1412 
 55 | ## 
 56 | ## Coefficients:
 57 | ##             Estimate Std. Error t value Pr(>|t|)    
 58 | ## (Intercept) -0.00945    0.03150   -0.30     0.76    
 59 | ## iv1          0.19737    0.03163    6.24  6.4e-10 ***
 60 | ## iv2         -0.19978    0.03216   -6.21  7.7e-10 ***
 61 | ## iv3          0.29183    0.03113    9.38  < 2e-16 ***
 62 | ## ---
 63 | ## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1 
 64 | ## 
 65 | ## Residual standard error: 0.995 on 996 degrees of freedom
 66 | ## Multiple R-squared: 0.144,	Adjusted R-squared: 0.141 
 67 | ## F-statistic: 55.8 on 3 and 996 DF,  p-value: <2e-16 
 68 | ## 
 69 | ```
 70 | 
 71 | 
 72 | 
 73 | 
 74 | This is broadly similar to the equation predicting `mv`.
 75 | 
 76 | 
 77 | 
 78 | ```r
 79 | summary(lm(dv ~ iv1 + iv2 + iv3 + mv, data_1))
 80 | ```
 81 | 
 82 | ```
 83 | ## 
 84 | ## Call:
 85 | ## lm(formula = dv ~ iv1 + iv2 + iv3 + mv, data = data_1)
 86 | ## 
 87 | ## Residuals:
 88 | ##     Min      1Q  Median      3Q     Max 
 89 | ## -2.7484 -0.6547 -0.0359  0.6947  2.7185 
 90 | ## 
 91 | ## Coefficients:
 92 | ##             Estimate Std. Error t value Pr(>|t|)    
 93 | ## (Intercept)  -0.0410     0.0308   -1.33     0.18    
 94 | ## iv1          -0.0449     0.0315   -1.43     0.15    
 95 | ## iv2           0.0400     0.0320    1.25     0.21    
 96 | ## iv3           0.0162     0.0317    0.51     0.61    
 97 | ## mv            0.8250     0.0309   26.66   <2e-16 ***
 98 | ## ---
 99 | ## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1 
100 | ## 
101 | ## Residual standard error: 0.972 on 995 degrees of freedom
102 | ## Multiple R-squared: 0.45,	Adjusted R-squared: 0.448 
103 | ## F-statistic:  204 on 4 and 995 DF,  p-value: <2e-16 
104 | ## 
105 | ```
106 | 
107 | 
108 | 
109 | Given that the simulation is based on complete mediation, the true regression coefficients for the ivs are zero. The results of the multiple regression predicting the `dv` from the `iv`s and `mv` is consistent with this.
110 | 
111 | What are the basic descriptive statistics and intercorrelations?
112 | 
113 | 
114 | 
115 | ```r
116 | psych::describe(data_1)
117 | ```
118 | 
119 | ```
120 | ##     var    n  mean   sd median trimmed  mad   min  max range  skew
121 | ## iv1   1 1000 -0.03 1.00  -0.04   -0.03 0.95 -3.40 3.20  6.59 -0.01
122 | ## iv2   2 1000  0.01 0.98   0.01    0.02 0.97 -3.12 3.17  6.29 -0.07
123 | ## iv3   3 1000  0.03 1.01   0.06    0.03 1.06 -3.09 3.02  6.12  0.01
124 | ## mv    4 1000 -0.01 1.07   0.02   -0.02 1.03 -3.12 3.82  6.94  0.07
125 | ## dv    5 1000 -0.05 1.31  -0.03   -0.04 1.37 -3.99 3.53  7.53 -0.03
126 | ##     kurtosis   se
127 | ## iv1     0.25 0.03
128 | ## iv2    -0.07 0.03
129 | ## iv3    -0.21 0.03
130 | ## mv      0.14 0.03
131 | ## dv     -0.17 0.04
132 | ```
133 | 
134 | ```r
135 | pairs.panels(data_1, pch='.')
136 | ```
137 | 
138 | ![plot of chunk unnamed-chunk-5](figure/unnamed-chunk-5.png) 
139 | 
140 | 
141 | 
142 | ## M1 Fit Path Analysis model
143 | 
144 | 
145 | 
146 | ```r
147 | m1_model <- '
148 | dv ~ mv
149 | mv ~ iv1 + iv2 + iv3
150 | '
151 | 
152 | m1_fit <- sem(m1_model, data=data_1)
153 | ```
154 | 
155 | 
156 | 
157 | 
158 | Are the regression coefficients the same?
159 | 
160 | 
161 | 
162 | ```r
163 | parameterestimates(m1_fit)
164 | ```
165 | 
166 | ```
167 | ##    lhs op rhs    est    se      z pvalue ci.lower ci.upper
168 | ## 1   dv  ~  mv  0.815 0.029 28.490      0    0.759    0.871
169 | ## 2   mv  ~ iv1  0.197 0.032  6.253      0    0.136    0.259
170 | ## 3   mv  ~ iv2 -0.200 0.032 -6.224      0   -0.263   -0.137
171 | ## 4   mv  ~ iv3  0.292 0.031  9.394      0    0.231    0.353
172 | ## 5   dv ~~  dv  0.944 0.042 22.361      0    0.861    1.026
173 | ## 6   mv ~~  mv  0.986 0.044 22.361      0    0.900    1.073
174 | ## 7  iv1 ~~ iv1  0.994 0.000     NA     NA    0.994    0.994
175 | ## 8  iv1 ~~ iv2  0.055 0.000     NA     NA    0.055    0.055
176 | ## 9  iv1 ~~ iv3  0.016 0.000     NA     NA    0.016    0.016
177 | ## 10 iv2 ~~ iv2  0.962 0.000     NA     NA    0.962    0.962
178 | ## 11 iv2 ~~ iv3 -0.035 0.000     NA     NA   -0.035   -0.035
179 | ## 12 iv3 ~~ iv3  1.024 0.000     NA     NA    1.024    1.024
180 | ```
181 | 
182 | 
183 | 
184 | 
185 | All the coefficients are in the ball park of what is expected.
186 | 
187 | Does the model provide a good fit?
188 | 
189 | 
190 | ```r
191 | fitmeasures(m1_fit)
192 | ```
193 | 
194 | ```
195 | ##             chisq                df            pvalue    baseline.chisq 
196 | ##             3.654             3.000             0.301           753.212 
197 | ##       baseline.df   baseline.pvalue               cfi               tli 
198 | ##             7.000             0.000             0.999             0.998 
199 | ##              logl unrestricted.logl              npar               aic 
200 | ##         -7045.596         -7043.769             6.000         14103.191 
201 | ##               bic            ntotal              bic2             rmsea 
202 | ##         14132.638          1000.000         14113.582             0.015 
203 | ##    rmsea.ci.lower    rmsea.ci.upper      rmsea.pvalue              srmr 
204 | ##             0.000             0.057             0.899             0.011 
205 | ```
206 | 
207 | 
208 | 
209 | 
210 | * The fitted model should provide a good fit because the fitted model is identical to the model used to simulate the data.
211 | * In this case, the p-value and the fit measures are consistent with the data being generated from the model specified.
212 | 
213 | 
214 | ## Calculate and test indirect effects
215 | 
216 | 
217 | ```r
218 | m2_model <- '
219 |     dv ~ b1*mv
220 |     mv ~ a1*iv1 + a2*iv2 + a3*iv3
221 |     
222 |     # indirect effects
223 |     iv1_mv := a1*b1
224 |     iv2_mv := a2*b1
225 |     iv3_mv := a3*b1
226 | '
227 | 
228 | m2_fit <- sem(m2_model, data=data_1)
229 | ```
230 | 
231 | 
232 | 
233 | 
234 | * Note that I needed to name effects before I could define the indirect effect as the product of two effects using `:=` notation.
235 | 
236 | 
237 | 
238 | 
239 | ```r
240 | parameterestimates(m2_fit, standardize=TRUE)
241 | ```
242 | 
243 | ```
244 | ##       lhs op   rhs  label    est    se      z pvalue ci.lower ci.upper
245 | ## 1      dv  ~    mv     b1  0.815 0.029 28.490      0    0.759    0.871
246 | ## 2      mv  ~   iv1     a1  0.197 0.032  6.253      0    0.136    0.259
247 | ## 3      mv  ~   iv2     a2 -0.200 0.032 -6.224      0   -0.263   -0.137
248 | ## 4      mv  ~   iv3     a3  0.292 0.031  9.394      0    0.231    0.353
249 | ## 5      dv ~~    dv         0.944 0.042 22.361      0    0.861    1.026
250 | ## 6      mv ~~    mv         0.986 0.044 22.361      0    0.900    1.073
251 | ## 7     iv1 ~~   iv1         0.994 0.000     NA     NA    0.994    0.994
252 | ## 8     iv1 ~~   iv2         0.055 0.000     NA     NA    0.055    0.055
253 | ## 9     iv1 ~~   iv3         0.016 0.000     NA     NA    0.016    0.016
254 | ## 10    iv2 ~~   iv2         0.962 0.000     NA     NA    0.962    0.962
255 | ## 11    iv2 ~~   iv3        -0.035 0.000     NA     NA   -0.035   -0.035
256 | ## 12    iv3 ~~   iv3         1.024 0.000     NA     NA    1.024    1.024
257 | ## 13 iv1_mv := a1*b1 iv1_mv  0.161 0.026  6.108      0    0.109    0.213
258 | ## 14 iv2_mv := a2*b1 iv2_mv -0.163 0.027 -6.081      0   -0.215   -0.110
259 | ## 15 iv3_mv := a3*b1 iv3_mv  0.238 0.027  8.922      0    0.186    0.290
260 | ##    std.lv std.all std.nox
261 | ## 1   0.815   0.669   0.669
262 | ## 2   0.197   0.183   0.184
263 | ## 3  -0.200  -0.183  -0.186
264 | ## 4   0.292   0.275   0.272
265 | ## 5   0.944   0.552   0.552
266 | ## 6   0.986   0.856   0.856
267 | ## 7   0.994   1.000   0.994
268 | ## 8   0.055   0.057   0.055
269 | ## 9   0.016   0.015   0.016
270 | ## 10  0.962   1.000   0.962
271 | ## 11 -0.035  -0.035  -0.035
272 | ## 12  1.024   1.000   1.024
273 | ## 13  0.161   0.161   0.161
274 | ## 14 -0.163  -0.163  -0.163
275 | ## 15  0.238   0.238   0.238
276 | ```
277 | 
278 | 
279 | 
280 | 
281 | The above output provide a significance test, and confidence intervals for the indirect effects, and includes standardised effects.
282 | 
283 | 
284 | 
285 | 


--------------------------------------------------------------------------------
/path-analysis/path-analysis.rmd:
--------------------------------------------------------------------------------
  1 | `r opts_chunk$set(cache=TRUE, tidy=FALSE)`
  2 | 
  3 | # Path Analysis Example
  4 | ```{r, message=FALSE}
  5 | library(psych)
  6 | library(lavaan)
  7 | 
  8 | ```
  9 | 
 10 | 
 11 | ## Simulate data
 12 | Let's simulate some data:
 13 | 
 14 | * three orthogonal predictor variables
 15 | * one mediator variable
 16 | * one dependent variable
 17 | 
 18 | ```{r}
 19 | set.seed(1234)
 20 | N <- 1000
 21 | iv1 <- rnorm(N, 0, 1)
 22 | iv2 <- rnorm(N, 0, 1)
 23 | iv3 <- rnorm(N, 0, 1)
 24 | mv <- rnorm(N, .2 * iv1 + -.2 * iv2 + .3 * iv3, 1)
 25 | dv <- rnorm(N, .8 * mv, 1)
 26 | data_1 <- data.frame(iv1, iv2, iv3, mv, dv)
 27 | ```
 28 | 
 29 | ## Traditional examination of dataset
 30 | * Is a regression consistent with the model?
 31 | 
 32 | ```{r}
 33 | summary(lm(mv ~ iv1 + iv2 + iv3, data_1))
 34 | ```
 35 | 
 36 | This is broadly similar to the equation predicting `mv`.
 37 | 
 38 | ```{r}
 39 | summary(lm(dv ~ iv1 + iv2 + iv3 + mv, data_1))
 40 | ```
 41 | Given that the simulation is based on complete mediation, the true regression coefficients for the ivs are zero. The results of the multiple regression predicting the `dv` from the `iv`s and `mv` is consistent with this.
 42 | 
 43 | What are the basic descriptive statistics and intercorrelations?
 44 | 
 45 | ```{r}
 46 | psych::describe(data_1)
 47 | pairs.panels(data_1, pch='.')
 48 | ```
 49 | 
 50 | 
 51 | ## M1 Fit Path Analysis model
 52 | 
 53 | ```{r}
 54 | m1_model <- '
 55 | dv ~ mv
 56 | mv ~ iv1 + iv2 + iv3
 57 | '
 58 | 
 59 | m1_fit <- sem(m1_model, data=data_1)
 60 | ```
 61 | 
 62 | Are the regression coefficients the same?
 63 | 
 64 | ```{r}
 65 | parameterestimates(m1_fit)
 66 | ```
 67 | 
 68 | All the coefficients are in the ball park of what is expected.
 69 | 
 70 | Does the model provide a good fit?
 71 | ```{r}
 72 | fitmeasures(m1_fit)
 73 | ```
 74 | 
 75 | * The fitted model should provide a good fit because the fitted model is identical to the model used to simulate the data.
 76 | * In this case, the p-value and the fit measures are consistent with the data being generated from the model specified.
 77 | 
 78 | 
 79 | ## Calculate and test indirect effects
 80 | ```{r}
 81 | m2_model <- '
 82 |     dv ~ b1*mv
 83 |     mv ~ a1*iv1 + a2*iv2 + a3*iv3
 84 |     
 85 |     # indirect effects
 86 |     iv1_mv := a1*b1
 87 |     iv2_mv := a2*b1
 88 |     iv3_mv := a3*b1
 89 | '
 90 | 
 91 | m2_fit <- sem(m2_model, data=data_1)
 92 | ```
 93 | 
 94 | * Note that I needed to name effects before I could define the indirect effect as the product of two effects using `:=` notation.
 95 | 
 96 | 
 97 | ```{r}
 98 | parameterestimates(m2_fit, standardize=TRUE)
 99 | 
100 | ```
101 | 
102 | The above output provide a significance test, and confidence intervals for the indirect effects, and includes standardised effects.
103 | 
104 | 
105 | 
106 | 


--------------------------------------------------------------------------------