Adventures In Culture, Mind & Space: Cross-cultural measurement invariance testing in R in 5 simple steps

Here I am presenting a quick crash course to run invariance tests for cross-cultural research in R. R is a free programme and has an expanding list of awesome features that should be of interest to people doing cross-cultural work.

I am working with RStudio, but there are other options for running R. Here are the basic steps to get you started and run a cross-cultural equivalence test.

1. Set your working directory

This step is important because it will allow you to call your data file later on repeatedly without listing the whole path of where it is saved. For example, I saved the file that I am working with on my dropbox folder in a folder called 'Stats' that is in my 'PDF' folder.
I need to type this command:

setwd("C:\\Users\\Ron\\Dropbox\\PDF\\Stats")

Two important points:
a) for some strange reason you need double \\ to set your directory paths with windows.
b) make sure that there are no spaces in any of your file or directory paths. R does not like it and will throw a tantrum if you have a space somewhere.

2. Read your data into R

The most convenient way to read data into R is using .csv files. Any programme like SPSS or Excel will allow you to save your data as a .csv file. There are a few more things that we need to discuss about saving your data, but I will discuss this below.

Type:
justice <- read.csv("justice.csv", header = TRUE)

R is an object oriented language, which means we will constantly create objects by calling on functions: object <- function. This may seem weird at first, but will allow you to do lots of cool stuff in a very efficient way.

I am using a data that tested a justice scale, so I am calling my object 'justice'.

3. Deal with missing data

If you have absolutely no missing data in your data file, skip this step. However most mortal researcher souls will have some missing data in their spreadsheet. R is very temperamental with missing data and we need to tell it what missing data is and how to deal with it. Some people (including myself) who are used to SPSS typically leave missing data as a blank cell in the spreadsheet. This will create problems. The best option is to write a little syntax command in spss to recode all blank cells to a constant number.
For example, I could write something like this in SPSS:

recode variable1 to variable12
(sysmis=9999) (else=copy) into
pj1 pj2 pj3 pj4 ij1 ij2 ij3 ij4 dj1 dj2 dj3 dj4.
Execute.

Now I can save those new variables as a .csv file and read into R using the step above.

Once you have executed step 2, you need to define these annoying 9999s as missing values.
The simplest and straightforward option is to write this short command that converts all these offending values into NA - the R form of missing data.

justice[justice==9999] <- NA

Note the square brackets and double ==. If you want to treat only a selected variable, you could write:

justice$pj1[justice$pj1==9999] <- NA

This tells R that you want only the pj1 variable in the dataframe justice to be treated in this way.

To check that all worked well, type:

summary(justice)

You should see something like this:

If all went well, now your minimum and maximum values are within the bounds of your original data and you have a row of NA's a the bottom of each variable column.

4. Loading the analysis packages

R is a very powerful tool because it is constantly expanding. Researchers from around the world are uploading tools and packages that allow you to run fancy new stats all the time. However, the base installation of R does not include them. So we need to tell R which packages we want to use.
For measurement invariance tests, these three are particularly useful: lavaan (the key one) and semTools (important for the invariance tests).

Write this code to download and install the packages on your machine:

install.packages(c("semTools", "lavaan"))

Make sure you have good internet connectivity and you are not blocked by an institutional firewall. I had some problems recently trying to download R packages when accessing it from a university campus with a strong firewall.

Once all packages are downloaded, you need to call them before you can run any analyses:

library("lavaan")

library("semTools")

Important: You need to call these packages each time that you want to run some analyses, if you have restarted R or RStudio.

5. Running the analyses

R is an object oriented language, as I mentioned before. The analysis can be executed in a couple of steps. First we need to specify the model that we run by creating a new object that contains all the information. Then we tell R what to do with that model. Finally, we have various options for obtaining the results, that is the fit indices and parameter estimates as well as other diagnostic information.

Let's create the model first:
cfa.justice<-'
pj=~pj1+pj2+pj3+pj4
ij=~ij1+ij2+ij3+ij4
dj=~dj1+dj2+dj3+dj4'

This creates the model that we can then work with. The '=~' denotes that the items are loading on the latent factor. This is what it looks like:

The next set of commands sets specifies what should be done with the model. In the case of a simple CFA, we can call this function:

fit.cfa <-cfa(cfa.justice, data=justice)

To identify the model, lavaan sets the loading of the first item on each latent variable to 1. This is convenient, but may be problematic if the item is not a good indicator. An alternative strategy is to set the variance of the latent variable to 1. This can be done by adding std.lv=TRUE to the fit statement.

fit.cfa <-cfa(cfa.justice, data=justice, std.lv=TRUE)

This statement runs the analysis, but we still need to request the output.

The simplest way is to use summary again. Here is an option that prints both the fit indices and the standardized parameters.

summary(fit.cfa, fit.measures= TRUE, standardized=TRUE)

Here is a truncated and annotated version of the output:

lavaan (0.5-17) converged normally after 39 iterations
Used Total

Number of observations 2518 2634

# The following shows the estimator and the Chi square stats. As can be seen, we have 51 df's, but the model does not fit that well.

Estimator ML

Minimum Function Test Statistic 686.657

Degrees of freedom 51

P-value (Chi-square) 0.000

Model test baseline model:
Minimum Function Test Statistic 20601.993

Degrees of freedom 66

P-value 0.000

#The incremental fit indices are in contrast quite good. The CFI and TLI should be ideally be above .95 (or at least .90). So this model does look good.

User model versus baseline model:
Comparative Fit Index (CFI) 0.969

Tucker-Lewis Index (TLI) 0.960

Loglikelihood and Information Criteria:
Loglikelihood user model (H0) -44312.277

Loglikelihood unrestricted model (H1) -43968.949
Number of free parameters 27

#The AIC and BIC are useful for comparing models, especially non-nested models. Not the case right now.

Akaike (AIC) 88678.555

Bayesian (BIC) 88835.998

Sample-size adjusted Bayesian (BIC) 88750.212

#The RMSEA should be small. Values smaller than .08 are deemed acceptable, below .05 are good. We are doing ok-ish with this one here.

Root Mean Square Error of Approximation:
RMSEA 0.070

90 Percent Confidence Interval 0.066 0.075

P-value RMSEA <= 0.05 0.000

#Another useful lack of fit index. The SRMR should be below .05 if possible. This is looking good.

Standardized Root Mean Square Residual:
SRMR 0.037

#Now we have a print out of all the parameter estimates, including the standardized loadings, variances and covariances. Here it is important to check whether loadings are relatively even and strong, and that the variances and covariances are reasonable (e.g., we want to avoid very high correlations between latent variables). It is looking ok overall. Item ij4 may need some careful attention.

Parameter estimates:
Information Expected Standard Errors Standard
Estimate Std.err Z-value P(>|z|) Std.lv Std.all

Latent variables:

pj =~ pj1 1.000 1.202 0.789

pj2 1.009 0.027 38.005 0.000 1.213 0.790

pj3 0.769 0.025 31.003 0.000 0.924 0.643

pj4 0.802 0.024 33.244 0.000 0.963 0.687

ij =~ ij1 1.000 1.256 0.879

ij2 1.064 0.014 75.039 0.000 1.335 0.957

ij3 1.015 0.015 68.090 0.000 1.274 0.910

ij4 0.808 0.022 37.412 0.000 1.014 0.644

dj =~ dj1 1.000 1.211 0.821

dj2 1.013 0.019 51.973 0.000 1.227 0.871

dj3 1.056 0.020 52.509 0.000 1.279 0.877

dj4 1.014 0.021 48.848 0.000 1.228 0.834
Covariances:

pj ~~ ij 0.828 0.041 20.435 0.000 0.549 0.549

dj 0.817 0.040 20.187 0.000 0.562 0.562

ij ~~ dj 0.711 0.037 19.005 0.000 0.467 0.467

Variances:

pj1 0.874 0.036 0.874 0.377

pj2 0.889 0.036 0.889 0.377

pj3 1.213 0.039 1.213 0.587

pj4 1.040 0.035 1.040 0.528

ij1 0.464 0.016 0.464 0.227

ij2 0.164 0.011 0.164 0.084

ij3 0.336 0.013 0.336 0.172

ij4 1.448 0.042 1.448 0.585

dj1 0.709 0.025 0.709 0.326

dj2 0.479 0.019 0.479 0.242

dj3 0.489 0.020 0.489 0.230

dj4 0.662 0.024 0.662 0.305

pj 1.444 0.066 1.000 1.000

ij 1.576 0.057 1.000 1.000

dj 1.466 0.060 1.000 1.000

However, we want to do an invariance analysis. Right now we collapsed the samples and ran an analysis across all groups. This can create problems, especially if the samples have different means (see my earlier blogpost for an explanation of this problem).

The grouping variable should be a factor, that is a string variable that has the labels. You can also use continuous variables, but then you will need to remember what each number means. In this example, I have data from three samples:

> summary(justice$nation)
Brazil NZ Philippines

794 1146 694

It would be informative to see whether the item loadings are similar in each group. To do this, we only need to add group="nation" to our cfa statement.

fit.cfa.separate <-cfa(cfa.justice, data=justice, group="nation")

We can then print the results by using the summary statement again (remember that we have to call the new object for this analysis):

summary(fit.cfa.separate, fit.measures= TRUE, standardized=TRUE)

I am not printing the output.

Of course, this is not giving us the info that we want, namely whether the model really fits. In addition, we could ask for equal loadings, intercepts, unique variances, etc. I can't go into details about the theory and importance of each of these parameters. I hope to find some time soon to describe this. In the meantime, have a look at this earlier article.

In R, running these analyses is really straightforward and easy. A single command line will give us all the relevant stats. Pretty amazing!!!!!

To run a full-blown invariance analysis, all you need is to type this simple command:

measurementInvariance(cfa.justice,
data=justice,
group="nation",
strict=TRUE)

You can write it as a single line. I just put it on separate lines to show what it actually entails. First, we call the model that we specified above, then we link it to the data that we want to analyze. After that, we specify the grouping variable (nation). The final line requests strict invariance, that is we want to get estimates for a model where loadings, intercepts and unique variances are constrained as well as a model in which we constrain the latent means to be equal. If we don't specify the last line, we will not get the constraints in unique variances.

Here is the output, but without the strict invariance lines:

Measurement invariance tests:
Model 1: configural invariance:
chisq df pvalue cfi rmsea bic
992.438 153.000 0.000 0.962 0.081 86921.364
Model 2: weak invariance (equal loadings):
chisq df pvalue cfi rmsea bic
1094.436 171.000 0.000 0.958 0.080 86882.400
[Model 1 versus model 2]
delta.chisq delta.df delta.p.value delta.cfi
101.998 18.000 0.000 0.004
Model 3: strong invariance (equal loadings + intercepts):
chisq df pvalue cfi rmsea bic
1253.943 189.000 0.000 0.952 0.082 86900.945
[Model 1 versus model 3]
delta.chisq delta.df delta.p.value delta.cfi
261.505 36.000 0.000 0.010
[Model 2 versus model 3]
delta.chisq delta.df delta.p.value delta.cfi
159.507 18.000 0.000 0.006
Model 4: equal loadings + intercepts + means:
chisq df pvalue cfi rmsea bic
1467.119 195.000 0.000 0.942 0.088 87067.134
[Model 1 versus model 4]
delta.chisq delta.df delta.p.value delta.cfi
474.681 42.000 0.000 0.020
[Model 3 versus model 4]
delta.chisq delta.df delta.p.value delta.cfi
213.176 6.000 0.000 0.009

How do we make sense of this?

Model 1 is the most lenient model, no constraints are imposed on the model and separate CFA's are estimated in each group. The CFI is pretty decent. The RMSEA is borderline.
Model 2 constraints the factor loadings to be equal. The CFI is still pretty decent, the RMSEA actually improves slightly. This is due to the fact that we have now more df's and RMSEA punishes models with lots of free parameters. The important info comes in the line entitled Model 1 versus model 2. Here we find the difference stats. The X2 difference test is significant and we would need to reject model 2 as significant worse. However, due to the problems with the X2 difference test, many researchers treat this index with caution and examine other fit indices. One commonly examined fit index of the difference is Delta CFI, that is the difference in CFI fit from one model to the next. It should not be larger than .01. In our case, it is borderline - the delta CFI is .01.
We can then compare the other models. The next model constraints both loadings and intercepts (strong invariance). The model fit is pretty decent, we can probably assume that both loadings and intercepts are invariant across these three groups.
In contrast, constraining the latent means shows some larger problems. The latent means are likely to be different.

6. Further statistics

In this particular case, the model fits pretty well. However, often we run into problems. If there is misfit, we either trim the parameter (drop parameters or variables from the model) or we can add parameters. To see which parameters would be useful to add, we can request modification indices.
This can be done using this command:
mi <- modificationIndices(fit.cfa)
mi

The second line (mi) will print the modification indices. It gives you the expected drop in X2 as well as what the parameter estimates would be like if they were freed.

If we want to print only those modification indices above a certain threshold, let's say 10, we could add the following line:
mi<-modificationIndices(fit.cfa)
subset (mi, mi>10)
mi

This will give us modification indices for the overall model. If we want to see modification indices for any of the constrained models, we can request them after estimating the respective model.

For example, if we want to see the modification indices after constraining the loadings to be similar, we can run the following line:

metric <-cfa(cfa.justice,
data=justice,
group="nation",
group.equal=c("loadings"))
mi.metric<-modificationIndices(metric)
mi.metric

This will now give us the modification indices for this particular model.

There are more options for running constrained models. For example, this line gives the scalar invariant model:

scalar <-cfa(cfa1,
data=justice,
group="nation",
group.equal=c("loadings", "intercepts"))

As you can see, these models are replicating the models implied in the overall analysis that we got with the measurementInvariance command above.

Summary

I hope I have convinced you that measurement invariance in R using lavaan and semTools is a piece of cake. It is an awesome resource, allows you to run lots of models in no time whatsoever and of course it is free!!!!! Once you get into R, you can do even more fancy stuff and run everything from simple stats to complex SEM and ML models in a single programme.

More info on lavaan can be found here (including a pdf tutorial).

I am still in the process of learning how to navigate this awesome programme. If you have some suggestions for simplifying any of the steps or if you spot some mistakes or have any other suggestions... please get in touch and let me know :)
If there are some issues that are unclear or confusing, let me know too and I will try and clarify!
Look forward to hearing from you and hope you find this useful!

Adventures In Culture, Mind & Space

Sunday, December 21, 2014

Cross-cultural measurement invariance testing in R in 5 simple steps