Monday, April 2, 2012

How to do Procrustean Factor Rotation with more than 2 groups

Today, I am continuing the torture with a bit more detail on options for comparing factor loadings across three or more groups within SPSS. This is a crucial issue for cross-cultural research and is becoming increasingly important, because researchers start studying more than two groups. More complex designs are more powerful in uncovering processes that can explain emerging behavioural differences, so this research should be strongly encouraged!

Aim: Compare the factor structure when you have more than two cultural groups, get an estimate of factor similarity

Why are we concerned with Procrustean Rotation? Factor rotation is arbitrary, therefore apparently dissimilar factor structures might be more similar than we think; procrustean rotation is necessary to judge structural and metric equivalence

Statistical Procedure:

The same syntax as for the two group case (see previous post: can be run with SPSS, but the greater number of countries adds additional problems. You have various options:

  1. Run all pairwise comparisons. However, this will lead to a substantive number of comparisons (especially if you have many samples). This also leads to a number of statistical problems (remember family-wise error rate and increased Type I errors)
  2. Select one country as your target group. For example, if an instrument was developed in the US, you may want to compare each group to the US.
  3. Compute the average correlation matrix and use it for your factor analysis. The average is sometimes called pooled-within matrix. Therefore, you would compare each sample with the average structure across all samples (this can be done via discriminant function analysis in SPSS, you can then read the resulting correlation matrix into spss and use as an input for your factor analysis - see my discussion of how to do this here). This is highly appealing if you have many samples. This procedure of computing the average correlation matrix as input to the factor analysis can be simplified if (a) you have samples with similar sample size (no sample is dominating others; eg., if you have one sample of 10,000 and three samples of 50 participants each, the large sample is driving the factor structure) and (b) you mean centre each item within each sample prior to the overall factor analysis. This is necessary to account for any group mean differences that might obscure relationships if the samples are pooled. See below for a graphical explanation of why this might be a problem. As you can see, the relationship within each sample is negative, more sleep problems within each sample are associated with less laughter by participants. However, one group is consistently higher, for both the reported sleep problems as well as laughing. There may be reasons of why this is the case (I will come back to this example when talking about multilevel analysis), but for our analysis, combining the two samples would mean that we have a positive relationship across both samples combined (compared to negative relationships within both samples separately). This effect is due to the mean differences across both groups (I will post something soon on the beautiful complexity of these multi-level problems in psychology - very fascinating stuff). As a consequence of this confounding of group differences with individual differences, we need to take any such mean differences into account before we can combine the samples. This can easily be done using the z-transformation option in SPSS (‘Save standardized values as variables’ under the ‘Analysis’ -> ‘Descriptives’ option). 

I believe the last option is the most appealing with large data sets.

 However, cross-cultural psych never stops to be complicated. What happens if you find that some samples show good factor congruence with the average factor structure and others not? Ideally, you would exclude those samples from the average factor structure and re-run the analysis. Proceed iteratively till no sample shows any problems with factor similarity anymore.
If you have lots of cultural samples, you are really curious (and stats savvy) and want to find out what is happening in the strange worlds of culture, you may want to run cluster analysis on the congruence coefficients to identify clusters of samples that show greater similarity with each other. This might provide some interesting insights from a cross-cultural perspective. However, it is computationally demanding and relies on purely statistical criteria. There is a neat paper discussing various options and strategies, written by Welkenhuysen-Gybels and van de Vijver (2001, published in the Proceedings of the Annual Meeting of the American Statistical Association – I think this gives you an idea about what level of analysis we are talking about[1]). You can also download a SAS macro (the link is in the paper) that does much of the computational work for you. I have never worked with SAS, it seems a parallel universe to me and I am fascinated, but scared of it. But there are people who think it is easy. Conceptually, it is a nice tool.  

[1] You can download the paper at:

1 comment:

  1. " This can easily be done using the z-transformation option in SPSS (‘Save standardized values as variables’ under the ‘Analysis’ -> ‘Descriptives’ option). "

    Bloody brilliant. Thanks for that! I've been searching for this for a few hours now :-)