Welcome!

Here are links to our lecture notes on the different course topics...

Intro: The Pyramid of Success

Correlation, Least-Squares Principle, and Multiple Regression

Path Analysis

Exploratory Factor Analysis (herehere, and here)

Confirmatory Factor Analysis (herehere, and here)...

...and Associated Basic Concepts (degrees of freedom [here and here]; model fitreporting fit)

ONYX Program

Writing Up SEM/CFA Results

Full Structural Models (herehere, and here); also see the following article for discussion of what a "model" represents:

Rodgers, J. L. (2010). The epistemology of mathematical and statistical modelling. A quiet revolution. American Psychologist, 65, 1-12. 

Video clip of legendary physicist Richard Feynman discussing conclusions one can draw from tests of theoretical models.

Comparative Model Testing and Nestedness

Beyond the Basics of SEM (contains all our topics for roughly the second half of the course)

Diagram for Assignment 2

Refresher Diagram on SEM Terminology

SEM The Musical: 1234567, 8, 9, 10

Graphic arts programs

SEM The Musical 11


(Updated May 5, 2017)

The eleventh annual SEM The Musical was held Thursday, May 4, during our class. We had two new songs this year, one by Dr. Reifman and one by student Derrick Holland. Derrick's song keeps our streak alive of having at least one student-written song every year. We also, of course, sang a bunch of favorites from the previous ten musicals (links: 123456789, 10). See below for this year's new songs...

Why Won’t It Run? 
Lyrics by Alan Reifman
May be sung to the tune of “On the Run” (Lukather/Paich/Waybill for Toto)

SEM involves, some complex math,
Before you go, you need to check, you’ve added each path,
Lots of little details, for you to keep in sight,
Formatting the data, and making sure, your syntax is right,

Have you verified, what you’ll fix to one?
Otherwise, your troubles, have just begun,

Why won’t it run, why all these error signs?
Why won’t it run? Only hints, of what could’ve, gone wrong,
Why won’t it run? Check your steps, line-by-line,
You can put, all your angst, into song!

(Instrumental)

You never know, what a new model, can bring,
Punctuation, constraints, it could be anything,
Maybe what you have, is a Heywood Case?
You’ll need a sharp eye, to keep things in place,

Maximum likelihood, seeks a minimum to achieve,
Gonna take a miracle, for you to receive,

Oh, oh, oh, why won’t it run, why won’t the steps converge?
Why won’t it run, are the magnitude scales far apart?
Why won’t it run, when will I be, on the verge?
Doing this, can tax your heart!

(Instrumental and Guitar Solo)

Hungry Like a Low Chi-Square
Lyrics by Derrick Holland
May be sung to the tune of "Hungry Like a Wolf" (Duran Duran)

Open the connection, get ready to run,
Make sure variables, are in a dot-dat file,
Do do do do do do do dodo dododo dodo,

List, all your variables, that you will use,
Make sure all missing variables, are -99*,
Do do do do do do do dodo dododo dodo,

All pathways are free,
Unless you fix a path to 1,
The very end goal,
It’s important you know,
And I'm hungry, like a low chi-square,

Straddle the line,
With comparative models,
I'm on the hunt, for a good CFI,
Check your TLI, and RMSEA,
And I'm hungry, like a low chi-square,

You get an error, so you start to freak out,
Mplus tells you, that variables are not defined,
Do do do do do do do dodo dododo dodo,

You get a low CFI, important paths are behind,
You search in theory, for paths that are not benign,
Do do do do do do do dodo dododo dodo,

All pathways are free,
Unless you fix one path to 1,
The very end goal,
It’s important you know,
And I'm hungry, like a low chi-square,

Straddle the line,
With comparative models,
I'm on the hunt, for a good CFI,
Check your TLI, and RMSEA,
And I'm hungry, like a low chi-square,

Searching for paths,
I break from theory,
I'm on the hunt,
But I won’t get pubbed,

Latent constructs, made up of manifest
And I'm hungry like low chi-squares,

Draw many lines,
If you use ONYX,
I'm on the hunt, for a good CFI,
Check your TLI, and RMSEA,
And I'm hungry, like a low chi-square,

---

*This is a specification in the Mplus program

Partial Least Squares (Small-Sample Alternative to Conventional SEM)

Partial Least Squares (PLS) is a variation on Structural Equation Modeling (SEM). Riou, Guyon, and Falissard (2016) state that, relative to conventional SEM, PLS “is more suitable to … work with smaller sample sizes.” PLS is recommended for exploratory purposes, and is often used with single-indicator constructs. The technique seems to be used predominantly within the field of Management Information Systems (MIS).

Significance testing is done through bootstrapping, with 100 random variations of the original data set being generated and the model rerun in each random data set. An actual path coefficient from one’s model can then be evaluated for extremity, relative to the distribution of the same coefficient estimated 100 times from the bootstrap.

Though PLS may have reputation for making it easier to obtain significant results, this view appears overstated; a simulation study found that “for N = 40, PLS had 3% and 1% higher power than regression for strong and medium effect sizes [and…] the same power as regression at weak effect size” (Goodhue, Lewis, & Thompson, 2006).

Fit indices, such as NFI, CFI, RMSEA, are not available.

WarpPLS (Kock, 2015) is a program I've found useful and that has a three-month free trial version. Note that the probabilities given in WarpPLS output are one-tailed, so that if you want to report two-tailed p-values, you must double the printed value (e.g., p = .02 one-tailed represents p = .04 two-tailed).

Discussion of the pros and cons of PLS, and of the circumstances for which it may -- or may not -- be appropriate, is available in Goodhue, Thompson, and Lewis (2013); Marcoulides, Chin, and Saunders (2009); McIntosh, Edwards, and Antonakis (2014); and other sources. See also this discussion piece by Kock.

References

Goodhue, D., Lewis, W., & Thompson, R. 2006. “PLS, small sample size and statistical power in MIS research,” in Proceedings of the 39th Hawaii International Conference on System Sciences, R. Sprague Jr. (ed.), Los Alamitos, CA: IEEE Computer Society Press. (link)

Goodhue D. L., Thompson R. L., & Lewis W. (2013). Why you shouldn’t use PLS: Four reasons to be uneasy about using PLS in analyzing path models. In 46th Hawaii International Conference on System Sciences (pp. 4739–4748). Wailea, HI: HICSS.

Kock, N. (2015). WarpPLS 5.0 User Manual. Laredo, TX: ScriptWarp Systems. (link)

Marcoulides, G. A., Chin, W. W., & Saunders, C. (2009). A critical look at partial least squares modeling. MIS Quarterly, 33(1), 171-175. (link)

McIntosh, C. N., Edwards, J. R., & Antonakis, J. (2014). Reflections on partial least squares path modeling. Organizational Research Methods, 17, 210-251. (abstract)

Riou, J., Guyon, H., & Falissard, B. (2016). An introduction to the partial least squares approach to structural equation modelling: A method for exploratory psychiatric research. International Journal of Methods in Psychiatric Research, 25, 220-231. Published online first at doi: 10.1002/mpr.1497.

Reminder of Terminology in an SEM

To the beginning SEM practitioner, terms such as "parameter," "factor loading," and "directional path" may be confusing. Here's a drawing on the whiteboard (with some touch-ups in PowerPoint) to help clarify proper usage. Thanks to the students who photographed the board!


Introduction to ONYX

ONYX is a free SEM package developed in Germany. We will use it for Assignment 1 in our Spring 2017 class, a CFA on the Hendrick and Hendrick love styles. Here are some tips I have come up with for using ONYX, given its differences from other SEM programs:

1. Everything is done through right-clicking to bring up menus.

2. You can use a plain-text (tab-delimited) ".dat" file saved from an SPSS data file. The ONYX user manual lists available options for designating missing data. Once you've drawn your model, you can use "Load Data" to connect to the .dat file, yielding what's called a "Data Panel."

3. Use the "Create Variable" option to generate either latent or observed variables.

4. You should name latent variables (in ALL CAPITALS) via the right-clicking. However, but you’ll have to drag in the measured variables from the "Data Panel" to the variables' respective boxes in the model. By hovering over the measured-variable boxes, you can verify that the data have been linked.

5. By right-clicking on top of a variable, you can use the "Add Path" tool (the default is to draw unidirectional "causal" paths, whereas holding down the Shift key while using "Add Path" yields dual-headed correlational arrows).

6. All unstandardized factor loadings start out fixed at 1; you should free all of them (i.e., letting them take on freely estimated values). To identify the model (i.e., make sure you're not estimating more quantities than you have information for), construct variances should be fixed to 1.*

7. The default settings yield an unstandardized solution, whereas usually we're interested in a standardized one. You can obtain a standardized solution by right-clicking on each indicator’s box and selecting “z-score Transform.”

8. Covariances (correlations between factors) are also fixed and should be freed.

9. Unlike other programs, where you submit a run, ONYX is constantly running in the background and responds to changes you make in model specifications. Right-clicking and selecting “Show Estimate Summary” will show current results.

---
*Fixing (or constraining) variables, underidentification, and degrees of freedom are discussed here. I have edited the linked document to distinguish between suggestions for when the AMOS program is used, vs. ONYX.

SEM The Musical 10


The tenth annual SEM The Musical was held on Thursday, May 5. We performed a few new songs this year, as shown below. We also performed songs from previous SEM Musicals. Three older songs we performed this year are available on YouTube (thanks to SH for filming). These songs are "Common Model Mistakes" (originally from SEM The Musical 9), "Saturated Your Model" and "If You Wanna Join My Construct (You've Gotta Load with My Friends)," the latter two from SEM The Musical 8. To see the lyrics from these (and other) older songs, just click on the year number of the musical: 123456789.


SEM Musical TEN!
Lyrics by Alan Reifman (retread from previous years)
(May be sung to the tune of “Let’s Get it Started,” Will Adams et al. for the Black Eyed Peas)

(Softly) The models keep runnin-runnin, and runnin-runnin, and runnin-runnin, and runnin-runnin, and runnin-runnin, and runnin-runnin, and runnin-runnin, and runnin-runnin, and...

We’re back again, to have some fun,
We’re gonna bust some rhyme, have a good time,
We’re gonna sing some songs, about SEM technique,
Access your inner geek, let your voices speak,

SEM is different, your measurement model’s explicit,
The whole model, gets tested for fit,
Is it identified? We know how hard you’ve tried,
Knowns and unknowns, side by side,
It takes you on a ride, finally you’re satisfied,
Your output’s now just fine, you’ve arrived, you can take pride…

NFI, TLI, CFI,
Calculate estimates, let it run, have some fun, yeah…
SEM Musical (TEN!), SEM Musical (HERE!),
SEM Musical (TEN!), SEM Musical (HERE!),
SEM Musical (TEN!), SEM Musical (HERE!),
SEM Musical (TEN!), SEM Musical (HERE!),
Yeah...

Build your constructs, get this straight,
Make sure the indicators, correlate,
Draw your pathways, residuals too,
Don’t leave out, the fixed 1 value,

Take your time, think it through,
Don’t worry if you’re new, we’ll walk with you,
Step by step, right up the pyramid,
For SEM, we’re really groovin,’
Hope you get an acceptable solution,
Submit your model and get it movin,’

NFI, TLI, CFI,
Calculate estimates, let it run, have some fun, yeah…
SEM Musical (TEN!), SEM Musical (HERE!),
SEM Musical (TEN!), SEM Musical (HERE!),
SEM Musical (TEN!), SEM Musical (HERE!),
SEM Musical (TEN!), SEM Musical (HERE!),
Yeah...


The Part That’s Error-Free 
Lyrics by Alan Reifman
(May be sung to the tune of “Biggest Part of Me,” David Pack for Ambrosia)

Boxes, they hold the manifestations,
Bubbles, are error locations,
Constructs, house the shared variation,
They're the part, that’s error-free,

Loadings, show measures, are correlated,
That makes, indicators validated,
Errors, in the bubbles, they are gated,
So constructs, are error-free,

Well...
You remove error,
And the paths, become more true*,
This is such, a key thing,
Latent constructs, do for you,

So draw it now,
Tell measurement error, to shoo.
You can estimate, the paths,
Without error, troubling you,

Sometimes, you have just, total-scale measures,
Of those, certain constructs, that you treasure,
Alpha, gives a way to block displeasure,
Controls, unreliability,

Parcels, a technique, that can’t be plainer,
Items, placed into, random containers,
These sets, can then serve, as indicators,
Constructs now, are error-free,

Well-l-l-l-l...
You remove error,
And the paths, become more true,
This is such, a key thing,
Latent constructs, do for you,

So draw it now,
Tell the measurement error, to shoo,
You can estimate, the paths,
Without error, troubling you,

(Instrumentals)

It’s an SEM hallmark,
Going back to CFA,
It’s a major advantage, of using LV’s,

Not all techniques, give you this,
Measurement error, doesn’t go away,
So use latent constructs, to be error-free,
Be error-free,
Be error-free...

*Stephenson, M. T., & Holbert, R. L. (2003). A Monte Carlo simulation of observable versus latent variable structural equation modeling techniques. Communication Research, 30, 332-354.

See also previous lecture modules here and here.


Those Kinds of Paths (Are Autoregressive)
Lyrics by Alan Reifman
(May be sung to the tune of “Because the Night,” Springsteen/Smith)

Panel models, longitudinally,
Follow the same people, over time,
Each major construct, we include repeatedly,
It gets us the time-ordering, of causality,

So, come on now, no hand-calculated math,
In cross-lagged models, we run paths,
From Construct A at one time, to B at the next,
We also have paths, from the same construct,
Time 1 to Time 2, and Time 2 to Time 3,

Those kinds of paths, are auto-regressive,
Those kinds of paths, test stability,
Those kinds of paths, are auto-regressive,
Those kinds of paths, test stability,

Autoregressive paths, play a crucial role,
They control for earlier levels, of a later DV,
So when a cross-lagged path, is significant,
It shows association, beyond stability,

So, come on now, no hand-calculated math,
In cross-lagged models, we run paths,
From Construct A at one time, to B at the next,
We also have paths, from the same construct,
Time 1 to Time 2, and Time 2 to Time 3,

Those kinds of paths, are auto-regressive,
Those kinds of paths, test stability,
Those kinds of paths, are auto-regressive,
Those kinds of paths, test stability,

(Guitar solo)

These kinds, of paths,
Predict, to later versions, of themselves,
Without them, analyses would lack rigor,
So include them...

Time 1 to Time 2, Time 2 to Time 3,
Time 1 to Time 2, Time 2 to Time 3,
Time 1 to Time 2, Time 2 to Time 3,

Those kinds of paths, are auto-regressive,
Those kinds of paths, test stability,
Those kinds of paths, are auto-regressive,
Those kinds of paths, test stability,

Those kinds of paths, are auto-regressive,
Those kinds of paths, test stability,
Those kinds of paths, are auto-regressive,
Those kinds of paths, test stability,

Those kinds of paths, are auto-regressive,
Those kinds of paths, test stability,
Those kinds of paths, are auto-regressive,
Those kinds of paths, test stability,

Those kinds of paths, are auto-regressive,
Those kinds of paths, test stability,
Those kinds of paths, are auto-regressive,
Those kinds of paths, test stability...


Constructs (Don’t be Afraid of Changing!)
Lyrics by Diane Wittie
(May be sung to the tune of “Landslide,” Stevie Nicks)

Gathered the data, and they abound,
I cleaned them up, then I went to town,
And I saw some variables, that looked interesting,
And now my sleep, would be sound,

Oh, yes I can begin, naming latent constructs,
But will those, constructs make any sense?
Will they adequately represent,
What I envision?
Can I implement, my central concepts?

Uh-hum, I do think so,

Well, don’t be, afraid of changing,
’Cause your constructs, need to make sense,
Think through, your decisions,
You may need, revisions,
Don't do anything, you'll rue,

(Brief guitar)

So, don’t be, afraid of changing,
’Cause your constructs, need to make sense,
Think through, your decisions,
You may need, revisions,
Don't do anything, you'll rue,
To your theory, be true,

So, analyze your data, see what you've found,
Your model, may earn great renown!
If you see factor loadings, at plus or minus .4,
Well maybe, high points you will score,
If you see structural paths, that are significant,
Yes, high points, you will score!

Mediational Models

As we saw in the journal-assignment presentations, many SEM-based studies examine mediation between variables. To mediate is to go in the middle, like a negotiation mediator comes between the labor union and management.

In statistical analysis, we often start out with a relationship between two variables. Using an example from one of my grad-school mentors, Patricia Gurin, cigarette smoking and lung cancer are positively associated.

Cigarette Smoking ==> Lung Cancer

Why does this relationship exist? A more fine-grained understanding would be that smoking leads to lung tissue damage, and tissue damage leads to cancer. Tissue damage would thus be considered the mediator or mechanism.

Cigarette Smoking ==> Tissue Damage ==> Lung Cancer

Reuben Baron and David Kenny published an article in 1986 on mediation that has been cited over 58,000 times! Kenny summarizes the process in a nutshell here. In the following figure, I apply Baron and Kenny's "old school" method to Gurin's example. Note that one would run the model twice.



(Illustration of Baron and Kenny's, 1986, logic. Example from Patricia Gurin, University of Michigan, circa 2002-2003, link)

The above diagram presents the scenario of full mediation (i.e., the initially significant direct path from antecedent to outcome becomes nonsignificant). One can then say that the mediator accounts fully for the antecedent-outcome relationship. If the initial direct path from antecedent to outcome remains significant after addition of the two mediational paths, but the initial direct path is reduced in magnitude, one can claim partial mediation (see Huselid and Cooper, 1994, "Gender roles as mediators of sex differences in expressions of pathology").

As Kenny writes on his website, "More contemporary analyses focus on the indirect effect." The leading names associated with contemporary mediational analysis are Andrew Hayes and Kristopher Preacher, who indeed emphasize indirect effects. The indirect effect can be calculated by multiplying the standardized paths from antecedent to mediator, and from mediator to outcome (think back to our unit on path-analysis tracing rules).


The indirect effect is .15 in the above example. If each of the two segments of the indirect effect (A to M, and M to O) is each statisically significant (i.e., different from zero), we would be confident that the indirect effect also is significant. As Hayes (2009, "Beyond Baron and Kenny: Statistical mediation analysis in the new millennium") notes, however, "it is possible for an indirect effect to be detectably different from zero even though one of its constituent paths is not." What is called for is a significance test of the indirect effect of .15 (or whatever value one has).

The problem is that there is no existing theoretical distribution such as the z, t, F, or chi-square distribution to judge the statistical significance of an indirect effect (i.e., whether or not one's obtained indirect effect falls in the upper or lower 2.5% of the distribution for a two-tailed p < .05 significance level). Therefore, researchers use a "synthetic" statistical distribution for testing the significance of indirect effects, known as a "bootstrap" distribution. Kenny discusses this on his website and it is also illustrated in slide 6 of this slideshow.

An additional source for studying mediation in SEM is:

Li, S. (2011). Testing mediation using multiple regression and structural modeling analyses in secondary data. Evaluation Review, 35, 240-268.