Survey research and design in psychology/Lectures/Exploratory factor analysis/Notes

Exploratory factor analysis lecture notes

About these notes
These lecture notes:
 * 1) were converted from .odp slides to mediawiki syntax.
 * 2) subsequently copyedited and further wikified (ongoing)
 * 3) they do not yet include all:
 * 4) images
 * 5) additional notes (i.e., provide slide text only)
 * 6) notes from updated slides

To see an embedded version of this presentation:
 * Exploratory factor analysis (slideshare)

What is factor analysis?

 * 1) A multivariate statistical technique for identifying clusters of inter-correlated variables (or 'factors').
 * 2) A family of techniques to examine linear correlations amongst variables.
 * 3) Aim is to identify groups of variables which are relatively homogeneous.
 * 4) Groups of related variables are called 'factors'.
 * 5) Involves empirical testing of theoretical data structures

Purposes
There are two main applications of factor analytic techniques; to:


 * 1) reduce the number of variables, and
 * 2) detect structure in the relationships between variables, that is, to classify variables.

Factor analysis is commonly used in psychometric instrument development.

History

 * 1) Invented by Spearman (1904)
 * 2) Usage was hampered by onerousness of hand calculation
 * 3) Since the advent of computers, usage has thrived, esp. to develop:
 * 4) Theory-e.g., determining the structure of personality
 * 5) Practice-e.g., development of 10,000s+ of psychological screening & measurement tests
 * 6) IQ -is it separate but multiple, related factors, e.g,.
 * 7) Verbal
 * 8) Mathematical
 * 9) Interpersonal, etc.
 * ...or is it one global factor (g)?
 * ...or is it hierarchically structured?


 * 1) "Introduced and established by Pearson in 1901 and Spearman three years thereafter, factor analysis is a process by which large clusters and grouping of data are replaced and represented by factors in the equation. As variables are reduced to factors, relationships between the factors begin to define the relationships in the variables they represent (Goldberg & Digman, 1994). In the early stages of the process' development, there was little widespread use due largely in part to the immense amount of hand calculations required to determine accurate results, often spanning periods of several months. Later on a mathematical foundation would be developed aiding in the process and contributing to the later popularity of the methodology. In present day, the power of super computers makes the use of factor analysis a simplistic process compared to the 1900's when only the devoted researchers could use it to accurately attain results (Goldberg & Digman, 1994)." (from http://www.personalityresearch.org/papers/fehringer.html)
 * 2) Terminology was introduced by Thurstone (1931) - http://www.statsoft.com/textbook/stfacan.html

Conceptual models
Here are some visual ways of conceptualising factor analytic models:

Hierarchical

 * 1) [[Image:FactorAnalysis SimpleModel.png|center|thumb|700px|A simple factor analytic model (2-d) ... e.g., 12 items testing might actually tap only 3 underlying factors]]
 * 2) Figure 2.1 (DeCoster, 1998) (pdf)

Cluster

 * 1) [[Image:FactorAnalysis ConceptualModel DotsRings.png|center|thumb|450px|Factor analysis uses correlations among many items to search for common clusters...Exploratory factor analysis is a tool to help a researcher ‘throw a hoop’ around clusters of related items, to distinguish between clusters, and to identify and eliminate irrelevant or indistinct (overlapping) items.]]

3-d

 * 1) Figure 3 (Clemants & Moore, 2003) (gif/html).

Factor analysis process

 * 1) FA can be conceived of as a method for examining a matrix of correlations in search of clusters of highly correlated variables.
 * 2) A major purpose of factor analysis is data reduction, i.e., to reduce complexity in the data, by identifying underlying (latent) clusters of association.

Intelligence
IQ – does intelligence consist of separate factors, e.g,.
 * 1) Verbal
 * 2) Mathematical
 * 3) Interpersonal, etc.?

...or is it one global factor (g)?

...or is it hierarchically structured?

Personality
Personality – does it consist of 2, 3, or 5, 16, etc. factors? e.g., the “Big 5”?


 * 1) Neuroticism
 * 2) Extraversion
 * 3) Agreeableness
 * 4) Openness
 * 5) Conscientiousness

Essential facial features
Six orthogonal factors, represent 76.5 % of the total variability in facial recognition (in order of importance):


 * 1) upper-lip
 * 2) eyebrow-position
 * 3) nose-width
 * 4) eye-position
 * 5) eye/eyebrow-length
 * 6) face-width.

Problems
Problems with factor analysis include:


 * 1) Mathematically complicated
 * 2) Technical vocabulary
 * 3) Results usually absorb a dozen or so pages
 * 4) Students do not ordinarily learn factor analysis
 * 5) Lay people find the results incomprehensible

“The very power of FA to create apparent order from real chaos contributes to its somewhat tarnished reputation as a scientific tool” - Tabachnick & Fidell (2001)

“It is mathematically complicated and entails diverse and numerous considerations in application. Its technical vocabulary includes strange terms such as eigenvalues, rotate, simple structure, orthogonal, loadings, and communality. Its results usually absorb a dozen or so pages in a given report, leaving little room for a methodological introduction or explanation of terms. Add to this the fact that students do not ordinarily learn factor analysis in their formal training, and the sum is the major cost of factor analysis: most laymen, social scientists, and policy-makers find the nature and significance of the results incomprehensible.” Rummell - http://www.hawaii.edu/powerkills/UFA.HTM

EFA = Exploratory Factor Analysis

 * 1) explores & summarises underlying correlational structure for a data set

CFA = Confirmatory Factor Analysis

 * 1) tests the correlational structure of a data set against a hypothesised structure and rates the 'goodness of fit'

Data reduction

 * 1) Simplifies data by revealing a smaller number of underlying factors
 * 2) Helps to eliminate:
 * 3) redundant variables - (e.g, items which are highly correlated are unnecessary)
 * 4) unclear variables - (e.g., items which don’t load cleanly on a single factor)
 * 5) irrelevant variables - (e.g., variables which have low loadings)

Steps

 * 1) Test assumptions
 * 2) Select type of analysis
 * 3) Extraction (PC/PAF)
 * 4) Rotation (Orthogonal/Oblique)
 * 5) Determine no. of factors
 * 6) Identify which items belong in each factor
 * 7) Drop items as necessary and repeat steps 3 to 4
 * 8) Name and define factors
 * 9) Examine correlations amongst factors
 * 10) Analyse internal reliability

Sample size
Here are some guidelines about sample size for exploratory factor analysis. Factor analysis requires a reasonable sample size in order to be effective - and requires a larger sample size than for multivariate analyses such as multiple linear regression and ANOVA. Also note that factor analysis is based on correlations amongst the items, so a good estimate of each pair-wise correlation is needed - e.g., check the scatterplots.

Typical guidelines for factor analysis sample size requirements reported in the research method literature are:
 * 1) A Total N > 200 is recommended. Comrey and Lee (1992) provide this description of total sample sizes' adequacy for factor analysis:
 * 2) 50 = very poor,
 * 3) 100 = poor,
 * 4) 200 = fair,
 * 5) 300 = good,
 * 6) 500 = very good
 * 7) 1000+ = excellent
 * 8) Min/Ideal sample size based on variable:factor ratio
 * 9) Min. N > 5 cases per variable (item)
 * e.g., if I have 30 variables, I should have at least 150 cases (i.e., 1:5)
 * 1) Ideal N > 20 cases per variable
 * e.g., if I have 30 variables, I would ideally have at least 600 cases (1:20)

For more information, see EFA assumptions.

Example factor analysis - Classroom behaviour

 * 1) Based on Francis Section 5.6, which is based on the Victorian Quality Schools Project (google search).
 * 2) 15 classroom behaviours of high-school children were rated by teachers using a 5-point scale.
 * 3) Task: Identify groups of variables (behaviours) that are strongly inter-related & represent underlying factors.

Items

 * 1) Cannot concentrate ↔ can concentrate
 * 2) Curious & enquiring ↔ little curiousity
 * 3) Perseveres ↔ lacks perseverance
 * 4) Irritable ↔ even-tempered
 * 5) Easily excited ↔ not easily excited
 * 6) Patient ↔ demanding
 * 7) Easily upset ↔ contented
 * 8) Control ↔ no control
 * 9) Relates warmly to others ↔ provocative,disruptive
 * 10) Persistent ↔ easily frustrated
 * 11) Difficult ↔ easy
 * 12) Restless ↔ relaxed
 * 13) Lively ↔ settled
 * 14) Purposeful ↔ aimless
 * 15) Cooperative ↔ disputes

LOM

 * 1) All variables must be suitable for correlational analysis, i.e., they should be ratio/metric data or at least Likert data with several interval levels.

Normality

 * 1) FA is robust to assumptions of normality(if the variables are normally distributed then the solution is enhanced)

Linearity

 * 1) Because FA is based on correlations between variables, it is important to check there are linear relations amongst the variables (i.e., check scatterplots)

Outliers

 * 1) FA is sensitive to outlying cases
 * 2) Bivariate outliers(e.g., check scatterplots)
 * 3) Multivariate outliers (e.g., Mahalanobis' distance)
 * 4) Identify outliers, then remove or transform

Factorability

 * 1) It is important to check the factorability of the correlation matrix (i.e., how suitable is the data for factor analysis?)
 * 2) Check correlation matrix for correlations over .3
 * 3) Check the anti-image matrix for diagonals over .5
 * 4) Check measures of sampling adequacy (MSAs)
 * 5) Bartlett's
 * 6) KMO

Correlations

 * 1) The most manual and time consuming but thorough and accurate way to examine the factorability of a correlation matrix is simply to examine each correlation in the correlation matrix
 * 2) Take note whether there are SOME correlations over .30 -if not, reconsider doing an FA-remember garbage in, garbage out

Anti-image correlation matrix

 * 1) Anti-image: Medium effort, reasonably accurate
 * 2) Examine the diagonals on the anti-image correlation matrix to assess the sampling adequacy of each variable
 * 3) Variables with diagonal anti-image correlations of less that .5 should be excluded from the analysis -they lack sufficient correlation with other variables

Measures of sampling adequacy

 * 1) Quickest method, but least reliable
 * 2) Global diagnostic indicators - correlation matrix is factorable if:
 * 3) Bartlett's test of sphericity is significant and/or
 * 4) Kaiser-Mayer Olkin (KMO) measure of sampling adequacy > .5

Summary

 * 1) Are there several correlations > .3?
 * 2) Are the anti-image matrix diagonals > .5?
 * 3) Is Bartlett's test significant?
 * 4) Is KMO > .5 to .6?(depends on whose rule of thumb)

Extraction method
There are two main approaches to EFA based on:


 * 1) Analysing only shared variancePrinciple Axis Factoring (PAF)
 * 2) Analysing all variancePrinciple Components (PC)

Principal components (PC)

 * 1) More common
 * 2) More practical
 * 3) Used to reduce data to a set of factor scores for use in other analyses
 * 4) Analyses all the variance in each variable

Principal axis factoring (PAF)

 * 1) Used to uncover the structure of an underlying set of p original variables
 * 2) More theoretical
 * 3) Analyses only shared variance(i.e. leaves out unique variance)

Total variance explained

 * 1) Often there is little difference in the solutions for the two procedures.
 * 2) It's a good idea to check your solution using both techniques
 * 3) If you get a different solution between the two methods try to work out why and decide on which solution is more appropriate

Communalities
p


 * 1) The proportion of variance in each variable which can be explained by the factors
 * 2) Communality for a variable = sum of the squared loadings for the variable on each of the factors
 * 3) Communalities range between 0 and 1
 * 4) High communalities (> .5) show that the factors extracted explain most of the variance in the variables being analysed
 * 5) Low communalities (< .5) mean there is considerable variance unexplained by the factors extracted
 * 6) May then need to extract MORE factors to explain the variance

Eigen values

 * 1) EV = sum of squared correlations for each factor
 * 2) EV = overall strength of relationship between a factor and the variables
 * 3) Successive EVs have lower values
 * 4) Eigen values over 1 are 'stable'

Explained variance

 * 1) A good factor solution is one that explains the most variance with the fewest factors
 * 2) Realistically happy with 50-75% of the variance explained

How many factors?
A subjective process ... Seek to explain maximum variance using fewest factors, considering:


 * 1) Theory -what is predicted/expected?
 * 2) Eigen Values > 1? (Kaiser's criterion)
 * 3) Scree Plot -where does it drop off?
 * 4) Interpretability of last factor?
 * 5) Try several different solutions?
 * 6) Factors must be able to be meaningfully interpreted & make theoretical sense?


 * 1) Aim for 50-75% of variance explained with 1/4 to 1/3 as many factors as variables/items.
 * 2) Stop extracting factors when they no longer represent useful/meaningful clusters of variables
 * 3) Keep checking/clarifying the meaning of each factor and its items.

Scree plot

 * 1) A bar graph of Eigen Values
 * 2) Depicts the amount of variance explained by each factor.
 * 3) Look for point where additional factors fail to add appreciably to the cumulative explained variance.
 * 4) 1st factor explains the most variance
 * 5) Last factor explains the least amount of variance
 * 6) Factor loadings (FLs) indicate the relative importance of each item to each factor.
 * 7) In the initial solution, each factor tries â€œselfishlyâ€ to grab maximum unexplained variance.
 * 8) All variables will tend to load strongly on the 1st factor
 * 9) Factors are made up of linear combinations of the variables (max. poss. sum of squared rs for each variable)

Initial solution - Unrotated factor structure
1st factor extracted: Second factor extracted:
 * 1) best possible line of best fit through the original variables
 * 2) seeks to explain maximum overall variance
 * 3) a single summary of the main variance in set of items
 * 4) Each subsequent factor tries to maximise the amount of unexplained variance which it can explain.
 * 1) orthogonal to first factor - seeks to maximize its own eigen value (i.e., tries to gobble up as much of the remaining unexplained variance as possible)
 * 2) Vectors = lines of best fit
 * 3) Seldom see a simple unrotated factor structure
 * 4) Many variables will load on 2 or more factors
 * 5) Some variables may not load highly on any factors
 * 6) Until the FLs are rotated, they are difficult to interpret.
 * 7) Rotation of the FL matrix helps to find a more interpretable factor structure.

Two basic types

 * 1) Orthogonal (Varimax): Minimises factor covariation, produces factors which are uncorrelated
 * 2) Oblique (Oblimin): allows factors to covary, allows correlations between factors.

Why rotate a factor loading matrix?

 * 1) After rotation, the vectors (lines of best fit) are rearranged to optimally go through clusters of shared variance
 * 2) Then the FLs and the factor they represent can be more readily interpreted
 * 3) A rotated factor structure is simpler & more easily interpretable
 * 4) each variable loads strongly on only one factor
 * 5) each factor shows at least 3 strong loadings
 * 6) all loading are either strong or weak, no intermediate loadings

Orthogonal versus oblique rotations

 * 1) Think about purpose of factor analysis
 * 2) Try both
 * 3) Consider interpretability
 * 4) Look at correlations between factors in oblique solution - if >.32 then go with oblique rotation (>10% shared variance between factors)

Factor structure
Factor structure is most interpretable when:

(Loadings of > +.40 are generally OK)
 * 1) Each variable loads strongly on only one factor
 * 2) Each factor has three or more strong loadings
 * 3) Most factor loadings are either high or low with few of intermediate value

How do I eliminate items?
Elminating items from an EFA is a subjective process, but consider:
 * 1) Communalities (each ideally > .5)
 * 2) Size of main loading (bare min > |.4|, preferably > |.5|, ideally > |.6|)
 * 3) Meaning of item (face validity)
 * 4) Contribution it makes to the factor (i.e., is a better measure of the latent factor achieved by including or not including this item?)
 * 5) Number of items already in the factor (i.e., if there are already many items (e.g., > 6) in the factor, then the researcher can be more selective about which ones to include and which ones to drop)
 * 6) Eliminate 1 variable at a time, then re-run, before deciding which/if any items to eliminate next
 * 7) (Size of cross loadings max ~ |.3|)

How many items per factor?

 * 1) Bare min. = 2
 * 2) Recommended min. = 3
 * 3) Max. = unlimited
 * 4) More items
 * 5) -> greater reliability
 * 6) -> more 'roundedness'
 * 7) -> Law of diminishing returns
 * 8) Typically = 4 to 10 is reasonable

Interpretability

 * 1) The researcher must be able to understand and interpret a factor if it is going to be extracted.
 * 2) Be guided by theory and common sense in selecting factor structure.
 * 3) However, watch out for 'seeing what you want to see' when factor analysis evidence might suggest a different solution.
 * 4) There may be more than one good solution! e.g.,
 * 5) 2 factor model of personality
 * 6) 5 factor model of personality
 * 7) 16 factor model of personality

Factor loadings & item selection
A factor structure is most interpretable when:
 * 1) Each variable loads strongly on only one factor (strong is > +.40).
 * 2) Each factor shows 3 or more strong loadings, more = greater reliability.
 * 3) Most loadings are either high or low, few intermediate values.
 * 4) These elements give a 'simple' factor structure.

Factor loading guidelines (Comrey & Lee, 1992)
Loadings:
 * 1) > .70 - excellent
 * 2) > .63 - very good
 * 3) > .55 - good
 * 4) > .45 - fair
 * 5) > .32 - poor

Example - Condom use

 * 1) The Condom Use Self-Efficacy Scale (CUSES) was administered to 447 multicultural college students.
 * 2) PC FA with a varimax rotation.
 * 3) Three distinct factors were extracted:
 * 4) 'Appropriation'
 * 5) 'Sexually Transmitted Diseases'
 * 6) 'Partners' Disapproval'
 * 7) Barkley, T. W. Jr., & Burns, J. L. (2000). Factor analysis of the Condom Use Self-Efficacy Scale among multicultural college students. Health Education Research, 15(4), 485-489.
 * 8) Condom Use Self-Efficacy Scale (CUSES)

Summary

 * 1) Factor analysis is a family of multivariate correlational data analysis methods for summarising clusters of covariance.
 * 2) FA analyses and summarises correlations amongst items
 * 3) These common clusters (the factors) can be used as summary indicators of the underlying construct

Assumptions

 * 1) 5+ cases per variables (ideal is 20 per)
 * 2) N > 200
 * 3) Outliers
 * 4) Factorability of correlation matrix
 * 5) Normality enhances the solution

Steps

 * 1) Communalities
 * 2) Eigen Values & % variance
 * 3) Scree Plot
 * 4) Number of factors extracted
 * 5) Rotated factor loadings
 * 6) Theoretical underpinning

Type of FA

 * 1) PC vs. PAF
 * 2) PC for data reduction e.g., computing composite scores for another analysis ( uses all variance )
 * 3) PAF for theoretical data exploration (uses shared variance)
 * 4) Choose technique depending on the goal of your analysis.

Rotation

 * 1) Rotation
 * 2) orthogonal -perpendicular vectors
 * 3) oblique -angled vectors
 * 4) Try both ways -are solutions different?

Factor analysis in practice
To find a good solution, most researchers, try out each combination of The above methods would then be commonly tried out on a range of possible/likely factors, e.g., for 2, 3, 4, 5, 6, and 7 factors No. of factors to extract?
 * 1) PC-varimax
 * 2) PC-oblimin
 * 3) PAF-varimax
 * 4) PAF-oblimin
 * 1) Try different numbers of factors
 * 2) Try orthogonal & oblimin solutions
 * 3) Try eliminating poor items
 * 4) Conduct reliability analysis
 * 5) Check factor structure across sub-groups if sufficient data
 * 6) You will probably come up with a different solution from someone else!
 * 1) Inspect EVs - look for > 1
 * 2) % of variance explained
 * 3) Inspect scree plot
 * 4) Communalities
 * 5) Interpretability
 * 6) Theoretical reason