1. Introduction
The emergence and development of the SEM (Structural Equation Modeling, a second-generation multivariate analysis), has benefited a wide range of areas including social science, where it has been used to develop, confirm, and empirically test theories (Xiong et al., 2015).
There are two types of SEM: CB-SEM (covariance-based SEM) and VB-PLS (variance-based partial least squares) (Hair et al., 2012b). These two methods have the same roots; however, the PLS method has an advantage that can be used as an alternative to the covariance-based SEM (Hair et al., 2012b).
PLS-SEM (Partial Least Squares Structural Equation Modeling), first introduced by Herman Wold in 1975, was known as NIPALS (nonlinear iterative partial least squares) in its early years. It became known by Lohmöller in 1989 and popularized as a core tool for multivariate analysis in several disciplines (Hair et al., 2012b; Ringle et al., 2012b; Wold, 1973; Wold, 1975).
PLS path modeling, also known as PLS-SEM, is a soft modeling method and a component-based or variance-based modeling technique (Ferrer et al., 2012; Xiong et al., 2015). A path analysis model refers to A special form of the structural equation which analyzes structural models with only observable variables (Xiong et al., 2015). The difference between this method, which is used in theoretical exploration, and the covariance-based technique is that PLS-SEM has no strict assumptions about a model and it, as well, allows one to estimate biased parameters (Xiong et al., 2015). For instance, given a complicated model which has multiple parameters or heavily right-skewed variables, it is possible to make an estimation (Ferrer et al., 2012). Moreover, since it is not too sensitive about small sample (called also subject and observation) sizes or distributional assumptions, it is possible to predict the model with non-normal data (Ferrer et al., 2012; Xiong et al., 2015). For these characteristics, PLS-SEM, like CB-SEM, is extensively used by many researchers to establish cause and effect relationships between variables. According to Ringle et al. (2012a), which investigates 65 papers published in MIS Quarterly, the PLS-SEM technique is selected for various reasons. This includes the situation when; the sample size is small (24); collected data is non-normal (22), formative indicators are used (20), the purpose is to predict (10), the complexity of the model is high (9), conducting an exploratory research (7), or the purpose is a theoretical development (6).
SEM is, basically, made up of two main components: the structural model and the measurement model. The simple measurement model-type SEM is composed of several observable variables where a latent variable contains measurement errors of its own. However, the structural model is represented by the correlation with all latent variables (Xiong et al., 2015). With SEM, the two factors can be tested simultaneously. SEM can evaluate latent variables at the level of observable variables while, at the same time, measuring relations between latent variables on the theoretical level (Hair et al., 2012). Concerning the order of analysis, the measurement model is analyzed before the structural model. Although SEM is, widely known for being useful in giving reasonable results in the analysis of measurement and structural models, it has, also, been criticized for its abuse and producing irrelevant results. When authors or reviewers lack experience with SEM, the authenticity of the SEM result may be questioned (Xiong et al., 2015) for the following reasons
Firstly, many researchers tend to rely on the features of the statistical tool they possess or use instead of conducting a factor analysis based on a theoretical foundation (Yim, 2015). When the learning method of analysis is focused on software, the analysis is often performed using only the default setting due to the function-focused analysis. Too often, they fail to consider why or when the technique should be used in an attempt to derive results and interpret them as quickly and easily as possible. For instance, principal component analysis (hereafter PCA), a default setting of SPSS, has been used for factor analysis in many cases. This is frequently observed in prior studies, as in Cho (2016), Lee (2018), Phuong and Dat (2017), and Phuong et al. (2018). Granted, this is not to say that PCA cannot be used for a factor analysis. Basic conditions need to be satisfied to use PCA for factor analysis. However, in too many cases, these conditions are not met or, even, mentioned, in the first place. Second, as in the general SEM, PLS-SEM is also composed of a structural model and a measurement model. In the measurement model, the reliability and validity of observable and latent variables are analyzed. If an error occurs in the measurement model analysis, subsequent analysis of the structural model will naturally produce erroneous results. Thus, a precise analysis of the measurement model is a prerequisite for raising the accuracy of the structural model analysis. However, through the procedure of factor analysis is, relatively, well-known as compared to other analysis processes, its complexity and process flexibility generates a substantial amount of ambiguity in conducting research (Floyd & Widaman, 1995). This may cause a number of errors in the analysis of the measurement model.
This study proposes an exploratory factor analysis (hereafter EFA) method for a more robust PLS-SEM analysis. EFA is an important analysis technique that has a substantial effect on overall results, especially, since the measurement model analysis is the starting point in the PLS-SEM analysis. EFA is also used for the analysis of unidimensionality, reliability, validity, and, even, for the process of confirming the common method bias (CMB). Therefore, an accurate EFA analysis is very important for a robust and accurate analysis. It is, thus, necessary to discuss the accurate EFA techniques for PLS-SEM.
Against this backdrop, this study presents a factor analysis process for PLS SEM based on related prior studies without referring to statistical or methodological textbooks. In particular, we will find and suggest a common procedural denominator based on various arguments in the literature rather than on statistical tools. In many cases, outside a set of rules of thumb acknowledged and flowed by many researchers, there has been no absolute standards suggested in conducting a positive analysis. Granted that there are no absolute criteria for factor analysis, this paper will propose empirical standards accepted by a substantial number of studies. At the same time, we will provide results and implications reflecting the features of easy-to-use statistical tools to help researchers in the empirical analysis.
2. Literature Review
2.1. Structural Equation Modeling (SEM)
SEM is used to describe and verify the relationship between two kinds of variables: latent variables and manifest variables. While latent variables cannot be observed directly due to its abstract character, manifest variables (or observed variables) are easy to measure because they contain objective facts. Multiple observable variables reflect a latent variable (Xiong et al., 2015).
2.2. PLS-SEM(Partial Least Square Structural Equation Modeling)
PLS-SEM maximizes the explained variance of endogenous latent variables by estimating partial model relationships in OLS (Ordinary Least Squares) Regression. By contrast, CB-SEM estimates model parameters while minimizing the inconsistency between estimated covariance matrices and sample covariance matrices. If there are conditions for selecting PLS-SEM, and they are not satisfied, it will raise the possibility of leading to inaccurate results, interpretations, and conclusions (Hair et al., 2012).
2.3. EFA(Exploratory Factor Analysis)
Ever since Charles Spearman popularized it in 1900s, factor analysis has become one of the most widely used statistical techniques in psychological research today (Treiblmaier & Filzmoser, 2010). The term, “factor,” (called also as latent variable, hypothetical construct, unobserved variable and theory) means what helps one to interpret the consistency of a dataset. Thus, factor analysis refers to an analytic technique that permits the reduction of a large number of interrelated variables to a small number of latent or hidden dimensions (Tinsley & Tinsley, 1987).
The first six decades, out of the 100-year or so history of factor analysis, was dedicated to the development of the EFA technique. It has since made statistical progress, and EFA techniques have become popular. Despite the popularity, there is an ongoing debate over the fundamental nature and value of EFA. Some researchers consider factor analysis as a method that assumes latent variables and highlights the pattern of correlation. Others see it as a technique to reduce data complexity and provide an economic explanation for the correlated data (Haig, 2005).
The term, “exploratory,” is used because either prior research on the combination of sub-indicators is not available or the researcher does not have a solid theoretical foundation (Floyd & Widaman, 1995). Thus, EFA is used to achieve parsimony using the optimal and smallest number of exploratory concepts to explain the maximum common variance in the correlation matrix (Tinsley & Tinsley, 1987).
Table 1: The Primary Goals of Factor Analysis
2.4. Purposes of EFA
According to Conway and Huffcutt (2003), the objectives of the EFA is “for consequential purposes” and “less consequential purposes.” According to these distinctions, auxiliary purposes are divided into three types. First, they are used to verify the unidimensionality of well-constructed multi-item measurement tools. Second, they are used to find the dimensionality of newly developed measurement tools at the baseline evaluation stage (Conway & Huffcutt, 2003). In other words, it is used to search for relationships that exist within the set of observed variables measured through the question (Beavers et al., 2013). Third, it is used to assess the severity of SMB (same method bias) or CMB (common method bias) that can occur in self-report instruments before hypothesis testing (Conway & Huffcutt, 2003).
There are three main purposes of EFA. First, it is used for the development and validation of measurement tools (Conway & Huffcutt, 2003). EFA is used frequently to develop psychological tools, to evaluate the validity of the tools (such as construct validity), to test theories related to the tools, and to explore new potential constructs (Floyd & Widaman, 1995; Hayton et al., 2004; Tinsley & Tinsley, 1987). It is also used for the development and refinement of measurement items. Second, it is used for hypothesis testing. EFA can be used to develop hypothesized models before validating new models using confirmatory factor analysis (CFA). Third, it can be used when research conditions change. For instance, EFA can be performed when the number of factors (such as personality factor) changes (Conway & Huffcutt, 2003).
2.5. Wrong Approach to EFA
Researchers frequently make mistakes in EFA-related choice. For instance, they may select PCA when, in fact, a common factor analysis is a better technique; lack the reasons for extracting factors (in general, “eigenvalue > 1” is applied); or use an orthogonal rotation instead of an oblique rotation without considering various aspects (Conway & Huffcutt, 2003).
Indeed, the inaccurate use of factor analysis methods is rampant across disciplines (Conway & Huffcutt, 2003; Park et al., 2002). Park et al. (2002) concluded that over 70% of studies follow wrongful procedures of factor analysis after investigating a total of 119 papers in the areas of Human Communication Research (n=36), Communication Research (n=53), and Communication Monographs (n=30). Around 30% of the total did not, even, report the factor analysis procedure. Not only that but also inaccurate choices can be observed in the overall process of the factor analysis. This includes the selection of a factor extraction model (such as PCA and common factor analysis (EFA)), selection of a criteria for determining how many factors to retain (such as an eigenvalue > 1), or selection of a rotation method (such as oblique rotation and orthogonal rotation). According to the result of a study on published empirical research, about 80% of them are still employing orthogonal rotations even though a growing body of literature shows that oblique rotations fare better than orthogonal rotations in EFA (Conway & Huffcutt, 2003).
3. Factor Analysis Procedure for PLS-SEM
The quality and capability of EFA depends on decisions regarding the number of samples, number of observable variables, level of communality, level of model errors contributing to the discovery of factors (the lack of model fit in a population or the degree of a model accurately representing the relationship between variables in a population, such as the RM(S)R, Root Mean Squared Residual, and independent of sampling errors), the factor extraction, the number of factors retained, and the method used to rotate factors (Conway & Huffcutt, 2003; Preacher & MacCallum, 2002).
3.1. Data Adequacy
3.1.1. Qualitative Conditions: Adequacy of sampling quality
Bartlett test of sphericity: it is possible to extract interpretable factors only if the data matrix contains meaningful information. An analysis method to assess this quality is the chi-square test which is used to evaluate the correlation matrix suggested by Bartlett (Tinsley & Tinsley, 1987). It is represented as:
\(\chi^{2}=-\left[n-1-\frac{2 p+5}{6} \log _{e}|R|\right], \text { where } d f=\frac{p(p-1)}{2}\) (1)
where n is the number of samples, p is the number of variables, and R is the absolute value of the correlation matrix determinant. Bartlett test measures the amount of information in the correlation matrix. Specifically, the analysis value represents the possibility of error in discarding the null hypothesis that there is no deviation with the identity matrix. A statistically significant chi-square value (p≤0.05) is the minimum requirement for conducting factor analysis (Tinsley & Tinsley, 1987).
Kaiser-Meyer-Olkin Measure of Sampling Adequacy (KMO MSA): it is another measure to assess the degree of appropriateness for the factor analysis and the intercorrelation between variables. The range of value is between 0 and 1, and as the KMO value draws closer to zero, it is more likely that each variable can be predicted by other variables without error. In general, the MSA value of 0.6 or higher is considered mediocre, 0.7 or higher is middling, and 0.8 or higher is meritorious. However, the problem of this measure is that as the sample size increases, the average correlations increase, the number of variables increases, and the number of factors decreases. Thus, the MSA value increases (Hair et al., 2009).
3.1.2. Quantitative Conditions: Adequacy of sample size
There is a general criterion to follow to derive a solution of the appropriate factor structure through factor analysis. One of the criteria is the sample size. Since the number of samples has a big influence on statistical power, it should be addressed with great significance (Preacher & MacCallum, 2002). However, deciding on an adequate sample size is complicated matter (Conway & Huffcutt, 2003). Mundfrom et al. (2005) suggested four factors influencing the minimum number of samples: number of observable variables, number of factors, number of observable variables per factor, and the level of commonality.
Needless to say, the larger the sample, the better the statistical power. However, this fact has two problems. First, there is no consensus over how large is “large” is (Hogarty et al., 2005). The answer varies depending on the researcher. Second, when the analysis unit is a corporation, it is difficult to collect samples, and time and cost constraints make it even harder in many cases. Therefore, it is not always realistic to have a large sample.
Thus, many studies propose the criteria focused on a minimum number of samples, rather than a maximum number. Within this context, this section will examine the criteria for a minimum sample size. The criteria for the sample can be generally divided into two respects; absolute sample size and relative sample size criterion. Absolute sample size refers to the number of samples required without considering the number of variables. Relative sample size is the number of samples required in proportion to the number of variables.
Absolute Sample Size Criteria: no quantitative criterion is frequently cited among scholars; each has a, somewhat, different criteria. Gorsuch (1983) and Kline (1994) argue that a minimum of 100 samples is required for factor analysis. Preacher and MacCallum (2002) claimed that the minimum sample range for factor analysis ranged from 100 to 250. Conway and Huffcutt (2003) argued that more than 400 specimens were needed to avoid distorted results. In a comprehensive review, Tinsley and Tinsley (1987) considered 100 as “poor,” 200 as “fair,” 300 as “good,” 500 as “very good,” and 1000 as “excellent.” Similarly, Comrey and Lee (1992) provided the following scale of sample size adequacy: 50 – very poor, 100 – poor, 200 – fair, 300 – good, 500 – very good, and 1,000 or more – excellent.
Relative Sample Size Criteria: Kline (1994) argued that relative criteria are more appropriate to determine the minimum sample size than absolute criteria. However, absolute or relative, the criteria vary depending on the researcher. Conway and Huffcutt (2003) suggested a 5:1 sample at minimum for empirical research. Cattell (1978) considered 3:1 to 6:1 as a fit for each observable variable. For Everitt (1975), the minimum ratio was 10:1. Tinsley and Tinsley (1987) integrated both viewpoints and suggested 5:1 (5 subjects) or 10:1 (subjects) as minimum relative criteria. According to their argument, a minimum of 5 to 10 samples are needed for each variable. Hair et al. (2009) proposed a 20:1 criterion for factor analysis. Arrindell and van der Ende (1985) saw a problem with these criteria; However, relative criteria can be considered to be more reasonable since there is no agreed-upon recommendation among researchers.
Combined Sample Size Criteria: As the sample size increases, random errors of measurement (called also as (questionnaire) item, scale, indicator (of the latent variable), observed variable and manifest variable) decrease and the test parameters of the variables stabilize. Therefore, when the sample size reaches a certain level of stabilization, there is no big difference even if a variable is added. According to some scholars, the number of samples to reach stabilization is 300. Thus, relative sample criteria (number of variables) is no longer important as long as the sample size reaches the stabilization point (Tinsley & Tinsley, 1987). Based on this statement, Tinsley and Tinsley (1987) concluded that the number of samples per variable should be five to ten at the minimum and the total number should be more than 300 to find a stable factor structure. However, Hogarty et al. (2005) pointed out that the minimum number of samples required to detect population factors and achieve appropriate stability, or the number of samples per observation variable (the relative criterion) may vary by research. In other words, a relatively small number of samples may be appropriate in some situations, while a fairly large sample size may not be adequate for others. Therefore, they found that an appropriate sample for common factor analysis is greatly influenced by the level of communality of the (observable) variables and the level of overdetermination of (potential) factors. Commonality refers to the ratio of the variance of the (observable) variables explained by common factors. Overdetermination is the level at which the factor is clearly expressed by a sufficient number of (observable) variables (Hogarty et al., 2005). That is, an overdetermined factor has an observable variable with at least three or four high factor loadings and a good sample structure, while the weakly determined factors consist of observable variables with relatively low factor loadings, which indicates an unclear factor structure. Hogarty et al. (2005) empirically found that commonality is closely related to the discovery of better factors. According to their findings, the number of samples with high commonality has little effect on the quality of the factor solution. However, the effect of the number of samples increases when the commonality is low. In addition, overdetermined factors have been shown to improve the quality of factor solution, especially when the number of factors is relatively small (Hogarty et al., 2005)
When the number of samples is small: Preacher and MacCallum (2002) argued that, even, if the number of samples is small (for instance, 30-50) appropriate factors can be found if there are no problems in other conditions. The conditions mentioned here are high communalities, a small number of factors to be found, and a low level of model error. If these conditions are satisfied, reviewers should not worry too much about the number of samples. However, problems can arise from the small sample size, such as factor overdetermination (far more factors extracted than predicted). An appropriate way to solve this is to reduce the number of potential factors rather than adding items of observable variables. However, if the number of factors is reduced rather than adding items to improve the likelihood of finding the appropriate factors, the commonality is likely to be lowered. In other words, having too few factors adversely affect commonality, while too many factors are not easily interpretable. Therefore, the researcher should employ a method of sequential elimination of the factors with explanatory power over the model. This can be done by using appropriate factor maintenance techniques to solve the factor overdetermination (Preacher & MacCallum, 2002). Mundfrom et al. (2005) also found that there is no problem even if the minimum number of samples is small. This is true as long as they have high commonality (h2=0.6-0.8), the ratio of observable variables per factor is high, and the number of minimum samples is stabilized regardless of the number of factors and the level of commonality when the number of variables per factor exceeds 6 (or 7 or more) (Table 2).
Table 2: The Primary Goals of Factor Analysis
* Basic Conditions: when the number of variables per factor exceeds 6 (or 7 or more)
* source: Mundfrom et al. (2005)
3.2. Factor-to-Variable Ratio
In general, the minimum number of measurement variables required per latent variable ranges from three to ten (Preacher & MacCallum, 2002). Fabrigar et al. (1999) claimed that there should be, at least, four observable variables (4:1) for each factor.
3.3. Factor Extraction Method
Factor extraction refers to a method of identifying factors that characterize a set of variables. Various extraction methods have been developed to date. However, the most frequently used techniques are Maximum Likelihood (ML), Principal Axis Factoring (PAF), and Principal Component Analysis (PCA) (Kahn, 2006). The most popular technique for component analysis is PCA. Moreover, in common factor models, ML and PAF are used to provide commonality estimates (Conway & Huffcutt, 2003).
ML (Maximum Likelihood): ML evaluates the likelihood that the correlation matrix is extracted from the population where the obtained factor structure is based on the scores of the variables. The advantage of this technique is that it provides estimates of significance tests and confidence intervals during the analysis. The disadvantage is that it requires a multivariate normal data. For this reason, researchers do not prefer or use ML over PAF or PCA (Kahn, 2006).
PAF (Principal Axis Factoring): PAF analyzes the common variance between variables. Common variance (communality) is the variance that a variable shared with at least one other variable. However, variance not shared with other variables is not considered in the PAF since it is not related to common factors. PAF uses SMC (Squared Multiple Correlation, R2 ) as an estimate to calculate the variance analyzed, and unique variance and error variance—among the three factors of variance are common variance, unique variance, and error variance—are not reflected. Thus, the factors derived from PAF explain the variance shared by more than one variable. Statistically, the SMC is located on the diagonal in the correlation matrix between variables (Kahn, 2006).
PCA (Principal Component Analysis): Unlike PAF, PCA analyzes all variance (including the common variance and the unique variance) of variables. Therefore, statistically, the diagonal value in the correlation matrix between variables is “1.00” (unities), which is a value reflecting all variances. Theoretically, the purpose of factor analysis is to identify common factors. In this sense, PCA cannot be a true factor analysis. Instead, the purpose of the PCA is to find linear combinations (that is, principal components) that hold as much information as possible on observable variables (Kahn, 2006).
Although there are various model extraction methods, they are mostly classified into two types; the common factor model and the principal model (Conway & Huffcutt, 2003).
3.3.1. Similarities between Component Analysis and Factor Analysis
Among the various techniques for factor analysis, the most commonly observed technique used in information systems research is the PCA (Gefen & Straub, 2005).
Some researchers argue that component analysis and common factor analysis actually yield the same results (Conway & Huffcutt, 2003; Snook & Gorsuch, 1989). The difference in technique is not significant when these two approaches derive convergent solutions. The only difference between the PCA and the common factor analysis (EFA), they claim, is that whatever rotation technique is used, the factor loading derived through principal component analysis is, relatively, larger than that of the common factor analysis. This is because the PCA includes the total variance (common variance + unique variance + error variance) (Snook & Gorsuch, 1989). However, such a result can occur conceptually when the variable error is close to zero (Park et al., 2002).
According to Snook and Gorsuch's (1989) positive analysis of the accuracy of the two procedures in terms of bias and variability, PCA may produce a considerably different result from the population pattern when the factor loading and the number of variables used are small (Snook & Gorsuch, 1989). Park et al. (2002) argued that PCA and EFA are likely to produce different results when the number of observable variables per factor is small (such as in three observable variables per factor) and when the commonality is low (such as 0.4 or below). Park et al. (2002) argued that, in many studies, PCA has been used, incorrectly, to find latent variables of observable variables (as in the use of EFA). They added that the PCA should be used to reduce the observable variable to a small set of variables that keep the total variance of the observable variable as large as possible (Park et al., 2002).
3.3.2. Differences between PCA and EFA
The biggest difference between PCA and EFA is in their purpose. This makes a difference in how EFA and PCA conceptualize the source of variance for observed variables. EFA assumes that factors reflect observable variable imperfectly way; thus, it reflects the variance as divided into common variance (variance due to common factors—factors influencing more than one variable) and unique variance (variance due to unique factors—factors influencing only one variable). However, PCA includes both common variance and unique variance in the composition without distinguishing between the two (Conway & Huffcutt, 2003). Common variance is a variance in which one variable shares variance with other variables (Park et al., 2002). Unique variance is a specific variance for each variable (Park et al., 2002). All factor loadings maintained through factor analysis involves all three variances (Tinsley & Tinsley, 1987). The three types of variance are as follows:
Total Variance = Common Variance + Unique(Specific) Variance + Random(Error) Variance (2)
First, EFA estimates errors; however, PCA does not reflect errors and assumes no observable variable error. In other words, while PCA reflects the total variance in the analysis, EFA reflects only common variance (or shared variance) by distinguishing each variance (Park et al., 2002). PCA is located at 1.0 (unities) in the diagonal of the matrix (Snook & Gorsuch, 1989; Tinsley & Tinsley, 1987). This means that it contains 100% of the variance; that is, all variances. Therefore, PCA, technically, transforms the data into a set of orthogonal variables (Tinsley & Tinsley, 1987). However, common factor analysis has a commonality in diagonal values (Snook & Gorsuch, 1989). Second, the observable variable in EFA is a function of the Factor. Meanwhile, components in PCA are functions of observable variables. Third, EFA is an attempt to qualitatively and quantitatively explain the correlation between variables, while PCA attempts to explain the variation of variables by holding as much information as possible on the original variables (Park et al., 2002).
Therefore, if researchers want to reduce income, education level and debt level, as a composite component of socioeconomic status, PCA would suit the purpose. If researchers, however, want to test a latent variable of “verbal aggressiveness” through other various observable variables, EFA is an appropriate method (Park et al., 2002).
3.3.3. Summary
Extraction techniques have been, so far, introduced by various theories. However, many techniques are still, and too often, used for the same purpose in the field. Although PCA and PAF provide similar information in many cases, using these two techniques for the same purpose can cause problems. In addition, according to prior research, PAF is better than PCA and produces a more accurate result, especially, when communalities are low. Therefore, if the purpose of the analysis is to determine the underlying factors behind the indicators, then PAF or ML should be used rather than PCA. Moreover, it is necessary for researchers to take extra care because PCA is set as a default extraction technique in popular statistical software such as SPSS and SAS (Kahn, 2006).
3.4. Factor Rotation Method
Factor extraction provides only an aspect of the ideal factor solution. However, factor rotation clarifies the factor structure by equally distributing the variance to the factors (Tinsley & Tinsley, 1987). Fabrigar et al. (1999) argued that the factor should be rotated in a multidimensional space (n, number of factors) to derive a solution with the best simple structure (Park et al., 2002). Here, the simple structure is a solution comprising each factor that consists of a subset of measured variables with a high factor loading for the parent factor and a low loading for a foreign factor (no cross-loading factor) (Conway & Huffcutt, 2003; Fabrigar et al., 1999). The simple structure is, psychologically, meaningful, interpretable (interpretable solution), and easier to repeat or generalize (Conway & Huffcutt, 2003; Fabrigar et al., 1999; Park et al., 2002; Tinsley & Tinsley, 1987).
3.4.1. Orthogonal Rotation (90°)
Orthogonal rotation (including the varimax, equimax, and quartimax) puts constraints on factors as independent factors or uncorrelated factor (Conway & Huffcutt, 2003; Fabrigar et al., 1999; Park et al., 2002; Tinsley & Tinsley, 1987). Thus, it is assumed that there is no correlation between factors, and if, statistically, the correlation coefficient is low between factors, this rotation method is appropriate. By contrast, if factors are correlated and this rotation is used, it is highly likely to distort the factor loading in a simple structure (Conway & Huffcutt, 2003). Supporters of this technique argue that orthogonal rotation makes it easier to derive simplicity, enhances conceptual clarity, and better represents the psychological reality (Fabrigar et al., 1999; Tinsley & Tinsley, 1987). By contrast, Park et al. (2002) retort that the argument is inappropriate and unrealistic. They claimed that the solution derived from orthogonal rotation is not simpler than that of oblique rotation and, even, overlooked the existence of the significant correlation between factors (Park et al., 2002). Among various techniques for orthogonal rotation, varimax is the most general and optimal rotation technique available (Conway & Huffcutt, 2003; Fabrigar et al., 1999). Varimax aims to maximize the variance of squared loadings on a factor (Conway & Huffcutt, 2003). Therefore, the varimax procedure is appropriate if the underlying dimensions assume no correlation between factors or do not take into account the assumption of correlation. This is because they do not reflect assumptions about the level of correlation between factors (Tinsley & Tinsley, 1987). However, it should be noted that varimax orthogonal rotations can lead to false results due to the non-correlation constraints between factors (Fabrigar et al., 1999). The meaning of the factors revealed by the factor analysis depends entirely on the meaning of the variables; if meaningless variables are analyzed, meaningless factors will be derived. Thus, ideally, variables should form a logical set. If variables are not sufficiently correlated, it may be difficult to find common factors. Finally, variables that do not have sufficient reliability are unlikely to form meaningful factors (Kahn, 2006).
3.4.2. Oblique Rotation (less than 90°)
EFA is used to identify clusters of correlated variables based on the mutual dependence of underlying latent variables (Preacher & MacCallum, 2002). That is, EFA explores how many factors exist between sets of variables and the level of association between these factors (Kahn, 2006). Many scholars have argued that the oblique rotation technique, theoretically, yields better results than the orthogonal rotation technique (Tinsley & Tinsley, 1987). Conway and Huffcutt (2003), also, claimed that oblique rotation better represents reality than orthogonal rotation and creates a simpler structure. Factors derived from oblique rotation (correlated) correlate with each other (Tinsley & Tinsley, 1987). That is, oblique rotation (including promax, oblimin quartimin rotations) allows for correlations between factors (Conway & Huffcutt, 2003; Fabrigar et al., 1999; Park et al., 2002). According to Fabrigar et al. (1999), various potential constructs are used in psychological research, and these potential concepts are correlated with each other, theoretically and empirically. Therefore, they argue that orthogonal rotation can, more accurately and realistically, represent the association between one variable and another (Fabrigar et al., 1999). However, to use oblique rotation, complex exploratory hypotheses are required along with an explanation of the potential dimension of each factor and the potential dimension of the correlation between the factors (Tinsley & Tinsley, 1987). Unlike orthogonal rotation, there is no rotation technique preferred by researchers in oblique rotation. However, commonly used rotation techniques are direct quartimin rotation, promax rotation, and Harris-Kaiser oblique rotation (Fabrigar et al., 1999).
3.4.3. Controversy
Previous studies have reached a consensus on the following conclusions. First, if there is a clear and simple factor structure, both techniques yield similar analysis results. Second, varimax derives the optimal solution in the orthogonal rotation while promax and the Harris-Kaiser criteria yield optimal results in oblique rotation. Third, whatever rotation scheme is chosen, the varimax-promax rotation procedure is the basis for structure determination (Tinsley & Tinsley, 1987).
In orthogonal rotation, one factor in the multidimensional space is oriented at 90° to other factors. However, oblique rotation is oriented at an angle of less than 90°. Therefore, orthogonal rotation increases the likelihood of producing a poor simple structure when clusters of variables form an angle of less than 90° (Fabrigar et al., 1999). Experiments by Fabrigar et al. (1999) show that direct oblimin produces fewer cross loadings than varimax rotation technique.
Oblique rotation provides more information than orthogonal rotation and produces a correlation estimate between common factors. Knowing the level of correlation between factors is useful for understanding the conceptual nature of common factors. For instance, the presence of significant correlations between factors indicates that higher order factors may exist. However, since orthogonal rotation does not provide inter-factor correlation, it is difficult to determine if one or more higher-order factors are present in the data (Fabrigar et al., 1999).
3.5. Determining the Number of Factors to Retain
Choosing a criterion for the number of factors for retention is one of the most important decisions for researchers (Conway & Huffcutt, 2003) because a different number of factors can be retained by the researcher's choice among various criteria. Hayton et al. (2004) emphasized that factor retention decisions are more important than other stages in factor analysis. They argued that the factor number decision has the greatest impact on the balance to be maintained between the appropriate representation of parsimony and underlying correlations. This is to derive optimal results through EFA. They also claimed that the estimation of too few or too many factors results in significant errors that affect the outcome of EFA (In particular, estimates of too few factors, out of the two misspecifications, is, eventually, more serious as it traditionally leads to the loss of important information.). However, there has been no consensus on an appropriate standard (Hayton et al., 2004).
3.5.1. Kaiser's criterion or K1 rule by Kaiser (1958)
Kaiser's criterion is the most commonly used for determining how many factors to retain. This criterion only determines the factor as an eigenvalue (the sum of the squared values of the factor loadings) (Tinsley & Tinsley, 1987). The factor's eigenvalue is a measure of the set of explanatory variances on the same items of variables. An eigenvalue of 1 means that a variance between variables explains as much as a single variable explains (variables are normalized to have a variance of “1”) (Kahn, 2006).
\(\text { Eigenvalue }=\sum(\text { factor loadings })^{2}\) (3)
According to this standard, if the eigenvalue is 1 or higher, it remains a factor (Tinsley & Tinsley, 1987). This means that the variance of the factor or component must be greater than the variance of one observable variable (Park et al., 2002). That is, the variance of a factor must be greater than the variance that can be explained with the variance of a variable. Kahn (2006) argues that the K1 criterion is only suitable for PCA because it requires that the K1 criterion be equal to or greater than the eigenvalue and the total variance of one variable, while PAF and ML techniques assume that the variance of one variable is less than one (Kahn, 2006).
However, Tinsley and Tinsley (1987) argued that this technique was mainly used for alpha factor analysis. Therefore, when using it with other factor extraction techniques, there is a possibility of underestimating the number of significant factors in the matrix. Underestimation is more problematic than overestimation. In the case of overextraction, additional factors can be eliminated as they become more difficult to analyze or less reliable with analysis. In the case of under-extraction, however, they, ultimately, can be a barrier to the development of a theory or a constraint on discovering new latent variables (Tinsley & Tinsley, 1987).
Meanwhile, Park et al. (2002) argue that this criterion tends not only to underestimate the number of factors, but also to overestimate the number of factors. Hayton et al. (2004) argue that the K1 criterion is inaccurate and often leads to over-factoring. Kaiser (1958) is based on the argument of Guttman (1954). Guttman (1954) found that lower bounds are extracted from the rank of the population correlation matrix by extracting an eigenvalue of 1 or higher. Kaiser (1958) asserts that the reliability of components is always nonnegative when the eigenvalue is greater than 1. However, Guttman's (1954) claim is based on population correlation matrices, whereas Kaiser targets finite samples, resulting in sampling errors and overestimation of factors. That is, the K1 criterion is appropriate when the number of samples approaches infinity based on the population correlation matrix. Thus, in a finite sample, some factors may appear larger than an eigenvalue of 1 because of sampling error. Another problem with this criterion is that the eigenvalue 1 criterion is somewhat arbitrary (Hayton et al., 2004). For instance, if we accept this standard, an eigenvalue 1.01 is kept as a factor, and 0.99 is excluded from the factors. Fabrigar et al. (1999) noted that not only is there much evidence that K1 is inaccurate but also, there is no research showing that this criterion works properly.
As examined so far, previous studies have criticized the criterion for not consistently determining the exact number of factors; researchers do not recommend relying on this criterion. Some, even, considered it unsuitable, compared to other criteria (Conway & Huffcutt, 2003). Nonetheless, many researchers still, mistakenly, use this criterion (Hayton et al., 2004).
3.5.2. Cattell’s (1966) Scree Test
A better method than K1 to decide on factor retention in performing Common Factor Analysis (EFA) is to evaluate the eigenvalue plot for a factor called the “Scree Plot” (Kahn, 2006). The basic assumption of this technique is that the matrix is residual. Thus, the factors extracted, successfully, represent only the variance of the error. The error of measurement is random. Hence, the amount of variance extracted from successful factors varies at random (Tinsley & Tinsley, 1987).
Scree tests show the differences between the factors by plotting the eigenvalues in descending order (Park et al., 2002; Tinsley & Tinsley, 1987). Thus, the difference between the eigenvalues appears as a curve (Tinsley & Tinsley, 1987). Factors located on the left side (cliff) of the break-point are considered as real factors and those located on the right side (scree) are regarded as error factors (Hayton et al., 2004; Park et al., 2002; Tinsley & Tinsley, 1987). This technique is known to be, particularly, accurate when the number of samples is large and the factor or component is strong (Park et al., 2002). According to Hayton et al. (2004), this criterion produces better results than the K1 standard.
However, the problems of this technique are as follows: First, this test is criticized for its dependence on researchers’ subjectivity (Hayton et al., 2004; Tinsley & Tinsley, 1987). That is, there is no absolute standard. Kahn (2006) argued that since a scree test does not determine too many factors like K1, it is suitable to use in a rotation method for extraction of PAF and ML, but if there is no clear scree line, it is dependent on the subjectivity of researchers. Second, unlike the basic assumption of a scree test, there can be multiple breaks (Hayton et al., 2004; Park et al., 2002; Tinsley & Tinsley, 1987). Third, there are cases where no clear break-point exists (ambiguity) (Hayton et al., 2004; Park et al., 2002). This is particularly common when the sample size is small, or the number of variables per factor is small. Thus, this criterion has varying inter-rater reliability (Hayton et al., 2004). Hence, according to Kahn (2006), both K1 and Carttell’s scree test are suitable for determining the number of factors; however, the two techniques do not always produce consistent results.
3.5.3. PA (Parallel Analysis by Horn (1965))
Many researchers have been, constantly, curious about which criterion is most accurate among various factor retention standards. Actual results from previous studies show that PA is highly accurate in comparison to K1 and ML. For example, while the K1 criterion is population-based, PA is sample-based and, therefore, free of the likelihood of sampling error, which is an inherent problem of K1 (Hayton et al., 2004). Kahn (2006) also suggested that the PA technique is superior to the K1 standard or the Scree test. Thus, many scholars consider PA to be the most accurate technique for determining the number of factors. PA generates an eigenvalue from a random set of data based on the same number of variables and the same number of cases. The analysis tool (analyst) plots this randomly generated eigenvalue along with the actual eigenvalues on the scree plot. Factors with real eigenvalues higher than random eigenvalues are retained (Kahn, 2006)
PA assumes that the nontrivial components extracted from real data based on valid factor structures have higher eigenvalues than the eigenvalues of parallel components extracted from random data with the same sample number and variable number. Thus, PA has a structure of correlation matrices of several random variables based on the number of observable variables in the actual data set (Hayton et al., 2004).
The factor extraction method compares the mean eigenvalue in a random correlation matrix with the eigenvalue of the actual data correlation matrix. Thus, the first observed eigenvalue is compared with the first random eigenvalue, and the second observed eigenvalue with the second random eigenvalue. In this process, factors with actual eigenvalues greater than the parallel average random eigenvalues are retained; however, sampling errors occur, and the factor is excluded when the parallel mean random eigenvalue is less than or equal to the actual eigenvalue (Hayton et al., 2004).
The awareness of PA is low and rarely selected by researchers (although it is one of the most accurate methods for determining factors to retain) due to the lack of support by software packages (It is not supported in SPSS or SAS), the lack of related education in college, and the lack of coverage in various textbooks. The disadvantages of this criterion are the over-factoring tendency as revealed in some studies; although, the error rate is lower than other criteria (Hayton et al., 2004).
3.5.4. ML (Maximum Likelihood factor extraction method)
The ML factor extraction method is another factor retention decision technique. The advantage of ML is that it provides significance testing of the number of factors. Using this technique, researchers can compare models with varying numbers of factors and determine the number of final factors based on various model fit indexes such as root mean square error of approximation (RMSEA) and expected cross-validation index (ECVI) (Park et al., 2002).
However, this technique tends to over-factor. Moreover, it is overly sensitive to the number of samples, and the accuracy tends to decrease as the number of samples increases (Hayton et al., 2004).
3.5.5. PVA-FS (the percentage of variance accounted for by the factor solution)
The most intuitive way to determine the number of factors is to use the ratio of the variance between the variables explained by each factor (Kahn, 2006). PVA-FS is a way to determine the number of factors which provide a high percentage of variance explanatory power or an optimal solution easiest to interpret (Conway & Huffcutt, 2003). That is, factors that explain a sufficient level of variance are retained, and factors explaining low-level variance are discarded (Kahn, 2006). Hair et al. (2009) argued that the percentage of variance explanations should be 60% or higher. The biggest problem with this approach is that the final decision of the number of factors depends on the subjectivity of the researcher since there is no common criterion for a “level.” Thus, the objective approach is to determine the number of maintenance factors according to the number of common variances between items. Specifically, it retains the number of factors needed to explain all common variances between variables. PVA-FS is provided as a default setting when PAF is used in SAS software (Kahn, 2006).
3.5.6. PVA-LF (the percentage of variance accounted for by the last factor)
There are no absolute standards for this method or discussions over which level is proper (Tinsley & Tinsley, 1987). Therefore, it is inappropriate to apply this technique.
3.5.7. MAP (the minimum average partial method by Velicer (1976))
MAP, having the second-highest factoring accuracy after PA, is a criterion for principal component analysis (Hayton et al., 2004; Velicer, 1976). MAP computes the average of squared partial correlations for each component after partialling out. If the minimum average squared partial correlation is reached, the residual matrix becomes identical to the identity matrix, and no further components are extracted. Different from other criteria, the disadvantage of this approach is the tendency of under-factoring (Hayton et al., 2004).
3.5.8. Bartlett’s (1950) Chi-square Test
This technique is used to test the hypothesis that the remaining eigenvalues are identical. Each eigenvalue is evaluated sequentially until the null hypothesis is not rejected through chi-square (Hayton et al., 2004).
3.5.9. A priori theory
This technique is a non-quantitative method to select the most interpretable solution based on theoretical expectations (Conway & Huffcutt, 2003; Hayton et al., 2004). In other words, it bases the decision of how many factors to retain solely on precedent theories. For instance, when one uses the Big Five personality factors, one probably ends up retaining five factors (Kahn, 2006).
3.5.10. Summary
It is worth remembering that deciding on factors to retain with only one criterion is not appropriate (Tinsley & Tinsley, 1987). Thus, some scholars suggested the use of a combination of techniques as there is no single technique with consistently high accuracy under various conditions (Conway & Huffcutt, 2003). As a matter of fact, many researchers use multiple decision criteria to decide the number of factors to retain (Hayton et al., 2004).
3.6. Interpretation of factor analysis
The most widely used factor interpretation criterion is factor loading. According to one rule of thumb, a factor loading of 0.3 or more should be considered in factor analysis. A loading of 0.3 or more means that the variance explained by one factor is about 10%. However, Comrey and Lee (1992) concluded that this criterion is problematic and made the following claims: 0.32 or higher = poor (0.322=0.1024 (about 10%)), 0.45 or higher = fair (0.452=0.2025 (about 20%)), 0.55 or higher = good (0.552=0.3025 (about 30%)), 0.63 or higher = very good (0.632=0.3969 (about 40%)), and 0.71 or higher =excellent (0.712=0.5041 (about 50%)). Another important thing to consider is the presence of negative loadings. Positive loading values help to understand the nature of the potential dimension emphasized by factors, while negative loading values provide a clear interpretation by indicating what is not a factor (Tinsley & Tinsley, 1987).
3.7. Reporting EFA information
Conway and Huffcutt (2003) argued that a substantial number of researchers often skip relevant information which should be provided when conducting EFA. They insist that the researchers need to present descriptive statistics (including sample number, mean, and standard deviation), factor extraction techniques used, number of extracted factors, factor rotation techniques, how the extracted factors are interpreted, how factor scores are calculated, the eigenvalue, commonality, the variance explanatory power of factors, the load factor matrix of all factors, and the correlation matrix of factors (Conway & Huffcutt, 2003).
The table below summarizes the recommended techniques for each step of the EFA.
3.8. Food for thought
The table below shows the answers to the questions that many researchers have had in the process of factor analysis. All answers were derived from various previous studies.
Table 3: The Recommended Techniques for Each Step of the EFA
Table 4: Some Practical Questions and Answers for Conducting EFA
4. Conclusions
In analyzing the causality between factors using SEM, it is important to derive conceptually and theoretically, accurate factors through empirical analysis and identify the relationship between them. Thus, it is an important prerequisite to derive factors clearly
Despite all the various statistical methods, books, and proposed methodologies, there are still many cases where analysis is not performed according to the precise factor identification method with a solid theoretical basis. One of the fundamental reasons is that the analytical process is done through a software-based approach while the theoretical ground for the factor analysis is, often, neglected. Against this backdrop, this study sought to provide a factor analysis procedure for PLS-SEM based on factor analysisrelated studies. In particular, we tried to find out the common procedural denominators based on various claims in the literature (not statistical tools).
According to the result of the literature review, the factor analysis procedure for PLS-SEM consists of the following five stages. The first process evaluates the adequacy of the collected data. It consists of a qualitative and quantitative criterion. It is important to look at whether the Bartlett test of sphericity and the KMO MSA meet qualitative criteria. Since both techniques have advantages and disadvantages, evaluating data adequacy with one indicator can lead to biased conclusions. Next, quantitative standards include relative and absolute standards. Relative criterion refers to the number of samples against the number of measurement items, and the minimum requirement is 5:1. The absolute criterion, then, follows, which requires more than 100 samples, regardless of the number of observable variables.
The second step is factor extraction. If the collected data forms a normal distribution, it is appropriate to use ML. However, when data forms a non-normal distribution, PAF is a better choice of method.
The third step is factor rotation. The techniques used for rotation are orthogonal rotation and oblique rotation. If the causality between latent variables is to be studied, it is appropriate to use oblique rotation, which assumes inter-correlation between constructs. In particular, the oblique technique is a suitable method for PLS-SEM, which assumes that there is a correlation between latent variables.
Fourth, it is necessary to make a factor retention decision. It is very important to decide on how many factors to retain. Extracting too few factors increases the likelihood that the researcher will not be able to identify the intended causal relationship. Conversely, if too many factors are extracted, the complexity of the model may increase. It is, therefore, important to retain an appropriate number of factors. To date, there are various criteria for factor retention. However, each technique has both advantages and disadvantages. Therefore, a combined approach is necessary. First, PA should be used at the onset because it is considered to be highly accurate. Next, it is recommended to use the K1 criterion, which represents the sum of the explanatory power of the measurements used for each factor. In addition, as the purpose of PLS-SEM is to maximize the variance explanatory power, it is necessary to extract factors that, similarly, increase the total variance explanatory power through PVA-FA.
Finally, as a criterion for factor interpretation, it is appropriate to select an item with a factor loading of 0.5 or higher and communality of 0.5 or higher while, selecting a variable without a cross-loading greater than 0.4. It is expected that the accurate factor analysis process for PLS-SEM, as previously presented, will help us to extract more precise factors for the structural model.
References
- Alii, L. K. (2010). Sample Size and Subject to Item Ratio in Principal Components Analysis and Exploratory Factor Analysis. Journal of Biomerics & Biostatistics, 1(2), 1-6.
- Arrindell, W. A., & van der Ende, J. (1985). An Empirical Test of the Utility of the Observations-to-Variables Ratio in Factor & Components Analysis. Applied Psychological Measurement, 9, 165-178. https://doi.org/10.1177/014662168500900205
- Bartlett, M. S. (1950). Tests of Significance in Factor Analysis. British Journal of Psychology, 3, 77-85.
- Beavers, A. S., Lounsbury, J. W., Richards, J. K., Huck, S. W., Skolits, G. J., & Esquivel, S. L. (2013). Practical Considerations for Using Exploratory Factor Analysis in Educational Research. Practical Assessment, Research & Evaluation, 18(6), 1-13.
- Behrens, J. T. (1997). Principles and Procedures of Exploratory Data Analysis. Psychological Methods, 2(2), 131-160. https://doi.org/10.1037/1082-989X.2.2.131
- Browne, M. W. (2001). An Overview of Analytic Rotation in Exploratory Factor Analysis. Multivariate Behavioral Research, 36(1), 111-150. https://doi.org/10.1207/S15327906MBR3601_05
- Bryant, F. B., & Yarnold, P. R. (1997). Principal-Components Analysis and Exploratory and Confirmatory Factor Analysis, pp.99-136., In Reading and Understanding Multivariate Statistics, ed. L. G. Grimm & P. R. Yarnold, American Psychological Association, Washington, DC.
- Budaev, S. V. (2010). Using Principal Components and Factor Analysis in Animal Behaviour Research: Caveats and Guidelines. Ethology: International Journal of Behavioural Biology, 116(5), 472-480. https://doi.org/10.1111/j.1439-0310.2010.01758.x
- Cattell, R. B. (1966) The Scree Test for the Number of Factors. Multivariate Behavioral Research, 1, 245-276. https://doi.org/10.1207/s15327906mbr0102_10
- Cattell, R. B. (1978). The Scientific Use of Factor Analysis. New York: Plenum.
- Cho, Y-S. (2016). The Moderating Effects of Specificity of Technology in the Knowledge Transfer of Distributive Manufacturing MNEs. Journal of Distribution Science, 14(9), 121-132. https://doi.org/10.15722/jds.14.9.201609.121
- Comrey, A. L., & Lee, H. B. (1992). A First Course in Factor Analysis. Hillsdale, NJ: Erlbaum.
- Conway, J. M., & Huffcutt, A. I. (2003). A Review & Evaluation of Exploratory Factor Analysis Practices in Organizational Research. Organizational Research Methods, 6(2), 147-168. https://doi.org/10.1177/1094428103251541
- Costello, A. B., & Osborne, J. W. (2005). Best Practices in Exploratory Factor Analysis: Four Recommendations for Getting the Most from Your Analysis. Practical Assessment, Research and Evaluation, 10(7), 1-9.
- Cronbach, L. (1951). Coefficient Alpha and the Internal Structure of Tests. Psychometrika, 16, 297-334. https://doi.org/10.1007/BF02310555
- de Winter, J. C. F., Dodou, D., & Wieringa, P. A. (2009). Exploratory Factor Analysis with Small Sample Sizes. Multivariate Behavioral Research, 44, 147-181. https://doi.org/10.1080/00273170902794206
- Fabrigar, L. R., Wegender, D. T., MacCallum, R. C., & Strahan, E. J. (1999). Evaluating Use of Exploratory Factor Analysis in Psychological Research. Psychological Methods, 4(3), 272-299. https://doi.org/10.1037/1082-989X.4.3.272
- Floyd, F. J., & Widaman, K. F. F. (1995). Factor Analysis in the Development and Refinement of Clinical Assessment Instruments. Psychological Assessment, 7(3), 286-299. https://doi.org/10.1037/1040-3590.7.3.286
- Ford, J. K., MacCallum, R. C., & Tait, M. (1986). The Application of Exploratory Factor Analysis in Applied Psychology: A Critical Review & Analysis. Personnel Psychology, 39, 291-314. https://doi.org/10.1111/j.1744-6570.1986.tb00583.x
- Fouladi, R. T., & Steiger, J. H. (1993). Tests of Multivariate Independence: A Critical Analysis of "Monte Carlo Study of Testing the Significance of Correlation Matrices": Comment. Educational and Psychological Measurement, 53, 927-932. https://doi.org/10.1177/0013164493053004005
- Gefen, D., & Straub, D. A. (2005). Practical Guide to Factorial Validity Using PLS-Graph: Tutorial & Annotated Example. Communications of the Association for Information Systems, 16, 91-109.
- Gorsuch, R. L. (1983). Factor Analysis (2nd ed.). Hillsdale, NJ: Erlbaum.
- Guadagnoli, E., & Velicer, W. F. (1988). Relation of Sample Size to the Stability of Component Patters. Psychological Bulletin, 103, 265-275. https://doi.org/10.1037/0033-2909.103.2.265
- Guilford, J. P. (1954). Psychometric Methods (2nd ed.). New York: McGraw-Hill.
- Haig, B. D. (2005). Exploratory Factor Analysis, Theory Generation, and Scientific Method. Multivariate Behavioral Research, 40(3), 303-329. https://doi.org/10.1207/s15327906mbr4003_2
- Hair, J. F. Jr., Black, W. C., Babin, B. J., Anderson, R. E. (2010). Multivariate Data Analysis (7th ed.). Upper Saddle River, NJ: Prentice Hall.
- Hayton, J. C., Allen, D. G., & Scarpello, V. (2004). Factor Retention Decisions in Exploratory Factor Analysis: A Tutorial on Parallel Analysis. Organizational Research Methods, 7(2), 191-205. https://doi.org/10.1177/1094428104263675
- Henson, R. K., & Roberts, J. K. (2006). Use of Exploratory Factor Analysis in Published Research: Common Errors & Some Comment on Improved Practice. Educational & Psychological Measurement, 66(3), 393-416. https://doi.org/10.1177/0013164405282485
- Hogarty, K. Y., Hines, C. V., Kromrey, J. D., Ferron, J. M., & Mumford, K. R. (2005). The Quality of Factor Solutions in Exploratory Factor Analysis: The Influence of Sample Size, Communality, and Overdetermination. Educational & Psychological Measurement, 65(2), 202-226. https://doi.org/10.1177/0013164404267287
- Horn, J. L. (1965). A Rationale and Test for the Number of Factors in Factor Analysis. Psychometrika, 30, 179-185. https://doi.org/10.1007/BF02289447
- Hutcheson, G., & Sofroniou, N. (1999). The Multivariate Social Scientist: Introductory Statistics Using Generalized Linear Models. Thousand Oaks, CA: Sage Publications.
- Kahn, J. H. (2006). Factor Analysis in Counseling Psychology Research, Training, and Practice: Principles, Advances, & Applications. Counseling Psychologist, 34(5), 684-718. https://doi.org/10.1177/0011000006286347
- Kaiser, H. F. (1960). The Application of Electronic Computers to Factor Analysis. Educational and Psychological Measurement, 20, 141-151. https://doi.org/10.1177/001316446002000116
- Joreskog, K. G., & Sorbom, D. (1989). LISREL 7: User's Reference Guide. Mooresville, IN: Scientific Software.
- Joreskog, K. G., & Sorbom, D. (1996). LISREL 8: User's Reference Guide. Mooresville, IN: Scientific Software.
- Kass, R. A., & Tinsley, H. E. A. (1979). Factor Analysis. Journal of Research, 11, 120-138.
- Lawley, D. N., & Maxwell, A. E. (1971). Factor Analysis in a Statistical Method. Butter-worth, London.
- Ledesma, R. D., & Valero-Mora, P. (2007). Determining the Number of Factors to Retain in EFA: An Easy-to-Use Computer Program for Carrying Out Parallel Analysis. Practical Assessment, Research & Evaluation, 12(2), 1-11.
- Lee, H-S. (2018). Two Factors of Overseas Online Shopping: Self-Efficacy and Impulsivity. Journal of Distribution Science, 16(8), 79-89. https://doi.org/10.15722/JDS.16.8.201808.79
- Leonard, K. A. (2010). Sample Size and Subject to Item Ratio in Principal Components Analysis and Exploratory Factor Analysis. Journal of Biometrics & Biostatistics, 1(2), 2-6.
- MacCallum, R. C., & Widaman, K. F. (1999). Sample Size in Factor Analysis. Psychological Methods, 4(1), 84-99. https://doi.org/10.1037/1082-989X.4.1.84
- MacCallum, R. C., Widaman, K. F., Preacher, K. J., & Hong, S. (2001). Sample Size in Factor Analysis: The Role of Model Error. Multivariate Behavioral Research, 36(4), 611-637. https://doi.org/10.1207/S15327906MBR3604_06
- Meyers, L. S., Gamst, G., & Guarino, A. J. (2006). Applied Multivariate Research: Design and Interpretation. Thousand Oaks, California: SAGE Publications.
- Mundfrom, D. J., Shaw, D. G., & Ke, T. L. (2005). Minimum Sample Size Recommendations for Conducting Factor Analysis. International Journal of Testing, 5(2), 159-168. https://doi.org/10.1207/s15327574ijt0502_4
- Norris, M., & Lecavalier, L. (2010). Evaluating the Use of Exploratory Factor Analysis in Developmental disability Psychological Research. Journal of Autism & Developmental Disorders, 40(1), 8-20. https://doi.org/10.1007/s10803-009-0816-2
- Norusis, M., (2005). Advanced Statistical Procedures Companion. Prentice Hall, Upper Saddle River, NJ.
- Nunnally, J. C. (1978). Psychometric Theory (2nd ed.). McGraw Hill, New York.
- Osborne, J. W., & Fitzpatrick, D. C. (2012). Replication Analysis in Exploratory Factor Analysis: What It Is and Why It Makes Your Analysis Better. Practical Assessment, Research & Evaluation, 17(15), 1-8.
- Park, H. S., Dailey, R., & Lemus, D. (2002). The Use of Exploratory Factor Analysis and Principal Components Analysis in Communication Research. Human Communication Research, 28(4), 562-577. https://doi.org/10.1111/j.1468-2958.2002.tb00824.x
- Phuong, N. N. D., & Dat, N. T. (2017). The Effect of Country-of-Origin on Customer Purchase Intention: A Study of Functional Products in Vietnam. Journal of Asian Finance, Economics and Business, 4(3), 75-83. https://doi.org/10.13106/JAFEB.2017.VOL4.NO3.75
- Phuong, N. N. D., Khuong, M. N., Phuc, L. H., and Dong, L. N. T. (2018), The Effect of Two-Dimensional Factor on Municipal Civil Servants' Job Satisfaction and Public Policy Implications. Journal of Asian Finance, Economics and Business, 5(3), 133-142. https://doi.org/10.13106/jafeb.2018.vol5.no3.133
- Preacher, K. J., & MacCallum, R. C. (2002). Exploratory Factor Analysis in Behavior Genetics Research: Factor Recovery with Small Sample Sizes. Behavior Genetics, 32(2), 153-161. https://doi.org/10.1023/A:1015210025234
- Snook, S. C., & Gorsuch, R. L. (1989). Component Analysis Versus Common Factor Analysis: A Monte Carlo Study, Psychological Bulletin, 106(1), 148-154. https://doi.org/10.1037/0033-2909.106.1.148
- Streiner, D. L. (1994). Figuring Out Factors: The Use and Misuse of Factor Analysis. Canadian Journal of Psychiatry, 39, 135-140. https://doi.org/10.1177/070674379403900303
- Suhr, D. (2006). Exploratory or Confirmatory Factor Analysis. SAS Users Group International Conference, Cary: SAS Institute, Inc., 1-17.
- Tabachnick, B., & Fidell, L. (2001). Using Multivariate Statistics. Needham Height: Allyn and Bacon.
- Tavakol, M., & Dennick, R. (2011). Making Sense of Cronbach's Alpha. International Journal of Medical Education, 2, 53-55. https://doi.org/10.5116/ijme.4dfb.8dfd
- Tinsly, H. E. A., & Tinsley, D. J. (1987). Uses of Factor Analysis in Counseling Psychology Research. Journal of Counseling Psychology, 34(4), 414-424. https://doi.org/10.1037/0022-0167.34.4.414
- Treiblmaier, H., & Filzmoser, P. (2010). Exploratory Factor Analysis Revisited: How Robust Methods Support the Detection of Hidden Multivariate Data Structures in IS Research.
Cited by
- The Relationships between Product Quality Cues and Perceived Values based on Gender Differences at a Food Select Shop vol.11, pp.10, 2019, https://doi.org/10.13106/jidb.2020.vol11.no10.59
- Cloud-Based Accounting Adoption in Jordanian Financial Sector vol.8, pp.2, 2019, https://doi.org/10.13106/jafeb.2021.vol8.no2.0833
- The interaction effects of technological innovation and path-dependent economic growth on countries overall green growth performance vol.333, 2019, https://doi.org/10.1016/j.jclepro.2021.130134