The approach involves finding a way of representing correlated variables together to form a new smaller set of derived variables with minimum loss of information. So, it is a type of a data reduction tool and it removes redundancy or duplication from a set of correlated variables.
Also, factors are formed that are relatively independent of one another. But since it require the data to be correlated, so all assumptions that apply to correlation are relevant here.
There are two main types of factor analysis. The two main types are:
Principal component analysis - this method provides a unique solution so that the original data can be reconstructed from the results. Thus, this method not only provides a solution but also works the other way round, i.e., provides data from the solution. The solution generated includes as many factors as there are variables.
Common factor analysis - this technique uses an estimate of common difference or variance among the original variables to generate the solution. Due to this, the number of factors will always be less than the number of original factors. So, factor analysis actually refers to common factor analysis.
The main uses of factor analysis can be summarized as given below. It helps us in:
Identification of underlying factors- the aspects common to many variables can be identified and the variables can be clustered into homogeneous sets. Thus, new sets of variables can be created. This allows us to gain insight to categories.
Screening of variables- it helps us to identify groupings so that we can select one variable to represent many.
Let us consider an example to understand the use of factor analysis.
Suppose we want to know whether certain aspects such as “task skills” and “communication skills” attribute to the quality of “leadership” or not. We prepare a questionnaire with 20 items, 10 of them pertaining to task elements and 10 to communication elements.
Before using the questionnaire on the sample we use it on a small group of people, who are like those in the survey. When we analyze the data we try to see if there are really two factors and if those factors represent the aspects of task and communication skills.
In this way, factors can be found to represent variables with similar aspects.