`    `
Untitled Document

[

Factor analysis based on classical and robust estimators

Muthukrishnan R1, E D Boobalan2*

1,2Department of Statistics, Bharathiar University, Coimbatore-46 Tamil Nadu, INDIA.

Research Article

Abstract               Introduction: The multivariate methods, such as principal component analysis, discriminant analysis, cluster analysis, multivariate regressions etc., are mainly based on the empirical measures mean vector, covariance and correlation matrices. All these measures strongly affected by even a single outliers present in the multivariate data set. Robust alternatives measures are established to overcome this limitation. Many multivariate robust procedures are established to estimate these measures. All these robust procedures established based on the sample of selecting the best observations (which represents the original data) nearly half of the data points. Among these, the minimum covariance determinant estimator (MCD) proposed by Rousseeuw (1984) is one of the highly robust estimators of estimating multivariate location and scatter. This paper provides an attempt to explore such robust procedures along with the application in factor analysis. Further it is proposed to construct robust factor analysis with the help of most widely used robust methods MVE, S and MM that can resist the effect of outliers. The efficiency of these estimators with classical one is carried out by providing an empirical study with a help of MATLAB software.

INTRODUCTION

Multivariate analysis is a statistical technique for simultaneous analysis of two or more variables observed from one or more sample objects. The objective of the analysis is to estimate the extend or amount of relationship among the variables. When working with p-dimensional multivariate normal data both the location and scatter are of interest. The location is described by a mean vector which represents a point in the multidimensional space and the scatter is described by a variance-covariance matrix. The sample mean vector and the sample covariance matrix are the corner stone of the classical multivariate analysis. They are optimal when the underlying data are normal. They, however, are notorious for being extremely sensitive to outliers and heavy tailed data. Robust alternatives of these classical location and scatter estimators are available. These types of estimators indeed are much more robust against outliers and contaminated data. This paper provides a brief description on the robust estimators MCD, MVE, S and MM. It is proposed to construct factor analysis using these robust estimators and efficiencies are measured with classical factor analysis. The brief introduction about factor analysis along with robust and classical counterpart is discussed in section 2. Section 3 provides classical and robust estimators. The performance of the proposed method has been carried out with numerical experiments and the results are provided in the section 4. The findings and discussions are presented in the last section.

CLASSICAL AND ROBUST FACTOR ANALYSIS

Factor analysis is a popular multivariate technique. Its goal is to approximate the p original variables of the dataset by linear combinations of a smaller number k of latent variables, called factors. The classical factor analysis (FA) starts with the usual sample covariance (or correlation) matrix and then the eigenvectors and eigenvalues of the matrix are employed for estimating the loading matrix. This must be done in such a way that the covariance matrix or the correlation matrix of the p original variables is fitted well. The factor analysis model contains many parameters, including the specific variances of the error components. The classical technique starts by computing the usual sample covariance matrix or the sample correlation matrix, followed by a second step which decomposes this matrix according to the model. This approach is not robust to outliers in the data, since they already have a large effect on the first step. The analysis, however, is not robust since outliers can have a large effect on the covariance (or correlation matrix) and the results obtained may be misleading or unreliable. A straightforward approach to robustify the classical FA is to replace the sample covariance (or correlation) matrix with a robust one. Therefore construct a robust factor analysis method, which in the first step computes a highly resistant scatter matrix such as the minimum covariance determinant (MCD) estimator (Rousseew 1985, 1999), Rousseeuw's minimum volume ellipsoid (MVE) estimator, Rousseeuw and Yohai's S-estimators and Huber's M-estimators [Campbell (1980, 1982); Davies (1987); Hampel, Ronchetti, Rousseeuw and Stahel (1986); Huber (1981); Kent and Tyler (1991); Lopuhaa (1989); Lopuhaa and Rousseeuw (1991); Maronna (1976); Rousseeuw (1985); Rousseeuw and Leroy (1987); Rousseeuw and Yohai (1984); Rousseeuw and van Zomeren (1990a, 1990b, 1991); Tyler (1983, 1988, 1991)]. For the second step several methods are available, such as maximum likelihood estimation and the principal factor analysis method.

CLASSICAL AND ROBUST ESTIMATORS

Maximum Likelihood Estimator (MLE)

The principle of maximum likelihood estimation (MLE), originally developed by R.A Fisher in 1920. Assuming that the data is drawn from a population whose distribution is multivariate normal, then the optimal estimators for location and dispersion are found, respectively, as the  sample mean vector,

(1)

and  sample covariance matrix

(2)

These are, obviously, mean-based estimators, so any unusual or extreme observation an arbitrarily inflate either of them.

Robust Estimator

The Minimum Volume Ellipsoid (MVE) estimator was first proposed by Rousseeuw (1984). It has been frequently used in detection of multivariate outliers. The estimation seeks to find the ellipsoid of minimum volume that covers a subset of at least h data points. Subsets of approximately 50% of the observations are examined to find the subset that minimizes the volume occupied by the data. The best subset (smallest volume) is then used to calculate the covariance matrix and the Mahalanobis distances to all the data points. An appropriate cut-off value is then estimated, and the observations with distances that exceed that cut-off are declared to be outliers. To minimize computation time, Rousseeuw and Leroy (1987) proposed a resampling algorithm in which subsamples of p+1 observations (p is the number of variables), the minimum to determine an ellipsoid in p-dimensional space, are initially drawn. Another robust estimator, minimum covariance determinant estimator (MCD) proposed by Rousseeuw (1984, 1985) is a highly robust estimator of multivariate location and scatter. In beginning of 1984 when Rousseeuw introduced nobody didn’t use it due to lack of information about the calculating procedure and also time consuming, so in practice one resort to approximate algorithms. After that the algorithm modified for the computation purpose. To overcome this limitation Rousseeuw (1999) introduced a new algorithm is called FAST-MCD algorithm. It is contain concentration step (C-step) procedure to simplify the computation process. A key step of new algorithm is the fact that starting from any approximation to the MCD, it is possible to compute another approximation with an even lower determinant. The FAST-MCD method is able to handle large data sets within a reasonable amount of time. In fact, Rousseeuw and Van Driessen (1999) successfully analyzed with large data. Rousseeuw and Yohai (1984) introduced S estimator which is slightly different from the existing robust estimators. Also the authors studied the existence, consistency, asymptotic normality and breakdown point of the estimator. Davies (1987) investigated some properties of S-estimators of multivariate location and covariance. An S-estimator of multivariate location and scale minimizes the determinant of the covariance matrix, subject to a constraint on the magnitudes of the corresponding Mahalanobis distances. The multivariate MM-estimator was introduced by Tatsuoka and Tyler (2000) as belonging to a broad class of estimators namely multivariate M-estimators with auxiliary scale. M-estimator was originally constructed by Huber (1964) for the estimation of a one-dimensional location parameter. Maronna (1976) was the first to define M-estimator for multivariate location and covariance. The idea is to estimate the scale by means of a very robust S-estimator, and then estimate the location and shape using a different -function that yields better efficiency at the central model. The location and shape estimates inherit the breakdown point of the auxiliary scale and can be seen as a generalization of the regression MM-estimators of Yohai (1987).

Numerical Study

This section presents the performance of classical and various robust procedures, particularly MCD, MVE, S and MM are considered for the construction of factor analysis. Factor loadings of each variable by each factor under various procedures along with plots are also discussed in this section. The numerical study is carried out using MATLAB software which includes two packages namely forward Search Data Analysis (FSDA), Library for Robust Analysis (LIBRA). The study also provides results under different level of contamination of data.

Experiment 1

The factor analysis has performed in a real dataset under classical and robust procedures. The carbig dataset ( ) that contains various measured variables for about 392 automobiles. The p = 5 variables are the acceleration (X1), Displacement (X2), horsepower (X3), MPG (X4), and weight (X5). The summary of the factor loadings and variance explained under various procedures are listed in the table 1 and the factor loadings with 2% contamination are given in the table 2 which are given in the appendix. From the factor analysis, for the given data points there are two factors are extracted by all classical and robust procedures. It is observed from the table 1 the robust procedure also produces the same results as classical. For the contaminated data the deviation of factor loadings are very low in robust procedures but not in the case of classical procedures. The bi-plots of the factor loadings under various procedures with and without contamination displayed in the figure 1 and 2 respectively. It is observed that, all bi-plots based on the robust procedures with and without contamination is almost same, but in case of classical procedure the bi-plot shows the difference.

(a)                                                          (b)                                          (c)

(d)                                          (e)

Figure 1: Bi-Plot

(a)                                                          (b)                                          (c)

(d)                                         (e)

Figure 2: Bi-Plot (With Contamination) (a) Classical (b) MCD (c) MVE (d) S (e) MM

Experiment 2

The Olympic decathlon dataset is considered (see Linden 1987) for the experiment. The dataset description is as follows: the dataset contains the performances of 33 men's decathlon at the Olympic Games (1988) with ten different events. The ten different events are as follows 100 meters (Y1), long jump (Y2), shot-put (Y3), high jump (Y4), 400 meters (Y5), 110-meter hurdles (Y6), discus throw (Y7), pole vault (Y8), javelin (Y9) and 1500 meters (Y10). The factor analysis results for the given dataset and the results under various level of contamination (2%, 5%, 10% and 20%) of the data are displayed in the tables 3 to 7 which are given in the appendix. It is observed from the factor analysis results, for the given dataset there are three factors are extracted by the classical and robust procedures. Table 3 indicates that almost all the procedures classified the factor along with variables are same. The robust procedure gives the same results. Factor 1 contains 3 variables; they are 100 meters (Y1), 110 hurdles (Y6) and 400 meter (Y5). Factor 2 contains six variables like Long jump (Y2), Shot-put (Y3), High jump (Y4), Discuss throw (Y7), Pole vault (Y8) and Javelin throw (Y9). Factor 3 has only one variable, 1500 meters (Y10) running. Three factors can be named as sprints, field events and middle distance respectively. The results based on various levels of contamination of data are displayed in the tables 4 to 7. It is observed that the classical procedure doesn’t extract the same variables along with factors. The contamination level was increased the classical procedure doesn’t to classify the variables in a correct manner. The robust procedures, MCD and MVE are classified the variables in the factors in a meaningful way up to 35% of the contamination level, since these two procedures based on robust distance. But S and MM robust procedures tolerate up to some lower level of contamination of the data, because these two procedures are based on the magnitude of the Mahalonobis distance.

CONCLUSION

Robust location and scatter estimators find numerous applications to multivariate data analysis and inference in turn its play an important role in many areas such as pattern recognition, telecommunication applications, signal processing and computer vision tasks. In this context, this paper proposed to construct factor analysis with the help of most widely used robust estimators MVE, S and MM that can resist the effect of contaminated data. It is observed from the proposed factor analysis results, the classical procedure and robust procedures extract the same variables along with factors. The contamination level was increased the classical procedure doesn’t classify the variables in the correct manner with a factor. The robust procedures can tolerate some level of contaminated data.

ACKNOWLEDGEMENT

First author convey his sincere thanks to University Grants Commission, New Delhi, India for providing financial assistance under the major research project [F.N.40-247/2011 (SR)] scheme awarded at the department of statistics, Bharathiar University, Coimbatore - 641046, Tamilnadu, India.

REFERENCES

1. Campbell, N.A., “Robust Procedures in Multivariate Analysis I: Robust covariance estimation”, Applied Statistics, 29, 231-237, 1980.
2. Davies, P.L. “Asymptotic Behavior of S-Estimates of Multivariate Location Parameters and Dispersion Matrices”, Annals of Statistics, 15, 3, 1269-1292, 1987.
3. Flury, B. and Riedwyl, H., “Multivariate statistics: a practical approach”, Cambridge university press, 1988.
4. Hampel, F.R., Ronchetti, E.M., Rousseeuw, P.J. and Stahel, W.A., “Robust Statistics: The Approach Based in Influence Functions”, John Wiley and Sons, New York, 1986.
5. Huber, P.J., “Robust Statistics”, John Wiley and Sons, New York, 1981.
6. Kent, J.T. and Tyler, D.E., “Constrained M-Estimation for Multivariate Location and Scatter”, Annals of Statistics, 24, 1346-1370, 1996.
7. Kosfeld, R., “Robust exploratory factor analysis”, Statistical papers, 37, 105-122, 1996.
8. Lopuhaa, H.P., “On the Relation between S-Estimators and M-Estimators of Multivariate Location and Covariance”, Annals of Statistics, 17, 1662-1683, 1989.
9. Lopuhaa, H.P. and Rousseeuw, P.J., “Breakdown Properties of Affine Equivariant Estimators of Multivariate Location and Covariance Matrices”, Annals of Statistics, 19, 229-248, 1991.
10. Maronna, R.A., “Robust M-estimation of Multivariate Location and Scatter”, The Annals of Statistics, 4, 51-67, 1976.
11. Rousseeuw, P.J., “Least Median of Squares Regression”, Journal of the American Statistical Association, 79, 871-880, 1984.
12. Rousseeuw, P.J., “Multivariate Estimation with High Breakdown Point”, Mathematical Statistics and Applications, 283-297, 1985.
13. Rousseeuw, P.J. and Leroy, A., “Robust Regression and outlier detection”, John Wiley and Sons, New York, 1987.
14. Rousseeuw, P.J. and Van Zomeren, B. C., “Unmasking Multivariate Outliers and Leverage Points”, Journal o
15. the American Statistical Association, 85, 633 – 639, 1990a.
16. Rousseeuw, P.J. and Van Zomeren, B. C., “Unmasking Multivariate Outliers and Leverage Points (With Discussion)”, Journal of the American Statistical Association, 85, 633-651, 1990b.
17. Rousseeuw, P.J. and Van Zomeren, B. C., “Robust Distance: Simulation and Cutoff Values”, Directin in Robust Statistics and Diagnostics, Part II, eds, W. Stahel and S. Welsberg, The IMA volumes in Mathematics and its Application, 34, 195-203, 1991.
18. Rousseeuw, P. J. and Van Driessen, K., “A Fast Algorithm for the Minimum Covariance Determinant Estimator”, Technometrics, 41, 212-223, 1999.
19. Rousseeuw, P.J. and Yohai, V.J., “Robust Regression by Means of S- Estimators”, Robust and Nonlinear Time Series Analysis Lecture Notes in Statistics, 26, 256-272, 1984.
20. Salibian-Barrera, M. and Yohai, V. J., “A fast algorithm for S-regression estimates”, Journal of Computational and Graphical Statistics, 15, 414–427, 2006
21. Tyler, D.E., “A class of asymptotic tests for principal component vectors”, Annals of Statistics, 11(4), 1243-1250, 1983.
22. Tyler, D.E., “Some results on the existence and computation of the M-estimates of multivariate location and scatter”, SIAM J. Sci. Stat. Comput., 9, 2, 354-362, 1988.
23. Tyler, D.E., “Some issues in the robust estimation of multivariate location and scatter”, in Directions in Robust Statistics and Diagnostics Part II, Stahel, W. and Weisberg, S. (eds.), The IMA Volumes in Mathematics and its Applications, Springer-Verlag: New York, 34, 327-336, 1991.

Appendix

 Variables Classical MCD MVE S MM X1 -0.2432 -0.8500 -0.1042 0.9920 -0.1365 0.8653 -0.2193 0.9731 -0.2298 0.9707 X2 0.8773 0.3871 0.8469 -0.2348 0.9434 -0.1374 0.9301 -0.2825 0.9213 -0.3005 X3 0.7618 0.5930 0.7758 -0.4101 0.8019 -0.5933 0.8424 -0.4682 0.8266 -0.4922 X4 -0.7978 -0.2786 -0.8705 0.1262 -0.8491 0.1706 -0.8678 0.1489 -0.8487 0.1777 X5 0.9692 0.2129 0.9635 -0.0847 0.9728 -0.2210 0.9724 -0.1864 0.9698 -0.1829 Variance Explained 99.7554 99.9616 99.7670 99.9084 99.9165 99.9835 99.8652 99.9769 99.8300 99.9727

 Variables Classical MCD MVE S MM X1 -0.1915 0.9789 -0.1123 0.9875 -0.1363 0.8650 -0.2247 0.9719 -0.2394 0.9684 X2 0.8014 -0.1691 0.8445 -0.2352 0.9423 -0.1376 0.9284 -0.2887 0.9190 -0.3103 X3 0.5682 -0.2115 0.7763 -0.4098 0.8013 -0.5929 0.8448 -0.4649 0.8284 -0.4893 X4 -0.1316 0.0236 -0.8725 0.1260 -0.8489 0.1700 -0.8692 0.1607 -0.8497 0.1917 X5 0.9399 -0.1128 0.9693 -0.0823 0.9725 -0.2213 0.9724 -0.1815 0.9690 -0.1840 Variance Explained 98.6157 99.3428 99.8607 99.9719 99.9006 99.9908 99.8593 99.9773 99.8263 99.9734

 Events Factor Analysis (FA) MCD based FA MVE based FA S based FA MM based FA Factor 1 Factor 2 Factor 3 Factor 1 Factor 2 Factor 3 Factor 1 Factor 2 Factor 3 Factor 1 Factor 2 Factor 3 Factor 1 Factor 2 Factor 3 100 meters 0.7838 -0.0559 0.0708 0.7758 -0.1569 0.0462 0.7821 -0.0855 0.0624 0.7758 -0.157 0.0462 0.7821 -0.0855 0.0624 Long jump -0.6091 0.0502 -0.2622 -0.5035 0.2072 -0.1241 -0.6132 0.089 -0.2022 -0.5035 0.2072 -0.1241 -0.6132 0.089 -0.2022 Shot-put -0.2062 0.9687 0.1189 -0.1557 0.9732 0.1541 -0.1808 0.9684 0.1566 -0.1557 0.9732 0.1541 -0.1808 0.9684 0.1566 High jump -0.2525 0.0827 -0.0691 -0.288 0.0204 -0.0313 -0.2629 0.0217 -0.0341 -0.2881 0.0204 -0.0313 -0.2629 0.0217 -0.0341 400 meters 0.7236 0.205 0.3746 0.7047 0.085 0.3106 0.7156 0.1922 0.3559 0.7047 0.085 0.3106 0.7156 0.1922 0.3559 110 hurdles 0.826 -0.1223 -0.0515 0.7286 -0.3996 -0.1412 0.8069 -0.1939 -0.0462 0.7286 -0.3996 -0.1412 0.8069 -0.1939 -0.0462 Discusthrow -0.0674 0.7852 0.2645 -0.1944 0.734 0.3245 -0.0928 0.7492 0.34 -0.1944 0.734 0.3245 -0.0928 0.7492 0.34 Pole vault -0.5437 0.376 0.0319 -0.5566 0.4249 0.012 -0.5645 0.3869 0.0003 -0.5566 0.425 0.012 -0.5645 0.3869 0.0003 Javelin -0.0305 0.6143 -0.0324 -0.0901 0.5883 -0.1457 -0.0311 0.6273 -0.0775 -0.0901 0.5883 -0.1457 -0.0311 0.6273 -0.0775 1500 meter 0.2644 0.2189 0.9366 0.2197 0.0977 0.9681 0.2613 0.1712 0.9473 0.2197 0.0977 0.9681 0.2613 0.1712 0.9473 Variance 81.1567 95.5540 99.2999 84.4668 96.3167 99.4672 81.8294 93.2335 99.3417 81.5071 96.2013 99.3272 81.3661 96.0465 99.3277

 Events Factor Analysis (FA) MCD based FA MVE based FA S based FA MM based FA Factor 1 Factor 2 Factor 3 Factor 1 Factor 2 Factor 3 Factor 1 Factor 2 Factor 3 Factor 1 Factor 2 Factor 3 Factor 1 Factor 2 Factor 3 100 meters 0.8205 0.3949 -0.4014 -0.2414 0.4735 -0.3606 -0.0595 0.765 -0.3529 0.7841 -0.1998 0.0159 0.7815 -0.1475 0.0391 Long jump 0.5049 0.7401 -0.1745 0.2186 -0.5142 -0.1136 0.3154 -0.164 0.8423 -0.5138 0.2044 -0.1532 -0.6111 0.1021 -0.2169 Shot-put -0.0226 -0.1777 0.9813 0.8414 -0.0454 0.0635 0.9836 0.0689 0.151 -0.1103 0.9829 0.1291 -0.135 0.9746 0.1641 High jump 0.7844 0.5068 -0.3511 -0.0118 0.0053 0.9974 -0.0414 -0.4912 0.0982 -0.3905 -0.0428 -0.0718 -0.3562 -0.0716 -0.0718 400 meters -0.5246 -0.7583 0.3015 -0.0905 0.7889 -0.0167 0.2292 0.4972 -0.4495 0.701 0.0346 0.2976 0.7227 0.1188 0.345 110 hurdles -0.7314 -0.6086 0.2744 -0.6019 0.4717 -0.3033 -0.4414 0.8872 0.1142 0.7165 -0.4397 -0.161 0.7994 -0.2643 -0.0698 Discusthrow 0.8785 0.3994 0.0079 0.9777 -0.0391 -0.1277 0.8114 -0.1323 0.0499 -0.1399 0.7396 0.306 -0.0546 0.7337 0.3559 Pole vault 0.64 0.6301 0.006 0.697 -0.6181 0.2597 0.4841 0.0191 0.5702 -0.5296 0.4562 0.016 -0.5406 0.4273 0.0136 Javelin 0.7517 0.3797 0.0729 0.3714 -0.279 0.0313 0.3568 -0.078 0.4333 -0.0699 0.5745 -0.2109 -0.0136 0.6099 -0.1348 1500 meter -0.3972 -0.6727 0.3737 0.4136 0.5829 -0.2243 0.0911 0.2005 -0.5277 0.2513 0.0873 0.9614 0.2803 0.121 0.9496 Variance 86.1194 97.1504 99.4150 83.4165 96.1903 99.4219 77.1010 93.8564 99.2590 82.6629 95.6706 99.2971 82.6728 96.0101 99.2980

 Events Factor Analysis (FA) MCD based FA MVE based FA S based FA MM based FA Factor 1 Factor 2 Factor 3 Factor 1 Factor 2 Factor 3 Factor 1 Factor 2 Factor 3 Factor 1 Factor 2 Factor 3 Factor 1 Factor 2 Factor 3 100 meters 0.8691 0.4904 -0.0368 -0.1134 0.6706 -0.446 0.7009 0.0745 -0.1521 0.714 -0.1453 0.1199 0.746 -0.1148 0.0475 Long jump 0.034 -0.9376 -0.0061 0.0085 -0.0837 0.9939 -0.9683 0.2066 0.121 -0.5537 0.0722 -0.2461 -0.6033 0.0232 -0.1823 Shot-put 0.2043 0.6938 0.687 0.7946 -0.207 0.1527 0.0792 0.9217 0.2438 -0.16 0.9496 0.2603 -0.1346 0.9674 0.2028 High jump 0.9909 0.1102 -0.0458 -0.1698 -0.5626 -0.2637 -0.1573 -0.0794 0.3571 -0.4296 -0.1194 -0.022 -0.3625 -0.0728 -0.0572 400 meters -0.9393 -0.2168 0.058 -0.1127 0.5899 -0.238 0.772 0.3198 0.112 0.6378 0.091 0.4087 0.6843 0.1404 0.3583 110 hurdles 0.3713 0.926 -0.0059 -0.499 0.7301 -0.199 0.7299 -0.0608 -0.5644 0.896 -0.274 -0.07 0.8725 -0.2488 -0.0418 Discusthrow 0.9514 0.0341 0.2143 0.9811 0.0066 0.1799 0.3661 0.7957 0.002 -0.1005 0.6783 0.4514 -0.044 0.7214 0.3867 Pole vault 0.9208 0.2292 0.1209 0.5999 -0.5205 0.3723 0.0137 0.2088 0.7707 -0.5742 0.3746 0.0696 -0.5612 0.3732 0.0861 Javelin 0.7529 -0.2732 0.3037 0.2314 -0.1391 0.495 -0.1848 0.711 -0.0354 0.0041 0.6482 -0.1747 0.0104 0.6254 -0.1195 1500 meter 0.0679 0.9026 0.1618 0.4474 0.3793 -0.0955 0.5254 0.3973 0.5088 0.2044 0.0952 0.849 0.2432 0.1428 0.9568 Variance 80.4564 97.0798 98.8565 83.4165 96.1903 99.4219 84.7222 97.2415 99.3698 82.3514 96.0679 99.2817 82.1740 96.0542 99.2839

 Events Factor Analysis (FA) MCD based FA MVE based FA S based FA MM based FA Factor 1 Factor 2 Factor 3 Factor 1 Factor 2 Factor 3 Factor 1 Factor 2 Factor 3 Factor 1 Factor 2 Factor 3 Factor 1 Factor 2 Factor 3 100 meters 0.7361 -0.3822 0.5524 -0.368 0.5799 -0.277 -0.2318 0.5912 -0.1555 -0.2225 0.5179 -0.5 -0.1912 0.5074 0.5032 Long jump 0.7453 0.6567 -0.0964 0.3318 -0.3094 -0.1297 0.2108 -0.342 0.4974 0.0479 -0.3829 0.4195 0.0512 -0.5132 -0.376 Shot-put 0.5685 0.3777 0.5458 0.9156 0.0848 0.045 0.8488 0.1056 -0.0811 0.9814 0.0923 0.1529 0.9697 0.1317 -0.1931 High jump 0.7939 0.5988 0.0904 0.0358 -0.0801 0.9936 -0.361 -0.0137 0.9298 -0.1654 0.0762 0.5981 -0.087 -0.0244 -0.3984 400 meters 0.0213 0.9655 -0.1713 -0.0789 0.8983 0.0934 0.1624 0.9667 -0.1848 0.0746 0.9735 -0.2041 0.1087 0.8037 0.3187 110 hurdles 0.1814 -0.2353 0.9522 -0.7241 0.4288 -0.2334 -0.2806 0.4521 -0.7347 -0.3069 0.3708 -0.8195 -0.2071 0.319 0.9222 Discusthrow 0.9281 -0.219 0.135 0.8924 0.0512 -0.155 0.8601 0.1804 -0.1059 0.7578 0.1253 0.0599 0.7476 0.2438 -0.0943 Pole vault 0.2253 -0.9054 0.2403 0.7077 -0.491 0.2327 0.353 -0.2705 0.2487 0.3807 -0.2575 0.4866 0.3626 -0.2238 -0.4876 Javelin 0.8603 -0.0587 -0.1298 0.5753 -0.0169 0.0461 0.6793 -0.2356 0.2279 0.5972 0.0034 -0.0177 0.6408 -0.0837 0.0811 1500 meter -0.1942 -0.168 0.8788 0.2406 0.5842 -0.1573 0.1494 0.7846 -0.1483 0.1976 0.5636 -0.0196 0.1798 0.6937 -0.0106 Variance 61.8370 90.3361 97.9152 82.3887 96.4301 99.3959 83.0026 94.8756 99.2213 81.4511 96.1185 99.2509 81.1121 95.9612 99.2627

 Events Factor Analysis (FA) MCD based FA MVE based FA S based FA MM based FA Factor 1 Factor 2 Factor 3 Factor 1 Factor 2 Factor 3 Factor 1 Factor 2 Factor 3 Factor 1 Factor 2 Factor 3 Factor 1 Factor 2 Factor 3 100 meters 0.2271 0.8423 -0.4837 -0.2234 0.9354 -0.2649 0.8059 0.3505 -0.0737 -0.2663 0.5838 0.4644 -0.2674 0.5714 0.4694 Long jump 0.9853 -0.1188 -0.1014 0.137 -0.1943 0.9688 -0.6051 0.1176 -0.6585 0.1217 -0.4522 -0.3132 0.1145 -0.4662 -0.3251 Shot-put 0.1059 0.5656 0.0908 0.9565 -0.009 0.0839 0.1025 0.9868 -0.1037 0.9926 0.0527 -0.083 0.9926 0.0557 -0.0818 High jump 0.9876 0.1088 0.0917 0.0787 -0.3484 -0.2228 -0.4041 0.3754 -0.1362 -0.1511 -0.0443 -0.6251 -0.1543 -0.0524 -0.6231 400 meters 0.2 -0.2014 0.9312 -0.0237 0.6634 -0.0491 0.6612 0.1519 0.3208 0.0114 0.8166 0.2644 0.0099 0.7997 0.283 110 hurdles -0.0665 0.9208 0.1358 -0.7479 0.5535 -0.0401 0.7661 -0.3027 -0.0439 -0.4866 0.2557 0.8324 -0.4821 0.2486 0.8372 Discusthrow 0.5356 0.4535 -0.2864 0.7961 0.1123 0.0851 0.0544 0.7432 0.0315 0.7406 0.1834 -0.0854 0.7396 0.1996 -0.0792 Pole vault -0.3136 0.5604 -0.008 0.6185 -0.204 0.1709 -0.1932 0.3917 0.3471 0.4819 -0.1615 -0.3935 0.4814 -0.1468 -0.3973 Javelin 0.5483 -0.0373 0.0398 0.4957 0.0084 0.563 -0.112 0.2834 -0.5158 0.5498 -0.1753 0.0652 0.5479 -0.1896 0.0607 1500 meter -0.2032 0.4102 0.8863 0.1865 0.2656 -0.1161 0.0358 0.1903 0.967 0.0996 0.6348 -0.1138 0.1026 0.655 -0.109 Variance 83.7221 91.3759 96.69.06 79.8832 95.2993 99.2132 66.8015 89.2205 98.8539 81.1217 95.4153 99.1854 81.1973 95.3950 99.1796