Principal component analysis (PCA) and the closely related Karhunen-Lo` ve transe form, or the Hotelling transform, are classic techniques in statistical data analysis, feature extraction, and data compression, stemming from the early work of Pearson [364]. Given a set of multivariate measurements, the purpose is to find a smaller set of variables with less redundancy, that would give as good a representation as possible. | Independent Component Analysis. Aapo Hyvarinen Juha Karhunen Erkki Oja Copyright 2001 John Wiley Sons Inc. ISBNs 0-471-40540-X Hardback 0-471-22131-7 Electronic 6 Principal Component Analysis and Whitening Principal component analysis PCA and the closely related Karhunen-Loeve transform or the Hotelling transform are classic techniques in statistical data analysis feature extraction and data compression stemming from the early work of Pearson 364 . Given a set of multivariate measurements the purpose is to find a smaller set of variables with less redundancy that would give as good a representation as possible. This goal is related to the goal of independent component analysis ICA . However in PCA the redundancy is measured by correlations between data elements while in ICA the much richer concept of independence is used and in ICA the reduction of the number of variables is given less emphasis. Using only the correlations as in PCA has the advantage that the analysis can be based on second-order statistics only. In connection with ICA PCA is a useful preprocessing step. The basic PCA problem is outlined in this chapter. Both the closed-form solution and on-line learning algorithms for PCA are reviewed. Next the related linear statistical technique of factor analysis is discussed. The chapter is concluded by presenting how data can be preprocessed by whitening removing the effect of firstand second-order statistics which is very helpful as the first step in ICA. PRINCIPAL COMPONENTS The starting point for PCA is a random vector x with n elements. There is available a sample x l . x T from this random vector. No explicit assumptions on the probability density of the vectors are made in PCA as long as the first- and second-order statistics are known or can be estimated from the sample. Also no generative 125 126 PRINCIPAL COMPONENTANALYSISAND WHITENING model is assumed for vector x. Typically the elements of x are measurements like pixel gray levels or values of