Independent component analysis (ICA) is a method for finding underlying factors or components from multivariate (multidimensional) statistical data. What distinguishes ICA from other methods is that it looks for components that are both statistically independent, and nongaussian. Here we briefly introduce the basic concepts, applications, and estimation principles of ICA. | Independent Component Analysis. Aapo Hyvarinen Juha Karhunen Erkki Oja Copyright 2001 John Wiley Sons Inc. ISBNs 0-471-40540-X Hardback 0-471-22131-7 Electronic 1 Introduction Independent component analysis ICA is a method for finding underlying factors or components from multivariate multidimensional statistical data. What distinguishes ICA from other methods is that it looks for components that are both statistically independent and nongaussian. Here we briefly introduce the basic concepts applications and estimation principles of ICA. LINEAR REPRESENTATION OF MULTIVARIATE DATA The general statistical setting A long-standing problem in statistics and related areas is how to find a suitable representation of multivariate data. Representation here means that we somehow transform the data so that its essential structure is made more visible or accessible. In neural computation this fundamental problem belongs to the area of unsupervised learning since the representation must be learned from the data itself without any external input from a supervising teacher . A good representation is also a central goal of many techniques in data mining and exploratory data analysis. In signal processing the same problem can be found in feature extraction and also in the source separation problem that will be considered below. Let us assume that the data consists of a number of variables that we have observed together. Let us denote the number of variables by m and the number of observations by T. We can then denote the data by Xi t where the indices take the values i 1 m and t 1 T. The dimensions an can be very large. 1 2 INTRODUCTION A very general formulation of the problem can be stated as follows What could be a function from an m-dimensional space to an n-dimensional space such that the transformed variables give information on the data that is otherwise hidden in the large data set. That is the transformed variables should be the underlying factors or components .