Tuyển tập báo cáo các nghiên cứu khoa học quốc tế ngành hóa học dành cho các bạn yêu hóa học tham khảo đề tài: A Model-Selection-Based Self-Splitting Gaussian Mixture Learning with Application to Speaker Identification | EURASIP Journal on Applied Signal Processing 2004 17 2626-2639 2004 Hindawi Publishing Corporation A Model-Selection-Based Self-Splitting Gaussian Mixture Learning with Application to Speaker Identification Shih-Sian Cheng Institute of Information Science Academia Sinica Taipei 115 Taiwan Email sscheng@ Department of Computer Science and Information Engineering National Chiao-Tung University Hsinchu 300 Taiwan Hsin-Min Wang Institute of Information Science Academia Sinica Taipei 115 Taiwan Email whm@ Hsin-Chia Fu Department of Computer Science and Information Engineering National Chiao-Tung University Hsinchu 300 Taiwan Email hcfu@ Received 3 December 2003 Revised 2 July 2004 Recommended for Publication by Kenneth Barner We propose a self-splitting Gaussian mixture learning SGML algorithm for Gaussian mixture modelling. The SGML algorithm is deterministic and is able to find an appropriate number of components of the Gaussian mixture model GMM based on a self-splitting validity measure Bayesian information criterion BIC . It starts with a single component in the feature space and splits adaptively during the learning process until the most appropriate number of components is found. The SGML algorithm also performs well in learning the GMM with a given component number. In our experiments on clustering of a synthetic data set and the text-independent speaker identification task we have observed the ability of the SGML for model-based clustering and automatically determining the model complexity of the speaker GMMs for speaker identification. Keywords and phrases unsupervised learning Gaussian mixture modelling Bayesian information criterion speaker identification. 1. INTRODUCTION In many applications data clustering techniques have been applied to discover and extract the hidden structure in a data set and thus the structural relationships between individual data points can be detected. Data clustering is also known