Tuyển tập báo cáo các nghiên cứu khoa học quốc tế ngành hóa học dành cho các bạn yêu hóa học tham khảo đề tài: Research Article Optimization of an Image-Based Talking Head System | Hindawi Publishing Corporation EURASIP Journal on Audio Speech and Music Processing Volume 2009 Article ID 174192 13 pages doi 2009 174192 Research Article Optimization of an Image-Based Talking Head System Kang Liu and Joern Ostermann Institut fur Informationsverarbeitung Leibniz Universitat Hannover Appelstr. 9A 30167 Hannover Germany Correspondence should be addressed to Kang Liu kang@ Received 25 February 2009 Accepted 3 July 2009 Recommended by Gerard Bailly This paper presents an image-based talking head system which includes two parts analysis and synthesis. The audiovisual analysis part creates a face model of a recorded human subject which is composed of a personalized 3D mask as well as a large database of mouth images and their related information. The synthesis part generates natural looking facial animations from phonetic transcripts of text. A critical issue of the synthesis is the unit selection which selects and concatenates these appropriate mouth images from the database such that they match the spoken words of the talking head. Selection is based on lip synchronization and the similarity of consecutive images. The unit selection is refined in this paper and Pareto optimization is used to train the unit selection. Experimental results of subjective tests show that most people cannot distinguish our facial animations from real videos. Copyright 2009 K. Liu and J. Ostermann. This is an open access article distributed under the Creative Commons Attribution License which permits unrestricted use distribution and reproduction in any medium provided the original work is properly cited. 1. Introduction The development of modern human-computer interfaces 1-3 such as Web-based information services E-commerce and E-learning will use facial animation techniques combined with dialog systems extensively in the future. Figure 1 shows a typical application of a talking head for E-commerce. If the E-commerce Website is visited by a user

