Đang chuẩn bị liên kết để tải về tài liệu:
Báo cáo hóa học: " Multimodal Translation System Using Texture-Mapped Lip-Sync Images for Video Mail and Automatic Dubbing Applications"

Mạnh Tuấn 103 11 pdf

Không đóng trình duyệt đến khi xuất hiện nút TẢI XUỐNG Tải xuống

Tuyển tập báo cáo các nghiên cứu khoa học quốc tế ngành hóa học dành cho các bạn yêu hóa học tham khảo đề tài: Multimodal Translation System Using Texture-Mapped Lip-Sync Images for Video Mail and Automatic Dubbing Applications | EURASIP Journal on Applied Signal Processing 2004 11 1637-1647 2004 Hindawi Publishing Corporation Multimodal Translation System Using Texture-Mapped Lip-Sync Images for Video Mail and Automatic Dubbing Applications Shigeo Morishima School of Science and Engineering Waseda University Tokyo 169-8555 Japan Email shigeo@waseda.jp ATR Spoken Language Translation Research Laboratories Kyoto 619-0288 Japan Satoshi Nakamura ATR Spoken Language Translation Research Laboratories Kyoto 619-0288 Japan Email satoshi.nakamura@atr.jp Received 25 November 2002 Revised 16 January 2004 We introduce a multimodal English-to-Japanese and Japanese-to-English translation system that also translates the speaker s speech motion by synchronizing it to the translated speech. This system also introduces both a face synthesis technique that can generate any viseme lip shape and a face tracking technique that can estimate the original position and rotation of a speaker s face in an image sequence. To retain the speaker s facial expression we substitute only the speech organ s image with the synthesized one which is made by a 3D wire-frame model that is adaptable to any speaker. Our approach provides translated image synthesis with an extremely small database. The tracking motion of the face from a video image is performed by template matching. In this system the translation and rotation of the face are detected by using a 3D personal face model whose texture is captured from a video frame. We also propose a method to customize the personal face model by using our GUI tool. By combining these techniques and the translated voice synthesis technique an automatic multimodal translation can be achieved that is suitable for video mail or automatic dubbing systems into other languages. Keywords and phrases audio-visual speech translation lip-sync talking head face tracking with 3D template video mail and automatic dubbing texture-mapped facial animation personal face model. 1. INTRODUCTION The facial

TÀI LIỆU LIÊN QUAN