Tuyển tập báo cáo các nghiên cứu khoa học quốc tế ngành hóa học dành cho các bạn yêu hóa học tham khảo đề tài: A Statistical Approach to Automatic Speech Summarization Chiori Hori | EURASIP Journal on Applied Signal Processing 2003 2 128-139 2003 Hindawi Publishing Corporation A Statistical Approach to Automatic Speech Summarization Chiori Hori Department of Computer Science Tokyo Institute of Technology 2-12-1 O-okayama Meguro-ku Tokyo 152-8552 Japan Email chiori@ Sadaoki Furui Department of Computer Science Tokyo Institute of Technology 2-12-1 O-okayama Meguro-ku Tokyo 152-8552 Japan Email furui@ Rob Malkin Interactive Systems Labs Carnegie Mellon University Pittsburgh PA 15213 USA Email malkin@ Hua Yu Interactive Systems Labs Carnegie Mellon University Pittsburgh PA 15213 USA Email hua@ Alex Waibel Interactive Systems Labs Carnegie Mellon University Pittsburgh PA 15213 USA Email ahw@ Received 20 March 2002 and in revised form 11 November 2002 This paper proposes a statistical approach to automatic speech summarization. In our method a set of words maximizing a summarization score indicating the appropriateness of summarization is extracted from automatically transcribed speech and then concatenated to create a summary. The extraction process is performed using a dynamic programming DP technique based on a target compression ratio. In this paper we demonstrate how an English news broadcast transcribed by a speech recognizer is automatically summarized. We adapted our method which was originally proposed for Japanese to English by modifying the model for estimating word concatenation probabilities based on a dependency structure in the original speech given by a stochastic dependency context free grammar SDCFG . We also propose a method of summarizing multiple utterances using a two-level DP technique. The automatically summarized sentences are evaluated by summarization accuracy based on a comparison with a manual summary of speech that has been correctly transcribed by human subjects. Our experimental results indicate that the method we propose can effectively extract .