Đang chuẩn bị liên kết để tải về tài liệu:
Mpeg 7 audio and beyond audio content indexing and retrieval phần 5

Không đóng trình duyệt đến khi xuất hiện nút TẢI XUỐNG

Trong chương này chúng tôi sử dụng MPEG-7 được xác định rõ tiêu chuẩn mô tả Spoken Nội dung như là một ví dụ để minh họa cho những thách thức trong lĩnh vực này. Phần âm thanh của MPEG-7 bao gồm công cụ một SpokenContent cao cấp nhắm vào các ứng dụng quản lý dữ liệu nói. | 104 4 SPOKEN CONTENT In this chapter we use the well defined MPEG-7 Spoken Content description standard as an example to illustrate challenges in this domain. The audio part of MPEG-7 contains a SpokenContent high-level tool targeted at spoken data management applications. The MPEG-7 SpokenContent tool provides a standardized representation of an ASR output i.e. of the semantic information the spoken content extracted by an ASR system from a spoken signal. The Spo-kenContent description attempts to be memory efficient and flexible enough to make currently unforeseen applications possible in the future. It consists of a compact representation of multiple word and or sub-word hypotheses produced by an ASR engine. It also includes a header that contains information about the recognizer itself and the speaker s identity. How the SpokenContent description should be extracted and used is not part of the standard. However this chapter begins with a short introduction to ASR systems. The structure of the MPEG-7 SpokenContent description itself is presented in detail in the second section. The third section deals with the main field of application of the SpokenContent tool called spoken document retrieval SDR which aims at retrieving information in speech signals based on their extracted contents. The contribution of the MPEG-7 SpokenContent tool to the standardization and development of future SDR applications is discussed at the end of the chapter. 4.2 AUTOMATIC SPEECH RECOGNITION The MPEG-7 SpokenContent description is a normalized representation of the output of an ASR system. A detailed presentation of the ASR field is beyond the scope of this book. This section provides a basic overview of the main speech recognition principles. A large amount of literature has been published on the subject in the past decades. An excellent overview on ASR is given in Rabiner and Juang 1993 . Although the extraction of the MPEG-7 SpokenContent description is non-normative this .

Đã phát hiện trình chặn quảng cáo AdBlock
Trang web này phụ thuộc vào doanh thu từ số lần hiển thị quảng cáo để tồn tại. Vui lòng tắt trình chặn quảng cáo của bạn hoặc tạm dừng tính năng chặn quảng cáo cho trang web này.