SPEECH CODING ALGORITHMS P2

Example of speech waveform uttered by a male subject about the word ‘‘problems.’’ The expanded views of a voiced frame and an unvoiced frame are shown, with the magnitude of the Fourier transorm plotted. The frame is 256 samples in length. to duplicate many of the behaviors and characteristics of real-life phenomenon. However, it is incorrect to assume that the model and the real world that it represents are identical in every way. In order for the model to be successful, it must be able to replicate partially or completely the behaviors of the particular object or fact that it intends. | 14 INTRODUCTION problems. The expanded views of a voiced frame and an unvoiced frame are shown with the magnitude of the Fourier transorm plotted. The frame is 256 samples in length. to duplicate many of the behaviors and characteristics of real-life phenomenon. However it is incorrect to assume that the model and the real world that it represents are identical in every way. In order for the model to be successful it must be able to replicate partially or completely the behaviors of the particular object or fact that it intends to capture or simulate. The model may be a physical one . a model airplane or it may be a mathematical one such as a formula. The human speech production system can be modeled using a rather simple structure the lungs generating the air or energy to excite the vocal tract are represented by a white noise source. The acoustic path inside the body with all its components is associated with a time-varying filter. The concept is illustrated in Figure . This simple model is indeed the core structure of many speech coding algorithms as can be seen later in this book. By using a system identification SPEECH PRODUCTION AND MODELING 15 Output speech Lungs - Trachea - Pharyngeal cavity - Nasal cavity Oral cavity Nostril - Mouth Figure Correspondence between the human speech production system with a simplified system based on time-varying filter. technique called linear prediction Chapter 4 it is possible to estimate the parameters of the time-varying filter from the observed signal. The assumption of the model is that the energy distribution of the speech signal in frequency domain is totally due to the time-varying filter with the lungs producing an excitation signal having a flat-spectrum white noise. This model is rather efficient and many analytical tools have already been developed around the concept. The idea is the well-known autoregressive model reviewed in Chapter 3. A Glimpse of Parametric Speech Coding Consider the speech frame .

Không thể tạo bản xem trước, hãy bấm tải xuống
TỪ KHÓA LIÊN QUAN
TÀI LIỆU MỚI ĐĂNG
Đã phát hiện trình chặn quảng cáo AdBlock
Trang web này phụ thuộc vào doanh thu từ số lần hiển thị quảng cáo để tồn tại. Vui lòng tắt trình chặn quảng cáo của bạn hoặc tạm dừng tính năng chặn quảng cáo cho trang web này.