Báo cáo toán học: " Voice activity detection based on conjugate subspace matching pursuit and likelihood ratio test"

Tuyển tập các báo cáo nghiên cứu khoa học ngành toán học được đăng trên tạp chí toán học quốc tế đề tài: Voice activity detection based on conjugate subspace matching pursuit and likelihood ratio test | Deng and Han EURASIP Journal on Audio Speech and Music Processing 2011 2011 12 http content 2011 1 12 D EURASIP Journal on Audio Speech and Music Processing a SpringerOpen Journal RESEARCH Open Access Voice activity detection based on conjugate subspace matching pursuit and likelihood ratio test Shiwen Deng1 2 and Jiqing Han 1 Abstract Most of voice activity detection VAD schemes are operated in the discrete Fourier transform DFT domain by classifying each sound frame into speech or noise based on the DFT coefficients. These coefficients are used as features in VAD and thus the robustness of these features has an important effect on the performance of VAD scheme. However some shortcomings of modeling a signal in the DFT domain can easily degrade the performance of a VAD in a noise environment. Instead of using the DFT coefficients in VAD this article presents a novel approach by using the complex coefficients derived from complex exponential atomic decomposition of a signal. With the goodness-of-fit test we show that those coefficients are suitable to be modeled by a Gaussian probability distribution. A statistical model is employed to derive the decision rule from the likelihood ratio test. According to the experimental results the proposed VAD method shows better performance than the VAD based on the DFT coefficients in various noise environments. Keywords voice activity detection matching pursuit likelihood ratio test complex exponential dictionary 1 Introduction Voice activity detection VAD refers to the problem of distinguishing active speech from non-speech regions in an given audio stream and it has become an indispensable component for many applications of speech processing and modern speech communication systems 1-3 such as robust speech recognition speech enhancement and coding systems. Various traditional VAD algorithms have been proposed based on the energy zero-crossing rate and spectral difference in earlier literature 1 4 5 .

Không thể tạo bản xem trước, hãy bấm tải xuống
TÀI LIỆU LIÊN QUAN
TỪ KHÓA LIÊN QUAN
TÀI LIỆU MỚI ĐĂNG
Đã phát hiện trình chặn quảng cáo AdBlock
Trang web này phụ thuộc vào doanh thu từ số lần hiển thị quảng cáo để tồn tại. Vui lòng tắt trình chặn quảng cáo của bạn hoặc tạm dừng tính năng chặn quảng cáo cho trang web này.