Blind multi-channel speech separation using spatial estimation in two - speaker environments

This paper investigates the problem of speech separation from a mixture of two speech signals without source localization information in a room environment. Due to the lack of source information, the use of spatial detector comes at an expense of permutation ambiguity. To solve the problem, a permutation alignment algorithm based on correlation is employed to group the beamformer outputs into the correct sources. | Journal of Science and Technology Volume 48, Issue 4, 2010 pp. 109-119 BLIND MULTI-CHANNEL SPEECH SEPARATION USING SPATIAL ESTIMATION IN TWO-SPEAKER ENVIRONMENTS HAI QUANG DAM ABSTRACT This paper investigates the problem of speech separation from a mixture of two speech signals without source localization information in a room environment. Due to the lack of source information, the use of spatial detector comes at an expense of permutation ambiguity. To solve the problem, a permutation alignment algorithm based on correlation is employed to group the beamformer outputs into the correct sources. Evaluations using recordings from a real room environment show that the proposed beamformer offers a good interference suppression level whilst maintaining a low distortion level of the desired source. 1. INTRODUCTION In recent year, microphone arrays have seen increasing application for the acquisition of speech in hand-free, distant-talker scenarios. Based on beamforming, microphone arrays are especially promising system in term of interference reduction. These systems can be used to reduce noise in hearing aids, teleconferencing systems, hands free microphones in automobiles, computer terminals, speaker phones, and speech recognition systems. Multichannel optimum filtering requires statistical knowledge about the noise statistics, the environment and the source statistics. The beamformer coefficients are optimized in such a manner that a focused beam is steered to a desired source direction, whilst suppressing the contributions coming from other directions [1, 2]. The filter weights are designed using the information about the location of the target signal and the array geometry. From those parameters, a spatial, spectral and temporal filter are formed to match the beamforming requirement [3, 4]. Most of the beamformers considered so far require information about the desired source spatial correlation matrix. This information, however, may not be readily available .

Không thể tạo bản xem trước, hãy bấm tải xuống
TỪ KHÓA LIÊN QUAN
TÀI LIỆU MỚI ĐĂNG
Đã phát hiện trình chặn quảng cáo AdBlock
Trang web này phụ thuộc vào doanh thu từ số lần hiển thị quảng cáo để tồn tại. Vui lòng tắt trình chặn quảng cáo của bạn hoặc tạm dừng tính năng chặn quảng cáo cho trang web này.