| | |
Summary: ROBUST SPEECH SEPARATION USING TIMEFREQUENCY MASKING
Parham Aarabi, Guangji Shi, and Omid Jahromi
The Artificial Perception Laboratory
University of Toronto
10 Kings College Road,Ontario, Canada, M5S 3G4
fparham@ecf, guangji@comm, omidj@controlg.utoronto.ca
ABSTRACT
A multimicrophone timefrequency speech masking tech
nique is proposed. This technique utilizes both the time
frequency magnitude and phase information in order to es
timate the SignaltoNoise Ratio (SNR) maximizing mask
ing coefficients for each timefrequency block given that the
direction (or alternatively, the timedelay of arrival) of the
speaker of interest is known. Using this masking algorithm,
speech features (such as formants) from the direction of in
terest are preserved while features from other directions are
severely degraded. Digit recognition experiments indicate
that the proposed technique can result in a substantial in
crease in the digit recognition accuracy rate. At 0dB, for
example, the proposed technique results in a digit recog
|