Advanced Search

Browse by Discipline

Scientific Societies

E-print Alerts

Add E-prints

E-print Network

  Advanced Search  

Frequency Warping for VTLN and Speaker Adaptation by Linear Transformation of

Summary: Frequency Warping for VTLN and Speaker
Adaptation by Linear Transformation of
Standard MFCC
Sankaran Panchapagesan , Abeer Alwan
Department of Electrical Engineering, The Henry Samueli School of Engineering
and Applied Science, 66-147E Engr. IV, 405 Hilgard Avenue, Box 951594,
University of California, Los Angeles, CA 90095-1594, USA
Vocal Tract Length Normalization (VTLN) for standard filterbank-based Mel
Frequency Cepstral Coefficient (MFCC) features is usually implemented by warp-
ing the center frequencies of the Mel filterbank, and the warping factor is estimated
using the maximum likelihood score (MLS) criterion (Lee and Rose, 1998). A linear
transform (LT) equivalent for frequency warping (FW) would enable more efficient
MLS estimation (Umesh et al., 2005). We recently proposed a novel LT to perform
FW for VTLN and model adaptation with standard MFCC features (Panchapage-
san, 2006). In this paper, we present the mathematical derivation of the LT and give
a compact formula to calculate it for any FW function. We also show that our LT is
very closely related to previously proposed LTs for FW (McDonough, 2000; Pitz et
al., 2001; Umesh et al., 2005), and these LTs for FW are all found to be numerically
almost identical for the sine-log all-pass transform (SLAPT) warping functions. Our


Source: Alwan, Abeer - Electrical Engineering Department, University of California at Los Angeles


Collections: Computer Technologies and Information Sciences