Home

About

Advanced Search

Browse by Discipline

Scientific Societies

E-print Alerts

Add E-prints

E-print Network
FAQHELPSITE MAPCONTACT US


  Advanced Search  

 
Speaker Adaptation with Limited Data using Regression-Tree based Spectral Peak Alignment
 

Summary: 1
Speaker Adaptation with Limited Data using
Regression-Tree based Spectral Peak Alignment
Shizhen Wang, Xiaodong Cui, Member, IEEE, and Abeer Alwan, Senior Member, IEEE
Abstract-- Spectral mismatch between training and testing ut-
terances can cause significant degradation in the performance of
automatic speech recognition (ASR) systems. Speaker adaptation
and speaker normalization techniques are usually applied to
address this issue. One way to reduce spectral mismatch is to
reshape the spectrum by aligning corresponding formant peaks.
There are various levels of mismatch in formant structures. In
this paper, regression-tree based phoneme- and state-level spec-
tral peak alignment is proposed for rapid speaker adaptation us-
ing linearization of the vocal tract length normalization (VTLN)
technique. This method is investigated in a maximum likelihood
linear regression (MLLR)-like framework, taking advantage of
both the efficiency of frequency warping (VTLN) and the relia-
bility of statistical estimations (MLLR). Two different regression
classes are investigated: one based on phonetic classes (using
combined knowledge and data-driven techniques) and the other

  

Source: Alwan, Abeer - Electrical Engineering Department, University of California at Los Angeles

 

Collections: Computer Technologies and Information Sciences