Advanced Search

Browse by Discipline

Scientific Societies

E-print Alerts

Add E-prints

E-print Network

  Advanced Search  

Evaluation of Pronunciation Variants in the ASR Lexicon for Different Speaking Styles

Summary: Evaluation of Pronunciation Variants in the ASR Lexicon
for Different Speaking Styles
Ingunn Amdal and Torbjørn Svendsen
Department of Telecommunications
Norwegian University of Science and Technology,
N-7491 Trondheim, Norway
amdal,torbjorn @tele.ntnu.no
One of the challenges in automatic speech recognition is how to handle pronunciation variation. The main causes for pronunciation
variation are the speaker (voice characteristics, accent, non-nativeness etc.) and the speaking style (reading, spontaneous responses,
conversation etc.). An ASR system has basically two options for modelling the variation on the word and sub-word level: lexical
modelling of the pronunciation variation or adaptation, i.e. re-training of the acoustic models. The answer to the question of which
technique to choose, or how to combine them, may depend on the speaking style. We have therefore investigated the effects of using
pronunciation variants for recognition of read speech, spontaneous dictation, and non-native speech. The variants in the standard purpose
lexicon tested gave modest improvements and best results for read speech, which is the speaking style of the acoustic model training set.
1. Introduction
An important issue in pronunciation modelling is to
know what variation is better modelled at the lexical level
and what can be handled by the acoustic models, (Strik,
2001). Segmental variation, such as allophonic variation,


Source: Amdal, Ingunn - Department of Electronics and Telecommunications, Norwegian University of Science and Technology


Collections: Engineering