Modeling Word Durations


Gadde, V. R. R. (2000, October). Modeling word durations. In INTERSPEECH (pp. 601-604).


We describe a new method of modeling duration at word level. These duration models are easily trained from the acoustic training data and can be used to rescore N-best lists of recognition hypotheses. The models capture some of the well known durational effects such as prepausal lengthening. They incorporate a simple back off mechanism to handle unseen words during rescoring. Experiments with various large vocabulary conversational speech recognition (LVCSR) evaluation sets showed consistent improvements of 0.7-1.0 pct in word error rate (WER).

Read more from SRI