Mitra, V., Nam, H., Espy-Wilson, C., Saltzman, E., & Goldstein, L. (2011). Robust speech recognition with articulatory features using dynamic Bayesian networks. The Journal of the Acoustical Society of America, 130(4), 2408-2408.
Previous studies have proposed ways to estimate articulatory information from the acoustic speech signal and have shown that when used with standard cepstral features, they help to improve word recognition performance in noise for a connected digit recognition task. In this paper, I present results from a word recognition and a phone recognition experiments in noise that uses two sets of articulatory representation: continuous (tract variable trajectories) and discrete (articulatory gestures) along with standard mel cepstral features for acoustic modeling. The acoustic model is a dynamic Bayesian network (DBN) that treats the continuous articulatory information as observed and the discrete articulatory presentation as hidden random variables. Our results indicate that the use of articulatory information improved noise robustness for both the word recognition and phone recognition tasks substantially.
© 2011 Acoustical Society of America.