Modeling Consistency in a Speaker Independent Continuous Speech Recognition System


Konig, Y., Morgan, N., Wooters, C., Abrash, V., Cohen, M., & Franco, H. (1992). Modeling consistency in a speaker independent continuous speech recognition system. Advances in Neural Information Processing Systems, 5.


We would like to incorporate speaker-dependent consistencies, such as gender, in an otherwise speaker-independent speech recognition system. In this paper we discuss a Gender Dependent Neural Network (GDNN) which can be tuned for each gender, while sharing most of the speaker independent parameters. We use a classification network to help generate gender-dependent phonetic probabilities for a statistical (HMM) recognition system. The gender classification net predicts the gender with high accuracy, 98.3 pct on a Resource Management test set. However, the integration of the GDNN into our hybrid HMM-neural network recognizer provided an improvement in the recognition score that is not statistically significant on a Resource Management test set.

Read more from SRI