Normalized amplitude modulation features for large vocabulary noise-robust speech recognition

SRI authors: ,

Citation

V. Mitra, H, Franco, M. Graciarena, and A. Mandal, “Normalized amplitude modulation features for large vocabulary noise-robust speech recognition,” in Proc. 2012 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP 2012), pp. 4117–4120.

Abstract

Background noise and channel degradations seriously constrain the performance of state-of-the-art speech recognition systems. Studies comparing human speech recognition performance with automatic speech recognition systems indicate that the human auditory system is highly robust against background noise and channel variabilities compared to automated systems. A traditional way to add robustness to a speech recognition system is to construct a robust feature set for the speech recognition model. In this work, we present an amplitude modulation feature derived from Teager’s nonlinear energy operator that is power normalized and cosine transformed to produce normalized modulation cepstral coefficient (NMCC) features. The proposed NMCC features are
compared with respect to state-of-the-art noise-robust features in Aurora-2 and a renoised Wall Street Journal (WSJ) corpus. The
WSJ word-recognition experiments were performed on both a clean and artificially renoised WSJ corpus using SRI’s DECIPHER
large vocabulary speech recognition system. The experiments were performed under three train-test conditions: (a) matched, (b)
mismatched, and (c) multi-conditioned. The Aurora-2 digit recognition task was performed using the standard HTK recognizer
distributed with Aurora-2. Our results indicate that the proposed NMCC features demonstrated noise robustness in almost all the
training-test conditions of renoised WSJ data and also improved digit recognition accuracies for Aurora-2 compared to the MFCCs
and state-of-the-art noise-robust features.


Read more from SRI

  • Banner and attendees at the IEEE Hard Tech Venture Summit

    Cultivating hard tech startups that scale

    IEEE’s Hard Tech Venture Summit convened innovators at SRI to refine strategies and build new networks.

  • Patient going into a MRI

    Bringing surgical tools inside the MRI

    Drawing on SRI’s unique innovation ecosystem, the startup Medical Devices Corner is seeking to improve cancer surgery by advancing MRI-safe teleoperation.

  • Christopher Mims and Susan Patrick

    PARC Forum: How to AI

    The Wall Street Journal tech columnist Christopher Mims and SRI Education’s Susan Patrick discuss how AI can strengthen human agency.