Evaluating Robust Features on Deep Neural Networks for Speech Recognition in Noisy and Channel Mismatched Conditions

,

Citation

Mitra, V., Wang, W., Franco, H., Lei, Y., Bartels, C., & Graciarena, M. (2014). Evaluating robust features on deep neural networks for speech recognition in noisy and channel mismatched conditions. In Fifteenth annual conference of the international speech communication association.

Abstract

Deep Neural Network (DNN) based acoustic models have shown significant improvement over their Gaussian Mixture Model (GMM) counterparts in the last few years.  While several studies exist that evaluate the performance of GMM systems under noisy and channel degraded conditions, noise robustness studies on DNN systems have been far fewer.  In this work we present a study exploring both conventional DNNs and deep Convolutional Neural Networks (CNN) for noise- and channel-degraded speech recognition tasks using the Aurora4 dataset.  We compare the baseline mel-filterbank energies with noise-robust features that we have proposed earlier and show that the use of robust features helps to improve the performance of DNNs or CNNs compared to melfilterbank energies. We also show that vocal tract length normalization has a positive role in improving the performance of the robust acoustic features.  Finally, we show that by combining multiple systems together we can achieve even further improvement in recognition accuracy.

Index Terms: deep neural networks, convolutional neural networks, noise-robust speech recognition, continuous speech recognition, modulation features, damped oscillators.


Read more from SRI

  • Banner and attendees at the IEEE Hard Tech Venture Summit

    Cultivating hard tech startups that scale

    IEEE’s Hard Tech Venture Summit convened innovators at SRI to refine strategies and build new networks.

  • Patient going into a MRI

    Bringing surgical tools inside the MRI

    Drawing on SRI’s unique innovation ecosystem, the startup Medical Devices Corner is seeking to improve cancer surgery by advancing MRI-safe teleoperation.

  • Christopher Mims and Susan Patrick

    PARC Forum: How to AI

    The Wall Street Journal tech columnist Christopher Mims and SRI Education’s Susan Patrick discuss how AI can strengthen human agency.