Recent Improvements in SRI’s Keyword Detection System for Noisy Audio

, ,

Citation

van Hout, J., Mitra, V., Lei, Y., Vergyri, D., Graciarena, M., Mandal, A., & Franco, H. (2014). Recent improvements in SRI’s keyword detection system for noisy audio. In INTERSPEECH (pp. 1727-1731).

Abstract

We present improvements to a keyword spotting (KWS) system that operates in highly adverse channel conditions with very low signal-to-noise ratio levels.  We employ a system combination approach by combining the outputs of multiple large vocabulary continuous speech recognition (LVCSR) systems.  These systems are complementary thanks to different design decisions across all levels of information:  three speech activity detections systems; a wide range of front-end signal processing features (standard cepstral and filter-bank features, noise-robust features and multi-layer perceptron features); three statistical acoustic model types (Gaussian mixtures models, deep and convolutional neural networks); two keyword search strategies (wordbased and phone-based). We explore the scenario where the keywords are known in advance by adding them to the language model and assigning higher weights to n-grams with keywords in them. The scores of each individual system are fused by a logistic-regression based classifier to produce the final system combination output.  We present the performance of our system in the Phase III evaluations of DARPAs Robust Automatic Transcription of Speech (RATS) program for Levantine Arabic and Farsi conversational speech corpora.


Read more from SRI

  • Banner and attendees at the IEEE Hard Tech Venture Summit

    Cultivating hard tech startups that scale

    IEEE’s Hard Tech Venture Summit convened innovators at SRI to refine strategies and build new networks.

  • Patient going into a MRI

    Bringing surgical tools inside the MRI

    Drawing on SRI’s unique innovation ecosystem, the startup Medical Devices Corner is seeking to improve cancer surgery by advancing MRI-safe teleoperation.

  • Christopher Mims and Susan Patrick

    PARC Forum: How to AI

    The Wall Street Journal tech columnist Christopher Mims and SRI Education’s Susan Patrick discuss how AI can strengthen human agency.