The MERL/SRI System for the 3rd chime challenge using beamforming, robust feature extraction and advanced speech recognition

Citation

T. Hori, Z. Chen, H. Erdogan, J.R. Hershey, J. Le Roux, V. Mitra and S. Watanabe, “The MERL/SRI System for the 3rd chime challenge using beamforming, robust feature extraction and advanced speech recognition,” p. 475.

Abstract

This paper introduces the MERL/SRI system designed for the 3rd CHiME speech separation and recognition challenge (CHiME-3).  Our proposed system takes advantage of recurrent neural networks (RNNs) throughout the model from the front speech enhancement to the language modeling.  Two different types of beamforming are used to combine multimicrophone signals to obtain a single higher quality signal.  Beamformed signal is further processed by a single-channel bi-directional long short-term memory (LSTM) enhancement network which is used to extract stacked mel-frequency cepstral coefficients (MFCC) features.  In addition, two proposed noise-robust feature extraction methods are used with the beamformed signal.  The features are used for decoding in speech recognition systems with deep neural network (DNN) based acoustic models and large-scale RNN language models to achieve high recognition accuracy in noisy environments.  Our training methodology includes data augmentation and speaker adaptive training, whereas at test time model combination is used to improve generalization.  Results on the CHiME-3 benchmark show that the full cadre of techniques substantially reduced the word error rate (WER). Combining hypotheses from different robust-feature systems ultimately achieved 9.10% WER for the real test data, a 72.4% reduction relative to the baseline of 32.99% WER.

Index Terms— CHiME-3, robust speech recognition, beamforming, noise robust feature, system combination


Read more from SRI

  • surgeons around a surgical robot

    The SRI research behind today’s surgical robotics

    Intuitive’s da Vinci 5 system represents a major leap in robotic-assisted medicine. It all started at SRI, which continues to advance teleoperation technologies.

  • a collage of digital graphs

    A banner year for quantum

    SRI-managed QED-C’s annual report on quantum trends captures an industry accelerating rapidly from technical promise toward major global impact.

  • ICE Cube containing SRI’s aerogel experiment, photographed prior to launch. Source: Aerospace Applications North America

    An SRI carbon capture experiment launches into space

    By synthesizing carbon-absorbing aerogels in microgravity, SRI research will give us a rare glimpse into how these materials could be radically improved.