Deep convolutional nets and robust features for reverberations-robust speech recognition

Citation

Mitra, V., Wang, W. and Franco, H., “Deep convolutional nets and robust features for reverberation-robust speech recognition,” In Proc. IEEE Spoken Language Technology Workshop ’14, 2014, pp. 548–553.

Abstract

While human listeners can understand speech in reverberant conditions, indicating that the auditory system is robust to such degradations, reverberation leads to high word error rates for automatic speech recognition (ASR) systems. In this work, we present robust acoustic features motivated by human speech perception for use in a convolutional deep neural network (CDNN)-based acoustic model for recognizing continuous speech in a reverberant condition. Using a single-feature system trained with the single channel data distributed through the REVERB 2014 challenge on ASR in reverberant conditions, we show a substantial relative reduction in word error rates (WERs) compared to the conventional filterbank energy-based features for single-channel simulated and real reverberation conditions. The reduction is more pronounced when multiple features and systems were combined together. The proposed system outperforms the best system reported in REVERB-2014 challenge in single channel full-batch processing task.

Index Terms—deep convolutional networks, feature combination, robust speech recognition, reverberation robustness, robust features.


Read more from SRI

  • A photo of Mary Wagner

    Recognizing the life and work of Mary Wagner 

    A cherished SRI colleague and globally respected leader in education research, Mary Wagner leaves behind an extraordinary legacy of groundbreaking work supporting children and youth with disabilities and their families.

  • Testing XRGo in a robotics laboratory

    Robots in the cleanroom

    A global health leader is exploring how SRI’s robotic telemanipulation technology can enhance pharmaceutical manufacturing.

  • SRI research aims to make generative AI more trustworthy

    Researchers have developed a new framework that reduces generative AI hallucinations by up to 32%.