• Skip to primary navigation
  • Skip to main content
SRI logo
  • About
    • Press room
    • Our history
  • Expertise
    • Advanced imaging systems
    • Artificial intelligence
    • Biomedical R&D services
    • Biomedical sciences
    • Computer vision
    • Cyber & formal methods
    • Education and learning
    • Innovation strategy and policy
    • National security
    • Ocean & space
    • Quantum
    • QED-C
    • Robotics, sensors & devices
    • Speech & natural language
    • Video test & measurement
  • Ventures
  • NSIC
  • Careers
  • Contact
  • 日本支社
Search
Close
Speech & natural language publications March 1, 2017

Speech recognition in unseen and noisy channel conditions

Horacio Franco, Martin Graciarena, Dimitra Vergyri

Citation

Copy to clipboard


V. Mitra, H. Franco, C. Bartels, J. van Hout, M. Graciarena, D. Vergyri, “Speech recognition in unseen and noisy channel conditions,” in Proc. ICASSP 2017, pp. 5215-5219, March 2017.

Abstract

Speech recognition in varying background conditions is a challenging problem. Acoustic condition mismatch between training and evaluation data can significantly reduce recognition performance. For mismatched conditions, data-adaptation techniques are typically found to be useful, as they expose the acoustic model to the new data condition(s). Supervised adaptation techniques usually provide substantial performance improvement, but such gain is contingent on having labeled or transcribed data, which is often unavailable. The alternative is unsupervised adaptation, where feature-transform methods and model-adaptation techniques are typically explored. This work investigates robust features, feature-space maximum likelihood linear regression (fMLLR) transform, and deep convolutional nets to address the problem of unseen channel and noise conditions. In addition, the work investigates bottleneck (BN) features extracted from deep autoencoder (DAE) networks trained by using acoustic features extracted from the speech signal. We demonstrate that such representations not only produce robust systems but also that they can be used to perform data selection for unsupervised model adaptation. Our results indicate that the techniques presented in this paper significantly improve performance of speech recognition systems in unseen channel and noise conditions.

↓ Download

Share this

How can we help?

Once you hit send…

We’ll match your inquiry to the person who can best help you.

Expect a response within 48 hours.

Career call to action image

Make your own mark.

Search jobs

Our work

Case studies

Publications

Timeline of innovation

Areas of expertise

Institute

Leadership

Press room

Media inquiries

Compliance

Careers

Job listings

Contact

SRI Ventures

Our locations

Headquarters

333 Ravenswood Ave
Menlo Park, CA 94025 USA

+1 (650) 859-2000

Subscribe to our newsletter


日本支社
SRI International
  • Contact us
  • Privacy Policy
  • Cookies
  • DMCA
  • Copyright © 2022 SRI International