• Skip to primary navigation
  • Skip to main content
SRI logo
  • About
    • Press room
    • Our history
  • Expertise
    • Advanced imaging systems
    • Artificial intelligence
    • Biomedical R&D services
    • Biomedical sciences
    • Computer vision
    • Cyber & formal methods
    • Education and learning
    • Innovation strategy and policy
    • National security
    • Ocean & space
    • Quantum
    • QED-C
    • Robotics, sensors & devices
    • Speech & natural language
    • Video test & measurement
  • Ventures
  • NSIC
  • Careers
  • Contact
  • 日本支社
Search
Close
Speech & natural language publications May 1, 1996

A Maximum-Likelihood Approach to Stochastic Matching for Robust Speech Recognition

Citation

Copy to clipboard


Sankar, A., & Lee, C. H. (1996). A maximum-likelihood approach to stochastic matching for robust speech recognition. IEEE transactions on speech and Audio Processing, 4(3), 190-202.

Abstract

We present a maximum-likelihood(ML)stochastic matching approach to decrease the acoustic mismatch between a test utterance and a given set of speech models so as to reduce the recognition performance degradation caused by distortions in the test utterance and/or the model set. We assume that the speech signal is modeled by a set of subword hidden Markov models (HMM) X. The mismatch between the observed test utterance Y and the models X can be reduced in two ways: 1) by an inverse distortion function F(:) that maps Y into an utterance X which matches better with the models X, and 2) by a model transformation function G(:) that maps X to the transformed model Y which matches better with the utterance Y. We assume the functional form of the transformations F(:) or G(:) and estimate the parameters or in a maximum likelihood manner using the expectation-maximization (EM) algorithm. The choice of the form of F (:) or G(:) is based on our prior knowledge of the nature of the acoustic mismatch. The stochastic matching algorithm operates only on the given test utterance and the given set of speech models, and no additional training data is required for the estimation of the mismatch prior to actual testing.
Experimental results are presented to study the properties of the proposed algorithm and to verify the efficacy of the approach in improving the performance of an HMM-based continuous speech recognition system in the presence of mismatch due to different
transducers and transmission channels. The proposed stochastic matching algorithm is found to converge fast. Further, the recognition performance in mismatched conditions is greatly improved while the performance in matched conditions is well maintained.
The stochastic matching algorithm was able to reduce the word error rate by about 70% in mismatched conditions.

↓ Download

↓ View online

Share this

How can we help?

Once you hit send…

We’ll match your inquiry to the person who can best help you.

Expect a response within 48 hours.

Career call to action image

Make your own mark.

Search jobs

Our work

Case studies

Publications

Timeline of innovation

Areas of expertise

Institute

Leadership

Press room

Media inquiries

Compliance

Careers

Job listings

Contact

SRI Ventures

Our locations

Headquarters

333 Ravenswood Ave
Menlo Park, CA 94025 USA

+1 (650) 859-2000

Subscribe to our newsletter


日本支社
SRI International
  • Contact us
  • Privacy Policy
  • Cookies
  • DMCA
  • Copyright © 2022 SRI International