• Skip to primary navigation
  • Skip to main content
SRI InternationalSRI mobile logo

SRI International

SRI International - American Nonprofit Research Institute

  • About
    • Blog
    • Press room
  • Expertise
    • Advanced imaging systems
    • Artificial intelligence
    • Biomedical R&D services
    • Biomedical sciences
    • Computer vision
    • Cyber & formal methods
    • Education and learning
    • Innovation strategy and policy
    • National security
    • Ocean & space
    • Quantum
    • QED-C
    • Robotics, sensors & devices
    • Speech & natural language
    • Video test & measurement
  • Ventures
  • NSIC
  • Careers
  • Contact
  • 日本支社
Show Search
Hide Search
Speech & natural language publications September 1, 1997 Conference Paper

Mixture Input Transformations for Adaptation of Hybrid Connectionist Speech Recognizers

SRI International September 1, 1997

Abstract

We extend the input transformation approach for adapting hybrid connectionist speech recognizers to allow multiple transformations to be trained. Previous work has shown the efficacy of the linear input transformation approach for speaker adaptation [1] [2] [3], but has focused only on training global transformations. This approach is clearly suboptimal since it assumes that a single transformation is appropriate for every region in the acoutic feature input space, that is, for every phonetic class, microphone, and noise level. In this paper, we propose a new algorithm to train mixtures of transformation networks (MTNs) in the hybrid connectionist recognition framework. This approach is based on the idea of partitioning the acoustic feature space into R regions and training an input transformation for each region. The transformations are combined probabilistically according to the degree to which the acoustic features belong to each region, where the combination weights are derived from a separate acoustic gating network (AGN). We apply the new algorithm to nonnative speaker adaptation, and present recognition results for the 1994 WSJ Spoke 3 development set. The MTN technique can also be used for noise or microphone robust recognition or for other nonspeech neural network pattern recognition problems.

↓ Download

↓ Download

Share this

Facebooktwitterlinkedinmail

Publication, Speech & natural language publications Conference Paper

How can we help?

Once you hit send…

We’ll match your inquiry to the person who can best help you.

Expect a response within 48 hours.

Career call to action image

Make your own mark.

Search jobs
Our work

Case studies

Publications

Timeline of innovation

Areas of expertise

Blog

Institute

Leadership

Press room

Media inquiries

Compliance

Privacy policy

Careers

Job listings

Contact

SRI Ventures

Our locations

Headquarters

333 Ravenswood Ave
Menlo Park, CA 94025 USA

+1 (650) 859-2000

Subscribe to our newsletter

日本支社

SRI International

  • Contact us
  • Privacy Policy
  • Cookies
  • DMCA
  • Copyright © 2022 SRI International