• Skip to primary navigation
  • Skip to main content
SRI logo
  • About
    • Press room
    • Our history
  • Expertise
    • Advanced imaging systems
    • Artificial intelligence
    • Biomedical R&D services
    • Biomedical sciences
    • Computer vision
    • Cyber & formal methods
    • Education and learning
    • Innovation strategy and policy
    • National security
    • Ocean & space
    • Quantum
    • QED-C
    • Robotics, sensors & devices
    • Speech & natural language
    • Video test & measurement
  • Ventures
  • NSIC
  • Careers
  • Contact
  • 日本支社
Search
Close
Speech & natural language publications February 1, 1997

Acoustic Modeling for the SRI Hub4 Partitioned Evaluation Continuous Speech Recognition System

Citation

Copy to clipboard


Sankar, A., Heck, L., & Stolcke, A. (1997, February). Acoustic modeling for the SRI Hub4 partitioned evaluation continuous speech recognition system. In Proceedings of the 1997 DARPA Speech Recognition Workshop.

Abstract

We describe the development of the SRI system evaluated in the 1996 DARPA continuous speech recognition (CSR) Hub4 partitioned evaluation (PE). The task for the Hub4 evaluation was to recognition speech from broadcast television and radio shows. Recognizing such speech by machines poses many challenges. First, the segments to be recognized could be very long. This introduces a problem in training and recognition because of the consequent increased system memory requirement. A simple segmentation technique is used to break long segments into shorter, more manageable lengths. The speech from broadcast news sources exhibits a variety of difficult acoustic conditions, such as spontaneous speech, band-limited speech, and speech in the presence of noise, music, or background speakers. Such background conditions lead to significant degradation in performance. We describe techniques, based on acoustic adaptation, that adapt recognition models to the different acoustic background conditions, so as to improve recognition performance. We also present a novel algorithm that clusters the test data segments so that the resulting clusters are homogeneous with respect to speakers. This is followed by acoustic adaptation to the individual clusters, resulting in a significant performance improvement. Finally, we briefly describe our studies in language modeling for the Hub4 evaluation which is detailed further in another paper in these proceedings.

↓ Download

Share this

How can we help?

Once you hit send…

We’ll match your inquiry to the person who can best help you.

Expect a response within 48 hours.

Career call to action image

Make your own mark.

Search jobs

Our work

Case studies

Publications

Timeline of innovation

Areas of expertise

Institute

Leadership

Press room

Media inquiries

Compliance

Careers

Job listings

Contact

SRI Ventures

Our locations

Headquarters

333 Ravenswood Ave
Menlo Park, CA 94025 USA

+1 (650) 859-2000

Subscribe to our newsletter


日本支社
SRI International
  • Contact us
  • Privacy Policy
  • Cookies
  • DMCA
  • Copyright © 2022 SRI International