• Skip to primary navigation
  • Skip to main content
SRI logo
  • About
    • Press room
    • Our history
  • Expertise
    • Advanced imaging systems
    • Artificial intelligence
    • Biomedical R&D services
    • Biomedical sciences
    • Computer vision
    • Cyber & formal methods
    • Education and learning
    • Innovation strategy and policy
    • National security
    • Ocean & space
    • Quantum
    • QED-C
    • Robotics, sensors & devices
    • Speech & natural language
    • Video test & measurement
  • Ventures
  • NSIC
  • Careers
  • Contact
  • 日本支社
Search
Close
Speech & natural language publications June 1, 2000

Word-Level Rate of Speech Modeling Using Rate-Specific Phones and Pronunciations

Citation

Copy to clipboard


Jing Zheng, H. Franco, Fuliang Weng, A. Sankar and H. Bratt, “Word-level rate of speech modeling using rate-specific phones and pronunciations,” 2000 IEEE International Conference on Acoustics, Speech, and Signal Processing. Proceedings (Cat. No.00CH37100), 2000, pp. 1775-1778 vol.3, doi: 10.1109/ICASSP.2000.862097.

Abstract

Variations in rate of speech (ROS) produce changes in both spectral features and word pronunciations that affect ASR systems. To cope with these effects, we propose to use rate-specific phone models and pronunciations for ROS modeling at the word level. Words are given three types of pronunciations — fast, slow, and medium — consisting of rate-specific phone models, respectively. This approach allows us to model within-sentence rate variation. To better model coarticulation effects, we introduce the concept of zero-length phones, which enables short phones to be skipped without having to change their neighboring phones’ contexts. A data-driven approach is used to prune the pronunciation dictionary derived from rules for phone reduction. We tested these approaches on the Hub 4 database and achieved a relative improvement of 2.0% over the baseline — an evaluation-quality version of SRI’s DECIPHERTM continuous speech recognition system — for clean native speech in the 1996 development set.

↓ Download

↓ View online

Share this

How can we help?

Once you hit send…

We’ll match your inquiry to the person who can best help you.

Expect a response within 48 hours.

Career call to action image

Make your own mark.

Search jobs

Our work

Case studies

Publications

Timeline of innovation

Areas of expertise

Institute

Leadership

Press room

Media inquiries

Compliance

Careers

Job listings

Contact

SRI Ventures

Our locations

Headquarters

333 Ravenswood Ave
Menlo Park, CA 94025 USA

+1 (650) 859-2000

Subscribe to our newsletter


日本支社
SRI International
  • Contact us
  • Privacy Policy
  • Cookies
  • DMCA
  • Copyright © 2022 SRI International