• Skip to primary navigation
  • Skip to main content
SRI InternationalSRI mobile logo

SRI International

SRI International - American Nonprofit Research Institute

  • About
    • Blog
    • Press room
  • Expertise
    • Advanced imaging systems
    • Artificial intelligence
    • Biomedical R&D services
    • Biomedical sciences
    • Computer vision
    • Cyber & formal methods
    • Education and learning
    • Innovation strategy and policy
    • National security
    • Ocean & space
    • Quantum
    • QED-C
    • Robotics, sensors & devices
    • Speech & natural language
    • Video test & measurement
  • Ventures
  • NSIC
  • Careers
  • Contact
  • 日本支社
Show Search
Hide Search
Speech & natural language publications September 1, 2015 Conference Paper

Mitigating the effects of non-stationary unseen noises on language recognition performance

Mitchell McLaren, Aaron Lawson, Martin Graciarena September 1, 2015

SRI Authors: Mitchell McLaren, Aaron Lawson, Martin Graciarena

Citation

Copy to clipboard


L. Ferrer, M. McLaren, A. Lawson and M. Graciarena, “Mitigating the Effects of Non-Stationary Unseen Noises on Language Recognition Performance,” in Proc. Interspeech 2015, pp. 3446-3450.

Abstract

We introduce a new dataset for the study of the effect of highly non-stationary noises on language recognition (LR) performance.  The dataset is based on the data from the 2009 Language Recognition Evaluation organized by the National Institute of Standards and Technology (NIST). Randomly selected noises are added to these signals to achieve a chosen signal-tonoise ratio and percentage of corruption.  We study the effect of these noises on LR performance as a function of these parameters and present some initial methods to mitigate the degradation, focusing on the speech activity detection (SAD) step.   These methods include discarding the C0 coefficient from the features used for SAD, using a more stringent threshold on the SAD scores, thresholding the speech likelihoods returned by the model as an additional way of detecting noise, and a final model adaptation step.  We show that a system optimized for clean speech is clearly suboptimal on this new dataset since the proposed methods lead to gains of up to 35% on the corrupted data, without knowledge of the test noises and with very little effect on clean data performance.

↓ Download

Share this

Facebooktwitterlinkedinmail

Publication, Speech & natural language publications Conference Paper

How can we help?

Once you hit send…

We’ll match your inquiry to the person who can best help you.

Expect a response within 48 hours.

Career call to action image

Make your own mark.

Search jobs
Our work

Case studies

Publications

Timeline of innovation

Areas of expertise

Blog

Institute

Leadership

Press room

Media inquiries

Compliance

Privacy policy

Careers

Job listings

Contact

SRI Ventures

Our locations

Headquarters

333 Ravenswood Ave
Menlo Park, CA 94025 USA

+1 (650) 859-2000

Subscribe to our newsletter

日本支社

SRI International

  • Contact us
  • Privacy Policy
  • Cookies
  • DMCA
  • Copyright © 2022 SRI International