• Skip to primary navigation
  • Skip to main content
SRI InternationalSRI mobile logo

SRI International

SRI International - American Nonprofit Research Institute

  • About
    • Blog
    • Press room
  • Expertise
    • Advanced imaging systems
    • Artificial intelligence
    • Biomedical R&D services
    • Biomedical sciences
    • Computer vision
    • Cyber & formal methods
    • Education and learning
    • Innovation strategy and policy
    • National security
    • Ocean & space
    • Quantum
    • QED-C
    • Robotics, sensors & devices
    • Speech & natural language
    • Video test & measurement
  • Ventures
  • NSIC
  • Careers
  • Contact
  • 日本支社
Show Search
Hide Search
Speech & natural language publications June 1, 2016 Conference Paper

Analyzing the effect of channel mismatch on the SRI language recognition evaluation 2015 system

Mitchell McLaren, Diego Castán June 1, 2016

SRI Authors: Mitchell McLaren, Diego Castán

Citation

Copy to clipboard


M. McLaren, D. Castan and L. Ferrer, “Analyzing the effect of channel mismatch on the SRI language recognition evaluation 2015 system,” in Proc. Odyssey 2016, pp. 188-195

Abstract

We present the work done by our group for the 2015 language recognition evaluation (LRE) organized by the National Institute of Standards and Technology (NIST). The focus of this evaluation was the development of language recognition systems for clusters of closely related languages using training data released by NIST. This training data contained a highly imbalanced sample from the languages of interest. The SRI team submitted several systems to LRE’15. Major components included (1) bottleneck features extracted from Deep Neural Networks (DNNs) trained to predict English senones, with multiple DNNs trained using a variety of acoustic features; (2) data-driven Discrete Cosine Transform (DCT) contextualization of features for traditional Universal Background Model (UBM) i-vector extraction and for input to a DNN for bottleneck feature extraction; (3) adaptive Gaussian backend scoring; (4) a newly developed multiresolution neural network backend; and (5) cluster-specific N-way fusion of scores. We compare results on our development dataset with those on the evaluation data and find significantly different conclusions about which techniques were useful for each dataset. This difference was due mostly to a large unexpected mismatch in acoustic conditions between the two datasets. We provide a post-evaluation analysis that reveals that the successful approaches for this evaluation included the use of bottleneck features, and a well-defined development dataset appropriate for mismatched conditions.

↓ Download

Share this

Facebooktwitterlinkedinmail

Publication, Speech & natural language publications Conference Paper

How can we help?

Once you hit send…

We’ll match your inquiry to the person who can best help you.

Expect a response within 48 hours.

Our privacy policy
Career call to action image

Make your own mark.

Search jobs
Our work

Case studies

Publications

Timeline of innovation

Areas of expertise

Blog

Institute

Leadership

Press room

Media inquiries

Compliance

Privacy policy

Careers

Job listings

Contact

SRI Ventures

Our locations

Headquarters

333 Ravenswood Ave
Menlo Park, CA 94025 USA

+1 (650) 859-2000

Subscribe to our newsletter

日本支社

SRI International

  • Privacy Policy
  • Cookies
  • DMCA
  • Copyright © 2022 SRI International