• Skip to primary navigation
  • Skip to main content
SRI logo
  • About
    • Press room
    • Our history
  • Expertise
    • Advanced imaging systems
    • Artificial intelligence
    • Biomedical R&D services
    • Biomedical sciences
    • Computer vision
    • Cyber & formal methods
    • Education and learning
    • Innovation strategy and policy
    • National security
    • Ocean & space
    • Quantum
    • QED-C
    • Robotics, sensors & devices
    • Speech & natural language
    • Video test & measurement
  • Ventures
  • NSIC
  • Careers
  • Contact
  • 日本支社
Search
Close
Speech & natural language publications April 1, 2007

NAP and WCCN: Comparison of Approaches Using MLLR-SVM Speaker Verification System

Citation

Copy to clipboard


S. S. Kajarekar and A. Stolcke, “NAP and WCCN: Comparison of Approaches using MLLR-SVM Speaker Verification System,” 2007 IEEE International Conference on Acoustics, Speech and Signal Processing – ICASSP ’07, 2007, pp. IV-249-IV-252, doi: 10.1109/ICASSP.2007.366896.

Abstract

We compare two recently proposed techniques, within class covariance normalization (WCCN) [1] and nuisance attribute projection (NAP) [2], for intersession variability compensation in speaker verification. The comparison is performed using an MLLR-SVM speaker verification system. Both techniques model intersession variability using a within-speaker covariance matrix (WSCM). However, they manipulate eigenvectors of this matrix differently. We compare them on the 2005 and 2006 NIST speaker recognition evaluation (SRE) task. Results show that WCCN is more sensitive to the choice of background speakers and NAP is more sensitive to the choice of data for WSCM estimation. WCCN gives the best performance on 2005 SRE. On 2006 SRE, both techniques give similar performance under matched conditions. Further experiments with a simple combination of these techniques show slight improvements in the best performance of either technique. Overall results show that an MLLR-SVM system with either NAP or WCCN performs comparably to the best single systems in the 2006 NIST SRE.

Index Terms: Speaker recognition, Intersession variability, MLLR transforms, SVM

↓ Download

↓ View online

Share this

How can we help?

Once you hit send…

We’ll match your inquiry to the person who can best help you.

Expect a response within 48 hours.

Career call to action image

Make your own mark.

Search jobs

Our work

Case studies

Publications

Timeline of innovation

Areas of expertise

Institute

Leadership

Press room

Media inquiries

Compliance

Careers

Job listings

Contact

SRI Ventures

Our locations

Headquarters

333 Ravenswood Ave
Menlo Park, CA 94025 USA

+1 (650) 859-2000

Subscribe to our newsletter


日本支社
SRI International
  • Contact us
  • Privacy Policy
  • Cookies
  • DMCA
  • Copyright © 2022 SRI International