• Skip to primary navigation
  • Skip to main content
SRI logo
  • About
    • Press room
    • Our history
  • Expertise
    • Advanced imaging systems
    • Artificial intelligence
    • Biomedical R&D services
    • Biomedical sciences
    • Computer vision
    • Cyber & formal methods
    • Education and learning
    • Innovation strategy and policy
    • National security
    • Ocean & space
    • Quantum
    • QED-C
    • Robotics, sensors & devices
    • Speech & natural language
    • Video test & measurement
  • Ventures
  • NSIC
  • Careers
  • Contact
  • 日本支社
Search
Close
Speech & natural language publications September 1, 2018

Analysis of Complementary Information Sources in the Speaker Embeddings Framework

Mitchell McLaren, Aaron Lawson

Citation

Copy to clipboard


M. Kumar Nandwana, M. McLaren, D. Castan, Julien van Hout and A. Lawson, “Analysis of complementary information sources in the speaker embeddings framework,” Interspeech 2018, Hyderabad, Telangana, India. Forthcoming September 2018.

Abstract

Deep neural network (DNN)-based speaker embeddings have resulted in new, state-of-the-art text-independent speaker recognition technology. However, very limited effort has been made to understand DNN speaker embeddings. In this study, our aim is analyzing the behavior of the speaker recognition systems based on speaker embeddings toward different front-end features, including the standard Mel frequency cepstral coefficients (MFCC), as well as power normalized cepstral coefficients (PNCC), and perceptual linear prediction (PLP). Using a speaker recognition system based on DNN speaker embeddings and probabilistic linear discriminant analysis (PLDA), we compared different approaches to leveraging complementary information using score-, embeddings-, and feature-level combination. We report our results for Speakers in the Wild (SITW) and NIST SRE 2016 datasets. We found that first and second embeddings layers are complementary in nature. By applying score and embedding-level fusion we demonstrate relative improvements in equal error rate of 17% on NIST SRE 2016 and 10% on SITW over the baseline system.

↓ Download

Share this

How can we help?

Once you hit send…

We’ll match your inquiry to the person who can best help you.

Expect a response within 48 hours.

Career call to action image

Make your own mark.

Search jobs

Our work

Case studies

Publications

Timeline of innovation

Areas of expertise

Institute

Leadership

Press room

Media inquiries

Compliance

Careers

Job listings

Contact

SRI Ventures

Our locations

Headquarters

333 Ravenswood Ave
Menlo Park, CA 94025 USA

+1 (650) 859-2000

Subscribe to our newsletter


日本支社
SRI International
  • Contact us
  • Privacy Policy
  • Cookies
  • DMCA
  • Copyright © 2022 SRI International