A Noise-Robust System for NIST 2012 Speaker Recognition Evaluation

,

Citation

Ferrer, L., McLaren, M., Scheffer, N., Lei, Y., Graciarena, M., & Mitra, V. (2013). A noise-robust system for NIST 2012 speaker recognition evaluation. SRI INTERNATIONAL MENLO PARK CA SPEECH TECHNOLOGY AND RESEARCH LAB.

Abstract

The National Institute of Standards and Technology (NIST) 2012 speaker recognition evaluation posed several new challenges including noisy data, varying test-sample length and number of enrollment samples, and a new metric. Target speakers were known during system development and could be used for model training and score normalization. For the evaluation, SRI International (SRI) submitted a system consisting of six subsystems that use different low- and high-level features, some specifically designed for noise robustness, fused at the score and iVector levels. This paper presents SRI’s submission along with a careful analysis of the approaches that provided gains for this challenging evaluation including a multiclass voice-activity detection system, the use of noisy data in system training, and the fusion of subsystems using acoustic characterization metadata. 

Index Terms: Speaker recognition, noise-robustness, PLDA, iVector


Read more from SRI

  • Banner and attendees at the IEEE Hard Tech Venture Summit

    Cultivating hard tech startups that scale

    IEEE’s Hard Tech Venture Summit convened innovators at SRI to refine strategies and build new networks.

  • Patient going into a MRI

    Bringing surgical tools inside the MRI

    Drawing on SRI’s unique innovation ecosystem, the startup Medical Devices Corner is seeking to improve cancer surgery by advancing MRI-safe teleoperation.

  • Christopher Mims and Susan Patrick

    PARC Forum: How to AI

    The Wall Street Journal tech columnist Christopher Mims and SRI Education’s Susan Patrick discuss how AI can strengthen human agency.