How to train your speaker embedding extractor

Citation

M, McLaren, D. Castan, M. Kumar Nandwana, L. Ferrer and E. Yilmaz.  How to train your speaker embedding extractor.  Speaker Odyssey 2018.  Forthcoming June 2018.

Abstract

With the recent introduction of speaker embeddings for text-independent speaker recognition, many fundamental questions require addressing in order to fast-track the development of this new era of technology. Of particular interest is the ability of the speaker embeddings network to leverage artificially degraded data at a far greater rate beyond prior technologies, even in the evaluation of naturally degraded data. In this study, we aim to explore some of the fundamental requirements for building a good speaker embeddings extractor. We analyze the impact of voice activity detection, types of degradation, the amount of degraded data, and number of speakers required for a good network. These aspects are analyzed over a large set of 11 conditions from 7 evaluation datasets. We lay out a set of recommendations for training the network based on the observed trends. By applying these recommendations to enhance the default recipe provided in the Kaldi toolkit, a significant gain of 13-21% on the Speakers in the Wild and NIST SRE’16 datasets is achieved.


Read more from SRI

  • Collage of Douglas Engelbart at the Mother of All Demos and a modern computer mouse

    Stanford celebrates a world-changing SRI invention

    Spotlighting Douglas Engelbart’s invention of the computer mouse, Stanford Magazine revisits a moment when SRI transformed computing forever.

  • Two IT professionals solving a problem

    Why quantum assurance matters

    New SRI research seeks to secure the future of quantum innovation by extending software assurance capabilities from classical computers to quantum information systems.

  • PARC Forum Participants

    PARC Forum: The future of defense technologies

    Silicon Valley is paying close attention to the defense sector. SRI convened a conversation exploring new opportunities to advance security through innovation.