In this work, we extend the TBC method, proposing a new similarity metric for selecting training data that results in significant gains over the one proposed in the original work.
Analysis of Complementary Information Sources in the Speaker Embeddings Framework
In this study, our aim is analyzing the behavior of the speaker recognition systems based on speaker embeddings toward different front-end features, including the standard MFCC, as well as PNCC, and PLP.
Robust Speaker Recognition from Distant Speech under Real Reverberant Environments Using Speaker Embeddings
This article focuses on speaker recognition using speech acquired using a single distant or far-field microphone in an indoors environment.
How to train your speaker embedding extractor
In this study, we aim to explore some of the fundamental requirements for building a good speaker embeddings extractor. We analyze the impact of voice activity detection, types of degradation, the amount of degraded data, and number of speakers required for a good network.
Approaches to multi-domain language recognition
Approaches found to provide robustness in multi-domain LID include a domain-and-language-weighted Gaussian backend classifier, duration-aware calibration, and a source normalized multi-resolution neural network backend.
Language Diarization for Semi-supervised Bilingual Acoustic Model Training
In this paper, we investigate several automatic transcription schemes for using raw bilingual broadcast news data in semi-supervised bilingual acoustic model training.
Calibration Approaches for Language Detection
In this paper, we focus on situations in which either (1) the system-modeled languages are not observed during use or (2) the test data contains OOS languages that are unseen during modeling or calibration.
Improving Robustness of Speaker Recognition to New Conditions Using Unlabeled Data
We benchmark these approaches on several distinctly different databases, after we describe our SRICON-UAM team system submission for the NIST 2016 SRE.
On the Issue of Calibration in DNN-Based Speaker Recognition Systems
This article is concerned with the issue of calibration in the context of Deep Neural Network (DNN) based approaches to speaker recognition. We propose a hybrid alignment framework, which stems from our previous work in DNN senone alignment, that uses the bottleneck features only for the alignment of features during statistics calculation.