Senior Computer Scientist, Speech Technology and Research Laboratory (STAR)
Mitchell McLaren, Ph.D., is an advanced research engineer in SRI International’s Speech Technology and Research (STAR) Laboratory. His research interests include speaker and language identification, as well as other biometrics such as face recognition.
At SRI, McLaren is task lead of the Speaker Identification task for the Defense Advanced Research Projects Agency (DARPA) Robust Automatic Transcription of Speech (RATS) program. In past projects, he was heavily involved in the Threshold Calibration for Speaker Identification project from the Federal Bureau of Investigation and in the production of SRI’s submission to the 2012 Speaker Recognition Evaluation held by the National Institute of Standards and Technology.
Prior to joining SRI, McLaren was a postdoctoral researcher and the University of Nijmegen, The Netherlands where he focused on speaker and face identification on the Bayesian Biometrics for Forensics (BBfor2) project, funded by the Marie Curie Actions Research Fellowship Programme.
McLaren has published more than 20 papers in the field of speaker, language and face recognition. View selected publications on Google Scholar.
His Ph.D. in speaker identification is from the Queensland University of Technology (QUT), Brisbane, Australia.
View Dr. McLaren’s LinkedIn profile.
Recent publicationsmore +
In this work, we extend the TBC method, proposing a new similarity metric for selecting training data that results in significant gains over the one proposed in the original work.
In this study, our aim is analyzing the behavior of the speaker recognition systems based on speaker embeddings toward different front-end features, including the standard MFCC, as well as PNCC, and PLP.
Robust Speaker Recognition from Distant Speech under Real Reverberant Environments Using Speaker Embeddings
This article focuses on speaker recognition using speech acquired using a single distant or far-field microphone in an indoors environment.
Approaches found to provide robustness in multi-domain LID include a domain-and-language-weighted Gaussian backend classifier, duration-aware calibration, and a source normalized multi-resolution neural network backend.
In this study, we aim to explore some of the fundamental requirements for building a good speaker embeddings extractor. We analyze the impact of voice activity detection, types of degradation, the amount of degraded data, and number of speakers required for a good network.
In this paper, we investigate several automatic transcription schemes for using raw bilingual broadcast news data in semi-supervised bilingual acoustic model training.