Parameterization of Prosodic Feature Distributions for SVM Modeling in Speaker Recognition


L. Ferrer, E. Shriberg, S. Kajarekar and K. Sonmez, “Parameterization of Prosodic Feature Distributions for SVM Modeling in Speaker Recognition,” 2007 IEEE International Conference on Acoustics, Speech and Signal Processing – ICASSP ’07, 2007, pp. IV-233-IV-236, doi: 10.1109/ICASSP.2007.366892.


Multiple recent studies have shown that speaker recognition performance using frame-based cepstral features is improved by adding higher-level information, including prosodic and lexical features. This paper explores the important question of finding a good kernel for a system that models syllable-based prosodic features using support vector machines (SVMs). The system has been the best performing of our high-level systems in the last t wo NIST evaluations, and gives significant improvements when combined with cepstral-based systems. We introduce two new methods for transforming the syllable-level features into a single high-dimensional vector that can be well modeled by SVMs, resulting in significant gains in speaker recognition performance.

Read more from SRI