Shriberg, E., Ferrer, L., Venkataraman, A., & Kajarekar, S. (2004). SVM Modeling of” SNERF-Grams” for Speaker Recognition. In Eighth International Conference on Spoken Language Processing.
We describe a new approach to modeling idiosyncratic prosodic behavior for automatic speaker recognition. The approach computes prosodic features by syllable (syllable based nonuniform extraction region features, or “SNERFs”), and models the syllable-feature sequences (“SNERF-grams”) using support vector machines (SVMs). We evaluate performance on development data for a system submitted to the NIST 2004 Speaker Recognition Evaluation. Results show that SNERF-grams provide significant performance gains when combined with a state-of-the-art baseline system, as well as with both prosodic and word-based noncepstral systems.