Hatch, A. O., Stolcke, A., & Peskin, B. (2005, November). Combining feature sets with support vector machines: Application to speaker recognition. In IEEE Workshop on Automatic Speech Recognition and Understanding, 2005. (pp. 75-79). IEEE.
In this paper, we describe a general technique for optimizing the relative weights of feature sets in a support vector machine (SVM) and show how it can be applied to the field of speaker recognition. Our training procedure uses an objective function that maps the relative weights of the feature sets directly to a classification metric (e.g. equal-error rate (EER)) measured on a set of training data. The objective function is optimized in an iterative fashion with respect to both the feature weights and the SVM parameters (i.e. the support vector weights and the bias values). In this paper, we use this procedure to optimize the relative weights of various subsets of features in two SVM-based speaker recognition systems: a system that uses transform coefficients obtained from maximum likelihood linear regression (MLLR) as features and another that uses relative frequencies of phone n-grams. In all cases, the training procedure yields significant improvements in both EER and minimum DCF (i.e. decision cost function), as measured on various test corpora.