Abstract
We investigate the incorporation of larger time-scale information, such as prosody, into standard speaker ID systems. Our study is based on the Extended Data Task of the NIST 2001 Speaker ID evaluation, which provides much more test and training data than has traditionally been available to similar speaker ID investigations. In addition, we have had access to a detailed prosodic feature database of Switchboard-I conversations, including data not previously applied to speaker ID. We describe two baseline acoustic systems, an approach using Gaussian Mixture Models, and an LVCSR-based speaker ID system. […]
Share this



