A. Tsiartas, A. Kathol and E. Shriberg, M. de Zambotti and A. Willoughby, “Prediction of heart rate changes from speech features during interaction with a misbehaving dialog system,” in Proc. Interspeech 2015, pp. 3715-3719.
Most research on detecting a speaker’s cognitive state when interacting with a dialog system has been based on selfreports, or on hand-coded subjective judgments based on audio or audio-visual observations. This study examines two questions: (1) how do undesirable system responses affect people physiologically, and (2) to what extent can we predict physiological changes from the speech signal alone? To address these questions, we use a new corpus of simultaneous speech and high-quality physiological recordings in the product returns domain (the SRI BioFrustration Corpus). “Triggers” were used to frustrate users at specific times during the interaction to produce emotional responses at similar times during the experiment across participants. For each of eight return tasks per participant, we compared speaker-normalized pre-trigger (cooperative system behavior) regions to posttrigger (uncooperative system behavior) regions. Results using random forest classifiers show that changes in spectral and
temporal features of speech can predict heart rate changes with an accuracy of ~70%. Implications for future research and applications are discussed.