Dialog Act Modeling for Conversational Speech

Citation

Stolcke, A., Shriberg, E., Bates, R., Coccaro, N., Jurafsky, D., Martin, R., … & Van Ess-Dykema, C. (1998, March). Dialog act modeling for conversational speech. In AAAI spring symposium on applying machine learning to discourse processing (pp. 98-105).

Abstract

We describe an integrated approach for statistical modeling of discourse structure for natural conversational speech. Our model is based on 42 `dialog acts’ (e.g., Statement, Question, Backchannel, Agreement, Disagreement, Apology), which were hand-labeled in 1155 conversations from the Switchboard corpus of spontaneous human-to-human telephone speech. We developed several models and algorithms to automatically detect dialog acts from transcribed or automatically recognized words and from prosodic properties of the speech signal, and by using a statistical discourse grammar. All of these components were probabilistic in nature and estimated from data, employing a variety of techniques (hidden Markov models, N-gram language models, maximum entropy estimation, decision tree classifiers, and neural networks).

In preliminary studies, we achieved a dialog act labeling accuracy of 65% based on recognized words and prosody, and an accuracy of 72~o based on word transcripts. Since humans achieve 84% on this task (with chance performance at 35%) we find these results encouraging.


Read more from SRI

  • An arid, rural Nevada landscape

    Can AI help us find valuable minerals?

    SRI’s machine learning-based geospatial analytics platform, already adopted by the USGS, is poised to make waves in the mining industry.

  • Two students in a computer lab

    Building a lab-to-market pipeline for education

    The SRI-led LEARN Network demonstrates how we can get the best evidence-based educational programs to classrooms and students.

  • Code reflected in a man's eyeglasses

    LLM risks from A to Z

    A new paper from SRI and Brazil’s Instituto Eldorado delivers a comprehensive update on the security risks to large language models.