Speech: A Privileged Modality

Citation

Julia, L. E., & Cheyer, A. (1997). Speech: a privileged modality. In EUROSPEECH.

Abstract

Ever since the publication of Bolt’s ground-breaking “Put-That There” paper, providing multiple modalities as a means of easing the interaction between humans and computers has been a desirable attribute of user interface design. In Bolt’s early approach, the style of modality combination required the user to conform to a rigid order when entering spoken and gestural commands. In the early 1990s, the idea of synergistic multimodal combination began to emerge, although actual implemented systems (generally using keyboard and mouse) remained far from being synergistic. Next-generation approaches involved time-stamped events to reason about the fusion of multimodal input arriving in a given time window, but these systems were hindered by time-consuming matching algorithms. To overcome this limitation, we proposed [JULIA 93] a truly synergistic application and a distributed architecture for flexible interaction that reduces the need for explicit time stamping. Our slot-based approaches command directed, making it suitable for applications using speech as a primary modality. In this article, we use our interaction model to demonstrate that during multimodal fusion, speech should be a privileged modality, driving the interpretation of a query, and that in certain cases, speech has even more power to override and modify the combination of other modalities than previously believed. 


Read more from SRI

  • Banner and attendees at the IEEE Hard Tech Venture Summit

    Cultivating hard tech startups that scale

    IEEE’s Hard Tech Venture Summit convened innovators at SRI to refine strategies and build new networks.

  • Patient going into a MRI

    Bringing surgical tools inside the MRI

    Drawing on SRI’s unique innovation ecosystem, the startup Medical Devices Corner is seeking to improve cancer surgery by advancing MRI-safe teleoperation.

  • Christopher Mims and Susan Patrick

    PARC Forum: How to AI

    The Wall Street Journal tech columnist Christopher Mims and SRI Education’s Susan Patrick discuss how AI can strengthen human agency.