A Multimodal Learning Interface for Sketch, Speak and Point Creation of a Schedule Chart

Citation

Kaiser Ed, Demirdjian David, Gruenstein Alexander, Li Xiaoguang, Niekrasz John, Wesson Matt, Kumar Sanjeev. A Multimodal Learning Interface for Sketch, Speak and Point Creation of a Schedule Chart, in Proceedings of the 6th International Conference on Multimodal Interfaces, ACM, pp. 329-330, 2004.

Abstract

We present a video demonstration of an agent-based test bed application for ongoing research into multi-user, multimodal, computer-assisted meetings. The system tracks a two person scheduling meeting: one person standing at a touch sensitive whiteboard creating a Gantt chart, while another person looks on in view of a calibrated stereo camera. The stereo camera performs real-time, untethered, vision-based tracking of the onlooker’s head, torso and limb movements, which in turn are routed to a 3D-gesture recognition agent. Using speech, 3D deictic gesture and 2D object de-referencing the system is able to track the onlooker’s suggestion to move a specific milestone. The system also has a speech recognition agent capable of recognizing out-of-vocabulary (OOV) words as phonetic sequences. Thus when a user at the whiteboard speaks an OOV label name for a chart constituent while also writing it, the OOV speech is combined with letter sequences hypothesized by the handwriting recognizer to yield an orthography, pronunciation and semantics for the new label. These are then learned dynamically by the system and become immediately available for future recognition.


Read more from SRI

  • Banner and attendees at the IEEE Hard Tech Venture Summit

    Cultivating hard tech startups that scale

    IEEE’s Hard Tech Venture Summit convened innovators at SRI to refine strategies and build new networks.

  • Patient going into a MRI

    Bringing surgical tools inside the MRI

    Drawing on SRI’s unique innovation ecosystem, the startup Medical Devices Corner is seeking to improve cancer surgery by advancing MRI-safe teleoperation.

  • Christopher Mims and Susan Patrick

    PARC Forum: How to AI

    The Wall Street Journal tech columnist Christopher Mims and SRI Education’s Susan Patrick discuss how AI can strengthen human agency.