Integrating Several Annotation Layers for Statistical Information Distillation

Citation

M. Levit, D. Hakkani-Tur, G. Tur and D. Gillick, “Integrating several annotation layers for statistical information distillation,” 2007 IEEE Workshop on Automatic Speech Recognition & Understanding (ASRU), 2007, pp. 671-676, doi: 10.1109/ASRU.2007.4430192.

Abstract

We present a sentence extraction algorithm for Information Distillation, a task where for a given templated query, relevant passages must be extracted from massive audio and textual document sources. For each sentence of the relevant documents (that are assumed to be known from the upstream stages) we employ statistical classification methods to estimate the extent of its relevance to the query, whereby two aspects of relevance are taken into account: the template (type) of the query and its slots (free-text descriptions of names, organizations, topic, events and so on, around which templates are centered). The idiosyncrasy of the presented method is in the choice of features used for classification. We extract our features from charts, compilations of elements from various annotation levels, such as word transcriptions, syntactic and semantic parses, and Information Extraction annotations. In our experiments we show that this integrated approach outperforms a purely lexical baseline by as much as 30% relative in terms of F-measure. We also investigate the algorithm’s behavior under noisy conditions, by comparing its performance on ASR output and on corresponding manual transcriptions.


Read more from SRI

  • A photo of Mary Wagner

    Recognizing the life and work of Mary Wagner 

    A cherished SRI colleague and globally respected leader in education research, Mary Wagner leaves behind an extraordinary legacy of groundbreaking work supporting children and youth with disabilities and their families.

  • Testing XRGo in a robotics laboratory

    Robots in the cleanroom

    A global health leader is exploring how SRI’s robotic telemanipulation technology can enhance pharmaceutical manufacturing.

  • SRI research aims to make generative AI more trustworthy

    Researchers have developed a new framework that reduces generative AI hallucinations by up to 32%.