Author: SRI International

  • Integrating MAP, Marginals, and Unsupervised Language Model Adaptation

    We investigate the integration of various language model adaptation approaches for a cross-genre adaptation task to improve Mandarin ASR system performance on a recently introduced new genre, broadcast conversation (BC).

  • Exploiting Information Extraction Annotations for Document Retrieval in Distillation Tasks

    In this paper, we present our approach for using information extraction annotations to augment document retrieval for distillation. We take advantage of the fact that some of the distillation queries can be associated with annotation elements introduced for the NIST Automatic Content Extraction (ACE) task.

  • fMPE-MAP: Improved Discriminative Adaptation for Modeling New Domains

    This paper introduces a new adaptation approach, fMPE-MAP, which is an extension to the original fMPE (feature minimum phone error) algorithm, with the enhanced ability in porting Gaussian models and fMPE transforms to a new domain.

  • Build IT: Girls Developing Information Technology Fluency Through Design. Annual Report Year 2

    BuildIT is an after school and summer youth-based curriculum for low income middle school girls to develop IT fluency, interest in mathematics, and knowledge of IT careers.

  • A Semi-Supervised Learning Approach for Morpheme Segmentation for an Arabic Dialect

    We evaluate our approach by applying morpheme segmentation to the training data of a statistical machine translation (SMT) system. Experiments show that our approach is less sensitive to the availability of annotated stems than a previous rule-based approach and learns 12% more segmentations on our Iraqi Arabic data.

  • Detecting Deception Using Critical Segments

    We present an investigation of segments that map to GLOBAL LIES, that is, the intent to deceive with respect to salient topics of the discourse. We propose that identifying the truth or falsity of these CRITICAL SEGMENTS may be important in determining a speaker’s veracity over the larger topic of discourse.

  • Duration and Pronunciation Conditioned Lexical Modeling for Speaker Verification

    We propose a method to improve speaker recognition lexical model performance using acoustic-prosodic information. More specifically, the lexical model is trained using duration- and pronunciation-conditioned word N-grams, simultaneously modeling lexical information along with their acoustic and prosodic characteristics.

  • Leveraging graph locality via abstraction

    The use of abstraction to speedup problem solving is ubiquitous in AI, especially in the field of heuristic search where abstraction has proven a crucial technique for creating highly accurate memory-based heuristics known as pattern databases (PDBs).

  • Regression testing for grammar-based systems

    This paper describes best practices in two closely related regression testing frameworks used in grammar-based systems: MedSLT, a spoken language translation system based on the Regulus platform, and a search and question answering system based on PARCs XLE syntax-semantics parser.

  • An LFG Chinese grammar for machine use

    This paper describes the Chinese grammar developed at PARC, including its three basic components: the tokenizer and tagger, lexicon and syntactic rules.

  • PARC’s Bridge question answering system

    This paper describes a system designed to robustly map from natural language sentences to logical, abstract knowledge representations (the Bridge system).

  • Overlay mechanisms for multi-level deep processing applications

    This paper discusses some engineering tools that are used in the XLE grammar development platform to allow for domain specialization.