Author: SRI International

August 1, 2007

Integrating MAP, Marginals, and Unsupervised Language Model Adaptation

We investigate the integration of various language model adaptation approaches for a cross-genre adaptation task to improve Mandarin ASR system performance on a recently introduced new genre, broadcast conversation (BC).
August 1, 2007

Exploiting Information Extraction Annotations for Document Retrieval in Distillation Tasks

In this paper, we present our approach for using information extraction annotations to augment document retrieval for distillation. We take advantage of the fact that some of the distillation queries can be associated with annotation elements introduced for the NIST Automatic Content Extraction (ACE) task.
August 1, 2007

fMPE-MAP: Improved Discriminative Adaptation for Modeling New Domains

This paper introduces a new adaptation approach, fMPE-MAP, which is an extension to the original fMPE (feature minimum phone error) algorithm, with the enhanced ability in porting Gaussian models and fMPE transforms to a new domain.
August 1, 2007

Build IT: Girls Developing Information Technology Fluency Through Design. Annual Report Year 2

BuildIT is an after school and summer youth-based curriculum for low income middle school girls to develop IT fluency, interest in mathematics, and knowledge of IT careers.
August 1, 2007

A Semi-Supervised Learning Approach for Morpheme Segmentation for an Arabic Dialect

We evaluate our approach by applying morpheme segmentation to the training data of a statistical machine translation (SMT) system. Experiments show that our approach is less sensitive to the availability of annotated stems than a previous rule-based approach and learns 12% more segmentations on our Iraqi Arabic data.
August 1, 2007

Detecting Deception Using Critical Segments

We present an investigation of segments that map to GLOBAL LIES, that is, the intent to deceive with respect to salient topics of the discourse. We propose that identifying the truth or falsity of these CRITICAL SEGMENTS may be important in determining a speaker’s veracity over the larger topic of discourse.
August 1, 2007

Duration and Pronunciation Conditioned Lexical Modeling for Speaker Verification

We propose a method to improve speaker recognition lexical model performance using acoustic-prosodic information. More specifically, the lexical model is trained using duration- and pronunciation-conditioned word N-grams, simultaneously modeling lexical information along with their acoustic and prosodic characteristics.
July 21, 2007

Leveraging graph locality via abstraction

The use of abstraction to speedup problem solving is ubiquitous in AI, especially in the field of heuristic search where abstraction has proven a crucial technique for creating highly accurate memory-based heuristics known as pattern databases (PDBs).
July 15, 2007

Regression testing for grammar-based systems

This paper describes best practices in two closely related regression testing frameworks used in grammar-based systems: MedSLT, a spoken language translation system based on the Regulus platform, and a search and question answering system based on PARCs XLE syntax-semantics parser.
July 15, 2007

An LFG Chinese grammar for machine use

This paper describes the Chinese grammar developed at PARC, including its three basic components: the tokenizer and tagger, lexicon and syntactic rules.
July 15, 2007

PARC’s Bridge question answering system

This paper describes a system designed to robustly map from natural language sentences to logical, abstract knowledge representations (the Bridge system).
July 15, 2007

Overlay mechanisms for multi-level deep processing applications

This paper discusses some engineering tools that are used in the XLE grammar development platform to allow for domain specialization.