Publications
-
Integrating Several Annotation Layers for Statistical Information Distillation
We present a sentence extraction algorithm for Information Distillation, a task where for a given templated query, relevant passages must be extracted from massive audio and textual document sources.
-
OOV Detection by Joint Word/Phone Lattice Alignment
We propose a new method for detecting out-of-vocabulary (OOV) words for large vocabulary continuous speech recognition (LVCSR) systems. Our method is based on performing a joint alignment between independently generated…
-
Building A Highly Accurate Mandarin Speech Recognizer
We describe a highly accurate large-vocabulary continuous Mandarin speech recognizer, a collaborative effort among four research organizations. Particularly, we build two acoustic models (AMs) with significant differences but with similar…
-
Morph-Based Speech Recognition and Modeling of Out-of-Vocabulary Words Across Languages
We explore the use of morph-based language models in large-vocabulary continuous speech recognition systems across four so-called “morphologically rich” languages: Finnish, Estonian, Turkish, and Egyptian Colloquial Arabic. The morphs are…
-
Web 2.0 in the enterprise
Based on extensive studies of social systems such as del.icio.us and Wikipedia, we have identified a number of factors that need to be managed in order to realize the full…
-
Sampling Stable Properties of Massive Track Datasets
In this paper, we explore ways in which stable properties of sensor observations can be extracted and visualized using a statistical sampling of features from a very large track dataset,…
-
Thermal cure effects on electrical performance of nanoparticle silver inks
Physical, electrical, and morphological properties of thermally annealed silver nanoparticle thin films are described.
-
An exploratory study of the effect of domain knowledge on Internet search behavior: The case of diabetes
This study investigated how domain knowledge, about diabetes, influences the process and outcome of answering complex questions using the internet.
-
Extending Boosting for Large Scale Spoken Language Understanding
We propose three methods for extending the Boosting family of classifiers motivated by the real-life problems we have encountered. Our results indicate that it is possible to obtain the same…
-
Radio and Meteor Science Outcomes from Comparisons of Meteor Radar Observations at Amisr Poker Flat, Sondrestrom, and Arecibo
We address these meteor “head-echo” observations issues via the first ever use and analysis of meteor observations from the Poker Flat AMISR (PFISR: 449.3 MHz), Sondrestrom (SRF: 1,290 MHz), and Arecibo (AO:…
-
Capturing a Taxonomy of Failures During Automatic Interpretation of Questions Posed in Natural Language
In this paper, we present a study – conducted in the context of the Halo Project – cataloging the types of failures that occur when capturing knowledge from natural language.
-
Comparison of US, EPO, and PCT Patent Citations for Citation Analysis
This paper summarizes the results of a statistical comparison of citations on patent documents from the U.S. Patent & Trademark Office (USPTO), the EPO, and the PCT to determine whether…