Search results for: “stolcke”
-
Efficient Lattice Representation and Generation
We describe two new techniques for reducing word lattice sizes without eliminating hypotheses.
-
How Far Do Speakers Back Up in Repairs? A Quantitative Model
We propose a quantitative model that predicts the overall distribution of retrace lengths in a large corpus of spontaneous speech, based only on word position. Results have implications for modeling repairs in speech applications and constrain explanatory models in psycholinguistics.
-
Automatic Detection of Sentence Boundaries and Disfluencies based on Recognized Words
We study the problem of detecting linguistic events at interword boundaries, such as sentence boundaries and disfluency locations, in speech transcribed by an automatic recognizer. Several model combination approaches are investigated.
-
Dialog Act Modeling for Conversational Speech
We describe an integrated approach for statistical modeling of discourse structure for natural conversational speech. Our model is based on 42 `dialog acts’, which were hand-labeled in 1155 conversations from the Switchboard corpus of spontaneous human-to-human telephone speech.
-
New Developments in Lattice-Based Search Strategies in SRI’s Hub4 System
We describe new developments in SRI’s lattice-based progressive search strategy. These developments include the implementation of a new bigram lattice algorithm, lattice optimization techniques, and expansion of bigram lattices to trigram lattices.
-
The Development of SRI’s 1997 Broadcast News Transcription System
This paper describes SRI’s 1997 broadcast news transcription system used for the 1997 DARPA H4 evaluations. Our system had several novel components. We briefly describe these features and give comparative experimental results.
-
Entropy-based Pruning of Backoff Language Models
A criterion for pruning parameters from N-gram backoff language models is developed, based on the relative entropy between the original and the pruned model. It is shown that the relative entropy resulting from pruning a single N-gram can be computed exactly and efficiently for backoff models.
-
Can Prosody Aid the Automatic Classification of Dialog Acts in Conversational Speech?
This study asks whether current approaches, which use mainly word information, could be improved by adding prosodic information. The study is based on more than 1000 conversations from the Switchboard corpus.
-
A Prosody-Only Decision-Tree Model for Disfluency Detection
We have developed a disfluency detection method using decision tree classifiers that use only local and automatically extracted prosodic features. Because the model doesn’t rely on lexical information, it is widely applicable even when word recognition is unreliable.
-
Explicit Word Error Minimization in N-best List Rescoring
We show that the standard hypothesis scoring paradigm used in maximum-likelihood-based speech recognition systems is not optimal with regard to minimizing the word error rate, the commonly used performance metric in speech recognition.
-
Modeling Linguistic Segment and Turn Boundaries for N-best Rescoring of Spontaneous Speech
We present an N-best rescoring algorithm that removes the effect of segmentation mismatch. Furthermore, we show that explicit language modeling of hidden linguistic segment boundaries is improved by including turn-boundary events in the model.
-
A Study of Multilingual Speech Recognition
This paper describes our work in developing multilingual (Swedish and English) speech recognition systems in the ATIS domain. The acoustic component of the multilingual systems is realized through sharing Gaussian codebooks across Swedish and English allophones.