An Efficient Repair Procedure For Quick Transcriptions

Citation

Venkataraman, A., Stolcke, A., Wang, W., Vergyri, D., Zheng, J., & Gadde, V. R. R. (2004). An efficient repair procedure for quick transcriptions. In Eighth International Conference on Spoken Language Processing.

Abstract

We describe an efficient procedure for automatic repair of quickly transcribed (QT) speech. QT speech, typically closed captioned data from television broadcasts, usually has a significant number of deletions and misspellings, and has a characteristic absence of disfluencies such as filled pauses (for example, um, uh). Errors of these kinds often throw an acoustic model training program out of alignment and make it hard for it to resynchronize. At best the erroneous utterance is discarded and does not benefit the training procedure. At worst, it could misalign and end up sabotaging the training data. The procedure we propose in this paper aims to em cleanse/ such quick transcriptions so that they align better with the acoustic evidence and thus provide for better acoustic models for automatic speech recognition (ASR). Results from comparing our transcripts with those from careful transcriptions on the same corpus, and from comparable state-of-the-art methods are also presented and discussed.


Read more from SRI

  • An arid, rural Nevada landscape

    Can AI help us find valuable minerals?

    SRI’s machine learning-based geospatial analytics platform, already adopted by the USGS, is poised to make waves in the mining industry.

  • Two students in a computer lab

    Building a lab-to-market pipeline for education

    The SRI-led LEARN Network demonstrates how we can get the best evidence-based educational programs to classrooms and students.

  • Code reflected in a man's eyeglasses

    LLM risks from A to Z

    A new paper from SRI and Brazil’s Instituto Eldorado delivers a comprehensive update on the security risks to large language models.