Novel Approaches to Arabic Speech Recognition: Report from the 2002 Johns Hopkins Summer Workshop

Citation

Kirchhoff, K., Bilmes, J., Das, S., Duta, N., Egan, M., Ji, G., … & Vergyri, D. (2003, April). Novel approaches to Arabic speech recognition: report from the 2002 Johns-Hopkins summer workshop. In 2003 IEEE International Conference on Acoustics, Speech, and Signal Processing, 2003. Proceedings.(ICASSP’03). (Vol. 1, pp. I-I). IEEE.

Abstract

Although Arabic is currently one of the most widely spoken languages in the world, there has been relatively little speech recognition research on Arabic compared to other languages. Moreover, most previous work has concentrated on the recognition of formal rather than dialectal Arabic. This paper reports on our project at the 2002 Johns Hopkins Summer Workshop, which focused on the recognition of dialectal Arabic. Three problems were addressed: (a) the lack of short vowels and other pronunciation information in Arabic texts; (b) the morphological complexity of Arabic; and (c) the discrepancies between dialectal and formal Arabic. We present novel approaches to automatic vowel restoration, morphology-based language modeling and the integration of out-of-corpus language model data, and report significant word error rate improvements on the LDC Arabic CallHome task.


Read more from SRI

  • Banner and attendees at the IEEE Hard Tech Venture Summit

    Cultivating hard tech startups that scale

    IEEE’s Hard Tech Venture Summit convened innovators at SRI to refine strategies and build new networks.

  • Patient going into a MRI

    Bringing surgical tools inside the MRI

    Drawing on SRI’s unique innovation ecosystem, the startup Medical Devices Corner is seeking to improve cancer surgery by advancing MRI-safe teleoperation.

  • Christopher Mims and Susan Patrick

    PARC Forum: How to AI

    The Wall Street Journal tech columnist Christopher Mims and SRI Education’s Susan Patrick discuss how AI can strengthen human agency.