A comparative large scale study of MLP features for mandarin ASR

Citation

F. Valente, M. M. Doss, C. Plahl, S. Ravuri and W. Wang, “A comparative large scale study of MLP features for Mandarin ASR,” in Proc. 11th Annual Conference of the International Speech Communication Association 2010 (INTERSPEECH 2010), pp. 2630–2363.

Abstract

MLP based front-ends have shown significant complementary properties to conventional spectral features. As part of the DARPA GALE program, different MLP features were developed for Mandarin ASR. In this paper, all the proposed frontends are compared in systematic manner and we extensively investigate the scalability of these features in terms of the amount of training data (from 100 hours to 1600 hours) and system complexity (maximum likelihood training, SAT, lattice level combination, and discriminative training). Results on 5 hours of evaluation data from the GALE project reveal that the MLP features consistently produce relative improvements in the range of 15 pct. to 23 pct. at the different steps of a multipass system when compared to the conventional short-term spectral based features likeMFCC and PLP. The largest improvement is obtained using a hierarchical MLP approach.

Keywords: TANDEM features, Multi-Layer Perceptron, Acoustic features, GALE project, LVCSR.


Read more from SRI

  • An arid, rural Nevada landscape

    Can AI help us find valuable minerals?

    SRI’s machine learning-based geospatial analytics platform, already adopted by the USGS, is poised to make waves in the mining industry.

  • Two students in a computer lab

    Building a lab-to-market pipeline for education

    The SRI-led LEARN Network demonstrates how we can get the best evidence-based educational programs to classrooms and students.

  • Code reflected in a man's eyeglasses

    LLM risks from A to Z

    A new paper from SRI and Brazil’s Instituto Eldorado delivers a comprehensive update on the security risks to large language models.