Vergyri, D., Lamel, L., & Gauvain, J. L. (2010). Automatic speech recognition of multiple accented English data. In Eleventh Annual Conference of the International Speech Communication Association.
Accent variability is an important factor in speech that can significantly degrade automatic speech recognition performance. We investigate the effect of multiple accents on an English broadcast news recognition system. A multi-accented English corpus is used for the task, including broadcast news segments from 6 different geographic regions: US, Great Britain, Australia, North Africa, Middle East and India. There is significant performance degradation of a baseline system trained on only US data when confronted with shows from other regions. The results improve significantly when data from all the regions are included for accent-independent acoustic model training. Further improvements are achieved when MAP-adapted accent-dependent models are used in conjunction with a GMM accent classifier.
Index Terms: accented speech recognition, accent adaptation