Belvin, R. S., Riehemann, S. Z., & Precoda, K. (2004, May). A Fine-Grained Evaluation Method for Speech-to-Speech Machine Translation Using Concept Annotations. In LREC.
In this paper we report on a method of evaluating spoken language translation systems that builds upon a task-based evaluation method developed by CMU, but rather than relying on a predefined database of Interchange Format representations of spoken utterances, instead relies on a set of explicitly defined conventions for creating these interlingual representations. Our method also departs from CMU’s in its scoring conventions in using a finer-grained approach to scoring (especially scoring of predicates). We have attempted to validate the legitimacy of this approach to speech-to-speech MT evaluation by looking for a relationship between the scores generated by this method, and the scores generated by a series of experiments using naïve human judgements of the meaning and quality of MT systems’ output.