December 1, 2010

Unbiased discourse segmentation evaluation

Citation

Niekrasz John, Moore Johanna D. Unbiased discourse segmentation evaluation, in 2010 IEEE Spoken Language Technology Workshop, pp. 43-48, Dec 2010.

Abstract

In this paper, we show that the performance measures Pk and Window Diff, commonly used for discourse, topic, and story segmentation evaluation, are biased in favor of segmentations with fewer or adjacent segment boundaries. By analytical and empirical means, we show how this results in a failure to penalize substantially defective segmentations. Our novel unbiased measure k-κ corrects this, providing a single score that accounts for chance agreement. We also propose additional statistics that may be used to characterize important properties of segmentations such as boundary clumping. We go on to replicate a recent spoken-language topic segmentation experiment, drawing conclusions that are substantially different from previous studies concerning the effectiveness of state-of-the-art topic segmentation algorithms.

Keywords: Histograms, Equations, Proposals, Indexes, Length measurement, Image edge detection, Mathematical model

↓ Review online

Unbiased discourse segmentation evaluation

Abstract

Read more from SRI

Researchers develop materials that can take on the toughest conditions

Podcast: Re-imagining instructional quality and coaching

SRI’s Genome Explorer: Enhanced genome browser delivers better user experience