January 1, 2008

Error-Driven Generalist+Experts (EDGE): a Multi-Stage Ensemble Framework for Text Categorization

Citation

Huang,J., Madani,O., Giles,C.L., CIKM ’08: Proceedings of the 17th ACM conference on Information and knowledge managementOctober 2008 Pages 83–92https://doi.org/10.1145/1458082.1458097

Abstract

We introduce a multi-stage ensemble framework, Error-Driven Generalist+Expert or Edge, for improved classification on large-scale text categorization problems. Edge first trains a generalist, capable of classifying under all classes, to deliver a reasonably accurate initial category ranking given an instance. Edge then computes a confusion graph for the generalist and allocates the learning resources to train experts on relatively small groups of classes that tend to be systematically confused with one another by the generalist. The experts’ votes, when invoked on a given instance, yield a reranking of the classes, thereby correcting the errors of the generalist. Our evaluations showcase the improved classification and ranking performance on several large-scale text categorization datasets. Edge is in particular efficient when the underlying learners are efficient. Our study of confusion graphs is also of independent interest.

↓ View online

Error-Driven Generalist+Experts (EDGE): a Multi-Stage Ensemble Framework for Text Categorization

Abstract

Read more from SRI

Researchers develop materials that can take on the toughest conditions

Podcast: Re-imagining instructional quality and coaching

SRI’s Genome Explorer: Enhanced genome browser delivers better user experience