KDD ’08: Proceedings of the 14th ACM SIGKDD international conference on Knowledge discovery and data miningAugust 2008 Pages 515–523https://doi.org/10.1145/1401890.1401954
In many multi-class learning scenarios, the number of classes is relatively large (thousands,…), or the space and time efficiency of the learning system can be crucial. We investigate two online update techniques especially suited to such problems. These updates share a sparsity preservation capacity: they allow for constraining the number of prediction connections that each feature can make. We show that one method, exponential moving average, is solving a “discrete” regression problem for each feature, changing the weights in the direction of minimizing the quadratic loss. We design the other method to improve a hinge loss subject to constraints, for better accuracy. We empirically explore the methods, and compare performance to previous indexing techniques, developed with the same goals, as well as other online algorithms based on prototype learning. We observe that while the classification accuracies are very promising, improving over previous indexing techniques, the scalability benefits are preserved.