Yu, Q., Liu, J., Cheng, H., Divakaran, A., & Sawhney, H. (2013, 21-25 October). Semantic pooling for complex event detection. Paper presented at the ACM International Conference on Multimedia, Barcelona, Spain.
Complex event detection is very challenging in open source such as You-Tube videos, which usually comprise very diverse visual contents involving various object, scene and action concepts. Not all of them, however, are relevant to the event. In other words, a video may contain a lot of “junk” information which is harmful for recognition. Hence, we propose a semantic pooling approach to tackle this issue. Unlike the conventional pooling over the entire video or specific spatial regions of a video, we employ a discriminative approach to acquire abstract semantic “regions” for pooling. For this purpose, we first associate low-level visual words with semantic concepts via their co-occurrence relationship. We then pool the low-level features separately according to their semantic information. The proposed semantic pooling strategy also provides a new mechanism for incorporating semantic concepts for low-level feature based event recognition. We evaluate our approach on TRECVID MED  dataset and the results show that semantic pooling consistently improves the performance compared with conventional pooling strategies.