Uchida, Y., Agrawal, M., & Sakazawa, S. (2011, April). Accurate content-based video copy detection with efficient feature indexing. In Proceedings of the 1st ACM International Conference on Multimedia Retrieval (pp. 1-8).
We describe an accurate content-based copy detection system that uses both local and global visual features to ensure robustness. Our system advances state-of-the-art techniques in four key directions. (1) Multiple-codebook-based product quantization: conventional product quantization methods encode feature vectors using a single codebook, resulting in large quantization error. We propose a novel codebook generation method for an arbitrary number of codebooks. (2) Handling of temporal burstiness: for a stationary scene, once a query feature matches incorrectly, the match continues in successive frames, resulting in a high false-alarm rate. We present a temporal-burstiness-aware scoring method that reduces the impact from similar features, thereby reducing false alarms. (3) Densely sampled SIFT descriptors: conventional global features suffer from a lack of distinctiveness and invariance to non-photometric transformations. Our densely sampled global SIFT features are more discriminative and robust against logo or pattern insertions. (4) Bigram- and multiple-assignment-based indexing for global features: we extract two SIFT descriptors from each location, which makes them more distinctive. To improve recall, we propose multiple assignments on both the query and reference sides. Performance evaluation on the TRECVID 2009 dataset indicates that both local and global approaches outperform conventional schemes. Furthermore, the integration of these two approaches achieves a three-fold reduction in the error rate when compared with the best performance reported in the TRECVID 2009 workshop.