
Recognition of 3-D Scene Text
Video is an increasingly important source of information to the intelligence analyst, and the volume of collected multimedia data is expanding at a tremendous rate. The capability to recognize text that appears in real-world scenery can be useful for characterizing the contents of video imagery for intelligence use. Previous research in text recognition for both printed documents and other sources of imagery has generally assumed that the text lies in a plane that is roughly perpendicular to the optical axis of the camera. However, text on objects such as street signs, nameplates, and billboards appearing in captured video imagery often lies in a plane that is oriented at an oblique angle and can be quite small relative to the field of view. In this project SRI International (SRI) has developed several techniques for improving the recognition of scene text:
- Two techniques (one using cues derived from a single video frame; the other using information from multiple video frames) to rectify the image of obliquely viewed scene text so that it can be successfully recognized by a conventional optical character recognition (OCR) engine.
- A new super-resolution technique for small text: this technique incorporates a constraint based on the fact that a region of text is generally bilevel. SRI implemented it in C++, and demonstrated its success on real video imagery.
- A new technique for combining (agglomerating) OCR results from multiple video frames.
The performance of these new techniques was evaluated on a variety of video data sets collected by both SRI and other Video Analysis and Context Exploitation (VACE)-related organizations.
In addition, SRI implemented and delivered a stand-alone demonstration program that detects, tracks, and recognizes text in video imagery. We implemented a version of this software that runs in real time on a laptop computer. We developed a set of metrics for the end-to-end evaluation of video text recognition systems, and used those metrics to evaluate the performance of our software.
For More Information
Gregory K. Myers, Program Director
(650) 859-4091
|