We introduce Salience-Based Compression (SBC), a vision-guided pre-filtering technology, coupled with standards-based video coding. SBC works by detecting and tracking salient features and keeping them sharp; non-salient features are lowpass filtered, causing an automatic and beneficial drop in bit rate. Because salience-based pre-filtering is performed as a pre-processing step, it can interface to any COTS video encoder, thus enabling use in existing infrastructures and ensuring the compliance of the video bitstream that is produced. For typical aerial surveillance video, SBC can reduce bit rate by up to a factor of four, yet still provide full motion video (FMV) and preserve salient visual information.
Computational sensing-low-power processing publications
When using the conventional fixed smoothing factor to display the stabilized video, we have the issue of large undefined black border regions (BBR) when camera is fast panning and zooming. To minimize the size of BBR and also provide smooth visualization to the display, this paper discusses several novel methods that have demonstrated on a real-time platform. These methods include an IIR filter, a single Kalman filter and an interactive multi-model filter. The fundamentals of these methods are to adapt the smoothing factor to the motion change from time to time to ensure small BBR and least jitters. To further remove the residual BBR, the pixels inside the BBR are composited from the previous frames. To do that, we first store the previous images and their corresponding frame-to-frame (F2F) motions in a FIFO queue, and then start filling the black pixels from valid pixels in the nearest neighbor frame based on the F2F motion. If a matching is found, then the search is stopped and continues to the next pixel. If the search is exhausted, the pixel remains black. These algorithms have been implemented and tested in a TI DM6437 processor.
Providing high quality, low latency video from unmanned vehicles through bandwidth-limited communications channels remains a formidable challenge for modern vision system designers. SRI has developed a number of enabling technologies to address this, including the use of SWaP-optimized Systems-on-a-Chip which provide Multispectral Fusion and Contrast Enhancement as well as H.264 video compression. Further, the use of salience-based image prefiltering prior to image compression greatly reduces output video bandwidth by selectively blurring non-important scene regions. Combined with our customization of the VLC open source video viewer for low latency video decoding, SRI developed a prototype high performance, high quality vision system for UxV application in support of very demanding system latency requirements and user CONOPS.
We propose a Motion Adaptive Signal Integration (MASI) algorithm that operates the sensor at a high frame rate, with real time alignment of individual image frames to form an enhanced quality video output.
In scenes of significantly varying lighting conditions, under and over exposed regions can suffer from a loss of information. Similarly, the presence of spatial depth within a scene can cause some image regions to be out of focus. Several methods of addressing these issues exist, including tone mapping for true high dynamic range representation and exposure fusion for combining varied-exposure low dynamic range images, as solutions to the former, and image fusion and segmentation etc. to address the latter. This paper proposes an overhauled method of exposure fusion that solves the exposure and focus problems simultaneously, achieving a well-exposed, all-in-focus result. Smart, scene-based data acquisition techniques for reducing both required input data and computational resources are discussed. A platform for a realtime system implementation is also presented.
Image fusion is an important visualization technique of integrating coherent spatial and temporal information into a compact form. Laplacian fusion is a process that combines regions of images from different sources into a single fused image based on a salience selection rule for each region. In this paper, we proposed an algorithmic approach using a mask pyramid to better localize the selection process. A mask pyramid operates in different scales of the image to improve the fused image quality beyond a global selection rule. Several examples of this mask pyramid method are provided to demonstrate its performance in a variety of applications. A new embedded system architecture that builds upon the Acadia ® II Vision Processor is proposed.
Image fusion is a process that combines regions of images from different sources into a single fused image based on a salience selection rule for each region. In this paper, we proposed an algorithmic approach using a mask pyramid to better localize the selection process. A mask pyramid operates in different scales of the image to improve the fused image quality beyond a global selection rule. The proposed approach offers a generic methodology for applications in image enhancement, high dynamic range compression, depth of field extension, and image blending. The mask pyramid can also be encoded for intelligent analysis of source imagery. Several examples of this mask pyramid method are provided to demonstrate its performance in a variety of applications. A new embedded system architecture that builds upon the AcadiaÂ® II Vision Processor is proposed.
The Sarnoff Acadia® II is a powerful vision processing SoC (System-on-a-Chip) that was specifically developed to support advanced vision applications where system size, weight and/or power are severely constrained. This paper, targeted at vision system developers, presents a detailed technical overview of the Acadia® II, highlighting its architecture, processing capabilities, memory and peripheral interfaces. All major subsystems will be covered, including: video preprocessing, specialized vision processing cores for multi-spectral image fusion, multi-resolution contrast normalization, noise coring, image warping, and motion estimation. Application processing via the MPCore®, an integrated set of four ARM®11 floating point processors with associated peripheral interfaces is presented in detail. The paper will emphasize the programmability of the Acadia® II, while describing its ability to provide state-of-the-art realtime image processing in a small, power optimized package.
Maximizing transmitted video quality at the highest resolution and highest frame rate is desirable ,but multiple approaches can be employed to
maximize transmission quality for the video at a given bitrate.
We present an on-the-move LIDAR-based object detection system for autonomous and semi-autonomous unmanned vehicle systems. In this paper we make several contributions: (i) we describe an algorithm for real-time detection of objects such as doors and stairs in indoor environments; (ii) we describe efficient data structures and algorithms for processing 3D point clouds acquired by laser scanners in a streaming manner, which minimize the memory copying and access. We show qualitative results demonstrating the effectiveness of our approach on runs in an indoor office environment.