Citation
Han-Pang Chiu, Supun Samarasekera, Niluthpol Mithun, Abhinav Rajvanshi, Kevin Kaighn, Glenn Murray, Taragay Oskiper, Mikhail Sizintsev, Rakesh Kumar, Night-Time GPS-Denied Navigation and Situational Understanding Using Vision-Enhanced Low-Light Imager, 2023 Joint Navigation Conference, Institute of Navigation
Abstract
Accurate navigation and situational understanding in GPS-denied environments are key capabilities for military platforms and warfighters. Vision-aided navigation systems with machine learning (ML) techniques using small low-cost electro-optical (EO) cameras provide these capabilities. In addition to estimating platform locations, these systems can analyze the perceived scene and identify targets of interest in real time during navigation, by utilizing a pre-trained deep neural network. However, all these capabilities degrade dramatically in night-time operations and dark environments such as tunnels and mines, that are common places for military missions. Vision-based navigation methods are unreliable in these perceptually challenging cases. The quality of image-based machine learning techniques is also poor in visually degraded situations.
In this presentation, we describe and demonstrate a novel vision-enhanced low-light imager system to provide GPS-denied navigation and ML-based visual scene understanding capabilities for both day and night operations. Our system uses SRIās DomiNite imager, which is an advanced low SWAP (size, weight, and power) low-light sensor to provide both day and night imaging capability without the need for expensive and bulky image intensifiers or infrared imagers. The DomiNite imager based on the fourth generation of SRIās NV-CMOSĀ® technology, is the first digital night vision imager that sees into the shadows during the day and through the darkness of the night. Our system adapts and extends SRIās state-of-the-art vision-aided navigation methods and machine learning techniques to work with the DomiNite imager. It enables enhanced augmented reality (AR) features for aided target recognition, and situational awareness for mobile platforms or warfighters in all conditions (day and night) without use of any external illumination.
Unlike conventional global-shutter EO cameras, the low-light imager is rolling shutter ā each image line is captured at different time. To adapt our vision-based navigation methods to low-light cameras, we perform real-time motion compensation across image rows using high-rate motion measurements from a small low-cost IMU (inertial measurement unit). The 6-DoF (Degree of Freedom) navigation pose (3D position and 3D orientation) can then be estimated by tracking and fusing visual features across video frames with IMU-based motion mechanism. Our system uses an Error-state Kalman filter to integrate measurements from sensors to produce a 6-DoF platform pose at 30 Hz. High-precision GPS-denied navigation accuracy (<1 meter after >1km navigation) is achieved using our system in night-time dark environments.
For vision-based navigation using semantic cues and for situational understanding in darkness, we develop a novel unsupervised transfer learning framework to adapt existing deep neural networks from EO cameras to low-light images. Traditional supervised learning approach requires lots of data collection and manual labeling to train a deep neural network from scratch for low-light sensors, which is expensive and time consuming. Our unsupervised framework avoids the human labeling effort, by using existing EO-camera based deep networks (teacher) to supervise the training of a new deep neural network (student) for low-light sensor within a teacher-student self-training architecture. The trained deep neural network from our framework outperforms state-of-the-art ML methods in semantic segmentation by +10.5% accuracy. Dynamic objects such as people and vehicles detected by our semantic segmentation network can be cued for situation awareness. The accuracy of our estimated platform pose is also improved, by filtering out pixels associated with these detected dynamic targets from the pose estimation process. The segmented static regions such as roads and buildings also can be used for semantic geo-registration, that generates absolute pose measurements by matching the segmented regions from a perceived camera frame to geo-reference imagery database. Our entire system is integrated into a low SWAP hand-held hardware unit (< 8.1 x 7.0 x 3.2 in., < 1.7 kg / 3.75 lbs. < 15W) with DomiNite imager that can be used by warfighters or small military platforms. In this presentation, we describe the details of the hardware and our methods for each system module. We show experimental results achieved by our system operating at night under starlight conditions with no external illumination. The results include 6-DoF GPS denied navigation pose estimation and real-time semantic scene analysis, with AR applications.