• Skip to primary navigation
  • Skip to main content
SRI logo
  • About
    • Press room
    • Our history
  • Expertise
    • Advanced imaging systems
    • Artificial intelligence
    • Biomedical R&D services
    • Biomedical sciences
    • Computer vision
    • Cyber & formal methods
    • Education and learning
    • Innovation strategy and policy
    • National security
    • Ocean & space
    • Quantum
    • Robotics, sensors & devices
    • Speech & natural language
    • Video test & measurement
  • Ventures
  • NSIC
  • Careers
  • Contact
  • 日本支社
Search
Close
2d-3d reasoning and augmented reality publications March 27, 2023

Vision based Navigation using Cross-View Geo-registration for Outdoor Augmented Reality and Navigation Applications

Rakesh Kumar, Supun Samarasekera, Han-Pang Chiu

Citation

Copy to clipboard


Rakesh Kumar, Supun Samarasekera, Niluthpol Mithun, Kshitij Minhas, Taragay Oskiper, Kevin Kaighn, Mikhail Sizintsev, Han-Pang Chiu, Vision based Navigation using Cross-View Geo-Registration for Outdoor Augmented Reality and Navigation Applications, 2023 Joint Navigation Conference, Institute of Navigation

Abstract

Estimating precise geo-location of ground imagery and video streams in the world is crucial to many applications, including wide area augmented reality, dismount tracking, navigation for autonomous vehicles, and robotics. Visual-Inertial Odometry and SLAM (simultaneous localization and mapping) system are often used for navigation in these applications. The visual odometry systems have increasingly become more accurate and can achieve 0.1% drift with respect to distance travelled. To reset the drift, the input images are matched to a geo-reference landmark database. Most prior works consider the problem as matching the input image queries against a pre-built database of geo-referenced ground images or video streams collected from similar ground viewpoints and the same sensor modality. However, collecting ground images over a large area is time-consuming and may not be feasible in many cases. To overcome this limitation, there are significant recent interests in geo-localization of ground imagery against an overhead reference image database.  Due to the wide availability, easier obtainability, and dense coverage, 2D satellite data has become a very attractive reference data source.

In this work, we present a new vision-based cross-view geo-localization solution matching camera images to a 2D satellite/ overhead reference image database. We present solutions for both coarse search for cold start and fine alignment for continuous refinement. The geo-localization solution is based on a neural network-based framework that can perform both location and orientation estimation based on cross-view matching. We have developed solutions using both convolutional neural networks and transformers. We compare the results for each case. We also present an approach to extend the single image query-based cross-view geo-localization by utilizing temporal information across video frames for continuous and consistent geo-localization, which fits the demanding requirements in navigation applications. Our cross-view geo-localization approach can be used to augment existing navigation methods as an additional sensor measurement. The cross-view matching neural network model is optimized to run on low-cost embedded smartphone processors. We also present the methods to optimize the neural network and compare the performance achieved between running on a MSI VR backpack with powerful GTX 1070 GPU versus running on a Qualcomm RB5 smartphone processor with embedded GPU. Specifically, we develop and present a navigation system to continuously estimate 6-DoF (Degrees of Freedom) camera geo-poses for outdoor navigation applications. The system consists of a helmet-mounted sensor platform (including cameras, IMU, magnetometer, and GPS when available), an embedded computer (e.g., Qualcomm RB5) mounted on the backpack, and a video see-through head-mounted display (HMD). The backbone of our navigation system is built on a tightly coupled error-state Extended Kalman Filter (EKF) based sensor fusion for visual-inertial navigation. The tightly coupled visual-inertial-odometry module produce 6-DoF platform pose updates at 15-30 Hz. However, these pose updates drift over time. To correct the drift, our novel cross-view visual geo-localization solution estimates 3-DoF (latitude, longitude and heading) camera pose, by matching camera images to satellite images. In addition to the relative measurements from frame-to-frame feature tracks for odometry purposes, our error-state EKF framework can fuse the estimates from the cross-view geo-registration model with global measurements from GPS, for heading and location correction to counter visual odometry drift accumulation over time. The visual geo-localization solution is used both for providing initial global heading and location (cold-start procedure) and for continuous global heading refinement over time to the navigation system. We show experimental result videos demonstrating both navigation and augmented reality performance and accuracy performance results with and without GPS.

↓ Review online (note – title has changed)

Share this

How can we help?

Once you hit send…

We’ll match your inquiry to the person who can best help you.

Expect a response within 48 hours.

Career call to action image

Make your own mark.

Search jobs

Our work

Case studies

Publications

Timeline of innovation

Areas of expertise

Institute

Leadership

Press room

Media inquiries

Compliance

Careers

Job listings

Contact

SRI Ventures

Our locations

Headquarters

333 Ravenswood Ave
Menlo Park, CA 94025 USA

+1 (650) 859-2000

Subscribe to our newsletter


日本支社
SRI International
  • Contact us
  • Privacy Policy
  • Cookies
  • DMCA
  • Copyright © 2023 SRI International
Manage Cookie Consent
To provide the best experiences, we use technologies like cookies to store and/or access device information. Consenting to these technologies will allow us to process data such as browsing behavior or unique IDs on this site. Not consenting or withdrawing consent, may adversely affect certain features and functions.
Functional Always active
The technical storage or access is strictly necessary for the legitimate purpose of enabling the use of a specific service explicitly requested by the subscriber or user, or for the sole purpose of carrying out the transmission of a communication over an electronic communications network.
Preferences
The technical storage or access is necessary for the legitimate purpose of storing preferences that are not requested by the subscriber or user.
Statistics
The technical storage or access that is used exclusively for statistical purposes. The technical storage or access that is used exclusively for anonymous statistical purposes. Without a subpoena, voluntary compliance on the part of your Internet Service Provider, or additional records from a third party, information stored or retrieved for this purpose alone cannot usually be used to identify you.
Marketing
The technical storage or access is required to create user profiles to send advertising, or to track the user on a website or across several websites for similar marketing purposes.
Manage options Manage services Manage vendors Read more about these purposes
View preferences
{title} {title} {title}