We propose a method to train an autonomous agent to learn to accumulate a 3D scene graph representation of its environment by simultaneously learning to navigate through said environment.
Computer vision publications
SASRA: Semantically-aware Spatio-temporal Reasoning Agent for Vision-and-Language Navigation in Continuous Environments
This paper presents a novel approach for the Vision-and-Language Navigation (VLN) task in continuous 3D environments.
Broadening AI Ethics Narratives: An Indic Arts View
We investigate uncovering the unique socio-cultural perspectives embedded in human-made art, which in turn, can be valuable in expanding the horizon of AI ethics.
Real-Time Hyper-Dimensional Reconfiguration at the Edge using Hardware Accelerators
In this paper we present Hyper-Dimensional Reconfigurable Analytics at the Tactical Edge using low-SWaP embedded hardware that can perform real-time reconfiguration at the edge leveraging non-MAC deep neural nets (DNN) combined with hyperdimensional (HD) computing accelerators.
Model-Free Generative Replay For Lifelong Reinforcement Learning: Application To Starcraft-2
We evaluate our proposed algorithms on three different scenarios comprising tasks from the Starcraft 2 and Minigrid domains.
Head-Worn Markerless Augmented Reality Inside a Moving Vehicle
This paper describes a system that provides general head-worn outdoor AR capability for the user inside a moving vehicle.
SIGNAV: Semantically-Informed GPS-Denied Navigation and Mapping in Visually-Degraded Environments
We present SIGNAV, a real-time semantic SLAM system to operate in perceptually-challenging situations.
Generating and Evaluating Explanations of Attended and Error-Inducing Input Regions for VQA Models
Error maps can indicate when a correctly attended region may be processed incorrectly leading to an incorrect answer, and hence, improve users’ understanding of those cases.
Challenges in Procedural Multimodal Machine Comprehension: A Novel Way to Benchmark
We identify three critical biases stemming from the question-answer generation process and memorization capabilities of large deep models.