Computer vision publications
-
Striking the Right Balance: Recall Loss for Semantic Segmentation
We propose a hard-class mining loss by reshaping the vanilla cross entropy loss such that it weights the loss for each class dynamically based on instantaneous recall performance.
-
Graph Mapper: Efficient Visual Navigation by Scene Graph Generation
We propose a method to train an autonomous agent to learn to accumulate a 3D scene graph representation of its environment by simultaneously learning to navigate through said environment.
-
SASRA: Semantically-aware Spatio-temporal Reasoning Agent for Vision-and-Language Navigation in Continuous Environments
This paper presents a novel approach for the Vision-and-Language Navigation (VLN) task in continuous 3D environments.
-
Broadening AI Ethics Narratives: An Indic Arts View
We investigate uncovering the unique socio-cultural perspectives embedded in human-made art, which in turn, can be valuable in expanding the horizon of AI ethics.
-
Real-Time Hyper-Dimensional Reconfiguration at the Edge using Hardware Accelerators
In this paper we present Hyper-Dimensional Reconfigurable Analytics at the Tactical Edge using low-SWaP embedded hardware that can perform real-time reconfiguration at the edge leveraging non-MAC deep neural nets (DNN)…
-
Model-Free Generative Replay For Lifelong Reinforcement Learning: Application To Starcraft-2
We evaluate our proposed algorithms on three different scenarios comprising tasks from the Starcraft 2 and Minigrid domains.
-
Head-Worn Markerless Augmented Reality Inside a Moving Vehicle
This paper describes a system that provides general head-worn outdoor AR capability for the user inside a moving vehicle.
-
SIGNAV: Semantically-Informed GPS-Denied Navigation and Mapping in Visually-Degraded Environments
We present SIGNAV, a real-time semantic SLAM system to operate in perceptually-challenging situations.
-
Generating and Evaluating Explanations of Attended and Error-Inducing Input Regions for VQA Models
Error maps can indicate when a correctly attended region may be processed incorrectly leading to an incorrect answer, and hence, improve users’ understanding of those cases.
-
Challenges in Procedural Multimodal Machine Comprehension: A Novel Way to Benchmark
We identify three critical biases stemming from the question-answer generation process and memorization capabilities of large deep models.
-
Global Heading Estimation for Wide Area Augmented Reality Using Road Semantics for Geo-referencing
We present a method to estimate global camera heading by associating directional information from road segments in the camera view with annotated satellite imagery.
-
Long-Range Augmented Reality with Dynamic Occlusion Rendering
This paper addresses the problem of fast and accurate dynamic occlusion reasoning by real objects in the scene for large scale outdoor AR applications.