Featured innovator: Rakesh “Teddy” Kumar

This scientific trailblazer is on a quest to shape the digital age at SRI

Rakesh “Teddy” Kumar’s professional life has been a scientific adventure, an exciting quest to blaze new trails and shape the digital age.

One example: Kumar was part of a team at SRI International that invented extended blended panoramas. They used the camera as a paintbrush to build panoramas (Video Brush). This groundbreaking technology automatically combines several images into a spherical panorama through which viewers can visually travel and see things from different angles — all welded together perfectly without seams or misalignment. In 2011, SRI licensed the technology to Google, which incorporated it into the cameras of its Android smartphones.

The project was just one of the interesting innovations Kumar has helped pioneer during his trailblazing career. “I’ve been fortunate to work on exciting projects with great people,” he says.

Innovations with real-world applications

Kumar came to the institute nearly 28 years ago, joining the SRI subsidiary Sarnoff Corporation after earning a Ph.D in Computer Science from the University of Massachusetts at Amherst and working briefly as a programmer at IBM.

The doctorate was the culmination of academic training that included an M.S. in Electrical and Computer Engineering from the State University of New York at Buffalo and a B.Tech in Electrical Engineering from the Indian Institute of Technology, Kanpur, India.

Over time, Kumar steadily rose up the ranks at SRI. Today, Kumar serves as vice president of the Information Computing Sciences Division and director of the Center for Vision Technologies Laboratory. In this leadership role, Kumar spearheads research and development in computer vision, robotics, image processing, computer graphics and visualization algorithms and systems for government and commercial clients.

“It’s thrilling to work in this field because of all the different areas where your work can apply,” says Kumar. “From an intellectual perspective alone, it’s very interesting, but that’s made even better by the fact that there are many real-world applications.”

As with extended blended panoramas, some of those applications have become prominent features of modern society. Take virtual advertising insertion, for instance.

Television viewers of sporting and news events are familiar with digitally inserted on-screen advertising or graphics — for example, ads appear behind home plate during baseball games, or a colored line denotes where a first down is in a televised football game. This capability is built on vision-based, match-moving technology that inserts images and video into broadcast signals with the correct scale, orientation and motion relative to the photographed objects in the scene. Kumar and his colleagues helped create this system in the mid-1990s.

“When it came out, it was the first actual, real application of augmented reality,” shares Kumar.

Another innovative technology Kumar led was the development of Video Flashlights, a system for rendering multiple live videos in real-time over a 3D model. The system illuminates a static 3D model with live video textures from static and moving cameras in the same way that a flashlight lights an environment. The system has real-world applications as an immersive visualization solution for security and monitoring systems that deploy hundreds of cameras to monitor a large-scale campus or an urban site. The multiple camera feeds are fused together to form a coherent display. The display provides a live, dynamic world with a birds-eye view that can zoom in to see activities up close.

The technology was developed around the time of the 9/11 attacks and deployed in various airports to help enhance security. SRI later sold the technology to a company that used it to win a substantial contract aimed at helping to modernize training methods used by the Department of Defense.

Computer vision augmented_reality

Military training applications are also central to another innovative solution in which Kumar played a key role: Augmented Reality Binoculars. This system allows long-range high-precision augmentation of live telescopic imagery with aerial and terrain-based synthetic objects, vehicles, people and effects. The system’s design is based on using two cameras with wide- and narrow-field view lenses enclosed in a binocular-shaped shell. The wide field of view provides context and enables the recovery of the 3D location and orientation of the binoculars much more robustly, whereas the narrow field of view is used for the actual augmentation and to increase precision in tracking.

The bottom line is that the system provides jitter-free, robust and real-time pose estimation for precise augmentation. “We have demonstrated successful use of our system as part of a live simulated training system for observer training, in which fixed- and rotary-wing aircraft, ground vehicles and weapon effects are combined with real-world scenes,” Kumar and a co-author note in a paper on the system.

The system allows soldiers, for instance, to see augmented reality threats — for example, an enemy tank — in the natural world landscape and train in combatting those threats. “This saves the military money on training costs because otherwise, they would have to put real targets out there and fly real planes, and that gets expensive,” says Kumar.

Award-winning work

During his distinguished career, Kumar has co-authored more than 60 research publications, received more than 50 patents, and been the principal founder of multiple spin-off companies from SRI, including VideoBrush, LifeClips and SSG.

Among other roles that build on his SRI experience, like having served on a panel for the Defense Advanced Research Project Agency (DARPA), Kumar has been an associate editor for the Institute of Electrical and Electronics Engineers (IEEE) publication, Transactions on Pattern Analysis and Machine Intelligence.

In 2013, the University of Massachusetts Amherst School of Computer Science honored Kumar with its Outstanding Achievement in Technology Development Award. Meanwhile, he also won a Sarnoff President’s Award and Sarnoff Technical Achievement Awards for his work in registration of multi-sensor, multi-dimensional medical images and alignment of video to three-dimensional scene models.

At an IEEE virtual reality conference in 2011, a paper that Kumar co-authored–“Stable Vision-Aided Navigation for Large-Area Augmented Reality” — earned the best paper award at IEEE Virtual Reality Conference. The paper presented a unified approach for a drift-free and jitter-reduced vision-aided navigation system. The paper “Augmented Reality Binoculars” co-authored by him received the best paper award in the IEEE International Symposium on Mixed and Augmented Reality (ISMAR) 2013 conference.

Kumar continues to stay busy, focused on new scientific innovations. That includes work at the forefront of artificial intelligence (AI) autonomy — a research area focused on enhancing mapping and use of semantic information to enable robots to learn faster and operate better within environments.

Kumar co-authored a paper published in October 2020 that studied the important yet largely unexplored problem of large-scale cross-modal visual localization by matching ground RGB images to a geo-referenced aerial LIDAR 3D point cloud (rendered as depth images). The paper introduces a new dataset containing over 550,000 pairs of RGB and aerial LIDAR depth images and proposes a novel joint embedding-based method that effectively combines the appearance and semantic cues from both RGB and LIDAR to handle drastic cross-modal variations. The work provides a foundation for further research in cross-modal visual localization.

“SRI creates world-changing solutions to make people safer, healthier and more productive,” says Kumar. “It’s pretty cool to be part of that.”

Read more from SRI