SRI builds an AI-enabled system that guides users through complex physical tasks

Headsets help human-machine collaboration efforts in the physical world.

Wearable cameras and microphones enable the system to provide feedback through voice, text, and graphics.

Few people even know what a pinwheel tortilla is, much less how to make one, but researchers at SRI have developed an AI-enabled system that will walk anyone — even a neophyte —through the process, ingredient by ingredient, step by step. SRI’s AMIGOS — the Autonomous Multimodal Ingestion for Goal Oriented Support — is something to behold. While the tortilla might seem a rather benign task, it is a first glimpse of a future that is anything but.

“We are all very sick of making pinwheel tortillas,” said Bob Price, research engineer in robotics, perception, assistance applications, and machine learning at SRI on the AMIGOS project. “The goal is to move to military uses for vehicle maintenance, field medicine, and even helicopter co-piloting.”

AMIGOS is supported by the Defense Advanced Research Projects Agency (DARPA) and is the latest in a field DARPA calls Perceptual Task Guidance (PTG). Such devices boast wearable cameras and microphones that allow the AI to see and hear what the wearer sees and hears and permit the device to provide verbal feedback through voice, words, and graphics projected on a lens before the wearer’s eyes, leaving the hands free to perform tasks.

Teach your children well

“You can think of AMIGOS as an assistant that looks over your shoulder while you are doing a task and helps you do things right and, when necessary, to spot mistakes and correct them,” says Charles Ortiz, associate director of collaborative and conversational systems at SRI and Principal Investigator on AMIGOS. “A physical task assistant must be able to watch, understand, and track what a user is doing and also provide useful instruction on precisely how to do it and what to do next.”

AMIGOS learns standard operating procedures, parts, and manual maneuvers for a range of tasks by poring over technical manuals, checklists, illustrations, training videos, and other sources of information. “You can imagine,” Price says, “This is a very hard thing to do with a simple recipe, but the difficulty grows much more so with the complexity and consequence of the task at hand.”

“A physical task assistant must be able to watch, understand, and track what a user is doing and also provide useful instruction on precisely how to do it and what to do next.” – Charles Ortiz

Other tasks in AMIGOS’s future could involve great risk, such as field medicine. Other tasks seem simple and perhaps even harmless, but can have outcomes that are irreversible. A single missing screw could ground a multi-million-dollar fighter jet. Then there is the user. Not every person does things the same way. Some users are right-handed, others are left-handed. Not all have the same level of dexterity, experience, and expertise. They might even have to learn to use a new tool or work with an unfamiliar part. The user might even be called upon to multitask, monitoring certain things while performing others, as in cooking.

AMIGOS must anticipate and adapt to it all.

On the shoulders of giants

The cooking assistant was the nominal proof-of-concept challenge in DARPAs PTG competition. AMIGOS was SRI’s answer. It can see and recognize various ingredients arrayed before it. It knows what utensils, measuring cups and spoons are at hand. It distinguishes bowls and plates from a frying pan. It can even see the cook’s hands and “read” various gestures and motions, from grasping and pouring to stirring and spreading.

All the while, the chef under the headpiece is privy to additional written and graphical information displayed on the inside of the lens — augmented reality, as it is known. Most notable of all, however, the user can ask questions along the way — “I’ve stirred the batter. Now, what?” AMIGOS responds appropriately: “Spread a thin layer with a knife.”

While training AMIGOS in the subtle nuances of a tortilla is challenging, Price and Ortiz are shifting the attention of AMIGOS to the maintenance and repair of physical devices such as gas engines that depend on very specialized knowledge rather than everyday cooking skills.

“AMIGOS is a promising prototype, but it will only get better and more sophisticated over time,” says Ortiz. “The wearers will become more versatile, more proficient, and grow their skillsets even when working in new, high-stress, or changing environments. AMIGOS is a very unique AI technology.”

This material is based upon work supported by the Defense Advanced Research Projects Agency (DARPA) under Contract No HR001122C0009. Any opinions, findings and conclusions or recommendations expressed in this material are those of the author(s) and do not necessarily reflect the views of the Defense Advanced Research Projects Agency (DARPA).

Read more from SRI