
Working in SRI’s AI Center, Sequeira creates autonomous systems that learn, reason, and adapt under uncertainty.
As an advanced computer scientists in SRI’s Artificial Intelligence Center, Pedro Sequeira focuses his research on bridging the gaps between AI and human behavior. Before joining SRI, he earned his PhD in Information Systems and Computer Engineering with a specialization in Artificial Intelligence at Instituto Superior Técnico in Portugal.
Here, he explores the frontier of AI-human collaboration and explains how some recent SRI projects are advancing this fast-moving field.
A lot of your work is focused on human-machine collaboration. How did you find your way to that research area?
My background is mostly in machine learning — in particular, reinforcement learning. That’s dealing with autonomous sequential decision-making.
My work focuses on the intersection between machine learning and humans. And what really interests me is learning from humans.
A lot of my work deals with learning from demonstration, trying to help computers model human behaviors, understand people’s goals, and create models to predict their future behavior. That’s especially important if we want to make AI systems that can be personalized to a particular user.
When many people think about machine learning, their minds immediately jump to generative AI and large language models. But that’s just one of many forms of machine learning. How big of a role do large language models play in your own work?
The initial things that I did on the autonomy space had literally nothing to do with LLMs. Today, it’s important to recognize that there are other types of machine learning foundation models that are not “large language models.” Vision language models, for example, and more recently vision language action models. That’s often closer to what I do, because these models have to do with trying to help a computer system interpret the environment in light of one’s goals and needs. And then come up with plans and sequences of behavior to achieve those goals in an autonomous way.
Now, LLMs can be used for reasoning, which can drive the sequential behavior. A vision model helps a system understand what it’s seeing, for example, and then LLMs can help it further reason about what it’s seeing and how it connects to the goals. That’s because LLMs have strong implicit knowledge. But LLMs are notoriously bad at long-term planning. That’s why we want to integrate LLMs with more traditional AI methods such as planning tools, which complement LLM capabilities.
But overall, when we’re talking about human-machine teaming, LLMs are rarely the entire picture. Multimodal models (not just language models) and tool use (access to external software like action planners and internet search services) will be the key to solving most of these problems around human-machine teaming.
What are some of SRI’s specific areas of strength in applying AI to human-machine teaming?
One of them is certainly LLM personalization. That was the focus of our work on DARPA KMASS. If you’ve interacted with LLMs, you know that they can often give pretty generic responses. Particularly if you’re working in a very specialized domain of knowledge, they really don’t understand what you need and what you want. So how can these systems better understand individual users? How can a system take information that’s available about a user and leverage that to tailor a response that better addresses that user’s knowledge and needs? That’s one important area of focus for us. We’re not building these big foundation models — the ChatGPTs and Llamas of the world — but rather trying to build on top of them to make them more effective for specific kinds of work.
“How can we create models that quickly understand the intentions and the goals of humans, predict their future behavior, and adapt to that behavior?” — Pedro Sequeira
Another focus has to do with agentic workflows. Essentially, trying to orchestrate LLM-based agents. Think of it like automated teamwork, like a “divide and conquer” approach. Across the board, that’s a huge area of interest. And a focus of ours, particularly in projects like COLLEAGUE, is: How can we adapt those workflows for real-time user interaction and collaboration? It’s not very easy to control those workflows for real-time interaction using existing off-the-shelf tools. There are interesting theoretical models out there for how one could chain together various agents. But as of early 2026, those models are not very functional for real-time use.
The third focus I would highlight is “theory of mind” reasoning. How can we create models that quickly understand the intentions and the goals of humans, predict their future behavior, and adapt to that behavior? That’s the focus of a recent internally funded R&D effort we’re calling ToMCAT, or “Theory of Mind for Cooperative Agents in Teams,” which can be applied in both cooperative and adversarial settings.
SRI is celebrating its 80th anniversary this year, which means that in just 20 years, we’ll be turning 100. As we look ahead to 2046, what do you think the future of human-machine interaction will look like?
I think the fundamental challenge is around equipping machines with what we call “general capabilities.” We’re still far from having robots that live in our homes and do a lot of different things robustly, as humans do. There are still major issues there. Not just around their interactions and communications with humans, but also their fundamental autonomous capabilities, right? Today’s robots still need to be optimized toward highly specific tasks and use cases and are not very robust to changes in the environment.
But in 20 years, I would envision that we do have robots that live in your home and can do chores. At the very least, a wide range of simple tasks.
Already, we have LLMs that can do most general-purpose, mundane text-processing tasks, like summarization or generating a list of topics. In fact, they’re really good at that. You don’t see that yet in robots that interact with people and act in the real world. But I think that is something that we’ll see in 20 years. In fact, in less than 20 years.
Learn how SRI is inventing the future of artificial intelligence.


