• Skip to primary navigation
  • Skip to main content
SRI logo
  • About
    • Press room
    • Our history
  • Expertise
    • Advanced imaging systems
    • Artificial intelligence
    • Biomedical R&D services
    • Biomedical sciences
    • Computer vision
    • Cyber & formal methods
    • Education and learning
    • Innovation strategy and policy
    • National security
    • Ocean & space
    • Quantum
    • Robotics, sensors & devices
    • Speech & natural language
    • Video test & measurement
  • Ventures
  • NSIC
  • Careers
  • Contact
  • 日本支社
Search
Close
Story May 4, 2023

Does artificial intelligence really understand us?

In developing a new metric, known as Conceptual Consistency, SRI researchers measure how much AI truly knows.

Computers that see. Chatbots that chat. Algorithms that paint on command. The world is finally getting a glimpse of the true promise of artificial intelligence (AI). And yet an argument rages as to whether these applications are truly intelligent. It is a question that goes to the very heart of what it means to be a human being—comprehending of the world around us; able to create ideas, words, and other new things; and possessing of self-awareness.

Now a team of researchers led by SRI International’s Ajay Divakaran, technical director of the Vision and Learning Laboratory at SRI’s Center for Vision Technologies, has set out to answer a provocative question: How much does AI really “understand” about the world? Divakaran, SRI colleagues Michael Cogswell and Yunye Gong, former intern Pritish Sahu, and Professor Yogesh Rawat and his doctoral student Madeleine Schiappa at the University of Central Florida have developed a way to calculate just how much artificial intelligence knows. They call it conceptual consistency.

“Deep learning models, like ChatGPT, DALL-E, and others, have demonstrated fairly remarkable performance in many humanlike tasks, but it is not clear if they do so by mere rote memory or possess true conceptual models of the way the world works,” Divakaran says.

He provides an example from one of the team’s papers of a visual and language (V+L) model trained to evaluate and describe images. A conceptually consistent model should know that the description “snow garnished with a man” is not only implausible but impossible. By the same token, Divakaran says, a similar model should be able to positively assert that a chair is not just a chair but a beach chair by taking contextual clues from the image—for instance, that the chair in question is situated on a beach.

Seemingly simple but creative leaps in logic and reasoning like these are hallmarks of human intelligence, Divakaran says, and are critical to the sort of truly intelligent AI used in life-and-death applications like autonomous cars and airplanes. In these uses, AI must understand the world rather than fall back on mere memory. The researchers hope it can help developers of AI improve the reliability of their applications.

“We have developed a way to test this key distinction, and we can use it to evaluate when we can have faith in AI’s capabilities and when we need to be more skeptical of AI and more conservative in our use of these still-new technologies,” he explains.

Conceptual consistency works whether the output being judged is language, as with ChatGPT, or images, as with DALL-E and other algorithms that can “see” and identify objects in photographs. Divakaran and colleagues refer to these as multimodal models. A computer vision algorithm used in an autonomous vehicle must be able to see objects in the world, know what they are, and reason about how to respond to those objects.

“In its most basic level, conceptual consistency measures whether AI’s knowledge of relevant background information is consistent with its ability to answer follow-on questions correctly,” Divakaran says. “Conceptual consistency measures AI’s depth of understanding.”

In one paper, Divakaran and his co-authors provide the example query, “Is a mountain tall?” A large language model (LLM) is likely to answer correctly, with a simple “Yes.” While that is all well and good, it is hardly remarkable, Divakaran would argue. What’s more important, and more indicative of true intelligence, is the generalizability of the model’s understanding about mountains—its conceptual consistency. A conceptually consistent model should also be able to answer more difficult queries about mountains correctly. But often the deeper one probes, the less conceptually consistent the models become.

The great fear and a still-open question, skeptics argue, is whether an LLM can answer only from its existing knowledge base and therefore cannot produce the sort of creative or tangential leaps of the best human minds.

“LLMs’ memory is limited to the data they have at their disposal and is therefore only mimicking the data used to train them, using probability to assemble words and ideas through pattern recognition in ways that other humans have in the past,” Divakaran explains.

To put it simply, AI does not have a mind of its own—it is simply repeating or perhaps reorganizing what other human minds have already produced. By measuring background knowledge and predicting a model’s ability to answer questions correctly on a given topic, the SRI team computes conceptual consistency to quantify when a model’s knowledge of relevant background is consistent with its ability to perform a given task.

In experiments, Divakaran and colleagues have arrived at several interesting conclusions. A model’s knowledge of background information can be used to predict when it will answer questions correctly, and conceptual consistency generally grows with the scale of the model. “Bigger models are not just more accurate, but also more consistent,” Divakaran and co-authors wrote in one of their recent papers. GPT-3, the LLM behind ChatGPT, does show a moderate amount of conceptual consistency. But, by the same token, multimodal models have not been investigated rigorously.

“At the very least, conceptual consistency can help us know when it’s safe to trust AI and when a go-slow approach is warranted,” Divakaran says.

Share this

How can we help?

Once you hit send…

We’ll match your inquiry to the person who can best help you.

Expect a response within 48 hours.

Career call to action image

Make your own mark.

Search jobs

Our work

Case studies

Publications

Timeline of innovation

Areas of expertise

Institute

Leadership

Press room

Media inquiries

Compliance

Careers

Job listings

Contact

SRI Ventures

Our locations

Headquarters

333 Ravenswood Ave
Menlo Park, CA 94025 USA

+1 (650) 859-2000

Subscribe to our newsletter


日本支社
SRI International
  • Contact us
  • Privacy Policy
  • Cookies
  • DMCA
  • Copyright © 2023 SRI International
Manage Cookie Consent
To provide the best experiences, we use technologies like cookies to store and/or access device information. Consenting to these technologies will allow us to process data such as browsing behavior or unique IDs on this site. Not consenting or withdrawing consent, may adversely affect certain features and functions.
Functional Always active
The technical storage or access is strictly necessary for the legitimate purpose of enabling the use of a specific service explicitly requested by the subscriber or user, or for the sole purpose of carrying out the transmission of a communication over an electronic communications network.
Preferences
The technical storage or access is necessary for the legitimate purpose of storing preferences that are not requested by the subscriber or user.
Statistics
The technical storage or access that is used exclusively for statistical purposes. The technical storage or access that is used exclusively for anonymous statistical purposes. Without a subpoena, voluntary compliance on the part of your Internet Service Provider, or additional records from a third party, information stored or retrieved for this purpose alone cannot usually be used to identify you.
Marketing
The technical storage or access is required to create user profiles to send advertising, or to track the user on a website or across several websites for similar marketing purposes.
Manage options Manage services Manage vendors Read more about these purposes
View preferences
{title} {title} {title}