Building and Using Scene Repesentations In Image Understanding

Citation

Baker, H. H. (1993). Building and Using Scene Representation in Image Understanding. SRI INTERNATIONAL MENLO PARK CA ARTIFICIAL INTELLIGENCE CENTER.

Abstract

The task of having computers able to understand their environments through direct imaging has proved to formidable. With its beginnings about 30 years ago, the field of computer vision has grown as a major part for the pursuit for artificial intelligence. Most elements of this pursuit – language understanding, reasoning and planning, speech – are very difficult challenges, but vision, with its high dimensionality of space, time, scale, color,dynamics, and so forth, may be the most challenging. Early attempts to develop computer sivion focused on restricted situations in which it was feasible to provide the computer with fairly complete descriptions of what it would encounter. In such cases, single images provided the sensory information for analysis. As the domains of application grew, the requirements for more competent descriptions of the world increased. Dealing with three-dimensional (3D) dynamic structures (the real world) from 3D dynamic platforms (we humans) calls for greater capabilities on both the analysis and synthesis sides of the issue. The analysis side is the processing of sensory data for such tasks as recognition and navigation, and a number of techniques are discussed here for dealing with these two-, three-, and higher-dimensional data. The synthesis side is the construction of “internal’’ descriptions of what they may be used subsequently for the above tasks. This latter issue is the underlying theme we pose in this paper – developing representations from vision that will later enable effective automated operation in our 3D dynamic environments.


Read more from SRI

  • An arid, rural Nevada landscape

    Can AI help us find valuable minerals?

    SRI’s machine learning-based geospatial analytics platform, already adopted by the USGS, is poised to make waves in the mining industry.

  • Two students in a computer lab

    Building a lab-to-market pipeline for education

    The SRI-led LEARN Network demonstrates how we can get the best evidence-based educational programs to classrooms and students.

  • Code reflected in a man's eyeglasses

    LLM risks from A to Z

    A new paper from SRI and Brazil’s Instituto Eldorado delivers a comprehensive update on the security risks to large language models.