PerspectiveMachine Learning

Understanding spatial environments from images

See allHide authors and affiliations

Science  15 Jun 2018:
Vol. 360, Issue 6394, pp. 1188
DOI: 10.1126/science.aat9641

You are currently viewing the summary.

View Full Text

Log in to view the full text

Log in through your institution

Log in through your institution


The ability to understand spatial environments based on visual perception arguably is a key function of the cognitive system of many animals, including mammalians and others. A common presumption about artificial intelligence is that its goal is to build machines with a similar capacity of “understanding.” The research community in artificial intelligence, however, has settled on a more pragmatic approach. Instead of attempting to model or quantify understanding directly, the objective is to construct machines that merely solve tasks that seem to require understanding. Understanding can only be measured indirectly, for example, by analyzing the ability of a system to generalize the solving of new tasks, which is sometimes called transfer learning (1). Transfer learning is particularly appealing in an unsupervised setting, which means that the objective of the original task is defined in terms of the input data itself, without requiring additional, task-specific information (see the figure). On page 1204 of this issue, Eslami et al. (2) present an important step toward building machines that learn to understand spatial environments using unsupervised transfer learning. Remarkably, they develop a system that relies only on inputs from its own image sensors, and that learns autonomously and without human supervision.