Report

Categorical Representation of Visual Stimuli in the Primate Prefrontal Cortex

See allHide authors and affiliations

Science  12 Jan 2001:
Vol. 291, Issue 5502, pp. 312-316
DOI: 10.1126/science.291.5502.312

Abstract

The ability to group stimuli into meaningful categories is a fundamental cognitive process. To explore its neural basis, we trained monkeys to categorize computer-generated stimuli as “cats” and “dogs.” A morphing system was used to systematically vary stimulus shape and precisely define the category boundary. Neural activity in the lateral prefrontal cortex reflected the category of visual stimuli, even when a monkey was retrained with the stimuli assigned to new categories.

Categorization refers to the ability to react similarly to stimuli when they are physically distinct, and to react differently to stimuli that may be physically similar (1). For example, we recognize an apple and a banana to be in the same category (food) even though they are dissimilar in appearance, and we consider an apple and a billiard ball to be in different categories even though they are similar in shape and sometimes color. Categorization is fundamental; our raw perceptions would be useless without our classification of items as furniture or food. Although a great deal is known about the neural analysis of visual features, little is known about the neural basis of the categorical information that gives them meaning.

In advanced animals, most categories are learned. Monkeys can learn to categorize stimuli as animal or non-animal (2), food or non-food (3), tree or non-tree, fish or non-fish (4), and by ordinal number (5). The neural correlate of such perceptual categories might be found in brain areas that process visual form. The inferior temporal (IT) and prefrontal (PF) cortices are likely candidates; their neurons are sensitive to form (6–9) and they are important for a wide range of visual behaviors (10–12).

The hallmark of perceptual categorization is a sharp “boundary” (13). That is, stimuli from different categories that are similar in appearance (e.g., apple/billiard ball) are treated as different, whereas distinct stimuli within the same category (e.g., apple/banana) are treated alike. Presumably, there are neurons that also represent such sharp distinctions. This is difficult to assess with a small subset of a large, amorphous category (e.g., food, human, etc). Because the category boundary is unknown, it is unclear whether neural activity reflects category membership or physical similarity.

We used a three-dimensional morphing system to generate stimuli that spanned two categories, “cats” and “dogs.” Three species of cats and three breeds of dogs served as prototypes (14–16); the morphed images were linear combinations of all possible arrangements between them (Fig. 1). By blending different amounts of “cat” and “dog,” we could continuously vary the shape and precisely define the category boundary (17). Thus, stimuli that were close to but on opposite sides of the boundary could be similar, whereas stimuli that belonged to the same category could be dissimilar (e.g., “cheetah” and “housecat”) (18).

Figure 1

The stimuli. (A) Monkeys learned to categorize randomly generated “morphs” from the vast number of possible blends of six prototypes. For neurophysiological recording, 54 sample stimuli were constructed along the 15 morph lines illustrated here. The placement of the prototypes in this diagram does not reflect their similarity. (B) Morphs along the C1-D1 line.

Two monkeys performed a delayed match-to-category (DMC) task (Fig. 2A) that required judging whether a sample and test stimulus were from the same category (19). Performance was high (about 90% correct), even when the samples were close to the category boundary (Fig. 2B). The monkeys classified dog-like cats (60:40 cat:dog) correctly about 90% of the time, and misclassified them as dogs only 10% of the time; they did as well with cat-like dogs (60:40 dog:cat).

Figure 2

Task design and behavior. (A) A sample was followed by a delay and a test stimulus. If the sample and test stimulus were the same category (a match), monkeys were required to release a lever before the test disappeared. If they were not, there was another delay followed by a match. Equal numbers of match and nonmatch trials were randomly interleaved. (B) Average performance of both monkeys. Red and blue bars indicate percentages of samples classified as “dog” and “cat,” respectively.

We made recordings from 395 neurons from the lateral PF cortices of two monkeys (20) (Fig. 3A). The majority of neurons were activated during the sample and/or delay interval (253/395 or 64%) (21). They often reflected the sample's category. Nearly one-third of responsive neurons (82/253) were category-selective in that they exhibited an overall difference in activity during the sample and/or the delay interval to cats versus dogs. Similar numbers of neurons preferred cats (sample interval, 35/65; delay interval, 21/44) and dogs (sample, 30/65; delay, 23/44).

Figure 3

Recording locations and single neuron example. (A) Recording locations in both monkeys. A, anterior; P, posterior; D, dorsal; V, ventral. There was no obvious topography to task-related neurons. (B) The average activity of a single neuron in response to stimuli at the six morph blends. The vertical lines correspond (from left to right) to sample onset, offset, and test stimulus onset. The inset shows the neuron's delay activity in response to stimuli along each of the nine between-class morph lines (see Fig. 1). The prototypes (C1, C2, C3, D1, D2, and D3) are represented in the outermost columns; each appears in three morph lines. A color scale indicates the activity level.

Figure 3B shows an example of a single neuron that exhibited greater activity in response to dogs than to cats and responded similarly to samples from the same category, regardless of their degree of dogness or catness. Its activity was different in response to stimuli near the category boundary, the cat-like dogs (60:40 dog:cat) versus the dog-like cats (60:40 cat:dog) (22), but there was no difference in activity elicited by these stimuli and by their respective prototypes (the 100% cats or dogs) (23). The inset in Fig. 3B shows the neuron's activity in response to each of the 54 samples. It exhibited overall greater activity in response to dogs than to cats, but there were small differences within categories. Just a few stimuli elicited activity that was similar to that from the other category. These stimuli were not consistent across different neurons, however. Across the population of neurons, category activity appeared at the start of neural responses to the sample, about 100 ms after sample onset (24).

We examined all stimulus-selective neurons, irrespective of whether they were category-selective per se (25). For each neuron, we computed the difference in activity between pairs of samples at different positions along each between-category morph line (Fig. 1A). In Fig. 4, A and B, each neuron's average difference in response to pairs of samples from the same category (within-category difference, WCD) is plotted against its difference in response to samples from different categories (between-category difference, BCD). If neurons were not sensitive to categories, these measures should be similar (i.e., BCD/WCD ratios should equal 1 and cluster around the diagonal). Instead, the BCD values are significantly higher than WCD values. This indicates greater activity differences in response to samples from different categories, especially during the delay (26).

Figure 4

Category effects in a neural population. (A and B) Average differences in activity in response to samples from the same (WCD) and different (BCD) categories for the sample (A) and delay interval (B). Each point represents one neuron. The dotted line indicates equal differences irrespective of category. The solid line indicates the regression line. (C and D) Average activity of the neural population (and standard error) in response to stimuli at different morph levels of their preferred and nonpreferred categories for the sample (C) and delay (D) intervals.

The average activity of all stimulus-selective neurons at different morph levels is shown in Fig. 4, C and D (27). There was a significant difference in activity between the categories (28), but activity was similar at the different morph levels within each category (29), indicating greater sensitivity to stimulus category than to identity. Few category-selective neurons conveyed significant identity information (sample interval, 20/65 or 31%; delay interval, 10/44 or 23%) (30). Also, PF neural responses to the test stimulus seemed to reflect category evaluation. Many PF neurons showed enhanced or suppressed activity when the test stimulus matched the category of the sample (112/395 or 28%) (31). Similar effects were reported for identity matches in the PF and IT cortex (32).

Because our monkeys had no experience with cats or dogs before training, it seemed likely that the categories were learned. We thus retrained one monkey on the DMC task after defining two new category boundaries that were orthogonal to the original boundary (Fig. 1A). This created three new classes, each containing morphs centered around one cat prototype and one dog prototype (e.g., the cheetah and the “doberman”). After training, the monkey was able to perform the new three-category DMC task at >85% correct. We then recorded from 103 PF neurons from the same depths and locations in the PF cortex, using the same samples as in the original two-category task.

Neural responsiveness (58% or 60/103) (33) and stimulus-selectivity (35% or 21/60) (34) during the three-category task was similar to that during the two-category task (64% or 253/395, and 28% or 73/253, respectively), but the original categories were no longer reflected in activity (35). Instead, the three new categories were evident in delay activity (36). As during the two-category task, category information was stronger during the delay (37), possibly because it is relevant for the judgment after the delay. “Prospective activity” is stronger nearer the relevant event (38, 39) and appears earlier within a trial as task proficiency increases (40). The monkey was not as proficient at the three-category task, and its reaction times were significantly longer (41).

Categorization of sensory inputs is the nexus between perception and cognition; thoughts and behaviors depend on knowledge of the types of things around us. The sharp transition in neural activity we observed is consistent with a “classical,” perceptual category boundary. More conceptual categories can have “fuzzy” boundaries and are unlikely to exhibit such properties (42). Perceptual categorization relies on extraction of the combinations of features defining a category. These features were not explicitly instructed, were acquired by training, and were necessarily multivariate abstractions; the categories differed by more than a few simple features. PF activity could have reflected, and/or resulted in, a shifting of attention to those features (43).

These results fit well with studies suggesting that PF neural circuitry is malleable. Experience has been shown to induce and modify the sensitivity of PF neurons to specific stimuli (44, 45), and PF activity reflects learned associations and rules (40, 46,47). Of course, the PF cortex is not likely to be the only brain area involved in categorization. The PF cortex is interconnected with temporal lobe structures important for long-term memory (48), including the IT cortex, whose neurons have stimulus specificities that could contribute to categorization (7,49). Interactions between the PF and IT cortices underlie the storage and/or recall of visual memories and associations (50–52), but not necessarily visual short-term memory (53). The storage and recall of categories may also require such collaboration.

  • * To whom correspondence should be addressed. E-mail: ekm{at}ai.mit.edu

REFERENCES AND NOTES

View Abstract

Navigate This Article