Report

Anterior Cingulate: Single Neuronal Signals Related to Degree of Reward Expectancy

See allHide authors and affiliations

Science  31 May 2002:
Vol. 296, Issue 5573, pp. 1709-1711
DOI: 10.1126/science.1069504

Abstract

As monkeys perform schedules containing several trials with a visual cue indicating reward proximity, their error rates decrease as the number of remaining trials decreases, suggesting that their motivation and/or reward expectancy increases as the reward approaches. About one-third of single neurons recorded in the anterior cingulate cortex of monkeys during these reward schedules had responses that progressively changed strength with reward expectancy, an effect that disappeared when the cue was random. Alterations of this progression could be the basis for the changes from normal that are reported in anterior cingulate population activity for obsessive-compulsive disorder and drug abuse, conditions characterized by disturbances in reward expectancy.

During normal activity, we continually compare our current status against our expectation for reaching a goal, with expectation increasing over the course of the activity. That implies that there are neural signals underlying this increasing expectation.

Over the past several years, we have used visually cued multitrial reward schedules in monkeys. In this task monkeys change their error rates according to reward expectancy (1–4). To obtain a reward, monkeys must successfully complete a set (or schedule) of visual color-discrimination trials (Fig. 1A) [(2); see (5) for details of experimental procedures]. In the schedule task, the monkey has to complete between one and four color-discrimination trials successfully to obtain the reward (Fig. 1B). An unsuccessful trial is not explicitly punished, but the monkey only progresses to the next stage of a schedule when a trial is completed successfully. A second set of visual stimuli used as cues indicate progress of the schedule. The cues become brighter as the schedule progresses (cued condition). The only information available about the schedule and trial is provided by the cue. As in all of the previous studies making use of this task (5), the monkeys here made progressively fewer errors as the rewarded trial approached, with the fewest errors occurring in the rewarded trials (Fig. 2A), showing that the cue is actually being used by the monkey to regulate its behavior. When we randomized the cues with respect to the schedule so that the cues were no longer related to the schedule (random condition) (5), the monkey's error rate was always low, regardless of cue brightness (Fig. 2B). Thus, there is a substantial behavioral difference between knowing for certain what will happen in each successfully completed trial (cued condition) versus knowing the overall reward rate without knowing the outcome of each trial for certain (random condition).

Figure 1

Behavioral task. (A) Visual color-discrimination trial. When the monkey touches a bar, the fixation point (FP) comes on (800 ms must have elapsed after the cue comes on) in the center of the computer video monitor in front of the monkey. A red target then appears. After the varying waiting period lasting between 400 and 1200 ms, the target color becomes green, which instructs the monkey to release the bar. If the bar is released within 1 s after the onset of green target, the target turns blue to signal the monkey that the trial is correct. (B) Cued multitrial reward schedule. A four-trial schedule is shown. The rectangular cue varied from black to white in direct proportion to the schedule fraction (trial number/schedule length). The schedule fraction quantifies progress toward the rewarded trial, i.e., 1/4, 2/4, 3/4, 4/4,1/3, 2/3, 3/3, 1/2, 2/2, 1/1. The brightness of the visual cue was changed at the onset of each trial. Thus, the monkeys could interpret the meaning of the cue before responding to the target in the forthcoming trial (cued condition). The monkeys had to complete each schedule before beginning a new one, regardless of how many errors they made.

Figure 2

Error rate plotted against schedule fraction. (A) The error rate decreased as the monkey progressed through the schedule (cued condition). (B) The error rate was always low when the schedule sequence was randomized (random condition). The error rates were averaged over all recording sessions.

For neurons in ventral striatum (2) and perirhinal cortex (4), responses occurred in specific trials of the reward schedules, with the response strengths being similar in all trials showing responses. The trials in which responses occurred appeared idiosyncratic. Thus, although the populations of neurons in either ventral striatum and perirhinal cortex could be used to decode progress through reward schedules, no single neuron carried a signal that varied directly with schedule progress or reward expectancy.

We hypothesized that within the brain's reward system, there should be a signal related to the degree of reward expectancy. For several reasons, the anterior cingulate cortex (6–10) seemed a promising site for such a signal. It appears to have a role in performance monitoring and error detection, conflict monitoring, and response selection, all of which depend on assessing reward proximity or likelihood (11–18). Several neuronal recording studies have shown associations between sensory stimuli and the expectation of various outcomes, such as reward, or pain (19–24). Finally, in several imaging studies of patients with disturbances in motivation and reward expectation, such as obsessive-compulsive disorder and drug abuse, the anterior cingulate has shown increased activation when compared with anterior cingulate in normal subjects (25–38).

We recorded from 106 single neurons in area 24c of anterior cingulate cortex [ventral bank of anterior cingulate sulcus, a part of rostral cingulate motor area (39), confirmed by magnetic resonance imaging (40)] of monkeys performing the cued multitrial reward schedule task. A substantial number of neurons (94/106) showed selective responses during the reward schedule task. For 69 neurons, activity was idiosyncratically related to the schedule. There were responses related to the cue, the bar release, and/or the reward. This activity was similar to the activity previously reported in the ventral striatum (2), and hence we do not focus on it here.

The 33 neurons forming the basis of this study showed responses that were progressively increasing or decreasing through the schedules as the reward expectancy increased (Fig. 3). For 18 of these 33 neurons, we were able to test the random condition. For all 18, the responses either disappeared or lost modulation against the schedules when the schedule sequence was randomized (Fig. 3). Because the only factor that changes (each trial is the same as every other trial, and the cues are still present) is whether or not the cue is a valid predictor of the current schedule state, the differences can only be related to the cognitive state of the monkey. These differences are not related to differences in motor activity because all trials are the same. Furthermore, there were responses even in error trials and subsequent correct trials for the neuron in Fig. 3A, which indicate that the responses are not explained by simple motor sequence.

Figure 3

Neural responses across the entire four-trial schedule. Black lines and dots indicate the cued condition; red indicates the random condition. For rasters and spike densities, responses are aligned to the event in the rewarded trial (vertical lines) and thus are somewhat misaligned in earlier trials (refer to event timing relations in Fig. 1). Boxplots show spike count distributions for all trials in the 500 ms following events (see color code); “bkgrd” indicates activity between trials. The top and bottom limits of each box show the range from the 75th to 25th percentile responses, and the line at the center of the notch is the median. Boxplot notches show the 95% confidence interval for the median. If the notches for two boxes do not overlap, the medians are significantly different (P < 0.05), thus allowing for visual estimates of statistical significance. Statistics quoted in the text were calculated explicitly. (A) Phasic response at the onset of the cue. Rasters and spike densities are aligned to the cue in the rewarded trial. Response strength increased through unrewarded trials and ceased in the rewarded trials. There was no response in the random condition. (B) Phasic response to bar release. The rasters and spike densities are aligned to bar release in the rewarded trial. Response strength to bar release increases through the schedule in the cued condition. Only every second raster dot is shown. (C) Sustained response with superimposed phasic response. Rasters and spike densities are aligned to activation of the reward apparatus in the rewarded trial. The response disappeared before the reward in the last trials. Only every fourth raster dot is shown. (D) Response increasing through the schedule (cued condition). The rasters and spike densities are aligned to activation of the reward apparatus in the rewarded trial. In the random condition there was continuing activity that was interrupted between trials (bkgrd).

Twenty-seven of these 33 neurons had phasic components related to one or more of the trial events (i.e., cue on, wait on, go on, bar release, or reward). Ten of these 27 neurons showed activity that increased significantly [analysis of variance (ANOVA) with repeated measures for effect of schedule progress in cued condition,P < 0.05, for each neuron; see (5) for a description of ANOVA with repeated measures] as the schedule progressed through the unrewarded trials, and diminished in the rewarded trial. An example of a neuron that responded phasically to the cue in the unrewarded trials, but did not respond in the rewarded trial, is shown in Fig. 3A. Thus, these responses disappeared when the monkey knew that the reward was immediately forthcoming, which suggests that an aspect of expectancy related to the outcome resolved when the outcome was certain and immediate. This type of activity could be related to conflict monitoring because there were responses when the monkey had to perform trials correctly while faced with the knowledge that no reward would be forthcoming.

Eight other phasic neurons showed the strongest responses in the rewarded trial (Fig. 3B). These may be related to the progress of the schedule or proximity to the reward. The neuron in Fig. 3B also shows strong phasic activity at the time of bar release in all trials in the random condition. Nine neurons showed phasic responses that became progressively smaller through the course of the schedule.

Nine neurons (some of these overlapped with the phasic neurons; see, e.g., Fig. 3C) showed sustained activity lasting through most of the schedule. In Fig. 3C, the neuron showed the sustained activity that persisted from the onset of the first trial and ended just before the reward was given in the last trial. The activity dropped to a low level when the schedule sequence was randomized. Seven neurons belong to this group. Two neurons showed activity that increased tonically through the course of the schedule; the activity disappeared only after the reward was delivered (Fig. 3D). In the random condition, the activity varied up and down with each trial. Neurons of the types shown in Fig. 3, B and D, showed responses of an intermediate strength in the random condition; nonetheless, the responses are greatest in trials wherein the monkey is certain that correct trial performance will be rewarded.

The findings are consistent with our hypothesis that the anterior cingulate cortex activity reflects, or perhaps regulates, the degree of reward expectancy during progress through a multistage task. Many of the functions suggested previously for the anterior cingulate—performance monitoring and error detection, conflict monitoring, and response selection (11–18)—depend on reaching some expected result, such as reward, other goal, or even a painful stimulus.

Previous single-neuron recordings in cingulate that made use of a single-trial task have shown several types of signals. Neurons in the anterior cingulate have been related to stimulus-reward associations, some of which show increasing activity over the course of a trial (19), associations with objects related to reward in single trials (20, 21), and neurons related to painful stimuli (22). Neurons in the cingulate motor area increase their activity as a reward is reduced over the course of trials when the animal comes closer and closer to voluntarily switching to an alternative motor response that will increase reward. The authors concluded that neurons in cingulate motor area have been related to “reward-based motor selection” (18). In light of our findings, these switching results could be related to increasing expectation for switching behavior to obtain increased reward, thus giving a single interpretation for their results and ours.

Our task involving multitrial reward schedules was designed to keep motor and sensory stimulation the same in cued and random conditions, systematically manipulating the level of motivation and expectancy. Through the use of reward schedules, neuronal responses related to sensory and motor events could be separated from associatively derived signals. Based on our finding that about one-third of anterior cingulate neurons show response modulation that covaries with level of expectancy, and in light of the results from other studies, we suggest that the functions of the anterior cingulate are connected through their dependence on reward or goal expectancy. When certainty about outcome is removed from expectation (our random condition), the progressive modulation disappears. The neuronal signals that encoded the degree of reward expectancy directly might be related to the feelings of increasing anticipation that are experienced over stages toward predicted outcome. We thus speculate that these signals in particular will be disturbed in disorders of motivation and reward expectation such as obsessive-compulsive disorder and drug abuse.

  • * To whom correspondence should be addressed. E-mail: m.shidara{at}aist.go.jp

REFERENCES AND NOTES

View Abstract

Navigate This Article