Report

Encoding Predictive Reward Value in Human Amygdala and Orbitofrontal Cortex

See allHide authors and affiliations

Science  22 Aug 2003:
Vol. 301, Issue 5636, pp. 1104-1107
DOI: 10.1126/science.1087919

Abstract

Adaptive behavior is optimized in organisms that maintain flexible representations of the value of sensory-predictive cues. To identify central representations of predictive reward value in humans, we used reinforcer devaluation while measuring neural activity with functional magnetic resonance imaging. We presented two arbitrary visual stimuli, both before and after olfactory devaluation, in a paradigm of appetitive conditioning. In amygdala and orbitofrontal cortex, responses evoked by a predictive target stimulus were decreased after devaluation, whereas responses to the nondevalued stimulus were maintained. Thus, differential activity in amygdala and orbitofrontal cortex encodes the current value of reward representations accessible to predictive cues.

An organism's ability to predict future events, such as food or danger, on the basis of relevant sensory cues is emblematic of associative learning. This phenomenon can be studied with classical conditioning, whereby a previously neutral item (the conditioned stimulus, CS+) acquires importance after being paired with a biologically salient reinforcer (the unconditioned stimulus, UCS). The efficacy of conditioning depends on establishing CS-UCS links, but evidence suggests that a CS+ can invoke multiple, unique UCS representations, including sensory properties, reward value, or associated affective states (1). Clarifying the neural substrates that support these associative links has important implications for biological models of reinforcement learning (2, 3).

Neuroimaging studies emphasize the roles of amygdala and orbitofrontal cortex (OFC) in human classical conditioning (47), but no experiment has characterized the psychological underpinnings of these activations. Reinforcer devaluation offers a means of dissociating among the various central representations that a CS+ may engage. This approach has been applied to animal studies of appetitive learning, which show that damage to amygdala and OFC interferes with the effects of reinforcer devaluation (811). However, it remains unclear from lesion studies what precise information is encoded within these structures.

We used functional magnetic resonance imaging (fMRI) to determine the impact of reinforcer devaluation on responses evoked by predictive cues. If amygdala and OFC maintain representations of predictive reward value, then neural responses evoked by a CS+ should be sensitive to experimental manipulations that devalue predicted reward. On the other hand, insensitivity to devaluation would indicate that the role of these areas in associative learning is independent of, or precedes linkage to, central representations of their reward value. Thirteen hungry subjects were scanned during learning and anticipation of two food-based olfactory rewards, both before and after selective satiation (12). One odor was destined for reinforcer devaluation (target UCS), whereas the other underwent no motivational shift (nontarget UCS). Arbitrary visual images comprised target and nontarget CS+ stimuli, which were either paired (CS+p) with the corresponding UCS or unpaired (CS+u) (4). Another image was never paired with odor (CS–) (Fig. 1).

Fig. 1.

Experimental task. In a paradigm of appetitive olfactory learning, arbitrary pictures comprised the target (Tgt) and nontarget (nTgt) CS+ stimuli. These were coupled with their corresponding odor UCS on 50% of all trials, resulting in paired (CS+p) and unpaired (CS+u) event types. A nonconditioned stimulus (CS–) was never paired with odor. During each trial, subjects indicated whether the picture appeared on the left or right side, and they sniffed upon delivery of a sniff cue (red asterisk). The same contingencies and task were repeated for all sessions.

During initial training, subjects learned the picture-odor associations while performing a visuospatial discrimination task. Reaction times (RTs) provided independent evidence for conditioning (6, 13) and confirmed that subjects responded significantly faster to the target and nontarget CS+, compared with CS–, during the first half-session (Fig. 2A). Significant neural responses in posterior amygdala, rostromedial OFC, ventral midbrain, primary olfactory (piriform) cortex, insula, and hypothalamus (Fig. 2B and table S1) highlighted regions that participate in the acquisition of picture-odor contingencies. This network is similar to activation patterns evoked in previous imaging studies of appetitive conditioning (6, 7).

Fig. 2.

(A) Mean RTs provided an objective index of learning. Subjects responded significantly faster to the CS+ stimuli than to the CS– in the first half of training. Error bars in all figures reflect SEM. *, P < 0.05. (B to D) Neural responses detected in a random-effects group analysis of olfactory learning (training session). Significant incremental activations were evoked by the CS+u stimuli (relative to CS–) in amygdala (amy), orbitofrontal cortex (ofc), piriform cortex (pir), insula (ins), and ventral midbrain (vmb). Statistical maps (threshold, P < 0.001) are overlaid on coronal (B), sagittal (C), and axial (D) sections from the mean T1-weighted structural scan. See table S1 for activation peaks.

After training, subjects received the same contingencies in two further sessions. Between sessions, subjects were removed from the scanner and fed until sated with a meal corresponding to the target odor identity, thereby reducing the motivational value of the target UCS without affecting nontarget UCS value. Hunger level and food pleasantness both significantly declined at the end of the meal (12) (Fig. 3, A and B). Subjects reported no significant differences in odor pleasantness between target (4.42 ± 0.85; mean ± SEM) and nontarget (4.21 ± 1.11) UCS in the pre-satiety state (P = 0.75). Critically, ratings from pre- to post-feeding significantly decreased for the target UCS (P < 0.01) in the absence of changes for the nontarget UCS (P = 0.46), reflecting the efficacy of selective satiation in lessening the value of the target UCS (Fig. 3C). Ratings of odor intensity did not differ between target and nontarget UCS or as an effect of satiety. There were no significant differences in sniff amplitude or latency between target and nontarget conditions in pre- or post-satiety sessions.

Fig. 3.

(A to C) Mean behavioral ratings of hunger level (A), food pleasantness (B), and odor pleasantness (C) illustrating selective devaluation of the target UCS. (D to F) Satiety-sensitive neural activations that paralleled the behavioral effect. (D) In dorsomedial amygdala (at x = –15, y = –6, z = –18), neural responses elicited by the target CS+u declined from pre- to post-satiety, whereas nontarget CS+u activity was unchanged. The middle panel is magnified from the left. Activations are superimposed on coronal sections (P < 0.01 for display). On the right, amygdala signal change is plotted as contrasts of parameter estimates (betas) for both target and nontarget CS+u, after adjusting for CS– baselines. (E) In OFC (at 24, 33, –12), a similar activation pattern was observed. Neural responses are displayed in axial and coronal formats. (F) By comparison, significant satiety-related effects in ventral striatum (vst), insula (ins), and anterior cingulate (cg) also reflected response increases to the nontarget CS+u. Peak coordinates are listed in table S2.

Our aim was to identify brain regions showing differential responses to the target CS+u from pre- to post-satiety (relative to nontarget CS+u responses) (12). We found significant response decrements in left dorsomedial amygdala, which spanned regions adjacent to posterior cortical and basomedial amygdala nuclei (Fig. 3D). Significant differential activity was also detected in multiple areas of OFC (Fig. 3E). Contrast estimates of signal change from amygdala and OFC demonstrated satiety-related declines in target CS+u activity with preserved nontarget CS+u activity (Fig. 3, D and E), paralleling behavioral effects of satiation. Conversely, satiety-sensitive neural responses in ventral striatum, insula, and anterior cingulate exhibited a different pattern of activity, reflecting both decreases to the target CS+u and increases to the nontarget CS+u (Fig. 3F and table S2).

Numerous studies have documented the effect of hunger states on food-related processing in OFC, amygdala, and insula (1418), but none has investigated the influence of selective satiation on evoked activity patterns in the context of associative learning. Our results underscore the selective impact of hunger and satiety on the neural correlates of reward prediction: When a food changes from delectable to distasteful, the brain responses evoked by a predictive cue are attenuated in areas that maintain responses to predictors of other palatable stimuli. Thus, amygdala and OFC activity evoked by the target CS+u decreased from pre-to post-satiety, parallel to the reward value of the target UCS, whereas activity did not change to the nontarget CS+u, indicating that the concurrent reward value of predictive stimuli may be represented in amygdala and OFC. This proposal accords with animal studies of reinforcer devaluation (811) and olfactory reversal learning (19, 20), demonstrating the importance of amygdala-OFC networks to motivational and incentive processes.

In contrast, other brain regions, including ventral striatum, insula, and cingulate cortex, exhibited both response decreases to the target CS+u and response increases to the nontarget CS+u. As an effect of selective satiation, the relative reward value of the nontarget (nondevalued) CS+u might be enhanced. Human physiological studies of satiety indicate that the pleasantness of an unsated food can actually increase at the same time that sated items become less appetizing (21). Indeed, shifting the balance of reward representation from consumed to unconsumed foods may optimize the motivational basis for food selection.

Do the satiety-related effects we observed reflect an ability of a CS+ to access representations of UCS reward value? We determined whether activations modulated by reinforcer devaluation are expressed in the same regions that encode representations of odor UCS (12). Significant overlapping responses in dorsal amygdala and caudal OFC (table S2) suggest that the CS+ gains access to UCS representations of reward value encoded in these areas. Satiety-sensitive regions not detected in this analysis (Fig. 3F) are also critical to associative learning but appear to mediate different aspects of reward prediction (1, 22). These structures could help assign reward value to the CS+ itself or discriminate relative values among predictive stimuli.

Finally, we examined whether brain regions that encode predictive reward value also participate in the initial acquisition of stimulus-reward contingencies (12). We detected significant responses in medial temporal lobe, with peaks in amygdala posteriorly and piriform cortex anteriorly (Fig. 4). Responses were also observed in OFC, insula, hypothalamus, and cingulate cortex (table S3), indicating that many of the brain regions maintaining representations of predictive reward may be a subset of those engaged in associative learning. Animal (23) and human (6, 24) studies indicate that primary olfactory structures are not mere sensory relays, but participate in higher order computations related to learning and motivation. The same considerations can be applied to taste processing: The anterior portion of insula activated here falls within a zone defined as human gustatory cortex (25), suggesting that aspects of food-based reward are also updated in these structures.

Fig. 4.

Location of satiety-related responses that are a subset of those evoked during olfactory learning. Significant activations (P < 0.001) in amygdala (amy), piriform cortex (pir), and OFC are depicted in axial orientation (left), with serial cross sections where indicated along the coronal axis (right). See table S3 for peak activation coordinates.

In animal studies of devaluation, behavior is typically assessed with procedures uncontaminated by new learning (811). In our design, because picture-odor pairs were repeatedly presented after devaluation, it is possible that the satiety-specific effects could reflect new associative learning between the target CS+ and the devalued UCS. Consequently, we performed a supplementary analysis modeling condition by time interactions for the pre- and post-satiety phases. Satiety-related decrements (of mean target CS+u activity) were still detected in identical regions of amygdala and OFC, but there was no significant time-dependent (learning-related) response decline in these brain areas (26). Nevertheless, given the rapidity of initial learning (during the training phase), the possibility remains that new learning could have taken place even faster and contributed to the satiety-related responses described here.

Patients with damage to medial temporal and basal frontal lobes commonly engage in maladaptive behaviors, tending to choose immediate rewards without regard for future consequences (27). Defective encoding of (or access to) updated reward value in amygdala and OFC could explain the inability of such patients to modify responses when expected outcomes change (20). Our findings also have implications for the feeding derangements described in Kluver-Bucy syndrome (28) and frontotemporal dementia (29). These patients display symptoms ranging from increased appetite and changes in food preference to hyperorality and consumption of non-foods. Such phenomena may arise out of a disabled network involving OFC and amygdala, whereby routine (learned) food cues no longer recruit motivationally appropriate representations of food-based reward value.

Theoretical and computational models of reward learning postulate the concept of motivational “gates” that traffic information flow between internal representations of the CS+ and the UCS (2, 30). These gates are the targets of motivational signals (likely a combination of sensory, visceral, autonomic, and interoceptive factors) and determine the likelihood that stimulus-reward associations activate appetitive systems. The neural mechanisms that support these processes are not well characterized. Our data show that neural responses evoked by a CS+ in amygdala, OFC, ventral striatum, insula, cingulate, and hypothalamus are directly modulated by hunger states, indicating that this structural network underpins Pavlovian incentive behavior in a manner that meets the requirements of a motivational gate.

Supporting Online Material

www.sciencemag.org/cgi/content/full/301/5636/1104/DC1

Materials and Methods

Tables S1 to S3

References

References and Notes

View Abstract

Navigate This Article