Report

# Separate Neural Systems Value Immediate and Delayed Monetary Rewards

See allHide authors and affiliations

Science  15 Oct 2004:
Vol. 306, Issue 5695, pp. 503-507
DOI: 10.1126/science.1100907

## Abstract

When humans are offered the choice between rewards available at different points in time, the relative values of the options are discounted according to their expected delays until delivery. Using functional magnetic resonance imaging, we examined the neural correlates of time discounting while subjects made a series of choices between monetary reward options that varied by delay to delivery. We demonstrate that two separate systems are involved in such decisions. Parts of the limbic system associated with the midbrain dopamine system, including paralimbic cortex, are preferentially activated by decisions involving immediately available rewards. In contrast, regions of the lateral prefrontal cortex and posterior parietal cortex are engaged uniformly by intertemporal choices irrespective of delay. Furthermore, the relative engagement of the two systems is directly associated with subjects' choices, with greater relative fronto-parietal activity when subjects choose longer term options.

In Aesop's classic fable, the ant and the grasshopper are used to illustrate two familiar, but disparate, approaches to human intertemporal decision making. The grasshopper luxuriates during a warm summer day, inattentive to the future. The ant, in contrast, stores food for the upcoming winter. Human decision makers seem to be torn between an impulse to act like the indulgent grasshopper and an awareness that the patient ant often gets ahead in the long run. An active line of research in both psychology and economics has explored this tension. This research is unified by the idea that consumers behave impatiently today but prefer/plan to act patiently in the future (1, 2). For example, someone offered the choice between $10 today and$11 tomorrow might be tempted to choose the immediate option. However, if asked today to choose between $10 in a year and$11 in a year and a day, the same person is likely to prefer the slightly delayed but larger amount.

Economists and psychologists have theorized about the underlying cause of these dynamically inconsistent choices. It is well accepted that rationality entails treating each moment of delay equally, thereby discounting according to an exponential function (13). Impulsive preference reversals are believed to be indicative of disproportionate valuation of rewards available in the immediate future (46). Some authors have argued that such dynamic inconsistency in preference is driven by a single decision-making system that generates the temporal inconsistency (79), while other authors have argued that the inconsistency is driven by an interaction between two different decision-making systems (5, 10, 11). We hypothesize that the discrepancy between short-run and long-run preferences reflects the differential activation of distinguishable neural systems. Specifically, we hypothesize that short-run impatience is driven by the limbic system, which responds preferentially to immediate rewards and is less sensitive to the value of future rewards, whereas long-run patience is mediated by the lateral prefrontal cortex and associated structures, which are able to evaluate trade-offs between abstract rewards, including rewards in the more distant future.

A variety of hints in the literature suggest that this might be the case. First, there is the large discrepancy between time discounting in humans and in other species (12, 13). Humans routinely trade off immediate costs/benefits against costs/benefits that are delayed by as much as decades. In contrast, even the most advanced primates, which differ from humans dramatically in the size of their prefrontal cortexes, have not been observed to engage in unpreprogrammed delay of gratification involving more than a few minutes (12, 13). Although some animal behavior appears to weigh trade-offs over longer horizons (e.g., seasonal food storage), such behavior appears invariably to be stereo-typed and instinctive, and hence unlike the generalizable nature of human planning. Second, studies of brain damage caused by surgery, accidents, or strokes consistently point to the conclusion that prefrontal damage often leads to behavior that is more heavily influenced by the availability of immediate rewards, as well as failures in the ability to plan (14, 15). Third, a “quasi-hyperbolic” time-discounting function (16) that splices together two different discounting functions—one that distinguishes sharply between present and future and another that discounts exponentially and more shallowly—has been found to provide a good fit to experimental data and to shed light on a wide range of behaviors, such as retirement saving, credit-card borrowing, and procrastination (17, 18). However, despite these and many other hints that time discounting may result from distinct processes, little research to date has attempted to directly identify the source of the tension between short-run and long-run preferences.

The quasi-hyperbolic time-discounting function—sometimes referred to as beta-delta preference—was first proposed by Phelps and Pollack (19) to model the planning of wealth transfers across generations and applied to the individual's time scale by Elster (20) and Laibson (16). It posits that the present discounted value of a reward of value u received at delay t is equal to u for t = 0 and to βδtu for t > 0, where 0 < β ≤ 1 and δ ≤ 1. The β parameter (actually its inverse) represents the special value placed on immediate rewards relative to rewards received at any other point in time. When β < 1, all future rewards are uniformly downweighted relative to immediate rewards. The δ parameter is simply the discount rate in the standard exponential formula, which treats a given delay equivalently regardless of when it occurs.

Our key hypothesis is that the pattern of behavior that these two parameters summarize—β, which reflects the special weight placed on outcomes that are immediate, and δ, which reflects a more consistent weighting of time periods—stems from the joint influence of distinct neural processes, with β mediated by limbic structures and δ by the lateral prefrontal cortex and associated structures supporting higher cognitive functions.

To test this hypothesis, we measured the brain activity of participants as they made a series of intertemporal choices between early monetary rewards ($R available at delay d) and later monetary rewards ($R′ available at delay d′; d ′ > d). The early option always had a lower (undiscounted) value than the later option (i.e., $R <R′). The two options were separated by a minimum time delay of 2 weeks. In some choice pairs, the early option was available “immediately” (i.e., at the end of the scanning session; d = 0). In other choice pairs, even the early option was available only after a delay (d > 0). Our hypotheses led us to make three critical predictions: (i) choice pairs that include a reward today (i.e., d = 0) will preferentially engage limbic structures relative to choice pairs that do not include a reward today (i.e., d > 0); (ii) lateral prefrontal areas will exhibit similar activity for all choices, as compared with rest, irrespective of reward delay; (iii) trials in which the later reward is selected will be associated with relatively higher levels of lateral prefrontal activation, reflecting the ability of this system to value greater rewards even when they are delayed. Participants made a series of binary choices between smaller/earlier and larger/later money amounts while their brains were scanned using functional magnetic resonance imaging. The specific amounts (ranging from$5 to \$40) and times of availability (ranging from the day of the experiment to 6 weeks later) were varied across choices. At the end of the experiment, one of the participant's choices was randomly selected to count; that is, they received one of the rewards they had selected at the designated time of delivery.

To test our hypotheses, we estimated a general linear model (GLM) using standard regression techniques (21). We included two primary regressors in the model, one that modeled decision epochs with an immediacy option in the choice set (the “immediacy” variable) and another that modeled all decision epochs (the “all decisions” variable).

We defined β areas as voxels that loaded on the “immediacy” variable. These are preferentially activated by experimental choices that included an option for a reward today (d = 0) as compared with choices involving only delayed outcomes (d > 0). As shown in Fig. 1, brain areas disproportionately activated by choices involving an immediate outcome (β areas) include the ventral striatum, medial orbitofrontal cortex, and medial prefrontal cortex. As predicted, these are classic limbic structures and closely associated paralimbic cortical projections. These areas are all also heavily innervated by the midbrain dopamine system and have been shown to be responsive to reward expectation and delivery by the use of direct neuronal recordings in nonhuman species (2224) and brain-imaging techniques in humans (2527) (Fig. 1). The time courses of activity for these areas are shown in Fig. 1B (28, 29).

We considered voxels that loaded on the “all decisions” variable in our GLM to be candidate δ areas. These were activated by all decision epochs and were not preferentially activated by experimental choices that included an option for a reward today. This criterion identified several areas (Fig. 2), some of which are consistent with our predictions about the δ system (such as lateral prefrontal cortex). However, others (including primary visual and motor cortices) more likely reflect nonspecific aspects of task performance engaged during the decision-making epoch, such as visual processing and motor response. Therefore, we carried out an additional analysis designed to identify areas among these candidate δ regions that were more specifically associated with the decision process.

Specifically, we examined the relationship of activity to decision difficulty, under the assumption that areas involved in decision making would be engaged to a greater degree (and therefore exhibit greater activity) by more difficult decisions (30). As expected, the areas of activity observed in visual, premotor, and supplementary motor cortex were not influenced by difficulty, consistent with their role in non–decision-related processes. In contrast, all of the other regions in prefrontal and parietal cortex identified in our initial screen for δ areas showed a significant effect of difficulty, with greater activity associated with more difficult decisions (Fig. 3) (31). These findings are consistent with a large number of neurophysiological and neuroimaging studies that have implicated these areas in higher level cognitive functions (32, 33). Furthermore, the areas identified in inferior parietal cortex are similar to those that have been implicated in numerical processing, both in humans and in nonhuman species (34). Therefore, our findings are consistent with the hypothesis that lateral prefrontal (and associated parietal) areas are activated by all types of intertemporal choices, not just by those involving immediate rewards.

If this hypothesis is correct, then it makes an additional strong prediction: For choices between immediate and delayed outcomes (d = 0), decisions should be determined by the relative activation of the β and δ systems (35). More specifically, we assume that when the β system is engaged, it almost always favors the earlier option. Therefore, choices for the later option should reflect a greater influence of the δ system. This implies that choices for the later option should be associated with greater activity in the δ system than in the β system. To test this prediction, we examined activity in β and δ areas for all choices involving the opportunity for a reward today (d = 0) to ensure some engagement of the β system. Figure 4 shows that our prediction is confirmed: δ areas were significantly more active than were β areas when participants chose the later option, whereas activity was comparable (with a trend toward greater β-system activity) when participants chose the earlier option.

In economics, intertemporal choice has long been recognized as a domain in which “the passions” can have large sway in affecting our choices (36). Our findings lend support to this intuition. Our analysis shows that the β areas, which are activated disproportionately when choices involve an opportunity for near-term reward, are associated with limbic and paralimbic cortical structures, known to be rich in dopaminergic innervation. These structures have consistently been implicated in impulsive behavior (37), and drug addiction is commonly thought to involve disturbances of dopaminergic neurotransmission in these systems (38).

Our results help to explain why many factors other than temporal proximity, such as the sight or smell or touch of a desired object, are associated with impulsive behavior. If impatient behavior is driven by limbic activation, it follows that any factor that produces such activation may have effects similar to that of immediacy (10). Thus, for example, heroin addicts temporally discount not only heroin but also money more steeply when they are in a drug-craving state (immediately before receiving treatment with an opioid agonist) than when they are not in a drug-craving state (immediately after treatment) (39). Immediacy, it seems, may be only one of many factors that, by producing limbic activation, engenders impatience. An important question for future research will be to consider how the steep discounting exhibited by limbic structures in our study of intertemporal preferences relates to the involvement of these structures (and the striatum in particular) in other time-processing tasks, such as interval timing (40) and temporal discounting in reinforcement learning paradigms (41).

Our analysis shows that the δ areas, which are activated uniformly during all decision epochs, are associated with lateral prefrontal and parietal areas commonly implicated in higher level deliberative processes and cognitive control, including numerical computation (34). Such processes are likely to be engaged by the quantitative analysis of economic options and the valuation of future opportunities for reward. The degree of engagement of the δ areas predicts deferral of gratification, consistent with a key role in future planning (32, 33, 42).

More generally, our present results converge with those of a series of recent imaging studies that have examined the role of limbic structures in valuation and decision making (26, 43, 44) and interactions between prefrontal cortex and limbic mechanisms in a variety of behavioral contexts, ranging from economic and moral decision making to more visceral responses, such as pain and disgust (4548). Collectively, these studies suggest that human behavior is often governed by a competition between lower level, automatic processes that may reflect evolutionary adaptations to particular environments, and the more recently evolved, uniquely human capacity for abstract, domain-general reasoning and future planning. Within the domain of intertemporal choice, the idiosyncrasies of human preferences seem to reflect a competition between the impetuous limbic grasshopper and the provident prefrontal ant within each of us.

Supporting Online Material

Materials and Methods

Figs. S1 and S2

Tables S1 and S2

References

View Abstract