Neural Mechanisms of Foraging

See allHide authors and affiliations

Science  06 Apr 2012:
Vol. 336, Issue 6077, pp. 95-98
DOI: 10.1126/science.1216930


Behavioral economic studies involving limited numbers of choices have provided key insights into neural decision-making mechanisms. By contrast, animals’ foraging choices arise in the context of sequences of encounters with prey or food. On each encounter, the animal chooses whether to engage or, if the environment is sufficiently rich, to search elsewhere. The cost of foraging is also critical. We demonstrate that humans can alternate between two modes of choice, comparative decision-making and foraging, depending on distinct neural mechanisms in ventromedial prefrontal cortex (vmPFC) and anterior cingulate cortex (ACC) using distinct reference frames; in ACC, choice variables are represented in invariant reference to foraging or searching for alternatives. Whereas vmPFC encodes values of specific well-defined options, ACC encodes the average value of the foraging environment and cost of foraging.

Recent insights into the neural mechanisms of decision-making have come from investigations in behavioral economics. Participants typically decide between limited numbers of options differing in probability, risk, and amount of reward (1). Despite their success in explaining the choices animals make (2, 3), the optimal foraging models of ecology have had little impact on cognitive neuroscience (4) or economics (5). The key foraging choice is usually not a binary one between currently available options; instead, it is whether or not to engage with options as they are encountered (2, 3, 5). It depends not just on (i) the value of the option encountered (encounter value) but also on estimates of (ii) the environment’s average value (search value), and (iii) the cost of leaving to forage for alternatives (search cost) (24). We used functional magnetic resonance imaging to examine the neural mechanisms mediating foraging.

Human participants made foraging-style choices (forages) to either engage with current options of known value or search among a set of potential alternatives also of known value. All the stimuli were drawn with replacements from a set of 12 that had values learned in a previous session (supplementary material 1.2). Pre- and postscanning checks and analyses of choices during scanning confirmed value retention (fig. S7). Two visual stimuli indicated reward magnitudes potentially available if the subject engaged (their weighted combination constituted the encounter value) (supplementary equations 2 to 4 and fig. S2). Rewards were points that translated into money when the experiment was completed. Six additional boxed stimuli indicated the values of the potential alternatives (search value). Choosing to search entailed a risk of paying a search cost (high, mid, or low) in loss of points indicated by box color. If the subjects engaged, they went on to make a comparative decision between the two components that constituted the encounter option, after being informed about their associated reward probabilities (Fig. 1A). The introduction of probability information ensured that decisions could only be made at this point and that forages and decisions were separated in time. When participants chose to search, new options drawn at random from the boxed alternatives were encountered. Participants searched as often as they wished but risked the same costs each time.

Fig. 1

(A) Trials started with two central stimuli (encounter value) and six alternative stimuli (search value) in a box at the top (drawn from a set of 12 learned in a previous session); box color indicated current potential search cost. The horizontal bar indicated previously collected points. The first choice was a forage—to engage with the encounter value or search for an alternative. Searching led back to the initial screen with a new encounter value drawn from the previous set of alternatives. Engaging led to the second type of choice—the decision—between the two component stimuli that constituted the encounter value. The pseudorandomly determined reward probabilities were now revealed. After the decision feedback indicated reward delivery. Factors (β weights from logistic regressions) influencing likelihoods of search during forages (B) and picking the right stimulus during decisions (C).

Logistic regression identified factors weighing on forages and decisions. Engaging was promoted by search costs and encounter values but retarded by all components of search values (Fig. 1B and fig. S7). Participants were biased against search and required objectively more value gain for searching than engaging (the constant from the regression reflects subjects’ biases against searching; we call this parameter forage readiness). Decisions were influenced by reward probability and magnitude differences between options (Fig. 1C).

Comparison of average activity during foraging and decisions identified ACC among other regions (Fig. 2A). Usually, in decisions, the most common signal observed in ACC is inversely related to the value difference between chosen and unchosen options. Such inverse value difference effects have been interpreted as indicating that ACC or dorsomedial frontal cortex is a “comparator” comparing choice values. According to this theory, the region is more active when unchosen values are larger, because a smaller difference between chosen and unchosen values means comparison takes longer before a choice is made (6, 7) (fig. S3). Related accounts emphasize an ACC role in monitoring for conflict between responses (8).

Fig. 2

ACC activity was higher in forages than decisions (A), better related to the inverse value difference (VD) during decisions than foraging (B), reflected the main effect of search value during foraging (C), and related better to search VD than decision VD (D). ACC time courses during engage (E) and search (F). (G) Individual peak ACC BOLD β weights 5 to 10 s after forage stimulus onset correlated with behavioral effects of the search value on search behavior (bottom), whereas ACC β weights of best search value component predicted repeated searching (top). VmPFC exhibited no such correlations. (H) Time course for engage forages and the subsequent decision phase: The search value (red) signal continued into the decision phase. Reward magnitudes associated with chosen (green) and unchosen (orange) components of encounter value (left) were represented from their onset in the forage phase and into the decision phase. The reward probabilities of the chosen and unchosen options were only revealed after engaging, and their BOLD effects therefore appear later (right).

However, our task also allowed us to test whether the ACC signal reflects the relative benefit of the alternative course of action or the value of exploring the environment. This hypothesis predicts that ACC, during forages, will stop reflecting the value of the unchosen option and will always represent the value of searching. We therefore refined the analysis (supplementary text 1.5) and tested for a region that demonstrated both of these effects: Coding for the unchosen–chosen value difference during decisions but not forages (Fig. 2B), and, on forages, instead coding for the search value (Fig. 2C). Both tests identified overlapping ACC regions. When these two effects were combined into a compound test [forage(search value–encounter value)–decision(chosenvalue-unchosen value)], the same ACC region was implicated (Fig. 2D).

We analyzed foraging signal time courses in a region centered on the overlap between foraging search value and decision value difference effects (Fig. 2, C and D). The blood-oxygen-level–dependent (BOLD) contrast for ACC was positively correlated with the value of searching the environment and negatively correlated with the value of engaging with the current encounter option, regardless of the choice participants ultimately made (Fig. 2, E and F). The frame of reference in which values are encoded in ACC is thus fixed in relation to response strategy, that is, searching or engaging. This contrasts with vmPFC and other regions where value is encoded in a flexible reference frame tied to the choice taken or attended (9, 10). Comparing search value signals in ACC, we found a more rapid increase (greater slope) on search than engage choices [t(17) = –2.54, P = 0.021] consistent with earlier, stronger signals in search decisions (fig. S8) and faster accumulation of search evidence in ACC on search choices (4). In search choices, there was also an effect of search cost (Fig.2F).

We next examined whether individual differences in ACC activity reflected differences in foraging. Behavioral variation in the influence of search value in promoting searches was correlated with neural variation in ACC search value effects (Fig. 2G, bottom), and behavioral differences in the influence of the lowest and highest alternative values were correlated with ACC activity (fig. S5). Although average search value determined search choices (Fig. 1B), it did not predict the rate at which participants repeatedly searched again and again in pursuit of the best alternative on each trial. Such perseverative search rates were, however, predicted by ACC responses to best alternatives (Fig. 2G, top). Finally, we looked at the decision phase; ACC activity still reflected the search value from the prior forage, as if still encoding how good it would be to search for alternatives (Fig. 2H). Brain activity conveyed knowledge of environmental richness even during simultaneous binary decision-making when the signal was no longer relevant. Knowledge of environmental richness, which is normally pertinent to foraging but irrelevant to binary decision-making, impinges on, and impairs, simultaneous binary decision-making in behavioral experiments (5).

Despite their limitations (11) and alternative explanations of reward- and error-related activity in ACC (8, 12), conflict and comparator-based theories remain the most influential accounts of decision-related activity in ACC. However, the presence of an average reward signal (search value), a negative effect of search cost, anchoring of value representations with respect to search versus engage strategies, differential rates of search signal accumulation on search and engage trials, and correlation, across subjects, between ACC signal variance and search choice variance (Fig. 2 and fig. S5) cannot be accommodated within comparator- and conflict-based ACC theories. Instead, we suggest that ACC codes the value of switching to a course of action alternative to that which is taken or is the default. ACC supplies such a signal even when subjects are not asked to forage but to make decisions. As soon as the subject switches to the alternative, the signal dissipates, but it is maintained if the course of behavior is maintained (compare red lines in Fig. 2F versus Fig. 2, E and H).

VmPFC encodes the value of chosen or attended options in comparison with unchosen or unattended options (9, 10, 13). During foraging, however, vmPFC activity only reflected the chosen option value when participants engaged, and there was no representation of search value (Fig. 3A). When subjects searched, the chosen search value was actually negatively correlated with vmPFC activity, and there was no representation of encounter value. The absence of any representation of search value—the average value of the environment—and of search cost (Fig. 3A) restricts any role vmPFC might play in foraging.

Fig. 3

(A) VmPFC time courses during forages (conventions as in Fig. 2E). (B) Activity better related to decision VD than to forage VD. (C) VmPFC time course for engage forages and the subsequent decision phase (conventions as in Fig. 2H). (D) Individual peak vmPFC BOLD β weights 5 to 10 s after decision onset correlated with estimates of decision accuracy (softmax temperature).

In contrast, seconds after foraging, vmPFC played an important role in decisions. Comparison of average activity during decisions and forages and between decision and forage value differences [decision(chosen value-unchosen value) versus forage(chosen value–unchosen value)] identified vmPFC (Fig. 3B). It coded, negatively and positively, for values of unchosen and chosen options, respectively. It effectively encoded the value difference between options. During the transition from foraging to decisions, vmPFC rapidly changed from positively encoding both components of encounter value, weighting both in the same way as participants did behaviorally (fig. S4), to representing the value difference between chosen and unchosen components in decisions (Fig. 3, A and C). The reference frame in which values are encoded in vmPFC is thus flexible and concerned with the value dimensions and contrasts most pertinent to decision-making. Such a reference frame makes vmPFC suitable for goal-based (14) and multiattribute (15) decision-making. Its importance during decisions was underlined by individual variation in vmPFC reward magnitude effects, which were correlated with decision accuracy (Fig. 3D).

Reward prediction error signals associated with the ventral striatum and its interactions with orbitofrontal cortex (16) allow decision-making to change with experience. They occur even when there is little opportunity for learning (17),as in our task. We therefore examined whether forage prediction errors were also encoded by the striatum (fig. S3) and its interactions with the ACC. Despite its weak activation with search value, it exhibited post–search prediction errorlike signals (positive effect of new encounter value, negative effect of previous search value) (Fig. 4A). It also responded to search costs (Fig. 4B). The prediction error response had higher positive peaks in people who searched less (as if they had expected less) (Fig. 4C, top). Across subjects, search costs activated striatum in proportion to the degree that they deterred searching (Fig. 4C, bottom).

Fig. 4

Ventral striatal time courses after feedback following search forages(A). Effect of search costs when search is chosen(B). Individual peak BOLD β weights for new encounter value (C, top) and peak BOLD β weights for new search costs on searching (C, bottom) 5 to 10 s after event onset both correlated with the proportion of forages on which participants searched. Increased coupling with left ventral striatum as a function of search cost during searches (D) and individual differences in foraging readiness E) both revealed an ACC region anterior to, but overlapping with, that in Fig. 2D.

An ACC region overlapping with, but anterior to, the search value effect (Fig. 2C) was more coupled with left ventral striatum when search costs increased and search was chosen (Fig. 4D). The coupling appeared related to disinhibition of effortful choices because the same ACC region was also more active in subjects more willing to overcome costs; individual differences in foraging readiness were associated with increased anterior ACC activation (Fig. 4E).

VmPFC and ACC have been thought to operate in sequence during choice (6, 16), but our results suggest that ACC represents choice in a manner at odds with intuitions of how comparative decisions are made. Because ACC value representations are anchored to response strategy (engage or search), our results confirm that it is well placed to guide response selection. However, the different signals in ACC and vmPFC attest to independent roles in forages and decisions. The implication of ACC in foraging and encoding of the average value of the foraging environment may facilitate understanding of the reward signal it carries (12, 18, 19), its prominence during exertion of effort (20, 21), in go–no go decisions (22), in exploration (23, 24), and in representing alternative and counterfactual choice values (25, 26). Some action value learning tasks previously used to investigate ACC (12) may have been treated as foraging tasks, and animals may have been choosing whether to stay with the current choice or switch to an alternative. Such a perspective also makes it possible to reinterpret ACC activation recorded during exploration tasks (24) as reflecting estimates of richness of alternatives in the environment. ACC activity is frequently recorded (27) and might reflect the value of alternative choices in other tasks and the inclination to refrain from engaging in the currently offered choice (28). Foraging entails energetic costs, and we found that ACC activity also reflected the cost of foraging. ACC neurons have been shown to encode value signals that integrate both cost and reward (29). By contrast, vmPFC, a primate specialization (30), may underpin fine-grained, accurate, and flexible decision-making (6, 14).

Supplementary Materials

Materials and Methods

Supplementary Text

Figs. S1 to S9

Table S1

References (3137)

References and Notes

  1. Acknowledgments: Funded by U.K. Medical Research Council and the Wellcome Trust.
View Abstract

Stay Connected to Science

Navigate This Article