Report

Rats and Humans Can Optimally Accumulate Evidence for Decision-Making

See allHide authors and affiliations

Science  05 Apr 2013:
Vol. 340, Issue 6128, pp. 95-98
DOI: 10.1126/science.1233912

How to Make Decisions

Recently, a number of methods to probe internal properties of nonlinear neural systems have been developed. In these methods, highly variable stimuli are used to explore the input space of the system. Neural responses are then studied using models that take advantage of the known trial-by-trial stimulus information. Brunton et al. (p. 95) adapted this combined approach to decision-making. Both in rats and humans, the diffusion constant of the drift-diffusion model of decision-making was zero, implying that the noise is all in the processing of sensory input and not in the evidence accumulator. In addition, rats gradually accumulated evidence for decision-making, with strong effects of sensory adaptation on gradual accumulation of evidence.

Abstract

The gradual and noisy accumulation of evidence is a fundamental component of decision-making, with noise playing a key role as the source of variability and errors. However, the origins of this noise have never been determined. We developed decision-making tasks in which sensory evidence is delivered in randomly timed pulses, and analyzed the resulting data with models that use the richly detailed information of each trial’s pulse timing to distinguish between different decision-making mechanisms. This analysis allowed measurement of the magnitude of noise in the accumulator’s memory, separately from noise associated with incoming sensory evidence. In our tasks, the accumulator’s memory was noiseless, for both rats and humans. In contrast, the addition of new sensory evidence was the primary source of variability. We suggest our task and modeling approach as a powerful method for revealing internal properties of decision-making processes.

Decisions in real life often need to be made based on noisy or unreliable evidence. Accumulating evidence from a set of noisy observations made over time makes it possible to average over different noise samples, thus improving estimates of the underlying signal. This principle is the basis for the influential class of “drift-diffusion” models (15), which have been broadly applied to explain a variety of phenomena in biology (68). Accumulation involves both maintaining a memory of evidence accrued so far and adding new evidence to the memory. Yet no test to date has distinguished between noise associated with each of these two components.

We developed tasks in which subjects (humans and rats) were concurrently presented with two trains of pulses, one train representing “left”-labeled pulses and the other, “right”-labeled pulses. At the end of each trial, the subjects had to report which of the two trains had the greater total number of pulses. The timing of pulses was random and varied widely, both within and across individual trials (9, 10). We reasoned that the precisely known pulse timing would enable detailed modeling of the subjects’ choices on each individual trial, whereas its variability would allow exploration of the stimulus space and would thus provide statistical power.

In an auditory version of the task, performed by three humans and 19 rats, left pulse trains were clicks presented on a speaker to the left of the subject, and right pulse trains were clicks presented on a speaker to the right of the subject (Fig. 1A; free-field speakers for rats, headphones for humans). In a visual version of the task, performed by four humans, left pulses were flashed white bars, tilted anticlockwise from the vertical, and right pulses were flashed white bars tilted clockwise (3) (Fig. 1B). On each trial, the stimulus was presented for a duration controlled by the experimenter. The sum of the two pulse rates was kept fixed within each task, and discrimination difficulty was controlled on each trial by the ratio of the two rates (Fig. 1, C to E).

Fig. 1 Psychophysical tasks and summary of behavior.

(A) Sequence of events in each trial of the rat auditory task. After light onset from a light-emitting diode (LED) in a center port, trained rats placed their nose into the port and “fixated” their nose there for a fixed amount of time until the light was turned off (1 to 2 s). Trains of randomly timed clicks were played concurrently from left and right free-field speakers during the last portion of the fixation time. After nose fixation and sounds ended, the rat made a choice, poking in the left or the right port to indicate which side played more clicks. Humans performed an analogous version of the task on a computer while wearing headphones. (B) Schematic diagram of a stimulus in the visual pulses version of the task, performed by humans on a computer. (C) Psychometric curves (fits to a four-parameter logistic function for each subject; see methods) for rat subjects. (D) Psychometric curves, as in (C), for human subjects. (E) Chronometric curves for an example rat. Difficulty is labeled by the ratio of click rates played on the two sides. For each difficulty, performance improves with longer stimulus durations. Dashed lines show the best-fit model predictions for this rat, as described in the text. The vertical axis shows mean accuracy and 95% confidence interval (CI).

To examine a large variety of possible mechanisms consistent with the performance improvement at longer stimulus durations seen in Fig. 1E [(11, 12); see also fig. S4], we expanded on the drift-diffusion framework and implemented a flexible model (Box 1) in which different regimes of model parameter values represent widely different mechanisms (three examples in Fig. 2A), with mixtures of mechanisms represented by intermediate parameter values.

Box 1

At each time point, the accumulator memory a (black trace) represents an estimate of the right versus left evidence accrued so far. At stimulus end, the model decides right if a > bias and left otherwise, where bias is a free parameter. The light gray traces indicate alternate runs with different instantiations of model noise.

Right ↑ (left ↓) pulses change the value of a by positive (negative) impulses of magnitude C.

σ2i parameterizes noise in the the initial value of a.

σ2a is a diffusion constant, parameterizing noise in a.

σ2s parameterizes noise when adding the evidence from a right or left pulse: For each click, variance σ2s is scaled by the amplitude of C and then added to the evidence contributed by the click.

λ parameterizes consistent drift in the memory a. In the “leaky” or forgetful case (λ < 0, illustrated), drift is toward a = 0, and later pulses affect the decision more than earlier pulses. In the “unstable” or impulsive case (λ > 0), drift is away from a = 0, and earlier pulses affect the decision more than later pulses. The memory’s time constant τ = 1/λ.

B is the height of the sticky decision bounds and parameterizes the amount of evidence necessary to commit to a decision.

ϕ and τϕ parameterize sensory adaptation by defining the dynamics of C. Immediately after a click, the magnitude C is multiplied by ϕ. C then recovers toward an unadapted value of 1 with time constant τϕ. Facilitation is thus represented by ϕ > 1, whereas depression is represented by ϕ < 1 (inset).

These properties are implemented by the following equations

if |a| ≥ B then da/dt = 0; else

da = σadW + (δt,tR · ηR ·C − δt,tL · ηL · C) dt + λadt (1)

where δt,tR, L are delta functions at the times of the pulses; η are i.i.d. Gaussian variables drawn from N(1, σs); and dW is a white-noise Wiener process. The initial condition a(t = 0) is drawn from the Gaussian N(0, σi).

Adaptation dynamics are given by

dCdt=1Cτϕ+(ϕ1)C(δt,tR+δt,tL) (2)

In addition, a lapse rate parameterizes the fraction of trials on which a random response is made. Ideal performance (a = #right clicks − #left clicks) would be achieved by

λ=0,B=,σa2=σs2=σi2=0,ϕ=1,bias=0 (3)

Fig. 2 The model can fit a variety of mechanisms, but the data consistently fit to a pulse-accumulating mechanism with zero noise in the accumulator’s memory.

(A) Three examples of the mechanisms that the model can represent. Top: The ideal, a pulse accumulator that weights all pulses equally. Middle: A burst detector. If three or more pulses from the same side arrive within 50 ms, the sticky bounds are reached, meaning that a commitment to orient to that side is made. Bottom: A precedence detector. If pulses from one side tend to arrive shortly before pulses from the other side, the adaptation minimizes the second side’s pulses, and the decision tends toward the preceding side. (B to D) Parameter-likelihood landscapes that result from the data of one example rat. Panels are two-dimensional slices, cut through the full nine-dimensional parameter space, around the best-fit values. The blue curves represent CIs [2 SD of the multidimensional normal distribution fit to the likelihood landscape (28)]. The best-fit parameter values found, which are λ ≈ 0, B >> 1, σ2a ≈ 0, σ2s >> 0, and ϕ << 1, correspond to pulse accumulation [(A), top] with a perfect memory but imperfect processing of sensory inputs. (E to J) Summaries of best-fit parameters over all subjects and tasks. Black ticks are best-fit values; gray bars span the CIs. Each panel has been divided by task (yellow highlight for human auditory task, green highlight for human visual task) and then sorted independently in order of parameter value. (E) All subjects, in all tasks, had long accumulator memories. (H) Most subjects were best fit with large bounds (B → ∞). (F) Thirteen of 19 rats and all humans in both auditory and visual tasks were best fit with σ2a = 0. (I) A wide range of values, all large compared to σ2a, were found for σ2s. (G) and (J) All rats showed strong, rapidly recovering depression (ϕ < 1, mean τϕ = 0.040 s). Humans showed weak depression in the auditory task and weak facilitation in the visual task.

Given a trial’s specific pulse times and a set of parameter values, the model produces the probability of observing a left versus a right response on that trial. Methods to compute the gradient of this probability with respect to model parameters (see the supplementary materials) were critical for efficiently finding the parameter values that gave the maximum likelihood of observing the complete set of a subject’s responses. Numerical tests always found only one maximum (fig. S6), suggesting that we always found the global maximum. Consistent with this observation, a mathematically related model has been proven to have a concave log likelihood (6), suggesting that our model may also be provably concave and have a single maximum.

Figure 2, B to D, shows the likelihood landscape around best-fit (i.e., maximum-likelihood) parameters, given the data of a representative rat subject. Confidence intervals are given by the parameter width of the maximum (blue contours). Figure 2B shows λ (= 1/τ, the memory time constant), which represents accumulator memory leak (if λ < 0) or instability (if λ > 0), and B, the height of the decision-commitment evidence bounds. λ was statistically indistinguishable from zero. That is, the decision dynamics were neither leaky nor unstable, suggesting that sensory evidence from throughout the entire stimulus period was given equal weight. The best-fit B was large enough that it produced model fits indistinguishable from those produced by B = ∞. Across subjects (Fig. 2, E and H), species, sensory modalities, and task parameters, the accumulator’s memory time constant |τ| = 1/|λ| was long (|λ| = 0.91 ± 0.15 s−1 mean ± SE across rats; |λ| = 0.23 ± 0.071 s−1 across humans), in the sense that |τ| was comparable to or greater than the longest stimulus duration used [1 s for rats, 4 s for humans (13)]. The best-fit values of B and λ were thus in the gradual accumulation regime (Fig. 2A, top).

In our tasks, noise in the sensory evidence (σ2s) adds total variance proportional to the sum of the amplitude of the clicks, whereas the memory diffusion noise (σ2a) adds total variance proportional to the stimulus duration. This separability allows us to isolate the magnitude of the diffusion noise σ2a that gives the drift-diffusion model its name (2, 5). To our surprise, in 13 out of 19 rats and in all three humans performing the auditory task and all four humans performing the visual task, the value that best fit the data was the ideal σ2a = 0 (Fig. 2C for an example subject, Fig. 2, F and I, for all subjects). Consistent with the easily distinguishable right versus left pulses used with humans, the best-fitting values of sensory evidence noise σ2s for humans were substantially lower than those for the rats (Fig. 2I). Again in this much lower σ2s regime, the best-fitting memory diffusion noise σ2a was zero. The dominant source of variability was thus noise in the evidence associated with each incoming pulse (σ2s = 1.90 ± 0.28 pulses2 per incoming pulse for rats, 0.50 ± 0.059 pulses2 in the human auditory task, and 0.24 ± 0.10 pulses2 in the human visual task.) This variability could be introduced by sensory uncertainty in the left-versus-right classification of each individual pulse, or by noise in the process of adding new sensory evidence to the accumulator memory.

The pulsatile nature of our task made it straightforward to parameterize sensory adaptation [(14); Eq. 2]. We found strong, quickly-recovering depression for rats (Fig. 2, D, G, and J; adaptation magnitude ϕ = 0.17 ± 0.021, recovery time constant τϕ = 0.080 ± 0.024 s, mean ± SE across rats). This is consistent with the depression observed in auditory cortex neural responses to click train stimuli (15, 16). Stimuli in the human tasks were constructed with a minimum interpulse interval (30 ms in the auditory task, 150 ms in the visual task), and this greatly reduced the adaptation effects as compared to those in rats (Fig. 2, G and J). Across all adaptation regimes, we found long accumulator time constants and zero memory diffusion noise.

If our subjects’ behavior depends on a process that cannot be approximated by the model [such as collapsing bounds (17), variability in attention (18), or other possibilities not yet formalized in a model], our interpretation of the best-fit values may be problematic. We therefore tested the model-derived conclusions. To assess the memory time constant τ, we calculated the “psychophysical reverse correlation” (17, 1921), which estimates the extent to which click rates at each point in time influence left and right decisions. This analysis indicated that all periods of the trial have similar influence on the decision (approximately constant separation between the two traces in Fig. 3A), which is consistent with the long τ found in the model-based analysis. To assess our estimates of the single-pulse noise σ2s and the starting variability σ2i, we fit the model to data from trials with multiple clicks on each of the two sides, and used those best-fit parameters to predict performance on trials in which only one single-side pulse happened to be presented (for which performance is dominated by σ2s and σ2i). The prediction was accurate, even on an individual subject-by-subject basis (Fig. 3B). To assess the memory diffusion noise σ2a, we controlled for sensory evidence by dividing trials into groups, so that all trials within a group had the same number of right clicks and the same number of left clicks; assuming large |τ|, performance within each group is then dominated by σ2a and the click depression parameters ϕ and τϕ. Large σ2a would predict decreasing within-group performance at longer stimulus durations. The data showed the opposite trend and was precisely predicted by the best-fit model, where σ2a = 0 and clicks are depressing (Fig. 3C and fig. S13). The tests of Fig. 3 thus provided model-independent confirmation of the model-fit parameter values.

Fig. 3 Model-independent analyses support model-fitting results.

(A) Long τ: psychophysical reverse correlation for an example rat (longest quarter of trials only). For each time point in each trial, we computed the excess pulse rate difference (right pulses/s – left pulses/s, relative to the value expected given the random processes used to generate the trial) and then obtained an average for trials resulting in a right (red) and an average for trials resulting in a left (green) decision. The separation between the two indicates how strongly clicks from each time point influenced the final decision. Thick solid lines were obtained from the rat’s responses; the thickness of the line represents the standard error. Narrow shaded lines were predicted by the best-fit model. (B) Accurate estimates of σ2s and σ2i: actual performance (fraction correct) on short-duration trials in which one side had one pulse and the other side had two pulses (i.e., trials for which performance was dominated by σ2s and σ2i), versus performance predicted by fitting the model to trials with multiple pulses on each of the two sides. Too few trials of the short type to perform this analysis were presented for one of the human visual data sets and all human auditory data sets. Solid circles are individual rats, and open diamonds are individual humans. The red cross shows the mean and standard error across subjects. The accurate estimates of σ2s and σ2i suggested by the good predictions also suggest that σ2a was estimated accurately, because the sum σ2s + σ2i + σ2a is tightly constrained by the data (fig. S12). (C) Assessing σ2a, ϕ, and τϕ: Trials were divided into groups, with sensory evidence and sensory noise controlled by keeping the total number of right clicks and the total number of left clicks fixed within each group. Performance within each group was then dominated by σ2a, ϕ, and τϕ. (C) shows the performance of an example rat, averaged across trial groups and relative to the overall mean of each group, as a function of stimulus duration. The red line is the prediction from the best-fit model, with σ2a = 0.

Using highly variable yet precisely known stimuli, together with a trial-by-trial model that uses the full information about each trial’s richly detailed stimulus (22), is a powerful approach for precisely quantifying multiple properties of decision-making processes. The approach provided strong evidence that rats can indeed gradually accumulate evidence for decision-making (2326), thus establishing that this important cognitive phenomenon can be studied in a widely available animal model that is amenable to a rapidly growing arsenal of molecular tools. With its capacity to provide moment-by-moment estimates of the temporal evolution of the accumulator, the approach will combine particularly well with neurobiological measurements. The model used for analysis can be readily expanded to consider and quantify further decision-making parameters, and the approach is easily generalized to different species, sensory modalities, and types of decision-making, including value-based decision-making (8, 27).

Supplementary Materials

www.sciencemag.org/cgi/content/full/340/6128/95/DC1

Methods

Supplementary Text

Modeling Methods

References and Notes

  1. Acknowledgments: We thank Y. Niv for suggesting trial-by-trial analysis; D. Buonomano for suggesting we examine sensory adaptation; and A. Akrami, J. Erlich, C. Kopec, J. Kubanek, T. Hanks, B. Scott, M. Shadlen, D. Tank, and M. Yartsev for comments on the manuscript.
View Abstract

Navigate This Article