ReportsNeuroscience

Midbrain dopamine neurons control judgment of time

See allHide authors and affiliations

Science  09 Dec 2016:
Vol. 354, Issue 6317, pp. 1273-1277
DOI: 10.1126/science.aah5234

Time is a subjective experience

Time, like space, is one of the fundamental dimensions of all our experiences. However, organisms do not work like clocks, and our judgment about the passage of time is variable, depending on circumstances. Soares et al. systematically investigated midbrain dopaminergic neurons during timing behavior in mice (see the Perspective by Simen and Matell). When measuring and manipulating mouse activity, the authors observed that dopaminergic neurons controlled temporal judgments on a time scale of seconds.

Science, this issue p. 1273; see also p. 1231

Abstract

Our sense of time is far from constant. For instance, time flies when we are having fun, and it slows to a trickle when we are bored. Midbrain dopamine neurons have been implicated in variable time estimation. However, a direct link between signals carried by dopamine neurons and temporal judgments is lacking. We measured and manipulated the activity of dopamine neurons as mice judged the duration of time intervals. We found that pharmacogenetic suppression of dopamine neurons decreased behavioral sensitivity to time and that dopamine neurons encoded information about trial-to-trial variability in time estimates. Last, we found that transient activation or inhibition of dopamine neurons was sufficient to slow down or speed up time estimation, respectively. Dopamine neuron activity thus reflects and can directly control the judgment of time.

Our ability to accurately estimate and reproduce time intervals is variable and depends on many factors, including motivation (1), attention (2), sensory change (3), novelty (4), and emotions (5). In addition, several neurological and neuropsychiatric disorders (69) are accompanied by changes in timing behavior. Midbrain dopamine (DA) neurons are implicated in many of the psychological factors (10) and disorders (6, 8, 11) associated with changes in time estimation.

Midbrain DA neurons also encode reward prediction errors (RPEs) (1215), an important teaching signal in reinforcement learning (16). Phasic DA responses to reward-predicting cues reflect the magnitude of (17, 18), probability of (19), and expected time delay until the reward (20, 21). When expectation varies over time, DA neuron responses are smaller at times when rewards and reward-predicting cues are more expected (21, 22), indicating that DA neurons receive temporal information. Manipulations of the DAergic system by pharmacological (23) or genetic (24) approaches disrupt timing behavior, suggesting that DA neurons may directly modulate timing. However, the data from pharmacological and genetic manipulations are inconsistent: In some cases, DA seems to speed up timekeeping (23, 25), and in others, DA seems to slow down or not affect timekeeping (26, 27).

To determine (i) what signals are encoded by midbrain DA neurons during timing behavior and (ii) how DA neurons contribute to variability in temporal judgments, we measured and manipulated the activity of DA neurons in mice as they performed categorical decisions about duration (28). We first trained mice to perform a temporal discrimination task (Fig. 1A, left). Mice initiated trials at a central nose port, immediately triggering the delivery of two identical tones separated by a variable delay. Mice reported the delay between tones as shorter or longer than 1.5 s at one of two lateral nose ports for water reward. Incorrect choices were not rewarded. Performance was nearly perfect for the easiest intervals but more variable for intervals near 1.5 s (the boundary between the “short” and “long” categories) and was well described by a sigmoid psychometric function (Fig. 1A, middle).

Fig. 1 Dopaminergic (DAergic) signaling is required and precisely aligned to temporal cues, not movement, during performance of a temporal categorization task.

(A) Shown on the left is the task schematic and order of events (circles in the upper panel, nose-ports; gray shading in the lower panel, interval period). A logistic function fit to the daily (gray) and average (black) performance of an example mouse (10 sessions) is shown in the middle. Pharmacogenetic suppression (hM4D) was targeted to midbrain DAergic neurons, and mice were injected with either CNO or saline on adjacent days; shown on the right is mean psychometric performance on days with saline or CNO treatment (black or red, respectively; n = 3 mice). Error bars, SEM. The inset shows the percent of correct trials on days before and after CNO treatment in mice expressing hM4D (filled circles, n = 3; *P < 0.005) or non–hM4D-expressing controls (open circles, n = 4). Error bars, SEM. (B) Schematic of the photometry apparatus and viral and surgical procedure. (C) Image of the substantia nigra pars compacta (SNc) histology. (D) On the left, all trials of DA neuronal activity recorded from a single subject are shown, split by interval duration and aligned on trial initiation (first tone delivery; white vertical line). Each row represents a trial, and within each interval, trials are sorted from fast (top) to slow (bottom) response time (RT, time from the second tone to choice; 3759 trials). Shown on the right are mean DAergic neuron responses, split by interval duration (n = 5 mice; intervals are color-coded as throughout). Shading, SEM across mice. z, z-score, ΔF/F, see the methods. (E) Example photometric traces recorded during a single correct and incorrect trial of the 1.74-s interval. (F) Photometric recordings of DA neuronal activity from a single subject, split by outcome (correct choices, top; incorrect choices, bottom) and aligned on choice (white). Within each outcome, trials were sorted by RTs [slow (top) to fast (bottom)]. Red dots mark the time of second-tone presentation (2426 trials). (G) Mean DAergic responses of incorrect trials aligned on the three main task events (first tone, second tone, and choice; n = 5 mice). Shading, SEM across mice.

We then pharmacogenetically suppressed DAergic neuronal activity and observed impaired temporal judgments on treatment days as compared with adjacent nontreatment days (P < 0.004, n = 3 mice; Fig. 1A, right). We also observed a tendency to perform fewer trials [control group, 177 ± 15 trials; clozapine N-oxide (CNO)–treated group, 115 ± 54 trials; mean ± SD; P = 0.05], suggesting that the animals’ motivation was affected by DAergic suppression. To test whether fluctuations in endogenous DA neuron activity predicted systematic changes in temporal judgments, we used fiber photometry (29) to measure Ca2+ activity in DAergic neurons, targeting the substantia nigra pars compacta (SNc) (Fig. 1, B and C, and figs. S1 and S2).

We observed DAergic responses locked to the three main task events on single trials: the first tone, the second tone, and reward delivery (or omission thereof) (Fig. 1E). Activity increased after reward delivery and decreased when the reward was omitted in the case of incorrect choices (Fig. 1F) (30). DAergic signaling has also been implicated in movement; however, DA neuron activity in this task did not reflect movement per se (Fig. 1, F and G, and fig. S3).

In this task, the second tone marks the end of the interval to be discriminated and is a sensory cue that predicts reward. The amplitude of a RPE at the time of the second tone should be modulated by two factors: the subject's expectation of reward at tone delivery and their temporal expectation of the second tone itself. First, expectation of reward varies as a function of stimulus difficulty, where the more difficult the interval to be discriminated, the lower the probability of reward (Fig. 2A). Second, because delay intervals were randomly selected from the stimulus set on each trial, occurrence of the second tone becomes less surprising with time (Fig. 2A). Indeed, animals were sensitive to changing temporal expectation, as indicated by a systematic decrease in response time (RT, the delay between second-tone delivery and choice execution) with increasing interval duration (RT for the shortest interval greater than RT for the longest interval; P < 0.005 in each of five mice). To test whether second-tone responses reflected a RPE that integrated information about temporal expectation and expected reward, we asked how well the pattern of average responses to all six second tones could be explained by a linear combination of temporal expectation (i.e., surprise, the inverse of the subjective hazard function; fig. S4) and performance (the probability of reward for each stimulus). On average, 90% of variance in mean responses could be explained by a relatively equal contribution of these two factors (range, 58 to 99%; n = 5 mice; Fig. 2A). Reward responses were also consistent with RPE coding: Within a given choice category, they tended to be larger for intervals that animals miscategorized more often (fig. S5).

Fig. 2 DAergic responses correlate with temporal judgments and are explained by a simple model of reward prediction error (RPE).

(A) Linear model (left) including RPE components: expectation of reward P (subject performance, top left) and temporal expectation S (surprise, the inverse of the subjective hazard function; bottom left). w, weight; a.u., arbitrary units. In the middle panel, measured second-tone DAergic response for six time intervals (black traces; n = 5 mice) are compared to predicted DA response (red dots). The graph on the right shows model predictions versus measured DAergic activity (gray symbols, individual mice; mean responses across mice, black filled circles). (B) Average measured DA response for all intervals during correct and incorrect trials. (C) Mean DA response to the second tone when an interval was judged as long versus short. Each shape represents a different mouse. Black symbols represent responses averaged across all interval stimuli.

On average, DA neuron responses to the second tone contained information about elapsed time through their encoding of temporal expectation. Do these responses relate to variations in judgments of time? When animals correctly judged intervals, the response to the second tone was, on average, larger for intervals in the short category (Fig. 2B). However, on incorrect trials, the pattern was reversed: The response to the second tone was larger for intervals in the long category. Thus, DA response magnitude reflected the animals’ assessment of the interval duration, not the actual interval duration. Over all intervals, the second-tone response for a given interval was significantly larger when that interval was judged as short (P < 0.001; Fig. 2, B and C). How do these results relate to the underlying decision and motor processes that guide choice during the task?

In principle, the trial-to-trial variations in DA neuron activity could be related to a time-dependent component of the decision, such as the speed of internal timekeeping or the location of the decision boundary in time. Alternatively, variations in DA activity might reflect a time-independent component of the behavior, such as a constant action bias. To quantitatively evaluate these two possibilities, we performed a logistic regression to assess the degree to which the magnitude of the DA neuron response to the second tone predicted animals’ choices on single trials. We found that activity predicted choice to a lesser extent in the case of easy stimuli than in the case of difficult stimuli (Fig. 3A). These data suggest that the DA neuron response was systematically related to the horizontal position of the psychometric curve along the time axis and not the vertical position along the choice axis (Fig. 3B). To test this, we split trials into high, medium, and low terciles of the distribution of responses to the second tone [Fig. 3, A (histograms) and C]. While the second-tone response amplitude was used to group trials, the systematic ordering of DA neuron responses emerged toward the beginning of the trial and persisted throughout an interval (Fig. 3, D and F). We next constructed psychometric curves for trials in each tercile and compared a range of models for the psychometric curve. The model that best explained the behavioral data collected from high-, medium-, and low-tercile trials consisted of three sigmoid curves that differed only in their horizontal location along the time axis (Fig. 3E). We observed a shift toward long choices when DAergic activity was low, and the opposite shift when activity was high. Specifically, as DA activity varied from the lower to the upper tercile, the psychometric threshold shifted by ~340 ms (i.e., ~20% of the 1.5-s category boundary; range, 90 to 620 ms; 6 to 42%; n = 5 mice). The relationship between DAergic response and psychometric shift was observed for recordings in either hemisphere (fig. S6), thus ruling out an explanation based on the laterality of short versus long choices. Instead, these results indicate that higher or lower midbrain DAergic activity is correlated with a change in a time-dependent component of the decision.

Fig. 3 Changes in a time-dependent component of choice behavior are predicted by DAergic activity.

(A) Trial-by-trial logistic regression (black) that predicts choice from the amplitude of the second-tone DA response (gray), for each of the six time intervals (left to right). The top and bottom histograms illustrate the number of trials, as a function of DA response, in which the subject made long and short choices, respectively (n = 8533 trials, 5 mice). For each session and interval, DA responses are grouped into terciles—high (blue), medium (gray), and low (red)—throughout the figure. (B) Distinct patterns of temporal judgments are expected depending on the nature of the relationship between DA response and choice. (C) Three individual trials illustrating low, medium, and high second-tone DA responses (quantified as the mean response in the gray-shaded box) and grouped by tercile within the entire second-tone response distribution, depicted at right. (D) Average DA response in each tercile for the 1.74-s interval stimulus (n = 1868 trials, 5 mice). Shading, SEM. (E) Psychometric curves constructed using trials from each tercile of DA response. Curves are the maximum-likelihood fits of logistic functions with the lowest Bayesian information criterion scores (n = 8533 trials, 5 mice). Error bars, 95% confidence interval (CI). The inset shows the difference in the probability of making a long choice between medium and low or high (red or blue) DA response trials. Error bars, SEM. (F) The top row is as in (D) but for all six interval durations; data shown in (D) are outlined in gray. The bottom row shows the area under the curve (auc), distinguishing high- and low-tercile DA responses. This difference in DA response increased during the course of the trial (red linear regression; coefficient of determination r2 ranging from 0.72 to 0.98; P < 0.0001).

How might this correlation between DA neuron activity and the location of the psychometric curve along the time axis relate to our initial finding that temporal expectation contributed to the average second-tone response? The theory of DAergic RPE coding predicts that slower (faster) timekeeping, by stretching (contracting) temporal surprise along the time axis, should increase (decrease) DAergic responses to the second tone (fig. S7). We observed a pattern of DAergic response to the second tone that was consistent with this (Fig. 2, B and C, and fig. S7). Furthermore, if DAergic activity reflects RPE continuously throughout a trial, differences in activity associated with slower or faster timekeeping (i.e., the separation between low- and high-activity terciles) should also grow continuously over time, and indeed, this is the case in our data (Fig. 3F and fig. S7). In contrast to the expected impact of variability in the speed of timekeeping on RPE coding, it is not apparent to us how changes in the location of the decision boundary along an animal’s internal notion of time should change RPEs arising at the presentation of the second tone. The most parsimonious explanation of the data is that DA neuron activity reflects variability in the speed of internal timekeeping.

These results demonstrate a correlation between temporal judgments and DA neuron activity. However, it is unclear whether DA neuron activity simply reflects, or whether it is sufficient to cause changes in, time judgments. We mimicked the observed variability in DAergic responses by optogenetically activating or inhibiting DA neurons (Fig. 4, A to E) on a minority of randomly chosen trials. Notably, we found that increasing or decreasing DA activity resulted in a horizontal shift in the psychometric curve in the directions predicted by the photometry data, albeit more modestly in the case of photoinhibition. (excitation, 140 ± 20 ms, n = 4 mice; inhibition, –68 ± 23 ms, n = 4 mice; Fig. 4, F and G, and fig. S8). These effects were transient, occurring only on stimulated trials, and thus could not be explained as resulting from learning (Fig. 4, F and G), nor were they observed in control animals (fig. S9). In addition, as was the case when sorting trials on the basis of DA response to the second tone, we observed no systematic effect on RTs, arguing against DAergic neuron activity affecting the subjects’ movement toward or incentive salience of choice options during the task (fig. S10).

Fig. 4 Optogenetic manipulation of dopamine neurons is sufficient to change judgment of time.

(A) Schematics illustrating viral strategy and subsequent fiber implantation (left) and stimulation protocol (right). (B and C) Histology confirming membrane expression of ChR2-YFP or NpHR-YFP (both green) in neurons of the SNc expressing tyrosine hydroxylase (TH, red). (D and E) Single-trial (top panels) and peri-stimulus time histogram (bottom panels) of in vivo electrophysiological measurement of two DA neurons reliably activated and inactivated by light (n = 53 and 8 trials, respectively). (F) Choice behavior and psychometric curves during control trials (black), photoactivated trials (blue), and unstimulated trials immediately after photoactivation (gray) (n = 4 mice). Error bars, 95% CI. Insets show the mean difference in the probability of a long choice between photoactivated and control trials (top, one bar per animal; bottom, one data point per stimulus). Error bars, SEM. (G) Same as (F) but for animals whose DA neurons were inhibited (n = 4).

Here we demonstrate a direct link between signals carried by midbrain DA neurons and judgments of elapsed time. Higher or lower levels of DAergic activity not only correlated with but could directly control timekeeping. These data are in agreement with some results of pharmacological manipulations of the DAergic system during timing tasks (26), but appear at odds with some others that showed accelerated timekeeping with increased DAergic tone (23, 25). However, recent studies demonstrate that many of the pharmacological effects on timing behavior can be explained by the changes in motivation (27, 31) that accompany DAergic drug administration (32). Indeed, pharmacogenetic DAergic manipulation in our task affected motivated behavior. Variability in the effects of pharmacology on timing may result from its relatively slow time course, which allows for compensation and/or the superposition of multiple distinct behavioral effects. Our approach circumvents these issues with genetically targeted, transient manipulations of DA neuron activity. Additionally, we focused on DA neurons in the SNc because many project to a dorsocentral region of the striatum where removal of DA input can cause a selective deficit in timing (33); however, whether DA neurons in other regions, such as the ventral tegmental area, contribute to timing variability is unknown. Last, we monitored and manipulated the activity of midbrain DA neurons, and not the levels of released DA. The relationship between tonic and phasic firing of DA neurons and DA release is not entirely clear, and it is complicated by feedback mechanisms by which released DA can affect the firing of DA neurons (34).

Although unexpected, the data presented here may explain existing behavioral data. Situations in which DAergic activity is elevated naturally, such as states of high approach motivation (35), response uncertainty (36), or cognitive engagement (37), are associated with underestimation of time (1, 2, 38). Conversely, situations that decrease DAergic activity, such as when fearful or aversive stimuli are presented (39), are associated with overestimation of time (40). These observations, together with our data, suggest that flexibility in time estimation may confer an adaptive advantage on the individual. For example, underestimating duration in better-than-expected situations may lead to longer engagement in those situations, resulting in even greater reward than if time estimation were not flexible. In other words, there may be a normative explanation for why “time flies when we are having fun” underlying our observation that DA neurons, which are so central to reward processing, exert control over time estimation.

Supplementary Materials

www.sciencemag.org/content/354/6317/1273/suppl/DC1

Materials and Methods

Figs. S1 to S10

References (4143)

References and Notes

  1. Acknowledgments: We thank A. Braga for assistance with behavioral training; M. Duarte for assistance with mouse colonies; G. Lopes for assistance with Bonsai; T. Monteiro, T. Gouvêa, other members of the Paton laboratory, B. Lau, E. Lottem, M. Murakami, C. Poo, A. Renart, and T. Akam for discussions and/or comments on the manuscript; Z. Mainen for support; platforms at the Champalimaud Centre for histology support and animal care; and V. Jayaraman, R. A. Kerr, D. S. Kim, L. L. Looger, and K. Svoboda from the GENIE (Genetically-Encoded Neuronal Indicator and Effector) Project at the Howard Hughes Medical Institute’s Janelia Farm Research Campus for providing the AAV-GCaMP6f through the University of Pennsylvania Vector Core. Viruses for expression of NpHR3.0 and EYFP are available from the University of North Carolina Vector Core under a material transfer agreement with K. Deisseroth. Viruses for expression of GCaMP6f and TdTomato are available from the University of Pennsylvania Vector Core under a material transfer agreement with the trustees of the University of Pennsylvania on behalf of J. Wilson. The work was funded by the Bial Foundation (188/12 to J.J.P.), the Simons Foundation (Simons Collaboration on the Global Brain award 325476 to J.J.P.), Fundação para Ciência e Tecnologia (SFRH/BD/51895/2012 to S.S.), the European Molecular Biology Organization (Advanced Long Term Fellowship 983-2012 to B.V.A.), Marie Curie Actions (FP7-PEOPLE-2012-IIF 326398 to B.V.A.), and the Champalimaud Foundation (internal funding to J.J.P.). Data presented in this paper can be found at www.dropbox.com/sh/ip6forddl84028j/AAAsa3ry41bu4acYk1Bl3KDra?dl=0.
View Abstract

Navigate This Article