ReportsNeuroscience

Dopamine neurons encode performance error in singing birds

See allHide authors and affiliations

Science  09 Dec 2016:
Vol. 354, Issue 6317, pp. 1278-1282
DOI: 10.1126/science.aah6837

Birds of a feather sing together

How do birds know that a song that they hear is from a member of their own species, and how do they learn their songs in the first place? Araki et al. identified two types of brain cells involved in how finches learn their songs (see the Perspective by Tchernichovski and Lipkind). When zebra finches were raised by Bengalese finch foster parents, they learned a song whose morphology resembled that of their foster father. However, the temporal structure remained zebra finch–specific, suggesting that it is innate. Gadagkar et al. recorded activity in specific dopamine neurons in singing zebra finches while controlling perceived song quality with distorted auditory feedback. This distorted feedback represented worse performance than predicted and resulted in negative prediction errors. These findings suggest again that finches have an innate internal goal for their learned songs.

Science, this issue p. 1282, p. 1234; see also p. 1278

Abstract

Many behaviors are learned through trial and error by matching performance to internal goals. Yet neural mechanisms of performance evaluation remain poorly understood. We recorded basal ganglia–projecting dopamine neurons in singing zebra finches as we controlled perceived song quality with distorted auditory feedback. Dopamine activity was phasically suppressed after distorted syllables, consistent with a worse-than-predicted outcome, and was phasically activated at the precise moment of the song when a predicted distortion did not occur, consistent with a better-than-predicted outcome. Error response magnitude depended on distortion probability. Thus, dopaminergic error signals can evaluate behaviors that are not learned for reward and are instead learned by matching performance outcomes to internal goals.

When practicing piano, how do you know if you struck the right or wrong note? The problem is that there is nothing intrinsically “good” or “bad” about the sound of A-sharp. It entirely depends if that’s the note you wanted to strike at that time step of the song. Performance evaluation requires sensory feedback to be compared with internal benchmarks that change from moment to moment in a sequence. Performance errors during musical performance (1, 2) and speech production (3) are associated with a frontal error-related negativity in the electroencephalogram that may relate to activity in ventral tegmental area (VTA) dopamine neurons (4). Yet, although dopamine neurons are known to encode reward prediction error in tasks where animals seek primary rewards such as food or juice (57), it is not known if dopamine activity also encodes error in tasks that are not learned for primary reward and are instead learned by matching sensory feedback to internal performance benchmarks (8, 9).

Songbirds use auditory feedback to learn to sing and have a dopaminergic projection from VTA to Area X, a nucleus required for song learning (1013). It is hypothesized that a singing bird evaluates its own song to compute an auditory-error–based reinforcement signal that guides learning—i.e., a neural signal that “tells” vocal motor circuits if the recent vocalization was “good” and should be reinforced or “bad” and be eliminated (14, 15) (Fig. 1A). The neural correlates of song evaluation remain unknown (1618), leading to alternative models of learning that do not require online error signals (19).

Fig. 1 Experimental test of performance error signals in birdsong.

(A) Evaluation of auditory feedback during singing is hypothesized to result in “error” signals that reach the song system. (B) Strategy for antidromic identification of VTAx dopamine neurons. (C) Antidromic spikes (black) and spike collisions (red) of a VTAx neuron. (D) VTAx neurons labeled by injection of retrograde tracer into Area X (green, top) and colabeled dopamine neurons stained with antibody against tyrosine hydroxylase (TH) (purple, bottom). White arrows point to the visible path of the electrode that recorded the VTAx unit shown in Fig. 2A (scale bar, 100 μm; anterior-right, dorsal-top). (E) Example of displaced-syllable DAF. A snippet of syllable “c” was played back during production of the target syllable “b” (target time, black triangles and white dashed lines). Randomly interleaved target renditions were left undistorted (undistorted trials, blue dashed line). (F) Expanded view of the target syllable. (G) Pitch-contingent displaced-syllable DAF drives learning. Gray dots denote mean pitch of 49,716 target syllable renditions sung over 23 days for one bird. Shading demarcates distorted renditions; green, low-pitch variants distorted (up days); blue, high-pitch variants distorted (down days). (H) Histogram of pitch changes learned during each day (n = 4 birds).

To test if dopamine activity encodes performance error, we recorded songbird VTA neurons while controlling perceived song quality with distorted auditory feedback (DAF) (18, 2024) (Fig. 1, B to F). Beginning days before recordings, a specific song syllable was either distorted with DAF or, on randomly interleaved trials, left undistorted altogether (distortion rate 44 ± 8%, n = 26 birds; Fig. 1, E and F). DAF was a 50-ms snippet of sound with the same amplitude and spectral content as normal zebra finch song (see supplementary text). The snippet was either a segment of one of the bird’s own syllables displaced in time (displaced-syllable DAF, n = 10 birds; Fig. 1E) or a synthesized sound designed to mimic broadband portions of the bird’s own song (broadband DAF, n = 16 birds) (20, 24). Operant broadband DAF drives dopamine and Area X–dependent reinforcement of undistorted syllable variants (13, 23). Displaced-syllable DAF, when operantly delivered contingent on the pitch of a harmonic target syllable, resulted in similar learning (Fig. 1, G and H) (20).

To test for online error responses, we compared the activity between randomly interleaved renditions of distorted and undistorted songs. We computed the z-scored difference between target onset–aligned distorted and undistorted rate histograms (Fig. 2, A to D; target onset defined as the median DAF onset time relative to distorted syllable onset, n = 125 neurons in 26 birds) (24). We defined the error response as the average z-scored difference in firing in a 50- to 125-ms interval following target onset (24). We plotted the distribution of error responses across the 125 VTA neurons and observed two distinct groups: one that did not exhibit significant error responses (n = 108 neurons, error response 0.1 ± 0.9) and a group of error-responding neurons (n = 17 neurons, error response 3.3 ± 0.5; Fig. 2, E and F) that formed a distinct cluster (P < 0.001, bootstrap) (24). These two groups, defined as VTAerror (n = 17) and VTAother (n = 108), were spatially intermingled (fig. S1).

Fig. 2 VTA neurons encode performance error during singing.

(A) Spectrogram, voltage trace, and the instantaneous firing rate of a VTAx neuron (DAF, red shading; undistorted targets, blue lines). (B) Top to bottom: spectrograms, spiking activity during undistorted and distorted trials, corresponding spike raster plots and rate histograms, and z-scored difference between undistorted and distorted rate histograms (plots aligned to target onset). Horizontal bars in histograms indicate significant deviations from baseline (P < 0.05, z test) (24). (C and D) Two additional VTAerror neurons as in (B). (E) Each row plots the z-scored difference between undistorted and distorted target-aligned rate histograms. VTAx neurons (top, n = 14) and nonantidromic neurons (bottom, n = 111) are independently sorted by maximal z score. (F) Top, distribution of error responses (24). Bottom, spike width versus error response (triangles: antidromic; circles: nonantidromic neurons). (G) Normalized response to distorted and undistorted targets (mean ± SEM) for VTAother (top) and VTAerror neurons (middle). Bottom, scatterplot of normalized rate in the 50 to 125 ms following distorted and undistorted trials (solid fills indicate P < 0.05, bootstrap). (H) Distributions of phasic response durations (top) and latencies (bottom). (I) For each VTAerror neuron, the time of maximal firing rate relative to motif onset is plotted against target time.

All VTAerror neurons were phasically suppressed by DAF during singing (Fig. 2, A to D, G; P < 0.05 in 17 out of 17 VTAerror neurons, bootstrap). Suppressions followed DAF onset with a latency of 58 ± 13 ms, lasted 86 ± 35 ms, and resulted on average in a 75% reduction in firing rate (range: 45 to 100%) (24, 25). DAF-induced suppressions during singing were highly reliable, occurring on an average of 94% of distorted trials (range: 82 to 100%). VTAerror neurons also exhibited phasic activations following the precise time-step of undistorted songs where DAF would have occurred but did not occur (Fig. 2, A to D, G, and I; P < 0.05 in the same 17 neurons that exhibited suppressions on distorted trials, bootstrap). Phasic activations mirrored the phasic suppressions: They followed target onsets with a latency of 51 ± 20 ms, lasted 62 ± 27 ms, and resulted on average in a 77% (range: 42 to 214%) increase in firing rate (24) (Fig. 2H).

These precisely timed phasic activations suggest that undistorted target syllables are signaled as better than predicted, as if they are evaluated against an estimate of syllable quality that is diminished by a memory of errors (i.e., a flexible performance benchmark; see supplementary text). To test if error signals are scaled by error history, we trained 10 birds in a two-target paradigm in which one syllable was distorted with a high probability (target-1, 49 ± 4%) and a second syllable with low probability (target-2, 20 ± 4%) (Fig. 3, A to C) (24). The magnitude and reliability of phasic suppressions did not depend on error probability (percentage of suppression: target-1: 59%, range 45 to 77%; target-2: 63%, range 20 to 100%; reliability: target-1: 90%, range 82 to 100%; target-2: 86%, range 71 to 100%, P > 0.4, rank sum tests; Fig. 3D), consistent with weak scaling of dopaminergic negative reward prediction error responses (6, 7). In contrast, phasic activations were significantly larger following (the more surprising) undistorted renditions of the high-probability target (increase in firing rate, target-1: 67%, range 42 to 159%; target-2: 22%, range –3 to 48%, P < 0.001, rank sum test; Fig. 3E). Error responses to target-2 did not depend on whether or not the preceding target-1 was distorted and vice versa, indicating that song time steps are independently evaluated against temporally aligned performance benchmarks (P > 0.05, rank sum tests and fig. S2).

Fig. 3 VTAerror responses depend on error probability.

(A) Displaced-syllable DAF scheme with two targets per motif (syllable b: target-1, distortion rate 50%; syllable d, target-2, distortion rate 20%; target times marked with dashed white line and black triangle). The distorted versions of the two target syllables are shown at right (color scheme as in Fig. 1E). (B) Target-1 and (C) target-2 error responses for the same neuron. Top to bottom: spectrograms, spiking activity during undistorted and distorted trials, corresponding spike raster plots and rate histograms, and z-scored difference between undistorted and distorted rate histograms (all plots aligned to target onset). Horizontal bars in histograms indicate significant deviations from baseline (P < 0.05, z test) (24). (D) Top, normalized responses to distorted targets (mean ± SEM) for VTAerror neurons. Bottom, scatterplot of normalized rate in the 50 to 125 ms following target time (solid fills indicate P < 0.05, bootstrap). (E) Same as (D) but for undistorted targets.

More than 95% of Area X–projecting VTA neurons are dopaminergic (11). Fourteen of 125 VTA neurons were antidromically identified as projecting to Area X (Fig. 1, B to D), and 13 out of 14 VTAx neurons encoded performance error (Fig. 2, E and F). Firing patterns of VTAerror neurons were like those of mammalian dopamine neurons (see supplementary text and figs. S3 to S5).

Dopamine activity correlates with movement (26, 27). We quantified movement with microdrive-mounted accelerometers (fig. S6 and movie S1). The activity of many VTA neurons was modulated by movement, which was in turn correlated with singing. But movement patterns during singing were not affected by DAF, and error responses were not affected by movement (n = 26 out of 26 birds, P > 0.05, bootstrapped d’ analysis, see supplementary text, tables S1 and S2, and figs. S6 to S10).

VTAerror neurons might encode not performance error but simply the presence or absence of DAF as if it were an aversive stimulus (see supplementary text). An aversive response should persist in birds during nonsinging periods, whereas performance error should be restricted to singing. During nonsinging periods, VTAerror neurons did not differentially respond to playback of distorted and undistorted renditions of the bird’s own song (normalized firing rate, distorted: 1.0 ± 0.2; undistorted: 1.1 ± 0.1; P > 0.3, unpaired t test) (Fig. 4) and did not exhibit pauses in response to DAF (fig. S11). Confinement of VTAerror responses to singing is consistent with performance error.

Fig. 4 Response of VTAerror neurons to birdsong during nonsinging.

(A) Distorted and undistorted renditions of the bird’s own song was played back during nonsinging periods. (B) Top to bottom: spectrograms, spiking activity of the VTAx neuron shown in Fig. 3 during playback of undistorted and distorted songs, corresponding spike raster plots and rate histograms, and z-scored difference between undistorted and distorted rate histograms (all plots aligned to target onset). (C) Normalized responses to distorted and undistorted targets (mean ± SEM) for VTAerror neurons during passive playback (top). Bottom, scatterplot of normalized rate in the 50 to 125 ms following target time (empty fills indicate no significant response, P > 0.05, bootstrap) (24).

Performance error signals during singing are similar to prediction error signals during reward seeking (5). Suppression of VTAerror activity after distorted syllables resembles the dopamine response to worse-than-predicted reward outcomes. Activation of VTAerror neurons after undistorted syllables resembles the dopamine response to better-than-predicted reward outcomes. The scaling of positive VTAerror responses according to error history suggests that song is evaluated against flexible performance benchmarks. Positive reward prediction error signals are also scaled by reward prediction (6, 7). Finally, performance and reward prediction error signals could underlie similar learning mechanisms. Dopamine-modulated corticostriatal plasticity links external stimuli to reward-maximizing responses (14). Dopamine-modulated corticostriatal plasticity also exists inside Area X (28) and could similarly link each time step in the song to the specific vocalization that produces a favorable outcome when produced at that time step (supplementary text and fig. S12). Such a mechanism would explain the reinforcement of undistorted syllable variants in operant DAF paradigms (Fig. 1, G and H) (18, 20, 21, 23) and could contribute to natural song learning (14).

Yet, unlike reward prediction error, performance error during singing is not derived from sensory feedback of intrinsic reward or reward-predicting value. The absence of error responses in birds passively hearing distorted or undistorted syllables suggests that there is nothing intrinsically “good” or “bad” about these sounds according to the performance-monitoring system. Performance error might instead derive from evaluation of auditory feedback against internal performance benchmarks that require, at each time step of the song sequence, information about the desired outcome, the actual outcome, and also the predicted probability of achieving the desired outcome. It remains unknown how upstream circuits construct the VTAerror signal. Multiple auditory cortical areas, including one that projects to VTA, respond to DAF specifically during singing (22, 25), providing a candidate pathway for auditory mismatch signals to reach VTA. A newly identified Area X–basal forebrain–VTA pathway (29) might additionally provide a temporally precise and syllable-specific memory of errors required to compute a benchmark against which mismatch error signals are scaled.

Supplementary Materials

www.sciencemag.org/content/354/6317/1278/suppl/DC1

Materials and Methods

Supplementary Text

Figs. S1 to S12

Tables S1 to S2

Movie S1

References (3068)

References and Notes

  1. Materials and methods are available as supplementary materials on Science Online.
Acknowledgments: We thank J. Fetcho, M. Warden, M. Long, A. Andalman, and D. Aronov for comments on the manuscript; J. Cohen for mouse VTA recording data; T. Bollu and D. Murdoch for technical support; J. Wu and K. Maher for histology; and A. Treska for art. Funding support was provided to J.H.G. by NIH (grant R01NS094667), Pew Charitable Trusts, and Klingenstein Neuroscience Foundation and to V.G. by Simons Foundation. V.G. and J.H.G. designed the research, analyzed the data, and wrote the paper. V.G., P.A.P., R.C., A.R.F., E.B.-D., and J.H.G. performed experiments. The authors declare no competing financial interests. Data can be accessed at www.nbb.cornell.edu/goldberg/.
View Abstract

Navigate This Article