Report

Building Neural Representations of Habits

See allHide authors and affiliations

Science  26 Nov 1999:
Vol. 286, Issue 5445, pp. 1745-1749
DOI: 10.1126/science.286.5445.1745

Abstract

Memories for habits and skills (“implicit or procedural memory”) and memories for facts (“explicit or episodic memory”) are built up in different brain systems and are vulnerable to different neurodegenerative disorders in humans. So that the striatum-based mechanisms underlying habit formation could be studied, chronic recordings from ensembles of striatal neurons were made with multiple tetrodes as rats learned a T-maze procedural task. Large and widely distributed changes in the neuronal activity patterns occurred in the sensorimotor striatum during behavioral acquisition, culminating in task-related activity emphasizing the beginning and end of the automatized procedure. The new ensemble patterns remained stable during weeks of subsequent performance of the same task. These results suggest that the encoding of action in the sensorimotor striatum undergoes dynamic reorganization as habit learning proceeds.

Many acts that we perform regularly become so routine that we carry them out almost without conscious effort. We depend on these habits to free us to think and to react to new events in the environment (1). Clinical and behavioral evidence suggests that the basal ganglia are centrally involved in the procedural learning that leads to habit formation and in the performance of routinized behaviors once they are learned (2–4). The development of multiple-electrode chronic recording techniques in freely moving rodents (5,6) now makes it feasible to investigate what forms of neuronal representation are built up in the basal ganglia during the acquisition of habits. We used multiple tetrodes (7) to record chronically from ensembles of neurons in the striatum of rats as they underwent training on a procedural learning task and as they performed the habitual behavior after learning (8).

We used a conditioned T-maze paradigm (Fig. 1A) that allowed us to analyze neuronal activity related to key epochs in the behavioral task. Rats were trained to run down the maze once a start gate opened and to respond to auditory instruction cues by turning right or left to receive reward. A daily training session consisted of 40 trials (20 for each tone presented pseudorandomly) (9). Rats reached the 72.5% correct criterion for acquisition (10) in an average of five sessions and then received seven or more additional (overtraining) sessions. During acquisition, the speed of performance increased, as is typical of procedural learning (Fig. 1, B and C). Behavioral data were normalized by defining nine stages of training in order to examine learning-related changes in unit activity profiles (10).

Figure 1

Experimental paradigm for assessing habit learning. (A) Rats were trained to run a T-maze to obtain reward (9). Event-related neuronal firing patterns were recorded in relation to start, tone, turn, and goal reaching from tetrodes placed in the sensorimotor striatum (dorsolateral caudoputamen, representative track reconstruction shown in brain section). The peri-event time histograms show the average number of spikes in consecutive 10-ms intervals during 2-s periods around individual task events (indicated by the vertical line at time 0 in each histogram) for four representative units. Analyses were carried out on ±500-ms peri-event periods (13). Horizontal line indicates 95% confidence level above the mean baseline firing rate (13). (B) Average percent-correct behavioral responses (solid line) and trial duration (dashed line) in nine training stages. Stages were defined as described in (10), with behavioral criterion achieved in stage 3. (C) Cumulative plot showing, in different shades of gray (from bottom to top), mean running times for the four segments of the T-maze detected by the photobeams [ticks in (A)]: from start to tone onset, from tone onset to beginning of turn, from beginning to end of turn, and from end of turn to goal. (D) Autocorrelogram representative of those used in assessing sorted single units.

During the acquisition and overtraining periods, we recorded from a daily average of 30 units per rat in rats with six tetrodes chronically implanted in the sensorimotor striatum (Fig. 1A) (11, 12). These tetrodes never moved or moved a few hundred micrometers at most throughout the 2- to 3-week-long experiments (13). Neuronal “start responses” were defined as changes in firing rate that were time locked to the start-gate opening or the initiation of locomotion. Similarly, “tone responses” were related to the onset of the auditory instruction cues, “turn responses” occurred during the right and left turns, and “goal responses” occurred at the end of the trial, at the point of reward delivery (13). Nearly all of the units recorded had action potential waveforms typical of the predominant cell type in the striatum, the medium spiny projection neuron (14).

Striking changes occurred in the ensemble pattern of task-related responses of striatal neurons during learning (Fig. 2A). At the beginning of training, when the animals had received only maze-familiarization sessions (9), 56% of the units recorded had significant task-related activity (13) for at least one of the task events. The most common response (shown by 65% of the task-related units) was during turning, consistent with the location of the recording sites in the sensorimotor striatum (11, 12). During learning, as the rats' performance improved, the percentage of task-related units increased to a maximum of 92% (P < 0.001), but the numbers of units with turn responses fell dramatically. By behavioral asymptote (stage 5), only 28% of task-related units responded during the turns (P < 0.001), and by the end of training, only 13% had significant turn responses.

Figure 2

Changes in task-related activity patterns of units in the sensorimotor striatum during T- maze learning. (A) Percentage of all task-related units recorded that exhibited start-, tone-, turn-, and goal-related activity during successive stages of acquisition and overtraining, as described in (10). Values are averages for the three rats. (B) Individual averages for the three rats (rat V, solid lines; rat Y, dashed lines; and rat Z, dotted lines) of percentage of task-related neurons with start, tone, turn, and goal responses, plotted for successive stages. (C) Changes in percentages of units recorded on a single tetrode in the sensorimotor striatum of rat Z that was not moved during 16 daily sessions. Behavioral criterion was achieved on day 11 (indicated by underline).

By contrast, units with start responses, which made up 44% of the task-related units at the beginning of training, rose to 88% by behavioral asymptote (P < 0.001), and goal-responsive units also increased greatly, from 29 to 67% (P < 0.001). All rats exhibited these changes in proportionate response to start, turn, and goal reaching, despite individual variation in the rates of change (Fig. 2B). Units responding in relation to the tones were rare, except in one rat (15), and increased from 18 to 36% of all task-related units (P < 0.025). These acquired ensemble activity patterns were remarkably stable during the overtraining period, except for the tone-related units, whose numbers fell back to pretraining levels. Concomitant with these changes, the number of units that responded to more than one task event rose from 44 to 70% by behavioral asymptote (P < 0.005). Thus, as a consequence of the T-maze training, large numbers of units in the sensorimotor striatum were newly recruited to respond to some aspect of the task, most of the units came to have multievent response profiles, and the entire pattern of responsivity of the active striatal neurons changed.

The dramatic increase in start- and goal-related ensemble activity reflected an increase, from 3 to 32% (P < 0.001), in units with dual-event responses representing start and goal and smaller increases (P > 0.05) in units with single-event responses representing either start (from 14 to 22%) or goal (from 9 to 17%) (Fig. 3B). Because of the dominance of neuronal responses at start and goal, the response patterns of simultaneously recorded units became more similar during learning (Fig. 3C) (16). These observations suggest a fundamental learning-related change in neural correlates of the task in the sensorimotor striatum, at both single-unit and ensemble levels.

Figure 3

Task-related activity profiles of striatal units. (A) Examples of the consistent responses of units recorded at single sites on two unmoved tetrodes during overtraining sessions (session days shown at left). Conventions as in Fig. 1A. Upper set of plots show data from the same tetrode as that illustrated in (B) and in Fig. 2C. (B) Task-space activity maps illustrating task-related activity profiles of units recorded on a single tetrode during four of the sessions (stages 1, 2, 3, and 9). Units with turn responses were prominent in stage 1, those with start responses increased in stage 2, turn-related units declined by stage 3, and units with dual start and goal responses increased by stage 9. (C) Tuning of response properties of striatal neurons during T-maze learning indicated by average similarity index scores (16) for pairs of units recorded on single tetrodes (solid line) and on different tetrodes (dashed line).

To determine whether populations of neurons recorded at a single site in the sensorimotor striatum would show changes in activity patterns similar to those found for the entire population of striatal units recorded during the procedural learning process, we made detailed analyses of units recorded from tetrodes that were not moved at all during training. The data shown in Figs. 2C and 3A were acquired from a single tetrode that was not moved over 16 sessions (20 calendar days). As was true for the entire data set (Fig. 2A), turn-related responses in the local population declined sharply, responses at start and at goal rose sharply, and there was an initial rise and then a decline in tone responses during learning. The newly acquired patterns of activity were maintained consistently during the overlearning period, as in the entire population (Fig. 3A). Thus, local tuning occurs in the sensorimotor striatum, begins early during procedural learning, and represents a long-term change that can persist at least for weeks.

Procedural learning involves a complex of behavioral changes, including decreases in reaction time and movement time as well as increases in performance accuracy (Fig. 1, B and C). We compared the changes in each of these behavioral parameters to the changes in unit activity session by session (17) (Fig. 4). In all three rats, the changes in reaction time and trial duration (as an estimate of movement time) occurred more rapidly than the changes in percent-correct performance. The lag in percent correct was maximal in the slowest learner (16 days to criterion) and minimum in the fastest learner (3 days to criterion) (Fig. 4A). Correlations between the changes in neural activity and changes in behavioral parameters showed clear patterns (Fig. 4B). They were highest for the fastest learner (r = 0.760) and lowest for the slowest learner (r = 0.332). Among parameters, the correlations were lower for turn responses (r = 0.325) than for start and goal responses (r = 0.573 and r = 0.627, respectively). The most consistent high correlation was between increases in goal responses and increases in percent-correct performance. In the fastest learner, high correlations were evident for all but turn responses (Fig. 4, B and C). These data suggest that the changes in neuronal activity patterns during the maze learning were not exclusively related to any single behavioral parameter.

Figure 4

Correlations between changes in neural firing patterns and changes in performance speed and accuracy. (A) Plots of average reaction times (dashed lines), trial duration (solid lines), and percent-correct performance (dotted lines) for each individual rat across days of training. The vertical axis represents changes in the behavioral measures on a scale of 0 to 1, with 0 being the minimum value and 1 being the maximum value of each measure. Reaction time and trial duration were plotted on inverted scales on the right. Underlines indicate the day on which behavioral criterion (72.5% correct) was first reached. (B) Plots of Pearson correlation coefficients calculated for the relation between the percentages of units exhibiting responses related to start (open circles), turn (solid squares), and goal (crosses) and the three behavioral measures [reaction time (RT), trial duration (TD), and percentage of correct responses (%)]. Absolute values of correlation coefficients were used for the vertical axis. Correlations were highest for rat Y (the fastest learner) and lowest (not significant) for rat V (the slowest learner). (C) Individual plots of changes in neural activity (solid lines) and changes in behavioral parameters (dotted lines) for rat Y, the fastest learner. Vertical scales are as described in (A). Separate plots show how changes found in proportions of task-related neurons with start, turn, and goal responses during the nine training stages were related to changes in trial duration (TD) and percent-correct performance (%).

To look for other movement patterns that might have accounted for the changes in activity at start, turn, and goal reaching, we examined videotapes for one rat (rat Z) and also measured changes in velocity at start and goal (18). We found no detectable stereotypic responses emerging over time and no consistent relation between acceleration or deceleration and neuronal activity levels. We cannot completely exclude the possibility that some motor activity accounted for at least some of the changes in neuronal firing patterns that we found. However, we favor the view that other than purely motoric aspects of performance (such as context or cue saliency, motor readiness, or changing response strategy) contributed to the large-scale neuronal changes that we found (18).

Because we recorded simultaneously from multiple striatal neurons, it was possible to test whether the proportions of task-related neurons firing in relation to different task events were reflected in the total population firing and whether the changes in neuronal proportions during learning were mirrored by changes in the total population firing (19). We found that they were (Fig. 5). There was a striking similarity in the shifting patterns of response to start, tone, turn, and goal shown by the measures of percentage of task-related neurons discriminating particular events (Fig. 5A) and the percentage of total spikes time locked to those events (Fig. 5B). These changes in the patterns of total population firing occurred in the absence of consistent changes in average firing rates for the corresponding sessions. These results suggest that the changes we found were true learning-dependent changes in the striatal population response (6, 20). Preliminary results of an ongoing cross-correlational analysis (21) indicate that, during the maze learning, the degree of temporally correlated activity in the ensemble response also increased.

Figure 5

Reorganization of neuronal activity in the sensorimotor striatum during habit learning. (A) Schematic activity maps representing the average proportion of task-related units responsive to different parts of the task from early to late in training. The maps are based on data from Fig. 2A and show proportions according to the color scale shown to the right (red is high and blue is low). From left to right, the maps show task-space activity patterns for training stage 1 (day 1), stage 3 (first criterial performance day), stage 5 (average stage of behavioral asymptote), and stage 9 (seventh day with above-criterion performance). (B) Schematic activity maps representing the population spike discharge as the percentage of total spike discharges of striatal neurons simultaneously recorded during the same four training stages illustrated in (A) (19). The activity maps were made as those in (A), and the color scales used for (A) and (B) were identical.

Our findings demonstrate that an overall restructuring of neuronal response patterns occurs in the sensorimotor striatum as procedural learning occurs. The newly acquired neuronal responses emphasized the beginning and the end of the learned procedure (the maze run) and de-emphasized the turning phase of the procedure, despite the initial strong representation of this sensorimotor behavior at the beginning of training. This pattern is consistent with the hypothesis that the sensorimotor striatum develops an action template for triggering the procedure as a behavioral unit (with a beginning and an end and, if present, salient intermediates). Rapid tuning of responses to salient sensory cues has been found in the striatum in several reward-based paradigms (22), and anticipatory and reward-related responses have been found in the striatum in overtrained animals, mostly in the caudate nucleus (23). Our observation of a gradual and prolonged change in striatal responses suggests a neural correlate of the slow acquisition characteristic of habit learning (24). In the only other chronic recording study of striatal neurons during behavioral acquisition, decreased response-related activity was found in the sensorimotor striatum as rats learned an instrumental lever-press task (12). Our findings suggest that such decreases may be part of a general reshaping of neuronal responses emphasizing the context that can trigger the learned habit.

The changes that we have observed could represent plasticity in striatal networks or in their afferent structures or in both. Spiking in striatal projection neurons depends on coherent activation of excitatory afferent fibers, mainly originating in the neocortex and thalamus, and these structures are in loop circuits with the striatum. Sensorimotor and premotor cortical responsivity does change with procedural learning (25). Human brain scanning experiments suggest that different cortical areas become activated in succession, from prefrontal to premotor and parietal, and that heightened metabolic activity moves in parallel from rostral to caudal (sensorimotor) striatum (26). Our experiments also show that activity (number of task-related units) increases in the sensorimotor striatum during procedural learning, but they demonstrate that the activity reflects dramatically different task-related response profiles at different stages of learning.

The ensemble activity changes that we observed in the striatum contrast with those reported for the dopamine-containing neurons of the midbrain, which are thought to provide a reward-based teaching signal to the striatum (27). After conditioning, these midbrain neurons shift their responses to the earliest indicator of reward. These predictive responses may be similar to the “start” responses found here, in that late in training, trial start becomes predictive of reward. However, in the striatum, we detected no sign of such a shift during the training periods that we used. Instead, many neurons in the striatum acquired a response at start and goal reaching, and nearly a third of these had a double response at both start and goal (28).

Basal ganglia circuits affect motor and cognitive behavior through output pathways originating from subclasses of striatal projection neurons (3, 14, 29). Our results demonstrate large-scale changes in the recruitment and firing patterns of these neurons. This suggests that a dynamic reorganization of basal ganglia outputs may occur as a result of procedural learning. The greatly increased anticipatory or start responsivity in these neurons may form the basis for the behavioral “motor readiness” and “release” functions that have been ascribed to the basal ganglia on the basis of clinical and experimental studies (3, 29, 30). The accentuation of the beginning and end of the procedure that we found in the acquired neural activity patterns may relate to the form of striatal processing disordered in Parkinson's disease, in which patients have particular difficulty in starting and stopping movement sequences or in breaking into one sequence with another (31). This form of neural encoding could also relate to the triggering of repetitive, stereotyped behaviors in some neuropsychiatric disorders and addictive states (32). This view is compatible with the idea that, during habit learning, the striatum comes to code whole sequences of behavior as performance units that can be triggered by specific contexts (3).

  • * These authors contributed equally to this work.

  • To whom correspondence should be addressed. E-mail: graybiel{at}mit.edu

REFERENCES AND NOTES

View Abstract

Navigate This Article