Learning and attention reveal a general relationship between population activity and behavior

See allHide authors and affiliations

Science  26 Jan 2018:
Vol. 359, Issue 6374, pp. 463-465
DOI: 10.1126/science.aao0284

The neuronal population is the key unit

The responses of pairs of neurons to repeated presentations of the same stimulus are typically correlated, and an identical neuronal population can perform many functions. This suggests that the relevant units of computation are not single neurons but subspaces of the complete population activity. To test this idea, Ni et al. measured the relationship between neuronal population activity and performance in monkeys. They investigated attention, which improves perception of attended stimuli, and perceptual learning, which improves perception of well-practiced stimuli. These two processes operate on different time scales and are usually studied using different perceptual tasks. Manipulation of attention and learning in the same behavioral trials and the same neuronal populations revealed the dimensions of population activity that matter most for behavior.

Science, this issue p. 463


Prior studies have demonstrated that correlated variability changes with cognitive processes that improve perceptual performance. We tested whether correlated variability covaries with subjects’ performance—whether performance improves quickly with attention or slowly with perceptual learning. We found a single, consistent relationship between correlated variability and behavioral performance, regardless of the time frame of correlated variability change. This correlated variability was oriented along the dimensions in population space used by the animal on a trial-by-trial basis to make decisions. That subjects’ choices were predicted by specific dimensions that were aligned with the correlated variability axis clarifies long-standing paradoxes about the relationship between shared variability and behavior.

The responses of pairs of neurons to repeated presentations of the same stimulus are typically correlated [quantified as noise correlations, or spike count correlations (rSC)] (1, 2). Prior electrophysiological studies have shown that these correlations change with cognitive processes that affect perceptual performance (24). However, theoretical work has suggested that this correlated variability may not affect the information encoded by a neuronal population in a manner that influences a subject’s decisions (5, 6).

We therefore measured the relationship between neuronal population activity and performance by studying two processes that both improve visual performance but on very different time scales: attention (7) and perceptual learning (8). By observing attention and learning in the same behavioral trials and neuronal populations, we identified the dimensions of population activity that matter most for behavior.

We recorded from neuronal populations in V4 (3, 4, 79) in two rhesus monkeys with chronically implanted microelectrode arrays (3). The monkeys detected changes in the orientation of either of two Gabor stimuli (Fig. 1A): one placed within the receptive fields (RFs) of the recorded neurons and one in the opposite hemifield (Fig. 1B). We measured attention effects within a single session and learning effects across sessions (Fig. 1C).

Fig. 1 Methods and behavior.

(A) Orientation change-detection task with cued attention (3). (B) RF centers of recorded units from example session (black circles). Gray circles illustrate Gabor locations; the red circle illustrates representative RF size. (C) Methodology for quantifying attention- and learning-related changes in detection sensitivity (d′). Best-fitting exponential functions plotted with SEM. Heat map illustrates session number. (Insets) Psychometric curves for two example sessions.

Attention and perceptual learning improved performance and affected neuronal population responses in similar ways (Fig. 2 and figs. S1 and S2). Both processes were associated with decreases in the mean-normalized trial-to-trial variance (Fano factor) of individual units and the correlated variability between pairs of units (Fig. 2, C, D, J, and K) in response to repeated presentations of the same stimulus (figs. S3 and S4). These variability changes occurred only in the context of the task (variability measured during passive fixation was constant throughout training) (Fig. 2, F, G, M, and N).

Fig. 2 Summary of behavioral and neuronal effects of attention and perceptual learning.

All changes were significantly different than 0 except where indicated (t tests; P < 10−3). Conventions are as in Fig. 1C. (A and H) Sensitivity (d′) increased with both attention and learning. (B, E, I, and L) Evoked response (firing – baseline rate) increased with attention but did not change consistently with learning or passive fixation (P > 0.05). (C, D, J, and K) Fano factor and correlated variability decreased with attention and learning, but (F, G, M, and N) not during passive fixation (P > 0.05).

Recent theoretical work suggests that only correlated variability along the dimensions in neuronal population space that encode task-relevant stimulus information can limit information coding (5, 6). Determining whether correlated variability lies along these dimensions is experimentally unfeasible because it would require recordings from a very large number of neurons over an even larger number of trials.

Instead, we assessed the importance of attention- and perceptual learning–related changes in correlated variability by investigating their relationship to behavior. There was a single, robust relationship between correlated variability and perceptual performance, whether changes in performance happened quickly (attention) (Fig. 3, A and B) or slowly (learning) (Fig. 3, C and D). This relationship was robust even when we removed the main effects of attention and learning (Fig. 3, E and F).

Fig. 3 The relationship between correlated variability and performance is the same for attention and perceptual learning.

Mean rSC and d′ were significantly correlated across sessions (P < 10−3). (A and B) Relationship between rSC and d′ was indistinguishable between attention conditions (Fisher z Pearson-Filon tests; P > 0.05). (C and D) Relationship between rSC and d′ was indistinguishable for the first versus second half of learning (P > 0.05). (E and F) Relationships persisted after removing attention and learning effects (residuals of exponential fits in Fig. 2; P < 10−3; analyses of variance, P > 0.05).

We analyzed the responses of V1 neurons (7, 8) in animals performing the same attention task. Unlike in V4, correlated variability in V1 was not correlated with performance (fig. S5).

Both attention and perceptual learning improved the performance of a cross-validated, optimal linear stimulus decoder (fig. S6). However, the relationship between correlated variability in V4 and performance (Fig. 3) seems at odds with theoretical work that suggests most correlated variability should not affect the stimulus information that can be gleaned from an optimal decoder (6).

To examine the relationship between correlated variability and performance more directly, we developed a single-trial measure of correlated variability. We performed principal component analysis (PCA) on population responses to the same repeated stimuli used to compute spike count correlations (fig. S3), meaning that the first PC is by definition the axis that explains more of the correlated variability than any other dimension (Fig. 4, A and B, x axis). Consistent with the recent observation that correlated variability is typically low dimensional (1012), the variance explained by the first PC was strongly related to the magnitude of correlated variability in each session, even when we accounted for the changes caused by attention and learning (Fig. 4, C and D, and fig. S7) and trial-averaged firing rates (figs. S8 to S11). Like correlated variability (Fig. 3), the proportion of variance explained by the first PC was correlated with behavioral performance (d ′) across all sessions [Monkey 1, correlation coefficient (R) = –0.42, P < 10−13; Monkey 2, R = –0.62, P < 10−15].

Fig. 4 Correlated variability is more closely aligned with choices than would be expected from an optimal stimulus decoder.

(A) Schematic showing how we obtained our single-trial measure of correlated variability. We performed PCA on responses to the stimulus before the change (gray) and to the changed stimulus (black). The stimulus decoder detects differences between the responses to the previous and changed stimuli. The choice decoder detects differences between responses to misses and hits. (B) Example data set showing that the animal’s choices are more aligned with the first PC (x axis) than the difference between the previous and changed stimuli, which depends on both the first and second PC. (C and D) Mean rSC is highly correlated with proportion of variance explained by the first PC (P < 10−11; residuals, P < 10−9). (E and F) Performance of the choice and stimulus decoders normalized to their maximum performance (with the full data set), with SEM. (Inset) Raw decoder performance.

These analyses show that projection on this first PC is a suitable proxy for pairwise spike count correlations. We used this measure to assess the importance of correlated variability to the monkey by determining whether population activity along this first PC can predict the monkey’s choices on a trial-by-trial basis.

Activity along this first PC (and therefore correlated variability) had a much stronger relationship with the monkey’s behavior than it would if the monkey used an optimal stimulus decoder. A linear, cross-validated choice decoder (Fig. 4A) could detect differences in hit versus miss trial responses to the changed stimulus from V4 population activity along the first PC alone as well as it could from our full data set (Fig. 4, E and F, and fig. S12). By contrast, although the performance of the stimulus decoder (Fig. 4A) at detecting differences in V4 neuronal population responses to the previous stimulus (the stimulus before the change) versus the changed stimulus was unsurprisingly better overall than the performance of the choice decoder (Fig. 4, E and F, insets), the relative influence of the first PC was weaker. The performance of the stimulus decoder was much worse when based on the first PC alone versus our full data set (Fig. 4, E and F).

It is difficult to determine from extracellular recordings whether choice-predictive signals come from a bottom-up, causal relationship between sensory responses and decisions or from trial-to-trial variability from cognitive factors or post-decision signals (13). A recent study identifying the directionality of choice-predictive signals in mouse sensory cortex found that they are both bottom-up and top-down in origin (14). However, the time course of the choice-predictive activity in our data suggests that it occurs before the decision is made. We based our choice decoder on the first 70 ms of the evoked responses (after accounting for the response latency of V4 neurons). Choice-predictive activity was as strong in the first half of this time frame (60 to 95 ms) as in the second half (96 to 130 ms; paired t test per monkey, P > 0.05). That the choice-predictive activity described here was present during the full decision-making period suggests that it did not reflect post-decision feedback.

Our results, combined with functional imaging in humans (8) and other multielectrode recording studies (15, 16), suggest that learning is best studied by focusing on populations of neurons. Functional imaging studies, which use measures that are related to the activity of large neural populations, find consistent learning-related changes in both V1 and V4 (8, 17), as opposed to single-unit studies (8). Similarly, attention studies suggest that changes in population sensitivity are largely explained by cross-neuron correlations as opposed to single-neuron effects (3, 4).

The robust relationship between correlated variability and perceptual performance suggests that although attention and learning mechanisms act on different time scales (fig. S13), they share a common computation. Some characteristics of this computation are informed by recent studies showing that changes in a low rank modulator can account for the attention-related changes in rate, Fano factor, and correlated variability (11, 12). Attention and learning may decrease the strength of such a modulator by changing the balance of inhibition and excitation (10), which may improve information coding and the information that is communicated downstream (18).

Our most puzzling finding is that the attention- and learning-related changes in average noise correlation were so closely linked to performance but would likely have a minimal effect on performance if the monkeys read out visual information optimally. Similarly, a prior study found that correlations depend on training experience but did not find a relationship between shared variability and information coding (19). Correlated variability should only affect the performance of an optimal decoder when it is aligned with the stimulus dimension being decoded (6). Therefore, the relationship between correlated variability and performance suggests that our monkeys performed suboptimally.

We thus hypothesize that sensory information is decoded in a way that is optimal for the large number of stimuli and tasks that the animals encounter in their natural environment rather than the particular set of stimuli in our task. Traditionally, optimal decoders are trained to discriminate a particular set of stimuli that vary only in one stimulus dimension. This scenario implies a two-step decision process: identifying the stimulus (to optimize the decoder) and then decoding it. If animals could successfully identify the stimulus, they would perform perfectly on our change-detection task.

Instead, animals may use a more general decoder that could, for example, identify the orientation of any stimulus in any task, meaning that optimal weights would be tuned and noise correlations related to all stimulus features for which the neurons are selective. Noise correlations depend on tuning similarity for all stimulus features (6). Therefore, correlated variability is likely aligned with the dimension that is decoded by a general decoder, meaning that noise correlation decreases would improve performance. Several of the studies that suggest monkeys do behave optimally are those that used multisensory stimuli (20). Determining whether there is evidence that monkeys use decoders that are optimized for diverse stimuli and tasks will be an important avenue for future work. Our results suggest that the relationship between behavior and population activity is a powerful tool for understanding neural computation.

Supplementary Materials

Materials and Methods

Figs. S1 to S13

References (21, 22)

References and Notes

Acknowledgments: M.R.C. is supported by U.S. NIH grants 4R00EY020844-03, R01 EY022930, and Core Grant P30 EY008098s; a Whitehall Foundation grant; a Klingenstein-Simons Fellowship; a Sloan Research Fellowship; a McKnight Scholar Award; and a grant from the Simons Foundation. A.M.N. is supported by a fellowship from the Simons Foundation. We thank K. McCracken for technical assistance and J. H. R. Maunsell and A. Kohn for comments on a previous version of this manuscript. A.M.N., D.A.R., and M.R.C. designed the experiments; A.M.N., D.A.R., J.J.A., and J.S. collected the data; A.M.N. performed the analyses; and A.M.N. and M.R.C. wrote the paper. The authors declare no competing financial interests. Data analyzed in this manuscript are available at

Stay Connected to Science

Navigate This Article