Technical Comments

Do Neurons Predict the Future?

See allHide authors and affiliations

Science  11 Jan 2002:
Vol. 295, Issue 5553, pp. 227
DOI: 10.1126/science.295.5553.227a

Hasegawa et al. (1) claimed that neuronal responses in the monkey's prefrontal cortex track the animal's past performance and even predict its future performance. These conclusions were derived from an analysis of the correlation between the animal's performance and the neuronal response at various time lags. This correlation analysis, in my view, does not prove that neurons track the past or predict the future.

The analysis of Hasegawa et al. is incomplete in at least three respects. (i) Hasegawa et al. did not test whether the locations of peaks and troughs in the correlation functions deviated significantly from zero time lag, but the conclusions of the study rely on the significance of these deviations. (ii) The employed permutation test likely overestimated the number of significant correlations. If, for example, there were gradual fluctuations in the monkeys' performance, and there were also (independent) gradual fluctuations in the firing rate of the neurons, such that there was a nonzero correlation for some time shift, then random permutations of the behavioral data would tend to destroy the gradual fluctuations and to replace them with more random fluctuations. Therefore, random permutations would be expected to have a lower correlation with the neuronal firing rates, which implies that the permutation test is inadequate. (iii) The behavioral data and the neuronal responses were smoothed with a Gaussian kernel, which would tend to result in an overestimation of the magnitude of the correlation. Moreover, the shape of the correlation function would be determined mainly by the kernel.

A simulation substantiates these criticisms. Suppose that there is a gradual fluctuation in the monkey's motivation over trials (Fig. 1A, dotted curve). This can be modeled as low-pass filtered noise,n 1(t). The monkey's trial-to-trial performance is not identical to the dotted curve, because it is either 0 (error) or 1 (correct), which should be modelled as a Bernoulli process. Convolution of the performance function in a simulated experiment with a Gaussian kernel (standard deviation of 2 trials) yields the dashed curve in Fig. 1A. It was assumed that the responses of the neurons are 35% dependent on the motivation for the current trial and 65% dependent on some other random processn 2(t) (2). Thus, the mean firing rate of the neuron on trial t,f(t), was set toEmbedded Image(1)However, the actual firing rate on any trial has a distribution (i.e., is noisy), which was modelled as a Gaussian distribution with mean equal to f(t), and variance equal to 1.3f(t)1.1, in accordance with cortical physiology (3, 4). The neuronal response across trials was smoothed with a Gaussian kernel (standard deviation of 2 trials), and is shown as the thin continuous curve in Fig. 1A.

Figure 1

Simulated case that appears to predict future performance. (A) Dotted curve reflects the motivation,n 1(t). Dashed curve shows actual performance after smoothing with a Gaussian kernel, and thin continuous curve shows smoothed neuronal response. Thick curve is performance shifted 8 trials back in time. (B) Correlation between performance and neuronal response before smoothing (dashed curve) and after smoothing (continuous curve). (C and D) Distribution of time lags yielding a (C) maximal or (D) minimal correlation that passed the permutation test. Positive shifts correspond to cases that appear to predict the future.

After simulation of 191 cases, 28 cases with a positive correlation were obtained that passed the permutation test. The example of Fig. 1A was chosen from these 28 cases. The correlation between performance and neuronal activity for this case is shown in Fig. 1B. The maximal correlation was obtained with a shift of 8 trials in the future direction (see also thick curve in Fig. 1A). Smoothing with a Gaussian kernel changes the shape of the correlation function and increases its maximal amplitude (Fig. 1B). The shifts of all cases with a significant positive and negative correlation are shown in Fig. 1, C and D. There are more significant positive correlations than negative ones. This is caused by n 1(t), which determines performance and also influences the neuronal response. The ratio between the number of significant positive and negative correlations can be adjusted by changing the contribution of the two stochastic processes n 1(t) andn 2(t) tof(t). Higher contributions ofn 1(t) give rise to a larger proportion of positive correlations. The real correlation caused byn 1(t) is without time shift, but the peaks in most correlation functions are shifted (5). These shifts are caused by the stochastic relationship between the motivation and the monkey's performance, as well as by the stochastic relationship between the motivation and the neuronal response.

I conclude that the data of Hasegawa et al.(1) are consistent with a dependence of neuronal activity on the motivation for the present trial. The study did not show that prefrontal neurons track the monkey's past performance and predict its future performance.


Response: Roelfsema makes three main arguments: (i) that the analytical method of our study (1) was invalid; (ii) that the Gaussian smoothing we employed exaggerated the significance of our results; and (iii) that our data can be simulated with a zero-shift stochastic method. All of these points are incorrect.

Roelfsema argues that a significant low-frequency component in the behavioral function would be destroyed by our permutation method, which would render questionable the validity of our results. However, our data did not have a dominant low-frequency component. A spectral analysis of our real and shuffled data revealed that in most of the cases the frequency of the real performance data was in the middle of those of 500 shuffled data (Fig. 1A) and that the real and shuffled data clustered, for the most part, in the same range, with a few low-frequency exceptions for the real data (Fig. 1B). Across the sample, the median of the peak frequencies of the real performance data (32 sets per cycle) was not significantly different (Wilcoxon sign rank test, P > 0.05) from the median of the peak frequencies of the shuffled data (25.6 sets per cycle). In addition, the median of the peak frequencies of the real neuronal data was identical to that of the real performance data. Because the power spectra are unchanged, the permutation method was valid.

Figure 1

(A) Spectral analysis for a case that produced positive significant correlation between performance and neuronal activity with a time lag. The peak of the power spectra for the real data were 16.0 sets per cycle for neuronal activity (top row) and 21.3 sets per cycle for performance (middle row). Five hundred shuffles of the performance data produced widely ranged peaks, but the median of the 500 peaks was similar to the original peak (bottom row). (B) Scattergrams of the peak of the power spectra (sets per cycle) in the original performance against the median of 500 peaks from the shuffled data for all significant neurons. It is apparent that permuting the performance did not massively change the power spectra.

We smoothed the impulse function to estimate the continuous performance function that underlay the noisy original data. This did increase the absolute values of the correlation coefficients, and therefore the estimate of how much of the variance was explained by the correlation with performance. However, it did not change the significance of those correlations, because the r-values of the permuted data also increased. Increasing the sigma of the Gaussian smoothing strength makes the maximum r-value larger, but ther-values of the permuted data also increase. This also results in a decrease of significance at higher sigma values.

Roelfsema's behavioral model fails to simulate our behavioral data, so the fact that the model's neural correlate simulates our data without a time shift is meaningless. The motivation function that Roelfsema used to simulate our behavioral data was filtered at 32 sets per cycle to remove high-frequency noise. This filter resulted in a much larger low-frequency component (median = 51 sets per cycle in 100 simulations that produced significantly positive correlations) than our data (median = 32 sets per cycle). Given this unrealistic emphasis on low-frequency components, it is not surprising that Roelfsema's simulation worked without requiring shifts. When we duplicated Roelfsema's simulation using a motivation function with the filter set at 20 sets per cycle, we were able to generate a frequency power spectrum (median = 36.4) that more nearly resembled our behavioral data. This motivation function, using his original weighting coefficients, failed to generate the same proportion of significantly positive correlations that we found (Fig. 2A). Instead, we had to use an n 1 of 0.75, which in our simulation did produce an average of 28.5 significant correlations (in 100 runs of 171 neurons each). However, the distribution of the time shifts of these maximum correlations was a Gaussian distribution with its peak at 0 (Fig. 2B). A major point of our paper was that we did not find a Gaussian distribution of shifts; instead, few (2 in 28, or 7%) of our neurons had shifts between –1 and +1, which indicated a bimodal rather than a Gaussian distribution. Using the motivational function and weight coefficients that simulated our behavioral data, we were unable to generate any set of simulated neurons that had maximum correlations with 7% or fewer neurons in the –1 to +1 range, in 100 simulations (Fig. 2C).

Figure 2

(A) The number of significant positive (left) and negative (right) neurons at differentn 1 weights for two high-cut filtering methods (threshold 32 and 20 sets per cycle). Each plot is the average of 6 sessions of 171 simulations. Horizontal dashed line is the number of neurons of original data, indicating that the appropriaten 1 weight ranges between 0.65 and 0.8 for the positive neurons. (B) Average and SEM of the number of significant positive (left) and negative (right) neurons from 100 sessions of 171 simulations with n 1 weight = 0.75. (C) Distribution of percent of cases that had 7% (vertical line) or fewer neurons with maximal correlations at shifts of −1 to +1.

To summarize, we have shown that the permutation method is valid when applied to our data, that the Gaussian kernel does not render the correlations falsely significant, and that a zero-shift model does not simulate both our behavioral and neuronal data. Our methods do indeed demonstrate that the background activity of prefrontal neurons reflects neuronal processes that track general aspects of behavior with significant time lags or leads.


Navigate This Article