Research Article

Spiking neurons can discover predictive features by aggregate-label learning

See allHide authors and affiliations

Science  04 Mar 2016:
Vol. 351, Issue 6277, aab4113
DOI: 10.1126/science.aab4113

You are currently viewing the abstract.

View Full Text

Log in to view the full text

Log in through your institution

Log in through your institution

Credit assignment in the brain

To discover relevant clues for survival, an organism must bridge the gap between the short time periods when a clue occurs and the potentially long waiting times after which feedback arrives. This so-called temporal credit-assignment problem is also a major challenge in machine learning. Gütig developed a representation of the responses of spiking neurons, whose derivative defines the direction along which a neuron's response changes most rapidly. By using a learning rule that follows this development, the temporal credit-assignment problem can be solved by training a neuron to match its number of output spikes to the number of clues. The same learning rule endows unsupervised neural networks with powerful learning capabilities.

Science, this issue p. 10.1126/science.aab4113

Structured Abstract


Opportunities and dangers can often be predicted on the basis of sensory clues. The attack of a predator, for example, may be preceded by the sounds of breaking twigs or whiffs of odor. Life is easier if one learns these clues. However, this is difficult when clues are hidden within distracting streams of unrelated sensory activity. Even worse, they can be separated from the events that they predict by long and variable delays. To discover those clues, a learning procedure must bridge the gap between the short epochs within which clues occur and the time when feedback arrives. This “temporal credit-assignment problem” is a core challenge in biological and machine learning.


A neural detector of a sensory clue should fire whenever the clue occurs but remain silent otherwise. Hence, the number of output spikes of this neuron should be proportional to the number of times that the clue occurred. The reversal of this observation is the core hypothesis of this study: A neuron can identify an unknown clue when it is trained to fire in proportion to the clue’s number of occurrences. This “aggregate-label” hypothesis entails that when a neuron is trained to match its number of output spikes to the magnitude of a feedback signal, it will identify a set of clues within its input activity whose occurrences predict the feedback. This learning requires neither knowledge of the time nor of the absolute number of individual clues.


To implement aggregate-label learning, I calculated how neurons should modify their synaptic efficacies in order to most effectively adjust their number of output spikes. Because a neuron’s discrete number of spikes does not provide a direction of gradual improvement, I derived the multi-spike tempotron learning rule in an abstract space of continuous spike threshold variables. In this space, changes in synaptic efficacies are directed along the steepest path, reducing the discrepancy between a neuron’s fixed biological spike threshold and the closest hypothetical threshold at which the neuron would fire a desired number of spikes. With the resulting synaptic learning rule, aggregate-label learning enabled simple neuron models to solve the temporal credit assignment problem. Neurons reliably identified all clues whose occurrences contributed to a delayed feedback signal. For instance, a neuron could learn to respond with different numbers of spikes to individual clues without being told how many different clues existed, when they occurred, or how much each one of them contributed to the feedback. This learning was robust to high levels of feedback and input noise and performed well on a connected speech-recognition task.

Aggregate-label learning enabled populations of neurons to solve unsupervised learning tasks by relying on internally generated feedback signals that amplified correlations between the neurons’ output spike counts. These self-supervised networks discovered reoccurring constellations of input patterns even if they were rare and distributed over spatial and temporal scales that exceeded the receptive fields of individual neurons. Because learning in self-supervised networks is driven by aggregate numbers of feature occurrences, it does not require temporal alignment of the input activities of individual neurons. When competitive interactions between individual neurons were mediated through the internal feedback circuit, the formation of feature maps was possible even when the features’ asynchrony incapacitated lateral inhibition.


Aggregate-label learning solves the long-standing question of how neural systems can identify features within their input activity that predict a delayed feedback. This solution strongly enhances the known learning capabilities of simple neural circuit models. Because the feedback can be external or internal, these enhancements apply to supervised and unsupervised learning. In this framework, both forms of learning converge onto the same rule of synaptic plasticity, inviting future research on how they cooperate when brains learn.

Membrane potential traces of a model neuron before and after learning to detect reward predictive sensory clues.

Before learning, top trace; after learning, second through fifth traces from top; clues, colored squares. Each clue occurrence is represented as a spike pattern within the neuron’s input activity (raster plot). After learning, the number of output spikes (vertical deflections) elicited by each clue encodes the clue’s contribution to a delayed reward.


The brain routinely discovers sensory clues that predict opportunities or dangers. However, it is unclear how neural learning processes can bridge the typically long delays between sensory clues and behavioral outcomes. Here, I introduce a learning concept, aggregate-label learning, that enables biologically plausible model neurons to solve this temporal credit assignment problem. Aggregate-label learning matches a neuron’s number of output spikes to a feedback signal that is proportional to the number of clues but carries no information about their timing. Aggregate-label learning outperforms stochastic reinforcement learning at identifying predictive clues and is able to solve unsegmented speech-recognition tasks. Furthermore, it allows unsupervised neural networks to discover reoccurring constellations of sensory features even when they are widely dispersed across space and time.

View Full Text