You are currently viewing the abstract.
View Full TextLog in to view the full text
AAAS login provides access to Science for AAAS members, and access to other journals in the Science family to users who have purchased individual subscriptions.
Register for free to read this article
As a service to the community, this article is available for free. Existing users log in.
More options
Download and print this article for your personal scholarly, research, and educational use.
Buy a single issue of Science for just $15 USD.
Coding Reward Versus Punishment
Reinforcement learning is driven by reward prediction error, and a very influential theory has proposed that dopamine neurons provide this signal to teach value to the brain. Although this is called a reward prediction error, it has been assumed to also represent aversiveness. Thus, it was thought that the dopamine signal could be sufficient for learning total value. Fiorillo (p. 546) found that dopamine alone was not sufficient to encode value, implying that there must be an analogous signal for aversiveness.
Abstract
Whereas reward (appetitiveness) and aversiveness (punishment) have been distinguished as two discrete dimensions within psychology and behavior, physiological and computational models of their neural representation have treated them as opposite sides of a single continuous dimension of “value.” Here, I show that although dopamine neurons of the primate ventral midbrain are activated by evidence for reward and suppressed by evidence against reward, they are insensitive to aversiveness. This indicates that reward and aversiveness are represented independently as two dimensions, even by neurons that are closely related to motor function. Because theory and experiment support the existence of opponent neural representations for value, the present results imply four types of value-sensitive neurons corresponding to reward-ON (dopamine), reward-OFF, aversive-ON, and aversive-OFF.