New research reveals the function of dopamine in action-outcome learning

Dopamine in the brain is essential for learning and motor control. Regarding the role of dopamine in learning, Wolfram Schultz and colleagues at the University of Cambridge proposed the “reward prediction error” theory in 1997, which suggested that dopamine encodes stimulus-reward prediction errors during Pavlovian conditioning, i.e., the difference between the actual reward and expected reward signaled by the stimulus. Although this theory has been extended and widely used in reinforcement learning field about action decisions, whether dopamine actually encodes action-reward prediction errors, and how it controls sequential motor behaviors remain largely unknown. In the past, most research, including those by Schultz and his colleagues, mainly used discrete external stimuli as reward predictions in Pavlovian conditioning. In real life, however, rewards are often obtained through continuous self-exploration and trial-and-error learning. What is the role of dopamine in this kind of exploratory self-learning and behavioral improvement process?

 

Figure 1. Reward-evoked dopamine release was suppressed by goal-directed actions.

 

As shown in Figure 1, in the latest experiments, Professor Jin Xin, Professor and Director at the Center for Motor Control and Disease at East China Normal University, Affiliated Professor of Neuroscience at NYU Shanghai and member of the NYU-ECNU Institute of Brain and Cognitive Science at NYU Shanghai, and his team trained mice to perform intracranial self-stimulation via optogenetic stimulation of dopamine neurons, and used fast scan cyclic voltammetry to record dopamine release in the dorsal striatum (Figure 1A). They found that when compared with passive rewards of the same intensity, the release of dopamine evoked by rewards won by animals' own action was significantly reduced, whether it is a single action (Figure 1B-C) or an action sequence (Figure 1D-E). Further experiments show that the nigrostriatal dopamine encodes sequence-specific action-reward prediction errors, and may participate in the hierarchical control of sequence behavior as well. These results thus suggested that in the process of self-exploration, dopamine can provide feedback signals for action-outcome learning when the external outcome caused by actions is inconsistent with the animal’s internal expectation. After the animals familiarize the environment through repeated trial-and-error learning, the learned goal-directed action will then send a feedforward efference copy signal to inhibit the dopamine response and stop learning.

Previously, regarding dopamine and motor control, Professor Jin Xin and his collaborators have found that the nigrostriatal dopamine signals the start and stop of action sequence (Nature, 2010, 66: 457-462), and the dynamic dopamine biases online decision-making and action selection (Neuron, 2017, 93: 1436-1450). Together with the current study, the series of work from Professor Jin Xin's group therefore underscore the importance of dopamine for action selection, working both at the short timescale through influencing online motor control and at the long timescale through mediating trial-and-error learning. These findings have important implications not only for treating the motor symptoms in Parkinson's disease, but also for improving the reinforcement learning algorithm in artificial intelligence. This work was published on "Current Biology" 31: 1-14 (2021) with the title "Nigrostriatal Dopamine Signals Sequence-Specific Action-Outcome Prediction Errors".

 

Journal Reference:

Hollon, N.G., Williams, E.W., Howard, C.D., Li, H., Traut, T.I., & Jin, X*. (2021). Nigrostriatal dopamine signals sequence-specific action-outcome prediction errors. Current Biology, https://doi.org/10.1016/j.cub.2021.09.040.

>> To read the article in Chinese at the School of Life Sciences, East China Normal University, click here.