Skip to content. | Skip to navigation

Personal tools
Document Actions

F. Worgotter and B. Porr (2005)

Temporal sequence learning, prediction, and control: A review of different models and their relation to biological mechanisms

Neural Comput, 17(2):245-319.

In this review,we compare methods for temporal sequence learning (TSL) across the disciplines machine-control, classical conditioning, neuronal models forTSLas well as spike-timing-dependent plasticity (STDP). This review introduces the most influential models and focuses on two questions: To what degree are reward-based (e.g.,TDlearning) and correlationbased (Hebbian) learning related? and How do the different models correspond to possibly underlying biological mechanisms of synaptic plasticity? We first compare the different models in an open-loop condition, where behavioral feedback does not alter the learning. Here we observe that reward-based and correlation-based learning are indeed very similar. Machine control is then used to introduce the problem of closed-loop control (e.g., actor-critic architectures). Here the problem of evaluative (rewards) versus nonevaluative (correlations) feedback from the environment will be discussed, showing that both learning approaches are fundamentally different in the closed-loop condition. In trying to answer the second question, we compare neuronal versions of the different learning architectures to the anatomy of the involved brain structures (basal-ganglia, thalamus, and cortex) and the molecular biophysics of glutamatergic and dopaminergic synapses. Finally, we discuss the different algorithms used to modelSTDPand compare them to reward-based learning rules. Certain similarities are found in spite of the strongly different timescales. Here we focus on the biophysics of the different calciumrelease mechanisms known to be involved in STDP.
Relevant for: WP6 hierarchical architectures. Provides a comprehensive review of algorithms for reward-based and correlation-based learning (differential Hebb rules).