Skip to content. | Skip to navigation

Personal tools
Document Actions

S Singh, R L Lewis, and A G Barto (2009)

Where Do Rewards Come From?

In: Proceedings of the 31st Annual Conference of the Cognitive Science Society, ed. by N.A. Taatgen, H. van Rijn , pp. 2601-2606.

Reinforcement learning has achieved broad and successful ap- plication in cognitive science in part because of its general for- mulation of the adaptive control problem as the maximization of a scalar reward function. The computational reinforcement learning framework is motivated by correspondences to ani- mal reward processes, but it leaves the source and nature of the rewards unspecified. This paper advances a general computa- tional framework for reward that places it in an evolutionary context, formulating a notion of an optimal reward function given a fitness function and some distribution of environments. Novel results from computational experiments show how tra- ditional notions of extrinsically and intrinsically motivated be- haviors may emerge from such optimal reward functions. In the experiments these rewards are discovered through auto- mated search rather than crafted by hand. The precise form of the optimal reward functions need not bear a direct relationship to the fitness function, but may nonetheless confer significant advantages over rewards based only on fitness.
Reinforcement Learning; Markov Decision Processes; Semi-Markov Decision Processes; Hierarchy; Temporal Abstraction