Skip to content. | Skip to navigation

Personal tools
Intrinsically Motivated
Cumulative Learning
Versatile Robots
Document Actions

Abstracts of IM-CLeVeR Workshop 1

The abstracts of presentations

People from IM-CLeVeR

Baldassarre, Gianluca

(Istituto di Scienze e Tecnologie della Cognizione, CNR)
Piaget processes and action learning.
This presentation aims at illustrating a research agenda, based on computational models, directed to investigate the mechanisms which might underlie the acquisition of a hierarchically organised repertoire of actions in organisms and children and support their astonishing capabilities for cumulative learning. The presentation will start by reconsidering the psychological theory of Piaget on action assimilation and accommodation processes in children, and then it will continue by reviewing some selected neuroscientific facts which might be relevant for understanding the machinery underlying the hierarchical organisation of actions, in particular in relation to the striato-prefrontal and striato-motor loops involving basal ganglia and cortex. The presentation will then highlight some challenges emerging from this analysis, for example related to the learning mechanisms suitable for the acquisition of action hierarchies. Finally, it will present the lines of research which are being followed to tackle such open challenges within IM-CLeVeR, based on modular neural-networks, reinforcement-learning algorithms, and robotic arm control testbeds.

Barto, Andrew

(University of Massachusetts Ahmerst)
Where do rewards come from?
I describe a series of computational experiments recently carried out by Satinder Singh, Rick Lewis, and me that elucidate aspects of the relationship between ultimate goals (e.g., reproductive success for an animal) and the primary rewards that drive learning.  Among the lessons provided by these experiments are clarification of the traditional notions of extrinsically and intrinsically motivated behavior and that the precise form of an optimal reward function need not bear a transparent relationship to an agent's ultimate goal.

Gatsoulis, Yiannis

(University of Ulster)
Novelty detection in autonomous learning robots
This presentation will look at previous work in the area of autonomous robots, whose task was to highlight novel features in its environment from a range of sensors, including camera images.  The ability to detect novelty, that is, to recognise and respond to stimuli that do not fit into the class of expected perceptions, is very useful for animals and robots alike. For animals, novelty detection is an important survival trait -- the unexpected perception could signify a potential predator, while for robots a novel stimulus could be some important feature of an environment, a potential problem, or something that has to be learnt.

Guglielmelli, Eugenio

(Università Camus Bio-Medico)
A mechatronic platform for research on intrinsically motiveted cumulative learning
This talk will present the design and development of the IMCleVeR Board Experiment platform, a modular mechatronic system for investigating intrinsically motivated cumulative learning in ecological conditions in children and monkeys. The basic idea behind the development of this technological tool is to provide a standardised experimental environment where both intrinsic (e.g. curiosity-driven) and extrinsic (e.g. rewards) stimuli can be delivered to an experimental subject in a finely controlled way during the execution of a set of predefined tasks. The sequence of tasks that can be performed by using this platform is organized so to promote a cumulative learning process for developing manipulative capabilities of objects with different affordances. Here, the term “affordance” is used with its most general meaning: objects with different grasping patterns are associated to different possible meaningful actions (e.g. rotating or pushing an object shaped as a ball or a cube can control the opening of a food container or other types of rewarding actions). A set of ‘smart’ objects, embedding a variety of sensors and featuring different shapes, dimensions, weights, colours, real and perceived affordances, is placed on an ‘activity board’ and used  to attract the attention of the subject. Kinematic and dynamic quantitative data (e.g. position, orientation, acceleration, grasping force, tactile contact patterns, etc.) during free interaction in ecological settings are recorded by the system, so to later derive a quantitative, objective behavioural analysis. The technologies embedded in the platform are non-intrusive, usable in ecological scenarios, small and light enough to fit into common objects that can be manipulated by children (1-10 years old) and by non-human primates (e.g. capuchin monkeys). The overall platform is a highly modular, flexible research tool that can be easily re-arranged so to meet the requirements of a wide variety of experimental scenarios in neuroscience, motor control and bioengineering. Its application to early diagnosis and quantitative assessment of neurodevelopmental disorders is also envisaged, based on very promising results deriving from ongoing clinical trials carried out with early prototypes of such technologies. The current status of development of the novel platform will be presented and discussed along with preliminary ideas on the protocols for the experimental trials that are expected to be performed all along the project, starting in 2010. Finally, the rationale for the future application of the same platform for enabling a direct comparative analysis of the learning capabilities of an iCub robotic platform with respect to the performance of children and monkeys will be briefly introduced.

Gurney, Kevin

(University of Sheffield, Department of Psychology)
Learning novel action-outcome associations: computational analysis and biologically plausible modelling
The learning of ‘agency’ in animals requires the development of internal models of novel action-outcome contingencies, presumably stored in associative neuronal networks in the brain. Learning in such networks demands that  neural representations of the ‘action’ (including motoric and contextual components ) together with representations of ‘outcome’, be presented to the relevant brain circuits. The signal representations constitute ‘training patterns’ for the neural circuits which, after a process of plastic change, are able to store the association. In general, successful learning only takes place if the training data is presented repeatedly and, ideally, within a limited time window.  In an action-outcome scenario, this will be accomplished by repeatedly performing the action and observing the outcome until the association has been learned, after which a normal action selection policy  is resumed. This repetition bias must also be effective only when new, or unexpected outcomes result from the agent’s actions, otherwise learning instability ensues. An additional dimension of the learning process concerns the discovery of the precise (contextual and motoric) components that comprise the action which is  most effective in eliciting the novel outcome. This action discovery will be most apparent in situations where the agent is not performing a deliberate investigative behaviour.
In the vertebrate brain, action selection is supposed to be mediated by a set of sub-cortical structures – the basal ganglia (BG).  Thus, the BG ‘listen’ to action requests from other brain areas and allow only the most salient of these requests to be translated into behaviour. We present agent-based models of repetition bias which makes use of plastic change in BG. Further, the plasticity is coupled to a neuronal signalling of novelty mediated by phasic release of the neurotransmitter dopamine. In another model of single neurons in the BG, the groundwork is laid for more complete models of action discovery, by showing pattern matching on neuronal afferents under the influence of phasic dopamine. The models raise several questions about the biological implementation of learning agency which will be discussed.

Law, James (Lee, Mark)

(Aberystwyth University)
Rearing an infant robot the LCAS way
Development in the human infant is restricted by a series of constraints, which limit the infant's action repertoire and sensing capabilities.
Initially, these constraints reduce the perceived complexity of the environment and limit interaction, helping the infant to make sense of the world by preventing an overload of stimulus.  Over time, as the infant learns about itself and the environment, these constraints are gradually lifted, allowing the infant more sensor and motor abilities.  In this way the infant's development is restrained, giving it time to gain understanding and competence over its abilities before they have chance to become too complex to comprehend.
The LCAS framework takes this idea of staged development as a basis for a robotic learning architecture, in which the robot is only initially allowed access to a restricted sensor and motor set.  These restrictions may be on the access to sensors and motors, or be limitations imposed on them, such as resolution, torque, etc.  Using a measure of habituation, constraints are lifted allowing the robot access to more functionality, only when it has sufficiently learnt to deal with that which it currently has.  We use neurologically inspired mechanisms to structure learning and store the resulting information, but these are content neutral, and can support many learning methods.
In this presentation we will review the LCAS approach, show how we are implementing it to support an infant-like development sequence in a robot, and how this might combine with other elements being developed within the IM-CLeVeR project.

Mirolli, Marco

(Istituto di Scienze eTecnologie della Cognizione, CNR)
Integrating knowledge-based and competence-based intrinsic motivations
Since the 1950s researchers in psychology have identified several phenomena that demonstrate the presence of 'intrinsic motivations' in humans and other mammals and have proposed different hypotheses on what these intrinsic motivations might consist of. More recently, the recognition that intrinsically motivated learning is at the root of much of the flexibility and adaptability of real organism has led researchers in machine learning and autonomous robotics to propose models of intrinsically motivated learning. In both these fields a useful distinction can be done between knowledge-based (e.g. novelty, prediction errors...) and competence-based (e.g. effectance) intrinsic motivations. I present a hierarchical and modular neural network architecture that integrates both kinds of intrinsic motivations. At the low-level, genetically evolved internal reinforcers based on current perceptions drive the learning of basic skills. At the high-level, signals related to the low-level skill improvements  drive the system in deciding which skill to train in each moment. I present the results of a test of our  system in which the proposed architecture controls the behavior of a simulated robot that learns to accomplish several different navigation tasks by exploiting the skills it has acquired through intrinsically motivated learning. I discuss the relevance of this work with respect to both the psychological and neuroscientific knowledge related to intrinsic motivations.

Redgrave, Peter

(University of Sheffield, Department of Psychology)
What aspects of reinforcement learning are reinforced by phasic dopamine signals?
In the neuroscience community there is general agreement that the basal ganglia (one of the brain’s ancient and fundamental processing units) play an important role in behavioural selection and reinforcement learning.  It is also agreed that within the basal ganglia, the sensory response of midbrain dopaminergic neurones to biologically salient stimuli, acts as a reinforcement signal. From this point there is less agreement.  The majority view is that the dopamine neurones signal a reward prediction error that is used to reinforce the maximisation of future reward acquisition.  For various reasons, which will be covered in the talk, our view is that reinforcement learning can be split into independent processes that have been recognised by biological evolution, and separate mechanisms have been developed accordingly:
   1. A mechanism to determine agency (events in the world for which the agent is responsible), largely independent of any detailed assessment of value. A subsidiary process of agency determination is the development of novel actions; i.e.  identification of the causal aspects of behavioural output by means of trial and error.  It appears the basal ganglia and phasic dopamine reinforcement signal are ideally configured to perform this function, which is intrinsically motivated.
   2. A mechanism to bias future action selections based on outcome value.  Detailed determinations of outcome value are used to maximise future reward acquisition via a mechanism that applies selective bias to the looped inputs of the basal ganglia.  This enables behavioural outcomes associated with high value to have the competitive edge.  
We suggest the IM-CLeVeR project could be advised to adopt the strategy that has been tried and tested in vertebrates for several hundred million years.

Ring, Mark

(Scuola Univ. Profes. della Svizzera Italiana)
Building up
Continual learning is the constant development of increasingly complex behaviors.  The dream is to build more complex skills on top of simpler ones such that what is learned now can be built upon and modified later.  In 1992 I designed the first continual learners for reinforcement environments.  These agents could learn context-dependent skills incrementally and hierarchically without human intervention and without external specification of subgoals.  From the development of these early continual-learning agents came several important lessons, which I will describe in this talk.  Perhaps the most important lesson is that the concept of hierarchical behavior is an abstraction of far greater descriptive than of computational convenience.  Reproducing hierarchies of behavior does not necessarily require corresponding hierarchical structures.  I will show how to avoid the pitfalls of explicitly represented behavior hierarchies (such as macro operators) by focusing on the transitions between and within behaviors.  Temporal Transition Hierarchies (Ring, 1992) represent future expectations as contingencies of sensation and action in ways very similar to Predictive State Representations (though they predate PSRs by a decade).  Temporal Transition Hierarchies encode and reproduce hierarchical behaviors in complex, non-Markov environments; and they enable robust, continual learning through the incremental composition of behaviors of increasing complexity.

Schmidhuber, Juergen

(Scuola Univ. Profes. della Svizzera Italiana)
The basic algorithmic principles of intrinsic motivation - Overview of work 1990-2009
In 1990 I built the first artificial agents with intrinsic motivation.
Several additional systems followed in the next two decades. Crucial ingredients of such systems are:
(1) A predictor or compressor of the continually growing data history, reflecting what's currently known about sequences of actions and sensory inputs;
(2) A learning algorithm that continually improves the predictor or compressor (detecting novel spatio-temporal patterns that subsequently become known patterns),
(3) Intrinsic rewards measuring the predictor's or compressor's improvements due to the learning algorithm,
(4) A reward optimizer, which translates those rewards into the action sequences expected to optimize future reward, thus motivating the agent to create additional novel patterns predictable or compressible in previously unknown ways.
I will discuss the following variants:
(A) Intrinsic reward as measured by improvement in mean squared error (1991),
(B) Intrinsic reward as measured by relative entropies between the agent's priors and posteriors (1995),
(C) Learning of probabilistic, hierarchical programs and skills through zero-sum intrinsic reward games (1997-2002),
(D) Mathematically optimal, intrinsically motivated systems driven by compression progress (2006-2009).
I will further argue that science, art, music, comedy, etc. are just by-products of our simple algorithmic framework.

Siddique, Mia

(University of Ulster)
Hierarchical/modular structures for accumulative learning
A central issue in current robotics is how to scale up to more complex cognitive abilities, such as learning in changing environment and prediction. Animals and some insects are capable of learning, integration of multisensory cues, real-world navigation, and flexible behavioural choice with relatively small resources. Therefore, understanding these mechanisms should lead to efficient robot applications. This assumes that human learning process is not strictly error minimization, but accumulation of learning. The present research will introduce frameworks of cognitive models of learning that takes granules of information into account and represent the accumulated learning in hierarchical/modular forms so that a new skill can be acquired with minimum effort. The research proposes a modularised neural architecture which is suitable for incremental learning and addition of modules to the neural network structure. Modules can be pre-trained individually for specific subtasks and then integrated via an integration unit. Individual neural modules can be simple, smaller in size, trained with smaller data sets, thus will require less computation and will eliminate crosstalk phenomenon (loss of learned skills) such as temporal crosstalk and spatial crosstalk.

Triesch, Jochen

(Goethe University, Frankfurt Institut for Advanced Studies)
Self-organization and reward-modulated learning in recurrent networks
We have recently shown how self-organizing recurrent neural networks learning with a combination of different local plasticity rules can discover temporal  structure in their inputs and vastly outperform comparable reservoir computing networks on prediction tasks. Here we extend these networks to reward-modulated learning using reward-modulated spike-timing-dependent plasticity (r-STDP). We apply this idea in two contexts. First, we discuss a model of a sequence prediction task in infants. Second we present a model for the development of working memory properties in recurrent networks.

People from IM-CLeVeR ISAB - International Scientific Advisory Board

Balkenius, Christian

(University of Lund)
Learning compositional structures by observing dynamical systems
A new way to segment visual scenes is proposed. Instead of focusing on visual properties on their own, visual features are seen as cues about how the objects containing the features will interact with other objects.
By observning how different visual features influence the motion of a moving object, such as a ball, it becomes possible to find compositional structures in the environment. Similar visual features with similar effects on the ball will be considered as instances of the same object category. Once the scene has been segmented into such parts, it becomes possible to predict what will happen when the parts are rearranged.
The method depends on an ability to predict the motion of objects when the do not interact and any deviations from the expected behavior can be seen as instances of interaction. According to this view, objects are defined according to how they can influence the world, rather than as static entities.

Oudeyer, Pierre-Yves (14th Nov, Epirob)

(INRIA)
Why language acquisition and intrinsic motivation should go hand in hand
Language acquisition and intrinsic motivation are two topics which have mainly been studied separately both in developmental robotics and psychology. In this talk, I will show that they should in fact be studied together, especially if one wants to build developmental robots that may learn language in real complex environments. I will begin by outlining the big challenges of language acquisition in human and robots, especially those related to the acquisition of meaning. In this context, I will explain that many essential meanings learnt at the onset of language are rooted in sensorimotor representations, and affordances in particular. Thus, learning linguistic meanings implies the ability to learn motor affordances. While social learning mechanisms are essential in this process, I will explain why they are not sufficient in real complex sensorimotor spaces in which it is essential that the robot/human infant learns affordances by self-experimentation. Besides, self-experimentation through motor babbling can only be efficient if exploration is guided and organized, which is one of the main functions of intrinsic motivation. I will illustrate this point by describing several experiments in which a robot learns efficiently low-level motor skills and affordances driven by a computational model of intrinsic motivation used as an active learning heuristics. Furthermore, I will argue that intrinsic motivation conceptualized as active learning can also be essential to allow true interactive social language learning, where it allows both the teacher and the learner to control the growth of complexity in linguistic interactions. I will conclude by outlining a number of challenges implied by this joint study of language and intrinsic motivation.

Oudeyer, Pierre-Yves (IM-CLeVeR presentation)

(INRIA)

The challenges of active learning and intrinsic motivation for learning motor control in high-dimensional robots

Learning motor control in robots, such as learning visual reaching or object manipulation in humanoid robots, is becoming a central topic both in "traditional" robotics and in developmental robotics.
A major obstacle is that learning can become extremely slow or even impossible without adequate exploration strategies. Active learning and intrinsic motivation are two converging approaches, but they differ in their underlying assumptions. In this talk, I will try to articulate these two approaches in the context of developmental robotics, and show that important challenges remain to be addressed to achieve efficient exploration and motor learning in high-dimensional sensorimotor spaces.

Richard, Sutton

(University of Alberta)
Core learning algorithms for intrinsically motivated robots

This talk will present recent progress in the development of core learning algorithms that may be useful in creating systems with nontrivial cognitive-developmental trajectories. The most distinctive feature of such systems is that their learning is continual and cumulative. They never stop learning, and new learning builds upon the old. For such learning, the algorithms must be incremental and operate in real-time, and in the past the most suitable algorithms for such cases have been gradient-descent algorithms. Our new algorithms are extensions of temporal-difference learning so that it is a true gradient-descent algorithm, which greatly extends its robustness and generality. In particular, we have obtained for the first time temporal-difference methods for off-policy learning with function approximation, including nonlinear function approximation, and for intra-option learning. This talk will not present these algorithms in technical detail, but instead stress the several natural roles that they could play in systems that set their own goals and explore the world in a structured, intrinsically motivated way.

Schlesinger, Matthew

(Southern Illinois University, Psychology Department )
Getting "value" from vision: investigating where infants look, and why
Visual exploration is a critical aspect of interaction with the environment, and more importantly, the primary mode available to young infants (who do not yet manipulate their environment by reaching, crawling, etc.). In this talk I describe ongoing work with Dima Amso and Scott Johnson, in which we have been studying how 3-month-old infants begin to understand basic properties of objects. In particular, we are interested in how parallel developments in attention and oculomotor skill help infants learn to perceive partially-occluded objects. Our modeling approach borrows several fundamental properties of the mammalian visual system, and builds on these in order to simulate how young infants use vision to explore. I highlight the complementary roles of visual salience and oculomotor control, and discuss how an error or learning signal from an oculomotor control system can serve as a bootstrap for reinforcement learning in the context of intrinsically-motivated visual exploration.

Verschure, Paul

(Universitat Pompeu Fabra,Laboratory for Synthetic Perceptive, Emotive and Cognitive Systems - SPECS)
The multi-level neuronal organization of perception, cognition and action in a Synthetic Forager
We aim at building an autonomous synthetic foraging robot (SF) that is based on the neuronal, cognitive and behavioural principles underlying optimal foraging in rodents. The perceptual, cognitive and behavioural control systems of SF is based on the Distributed Adaptive Control (DAC) cognitive architecture. In this presentation I will define the Distributed Adaptive Control (DAC) neuromorphic architecture (1) that proposes how different levels of the neuraxis - from the brainstem to the neocortex - interact to give rise to perception, cognition and action. I will describe our current mobile system that is integrating chemosensing, vision and proximity sensing to achieve adaptive foraging in chemical environments. As particular examples of how we model the control systems of SF, I will discuss the reactive regulation of stereotyped behaviors, the integration of spatial and cue information originating in the medial and lateral entorhinal cortex respectively, in the hippocampus and the acquisition and execution of rules in the pre-frontal cortex.

Other outstanding scientists working on topics relevant for IM-CLeVeR

Ballard, Dana

(University of Texas, Department of Computer Sciences)
Modular reinforcement learning as a model of embodied cognition
To make progess in understanding human visuo-motor behavior, we will need to understand  its basic components at an abstract level. One way to achieve such an understanding would be to create a model of a human that has a sufficient amount of complexity so as to be capable of  generating such behaviors. Technological advances in VR allow significant progress to be made in this direction. Graphics models that simulate extensive human capabilities can be used as platforms from which to develop synthetic models of visuo-motor behavior. Currently such models can capture only a small portion of a full behavioral repertoire, but for the behaviors  that they do model, they can describe complete visuo-motor subsystems at a useful level of detail.  The value in doing so is that the body’s elaborate visuo-motor structures greatly simplify the  specification of the abstract behaviors that guide them. Essentially, one is  faced with proposing an embodied “operating system” model for picking the right set of abstract  behaviors at each instant. We outline one such model. Its centerpiece  uses MDP reinforcement learning  modules to guide behavior.

Grupen, Rod

(University of Massachusetts Ahmerst)
The developmental organization of dexterous robot behavior

"Dexterity" is the bridge between a physical and an intellectual relationship with the world. Humans have evolved to be bipedal with hands free for manipulation and defense. This evolutionary choice has had a tremendous impact on the co-evolving human brain as well as the perceptual categories, motor synergies, control knowledge, and representations that support interactions with the world. This talk will start by reviewing how human infants exploit a variety of developmental processes involving growth and maturation to defeat computational complexity and write their own behavioral programs.
We will discuss:
     * Kinematic, Dynamic, and Maturational Structure
     * Dynamical Systems Approach
     * Neurological Structure - Developmental Reflexes and Composability
     * A Developmental Assembler
     * Intrinsic Motivation for Affordance Discovery
     * Schemata and Programming by Demonstration
The talk will include experimental demonstrations of the ideas using our bimanual humanoid Dexter and a new mobile manipulator concept, uBot-5, that has been designed
as a personal robot that can learn and accomodate the individual needs
of healthcare clients. We will illustrate computational analogs of accomodation and assimilation, and demonstrate simple behavioral hierarchies that are acquired using developmental principles.

Botvinick, Matthew

(University of Massachusetts Ahmerst)
Hierarchical structure in brain and behavior
Recent research in psychology and neuroscience has begun to consider how task/subtask hierarchies are learned and exploited, and how such hierarchies are represented in the brain.  I'll review some of the relevant work, looking both at empirical findings and some theoretical interpretations, with attention to the question of how intrinsic motivation might be involved in constructing hierarchical action representations.

Lisman, John

(Brandeis University)
Towards an integrated view of memory and action selection; the role of dopamine in these processes
Dopamine has been implicated in reward and action selection, but more recent work implicates dopamine in memory processes, in particular the processes that make memory persistent. This work comes from both studies of memory behavior and studies of LTP, the synaptic modification thought to underlie behavior. One conclusion of this work is that LTP is short-lasting unless the synapses receive a second type of message coming from the dopamine containing axons of the midbrain structure called the VTA. This raises the question of what causes dopamine cells to fire. There appear to be several different inputs to the system, but one comes from the hippocampus itself which signals unexpected events. This hippocampal-VTA loop may itself be influence by motivational factors. The information recorded in the hippocampus is commonly thought of as coming fron the external world, however the massive PFC input indicates that goals and actions are also recorded. This suggests that the hippocampal input to the accumbens could be the basis for repeating successful actions. I will present a model of action selection by the PFC, hippocampal, VTA, basal ganglion system.

Merrick, Kathryn

(University of New South Wales)
Curious characters: from virtual worlds to sentient homes
In natural systems such as animals and humans, curiosity motivates individuals to explore and seek out interesting stimuli on which to focus their attention. Embedding computational models of curiosity in artificial systems permits the design of agents that can focus their attention autonomously and choose their own goals. The result is a new kind of self-adaptive agent with the capacity for emergent, creative behaviour not envisaged by system engineers. This talk will discuss three models of curious agents using reflexes, reinforcement learning and supervised learning. Applications of these models will be presented including non-player characters in computer games, anomaly detection, curious robots and intelligent environments. The talk will conclude with a discussion of open questions regarding the design and evaluation of curious agents.

Sirois, Sylvain

(University of Manchester)
Learning dynamics in habituation
Research on cognitive abilities in young infants has, over the past two decades, suggested a wealth of complex cognitive abilities that, if taken seriously, would make the very idea of developmental psychology a vacuous enterprise. Whether deliberate or not, this work endorses a form of nativism that would make the study of cognition the realm of molecular biology and transform psychologists into mere behavioural taxonomists. In this talk, I review the methods behind such work, as well as their shortcomings. I then discuss a formal process model of infant habituation, and look at how well it reproduces behaviour when embedded in a robot. I also review recent empirical support for predictions from the model. I then return to the issue of advanced cognition, and report on new studies that examine whether and how infants “understand” impossible events. The studies make use of a proper factorial design that independently and jointly examines the roles of perception and cognition. I further examine the role of pupil dilation data as a complement to looking time measures. The results suggest that perception is what drives apparent “conceptual” behaviour. This is good news for developmental psychologists, and has the added benefit of being consistent with genetics and neuroscience. Implications for computational models / robotics are discussed.

Yamashita, Yuichi

(RIKEN Brain Science Institute, Lab. for Behavior and Dynam. Cognition)
Self-organized functional hierarchy in a multiple timescale neural network model: synthetic neuro-robotic approach
It is generally thought that skilled behavior in human beings results from functional hierarchy of the motor control system, within which reusable motor primitives are flexibly integrated into various complex sensori-motor sequence patterns. The underlying neural mechanisms governing the way in which continuous sensori-motor flows are segmented into primitives, and the way in which series of primitives are integrated into complex sequential behavior, have however not yet been clarified. In earlier studies, this functional hierarchy has been realized through the use of explicit hierarchical structure, with local modules representing motor primitives in the lower level and a higher module representing sequences of primitives
switched via additional mechanisms such as gate-selecting. Complex sensori-motor sequences, however, are not easily handled in such earlier models due to a conflict, induced by this explicitly separated modular structure, between generalization and segmentation. To address this issue, we propose a different type of neural network model. The current model neither makes use of separate local modules to represent primitives, nor introduces explicit hierarchical structure. Rather than forcing architectural hierarchy onto the system, functional hierarchy emerges through a form of self-organization that is based on two distinct types of neurons, each with different time properties ("multiple timescales"). Such multiple timescales lead to complex sensori-motor flows of skilled behavior being segmented into reusable primitives, and the primitives, in turn, are flexibly integrated into novel sequences. In experiments, the proposed network model, coordinating the physical body of a humanoid robot through high-dimensional sensori-motor control, also successfully situated itself within a noisy environment. The idea proposed here of a functional hierarchy which self-organizes through multiple timescales in neural activity, in addition to contributing to further neurophysiological investigations of the motor control system, could also contribute to the study of various regions of the brain where similar mechanisms may be observed.