How (Not) to Model Autonomous Behaviour Cognitive Science

How (Not) to Model Autonomous Behaviour  Cognitive Science
How (Not) to Model Autonomous
Behaviour
Ezequiel A. Di Paolo, Hiroyuki Iizuka
CSRP 588
April 2007
ISSN 1350-3162
____________________________
Cognitive Science
Research Papers
_____________________________
2
How (not) to model autonomous behaviour
Ezequiel A. Di Paolo1 & Hiroyuki Iizuka1,2
1. Centre for Computational Neuroscience and Robotics (CCNR)
Centre for Research in Cognitive Science (COGS)
University of Sussex
Brighton, BN1 9QH, UK
[email protected]
2. Department of Media Architecture, Future University-Hakodate
116-2 Kamedanakano-cho, Hakodate, Hokkaido, 041-8655, Japan
[email protected]
To appear in BioSystems, Special issue on Modelling Autonomy.
3
Abstract.
Autonomous systems are the result of self-sustaining processes of constitution of
an identity under precarious circumstances. They may transit through different
modes of dynamical engagement with their environment, from committed
ongoing coping to open susceptibility to external demands. This paper discusses
these two statements and presents examples of models of autonomous behaviour
using methods in evolutionary robotics. A model of an agent capable of issuing
self-instructions demonstrates the fragility of modelling autonomy as a function
rather than as a property of a system’s organization. An alternative model of
behavioural preference based on homeostatic adaptation avoids this problem by
establishing a mutual constraining between lower-level processes (neural
dynamics and sensorimotor interaction) and higher-level metadynamics
(experience-dependent, homeostatic triggering of local plasticity and reorganization). The results of these models are lessons about how strong
autonomy should be approached: neither as a function, nor as a matter of
external vs. internal determination.
Keywords: biological autonomy, modelling autonomous behaviour, evolutionary
robotics, self-setting of goals, behavioural preference.
4
1. Introduction
In this paper I would like to establish two important points about autonomy that
stem from a careful analysis of the continuity between life and cognition, and a
third point by implication. The two main messages I would like to establish
about autonomy are: 1) autonomous systems always originate in self-sustaining
processes of constitution of an identity under precarious circumstances and 2)
such processes can be dynamically manifested in different modes of engaging
with the world ranging from committed coping to open susceptibility. The
implication of these two points will be that 3) current work in "autonomous"
robotics based on ideas of automated synthesis of design (e.g., evolutionary
robotics) and dynamical systems approaches to cognition, is still far from
achieving or even modelling autonomy in the strong sense advocated here, but
that this work may be at the same time the surest route to this goal. I will
concentrate for the most part of the paper on discussing examples of recent work
in evolutionary robotics. One case illustrates the insufficiency of thinking about
autonomy in terms of functions and another example shows that at least some
interesting aspects of the organization of autonomous behaviour can be
modelled fruitfully once we take points 1 and 2 more seriously. Both cases,
however, constitute “good” examples of the role of modelling in clarifying
complex concepts such as autonomy.
2. Why should autonomous systems generate their own identity?
I will work under the assumption that autonomous systems, i.e., systems capable
in some non-trivial sense of setting their own laws, exist, and that living systems
provide the clearest, less controversial examples of such autonomy (even if it
may still be possible to discuss autonomous systems that are non-living; or let's
say, remain agnostic about the possibility). That autonomy is not an illusion is far
from evident for Western thought. This is in fact because it is often suspected to
be a purely ascriptional property – one that will simply vanish upon closer
inspection. Autonomy remains such a slippery concept if examined under the
magnifying glass of reductionist physicalism. If we are to avoid mysteries, an
autonomous system must follow only the laws of physics, hence it cannot set its
own laws, therefore they don't really exist, they're just convenient ways of
talking. For Kant, in his Critique of Judgment, the intrinsic teleology of organisms
was similarly unreachable by pure reason and yet it was so evident that he
proposed it should remain as a regulative concept, i.e., we may talk about
organisms as if they had purposes of their own but as a convenient shorthand of
(not quite well-known) physical events. With autonomy the situation is
analogous (and this is no accident). However, the above argument is rather
absolutist in its terminological interpretation (what is a law? what is a system?)
and its ignoring of the complex possibilities of self-organization of multi-scale
physical processes of formation of constraints and structures. More gravely, the
5
argument just too quickly takes sides in the conflict between two kinds of very
real experiences: the experience of the physical world as regular and describable
in terms of laws and the experience of our perceived teleology and autonomous
behaviour in others and, most importantly, in ourselves. On what basis are two
reliable and repeatable experiences to be discriminated as real or unreal? History
tells us that this is a naive formulation and that conflict breeds novel
understanding by dialectical synthesis rather than by decreeing a winner
position. This is Hans Jonas's rebuttal of the Kantian lukewarm recognition of the
importance, but not quite properly ontological status, of intrinsic teleology. We
can know life because we ourselves are alive (Jonas, 1966; Weber & Varela, 2002;
Di Paolo, 2005; Di Paolo, Rohde & De Jaegher, forthcoming).
Let's just boldly state that living organisms are autonomous – they follow laws
set up by their own activity. Fundamentally, they can only be autonomous by
virtue of their self-generated identity as distinct entities. A system whose identity
is fully specified by a designer and cannot, by means of its own actions,
regenerate its own constitution, can only follow the laws contained in its design,
no matter how plastic, adaptive, or life-like its performance. In order for a system
to generate its own laws it must be able to build itself at some level of identity. If a
system ‘has no say’ in defining its own organization, then it is condemned to
follow an externally given design like a laid down railtrack. It may be endowed
with ways of changing its behaviour depending on history, but at some level it
will encounter an externally imposed functional (as opposed to physical)
limitation to the extent to which it can change. This can only be avoided if the
system's limitations result partly from its own dynamics.
Here we find already a point to be taken seriously by those who pursue the goal
of building an autonomous system artificially. It would be wrong to think that
the quest for artificial autonomy is futile by definition (to design what cannot be
externally constructed). In fact, a subtle change of attitude should take place to
start recasting the job of a designer of artificial systems. Once this attitude has
changed, there is no contradiction in the idea of strong artificial autonomy. A
design process is now transformed into the design of the right conditions
(appropriate material substrate and organization) for an autonomous identity to
constitute itself. Evolutionary robotics, as we shall see, has made important steps
in this novel methodological direction.
The autonomy of a self-constituted system is by no means unconstrained (being
able to influence one's own limitations does not imply being able to fully remove
them; on the contrary it means being able to set up new ways of constraining
one's own actions). Hans Jonas (1966) speaks of life as sustaining a relation of
needful freedom with respect to its environment. Matter and energy are needed to
fuel metabolism. In turn, metabolism sustains its form (its identity) by
dynamically disassociating itself from specific material configurations.
6
Let’s provide a definition of an autonomous system.
An autonomous system is defined as a system composed of several processes that
actively generate and sustain an identity under precarious conditions. By identity we
mean to the joint properties of self-distinction and operational closure. The two
properties go hand in hand. Operational closure, in a non-trivial sense, indicates
the property that among the enabling conditions for any constituent process in
the system one will always find one or more other processes in the system (i.e.,
there are no component processes that are not conditioned by other processes in
the network, which does not mean, of course, that other conditions external to
the system are not necessary as well for such processes to exist). Self-distinction
therefore means the property of a process/component of belonging to such
network of enabling conditions (i.e., it is the relation of closure that defines
whether a process/component belongs or not to the system), and more strongly,
of actively affirming the identity of the system by its own operation. By precarious
we mean the fact that in the absence of the organization of the system as a
network of processes, under otherwise equal physical conditions, isolated
component processes would tend to run down or extinguish.
The above definition makes the concept of autonomy operational. It should be
clear that by expressions like ‘self-constitution’ and ‘generating its own laws’ no
mysterious vitalism is intended. By saying that a system is self-constituted, we
mean that its dynamics generate and sustain an identity. An identity is generated
whenever a precarious network of dynamical processes becomes operationally
closed. This means that at some level of description, the conditions that sustain
any given process in such a network are provided by the operation of the other
processes in the network, and that the result of their global activity is an
identifiable unity in the same domain or level of description. Autonomy as
operational closure is intended to describe self-generated identities at many
possible levels (Varela, 1979; 1991; 1997).
For instance, autocatalytic cycles are an example of an operationally closed
system in the domain of chemical reactions: by definition, the cycle is capable of
sustaining and regenerating itself (given enough supplies) and, at a formal level
of description, it defines its own identity: a chemical reaction either belongs or
does not belong to an autocatalytic cycle. This identity defines the interactive
properties of the system but the history of interaction may also alter the process
of continuous identity generation; hence the sense of self-law.
For a living system, the self-identifying processes are themselves processes of
material construction and transformation resulting in a self-distinct physical
form. The constraint of physical construction seems to provide some non-trivial
implications to the condition of operational closure. Having a distinct, self-built
7
unity allows to ground consistently notions of behaviour and agency in ways
that autocatalytic cycles do not permit. The implication of this is that our
definition of autonomy may have to be refined in the future to better grasp the
implications of physical self-construction. The notion of precariousness does part
of this job. But this issue is further pursued here.
The definition provided above fits well the case of the constitutive autonomy of
living system: their metabolic organization. However, the definition is carefully
worded so as to avoid the conclusion that this is the only possible instantiation of
an autonomous system. Indeed, we find that several layers of behaviour up to
the case of social interactions are able to meet the operational requirements of
autonomy (or at least there’s no question of principle why they should not).
Robotics (the tool for modelling autonomy discussed in this paper) can therefore
aim at “catching” the constitutive dynamics of identity generation at some of
these higher levels in order to capture forms of non-metabolic generation of
values and self-determination (forms that are enabled but underdetermined by
metabolism, for detailed discussions on this topic see Jonas, 1966, Di Paolo, 2003,
2005). In other words, we suggest that there are ways of modelling and maybe
even instantiating artificial autonomy that do not require building a fully
autopoietic artificial system.
In this respect, it is important to indicate that cognitive systems are also
autonomous in an interactive sense in terms of their engagement with their
environment as agents and not simply as systems coupled to other systems
(Moreno & Etxeberria, 2005; Di Paolo, 2005). As such, they not only respond to
external perturbations in the traditional sense of producing the appropriate
action for a given situation, but do in fact actively regulate the conditions of their
exchange with the environment, and in doing so, they enact a world or cognitive
domain.
Viewing cognitive systems as autonomous is to reject the traditional poles of
seeing cognition as responding to an environmental stimulus on the one hand,
and as satisfying internal demands on the other – both of which subordinate the
agent to a role of obedience. It is also to recognize the ‘ongoingness’ of
sensorimotor couplings that lead to patterns of perception and action twinned to
the point that the distinction is often dissolved. Autonomous agency goes even
further than the recognition of ongoing sensorimotor couplings as dynamical
and emphasizes the role of the agent in constructing, organizing, maintaining,
and regulating those closed sensorimotor loops. In doing so, the cognizer plays a
role in determining what are the laws that it will follow, what is the ‘game’ that
is being played.
The focus on biological autonomy and agency is a radical departure from
decades of theories that subordinate cognition to the demands and instructions
8
of either the environment or internal sub-agential modules meant to represent
theoretical constructs such as instincts or drives. And it is only made more
radical by the connection between the constitutive and interactional aspects of
autonomy that is the basis of the idea of sense-making (Varela 1997; Thompson,
2007; Di Paolo, 2005), the bringing forth of a world of significance.
3. A fable about the dynamics of everyday life
When trying to understand autonomous behaviour it may be instructive to take a
look at the ongoing cycles of activity in normal everyday life and how they are
often very different from the performances that are studied in psychology,
neuroscience, cognitive science and AI/robotics. The locus of study in the
majority of work in these disciplines is in general the single performance of an
act – the recognition of a pattern, the enactment of a choice, the attainment of a
goal, etc. – and the factors and mechanisms involved. It is only rarely the
ongoing flow of behaviour that is of concern, i.e., the different modes of
engaging with the environment and the autonomous constitution of future
engagement such as the emergence of novel goals. In new AI and robotics we
find a strong, almost exclusive, emphasis on situated action and ongoing coping.
This is typically a mode of performance rich in sensorimotor couplings and
focused engagement with the task at hand (navigating towards a goal,
hammering down a nail, etc). This mode is highly robust. There are very few
distractors that will break down the flow of coping activity. But coping does not
always run smoothly. There may be breakdowns of different kinds that demand
some effort of re-adaptation in order to return to the goal-driven activity. Most of
the current work in robotics is about ongoing coping and a fraction of it (dealing
with adaptation and learning) is about facing and resolving breakdowns in
coping.
Do these two concepts cover all the possibilities that we may encounter in the
flow of an animal's (or a human's) everyday activity? This would imply that we
lead very busy lives going constantly from one well-defined action to the next,
that we are always only coping or dealing with some breakdown and that there
is a clear purpose at each moment. If we pay more attention to the temporal
organization of goal-seeking coping, we will find that there is one more possible
mode of activity. Coping behaviour starts with an intentional demand to fulfil an
objective. It is by definition motivated. This motivation or goal is not necessarily
fixed or independent of the activity that ensues which is driven by an initial
intention or solicitation from the current situation (i.e., a demand or need either
external or internal to the agent). However, the fate of all coping, goal-oriented
activity is, by its intentional nature, success, abandonment, or frustration
(irrecoverable breakdown or simply unattainability). In all cases, coping ceases.
We may describe this as the self-extinction of all well formed behaviour. If selfextinction does not occur, then we are dealing with compulsive, possibly
9
pathological action (obsessive repetition, moths attracted to the candle flame,
etc.).
What happens after self-extinguished coping? It is simply contrary to everyday
experience to assume that new goals will immediately follow from the
attainment or frustration of previous ones (we are of course not ignoring the
possibility of hierarchical organization of tasks into sub-tasks in which case the
next set of activities is generally well-defined, but this is not the only possibility).
In fact, our experience tells us that there are moments of certain openness to the
possibilities afforded by our situation (such openness can clearly be very
different depending on the affective outcome of the previous coping task). While
distractors were robustly ignored during coping, now in an open state with
undefined goals, an agent may be drawn by environmental or internal events
into forming a novel intention and retroactively investing such a “distracting”
event with meaning (for instance, I decide to put down the page I’m reading, I
take a deep breath and look around my desk aimlessly, the sound of a car horn in
the distance makes me look out of the window and on seeing the garden I notice
that some maintenance is now long overdue, I decide to go and do some work
there, it was one of the things I was previously intending to do, but not just now,
the car horn “reminded” me of it). So there are durable states of dynamical
openness and susceptibility to micro-events that are qualitatively very different
from intentional, goal-driven coping. Openness does not self-extinguish by the
logical structure of an intentional act, but it is bound to be extinguished by its
very nature of high susceptibility.
The different modes are represented in figure 1. Dynamically speaking, we could
venture the hypothesis that coping relates to low-dimensional, highly robust,
coordinated body/environment dynamics whereas openness relates to highdimensional, typically unstable, uncorrelated dynamical modes. And that
switching between one and the other is not symmetrical. And yet the emerging
picture is one of transiting between very different dynamical modes, between
stability and instability. Maybe, in order to be able to synthesize artificial
autonomous systems, we must first understand better what sort of dynamical
system can generate such different regimes.
[Figure 1 about here]
There is a further step in the story of transiting between different modes of
activity that is well supported by phenomenological analysis for the case of
human agency. This is the active and regulated constitution of goals. The path
from a state of openness to new coping simply just happens for most animals.
The new goal doesn’t wait long before it appears. But humans can bring about a
recursive constitutive skill to this passage by the act of simply asking: And now
what? What was it that I had planned to do? Therefore, a further level of autonomy
10
that is supported by socio-linguistic skills and that down-regulate behaviour by
actively constituting new goals marks human agency. However, this higher form
of autonomy is beyond the scope of the present paper.
4. Limitations of pure evolutionary robotics
Let us now turn to the problem of modelling autonomous behaviour. Let us
immediately clarify the obvious but easily forgotten fact that models and
instantiations are very different things. An instantiation is supposed to be a
proper member of the class of phenomena under study. By contrast, a model
need not be; it can be crude – almost ignoring the majority of interested aspects
of the phenomenon of interest – and yet be extremely useful. In general, simple
models tend to be scientifically very powerful. This is because the purpose of a
model is not to replicate a phenomenon, but to help explain it. There are lots of
ways in which this can happen that do not involve producing an instantiation:
models can show us the mistake in our assumptions, they can be explanatory
rich in the way they actually fail to capture the phenomenon of interest, they can
act as proofs of concept, they can generate novel hypotheses, and generally they
can help re-organize complex ideas by exercising and questioning our intuitions.
The models on self-generation of goals discussed later in this paper help us think
about the concept of strong autonomy discussed in the previous section without
either of them coming close to being an instantiation of this concept.
Modelling (and instantiating) autonomous behaviour is the goal of robotics.
However, in robotics the term autonomous is often used very loosely. It can
mean anything from mobile, un-tethered, adaptive, to self-recharging or selfpowered. In the sense of not being constantly controlled from the outside and
being able to cope with a noisy, real-world environment, mobile robots imitating
simple lifeforms have been seriously investigated for the last two decades
(Brooks, 1991) and have noble ancestors such as W. Grey Walter's tortoises
(Walter, 1950). Such robots move about using simple but powerful principles of
engaged interaction and achieve robust performance in the absence of explicit
controlling at the level of attaining a certain goal. Robust performance emerges
from the interaction of simple mechanisms with body and environmental
dynamics. These robots exploit loose couplings with the environment to achieve
sustained behaviour. But what about autonomous agents in a sense that is closer
to the autonomy of living systems, agents capable of setting their own laws? As
argued above, as long as a system is externally designed (even in terms of
eventual changes that it may undergo in its organization) and not allowed to
constitute itself, it cannot really be autonomous in the strong sense. Its goals are
not set by itself but by the designer; they are extrinsic to it. However, interesting
behaviour that approaches different aspects of autonomy is often observed once
the designer starts constraining the process of design at increasingly removed
levels.
11
Evolutionary robotics (ER) is still proving a useful and open-ended method for
exploring this increasingly less constraining role of the designer that may be
required to achieve strong artificial autonomy. ER hands in the task of filling in
design specifications pertaining to mechanisms, morphology, structural and
functional organization to an automatic process of artificial evolution (Harvey et
al. 1997, Nolfi & Floreano, 2000). Thus, instead of designing a robot that must
explore the environment but should go to the green light when the battery is
down, one can attempt to design a robot that more generally must keep the
battery up during its explorations, or more implicitly, a robot that explores
indefinitely. In principle, there may be different ways of achieving this broader
goal, and artificial evolution can find many of these ways and select for robots
that opportunistically choose the most convenient route towards the general goal
of maintaining the battery charged (it may imply taking advantage of a different
source of energy that the one we intended as designers). A proper, strongly
autonomous agent (in a sufficiently complex environment affording many
alternative routes towards a goal) would certainly maximize the selection
criteria. So, one could hope, all that needs to be done is evolve such robots for
sufficiently long times and such autonomous agents will emerge eventually.
Unfortunately, this is too optimistic a view and relies on a misunderstanding
about artificial evolution. Unlike the open-endedness of natural evolution
(operating on systems that are already autonomous), artificial evolution tends to
be conservative rather than innovative. Or, rather, its innovation resides in that it
often finds simpler, cleverer solutions than the ones we expect as designers. This
is what makes it a powerful scientific tool to debunk myths and clarify preconceptions (Harvey et al. 2005). Artificial evolution is capable of producing such
results because it works outside the box of design constraints that limit the way
we think about the system and the task it must achieve. But as a process
operating on statistical information about a set of tested solutions to a problem, it
will always run the risk of getting stuck on solutions that are statistically
mediocre and thus finding it hard to explore elements of design that are initially
neutral but that allow novel possibilities if they are reliably present.
The best way to illustrate this is with an example. Tuci, Quinn, and Harvey
(2002) have investigated landmark learning behaviour in a mobile, Khepera-like,
robot controlled by a continuous-time, recurrent neural network (CTRNN)
without synaptic plasticity. That learning is afforded without changes to a
network’s connectivity is already one major demonstration of the power of ER to
break pre-conceptions. This work intended to reproduce previous work by
Yamauchi and Beer (1994) on landmark learning in one dimension, but with an
agent moving freely in a two dimensional arena. The task is simple: find the
location of a goal that cannot be seen from the distance using information
provided by a fixed light. In half the cases, the light is next to the goal, in the
12
other half the light is far from the goal. Approaching the light and remembering
the relation to the goal would enable an agent to learn in which of the conditions
it finds itself, and then on later trials move either towards the light or away from
it, but always towards the goal. Yamauchi and Beer had to use a modular decomposition of functions into sub-networks to solve this task. It was the
suspicion of Tuci and colleagues that this was unnecessary. However, if the
fitness function were simply to count how many times the agent found the goal
after the first trial (minimising designer involvement), no learning would evolve.
Effectively, evolution settles on a fixed strategy (e.g., always move left) and finds
the goal 50% of the times – a result that will not be unfamiliar to practitioners of
ER. Why is not the robot using the light? Because the light is on average
uncorrelated with the goal position, and fitness is measured as the average of
many trials with the same number of presentations in each of the two possible
situations (landmark far, landmark near). Hence, the safest bet given this lack of
correlation is that the light is a distractor and hence it should be ignored.
Tuci et al, realizing this, solved this problem by introducing an artificial bias in
the selection process (effectively becoming more involved in it as designers).
During the initial phase of the evolutionary process, they gave extra rewards to
robots that approached the light (on top of whether they also approached the
goal or not). This forced the evolutionary process to select neurocontrollers that
responded to the light as a relevant stimulus. When this extra fitness criterion
was removed in later generations (and only goal seeking remained) the
population consisted of neurocontrollers that firstly sought the light, and from
that situation they had to work out what to do next. The light because of the
initial bias has ceased to be irrelevant, and in such circumstances (standing on
the same spot as the light after having approached it) now the evolutionary
process can uncover the proper correlation between light and goal. As a result of
this, robots capable of learning the landmark correlation to a goal evolved.
So we may conclude that more sophisticated levels of behaviour in general (as
more sophisticated models of autonomy) may demand more and not less design
intervention in the evolutionary process. This is the apparent paradox of artificial
autonomy. The system should in some sense build itself, the designer should
intervene less, but it should at the same time be more intelligently involved in
setting the right processes in motion. Contrary to the uninformed perceptions at
the time when ER was born, one cannot treat artificial evolution as a magic box
capable of solving any problem one poses to it (and all one must do is just wait).
Fortunately, failures to evolve a desired behaviour, if followed by some analysis
of the behaviours that do evolve, often leads to a revelation of what are the
problems one must overcome as a designer of an evolutionary regime.
In addition, by its very nature, ER proceeds by testing candidate solutions under
a set of varying circumstances in order to select robot controllers capable of
13
latching onto the significant interactions with the environment that will lead to
achieving the desired goal efficiently and robustly. Finding a target cannot
depend on the initial position of the agent, or the initial internal state, and so
these parameters must be randomized from trial to trial to ascertain a level of
stability of the solutions that evolve. But this very basic element of the ER
methodology may play against the design of autonomous agents, at least if we
consider the different dynamical regimes of activity described in the previous
section. If evolution is to produce stable and robust dynamical controllers, it will
avoid being strongly influenced by irrelevant environmental factors, but at the
same time it will avoid internal sources of instability. Hence it will produce
robust coping, but not necessarily dynamical states of openness after coping
activity is self-extinguished. That's why goal-seeking evolved robots tend to keep
around their targets like moths attracted to a flame. They behaviour is almost
pathological. The lack of self-extinction of behaviour should perhaps be taken as
a sign of bad design (cf., work by Ian Macinnes on functional circles and practical
ways of dealing with this problem, e.g., Macinnes and Di Paolo, 2006). So
evolving autonomous robots will have to overcome this problem by either
selecting the right building blocks, or including sensorimotor interactions and
internal elements that inevitably will sometimes lead to transitions between low
and high dimensionality in the dynamical flow as suggested in the previous
section.
5. A "self-instructing" agent. How not to model autonomy.
Let us consider an example of an agent capable of generating its own instructions
and following them. In some loose sense of autonomy (but not necessarily in the
operational sense that we have offered above), this agent would be setting up its
own goals. I present the following agent as a computer-enhanced thought
experiment but also as a demonstration of why certain tempting methodologies
for designing autonomous agents are conceptually flawed. In the next section, I
will show an agent that is not yet fully autonomous but which demonstrates
what I consider a better methodology. Both these models demonstrate how we
can learn about autonomy without yet producing proper instantiations.
In his well-known discrimination experiments, Randall Beer (2003) has shown
how minimally cognitive behaviour can be 1) easily modelled and analysed
using a combined evolutionary robotics/dynamical systems approach, and 2)
how such models, albeit minimal, demonstrate interesting general principles and
provide extendable vocabularies to discuss cognition in dynamical terms. The
basic discrimination experiment consists of a visually-guided agent moving in 1
dimension (left-right) whose task is to catch a falling object if it is a circle and
avoid it if it is a diamond; the agent receives input from an array of linear visual
sensors (rays that activate when intersected by the falling object) and this input is
fed into a recurrent, symmetrical CTRNN controlleri. The output of the network
14
determines the velocity of the agent, (Beer, 2003). Although dynamical analysis
has shown that agents use the absolute radius of the falling shape to perform
their discrimination, extensions of the setup to shapes of variable size results in
agents capable of discrimination based on shape, (Di Paolo and Harvey, 2003).
Let us consider a variant of this model. An agent that performs a circle/diamond
shape discrimination, but that depending on an external binary signal its choice
of which object to catch can be altered. So if the external signal (ES) is set to 0, the
agent is a circle-catcher and if the signal is 1, the agent is a diamond-catcher.
The setup is otherwise similar to Beer’s experiments, with the difference that
sensors are binary (to increase sensory ambiguity and encourage more active
solutions). And additionally, a focus control is added to the array of sensor rays.
This is an effector neuron that simply opens and closes the angle of the sensors
rays in a linear way. Interestingly, this extra level of sensory control is important
to evolve agents capable of changing their behaviour depending on the external
instruction. Figure 2 shows the average fitness of 10 independent runs with and
without focus control. The best focus controlling agents can perform either circlecatching or diamond-catching on demand for a relatively large range of sizes,
using ambiguous noisy sensors with success rates of over 85%.
[Figure 2 about here]
These agents have now a well-defined signal that alters the goal they pursuit.
Couldn’t such a signal be somehow provided internally? Ideally, could such a
signal be generated in a way that is jointly dependent on internal and
environmental factors? Exclusive dependence on either class of factors would not
generate an agent that we would be happy to call autonomous as we could
suspect that the agent is following the instructions that either are external to it or
is blindly taking no account of its situation. Autonomy, even in an intuitive
sense, is ruled out by either of these two conditions. Why? Because both
conditions negate the idea of self-determination. The case of constant reactive
response to the environment is clear. No system that is simply driven externally
can ever be autonomous. But, and this is less intuitive, the same may be said
about a system that is “driven internally”. If a subset of a system exerts control
on the whole, then the situation remains that of a system that is controlled, not
self-determined. If a system is controlled only by internal dynamics making it
blind to the current environmental situation (what sometimes in mathematical
terms is indeed called an “autonomous” system due to the lack of parametrical
and time-dependent driving), the system has nothing to determine itself against.
It simply endures in its dynamics because it’s closed to environmental
challengesii.
So taking these intuitive constraints into account let us conceive of a sub-system
15
capable of generating a stable on-off signal depending on internal state and
environmental circumstances. There are many options. One would be a central
pattern generator (CPG) that oscillates with a certain frequency in the absence of
input currents and settles into either a high or a low value stable attractor in the
presence of input. Such a circuit can easily be hand-designed using a fully
connected 2-node CTRNN (Beer, 1995) and is shown in figure 3. The CPG
receives input from the visual sensors. Depending on the phase value of the
oscillation orbit, the presence of input will drive the CPG to one of two possible
stable fixed points (new intersections of nullclinesiii). For one of the nodes the
two fixed points correspond to high and to low firing rates respectively. This
node is then connected to ES in the pre-evolved discriminator network. The
agent will now produce behaviours such as those shown in figure 4. Upon
repeated presentation of a circle the agent will sometimes approach it and other
times avoid it. Similarly for diamonds. In a very simplistic way, the agent is
setting up its own “goals” by instructing itself to go for one object type or the
other. It does so in a way that depends on internal conditions (phase in the CPG
cycle) and external factors (e.g., position and timing of the falling object). An
external observer could describe the agent’s behaviour as “capricious”.
[Figures 3 and 4 about here]
Why is this a problematic way of approaching artificial autonomy? Even though
this model tries to capture an intuitive notion of autonomy as the setting of a
system’s own goals by the system itself, the integrity of what we take to be the
system ultimately relies on the designer’s viewpoint. It is (like most artificial
agents to this day) a system by convention. It is only too clear that the add-on of a
CPG to an already evolved neural network does not result in a system that is
integrated in other than a nominal sense. In fact, because they system’s identity is
given externally, we could just as well visualize the CPG circuit as located in
another room and communicating with the system through remote control. And
this is in effect what a method aiming a integrating a controller into the system
itself (see Smithers, 1997) will always achieve: a homuncular solution whereby
the human controller is replaced by a module telling the rest of the system what
to do. The advantage of this example is the clarity with which this problem
presents itself, but more sophisticated versions of the same idea (when clear
modular separation is not so easy to perform) will be conceptually no different.
Hence, proper autonomy must address internal goal generation as a result of the
system’s own organization, rather than as a function that the system produces
(independently of whether such function is achieved in a modular or distributed
manner). In short, autonomy is not a function. A similar point is made by Rohde
and Di Paolo (2006) with respect to value generation and value systems (see also
Di Paolo, Rohde & De Jaegher forthcoming, and Rutkowska, 1997). We are faced
with an important consequence of the view on biological autonomy sketched at
the beginning which is not necessarily obvious at the moment of starting to
16
develop it into a workable model: autonomy is not something that a system does,
it is a property of how the system is organized and re-organizes itself so as to
channel its functionalities towards newly generated intentions.
6. A better way: evolving behavioural preferences.
For ease of comparison, let us stay with the problem of an agent producing an
autonomous behavioural choice, but let us approach the problem in terms of the
dynamical organization of the agent. How would such a model look like? What
sort of method can produce a system that by its very structure and interactions
would by itself select and regulate a sensorimotor flow based on the effects it has
on the maintenance of an internal organization?
Consider a dynamical model of the formation and sustaining of a behavioural
preference. How does an embodied agent develop a stable preference such as a
habit of movement, a certain posture, or a predilection for spicy dishes? Is this
development largely driven by a history of environmental contingencies or is it
endogenously generated? Kurt Goldstein (1934) described preferred behaviour
as the realization of a reduced subset of all the possible performances available to
an organism (in motility, perception, posture, etc.) that are characterized by a
feeling of comfort and correctness as a contrast to non-preferred behaviour
which is often difficult and clumsy. In this view, the fact that a preferred
behaviour is observed more often would be derivative of these properties and
not central to its definition (preferred behaviour is often efficient but not
necessarily optimal in any objective sense). Following this idea, a preference can
be defined as the enacting of a behavioural choice that is sustained through time
without necessarily being fully invariant, i.e., with time it may develop or it may
be transformed into a different preference.
Behavioural preferences and their changes lie between the two scales typically
addressed by dynamical systems approaches to cognition: the behavioural (e.g.,
Beer, 2003; Kelso, 1995) and the developmental (Thelen & Smith, 1994) and share
properties with both of them. Goldstein has argued that we cannot really find the
originating factors of a preferred behaviour purely in central or purely in
peripheral processes, but that both the organism’s internal dynamics and its
whole situation participate in determining preferences (Goldstein, 1934). In this
view, it becomes clear that a preference is never going to be captured if it is
modelled as an internal variable (typically a module called “Motivation”) as in
traditional and many modern approaches, but that a dynamical model needs to
encapsulate the mutual constraining between higher levels of sensorimotor
performance, and lower processes, such as neural dynamics (Varela &
Thompson, 2003). This is what the above model does not achieve. A preference is
not “located” anywhere in the agent’s cognitive architecture, but it is rather a
constraining of behaviour (through internal and external conditions) that is in
17
turn shaped by behaviour.
Iizuka and Di Paolo (forthcoming) present an exploratory model of behavioural
preference with the objective of exploring Goldstein’s assertion of multicausality. This model illustrates a potentially fruitful method for modelling
autonomy as well. The minimal requirement to capture the phenomenon of
preference is a situation with two mutually exclusive options of behavioural
choice. An agent should be able to perform either of these options but the choice
should not be random, but stable, and durable. The choice should not be
invariant either but it should eventually switch (in order to study the factors that
contribute to switching). There should be a correspondence between internal
dynamical modes and different aspects of behaviour (from commitment to a
choice to switching between choices). For this the methods of homeostatic
adaptation (first introduced in Di Paolo, 2000) are used to design not only the
agent’s performance but to put additional requirements on the corresponding
internal dynamics.
In the original model of homeostatic adaptation an agent is evolved to
simultaneously perform a task and to maintain its internal variables (e.g.,
neuronal firing rates) within some homeostatic bounds, (Di Paolo, 2000). When
such variables cross the boundary, local internal plasticity is activated, and keeps
active until homeostasis is recovered (Ashby, 1960). Some of the agents that
evolve under these conditions show a dynamical link between the two objectives
(performance and internal homeostasis) such that disruption to performance
(e.g., changes to motors or sensors) result in internal instability, which provokes
plastic internal changes until stability is regained. Because of the dynamical link
established during evolution, regaining internal stability involves a behavioural
adaptation to the original disruption. The result is that the agent is able adapt to
severe disruptions that have not been presented to it before.
This idea can be naturally extended to a situation where an agent must choose
between alternative behaviours: A and B. Instead of a single homeostatic region
for the internal variables (neuronal firing rates), there are two regions. If the
neural dynamics are contained within either of these high-dimensional boxes, the
network remains unchanged. But if the flow of internal states moves out of the
boxes, local plastic synaptic changes are applied in the general direction of
reverting the flow to move again inside the box. Now, agents are evolved to
perform each behaviour A and B and to behave homeostatically so that the
internal state is inside box A while performing behaviour A and inside box B
while performing behaviour B. The evolutionary regime is designed so that
behaviour is not biased to either of the choices. In the case presented in (Iizuka
and Di Paolo, forthcoming) the behaviour is simply approaching a light of type A
or B (the wheeled Khepera-like agent has two pairs of sensors left and right
towards the front, one for each type of light). Each homeostatic box is
18
implemented for each node in the CTRNN neurocontroller as 2 bands within the
range of firing rates (figure 5): a low-firing and a high-firing homeostatic region
(to reduce bias, the type of each region, A or B, is assigned randomly at the
beginning of the evolutionary run).
[Figure 5 about here]
The idea is that if the system holds two separate (fixed) high-dimensional boxes
in the space of neural dynamics which are associated with performing different
behaviours, a preference could be formed by the dynamical transitions that select
which box the trajectories go into and stay in. This provides a first requirement
for talking about preference, that of durability (bottom-up construction of
stability). Once a behaviour is formed, due to the stability in a box, the system
keeps doing the behaviour while ignoring other behavioural possibilities. It is
like a spontaneous top-down constraint that regulates the sensorimotor flow.
However, some disturbances might eventually cause a breakdown of the stability
and then another behaviour can be reconstructed though the homeostatic
adaptive mechanisms. Since by design, the system has another region of high
stability, it will be possible in the right circumstances to switch into it and then
start enacting the other behavioural option. In this way, behaviour can switch
due to the corresponding transitions between two boxes. One can expect to see
both spontaneous and externally induced transitions from the viewpoint of the
top-down and bottom-up construction or destruction of durable but
impermanent dynamical modes. Here we find a second requirement, that of the
possibility of transformation, or change in preference.
The evolved agents show interesting behaviour when two lights (A and B) are
presented simultaneously in a random position. They always “chose” to go to
one of the two lights, they never stay in the middle or move away from them.
Moreover, when the position of the light is replaced by two new distant
positions, the agents seem to maintain a preference for the light they have visited
previously, but not indefinitely. Figure 6 shows a sequence of 100 consecutive
presentations of lights (A and B) in sequence with randomized positions. The
plot on the left shows the final distance to each type of light. We can see that the
agent approaches light B for the first 25 presentations, but then switches to light
A and maintains this behaviour for about 35 presentations before switching
again. The plot on the right shows the corresponding proportion of time that the
neural dynamics is inside boxes A and B. It is clear that the proportion tends to
be high while the agent is performing the corresponding behaviour.
[Figure 6 about here]
Many tests have been performed to assess what makes an agent change its
preference, (more details in Iizuka and Di Paolo, forthcoming). For instance,
19
while approaching light A, the lights are swapped in position to see whether the
agent changes its behaviour. The result depends on the time of the swapping. If
the agent is far enough, it alters its trajectory after the swap and moves towards
the new position for light A. If the swap is made later, when the agent is close to
light B, the agent switches to finish its approach to light B, as if its presence was
now too strong a stimulus to ignore. This and similar tests indicate that a
preference is maintained or changed as a combined effect of environmental
factors and endogenous dynamics.
In an attempt to measure the development of a preference, agents are tested at
different times during the sequence of presentations shown in figure 6 in order to
find out if their choice would have been the same at that point in time if the
position of the lights had been different. The distinction between a spontaneous
or externally driven “decision” is made operational by observing the agent’s
behaviours in different situations departing from a same initial state. If the agent
“decided” to go to one of the lights endogenously, its behaviour must be robust
without depending too much on environmental factors. On the contrary, if the
selection were externally driven it would be affected by changes to
environmental factors such as light positions (as if the agent were not
“committed” enough).
Figure 7 shows the results. Each plot indicates in shades of grey the final
destination approached by the agent as a function of the angle relative to the
body in which each light is presented. The neural and bodily states are picked
from those corresponding to a given time in the sequence shown in figure 6. In
the case of (a), in which the agent originally has the preference of light B, the
“decision” is stable against the various alternative positions of the lights. The
agent robustly approaches light B for practically all the angular positions tested.
Therefore, the “decision” to approach B does not strongly depend on
environmental factors in this case. This is also true in the case of (c). Except for
the small region where it selects light B, the agent approaches light A wherever
else the lights are placed. By contrast, in cases (b) and (d) the agent is rather
“uncommitted” because the approached target changes depending on the lights’
position.
The agent thus alternates between periods of high and low preference for each of
the options presented to it. During periods of high preference it could be said
that the agent’s behaviour is committed in the sense that it is less dependent on
distracting factors (as implied in Figure 7, the agent will under these
circumstances ignore the “wrong” light even if placed directly in front of it, and
look for the “right” light even if it is out of its visual range). While the preference
is changing from one option to the other the agent does not show a strong
commitment to which light should be selected as target. There is ample scope for
distracting factors to alter the agent’s behaviour. Then, the new preference
20
develops towards a more stable (or committed) dynamics. It is this alternation
between these modes – resembling the alternation between the modes of coping
and openness described in section 3 – that makes this behaviour closer to
something we might call autonomous.
The implication of these results is that the emergence of a new goal happens by
an interplay between internal and interactive dynamics and that it consists, not
of a function that performs certain decisions and instructs how the agent should
behave, but of the regulation of periods of openness to external and internal
variations and periods of commitment to a goal. Even during periods of weak
environmental dependence, the endogenous dynamics are not solely responsible
for the agent’s performance. In all cases, behaviour is the outcome of a tightly
coupled sensorimotor loop. It is clear that the mode of environmental influence,
whether weak or strong, changes over time and that this is a property of the
agent’s own internal dynamics as well as its history of interaction. During the
periods of high susceptibility to external variations, the agent is highly
responsive to environmental variability resulting in less commitment towards a
given target. By contrast, during periods of weak susceptibility, the consistent
selection of a target is a consequence of low responsiveness to environmental
variability.
The important point is that the autonomy of the agent’s behaviour can be seen as
the flow of alternating high and low susceptibility as suggested in section 3,
which is an emergent property of the homeostatic mechanism in this case (but
might be the result of other mechanisms in general). There is nothing apart from
the flow of neural and sensorimotor dynamics that stands for a mode of
commitment to a preference or other. It should be made clear that this picture is
quite a contrast with the idea that autonomy may be simply measured as how
much of behaviour is determined internally vs. how much is externally-driven.
Strong autonomy is orthogonal to this issue since simply all of behaviour is
conditioned by both internal and external factors at all times. It is the mode of
responsiveness to variations in such factors that can be described as committed
or open, and it would be a property of strong autonomous systems that they can
transit between these modes (maybe in less contingent ways as this agent).
[Figure 7 about here]
This agent comes closer to some properties of autonomous behaviour described
above, especially in addressing the constitution of a new goal as an emerging
property of the internal and interactive dynamics in relation to organizational
constraints. We cannot really claim that the agent is fully autonomous however.
The fixed internal constraints (homeostatic boxes) are rather arbitrary in their
definition and their lack of contingent development over time. It seems that
having homeostatic regions that are somehow themselves constituted by a
21
history of interactions would be a much better way of modelling autonomy. In
addition, there is only a contingent link between internal “requirements” and
external interaction. Light is made relevant to the agent by a selective pressure
and it is linked to an internal condition to be satisfied (homeostasis) also by
evolutionary history. Organisms present much tighter double causal links
between internal needs (e.g., metabolism) and sensorimotor interactions (e.g.,
foraging). This is again something that should be improved for a closer approach
to behavioural autonomy, (Di Paolo, 2003).
7. Conclusions
What do we learn from these models? The two examples of modelling aspects of
autonomous behaviour presented above allow us to draw some clear conclusions
that are implicit but not well articulated in the more conceptual view on
biological autonomy proposed in section 2: Autonomy (and its implications such
as identity generation, value-generation, goal-setting, etc.) is an organizational
property of a system, not a function, a state or a mechanism. Any attempt at
approaching it purely in functional terms will miss something fundamental.
In other words, autonomy is a property pertaining to what a system is, rather
than what it does. This ontological property will have very clear consequences for
how the systems behaves, i.e., what it does. But to start a model from those
consequences will always run the risk of trivializing autonomy or even
explaining it away. In our two cases, we have focused on the self-setting of goals
as an example of an autonomous performance. Implementing this in an intuitive
manner results in a model that is not satisfactory because it misses the
ontological dimension of autonomy. It treats it as a function. A more
sophisticated implementation is able to capture the underlying dialectics
between dynamics and meta-dynamics (homeostatic constraints and plasticity),
between organization and performance, and goal setting is achieved in an
emergent manner and not as a function of the system requiring some dedicated
computational module. There is simply nothing in the neural controller of the
agent that set out new goals, and yet this is what the agent does as a whole.
The preference model also suggests a second lesson. It is not very fruitful to ask
of a putative autonomous system whether its behaviour is caused by internal or
external factors. This approach breeds confusion because all of behaviour is
always determined by both internal and external conditions. Autonomous
behaviour is, like the preferences shown by the agent, always caused by a
multiplicity of internal and external factors. It is the response of the system to
variability in such factors that gives an idea of the particular mode of
commitment to a goal, and it is to be expected of autonomous systems that they
would also show transitions between these modes as shown in the preference
model. The study of this model has so far looked at the commitment to a
22
preferred behaviour by altering environmental conditions (positions of light), but
we might as well study internal variability (e.g., noise or lesions) such that
behaviour is still achievable and similarly measure different modes of
engagement resulting in analogous periods of coping and switching (this study
has not been performed yet).
Are we dealing with an autonomous system in the model of preference
formation? Not yet. We may ask (as suggested at the end of the last section) to
what extent is an identity self-generated in this system. This definitional aspect
of autonomy is not captured by this model. It is clear that much of the system is
rather stable and not precarious (e.g., the agent’s body, sensor and motor
response, and neural connectivity) and that if an identity could be self-generated
anywhere in this case, it would have to be at the level of the combined neural
and sensorimotor dynamics. But still there is much arbitrariness in the design of
this setup (such as the location and static nature of the homeostatic regions). In
fact, whether higher and recursive levels of identity are possible without the
grounding in physical self-construction is still an open question.
The complex dialectics between different dynamical levels is at the root of
several intentional aspects of autonomous behaviour from the generation and
appreciation of values, norms and affect in a situation, to the emergence of a
sense of agency and self. We expect that, by producing models that either make
confused ways of thinking more manifest or indicate more clearly the relation
between complex ideas, this kind of methodology will help us further research
into these related questions. At the moment, functional modelling (attempting to
capture concepts such as autonomy or agency in terms of functions) is still
prevalent (Di Paolo, Rohde, De Jaegher, forthcoming; Rohde & Di Paolo, 2006).
This is in part due to the technical limitations of traditional approaches to
cognitive modelling (which we are beginning to overcome) but also partly due to
some conceptual Cartesian baggage that hides itself under apparently innocuous
assumptions, especially in terms of deriving mechanisms for externally observed
functionality.
However, we must at all times remember the point of producing models such as
the ones described here. They are not meant to be implementations of the
properties under study. The formation of an identity may well be modelled at the
neural level (e.g., the formation of a dynamic pattern that is self-sustaining under
precarious conditions and whose maintenance require certain sensorimotor
interactions with the environment) even though implementing such a process
may not still be enough to implement a proper autonomous agent. A model is
supposed to expose gaps in our understanding – not produce fancy
performances. For this, in most cases, full implementations are optional and
models as those presented here do their job.
23
References
Ashby, W. R., 1960. Design for a brain: The origin of adaptive behaviour (Second
edition). London: Chapman and Hall.
Beer, R. D., 1995. On the dynamics of small continuous-time recurrent neural
networks. Adaptive Behavior, 3, 471 – 511.
Beer, R. D., 2003. The dynamics of active categorical perception in an evolved
model agent. Adaptive Behavior, 11, 209–243.
Brooks, R. A., 1991. Intelligence without representation. Artificial Intelligence, 47,
139– 159.
Di Paolo, E. A., 2000. Homeostatic adaptation to inversion of the visual field and
other sensorimotor disruptions. In Meyer, J.-A., Berthoz, A., Floreano, D.,
Roitblat, H., & Wilson, S. (Eds.), From Animals to Animats 6: Proceedings of the
Sixth International Conference on the Simulation of Adaptive Behavior.
Cambridge MA: MIT Press.
Di Paolo, E. A., 2003. Organismically-inspired robotics: Homeostatic adaptation
and natural teleology beyond the closed sensorimotor loop. In Murase, K., &
Asakura, T. (Eds.), Dynamical Systems Approach to Embodiment and Sociality,
pp. 19–42. Advanced Knowledge Internaltional, Adelaide.
Di Paolo, E. A., 2005. Autopoiesis, adaptivity, teleology, agency. Phenomenology
and the Cognitive Sciences, 4, 429–452.
Di Paolo E. A., & Harvey, I., 2003. Decisions and noise: The scope of evolutionary
synthesis and dynamical analysis. Adaptive Behavior, 11, 284 - 288.
Di Paolo, E. A., Rohde, M., & De Jaegher, H., Forthcoming. Horizons for the
enactive mind: Values, social interaction and play. In O. Gapenne, J. Stewart and
E. A. Di Paolo (eds) Enaction: towards a new paradigm in cognitive science.
Cambridge MA: MIT Press.
Goldstein, K., 1995/1934. The organism. New York: Zone Books.
Harvey, I., Di Paolo, E. A., Wood, R., Quinn, M., & Tuci, E., 2005. Evolutionary
robotics: A new scientific tool for studying cognition. Artificial Life, 11, 79–98.
Harvey, I., P., H., Cliff, D., Thompson, A., & Jakobi, N., 1997. Evolutionary
robotics: the Sussex approach. Robotics and Autonomous Systems, 20, 207 – 224.
Iizuka, H. & Di Paolo E. A., Forthcoming. Towards Spinozist robotics: Exploring
24
the minimal dynamics of behavioural preference. Adaptive Behavior.
Jonas, H., 1966. The phenomenon of life: Towards a philosophical biology.
Evanston, IL: Northwestern University Press.
Kelso, J. A. S., 1995. Dynamic patterns: The self-organization of brain and
behavior. Cambridge MA: MIT Press.
Macinnes, I., & Di Paolo, E. A., 2006. The advantages of evolving perceptual cues.
Adaptive Behavior, 14(2), 147-156.
Moreno, A., & Etxeberria, A., 2005. Agency in natural and artificial systems.
Artificial Life, 11, 161–176.
Nolfi, S., & Floreano, D., 2000. Evolutionary robotics. The biology, intelligence,
and technology of self–organizing machines. Cambridge MA: MIT Press.
Rohde, M., & Di Paolo, E. A., 2006. ‘Value signals’: An exploration in
evolutionary robotics. Cognitive science research paper 584, COGS, University of
Sussex.
Rutkowska, J. C., 1997. What’s value worth? Constraints on unsupervised
behaviour acquisition. In Husbands, P., & Harvey, I. (Eds.), Proceedings of the
Fourth European Conference on Artificial Life, pp. 290 – 298. Cambridge, MA:
MIT Press.
Smithers, T., 1997. Autonomy in robots and other agents. Brain and Cognition,
34, 88 – 106.
Thelen, E., & Smith, L. B., 1994. A dynamic systems approach to the development
of cognition and action. Cambridge, MA: MIT Press.
Thompson, E., 2007. Mind in life: Biology, phenomenology, and the sciences of
mind. Cambridge, MA: Harvard University Press.
Tuci, E., Quinn, M., & Harvey, I., 2003. An evolutionary ecological approach to
the study of learning behavior using a robot based model. Adaptive Behavior, 10,
201 – 222.
Varela, F. J., 1979. Principles of biological autonomy. New York: Elsevier, North
Holland.
Varela, F. J., 1991. Organism: A meshwork of selfless selves. In Tauber, A. I. (Ed.),
Organism and the origin of the self, pp. 79 – 107. Netherlands: Kluwer Academic.
25
Varela, F. J., 1997. Patterns of life: Intertwining identity and cognition. Brain and
Cognition, 34, 72 – 87.
Walter, W. G., 1950. An imitation of life. Scientific American, 182(5), 42 – 45.
Weber, A., & Varela, F. J., 2002. Life after Kant: Natural purposes and the
autopoietic foundations of biological individuality. Phenomenology and the
Cognitive Sciences, 1, 97 – 125.
Yamauchi, B, & Beer R. D., 1994. Sequential behavior and learning in evolved
dynamical neural networks. Adaptive Behavior, 2, 219 – 246.
26
List of figures
Figure 1. Dynamical modes describing the flow of everyday activity.
Figure 2. Fitness achieved for agents evolved to catch circles or diamonds under
external instruction with and without focus control of sensor rays. Average of 10
independent runs.
Figure 3. Top left: CTRNN neurocontroller for self-instructing agent. ES: external
signal, F: focus effector, ML and MR, motor neurons driving left and right
respectively. Top right: nullclines corresponding to fully connected 2-node CPG
in the absence of input. Bottom left and right: nullclines in the presence of input,
trajectory ends in a low firing fixed point for neuron 1 (left) or in a high firing
fixed point (right) depending on phase. Output of neuron 1 is fed into ES.
Figure 4. Repeated presentation of falling circles (left) and diamonds (right) for
self-instructing agent. Plots show the horizontal displacement of the agent over
time and the position where the objects fall. Agent sometimes approaches the
target, other times avoids it.
Figure 5. Left: schematic representation of two high-dimensional homeostatic
regions in the space of neural firing rates. Right: how the homeostatic regions are
implemented for each node in the network. The plot indicates the plasticity
function (pj) as a function of neural firing rate (zj). Changes to incoming weights
are calculated as a function of pre- and post-synaptic activation multiplied by pj.
Whenever the post-synaptic firing rate is in one of the two flat regions, pj = 0 and
local plasticity is inhibited.
Figure 6. Left: Final distance to each light at the end of trials on serial
presentations of 100 pairs of lights. Right: Proportion of neurons that have stayed
within the homeostatic region for each light in correspondence to trials on the
left. Adapted from Iizuka and Di Paolo (forthcoming).
Figure 7. Light preference of the agent corresponding to the states of (a) 20, (b)
25, (c) 50 or (d) 95 in Fig. 6, against different light positions. Horizontal and
vertical axes indicate the initial angles of lights A and B relative to the agent’s
orientation respectively. The positions of lights whose difference is less than /2
are removed in order to better determine which light the agent is approaching.
The dark grey circles show that the agent approaches light A. The light grey
circles correspond to light B and black shows the agent does not approach either
of lights. Adapted from Iizuka and Di Paolo (forthcoming).
27
Figure 1. Dynamical modes describing the flow of everyday activity.
28
Figure 2. Fitness achieved for agents evolved to catch circles or diamonds under
external instruction with and without focus control of sensor rays. Average of 10
independent runs.
29
Figure 3. Top left: CTRNN neurocontroller for self-instructing agent. ES: external
signal, F: focus effector, ML and MR, motor neurons driving left and right
respectively. Top right: nullclines corresponding to fully connected 2-node CPG
in the absence of input. Bottom left and right: nullclines in the presence of input,
trajectory ends in a low firing fixed point for neuron 1 (left) or in a high firing
fixed point (right) depending on phase. Output of neuron 1 is fed into ES.
30
Figure 4. Repeated presentation of falling circles (left) and diamonds (right) for
self-instructing agent. Plots show the horizontal displacement of the agent over
time and the position where the objects fall. Agent sometimes approaches the
target, other times avoids it.
31
Figure 5. Left: schematic representation of two high-dimensional homeostatic
regions in the space of neural firing rates. Right: how the homeostatic regions are
implemented for each node in the network. The plot indicates the plasticity
function (pj) as a function of neural firing rate (zj). Changes to incoming weights
are calculated as a function of pre- and post-synaptic activation multiplied by pj :
wji = ji zi pj (zj) where ji is an evolved constant. Whenever the post-synaptic
firing rate is in one of the two flat regions, pj = 0 and local plasticity is inhibited.
32
Figure 6. Left: Final distance to each light at the end of trials on serial
presentations of 100 pairs of lights. Right: Proportion of neurons that have stayed
within the homeostatic region for each light in correspondence to trials on the
left. Adapted from Iizuka and Di Paolo (forthcoming).
33
Figure 7. Light preference of the agent corresponding to the states of (a) 20, (b)
25, (c) 50 or (d) 95 in Fig. 6, against different light positions. Horizontal and
vertical axes indicate the initial angles of lights A and B relative to the agent’s
orientation respectively. The positions of lights whose difference is less than /2
are removed in order to better determine which light the agent is approaching.
The dark grey circles show that the agent approaches light A. The light grey
circles correspond to light B and black shows the agent does not approach either
of lights. Adapted from Iizuka and Di Paolo (forthcoming).
34
Notes
i
The state equation for a CTRNN neuron is:
i (dyi/dt) = –vi + j wjizj + Ii,
where i indexes all neurons, j indexes all links inputting to neuron i (which may
be an empty set), i is a time constant, yi is the neuron state (analogous to a
membrane potential), Ii is an input current, wji is the link weight from neuron j
into neuron i, and zj is the activation of the pre-synaptic neuron attached to link j.
For a neuron, the firing rate is given by the logistic function:
zj = (yj+bj) = 1/{1+exp[–(yj+bj)]},
where bj is a bias parameter.
ii
If such blind action were to be the paradigmatic case of autonomy, we should
think of mountains as being alive since they endure much longer than living
systems. But life is not about enduring and autonomy is not about blindly
ignoring the environment. Self-determination becomes an empty concept if the
system is detached from sources of uncertainty and solicitations that would tend
to induce in it alternative outcomes from the one that the system itself is
struggling to achieve. In this view, autonomy is always a dialectical concept.
iii
Nullclines in a 2-node CTRNN circuit are calculated by setting the derivatives
of the states y1 and y2 equal to zero in the CTRNN state equation. The y1 nullcline
is given by:
J2 = ln [J1/(w21 – J1)] - w22 J1/ w21 – I2 – b2.
The y2 nullcline is given by:
J1 = ln [J2/(w12 – J2)] – w11 J2/ w12 – I1 – b1.
where J1 w21 (y2+b2) and J2 w12 (y1+b1)
See (Beer, 1995).
Was this manual useful for you? yes no
Thank you for your participation!

* Your assessment is very important for improving the work of artificial intelligence, which forms the content of this project

Download PDF

advertisement