proof copy

proof copy
Metadata of the chapter that
will be visualized online
Series Title
Chapter Title
Chapter SubTitle
Understanding Complex Systems
Data Assimilation as Artificial Perception and Supermodeling as Artificial Consciousness
Copyright Year
2013
Corresponding Author
Family Name
Duane
Given Name
Gregory S.
Copyright Holder
Springer-Verlag Berlin Heidelberg
Particle
Suffix
Division
Organization
Macedonian Academy of Sciences and Arts, University of Colorado
Email
[email protected]
Address
Abstract
Boulder, CO, USA
Data assimilation is naturally conceived as the synchronization of two systems, “truth” and “model”,
coupled through a limited exchange of information (observed data) in one direction. Though investigated
most thoroughly in meteorology, the task of data assimilation arises in any situation where a predictive
computational model is updated in run time by new observations of the target system, including the case
where that model is a perceiving biological mind. In accordance with a view of a semi-autonomous mind
evolving in synchrony with the material world, but not slaved to it, the goal is to prescribe a coupling
between truth and model for maximal synchronization. It is shown that optimization leads to the usual
algorithms for assimilation via Kalman Filtering under a weak linearity assumption. For nonlinear systems
with model error and sampling error, the synchronization view gives a recipe for calculating covariance
inflation factors that are usually introduced on an ad hoc basis. Consciousness can be framed as selfperception, and represented as a collection of models that assimilate data from one another and collectively
synchronize. The combination of internal and external synchronization is examined in an array of models
of spiking neurons, coupled to each other and to a stimulus, so as to segment a visual field. The interneuron coupling appears to enhance the overall synchronization of the model with reality.
Data Assimilation as Artificial Perception
and Supermodeling as Artificial Consciousness
Gregory S. Duane
1
2
O
O
F
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
N
C
O
R
R
EC
TE
D
PR
Abstract Data assimilation is naturally conceived as the synchronization of two
systems, “truth” and “model”, coupled through a limited exchange of information
(observed data) in one direction. Though investigated most thoroughly in meteorology, the task of data assimilation arises in any situation where a predictive
computational model is updated in run time by new observations of the target
system, including the case where that model is a perceiving biological mind.
In accordance with a view of a semi-autonomous mind evolving in synchrony
with the material world, but not slaved to it, the goal is to prescribe a coupling
between truth and model for maximal synchronization. It is shown that optimization
leads to the usual algorithms for assimilation via Kalman Filtering under a weak
linearity assumption. For nonlinear systems with model error and sampling error,
the synchronization view gives a recipe for calculating covariance inflation factors
that are usually introduced on an ad hoc basis. Consciousness can be framed as
self-perception, and represented as a collection of models that assimilate data from
one another and collectively synchronize. The combination of internal and external
synchronization is examined in an array of models of spiking neurons, coupled
to each other and to a stimulus, so as to segment a visual field. The inter-neuron
coupling appears to enhance the overall synchronization of the model with reality.
U
1 Data Assimilation as Synchronization of Truth and Model
A computational model of a physical process that provides a stream of new data
to the model as it runs must include a scheme to combine the new data with the
model’s prediction of the current state of the process. The goal of any such scheme
G.S. Duane ()
Macedonian Academy of Sciences and Arts, University of Colorado, Boulder, CO, USA
e-mail: [email protected]
L. Kocarev (ed.), Consensus and Synchronization in Complex Networks,
Understanding Complex Systems, DOI 10.1007/978-3-642-33359-0 8,
© Springer-Verlag Berlin Heidelberg 2013
22
23
24
25
G.S. Duane
U
N
C
O
R
R
EC
TE
D
PR
O
O
F
is the optimal prediction of the future behavior of the physical process. While the
relevance of the data assimilation problem is thus quite broad, techniques have been
investigated most extensively for weather modeling, because of the high dimensionality of the fluid dynamical state space, and the frequency of potentially useful new
observational input. Existing data assimilation techniques (3DVar,4DVar,Kalman
Filtering, and Ensemble Kalman Filtering) combine observed data with the most
recent forecast of the current state to form a best estimate of the true state of the
atmosphere, each approach making different assumptions about the nature of the
errors in the model and the observations.
An alternative view of the data assimilation problem is suggested here. The
objective of the process is not to “nowcast” the current state of reality, but to make
the model converge to reality in the future. Recognizing also that a predictive model,
especially a large one, is a semi-autonomous dynamical system in its own right,
influenced but not determined by observational input from a coexisting reality, it is
seen that the guiding principle that is needed is one of synchronism. That is, we seek
to introduce a one-way coupling between reality and model, such that the two tend to
be in the same state, or in states that in some way correspond, at each instant of time.
The problem of data assimilation thus reduces to the problem of synchronization of
a pair of dynamical systems, unidirectionally coupled through a noisy channel that
passes a limited number of “observed” variables.
While the synchronization of loosely coupled regular oscillators with limit cycle
attractors is ubiquitous in nature [23], synchronization of chaotic oscillators has only
been explored in the last two decades, in a wave of research spurred by the seminal
work of Pecora and Carroll [17]. Chaos synchronization can be surprising because
it implies that two systems, each effectively unpredictable, connected by a signal
that can be virtually indistinguishable from noise, nevertheless exhibit a predictable
relationship. Chaos synchronization has indeed been used to predict new kinds of
weak teleconnection patterns relating different sectors of the global climate system.
[2, 5, 8].
It is now clear that chaos synchronization is surprisingly easy to arrange, in
both ODE and PDE systems [4, 5, 14]. A pair of spatially extended chaotic systems
such as two quasi-2D fluid models, if coupled at only a discrete set of points and
intermittently in time, can be made to synchronize completely. The application of
chaos synchronization to the tracking of one dynamical system by another was
proposed by So et al. [22], so the synchronization of the fluid models suggests a
natural extension to meteorological data assimilation [7, 26].
Since the problem of data assimilation arises in any situation requiring a computational model of a parallel physical process to track that process as accurately
as possible based on limited input, it is suggested here that the broadest view of
data assimilation is that of machine perception by an artificially intelligent system.
In this context, the role of synchronism is reminiscent of the psychologist Carl
Jung’s notion of synchronicity in his view of the relationship between mind and
the material world. Like a data assimilation system, mind forms a model of reality
that functions well, despite limited sensory input. Jung, working in collaboration
with Wolfgang Pauli [13], noted uncanny coincidences or “synchronicities” between
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
Artificial Consciousness
77
1.1 Standard Data Assimilation vs. Synchronization
78
O
F
mental and physical phenomena, which he took to reflect a new kind of order
connecting the two realms. It was important to Jung and Pauli that synchronicities
themselves were distinct, isolated events, but such phenomena can emerge naturally
as a degraded form of chaos synchronization. In the artificial intelligence view of
data assimilation, the additional issue of model error can be approached naturally
as a problem of machine learning, which can indeed be addressed by extending the
methodology of synchronization, as we show here.
PR
O
Standard data assimilation, unlike synchronization, estimates the current state xT 2
Rn of one system, “truth,” from the state of a model system xB 2 Rn , combined with
noisy observations of truth. The best estimate of truth is the “analysis” xA , which
is the state that minimizes error as compared to all possible linear combinations of
observations and model. That is
R
EC
TE
minimizes the analysis error < .xA xT /2 > for a stochastic distribution given by
xobs D xT C where is observational noise, for properly chosen n n gain matrix
K. The standard methods to be considered in this paper correspond to specific forms
for the generally time-dependent matrix K.
The simple method known as 3dVar uses a time-independent K that is based on
the time-averaged statistical properties of the observational noise and the resulting
forecast error. Let the matrix
R
R < T >D< .xobs xT /.xobs xT /T >
O
76
79
80
81
82
83
84
85
86
87
88
89
90
(3)
N
C
75
91
U
be the “background” error covariance, describing the deviation of the model state
from the true state. If both covariance matrices are assumed to be constant in time,
then the optimal linear combination of background and observations is:
xA D R.R C B/1 xB C B.R C B/1 xobs
73
74
(2)
be the observation error covariance, and the matrix
B < .xB xT /.xB xT /T >
72
(1)
D
xA xB C K.xobs xB /
71
92
93
94
(4)
The formula (4), which simply states that observations are weighted more heavily
when background error is greater and conversely, defines the 3dVar method in
practical data assimilation, based on empirical estimates of R and B. The 4dVar
method, which will not be considered here, generalizes (4) to estimate a short history
of true states from a corresponding short history of observations.
95
96
97
98
99
G.S. Duane
The Kalman filtering method, which is popular for a variety of tracking problems,
uses the dynamics of the model to update the background error covariance B
sequentially. The analysis at each assimilation cycle i is:
i
xAi D R.R C B i /1 xBi C B i .R C B i /1 xobs
102
(5)
where the background xBi is formed from the previous analysis xAi 1 simply by
running the model M W Rn ! Rn
xBi D Mi 1!i .x i 1 /
103
104
F
(6)
O
O
as is done in 3dVar. But now the background error is updated according to
B i D Mi 1!i Ai 1 MTi 1!i C Q
100
101
105
(7)
106
107
108
D
PR
where A is the analysis error covariance A < .xA xT /.xA xT /T >, given
conveniently by A1 D B 1 C R1 . The matrix M is the tangent linear model given
by
ˇ
@Mb ˇˇ
Mab (8)
@x ˇ
a xDxA
R
EC
TE
The update formula (7) gives the minimum analysis error < .xA xT /2 >D T rA
at each cycle. The term Q is the covariance of the error in the model itself.
To compare synchronization to standard data assimilation, we inquire as to the
coupling that is optimal for synchronization, so that this coupling can be compared
to the gain matrix used in the standard 3dVar and Kalman filtering schemes. The
general form of coupling of truth to model that we consider in this section is given
by a system of stochastic differential equations:
109
110
111
112
113
114
115
O
R
xPT D f .xT /
xPB D f .xB / C C.xT xB C /
(9)
U
N
C
where true state xT 2 Rn and the model state xB 2 Rn evolve according to the
same dynamics, given by f , and where the noise in the coupling (observation)
channel is the only source of stochasticity. The form (9) is meant to include
dynamics f described by partial differential equations, as in the last section. The
system is assumed to reach an equilibrium probability distribution, centered on the
synchronization manifold xB D xT . The goal is to choose a time-dependent matrix
C so as to minimize the spread of the distribution.
Note that if C is a projection matrix, or a multiple of the identity, then (9) effects
a form of nudging. But for arbitrary C , the scheme is much more general. Indeed,
continuous-time generalizations of 3DVar and Kalman filtering can be put in the
form (9).
116
117
118
119
120
121
122
123
124
125
126
Artificial Consciousness
Let us assume that the dynamics vary slowly in state space, so that the Jacobian
F Df , at a given instant, is the same for the two systems
Df .xB / D Df .xT /
(10)
where terms of O.xB xT / are ignored. Then the difference between the two
equations (9), in a linearized approximation, is
eP D F e C e C C O
F
R
R
EC
TE
D
where R D< T > is the observation error covariance matrix, and ı is a time-scale
characteristic of the noise, analogous to the discrete time between molecular kicks in
a Brownian motion process that is represented as a continuous process in Einstein’s
well-known treatment. Equation (12) states that the local change in is given by the
divergence of a probability current .F C /e except for random “kicks” due to the
stochastic term.
The PDF can be taken to have the Gaussian form D N exp.e T Ke/, where
the matrix
K is the inverse spread, and N is a normalization factor, chosen so
R
that d n e D 1. For background error covariance B, K D .2B/1 . In the onedimensional case, n D 1, where C and K are scalars, substitution of the Gaussian
form in (12), for the stationary case where @[email protected] D 0 yields:
O
131
132
133
(12)
PR
O
1
@
C re Œ.F C /e D ıre .CRC T re /
@t
2
C F D ıRC 2 K
(13)
2B.C F / D ıRC 2
(14)
134
135
136
137
138
139
140
141
142
143
144
145
N
C
Solving dB=dC D 0, it is readily seen that B is minimized (K is maximized) when
C D 2F D .1=ı/B=R.
In the multidimensional case, n > 1, the relation (14) generalizes to the
fluctuation–dissipation relation
U
129
130
(11)
where e xB xT is the synchronization error.
The stochastic differential equation (11) implies a deterministic partial differential equation, the Fokker–Planck equation, for the probability distribution .e/:
or
127
128
B.C F /T C .C F /B D ıCRC T
146
147
148
149
(15)
that can be obtained directly from the stochastic differential equation (11) by a
standard proof [7]. B can then be minimized element-wise. Differentiating the
matrix equation (15) with respect to the elements of C , we find
150
151
152
dB.C F /T C B.dC /T C .dC /B C .C F /dB D ıŒ.dC /RC T C CR.dC /T (16)
153
G.S. Duane
where the matrix dC represents a set of arbitrary increments in the elements of C ,
and the matrix dB represents the resulting increments in the elements of B. Setting
dB D 0, we have
ŒB ıCR.dC /T C .dC /ŒB ıRC T D 0
154
155
156
(17)
O
F
Since the matrices B and R are each symmetric, the two terms in (17) are transposes
of one another. It is easily shown that the vanishing of their sum, for arbitrary dC ,
implies the vanishing of the factors in brackets in (17). Therefore C D .1=ı/BR1 ,
as in the 1D case.
Turning now to the standard methods, so that a comparison can be made, it is
recalled that the analysis xA after each cycle is given by:
157
158
159
160
161
162
O
xA D R.R C B/1 xB C B.R C B/1 xobs D xB C B.R C B/1 .xobs xB / (18)
D
PR
In 3dVar, the background error covariance matrix B is fixed; in Kalman filtering
it is updated after each cycle using the linearized dynamics. The background for
the next cycle is computed from the previous analysis by integrating the dynamical
equations:
xBnC1 D xAn C f .xAn /
(19)
TE
where is the time interval between successive analyses. Thus the forecasts satisfy
a difference equation:
EC
n
xBnC1 D xBn C B.R C B/1 .xobs
xBn / C f .xAn /
R
R
O
C
N
167
168
169
170
(21)
as Einstein modeled Brownian motion as a continuous process, using the white noise
to represent the difference between observation xobs and truth xT . The continuous
approximation is valid so long as f varies slowly on the time-scale .
It is seen that when background error is small compared to observation error, the
higher order terms OŒ.B.B C R/1 /2 can be neglected and the optimal coupling
C D 1=ıBR1 is just the form that appears in the continuous data assimilation
equation (21), for ı D . Thus under the linear assumption that Df .xB / D Df .xT /,
the synchronization approach is equivalent to 3dVar in the case of constant background error, and to Kalman filtering if background error is dynamically updated
over time.
The equivalence can also be shown for an exact description of the discrete
analysis cycle. That is, one can leave the analysis cycle intact and compare it to
a discrete-time version of optimal synchronization, i.e. to optimally synchronized
U
165
166
(20)
We model the discrete process as a continuous process in which analysis and
forecast are the same:
xP B D f .xB / C 1=B.B C R/1 .xT xB C / C OŒ.B.B C R/1 /2 163
164
171
172
173
174
175
176
177
178
179
180
181
182
183
Artificial Consciousness
maps. We rely on a fluctuation–dissipation relation (FDR) for stochastic difference
equations. Consider the stochastic difference equation with additive noise,
x.n C 1/ D F x.n/ C .n/;
< .n/.m/T >D R ın;m ;
(22)
where x; 2 Rn , F , R are n n matrices, F is assumed to be stable, and is
Gaussian white noise. One can show that the equilibrium covariance matrix <
xx T > satisfies the matrix FDR
F F T C R D 0:
184
185
186
187
188
F
(23)
PR
O
O
Now consider a model that takes the analysis at step n to a new background at
step n C 1, given by a linear matrix M . That is, xB .n C 1/ D M xA .n/. Also,
xT .n C 1/ D M xT .n/. Since xA .n/ D xB .n/ C B.B C R/1 .xobs .n/ xB .n//,
where xobs D xT C , we derive a difference equation for e xB xT :
e.n C 1/ D M.I B.B C R/1 /e.n/ C MB.B C R/1 :
D
TE
193
(25)
EC
and with the FDR as derived above:
.M C /B.M C /T B C CRC T D 0
191
192
(24)
For synchronously coupled maps, on the other hand, we have
e.n C 1/ D .M C /e.n C 1/ C C ;
189
190
194
(26)
195
196
R
R
Differentiating the matrix equation (26) with respect to the elements of C , as in the
continuous-time analysis, we find
dB C dCRC T C CRdC T :
(27)
C
O
0 D .M C /dB.M C /T C .dC /B.M C /T C .M C /B.dC /T
N
We seek a matrix C for which dB D 0 for arbitrary dC , and thus
(28)
U
.dC /ŒB.M C /T RC T C Œ.M C /B CR.dC /T D 0
197
for arbitrary dC . The two terms are transposes of one another, and it is easily shown,
as in the continuous-time case, that the quantities in brackets must vanish. This gives
the optimal matrix
C D MB.B C R/1
(29)
which upon substitution in (25) reproduces the standard data assimilation form (24),
confirming the equivalence.
198
199
200
201
202
G.S. Duane
1.2 Synchronization vs. Data Assimilation for Strongly
Nonlinear Dynamics
203
1.2.1 The Perfect Model Case
205
In a region of state space where nonlinearities are strong and (10) fails, the
prognostic equation for error in the (11) is replaced by:
206
204
eP D .F C /e C Ge 2 C He 3 C C F
(30)
PR
O
O
where we have included terms up to cubic order in e, with H < 0 to prevent
divergent error growth for large positive or negative e. In the multidimensional case,
(30) is shorthand for a tensor equation in which G and H are tensors of rank three
and rank four (and the restrictions on H are more complex). In the one-dimensional
case, which we shall analyze here, G and H are scalars.
The Fokker–Planck equation is now:
D
1
@
C re fŒ.F C /e C Ge 2 C He 3 g D ıre .CRC T re /
@t
2
.e/ D N exp.Ke 2 C Le 3 C M e 4 /
211
212
213
(32)
EC
R
R
O
C
N
209
210
(31)
R1
with a normalization factor N D Π1 exp.Ke 2 C Le 3 C M e 4 /1 , we obtain
from (31) the following relations between the dynamical parameters and the PDF
parameters :
U
208
214
TE
Using the ansatz for the PDF :
207
1 2
C R.2K/
2
1
G D C 2 R.3L/
2
1
H D C 2 R.4M /
2
215
216
217
F C D
(33)
The goal is to minimize the background error:
R1
R1
B.K; L; M / D 1
e 2 exp.Ke 2 C Le 3 C M e 4 /
1
exp.Ke 2 C Le 3 C M e 4 /
218
:
(34)
Using (33) to express the arguments of B in terms of the dynamical parameters, we
find B.K; L; M / D B.K.C /; L.C /; M.C // B.C / and can seek the value of C
that minimizes B, for fixed dynamical parameters F; G; H .
219
220
221
Artificial Consciousness
For grounding in choosing appropriate parameter values, the nonlinearities of
typical geophysical fluid systems were considered in a previous study [7]. The
coupling that gives optimal synchronization can be compared with the coupling
used in standard data assimilation, as for the linear case. In particular, one can ask
whether the “covariance inflation” scheme that is used as an ad hoc adjustment
in Kalman filtering [1] can reproduce the C values found to be optimal for
synchronization. In the continuous assimilation case, the form C D 1 BR11 is
replaced by the adjusted form
1
F BR1
224
225
226
227
228
229
F
(35)
O
C D
222
223
230
O
where F is the covariance inflation factor.
231
In practice, the need for covariance inflation is thought to arise more from model
error and from sampling error than from the nonlinearity of a hypothesized perfect
model. The reasoning of the previous subsection can be readily extended to
incorporate model error arising from processes on small scales that escape the
digital representation. While errors in the parameters or the equations for the
explicit degrees of freedom require deterministic corrections, the unresolved scales,
assumed dynamically independent, can only be represented stochastically. The
physical system is governed by:
232
EC
TE
D
PR
1.2.2 Nonlinear Dynamics with Stochastic Model Error
O
R
T
>. The error
in place of (9a), where M is model error, with covariance Q < M
equation (30) becomes
N
The Fokker–Planck equation becomes:
U
@
1
C re fŒ.F C /e C Ge 2 C He 3 g D ıre Œ.CRC T C Q/re @t
2
237
238
239
240
241
(37)
C
eP D .F C /e C Ge 2 C He 3 C C C M
235
236
(36)
R
xPT D f .xT / M
233
234
242
(38)
243
The use of C D 1 B.B CR/1 in [7] was misplaced. That form applies in the case of a discrete
assimilation cycle.
1
G.S. Duane
Table 1 Covariance inflation
factor vs. bimodality
parameters d1 ; d2
0.75
1.0
1.25
1.5
1.75
2.0
d2
0.75
1.26
1.26
1.28
1.30
1.32
1.34
1.0
1.26
1.23
1.23
1.25
1.27
1.29
d1
1.25
1.28
1.23
1.22
1.23
1.24
1.25
1.5
1.30
1.25
1.23
1.22
1.22
1.23
2.0
1.34
1.29
1.25
1.23
1.23
1.23
GD
1
.C 2 R C Q/.3L/
2
H D
1
.C 2 R C Q/.4M /
2
O
1
.C 2 R C Q/.2K/
2
(39)
D
PR
F C D
244
O
F
leading, as before, to:
1.75
1.32
1.27
1.24
1.23
1.22
1.23
U
N
C
O
R
R
EC
TE
The background error B given by (34) is now expressed in terms of C by
substituting expressions for K, L, and M derived from (39). The value of C that
gives the minimum B for fixed dynamical parameters F; G; H can then be found
numerically as in the perfect model case.
The optimization problem was solved numerically with results as shown in
Table 1 for a range of parameter values. Results are shown in terms of length scales
d1 and d2 , for dynamics described by a two-well potential with two stable fixed
points at respective distances d1 ; d2 from a central unstable fixed point. For that
configuration it is found that G D 1=d2 1=d1 and H D 1=.d1 d2 /. The covariance
inflation factors are remarkably constant over a wide range of parameters and agree
with typical values used in operational practice. Results are displayed for the case
where the amplitude of model error in (36) is 50% of the resolved tendency xPT , with
the resulting model error covariance Q approximately one-fourth of the background
error covariance B.
The thirde type of error that necessitates covariance inflation is sampling error,
which affects estimates of the background covariance B, and thus of the coupling
C . Background error is systematically underestimated due to undersampling,
and this effect has been treated by others [19]. Here, we mention that there is
additional uncertainty in the optimal coupling due to random sampling, which can
be represented as an additional noise term S in a revised error equation:
eP D .F C C S /e C Ge 2 C He 3 C C C M
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
(40)
The multiplicative noise from sampling error, with covariance S < ST >,
combines with the (uncorrelated) additive noise from observation error and model
265
266
Artificial Consciousness
error, giving an extended Fokker–Planck equation (see e.g. [20]). That equation is
most easily presented in the one-dimensional case:
267
268
@
1 @2
@
1
C fŒ.F C /e C Ge 2 C He 3 C Seg D ı 2 Œ.C 2 R C Q C Se 2 / (41)
@t @e
2
2 @e
O
PR
2 Supermodeling as Artificial Consciousness
O
F
This approach to the treatment of sampling error is currently under development.
In the synchronization view, the range of values typically used for covariance
inflation factors in the presence of model nonlinearity, with the various sources of
error can thus be explained. A more detailed analysis should enable a comparison
with operational values in specific cases.
U
N
C
O
R
R
EC
TE
D
In a perceiving brain, synchronization of truth and model occurs alongside internal
synchronization—in patterns of neuronal firing that have been suggested as a
mechanism for grouping of different features belonging to the same physical
object [12, 21, 25]. It was argued previously that patterns of synchronized firing
of neurons provide a particularly natural and useful representation of objective
grouping relationships, with chaotic intermittency allowing the system to escape
locally optimal patterns in favor of global ones [3], following an early suggestion of
Freeman’s [11]. The observed, highly intermittent synchronization of 40 Hz neural
spike trains might play just such a role.
The role of spike train synchronization in perceptual grouping has led to
speculations about the role of synchronization in consciousness [15, 18, 23, 25].
Recent debates over the physiological basis of consciousness have centered on
the question of what groups or categories of neurons must fire in synchrony in a
mental process for that process to be a “conscious” one [15]. Here we suggest a
relationship between internal synchronization and consciousness on a more naive
basis: Consciousness can be framed as self-perception, and then placed on a
similar footing as perception of the objective world. In this view, there must be
semi-autonomous parts of a “conscious” mind that perceive one another. In the
interpretation of Sect. 1, these components of the mind synchronize with one
another, or in alternative language, they perform “data assimilation” from one
another, with a limited exchange of information. The scheme has actually been
proposed and is currently being investigated, for fusion of alternative computational
models of the same objective process in a practical context [24].
Taking the proposed interpretation of consciousness seriously, again imagine that
the world is a 3-variable Lorenz system, perceived by three different components of
mind, also represented by Lorenz systems, but with different parameters. The three
Lorenz systems also “self-perceive” each other. Three imperfect “model” Lorenz
systems were generated by perturbing parameters in the differential equations for
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
Fig. 1 “Model” Lorenz
systems are linked to each
other, generally in both
directions, and to “reality” in
one direction. Separate links
between models, with distinct
values of the connection
ij
coefficients Cl , are
introduced for different
variables and for each
direction of possible influence
K1
K2
Reality
Model
1
Cℓ12
Cℓ21
Model
2
Cℓ31
Cℓ13
Cℓ23
K3
Cℓ32
Model
3
O
O
F
this figure will be printed in b/w
G.S. Duane
X
j ¤i
yPi D xi yi xi zi C i C
X
y
Cij .yj yi / C Ky .y yi /
(42)
j ¤i
TE
zPi D ˇi zi C xi yi C
303
304
Cijx .xj xi / C Kx .x xi /
D
xP i D i .yi zi / C
PR
a given “real” Lorenz system and adding extra terms. The resulting suite is: xP D
.y z/; yP D x y xz; zP D ˇz C xy
X
Cijz .zj zi / C Kz .z zi /
EC
j ¤i
U
N
C
O
R
R
where .x; y; z/ is the real Lorenz system and .xi ; yi ; zi / i D 1; 2; 3 are the three
models. An extra term is present in the models but not in the real system. Because
of the relatively small number of variables available in this toy system, all possible
directional couplings among corresponding variables in the three Lorenz systems
were considered, giving 18 connection coefficients CijA A D x; y; z i; j D 1; 2; 3
i ¤ j . The constants KA A D x; y; z are chosen arbitrarily so as to effect “data
assimilation” from the “real” Lorenz system into the three coupled “model” systems.
The configuration is schematized in Fig. 1.
The connections linking the three model systems were chosen using a general
result on parameter adaptation in synchronously coupled systems with mismatched
parameters: If two systems synchronize when their parameters match, then under
some weak assumptions it is possible to prescribe a dynamical evolution law for
general parameters in one of the systems so that the parameters of the two systems,
as well as the states, will converge [9]. In the present case the tunable parameters are
taken to be the connection coefficients (not the parameters of the separate Lorenz
systems), and they are tuned under the peculiar assumption that reality itself is a
similar suite of connected Lorenz systems. The general result [9] gives the following
adaptation rule for the couplings:
!
X
1
x
x
x
CP i;j
D a.xj xi / x xk =.Ci;j
100/2 C =.Ci;j
C ı/2
(43)
3
k
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
Artificial Consciousness
U
N
C
O
R
R
EC
TE
D
PR
O
O
F
y
z
with analogous equations for CP i;j and CP i;j
, where the adaptation rate a is an
arbitrary constant and the terms with coefficient dynamically constrain all
A
couplings Ci;j
to remain in the range .ı; 100/ for some small number ı. Without
recourse to the general result on parameter adaptation, the rule (43) has a simple
interpretation: Time integrals of the first terms on the right-hand side of
P each
equation give correlations between truth-model synchronization error, x 13 k xk ,
and inter-model “nudging,” xj xi . We indeed want to increase or decrease the
inter-model nudging, for a given pair of corresponding variables, depending on the
sign and magnitude of this correlation. (The learning algorithm we have described
resembles a supervised version of Hebbian learning. In that scheme “cells that fire
together wire together.” Here, corresponding model components “wire together”
in a preferred direction, until they “fire” in concert with reality.) The procedure
will produce a set of values for the connection coefficients that is at least locally
optimal.
A simple case is one in which each of the three model systems contains the
“correct” equation for only one of the three variables, and “incorrect” equations for
the other two. The “real” system could then be formed using large connections for
the three correct equations, with other connections vanishing. Other combinations
of model equations will also approximate reality.
In a numerical experiment (Fig. 2a), the couplings did not converge, but the
coupled suite of “models” did indeed synchronize with the “real” system, even
with the adaptation process turned off half-way through the simulation so that the
A
coupling coefficients Ci;j
subsequently held fixed values. The difference between
corresponding variables in the “real” and coupled “model” systems was significantly
less than the difference using the average outputs of the same suite of models, not
coupled among themselves. (With the coupling turned on, the three models also
synchronized among themselves nearly identically, so the average was nearly the
same in that case as the output of any single model.) Further, without the model–
model coupling, the output of the single model with the best equation for the given
variable (in this case z, modeled best by system 1) differed even more from “reality”
than the average output of the three models. Therefore, it is unlikely that any ex post
facto weighting scheme applied to the three outputs would give results equalling
those of the synchronized suite. Internal synchronization within the multi-model
“mind” is essential. In a case where no model had the “correct” equation for any
variable, results were only slightly worse (Fig. 2d).
The above scheme for fusion of imperfect computational/mental models only
requires that the models come equipped with a procedure to assimilate new
measurements from an objective process in real time, and hence from one another.
The scheme has indeed been suggested for the combination of long-range climate
projection models, which differ significantly among themselves in regard to the
magnitude and regional characteristics of expected global warming [6]. (To project
twenty-first century climate, the models are disconnected from reality after training,
parameters are altered slightly to represent increased greenhouse gas levels, and
one assesses changes in the overall shape of the attractor.) In this context, previous
results with Lorenz systems were confirmed and extended using a more developed
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
G.S. Duane
b
30
zm-z
c
30
d
30
30
0
0
0
-30
-30
0
1000
0
-30
0
time
1000
-30
0
1000
time
0
1000
time
time
O
F
Fig. 2 Difference zm z between “model” and “real” z vs. time for a Lorenz system with D 28,
ˇ D 8=3, D 10:0 and a suite of models with 1;2;3 D , ˇ1 D ˇ, 1 D 15:0, 1 D 30:0,
ˇ2 D 1:0, 2 D , 2 D 30:0, ˇ3 D 4:0,3 D 5:0, 3 D 0. The synchronization error is
shown for (a) the average of the coupled suite zm D .z1 C z2 C z3 /=3 with couplings CijA adapted
according to (43) for 0 < t < 500 and held constant for 500 < t < 1; 000; (b) the same average zm
but with all CijA D 0; (c) zm D z1 , the output of the model with the best z equation, with CijA D 0;
(d) as in (a) but with ˇ1 D 7=3, 2 D 13:0, and 3 D 8:0, so that no equation in any model is
“correct.” (Analogous comparisons for x and y give similar conclusions)
PR
O
this figure will be printed in b/w
a
O
R
R
EC
TE
D
machine learning method to determine inter-model connections [24]. The scheme
could also be applied to financial, physiological, or ecological models.
That the transition to synchronization among a suite of interconnected systems
is sharper than the transition for a pair of systems is taken here to bolster the
previous suggestions that synchronization plays a fundamental role in conscious
mental processing. It remains to integrate a theory of higher-level synchronization
with the known synchronization of 40 Hz spike trains. It is certainly plausible
that inter-scale interactions might allow synchronization at one level to rest on
and/or support synchronization at the other level. In a complex biological nervous
system, with a steady stream of new input data, it is also very plausible that natural
noise or chaos would give rise to very brief periods of widespread high-quality
synchronization across the system, and possibly between the system and reality.
Such “synchronicities” would appear subjectively as consciousness.
2.1 Supermodeling in a Simple Neural Model
C
N
U
X
kij i 0 j 0 .vi 0 j 0 vij /
i; j D 1; : : : ; 40
.i;j /¤.i 0 ;j 0 /
wP ij D cŒ˛.1 C tanh.ˇvij // wij C
X
.i;j /¤.i 0 ;j 0 /
kij i 0 j 0 .wi 0 j 0 wij /
369
370
371
372
373
374
375
376
377
378
379
380
381
To explore the possible connection between conscious perception and the type of
“supermodel” described above, we examine a simple model of visual grouping
exhibiting both internal and external synchronization.
We consider a 40 40 array of FitzHugh-Nagumo (FN) oscillators [10, 16], each
oscillator a 2-variable system, connected via diffusive coupling:
vP ij D 3vij v3ij v7ij C 2 wij C
368
(44)
382
383
384
385
386
Artificial Consciousness
b
c
4
3
2
1
200
400
600
800
time
O
F
Fig. 3 Stimulus (a) presented to an array of FN neurons (44) induces a synchronization pattern
(b) in which the FN v variables in (44), represented by gray level) closely agree within the bright
square but desynchonize in the dark background region, in a snapshot at t D 0:15. The v-cycle
for units in is the stimulus region is shown in panel (c) for units at .15; 15/ (purple) and .25; 25/
(yellow) along with the cycle for a desynchronized unit in the background at .25; 2/. (˛ D 12,
ˇ D 4, c D 0:04, D 10.) Time is in units of t =10
PR
O
this figure will be printed in b/w
v
a
H.Imageij /.expŒ.Imagei 0 j 0 Imageij /2 =
2 if.i 0 ; j 0 / 2 Ni;j
0
otherwise
(45)
EC
kij i 0 j 0 D
N
C
O
R
R
where Nij is a small neighborhood of .i; j /, of radius 2.5, and H is a Heaviside step
function, vanishing for negative arguments. The constants c; ˛; ˇ; and are defined
in the figure captions.
The effect of the coupling (45) is to cause the spike trains of the oscillators at
neighboring positions to synchronize if their input pixel gray levels are similar.
But additionally, the step function negates the synchronizing effect unless there is a
stimulus present, of brightness greater than some threshold . (One could imagine
the connection scheme to have emerged from a process of fast Hebbian learning,
which we do not represent explicitly here.) Upon presentation of a stimulus such as
the bright square in Fig. 3, the neurons in the region of the square synchronize, while
the neurons corresponding to the background field remain desynchronized. The
synchronization pattern defines a segmentation of the visual field. If desired, that
segmentation could be represented by another layer of neurons, at lower resolution,
with potentials determined by the variances over local regions of our FN array, or
proxies thereto that are more readily calculated by the neural network.
We now imagine a time-varying stimulus. If the stimulus varies on a much
slower time scale than that of the neural spike trains, as seems realistic, then
one can expect the synchronization patterns to themselves synchronize with the
slowly varying stimulus. The thresholding form in (45) would result in a crude,
binarized form of synchronization with the external stimulus. But if one imagines
a random distribution of thresholds among a large collection of neurons, then
U
389
TE
(
387
388
D
For image segmentation, the connection coefficient kij i 0 j 0 links a pixel at position
i; j to a pixel at i 0 ; j 0 , with strength depending on the brightness values Imageij and
Imagei 0 j 0 . Specifically, we take
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
a
b variance
1.0
1.5
0.8
1.0
0.6
0.4
0.5
0.2
50
100
150
200
250
time
300
50
100
150
200
250
300
time
F
Fig. 4 The stimulus pattern in Fig. 3a is temporally modulated by the waveform (a). The variance
(b) within the inner square region drops sharply at times of high signal value (yellow background
line). (The more gradual decreasing trend in variance over most of the time window results from
the decreasing signal in the portion of the waveform considered, as shown in Fig. 3c.)
O
O
this figure will be printed in b/w
G.S. Duane
U
N
C
O
R
R
EC
TE
D
PR
there should be analogue synchronization between the incoming signal and the
strength of the neural pattern. For convenience, we consider here the unrealistic
case of stimulus modulation on a time scale of the same order as that of the spike
trains. Even in this case, one observes a vestige of synchronization between the
neural patterns, measured as variance, and the incoming signal, as seen in Fig. 4.
Near the maximum signal phase in each stimulus cycle, there is a relatively sharp
drop in the variance of the neural pattern. (There is a background component of
steadily decreasing variance over most of the time window because of the shape of
the recently synchronized waveforms, which are in a decreasing phase, as is their
variance.) The behavior can be expected to generalize to chaotic inputs, especially
for longer stimulus time scales.
The role of “supermodeling” in the FN array is to make a pattern of activity
in a large collection of neurons synchronize more perfectly with a coherent,
spatially extended input pattern, where each neuron is only exposed to a restricted,
imperfect representation of the entire pattern. We consider a stimulus of nonuniform
brightness, as might be due to an illumination gradient, but with the same temporal
variation as before. Again, there is a weak vestige of synchronization between the
neural pattern and the signal (Fig. 5). It is conjectured that the coupled collection of
oscillators will synchronize better with the overall temporal pattern than a smaller
collection of oscillators responding to a small piece of the input. In one sense, that
conjecture is trivial, since variances computed over small sets will be noisy—the
advantage of supermodeling is logically tied to the system’s use of synchronization
patterns for representation of stimuli in the first place! But additionally, there is
typically a synergistic effect in coupled networks of nonlinear oscillators. It is
thought that the degree of synchronization with the external pattern will actually
be enhanced by internal synchronization within the network. Support for this
conjecture is provided by the results for a uniform increase in coupling strength
across the network: The dashed line in Fig. 5d shows enhancement of the stimulusprovoked drops in variance, as well as decrease in overall variances.
411
412
413
414
415
416
417
418
419
420
421
422
423
424
425
426
427
428
429
430
431
432
433
434
435
436
437
438
439
a
b
c variance
1.5
1.0
0.5
50
100
150 200
250
time
300
O
F
Fig. 5 A ramp image stimulus, as might arise from a gradual trend in illumination (a), is input to
the FN network, with the same overall time modulation as in Fig. 4, giving synchronized v-cycles
within the stimulus region and desynchronized background, as seen in a snapshot at t D 25 (b).
The variance (c) within the inner square region drops sharply at times of high signal value (yellow
line) as before, effectively “discounting the illuminant”. The temporal pattern of response to the
external stimulus is enhanced if all internal couplings are increased by a factor of 32 (dashed line)
PR
O
this figure will be printed in b/w
Artificial Consciousness
448
References
449
R
R
EC
TE
D
To complete the analogy with computational supermodeling, one may imagine
that the “image” inputs are provided by lower levels of a neural hierarchy, and that
there are differences in the “models” at those lower levels. Conversely, to reach
the type of self-perceptive processing that was described above as conscious, the
inter-model synchronization at higher levels should be at slower time scales, not
those of spike trains. The visual segmentation example serves mostly to illustrate a
fundamental interplay, in “neural” dynamics, between synchronization within a set
of models or units on the one hand, and synchronizability with a system that is to be
compactly represented on the other.
U
N
C
O
1. Anderson, J.L.: An ensemble adjustment Kalman filter for data assimilation. Mon. Wea. Rev.
129, 2884–2903 (2001)
2. Duane, G.S.: Synchronized chaos in extended systems and meteorological teleconnections.
Phys. Rev. E 56, 6475–6493 (1997)
3. Duane, G.S.: A ‘cellular neuronal’ approach to optimization problems. Chaos 19, Art. No.
033114 (2009)
4. Duane, G.S., Tribbia, J.J.: Synchronized chaos in geophysical fluid dynamics. Phys. Rev. Lett.
86, 4298–4301 (2001)
5. Duane, G.S., Tribbia, J.J.: Weak Atlantic-Pacific teleconnections as synchronized chaos.
J. Atmos. Sci. 61, 2149–2168 (2004)
6. Duane, G.S., Tribbia, J., Kirtman, B.: Consensus on long-range prediction by adaptive
synchronization of models. In: Paper presented at EGU General Assembly, No. 13324, Vienna,
Austria, April 2009
7. Duane, G.S., Tribbia, J.J., Weiss, J.B.: Synchronicity in predictive modelling: A new view of
data assimilation. Nonlinear Process. Geophys. 13, 601–612 (2006)
8. Duane, G.S., Webster, P.J., Weiss, J.B.: Co-occurrence of Northern and Southern Hemisphere
blocks as partially synchronized chaos. J. Atmos. Sci. 56, 4183–4205 (1999)
9. Duane, G.S., Yu, D.-C., Kocarev, L.: Identical synchronization, with translation invariance,
implies parameter estimation. Phys. Lett. A 371, 416–420 (2007)
440
441
442
443
444
445
446
447
450
451
452
453
454
455
456
457
458
459
460
461
462
463
464
465
466
467
468
G.S. Duane
U
N
C
O
R
R
EC
TE
D
PR
O
O
F
10. FitzHugh, R.: Impulses and physiological states in theoretical models of nerve membrane.
Biophys. J. 1, 445–466 (1961)
11. Freeman, W.J.: Chaos in the brain – possible roles in biological intelligence. Int. J. Intell. Syst.
10, 71–88 (1995)
12. Gray, C.M., Konig, P., Engel, A.K., Singer, W.: Oscillatory responses in cat visual-cortex
exhibit inter-columnar synchronization which reflects global stimulus properties. Nature 338,
334–337 (1989)
13. Jung, C.G., Pauli, W.: The Interpretation of Nature and the Psyche. Pantheon, New York (1955)
14. Kocarev, L., Tasev, Z., Parlitz, U.: Synchronizing spatiotemporal chaos of partial differential
equations. Phys. Rev. Lett. 79, 51–54 (1997)
15. Koch, C., Greenfield, S.: How does consciousness happen? Sci. Am. 297, 76–83 (2007)
16. Nagumo, J., Arimoto, S., Yoshizawa, S.: An active pulse transmission line simulating nerve
axon. Proc. IRE 50, 2061–2070 (1962)
17. Pecora, L.M., Carroll, T.L.: Synchronization in chaotic systems. Phys. Rev. Lett. 64, 821–824
(1990)
18. Rodriguez, E., George, N., Lachaux, J.P., Martinerie, J., Renault, B., Varela, F.J.: Perception’s
shadow: Long-distance synchronization of human brain activity. Nature 397, 430–433 (1999)
19. Sacher, W., Bartello, P.: Sampling errors in ensemble Kalman filtering. Part I: Theory Mon.
Wea. Rev. 136, 3035–3049 (2008)
20. Sardeshmukh, P.D., Sura, P.: Reconciling non-gaussian climate statistics with linear dynamics.
J. Climate 22, 1193–1207 (2009)
21. Schechter, B.: How the brain gets rhythm. Science 274, 339–340 (1996)
22. So, P., Ott, E., Dayawansa, W.P.: Observing chaos – deducing and tracking the state of a chaotic
system from limited observation. Phys. Rev. E 49, 2650–2660 (1994)
23. Strogatz, S.H.: Sync: The Emerging Science of Spontaneous Order, p. 338. Theia, New York
(2003)
24. van den Berge, L.A., Selten, F.M., Wiegerinck, W., Duane, G.S.: A multi-model ensemble
method that combines imperfect models through learning. Earth Syst. Dyn. 2, 161–177 (2011)
25. von der Malsburg, C., Schneider, W.: A neural coctail-party processor. Biol. Cybern. 54, 29–40
(1986)
26. Yang, S.-C., Baker, D., Cordes, K., Huff, M., Nagpal, G., Okereke, E., Villafañe, J., Duane, G.:
Data assimilation as synchronization of truth and model: Experiments with the three-variable
Lorenz system. J. Atmos. Sci. 63, 2340–2354 (2004)
469
470
471
472
473
474
475
476
477
478
479
480
481
482
483
484
485
486
487
488
489
490
491
492
493
494
495
496
497
498
499
500
501
Was this manual useful for you? yes no
Thank you for your participation!

* Your assessment is very important for improving the work of artificial intelligence, which forms the content of this project

Download PDF

advertisement