# proof copy

Metadata of the chapter that will be visualized online Series Title Chapter Title Chapter SubTitle Understanding Complex Systems Data Assimilation as Artificial Perception and Supermodeling as Artificial Consciousness Copyright Year 2013 Corresponding Author Family Name Duane Given Name Gregory S. Copyright Holder Springer-Verlag Berlin Heidelberg Particle Suffix Division Organization Macedonian Academy of Sciences and Arts, University of Colorado Email [email protected] Address Abstract Boulder, CO, USA Data assimilation is naturally conceived as the synchronization of two systems, “truth” and “model”, coupled through a limited exchange of information (observed data) in one direction. Though investigated most thoroughly in meteorology, the task of data assimilation arises in any situation where a predictive computational model is updated in run time by new observations of the target system, including the case where that model is a perceiving biological mind. In accordance with a view of a semi-autonomous mind evolving in synchrony with the material world, but not slaved to it, the goal is to prescribe a coupling between truth and model for maximal synchronization. It is shown that optimization leads to the usual algorithms for assimilation via Kalman Filtering under a weak linearity assumption. For nonlinear systems with model error and sampling error, the synchronization view gives a recipe for calculating covariance inflation factors that are usually introduced on an ad hoc basis. Consciousness can be framed as selfperception, and represented as a collection of models that assimilate data from one another and collectively synchronize. The combination of internal and external synchronization is examined in an array of models of spiking neurons, coupled to each other and to a stimulus, so as to segment a visual field. The interneuron coupling appears to enhance the overall synchronization of the model with reality. Data Assimilation as Artificial Perception and Supermodeling as Artificial Consciousness Gregory S. Duane 1 2 O O F 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 N C O R R EC TE D PR Abstract Data assimilation is naturally conceived as the synchronization of two systems, “truth” and “model”, coupled through a limited exchange of information (observed data) in one direction. Though investigated most thoroughly in meteorology, the task of data assimilation arises in any situation where a predictive computational model is updated in run time by new observations of the target system, including the case where that model is a perceiving biological mind. In accordance with a view of a semi-autonomous mind evolving in synchrony with the material world, but not slaved to it, the goal is to prescribe a coupling between truth and model for maximal synchronization. It is shown that optimization leads to the usual algorithms for assimilation via Kalman Filtering under a weak linearity assumption. For nonlinear systems with model error and sampling error, the synchronization view gives a recipe for calculating covariance inflation factors that are usually introduced on an ad hoc basis. Consciousness can be framed as self-perception, and represented as a collection of models that assimilate data from one another and collectively synchronize. The combination of internal and external synchronization is examined in an array of models of spiking neurons, coupled to each other and to a stimulus, so as to segment a visual field. The inter-neuron coupling appears to enhance the overall synchronization of the model with reality. U 1 Data Assimilation as Synchronization of Truth and Model A computational model of a physical process that provides a stream of new data to the model as it runs must include a scheme to combine the new data with the model’s prediction of the current state of the process. The goal of any such scheme G.S. Duane () Macedonian Academy of Sciences and Arts, University of Colorado, Boulder, CO, USA e-mail: [email protected] L. Kocarev (ed.), Consensus and Synchronization in Complex Networks, Understanding Complex Systems, DOI 10.1007/978-3-642-33359-0 8, © Springer-Verlag Berlin Heidelberg 2013 22 23 24 25 G.S. Duane U N C O R R EC TE D PR O O F is the optimal prediction of the future behavior of the physical process. While the relevance of the data assimilation problem is thus quite broad, techniques have been investigated most extensively for weather modeling, because of the high dimensionality of the fluid dynamical state space, and the frequency of potentially useful new observational input. Existing data assimilation techniques (3DVar,4DVar,Kalman Filtering, and Ensemble Kalman Filtering) combine observed data with the most recent forecast of the current state to form a best estimate of the true state of the atmosphere, each approach making different assumptions about the nature of the errors in the model and the observations. An alternative view of the data assimilation problem is suggested here. The objective of the process is not to “nowcast” the current state of reality, but to make the model converge to reality in the future. Recognizing also that a predictive model, especially a large one, is a semi-autonomous dynamical system in its own right, influenced but not determined by observational input from a coexisting reality, it is seen that the guiding principle that is needed is one of synchronism. That is, we seek to introduce a one-way coupling between reality and model, such that the two tend to be in the same state, or in states that in some way correspond, at each instant of time. The problem of data assimilation thus reduces to the problem of synchronization of a pair of dynamical systems, unidirectionally coupled through a noisy channel that passes a limited number of “observed” variables. While the synchronization of loosely coupled regular oscillators with limit cycle attractors is ubiquitous in nature [23], synchronization of chaotic oscillators has only been explored in the last two decades, in a wave of research spurred by the seminal work of Pecora and Carroll [17]. Chaos synchronization can be surprising because it implies that two systems, each effectively unpredictable, connected by a signal that can be virtually indistinguishable from noise, nevertheless exhibit a predictable relationship. Chaos synchronization has indeed been used to predict new kinds of weak teleconnection patterns relating different sectors of the global climate system. [2, 5, 8]. It is now clear that chaos synchronization is surprisingly easy to arrange, in both ODE and PDE systems [4, 5, 14]. A pair of spatially extended chaotic systems such as two quasi-2D fluid models, if coupled at only a discrete set of points and intermittently in time, can be made to synchronize completely. The application of chaos synchronization to the tracking of one dynamical system by another was proposed by So et al. [22], so the synchronization of the fluid models suggests a natural extension to meteorological data assimilation [7, 26]. Since the problem of data assimilation arises in any situation requiring a computational model of a parallel physical process to track that process as accurately as possible based on limited input, it is suggested here that the broadest view of data assimilation is that of machine perception by an artificially intelligent system. In this context, the role of synchronism is reminiscent of the psychologist Carl Jung’s notion of synchronicity in his view of the relationship between mind and the material world. Like a data assimilation system, mind forms a model of reality that functions well, despite limited sensory input. Jung, working in collaboration with Wolfgang Pauli [13], noted uncanny coincidences or “synchronicities” between 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 Artificial Consciousness 77 1.1 Standard Data Assimilation vs. Synchronization 78 O F mental and physical phenomena, which he took to reflect a new kind of order connecting the two realms. It was important to Jung and Pauli that synchronicities themselves were distinct, isolated events, but such phenomena can emerge naturally as a degraded form of chaos synchronization. In the artificial intelligence view of data assimilation, the additional issue of model error can be approached naturally as a problem of machine learning, which can indeed be addressed by extending the methodology of synchronization, as we show here. PR O Standard data assimilation, unlike synchronization, estimates the current state xT 2 Rn of one system, “truth,” from the state of a model system xB 2 Rn , combined with noisy observations of truth. The best estimate of truth is the “analysis” xA , which is the state that minimizes error as compared to all possible linear combinations of observations and model. That is R EC TE minimizes the analysis error < .xA xT /2 > for a stochastic distribution given by xobs D xT C where is observational noise, for properly chosen n n gain matrix K. The standard methods to be considered in this paper correspond to specific forms for the generally time-dependent matrix K. The simple method known as 3dVar uses a time-independent K that is based on the time-averaged statistical properties of the observational noise and the resulting forecast error. Let the matrix R R < T >D< .xobs xT /.xobs xT /T > O 76 79 80 81 82 83 84 85 86 87 88 89 90 (3) N C 75 91 U be the “background” error covariance, describing the deviation of the model state from the true state. If both covariance matrices are assumed to be constant in time, then the optimal linear combination of background and observations is: xA D R.R C B/1 xB C B.R C B/1 xobs 73 74 (2) be the observation error covariance, and the matrix B < .xB xT /.xB xT /T > 72 (1) D xA xB C K.xobs xB / 71 92 93 94 (4) The formula (4), which simply states that observations are weighted more heavily when background error is greater and conversely, defines the 3dVar method in practical data assimilation, based on empirical estimates of R and B. The 4dVar method, which will not be considered here, generalizes (4) to estimate a short history of true states from a corresponding short history of observations. 95 96 97 98 99 G.S. Duane The Kalman filtering method, which is popular for a variety of tracking problems, uses the dynamics of the model to update the background error covariance B sequentially. The analysis at each assimilation cycle i is: i xAi D R.R C B i /1 xBi C B i .R C B i /1 xobs 102 (5) where the background xBi is formed from the previous analysis xAi 1 simply by running the model M W Rn ! Rn xBi D Mi 1!i .x i 1 / 103 104 F (6) O O as is done in 3dVar. But now the background error is updated according to B i D Mi 1!i Ai 1 MTi 1!i C Q 100 101 105 (7) 106 107 108 D PR where A is the analysis error covariance A < .xA xT /.xA xT /T >, given conveniently by A1 D B 1 C R1 . The matrix M is the tangent linear model given by ˇ @Mb ˇˇ Mab (8) @x ˇ a xDxA R EC TE The update formula (7) gives the minimum analysis error < .xA xT /2 >D T rA at each cycle. The term Q is the covariance of the error in the model itself. To compare synchronization to standard data assimilation, we inquire as to the coupling that is optimal for synchronization, so that this coupling can be compared to the gain matrix used in the standard 3dVar and Kalman filtering schemes. The general form of coupling of truth to model that we consider in this section is given by a system of stochastic differential equations: 109 110 111 112 113 114 115 O R xPT D f .xT / xPB D f .xB / C C.xT xB C / (9) U N C where true state xT 2 Rn and the model state xB 2 Rn evolve according to the same dynamics, given by f , and where the noise in the coupling (observation) channel is the only source of stochasticity. The form (9) is meant to include dynamics f described by partial differential equations, as in the last section. The system is assumed to reach an equilibrium probability distribution, centered on the synchronization manifold xB D xT . The goal is to choose a time-dependent matrix C so as to minimize the spread of the distribution. Note that if C is a projection matrix, or a multiple of the identity, then (9) effects a form of nudging. But for arbitrary C , the scheme is much more general. Indeed, continuous-time generalizations of 3DVar and Kalman filtering can be put in the form (9). 116 117 118 119 120 121 122 123 124 125 126 Artificial Consciousness Let us assume that the dynamics vary slowly in state space, so that the Jacobian F Df , at a given instant, is the same for the two systems Df .xB / D Df .xT / (10) where terms of O.xB xT / are ignored. Then the difference between the two equations (9), in a linearized approximation, is eP D F e C e C C O F R R EC TE D where R D< T > is the observation error covariance matrix, and ı is a time-scale characteristic of the noise, analogous to the discrete time between molecular kicks in a Brownian motion process that is represented as a continuous process in Einstein’s well-known treatment. Equation (12) states that the local change in is given by the divergence of a probability current .F C /e except for random “kicks” due to the stochastic term. The PDF can be taken to have the Gaussian form D N exp.e T Ke/, where the matrix K is the inverse spread, and N is a normalization factor, chosen so R that d n e D 1. For background error covariance B, K D .2B/1 . In the onedimensional case, n D 1, where C and K are scalars, substitution of the Gaussian form in (12), for the stationary case where @[email protected] D 0 yields: O 131 132 133 (12) PR O 1 @ C re Œ.F C /e D ıre .CRC T re / @t 2 C F D ıRC 2 K (13) 2B.C F / D ıRC 2 (14) 134 135 136 137 138 139 140 141 142 143 144 145 N C Solving dB=dC D 0, it is readily seen that B is minimized (K is maximized) when C D 2F D .1=ı/B=R. In the multidimensional case, n > 1, the relation (14) generalizes to the fluctuation–dissipation relation U 129 130 (11) where e xB xT is the synchronization error. The stochastic differential equation (11) implies a deterministic partial differential equation, the Fokker–Planck equation, for the probability distribution .e/: or 127 128 B.C F /T C .C F /B D ıCRC T 146 147 148 149 (15) that can be obtained directly from the stochastic differential equation (11) by a standard proof [7]. B can then be minimized element-wise. Differentiating the matrix equation (15) with respect to the elements of C , we find 150 151 152 dB.C F /T C B.dC /T C .dC /B C .C F /dB D ıŒ.dC /RC T C CR.dC /T (16) 153 G.S. Duane where the matrix dC represents a set of arbitrary increments in the elements of C , and the matrix dB represents the resulting increments in the elements of B. Setting dB D 0, we have ŒB ıCR.dC /T C .dC /ŒB ıRC T D 0 154 155 156 (17) O F Since the matrices B and R are each symmetric, the two terms in (17) are transposes of one another. It is easily shown that the vanishing of their sum, for arbitrary dC , implies the vanishing of the factors in brackets in (17). Therefore C D .1=ı/BR1 , as in the 1D case. Turning now to the standard methods, so that a comparison can be made, it is recalled that the analysis xA after each cycle is given by: 157 158 159 160 161 162 O xA D R.R C B/1 xB C B.R C B/1 xobs D xB C B.R C B/1 .xobs xB / (18) D PR In 3dVar, the background error covariance matrix B is fixed; in Kalman filtering it is updated after each cycle using the linearized dynamics. The background for the next cycle is computed from the previous analysis by integrating the dynamical equations: xBnC1 D xAn C f .xAn / (19) TE where is the time interval between successive analyses. Thus the forecasts satisfy a difference equation: EC n xBnC1 D xBn C B.R C B/1 .xobs xBn / C f .xAn / R R O C N 167 168 169 170 (21) as Einstein modeled Brownian motion as a continuous process, using the white noise to represent the difference between observation xobs and truth xT . The continuous approximation is valid so long as f varies slowly on the time-scale . It is seen that when background error is small compared to observation error, the higher order terms OŒ.B.B C R/1 /2 can be neglected and the optimal coupling C D 1=ıBR1 is just the form that appears in the continuous data assimilation equation (21), for ı D . Thus under the linear assumption that Df .xB / D Df .xT /, the synchronization approach is equivalent to 3dVar in the case of constant background error, and to Kalman filtering if background error is dynamically updated over time. The equivalence can also be shown for an exact description of the discrete analysis cycle. That is, one can leave the analysis cycle intact and compare it to a discrete-time version of optimal synchronization, i.e. to optimally synchronized U 165 166 (20) We model the discrete process as a continuous process in which analysis and forecast are the same: xP B D f .xB / C 1=B.B C R/1 .xT xB C / C OŒ.B.B C R/1 /2 163 164 171 172 173 174 175 176 177 178 179 180 181 182 183 Artificial Consciousness maps. We rely on a fluctuation–dissipation relation (FDR) for stochastic difference equations. Consider the stochastic difference equation with additive noise, x.n C 1/ D F x.n/ C .n/; < .n/.m/T >D R ın;m ; (22) where x; 2 Rn , F , R are n n matrices, F is assumed to be stable, and is Gaussian white noise. One can show that the equilibrium covariance matrix < xx T > satisfies the matrix FDR F F T C R D 0: 184 185 186 187 188 F (23) PR O O Now consider a model that takes the analysis at step n to a new background at step n C 1, given by a linear matrix M . That is, xB .n C 1/ D M xA .n/. Also, xT .n C 1/ D M xT .n/. Since xA .n/ D xB .n/ C B.B C R/1 .xobs .n/ xB .n//, where xobs D xT C , we derive a difference equation for e xB xT : e.n C 1/ D M.I B.B C R/1 /e.n/ C MB.B C R/1 : D TE 193 (25) EC and with the FDR as derived above: .M C /B.M C /T B C CRC T D 0 191 192 (24) For synchronously coupled maps, on the other hand, we have e.n C 1/ D .M C /e.n C 1/ C C ; 189 190 194 (26) 195 196 R R Differentiating the matrix equation (26) with respect to the elements of C , as in the continuous-time analysis, we find dB C dCRC T C CRdC T : (27) C O 0 D .M C /dB.M C /T C .dC /B.M C /T C .M C /B.dC /T N We seek a matrix C for which dB D 0 for arbitrary dC , and thus (28) U .dC /ŒB.M C /T RC T C Œ.M C /B CR.dC /T D 0 197 for arbitrary dC . The two terms are transposes of one another, and it is easily shown, as in the continuous-time case, that the quantities in brackets must vanish. This gives the optimal matrix C D MB.B C R/1 (29) which upon substitution in (25) reproduces the standard data assimilation form (24), confirming the equivalence. 198 199 200 201 202 G.S. Duane 1.2 Synchronization vs. Data Assimilation for Strongly Nonlinear Dynamics 203 1.2.1 The Perfect Model Case 205 In a region of state space where nonlinearities are strong and (10) fails, the prognostic equation for error in the (11) is replaced by: 206 204 eP D .F C /e C Ge 2 C He 3 C C F (30) PR O O where we have included terms up to cubic order in e, with H < 0 to prevent divergent error growth for large positive or negative e. In the multidimensional case, (30) is shorthand for a tensor equation in which G and H are tensors of rank three and rank four (and the restrictions on H are more complex). In the one-dimensional case, which we shall analyze here, G and H are scalars. The Fokker–Planck equation is now: D 1 @ C re fŒ.F C /e C Ge 2 C He 3 g D ıre .CRC T re / @t 2 .e/ D N exp.Ke 2 C Le 3 C M e 4 / 211 212 213 (32) EC R R O C N 209 210 (31) R1 with a normalization factor N D Œ 1 exp.Ke 2 C Le 3 C M e 4 /1 , we obtain from (31) the following relations between the dynamical parameters and the PDF parameters : U 208 214 TE Using the ansatz for the PDF : 207 1 2 C R.2K/ 2 1 G D C 2 R.3L/ 2 1 H D C 2 R.4M / 2 215 216 217 F C D (33) The goal is to minimize the background error: R1 R1 B.K; L; M / D 1 e 2 exp.Ke 2 C Le 3 C M e 4 / 1 exp.Ke 2 C Le 3 C M e 4 / 218 : (34) Using (33) to express the arguments of B in terms of the dynamical parameters, we find B.K; L; M / D B.K.C /; L.C /; M.C // B.C / and can seek the value of C that minimizes B, for fixed dynamical parameters F; G; H . 219 220 221 Artificial Consciousness For grounding in choosing appropriate parameter values, the nonlinearities of typical geophysical fluid systems were considered in a previous study [7]. The coupling that gives optimal synchronization can be compared with the coupling used in standard data assimilation, as for the linear case. In particular, one can ask whether the “covariance inflation” scheme that is used as an ad hoc adjustment in Kalman filtering [1] can reproduce the C values found to be optimal for synchronization. In the continuous assimilation case, the form C D 1 BR11 is replaced by the adjusted form 1 F BR1 224 225 226 227 228 229 F (35) O C D 222 223 230 O where F is the covariance inflation factor. 231 In practice, the need for covariance inflation is thought to arise more from model error and from sampling error than from the nonlinearity of a hypothesized perfect model. The reasoning of the previous subsection can be readily extended to incorporate model error arising from processes on small scales that escape the digital representation. While errors in the parameters or the equations for the explicit degrees of freedom require deterministic corrections, the unresolved scales, assumed dynamically independent, can only be represented stochastically. The physical system is governed by: 232 EC TE D PR 1.2.2 Nonlinear Dynamics with Stochastic Model Error O R T >. The error in place of (9a), where M is model error, with covariance Q < M equation (30) becomes N The Fokker–Planck equation becomes: U @ 1 C re fŒ.F C /e C Ge 2 C He 3 g D ıre Œ.CRC T C Q/re @t 2 237 238 239 240 241 (37) C eP D .F C /e C Ge 2 C He 3 C C C M 235 236 (36) R xPT D f .xT / M 233 234 242 (38) 243 The use of C D 1 B.B CR/1 in [7] was misplaced. That form applies in the case of a discrete assimilation cycle. 1 G.S. Duane Table 1 Covariance inflation factor vs. bimodality parameters d1 ; d2 0.75 1.0 1.25 1.5 1.75 2.0 d2 0.75 1.26 1.26 1.28 1.30 1.32 1.34 1.0 1.26 1.23 1.23 1.25 1.27 1.29 d1 1.25 1.28 1.23 1.22 1.23 1.24 1.25 1.5 1.30 1.25 1.23 1.22 1.22 1.23 2.0 1.34 1.29 1.25 1.23 1.23 1.23 GD 1 .C 2 R C Q/.3L/ 2 H D 1 .C 2 R C Q/.4M / 2 O 1 .C 2 R C Q/.2K/ 2 (39) D PR F C D 244 O F leading, as before, to: 1.75 1.32 1.27 1.24 1.23 1.22 1.23 U N C O R R EC TE The background error B given by (34) is now expressed in terms of C by substituting expressions for K, L, and M derived from (39). The value of C that gives the minimum B for fixed dynamical parameters F; G; H can then be found numerically as in the perfect model case. The optimization problem was solved numerically with results as shown in Table 1 for a range of parameter values. Results are shown in terms of length scales d1 and d2 , for dynamics described by a two-well potential with two stable fixed points at respective distances d1 ; d2 from a central unstable fixed point. For that configuration it is found that G D 1=d2 1=d1 and H D 1=.d1 d2 /. The covariance inflation factors are remarkably constant over a wide range of parameters and agree with typical values used in operational practice. Results are displayed for the case where the amplitude of model error in (36) is 50% of the resolved tendency xPT , with the resulting model error covariance Q approximately one-fourth of the background error covariance B. The thirde type of error that necessitates covariance inflation is sampling error, which affects estimates of the background covariance B, and thus of the coupling C . Background error is systematically underestimated due to undersampling, and this effect has been treated by others [19]. Here, we mention that there is additional uncertainty in the optimal coupling due to random sampling, which can be represented as an additional noise term S in a revised error equation: eP D .F C C S /e C Ge 2 C He 3 C C C M 245 246 247 248 249 250 251 252 253 254 255 256 257 258 259 260 261 262 263 264 (40) The multiplicative noise from sampling error, with covariance S < ST >, combines with the (uncorrelated) additive noise from observation error and model 265 266 Artificial Consciousness error, giving an extended Fokker–Planck equation (see e.g. [20]). That equation is most easily presented in the one-dimensional case: 267 268 @ 1 @2 @ 1 C fŒ.F C /e C Ge 2 C He 3 C Seg D ı 2 Œ.C 2 R C Q C Se 2 / (41) @t @e 2 2 @e O PR 2 Supermodeling as Artificial Consciousness O F This approach to the treatment of sampling error is currently under development. In the synchronization view, the range of values typically used for covariance inflation factors in the presence of model nonlinearity, with the various sources of error can thus be explained. A more detailed analysis should enable a comparison with operational values in specific cases. U N C O R R EC TE D In a perceiving brain, synchronization of truth and model occurs alongside internal synchronization—in patterns of neuronal firing that have been suggested as a mechanism for grouping of different features belonging to the same physical object [12, 21, 25]. It was argued previously that patterns of synchronized firing of neurons provide a particularly natural and useful representation of objective grouping relationships, with chaotic intermittency allowing the system to escape locally optimal patterns in favor of global ones [3], following an early suggestion of Freeman’s [11]. The observed, highly intermittent synchronization of 40 Hz neural spike trains might play just such a role. The role of spike train synchronization in perceptual grouping has led to speculations about the role of synchronization in consciousness [15, 18, 23, 25]. Recent debates over the physiological basis of consciousness have centered on the question of what groups or categories of neurons must fire in synchrony in a mental process for that process to be a “conscious” one [15]. Here we suggest a relationship between internal synchronization and consciousness on a more naive basis: Consciousness can be framed as self-perception, and then placed on a similar footing as perception of the objective world. In this view, there must be semi-autonomous parts of a “conscious” mind that perceive one another. In the interpretation of Sect. 1, these components of the mind synchronize with one another, or in alternative language, they perform “data assimilation” from one another, with a limited exchange of information. The scheme has actually been proposed and is currently being investigated, for fusion of alternative computational models of the same objective process in a practical context [24]. Taking the proposed interpretation of consciousness seriously, again imagine that the world is a 3-variable Lorenz system, perceived by three different components of mind, also represented by Lorenz systems, but with different parameters. The three Lorenz systems also “self-perceive” each other. Three imperfect “model” Lorenz systems were generated by perturbing parameters in the differential equations for 269 270 271 272 273 274 275 276 277 278 279 280 281 282 283 284 285 286 287 288 289 290 291 292 293 294 295 296 297 298 299 300 301 302 Fig. 1 “Model” Lorenz systems are linked to each other, generally in both directions, and to “reality” in one direction. Separate links between models, with distinct values of the connection ij coefficients Cl , are introduced for different variables and for each direction of possible influence K1 K2 Reality Model 1 Cℓ12 Cℓ21 Model 2 Cℓ31 Cℓ13 Cℓ23 K3 Cℓ32 Model 3 O O F this figure will be printed in b/w G.S. Duane X j ¤i yPi D xi yi xi zi C i C X y Cij .yj yi / C Ky .y yi / (42) j ¤i TE zPi D ˇi zi C xi yi C 303 304 Cijx .xj xi / C Kx .x xi / D xP i D i .yi zi / C PR a given “real” Lorenz system and adding extra terms. The resulting suite is: xP D .y z/; yP D x y xz; zP D ˇz C xy X Cijz .zj zi / C Kz .z zi / EC j ¤i U N C O R R where .x; y; z/ is the real Lorenz system and .xi ; yi ; zi / i D 1; 2; 3 are the three models. An extra term is present in the models but not in the real system. Because of the relatively small number of variables available in this toy system, all possible directional couplings among corresponding variables in the three Lorenz systems were considered, giving 18 connection coefficients CijA A D x; y; z i; j D 1; 2; 3 i ¤ j . The constants KA A D x; y; z are chosen arbitrarily so as to effect “data assimilation” from the “real” Lorenz system into the three coupled “model” systems. The configuration is schematized in Fig. 1. The connections linking the three model systems were chosen using a general result on parameter adaptation in synchronously coupled systems with mismatched parameters: If two systems synchronize when their parameters match, then under some weak assumptions it is possible to prescribe a dynamical evolution law for general parameters in one of the systems so that the parameters of the two systems, as well as the states, will converge [9]. In the present case the tunable parameters are taken to be the connection coefficients (not the parameters of the separate Lorenz systems), and they are tuned under the peculiar assumption that reality itself is a similar suite of connected Lorenz systems. The general result [9] gives the following adaptation rule for the couplings: ! X 1 x x x CP i;j D a.xj xi / x xk =.Ci;j 100/2 C =.Ci;j C ı/2 (43) 3 k 305 306 307 308 309 310 311 312 313 314 315 316 317 318 319 320 321 322 Artificial Consciousness U N C O R R EC TE D PR O O F y z with analogous equations for CP i;j and CP i;j , where the adaptation rate a is an arbitrary constant and the terms with coefficient dynamically constrain all A couplings Ci;j to remain in the range .ı; 100/ for some small number ı. Without recourse to the general result on parameter adaptation, the rule (43) has a simple interpretation: Time integrals of the first terms on the right-hand side of P each equation give correlations between truth-model synchronization error, x 13 k xk , and inter-model “nudging,” xj xi . We indeed want to increase or decrease the inter-model nudging, for a given pair of corresponding variables, depending on the sign and magnitude of this correlation. (The learning algorithm we have described resembles a supervised version of Hebbian learning. In that scheme “cells that fire together wire together.” Here, corresponding model components “wire together” in a preferred direction, until they “fire” in concert with reality.) The procedure will produce a set of values for the connection coefficients that is at least locally optimal. A simple case is one in which each of the three model systems contains the “correct” equation for only one of the three variables, and “incorrect” equations for the other two. The “real” system could then be formed using large connections for the three correct equations, with other connections vanishing. Other combinations of model equations will also approximate reality. In a numerical experiment (Fig. 2a), the couplings did not converge, but the coupled suite of “models” did indeed synchronize with the “real” system, even with the adaptation process turned off half-way through the simulation so that the A coupling coefficients Ci;j subsequently held fixed values. The difference between corresponding variables in the “real” and coupled “model” systems was significantly less than the difference using the average outputs of the same suite of models, not coupled among themselves. (With the coupling turned on, the three models also synchronized among themselves nearly identically, so the average was nearly the same in that case as the output of any single model.) Further, without the model– model coupling, the output of the single model with the best equation for the given variable (in this case z, modeled best by system 1) differed even more from “reality” than the average output of the three models. Therefore, it is unlikely that any ex post facto weighting scheme applied to the three outputs would give results equalling those of the synchronized suite. Internal synchronization within the multi-model “mind” is essential. In a case where no model had the “correct” equation for any variable, results were only slightly worse (Fig. 2d). The above scheme for fusion of imperfect computational/mental models only requires that the models come equipped with a procedure to assimilate new measurements from an objective process in real time, and hence from one another. The scheme has indeed been suggested for the combination of long-range climate projection models, which differ significantly among themselves in regard to the magnitude and regional characteristics of expected global warming [6]. (To project twenty-first century climate, the models are disconnected from reality after training, parameters are altered slightly to represent increased greenhouse gas levels, and one assesses changes in the overall shape of the attractor.) In this context, previous results with Lorenz systems were confirmed and extended using a more developed 323 324 325 326 327 328 329 330 331 332 333 334 335 336 337 338 339 340 341 342 343 344 345 346 347 348 349 350 351 352 353 354 355 356 357 358 359 360 361 362 363 364 365 366 367 G.S. Duane b 30 zm-z c 30 d 30 30 0 0 0 -30 -30 0 1000 0 -30 0 time 1000 -30 0 1000 time 0 1000 time time O F Fig. 2 Difference zm z between “model” and “real” z vs. time for a Lorenz system with D 28, ˇ D 8=3, D 10:0 and a suite of models with 1;2;3 D , ˇ1 D ˇ, 1 D 15:0, 1 D 30:0, ˇ2 D 1:0, 2 D , 2 D 30:0, ˇ3 D 4:0,3 D 5:0, 3 D 0. The synchronization error is shown for (a) the average of the coupled suite zm D .z1 C z2 C z3 /=3 with couplings CijA adapted according to (43) for 0 < t < 500 and held constant for 500 < t < 1; 000; (b) the same average zm but with all CijA D 0; (c) zm D z1 , the output of the model with the best z equation, with CijA D 0; (d) as in (a) but with ˇ1 D 7=3, 2 D 13:0, and 3 D 8:0, so that no equation in any model is “correct.” (Analogous comparisons for x and y give similar conclusions) PR O this figure will be printed in b/w a O R R EC TE D machine learning method to determine inter-model connections [24]. The scheme could also be applied to financial, physiological, or ecological models. That the transition to synchronization among a suite of interconnected systems is sharper than the transition for a pair of systems is taken here to bolster the previous suggestions that synchronization plays a fundamental role in conscious mental processing. It remains to integrate a theory of higher-level synchronization with the known synchronization of 40 Hz spike trains. It is certainly plausible that inter-scale interactions might allow synchronization at one level to rest on and/or support synchronization at the other level. In a complex biological nervous system, with a steady stream of new input data, it is also very plausible that natural noise or chaos would give rise to very brief periods of widespread high-quality synchronization across the system, and possibly between the system and reality. Such “synchronicities” would appear subjectively as consciousness. 2.1 Supermodeling in a Simple Neural Model C N U X kij i 0 j 0 .vi 0 j 0 vij / i; j D 1; : : : ; 40 .i;j /¤.i 0 ;j 0 / wP ij D cŒ˛.1 C tanh.ˇvij // wij C X .i;j /¤.i 0 ;j 0 / kij i 0 j 0 .wi 0 j 0 wij / 369 370 371 372 373 374 375 376 377 378 379 380 381 To explore the possible connection between conscious perception and the type of “supermodel” described above, we examine a simple model of visual grouping exhibiting both internal and external synchronization. We consider a 40 40 array of FitzHugh-Nagumo (FN) oscillators [10, 16], each oscillator a 2-variable system, connected via diffusive coupling: vP ij D 3vij v3ij v7ij C 2 wij C 368 (44) 382 383 384 385 386 Artificial Consciousness b c 4 3 2 1 200 400 600 800 time O F Fig. 3 Stimulus (a) presented to an array of FN neurons (44) induces a synchronization pattern (b) in which the FN v variables in (44), represented by gray level) closely agree within the bright square but desynchonize in the dark background region, in a snapshot at t D 0:15. The v-cycle for units in is the stimulus region is shown in panel (c) for units at .15; 15/ (purple) and .25; 25/ (yellow) along with the cycle for a desynchronized unit in the background at .25; 2/. (˛ D 12, ˇ D 4, c D 0:04, D 10.) Time is in units of t =10 PR O this figure will be printed in b/w v a H.Imageij /.expŒ.Imagei 0 j 0 Imageij /2 = 2 if.i 0 ; j 0 / 2 Ni;j 0 otherwise (45) EC kij i 0 j 0 D N C O R R where Nij is a small neighborhood of .i; j /, of radius 2.5, and H is a Heaviside step function, vanishing for negative arguments. The constants c; ˛; ˇ; and are defined in the figure captions. The effect of the coupling (45) is to cause the spike trains of the oscillators at neighboring positions to synchronize if their input pixel gray levels are similar. But additionally, the step function negates the synchronizing effect unless there is a stimulus present, of brightness greater than some threshold . (One could imagine the connection scheme to have emerged from a process of fast Hebbian learning, which we do not represent explicitly here.) Upon presentation of a stimulus such as the bright square in Fig. 3, the neurons in the region of the square synchronize, while the neurons corresponding to the background field remain desynchronized. The synchronization pattern defines a segmentation of the visual field. If desired, that segmentation could be represented by another layer of neurons, at lower resolution, with potentials determined by the variances over local regions of our FN array, or proxies thereto that are more readily calculated by the neural network. We now imagine a time-varying stimulus. If the stimulus varies on a much slower time scale than that of the neural spike trains, as seems realistic, then one can expect the synchronization patterns to themselves synchronize with the slowly varying stimulus. The thresholding form in (45) would result in a crude, binarized form of synchronization with the external stimulus. But if one imagines a random distribution of thresholds among a large collection of neurons, then U 389 TE ( 387 388 D For image segmentation, the connection coefficient kij i 0 j 0 links a pixel at position i; j to a pixel at i 0 ; j 0 , with strength depending on the brightness values Imageij and Imagei 0 j 0 . Specifically, we take 390 391 392 393 394 395 396 397 398 399 400 401 402 403 404 405 406 407 408 409 410 a b variance 1.0 1.5 0.8 1.0 0.6 0.4 0.5 0.2 50 100 150 200 250 time 300 50 100 150 200 250 300 time F Fig. 4 The stimulus pattern in Fig. 3a is temporally modulated by the waveform (a). The variance (b) within the inner square region drops sharply at times of high signal value (yellow background line). (The more gradual decreasing trend in variance over most of the time window results from the decreasing signal in the portion of the waveform considered, as shown in Fig. 3c.) O O this figure will be printed in b/w G.S. Duane U N C O R R EC TE D PR there should be analogue synchronization between the incoming signal and the strength of the neural pattern. For convenience, we consider here the unrealistic case of stimulus modulation on a time scale of the same order as that of the spike trains. Even in this case, one observes a vestige of synchronization between the neural patterns, measured as variance, and the incoming signal, as seen in Fig. 4. Near the maximum signal phase in each stimulus cycle, there is a relatively sharp drop in the variance of the neural pattern. (There is a background component of steadily decreasing variance over most of the time window because of the shape of the recently synchronized waveforms, which are in a decreasing phase, as is their variance.) The behavior can be expected to generalize to chaotic inputs, especially for longer stimulus time scales. The role of “supermodeling” in the FN array is to make a pattern of activity in a large collection of neurons synchronize more perfectly with a coherent, spatially extended input pattern, where each neuron is only exposed to a restricted, imperfect representation of the entire pattern. We consider a stimulus of nonuniform brightness, as might be due to an illumination gradient, but with the same temporal variation as before. Again, there is a weak vestige of synchronization between the neural pattern and the signal (Fig. 5). It is conjectured that the coupled collection of oscillators will synchronize better with the overall temporal pattern than a smaller collection of oscillators responding to a small piece of the input. In one sense, that conjecture is trivial, since variances computed over small sets will be noisy—the advantage of supermodeling is logically tied to the system’s use of synchronization patterns for representation of stimuli in the first place! But additionally, there is typically a synergistic effect in coupled networks of nonlinear oscillators. It is thought that the degree of synchronization with the external pattern will actually be enhanced by internal synchronization within the network. Support for this conjecture is provided by the results for a uniform increase in coupling strength across the network: The dashed line in Fig. 5d shows enhancement of the stimulusprovoked drops in variance, as well as decrease in overall variances. 411 412 413 414 415 416 417 418 419 420 421 422 423 424 425 426 427 428 429 430 431 432 433 434 435 436 437 438 439 a b c variance 1.5 1.0 0.5 50 100 150 200 250 time 300 O F Fig. 5 A ramp image stimulus, as might arise from a gradual trend in illumination (a), is input to the FN network, with the same overall time modulation as in Fig. 4, giving synchronized v-cycles within the stimulus region and desynchronized background, as seen in a snapshot at t D 25 (b). The variance (c) within the inner square region drops sharply at times of high signal value (yellow line) as before, effectively “discounting the illuminant”. The temporal pattern of response to the external stimulus is enhanced if all internal couplings are increased by a factor of 32 (dashed line) PR O this figure will be printed in b/w Artificial Consciousness 448 References 449 R R EC TE D To complete the analogy with computational supermodeling, one may imagine that the “image” inputs are provided by lower levels of a neural hierarchy, and that there are differences in the “models” at those lower levels. Conversely, to reach the type of self-perceptive processing that was described above as conscious, the inter-model synchronization at higher levels should be at slower time scales, not those of spike trains. The visual segmentation example serves mostly to illustrate a fundamental interplay, in “neural” dynamics, between synchronization within a set of models or units on the one hand, and synchronizability with a system that is to be compactly represented on the other. U N C O 1. Anderson, J.L.: An ensemble adjustment Kalman filter for data assimilation. Mon. Wea. Rev. 129, 2884–2903 (2001) 2. Duane, G.S.: Synchronized chaos in extended systems and meteorological teleconnections. Phys. Rev. E 56, 6475–6493 (1997) 3. Duane, G.S.: A ‘cellular neuronal’ approach to optimization problems. Chaos 19, Art. No. 033114 (2009) 4. Duane, G.S., Tribbia, J.J.: Synchronized chaos in geophysical fluid dynamics. Phys. Rev. Lett. 86, 4298–4301 (2001) 5. Duane, G.S., Tribbia, J.J.: Weak Atlantic-Pacific teleconnections as synchronized chaos. J. Atmos. Sci. 61, 2149–2168 (2004) 6. Duane, G.S., Tribbia, J., Kirtman, B.: Consensus on long-range prediction by adaptive synchronization of models. In: Paper presented at EGU General Assembly, No. 13324, Vienna, Austria, April 2009 7. Duane, G.S., Tribbia, J.J., Weiss, J.B.: Synchronicity in predictive modelling: A new view of data assimilation. Nonlinear Process. Geophys. 13, 601–612 (2006) 8. Duane, G.S., Webster, P.J., Weiss, J.B.: Co-occurrence of Northern and Southern Hemisphere blocks as partially synchronized chaos. J. Atmos. Sci. 56, 4183–4205 (1999) 9. Duane, G.S., Yu, D.-C., Kocarev, L.: Identical synchronization, with translation invariance, implies parameter estimation. Phys. Lett. A 371, 416–420 (2007) 440 441 442 443 444 445 446 447 450 451 452 453 454 455 456 457 458 459 460 461 462 463 464 465 466 467 468 G.S. Duane U N C O R R EC TE D PR O O F 10. FitzHugh, R.: Impulses and physiological states in theoretical models of nerve membrane. Biophys. J. 1, 445–466 (1961) 11. Freeman, W.J.: Chaos in the brain – possible roles in biological intelligence. Int. J. Intell. Syst. 10, 71–88 (1995) 12. Gray, C.M., Konig, P., Engel, A.K., Singer, W.: Oscillatory responses in cat visual-cortex exhibit inter-columnar synchronization which reflects global stimulus properties. Nature 338, 334–337 (1989) 13. Jung, C.G., Pauli, W.: The Interpretation of Nature and the Psyche. Pantheon, New York (1955) 14. Kocarev, L., Tasev, Z., Parlitz, U.: Synchronizing spatiotemporal chaos of partial differential equations. Phys. Rev. Lett. 79, 51–54 (1997) 15. Koch, C., Greenfield, S.: How does consciousness happen? Sci. Am. 297, 76–83 (2007) 16. Nagumo, J., Arimoto, S., Yoshizawa, S.: An active pulse transmission line simulating nerve axon. Proc. IRE 50, 2061–2070 (1962) 17. Pecora, L.M., Carroll, T.L.: Synchronization in chaotic systems. Phys. Rev. Lett. 64, 821–824 (1990) 18. Rodriguez, E., George, N., Lachaux, J.P., Martinerie, J., Renault, B., Varela, F.J.: Perception’s shadow: Long-distance synchronization of human brain activity. Nature 397, 430–433 (1999) 19. Sacher, W., Bartello, P.: Sampling errors in ensemble Kalman filtering. Part I: Theory Mon. Wea. Rev. 136, 3035–3049 (2008) 20. Sardeshmukh, P.D., Sura, P.: Reconciling non-gaussian climate statistics with linear dynamics. J. Climate 22, 1193–1207 (2009) 21. Schechter, B.: How the brain gets rhythm. Science 274, 339–340 (1996) 22. So, P., Ott, E., Dayawansa, W.P.: Observing chaos – deducing and tracking the state of a chaotic system from limited observation. Phys. Rev. E 49, 2650–2660 (1994) 23. Strogatz, S.H.: Sync: The Emerging Science of Spontaneous Order, p. 338. Theia, New York (2003) 24. van den Berge, L.A., Selten, F.M., Wiegerinck, W., Duane, G.S.: A multi-model ensemble method that combines imperfect models through learning. Earth Syst. Dyn. 2, 161–177 (2011) 25. von der Malsburg, C., Schneider, W.: A neural coctail-party processor. Biol. Cybern. 54, 29–40 (1986) 26. Yang, S.-C., Baker, D., Cordes, K., Huff, M., Nagpal, G., Okereke, E., Villafañe, J., Duane, G.: Data assimilation as synchronization of truth and model: Experiments with the three-variable Lorenz system. J. Atmos. Sci. 63, 2340–2354 (2004) 469 470 471 472 473 474 475 476 477 478 479 480 481 482 483 484 485 486 487 488 489 490 491 492 493 494 495 496 497 498 499 500 501

* Your assessment is very important for improving the work of artificial intelligence, which forms the content of this project

### Related manuals

Download PDF

advertisement