On Applications of New Soft and Evolutionary Modeling Problems

On Applications of New Soft and Evolutionary Modeling Problems
On Applications of New Soft and Evolutionary
Computing Techniques to Direct and Inverse
Modeling Problems
Thesis submitted in partial fulfillment of the requirements for the award of
the Doctor of Philosophy
by
Babita Majhi
Roll No. 50609003, Ph. D.
Under the guidance of
Prof. (Dr.) G. Panda, FNAE, FNASc.
Electronics & Communication Engineering
National Institute of Technology
Rourkela - 769008
CERTIFICATE
This is to certify that the thesis entitled “On Applications of New Soft and
Evolutionary Computing Techniques to Direct and Inverse Modeling
Problems” by Ms. Babita Majhi, submitted to the National Institute of
Technology, Rourkela for the degree of Doctor of Philosophy, is a record of
bonafide research work carried out by her in the department of Electronics and
Communication Engineering under my supervision. I believe that the thesis
fulfills part of the requirements for the award of degree of Doctor of Philosophy.
The results embodied in the thesis have not been submitted for award of any
other degree.
Dr. Ganapati Panda, FNAE, FNASc.
Professor
Department of ECE
National Institute of Technology
Rourkela – 769008
India
ACKNOWLEDGEMENT
I am indebted to many people who contributed through their support, knowledge and
friendship, to this work and the years at NIT Rourkela.
I am grateful to my supervisor, Prof. G. Panda, who gave me the opportunity to realize this
work in the laboratory. He encouraged, supported and motivated me with much kindness
throughout the work. I always had the freedom to follow my own ideas, which I am very
grateful for. I really admire him for patience and staying power to carefully read the whole
thesis. It is his help for which I stand where I am.
I am also grateful to NIT Rourkela for providing me adequate infrastructure to carry out the
present investigations.
I am thankful to Prof. K. K. Mahapatra, Prof. S. K. Patra, Prof. S. Meher of Electronics and
Communication Engg. department and Prof. U. K. Mohanty of Metallurgical and Materials
Engg. department for extending their valuable suggestions and help whenever I approached.
My special thanks to Dr. D. P. Acharya and Ajit Kumar Sahoo for their constant inspiration
and encouragement during my research.
My hearty thanks to Sitanshu, Jagganath, Trilochan, Upendra, Sudhansu, Pyari, Nithin, Vikas,
Pawan, Sasmita and Piter for their help, cooperation and encouragement.
I acknowledge all staff, research scholars and juniors of ECE department, NIT Rourkela for
helping me.
I render my respect to all my family members for giving me mental support and inspiration for
carrying out my research work.
(Babita Majhi)
Roll. No. 50609003, Ph. D.
ii
ABSTRACT
Adaptive direct modeling or system identification and adaptive inverse modeling or channel
equalization find extensive applications in telecommunication, control system, instrumentation,
power system engineering and geophysics. If the plants or systems are nonlinear, dynamic ,
Hammerstein and multiple-input and multiple-output (MIMO) types, the identification task
becomes very difficult.
Further, the existing conventional methods like the least mean square (LMS) and recursive least
square (RLS) algorithms do not provide satisfactory training to develop accurate direct and
inverse models. Very often these (LMS and RLS) derivative based algorithms do not lead to
optimal solutions in pole-zero and Hammerstein type system identification problem as they
have tendency to be trapped by local minima.
In many practical situations the output data are contaminated with impulsive type outliers in
addition to measurement noise. The density of the outliers may be up to 50%, which means
that about 50% of the available data are affected by outliers. The strength of these outliers may
be two to five times the maximum amplitude of the signal. Under such adverse conditions the
available learning algorithms are not effective in imparting satisfactory training to update the
weights of the adaptive models. As a result the resultant direct and inverse models become
inaccurate and improper.
Hence there are three important issues which need attention to be resolved. These are :
(i)
Development of accurate direct and inverse models of complex plants using some
novel architecture and new learning techniques.
(ii)
Development of new training rules which alleviates local minima problem during
training and thus help in generating improved adaptive models.
(iii)
Development of robust training strategy which is less sensitive to outliers in
training and thus to create identification and equalization models which are robust
against outliers.
These issues are addressed in this thesis and corresponding contribution are outlined in seven
Chapters. In addition, one Chapter on introduction, another on required architectures and
iii
algorithms and last Chapter on conclusion and scope for further research work are embodied
in the thesis.
A new cascaded low complexity functional link artificial neural network (FLANN) structure is
proposed and the corresponding learning algorithm is derived and used to identify nonlinear
dynamic plants. In terms of identification performance this model is shown to outperform the
multilayer perceptron and FLANN model. A novel method of identification of IIR plants is
proposed using comprehensive learning particle swarm optimization (CLPSO) algorithm. It is
shown that the new approach is more accurate in identification and takes less CPU time
compared to those obtained by existing recursive LMS (RLMS), genetic algorithm (GA) and
PSO based approaches. The bacterial foraging optimization (BFO) and PSO are used to
develop efficient learning algorithms to train models to identify nonlinear dynamic and MIMO
plants. The new scheme takes less computational effort, more accurate and consumes less
input samples for training. Robust identification and equalization of complex plants have been
carried out using outliers in training sets through minimization of robust norms using PSO and
BFO based methods. This method yields robust performance both in equalization and
identification tasks. Identification of Hammerstein plants has been achieved successfully using
PSO, new clonal PSO (CPSO) and immunized PSO (IPSO) algorithms. Finally the thesis
proposes a distributed approach to identification of plants by developing two distributed
learning algorithms : incremental PSO and diffusion PSO. It is shown that the new approach is
more efficient in terms of accuracy and training time compared to centralized PSO based
approach. In addition a robust distributed approach for identification is proposed and its
performance has been evaluated.
In essence the thesis proposed many new and efficient algorithms and structure for
identification and equalization task such as distributed algorithms, robust algorithms,
algorithms for ploe-zero identification and Hammerstein models. All these new methods are
shown to be better in terms of performance, speed of computation or accuracy of results.
iv
Contents
Particulars
Page No.
Certificate
Acknowledgement
Abstract
Contents
List of Figures
List of Tables
Glossary
i
ii
iii-iv
v-ix
x-xiv
xv-xvi
xvii-xviii
Chapters
1. Introduction
1-11
1.1. Background
1
1.2. Motivation
4
1.3. Major contribution of the thesis
5
1.4. Chapter wise contribution
6
References
9
2. Selected adaptive architectures and bio-inspired techniques,
principles and algorithms
12-45
2.1
Introduction
12
2.2
The adaptive filtering problem
14
2.3
2.4
2.2.1
Adaptive FIR filter
15
2.2.2
Adaptive IIR filter
15
Artificial neural network (ANN)
17
2.3.1
Single neuron structure
17
2.3.2
Multilayer perceptron (MLP)
19
2.3.3
Functional link artificial neural network (FLANN)
21
Learning algorithms
23
v
2.4.1
2.5
Derivative based algorithms
23
2.4.1.1
LMS algorithm for adaptive FIR filters
24
2.4.1.2
Adaptive IIR LMS (ILMS) algorithm
25
2.4.1.3
Back propagation(BP) algorithm
25
2.4.1.4
The FLANN algorithm
28
Derivative free algorithms/Evolutionary computing based Algorithms
2.5.1
Genetic algorithm(GA)
29
29
2.5.1.1
Outline of the basic genetic algorithm
29
2.5.1.2
Operators of GA
30
2.5.1.3
Parameters of GA
32
2.5.1.4
Selection
33
2.5.1.5
GA for function optimization
33
2.5.2
Particle swarm optimization(PSO)
36
2.5.2.1
Basic method
36
2.5.2.2
Particle swarm optimization algorithm
37
2.5.3
Bacterial foraging optimization(BFO)
39
2.5.3.1.
Introduction
39
2.5.3.2
Bacterial foraging
40
2.5.4
Artificial immune system (AIS)
42
References
44
3. Development of a new cascaded functional link artificial neural 46-69
network (CFLANN) for nonlinear dynamic system identification
3.1 Introduction
46
3.2 Nonlinear dynamic system identification
49
3.3 Cascaded functional link artificial neural network
51
3.3.1 The FLANN
51
3.3.2
52
The CFLANN
3.4 Simulation study
54
3.5 Conclusion
66
vi
References
66
4. Identification of IIR plants using comprehensive learning
particle swarm optimization
70-92
4. 1 Introduction
70
4.2 Related work
72
4.3 Basics of modified PSO and CLPSO algorithms
74
4.4 Adaptive system identification of IIR systems
76
4.5 CLPSO based identification of IIR systems
78
4.6 Simulation study
80
4.7 Conclusion
88
References
88
5. Dynamic system identification using FLANN structure and
PSO and BFO based learning algorithms
93-115
5.1 Introduction
93
5.2 Dynamic system identification of nonlinear system
95
5.3 A generalized FLANN structure based identification model
96
5.4 BFO and PSO based nonlinear system identification
97
5.5 Simulation study
101
5.6 Conclusion
113
References
113
6. Robust identification and prediction using particle
swarm optimization technique
116-146
6.1 Introduction
116
6.2 Formulation of PSO based nonlinear system identification model
119
6.3 Weight update of FLANN model by squared error minimization using PSO 123
6.4 Development of robust identification and prediction models using PSO
based training with robust norm minimization
vii
124
6.5 Simulation study
127
6.6 Conclusion
142
References
142
7. Robust adaptive inverse modeling using bacterial foraging
optimization technique and applications
147-175
7.1 Introduction
147
7.2 Data recovery by adaptive channel equalization
151
7.3 BFO based training of weights of inverse model
153
7.4 Development of robust inverse modeling using BFO based training
155
with robust norm minimization
7.5 Simulation study
156
7.6 Conclusion
172
References
172
8. Identification of Hammerstein plants using clonal PSO and
immunized PSO algorithms
176-200
8.1 Introduction
176
8.2 Identification of Hammerstein plants using FLANN
178
8.2.1 Hammerstein model
178
8.2.2 FLANN architecture for modeling nonlinear static part
179
8.3 Proposed Clonal PSO and Immunized PSO algorithms
182
8.3.1 The CPSO algorithm
183
8.3.2 The IPSO algorithm
184
8.4 Weight update of Hammerstein model
185
8.4.1 Identification algorithm using FLANN structure and PSO based
training
185
8.4.2 Identification algorithm using FLANN structure and CPSO based
training
187
8.4.3 Identification algorithm using FLANN structure and IPSO based
187
viii
training
8. 5 Simulation study
188
8.6 Conclusion
197
References
197
9. Development of distributed particle swarm optimization
algorithms for robust nonlinear system identification
201-225
9.1 Introduction
201
9.2 Distributed system identification
204
9.2.1. INPSO based system identification
204
9.2.2 DPSO based system identification
207
9.3 Distributed robust identification of plants
209
9.4 Stepwise distributed PSO algorithms
209
9.5 Simulation study
210
9.6 Conclusion
218
References
223
10. Conclusion and scope for further work
226-233
10.1 Conclusion
226
10.2 Further research extension
228
Publications out of the thesis
229
ix
LIST OF FIGURES
Page
14
Fig. 2.1
The general adaptive filtering problem
Fig. 2.2
Adaptive filter using Bio-inspired/Derivative based algorithms
15
Fig. 2.3
Structure of an adaptive IIR filter
16
Fig. 2.4
Structure of a single neuron
17
Fig. 2.5
Different types of nonlinear activation function
18
Fig. 2.6
MLP Structure
20
Fig. 2.7
Structure of the FLANN model
23
Fig. 2.8
Neural network using BP algorithm
25
Fig. 2.9
Chromosome
30
Fig. 2.10
Crossover
31
Fig. 2.11
Mutation
32
Fig. 2.12
Multimodal function of (2.47)
34
Fig. 2.13
Fitness curve of the function vs iteration
35
Fig. 2.14
General flow chart of PSO
38
Fig. 2.15
Swimming, Tumbling and Chemotactic behavior of Ecoli
40
Fig. 2.16
The Clonal Selection Principle
44
Fig. 3.1
Identification scheme of a dynamic system
49
Fig. 3.2
A FLANN model for identification of nonlinear dynamic systems
53
x
Fig. 3.3
A CFLANN model for identification of nonlinear dynamic systems
53
Fig. 3.4
Comparison of identification performance of nonlinear plants of
(Example -1)
59
Fig. 3.5
Comparison of identification performance of nonlinear plant of
Example-2
61
Fig. 3.6
Comparison of identification performance of nonlinear plant of
Example-3
63
Fig. 3.7
Comparison of identification performance of nonlinear plant of
Example – 4
64
Fig. 4.1
Adaptive identification of IIR systems using output-error adaptive IIR
filter as the model
76
Fig. 4.2(a)
Comparison of convergence characteristics of different methods for an
exact 2nd order IIR model
82
Fig. 4.2(b) Comparison of convergence characteristics of different methods for a
reduced order(1st order) IIR model
82
Fig. 4.3(a)
83
Comparison of convergence characteristics of different methods for an
exact 3rd order IIR model
Fig. 4.3(b) Comparison of convergence characteristics of different methods for a
reduced order (2nd order) IIR model
83
Fig. 4.4(a)
Comparison of convergence characteristics of different methods for an
exact 4th order IIR model
84
Fig.4.4(b)
Comparison of convergence characteristics of different methods for a
reduced order (3rdorder) IIR model
84
Fig. 4.5(a)
Comparison of convergence characteristics of different methods for an
exact 5th order IIR model
85
Fig. 4.5(b) Comparison of convergence characteristics of different methods for a
reduced order (4th order) IIR model
86
Fig. 5.1
A generalized adaptive model of a complex dynamic nonlinear plant
97
Fig. 5.2
Response matching of static systems ((a), (b) for Example 1 and (c) and
104
xi
(d) for Example 2)
Fig. 5.3
Comparison of response of the dynamic plant of Example 3 using
nonlinearity defined in (5.15)
105
Fig. 5.4
Comparison of response of the dynamic plant of Example 3 using
nonlinearity defined in (5.16)
106
Fig. 5.5
Comparison of response of the dynamic plant of Example 4
108
Fig. 5.6
Comparison of response of the dynamic plant of Example 5
109
Fig. 5.7
Block diagram of MIMO plant identification
110
Fig. 5.8
Response matching of MIMO system of Example 6
111
Fig. 6.1
Identification scheme of a dynamic system
119
Fig. 6.2
Identification of nonlinear dynamic plants using FLANN architecture
and PSO based robust CF minimization
122
Fig. 6.3
Steps involved in the first generation weight update mechanism using
PSO based CF minimization
126
Fig. 6.4
Steps involved in 2nd generation weight-update mechanism using PSO
based CF minimization
127
Fig. 6.5
Plot of desired signal with 50% outliers used in Example-1
129
Fig. 6.6
Response matching of static systems ((a) and (b) for Example 1 and (c)
and (d) for Example 2)
130
Fig.6.7
Comparison of response of the dynamic plant of Example 3 using
nonlinearity defined in (6.22)
131
Fig. 6.8
Comparison of response of the dynamic plant of Example 3 using
nonlinearity defined in (6.23)
132
Fig. 6.9
Comparison of response of the dynamic plant of Example 4
133
Fig. 6.10
Comparison of response of the dynamic plant of Example 5
135
Fig. 6.11
Comparison of response of the dynamic plant of Example 6
136
xii
Fig. 6.12
Output response matching of Example 8
137
Fig. 6.13
Output response matching of Example 8
139
Fig. 6.14
Output response matching of Example 9
140
Fig. 7.1
Inverse Modeling
148
Fig. 7.2
A Digital Communication System with BFO based adaptive inverse
model
152
Fig. 7. 3
Comparison of BER of four different CFs based nonlinear equalizers
with [.209, .995, .209] as channel coefficients and NL1
159
Fig. 7. 4
Comparison of BER of four different CFs based nonlinear equalizers
with [.209, .995, .209] as channel coefficients and NL2
161
Fig. 7.5.
Comparison of BER of four different CFs based nonlinear equalizers
with [.260, .930, .260] as channel coefficients and NL1
163
Fig. 7. 6
Comparison of BER of four different CFs based nonlinear equalizers
with [.260, .930, .260] as channel coefficients and NL2
165
Fig. 7. 7.
Comparison of BER of four different CFs based nonlinear equalizers
with [.304, .903, .304] as channel coefficients and NL1
167
Fig. 7. 8
Comparison of BER of four different CFs based nonlinear equalizers
with [.304, .903, .304] as channel coefficients and NL2
169
Fig. 7. 9
Effect of EVR on the BER performance of the four CF-based equalizers
in presence of 50% outliers
170
Fig. 7. 10
Effect of EVR on the BER performance of the four CF-based equalizers
in presence of 40% outliers
171
Fig. 8.1
The Hammerstein Model
178
Fig. 8.2
Structure of FLANN model
179
Fig. 8. 3
Adaptive Identification model of the generalized Hammerstein Plant
181
Fig.8.4
Comparison of response at the output of nonlinear static part of the
plant and the corresponding models of Example 1
190
xiii
Fig. 8.5
Comparison of response at the output of nonlinear static part of the
plant and the corresponding models of Example 2
Comparison of response at the output of nonlinear static part of the
plant and the corresponding models of Example 3
192
Fig. 8.7
Comparison of response at the output of nonlinear static part of the
plant and the corresponding models of Example 4
196
Fig. 9.1
Two modes of cooperation
202
Fig. 9.2
IPSO based nonlinear identification scheme
206
Fig. 9.3
DPSO based nonlinear identification scheme
208
Fig 9.4 (a)
Convergence of System 1 with NL1 at -30dB
214
Fig. 8.6
194
Fig 9.4 (b) Convergence of System1 with NL1 at -20dB
214
Fig.9.4 (c)
Convergence of System 2 with NL1 at -30dB
214
Fig.9.4 (d) Convergence of System2 with NL1 at -20dB
215
Fig.9.4 (e)
Convergence of System1 with NL2 at -30dB
215
Fig.9.4 (f)
Convergence of System1 with NL2 at -20dB
215
Fig.9.4 (g)
Convergence of System 2 with NL2 at -30dB
216
Fig.9.4 (h) Convergence of System2 with NL2 at -20dB
216
Fig.9.5 (a)
Response matching of System 1 with NL1 at -20dB
216
Fig.9.5 (b) Response matching of System 2 with NL2 at -30dB
217
xiv
LIST OF TABLES
Page
35
Table 2.1
Initial Generation C1
Table 2.2
C1 population after 400th iteration
36
Table 3.1
Comparison of the sum of squared errors (SSE) between the plant
and the model outputs
65
Table 3. 2
Comparison of Computational Complexity of various system
identification models
65
Table 4.1
Comparison of performance between GA, PSO & CLPSO based
training of weights
87
Table 4.2
Comparison between true and estimated pole-zero parameters
obtained from RLMS, GA, PSO and CLPSO
87
Table 5.1
Comparison of NMSE(dB) computed for different examples of two
different models
112
Table 5.2
Comparison of Computational Complexities of various system
identification models
112
Table 6.1
Comparison of NMSE obtained in Example-1 to Example-6 from
models using three robust cost functions and conventional MSE CF
141
Table 8.1
Comparison of true and estimated parameters of system for dynamic
part of the model of Example 1
190
Table 8.2
Comparison of CPU time and SSE for identifying the plant of
Example 1
191
Table 8.3
Comparative results of estimates of system parameters for dynamic
part of the model of Example 2
192
Table 8.4
Comparison of CPU time and SSE for identifying the plant of
Example 2
193
Table 8.5
Comparative results of estimates of system parameters for dynamic
194
xv
part of the model of Example 3
Table 8.6
Comparison of CPU time and SSE for identifying the plant of
Example 3
195
Table 8.7
Comparative results of estimates of system parameters for dynamic
part of the model of Example 4
196
Table 8.8
Comparison of CPU time and SSE for identifying the plant of
Example 4
197
Table 9.1
Comparison of simulation parameters used in IPSO, DPSO and
PSO based models
211
Table 9.2
Comparison of estimated parameters using IPSO, DPSO and PSO
techniques
212
Table 9.3
Comparison of CPU time and sum of squared error obtained using
PSO, IPSO and DPSO
217
Table 9.4
Comparison of sum of squared error (SSE) during testing for Ex-4
with nonlinearity NL1
219
Table 9.5
Comparison of sum of squared error (SSE) during testing for Ex-4 with
nonlinearity NL2
220
Table 9.6
Comparison of sum of squared error (SSE) during testing for Ex-5 with
nonlinearity NL1
221
Table 9.7
Comparison of sum of squared error (SSE) during testing for Ex-5 with
nonlinearity NL2
222
xvi
GLOSSARY
ADSL
AIS
AWGN
BER
BFO
BIBO
BP
CF
CFLANN
CLPSO
CNN
CPSO
DPSO
DSP
EVR
FE
FIR
FLANN
FPGA
GA
HVAC
IIR
ILMS
INPSO
IPSO
ISI
LMS
MGS
MIMO
MLANN
MLP
MLSE
MMSE
MSE
NMSE
PPN
PSO
RBF
RCF
RLMS
RLS
RNN
SI
SISO
SSE
Adaptive digital subscriber loop
Artificial immune system
Additive white Gaussian noise
Bit error ratio
Bacterial foraging optimization
Bounded input bounded output
Back propagation
Cost function
Cascaded functional link artificial neural network
Comprehensive learning particle swarm optimization
Chebyshev neural network
Clonal particle swarm optimization
Diffusion particle swarm optimization
Digital signal processing
Eigen value ratio
Functional expansion
Finite impulse response
Functional link artificial neural network
Field programmable gate array
Genetic Algorithm
Heating, ventilating and air conditioning
Infinite impulse response
IIR LMS
Incremental particle swarm optimization
Immunized particle swarm optimization
Inter Symbol Interference
Least mean square
Mackey glass system
Multiple input multiple output
Multilayer artificial neural network
Multilayer perceptron
Mean log squared error
Minimum mean square error
Mean square error
Normalized mean square error
Polynomial perceptron network
Particle swarm optimization
Radial basis function
Robust cost function
Recursive least mean square
Recursive least square
Recurrent neural network
Swarm intelligence
Single input single output
Sum squared error
xvii
SSE
VLSI
WNN
Sum squared errors
Very large scale integrated
Wavelet neural network
xviii
1
Chapter
Introduction
1.1 Background
O
UT
of many applications of adaptive filtering, direct modeling and
inverse modeling are very important. The direct modeling or system
identification finds applications in control system engineering
including robotics [1.1], intelligent sensor design [1.2], process control [1.3],
power system engineering [1.4], image and speech processing [1.4], geophysics
[1.5], acoustic noise and vibration control [1.6] and biomedical engineering [1.7].
Similarly inverse modeling technique is used in digital data reconstruction [1.8],
channel equalization in digital communication [1.9], digital magnetic data
recording [1.10], intelligent sensor [1.2], deconvolution of seismic data [1.11].
The direct modeling mainly refers to adaptive identification of unknown plants.
Simple static linear plants are easily identified through parameter estimation
using conventional derivative based least mean square (LMS) type algorithms
[1.12]. But most of the practical plants are dynamic, nonlinear and combination
1
I N T R O D U C T I O N
of these two characteristics. In many applications Hammerstein and MIMO
plants need identification. In addition the output of the plant is associated with
measurement or additive white Gaussian noise(AWGN). Identification of such
complex plants is a difficult task and poses many challenging problems. Similarly
inverse modeling of telecommunication and magnetic medium channels is also
important for reducing the effect of inter symbol interference (ISI) and
achieving faithful reconstruction of original data. Similarly adaptive inverse
modeling of sensors is required to extend their linearities for direct digital
readout and enhancement of dynamic range. If the channel or the sensor
characteristic is modeled as a nonlinear filter with large eigen-value ratio (EVR)
together with AWGN building up of an accurate inverse model is also a difficult
and challenging task. These two important and complex issues are addressed in
the thesis and attempts have been made to provide improved efficient and
alternate promising solutions.
The conventional LMS and recursive least square (RLS) [1.13] techniques work
well for identification of static plants but when the plants are of dynamic type,
the existing forward-backward LMS [1.14] and the RLS algorithms very often
lead to non optimal solution due to premature convergence of weights to local
minima [1.15]. This is a major drawback of the use of existing derivative based
techniques. To alleviate this burning issue this thesis suggests the use of
derivative free optimization techniques in place of conventional techniques.
In recent past population based optimization techniques have been reported
which fall under the category of evolutionary computing [1.16] or computational
intelligence [1.17]. These are also called bio-inspired techniques which include
genetic algorithm (GA) and its variants [1.18], particle swarm optimization
(PSO) and its variants [1.19], bacterial foraging optimization (BFO) and its
variants [1.20] and artificial immune system (AIS) and its variants [1.21]. These
techniques are suitably employed to obtain efficient iterative learning algorithms
for developing adaptive direct and inverse models of complex plants and
channels.
Development of direct and inverse adaptive models essentially consists of two
components. The first component is an adaptive network which may be linear
2
I N T R O D U C T I O N
or nonlinear in nature. Use of a nonlinear network is preferable when nonlinear
plants or channels are to be identified or equalized. The linear networks used in
the thesis are adaptive linear combiner or all-zero or FIR structure [1.7] and
pole-zero or IIR structure[1.7]. Under nonlinear category low complexity single
layer function link artificial neural network (FLANN) [1.22] and multilayer
perceptron network (MLP) [1.23] are used. The second component is the
training or learning algorithm used to train the parameters of the model. As
stated earlier the structures used are trained by bio-inspired techniques such as
GA, PSO and modified PSOs, BFO and modified BFOs. Depending upon the
complexity and nature of the plants to be identified proper combination of
network of the model and corresponding bio-inspired learning rule is selected so
that the combination yields the best possible performance in direct and inverse
modeling tasks. This requires the knowledge of prior experience and simulation
results. One of the objectives of the present investigation is to choose models
with appropriate combination of structure and algorithm so that it provides best
possible performance of direct and inverse models. The bio-inspired
optimization tools can not directly be applied to develop direct and inverse
models of plants as those are not aimed to be used for training of parameters of
models. Therefore another motivation of investigation is to formulate the direct
and inverse modeling problems as optimization problems and then to introduce
bio-inspired techniques suitably to effectively optimize the cost function of the
models. In conventional identification and equalization problems, the mean
square error at the output is considered as the cost function to be minimized by
using bio-inspired techniques.
In many practical situations the training signal available is highly corrupted by
outliers and may be as high as 50%. Under such constraints the training of the
models gets severely affected if the squared error is used as the cost function for
minimization. This is because this conventional cost function is not robust
against outliers [1.24]. In statistics few cost functions have been defined which
are robust in nature and are not affected by outliers. These are Wilcoxon norm,
σ (1 − exp(−e 2 / 2σ )) and log(1 +
e2
) , where σ is a parameter to be adjusted during
2
3
I N T R O D U C T I O N
training and e 2 is mean square error.
In this thesis robust identification,
equalization and time series prediction schemes by minimization of the robust
norm using bio-inspired techniques have been proposed. This is a novel
contribution in this thesis.
In recent years distributed signal processing has played an important role in
sensor networks in which individual node collects local information but the
objective is to compute the global solution. Some representative problems are
parameter estimation using locally measured data and nonlinear identification
using the local data in a cooperative manner. Attempts have been made to solve
this interesting problem by using an approach based on newly introduced
distributed PSO. In this work two distributed versions: incremental and
diffusion type PSO techniques have been proposed and then used for robust
identification of linear and nonlinear plants.
1.2 Motivation
In summary the main motivations of the research work carried in the present
thesis are the following :
(i)
To formulate the direct and inverse modeling problems as error
square optimization problems
(ii)
To introduce bio-inspired optimization tools such as PSO and BFO
and their variants to efficiently minimize the squared error cost
function of the models. In other words to develop alternate
identification scheme.
(iii)
To achieve improved identification (direct modeling) of complex
nonlinear all-zero, pole-zero, Hammerstein and MIMO plants and
channel equalization (inverse modeling) of nonlinear noisy digital
channels
by
introducing
new
and
improved
identification
algorithms.
(iv)
To devise new bio-inspired training strategy for robust identification
of complex plants and robust equalization of complex channels.
4
I N T R O D U C T I O N
(v)
To suggest distributed incremental and diffusion type PSO
algorithms and use them for identification of linear and nonlinear
plants using local data of each sensor node.
(vi)
To introduce distributed robust algorithms for identification of
nonlinear plants.
1.3 Major contribution of the thesis
The following novel contributions have been made in the thesis :
A low complexity functional link artificial neural network based nonlinear dynamic
system identifier has been developed and its learning algorithm has been derived.
Improved identification performance has been demonstrated through simulation study.
A comprehensive learning particle swarm optimization technique has been used to
effectively identify IIR plants. Further, extensive simulation study has been made on the
use of the proposed method to effectively identify higher order plants with lower order
models. The new approach has been shown to overcome local minima problem in a
multimodal situation.
The bacterial foraging optimization and particle swarm optimization have been used as
learning tools in developing new models for identification of dynamic systems. Robust
identification and prediction task has also been carried using PSO. Similarly a new
approach to develop robust inverse model in presence of outliers has been successfully
implemented using BFO.
Identification of complex Hammerstein plants using two new PSO algorithms : clonal
PSO and immunized PSO have been proposed. It is shown that the immunized PSO
model outperforms its counterpart in all counts.
Distributed incremental and diffusion type PSO algorithms have been suggested for
identification of nonlinear plants with outliers in the training signal under a sensor
network frame work. The results are observed to be superior to conventional
method of identification.
5
I N T R O D U C T I O N
1.4 Chapter wise contribution
The research work undertaken is embodied in 10 Chapters.
1. Introduction
2. Selected adaptive architectures and bio-inspired techniques, principles and
algorithms
3. Development of a new cascaded functional link artificial neural
network(CFLANN) for nonlinear dynamic system identification
4. Identification of IIR plants using comprehensive learning particle swarm
optimization
5. Dynamic systems identification using PSO and BFO based learning
algorithms.
6. Robust identification and prediction using particle swarm optimization
technique
7. Robust adaptive inverse modeling using bacterial foraging optimization
technique and applications
8. Identification of Hammerstein plants using Clonal PSO and Immunized
PSO algorithms
9. Development of distributed PSO algorithms for robust nonlinear system
identification
10. Conclusion and scope for further work
Out of 10 Chapters, the research contribution is contained in Chapters 3 to 9.
A brief outline of chapter wise contribution is presented in sequel. Chapter 1
outlines the introduction to the problem, the motivation of the research work
and a condensed version of chapter wise contribution made in the thesis.
Finally Chapter 10 deals with the overall conclusion of the investigation and
scope for further research work.
A brief outline of each of the linear and nonlinear networks used for
identification and equalization purpose is presented in Chapter 2. This
6
I N T R O D U C T I O N
includes all-zero and pole-zero adaptive filters under linear category and MLP,
FLANN, CNN under nonlinear category. This Chapter also reviews the
existing derivative based algorithms such as the LMS, recursive LMS (RLMS),
FLANN, BP and evolutionary computing optimization algorithms like the
GA, PSO, BFO and AIS. Various combinations of the structure and the
learning algorithms are suitably used to obtain novel adaptive models for
identification and equalization of complex plants and channels respectively.
In Chapter 3, a new cascaded FLANN structure is proposed and the
corresponding learning algorithm is derived and used to identify nonlinear
dynamic plants. Four different identification models have been suggested.
Results of identification through simulation demonstrate that the new models
outperform the conventional MLP and FLANN based models in terms of
computational load and accuracy.
Identification of IIR plants or pole-zero systems finds extensive applications
in echo cancellation, channel estimation, process control, array processing and
speech recognition. In Chapter 4 a new adaptive IIR algorithm using
comprehensive learning particle swarm optimization (CLPSO) is proposed
which avoids potential local minima problem and provides accurate estimate
of pole-zero coefficients of IIR plants. Simulation study of identification of
some benchmark IIR plants reveals that the proposed method outperforms
the existing recursive LMS (RLMS), GA and PSO based methods in terms of
mean square error(MSE), execution time and product of population size and
number of input samples used during training.
Chapter 5 deals with the development of PSO and BFO based schemes to
identify nonlinear dynamic single-input-single-output (SISO) and multipleinput-multiple-output (MIMO) systems. The BFO and PSO based training of
the weights of the FLANN identification model have been newly introduced.
In both cases it is observed that the new training schemes in complex
identification task work better in terms of speed of computation, accuracy
and number of input samples used for training. However, both the new
schemes offer almost similar identification performance.
7
I N T R O D U C T I O N
The problem of robust identification and prediction is studied in depth in
Chapter 6. Development of such identification scheme is required when
either the training or input signal samples are contaminated with outliers. The
existing squared error cost function based learning schemes fail to offer
satisfactory identification performance. Therefore the use of new robust
norms which are insensitive to outliers have been suggested as the cost
function. Such cost functions are minimized by PSO method to develop
robust identification of nonlinear dynamic systems and prediction models of
complex time series. Simulation results show that the Wilcoxon norm
produces the best robust models for identification and prediction compared
to that produced by other norms used in the investigation.
Inverse modeling plays an important role in channel equalization, sensor
linearization and deconvolution operation in geophysics applications. In
practice it is difficult to develop an inverse model using squared error norm
when the training signal contains outliers. Therefore investigation has been
made in Chapter 7 to identify robust norms of errors and use them to
develop robust inverse models. To obtain such models the robust norms are
minimized using BFO scheme. The robustness of the new inverse models is
evaluated through simulation study using some benchmark channels and
different percentage of outliers in the training signal. The results indicate that
the use of squared error provides the least robust inverse models where as the
Wilcoxon norm generates most robust models.
Hammerstein plant contains all features of complex plants because it contains
a static nonlinear part, a dynamic linear system and an additive coloured noise.
Identification of such complex plants really poses difficulty. In Chapter 8,
attempt has been made to identify such plants using two new PSO algorithms
: clonal PSO (CPSO) and immunized PSO (IPSO). In this Chapter two new
variants of PSO algorithm have been proposed and then used them in
training the model parameters. The potentiality of the proposed method is
demonstrated through simulation of benchmark Hammerstein plants. The
potentiality of identification is evaluated by verifying three features : the
response matching at the output of static nonlinear part, comparison of
8
I N T R O D U C T I O N
estimated parameters of linear dynamic part with the corresponding true
values and comparison of sum of squared errors (SSE) between true and
overall estimated responses. Comparing all these test results, it is observed
that the IPSO model outperforms its counterpart in all counts.
In application like sensor networks linear and nonlinear identifications are
required using local data of each sensor. To achieve this objective distributed
signal processing algorithm for training is required. In Chapter 9 two such
distributed algorithms known as incremental and diffusion PSO (INPSO and
DPSO) algorithms have been proposed to identify linear and nonlinear plants.
Further the squared and Wilcoxon norm of errors are minimized in a
cooperative manner using distributed PSO algorithms. Simulation results
demonstrate that both the distributed algorithms provide excellent
identification performance when conventional squared error norm is used.
When outliers are present in the training samples both the Wilcoxon norm
based distributed algorithms provide superior performance compared to the
conventional norm based training.
The overall conclusion of the total investigation is listed in Chapter 10. This
Chapter also contains the details of further research work that can be carried
out in the same or the related field.
References
[1.1] K. S. Narendra and K. Parthasarathy, “Identification and control of dynamical systems
using neural networks”, IEEE Trans. on Neural Networks, vol. 1, pp. 4-26, January 1990.
[1.2] J. C. Patra, A. C. Kot and G. Panda, “An intelligent pressure sensor using neural
networks”, IEEE Trans. on Instrumentation and Measurement, vol. 49, issue 4, pp. 829834, Aug. 2000.
[1.3] M. Pachter and O. R. Reynolds, “Identification of a discrete time dynamical system”,
IEEE Trans. Aerospace Electronic System, vol. 36, issue 1, pp. 212-225, 2000.
[1.4] G. B. Giannakis and E. Serpedin, “A bibliography on nonlinear system identification”,
Signal Processing, vol. 83, no. 3, pp. 533-580, 2001.
9
I N T R O D U C T I O N
[1.5] E. A. Robinson and T. Durrani, Geophysical Signal Processing, Prentice-Hall, Englewood
Cliffs, NJ, 1986.
[1.6] D. P. Das and G. Panda, “Active mitigation of nonlinear noise processes using a novel
filtered-s LMS algorithm”, IEEE Trans. on Speech and Audio Processing, vol. 12, issue 3,
pp. 313-322, May 2004.
[1.7] B. Widrow and S.D. Sterns, “Adaptive Signal Processing” Prentice-Hall, Inc. Engle-wood
Cliffs, New Jersey, 1985.
[1.8] G. J. Gibson, S. Siu and C. F. N. Cowan, “The application of nonlinear structures to
the reconstruction of binary signals”, IEEE Trans. signal processing, vol. 39, no. 8, pp.
1877-1884, Aug. 1991.
[1.9] R. W. Lucky, Techniques for adaptive equalization of digital communication systems,
Bell Sys.Tech. J., 45, 255-286, Feb. 1966.
[1.10] H. Sun, G. Mathew and B. Farhang-Boroujeny, “Detection techniques for high
density magnetic recording”, IEEE Trans. on Magnetics, vol. 41, no. 3, pp. 1193-1199,
March 2005.
[1.11] L. J. Griffiths, F. R. Smolka and L. D. Trenbly, “Adaptive deconvolution : a new
technique for processing time varying seismic data”, Geohysics, June 1977.
[1.12] B. Widrow, J. M. McCool, M. G. Larimore and C. R. Johnson, Jr., :Stationary and
nonstationary learning characteristics of the LMS adaptive filter”, Proc. IEEE, vol. 64, no.
8, pp. 1151-1162, Aug., 1976.
[1.13] B. Friedlander and M. Morf, “Least-squares algorithms for adaptive linear phase
filtering”, IEEE Trans., vol. ASSP-30, no. 3, pp. 381-390, June 1982.
[1.14] S. A. White, “An adaptive recursive digital filter”, Proc. 9th Asilomar Conf. Circuits
Syst. Comput., p. 21, Nov. 1975.
[1.15] John J. Shynk, “Adaptive IIR filtering”, IEEE ASSP Magazine, April 1989, pp. 4-21.
[1.16] A. E. Eiben and J. E. Smith, “Introduction to Evolutionary Computing”, Springer, 2003,
ISBN 3-540-40184-9.
[1.17] Andries Engelbrecht, “Computational Intelligence : An introduction”, Wiley & Sons, ISBN 0470-84870-7.
[1.18] D.E.Goldberg, “Genetic algorithms in search, optimization and machine learning”, AdditionWesley,1989.
10
I N T R O D U C T I O N
[1.19] J. Kennedy, R. C. Eberhart and Y. Shi, “Swarm intelligence”, San Francisco:
Morgan Kaufmann Publishers, 2001.
[1.20] K. M. Passino, “Biomimicry of Bacterial Foraging for distributed optimization and
control”, IEEE control system magazine, vol 22, issue 3, pp. 52-67, June 2002.
[1.21] D. Dasgupta, Artificial Immune Systems and their Applications, Springer-Verlag,
1999.
[1.22] Y.H. Pao, Adaptive Pattern Recognition and Neural Networks, Addison Wesley, Reading,
Massachusetts, 1989.
[1.23] S. Haykin, “Neural Networks: A comprehensive foundation” 2nd Edition, Pearson
Education Asia, 2002.
[1.24] Jer-Guang Hsieh, Yih-Lon Lin and Jyh-Horng Jeng, “Preliminary study on Wilcoxon
learning machines”, IEEE Trans. on neural networks, vol. 19, no. 2, pp. 201-211, Feb.
2008.
11
2
Chapter
Selected Adaptive Architectures
and Bio-Inspired Techniques,
Principles and Algorithms
2.1 Introduction
T
HE main motive of the research work carried out in this thesis is to develop
elegant and efficient adaptive identification schemes for complex nonlinear
and dynamic plants, adaptive inverse models of nonlinear plants, equalization
of complex channels and prediction of nonlinear time series. All these adaptive
models inherently need suitable adaptive structures and appropriate learning rules to
train the parameters of these models. In the present investigation, I briefly outline
some selected adaptive architectures such as adaptive linear combiner, adaptive polezero filters, functional link artificial neural network and multilayer artificial neural
network. In addition we present some recently developed population based bioinspired derivative free techniques such as particle swarm optimization and its
12
S E L E C T E D
A D A P T I V E A R C H I T E C T U R E S A N D B I O - I N S P I R E D
T E C H N I Q U E S , P R I N C I P L E S A N D A L G O R I T H M S
variants, bacterial foraging optimization and its variants and artificial immune system
and its variants for training the parameters or coefficients of the adaptive structures.
An adaptive linear combiner or filter is feed forward in structure. These are [2.1, 2.2]
often realized either as a set of program instructions running on an arithmetical
processing device such as a microprocessor or DSP chip, or as a set of logic
operations implemented in a field-programmable gate array (FPGA) or in a semicustom or custom Very large scale integrated (VLSI) circuit. An adaptive linear
combiner is characterized by
1. the input signal sampled employed,
2. the structure that defines how the output signal of the combiner is
computed from its input samples,
3. the parameters of the structure which are iteratively changed based on some
learning rule and
4. the adaptive algorithms that guide how the parameters are to be adjusted
iteratively until the predefined objective is fulfilled.
By choosing a particular adaptive filter structure, one specifies the number and type
of parameters that need adjustments. The adaptive algorithms used to update these
parameters tend to minimize the cost function of the model. The cost function
normally are mean squared error of the model which is not robust to outliers in the
training or desired signal samples. The thesis also introduces minimization of some
robust cost functions of the error for updating the model parameters. Essentially the
development of adaptive models for identification, equalization, prediction and
function approximation is viewed as a minimization problem of some suitable cost
functions. The main contribution of the thesis is to solve these complex
optimization problems using bio-inspired based learning rules.
In following section, the general adaptive filtering problem is presented and the
mathematical notation for representing the form and operation of the adaptive
filter is introduced.
13
S E L E C T E D
A D A P T I V E A R C H I T E C T U R E S A N D B I O - I N S P I R E D
T E C H N I Q U E S , P R I N C I P L E S A N D A L G O R I T H M S
2.2 The adaptive filtering problem
Figure 2.1 shows a block diagram of an adaptive FIR filter or an adaptive linear
combiner in which a sample from a digital input signal x k is fed into an adaptive
filter, that computes a corresponding output signal sample y k at time k . The
output signal is compared to a second signal d k , called the desired signal.
xk
yk
Adaptive
Filter
-
Σ
+
dk
ek
Fig. 2.1 The general adaptive filtering problem
The difference signal given by
ek = d k − y k
(2.1)
is known as the error signal. The error signal is used to adapt the parameters of the
filter from time k to time (k + 1) in a well-defined manner. This process of
adaptation is represented by an oblique arrow. As the time index k is incremented
the output of the adaptive filter matches a better and better to the desired signal
following an adaptation process such that the magnitude of ek decreases over time.
In the adaptive filtering framework, adaptation refers to the mechanism by which the
parameters of the system are changed from time index k to time index (k + 1) . The
number and types of parameters within this system depend on the computational
structure chosen for the system. Different filter structures that have been used for
model development are presented below.
14
S E L E C T E D
A D A P T I V E A R C H I T E C T U R E S A N D B I O - I N S P I R E D
T E C H N I Q U E S , P R I N C I P L E S A N D A L G O R I T H M S
2. 2. 1 Adaptive FIR filter
The general architecture of an FIR adaptive filter or a adaptive linear combiner [2.1] is
depicted in Fig. 2.2. Let X is Nth input pattern having one unit delay in each instant. This
process
is
also
called
as
adaptive
linear
combiner
[2.1-2.2].
Let
X n = [ x(n) x(n − 1)........... x(n − M + 1)] form of the M -by-1 tap input vector and M − 1 is
the number of delay elements. The tap weights Wn = [ w0 (n) w1 (n) ......... wM −1 ( n)]T form
the elements of the M -by-1 tap weight vector. The output is represented as,
y ( n) =
M −1
∑w
m=0
m
( n) x ( n − m)
(2.2)
The output can be represented in vector notation as
y (n) = X nT Wn = WnT X n
(2.3)
z-1
x(n)
w0(n)
x(n-1)
z-1
x(n-2)
•••
w2(n)
w1(n)
•••
z-1
x(n-M+1)
wM-1(n)
∑
Bio-inspired/
derivative based
training rule
e(n)
∑
- y(n)
+
d(n)
Fig. 2.2 Adaptive filter using Bio-inspired/Derivative based algorithms
2. 2. 2 Adaptive IIR filter
The structure of a direct-form adaptive IIR filter [2.1] is shown in Fig. 2.3. In this case, the
output of the system is given by
N −1
M −1
m =1
m =0
y (n) = ∑ a m (n) y (n − m) + ∑ bm (n) x(n − m)
15
(2.4)
S E L E C T E D
A D A P T I V E A R C H I T E C T U R E S A N D B I O - I N S P I R E D
T E C H N I Q U E S , P R I N C I P L E S A N D A L G O R I T H M S
The terms a m (n) and bm (n) represent the feed forward and feed back coefficients of the
filter respectively. In matrix form, y (n) may be written as
y (n) = WnT S n
(2.5)
Where the combined weight vector is
Wn = [b0 (n) b1 (n) ..............bM −1 (n) a1 (n) a 2 (n).............a N −1 (n)]T
(2.6)
And the combined input and output signal vector is
S n = [ x(n) x(n − 1) ......... x(n − M + 1) y (n − 1) y (n − 2) ........... y (n − N + 1)]T
b0 ( n)
x (n)
z −1
x(n − 1)
y (n)
Σ
z −1
b1 (n)
y ( n − 1)
a1 (n)
z −1
Σ
+
d (n)
e(n)
z −1
z −1
z −1
x(n − M + 1)
-
(2.7)
bM −1 (n)
aN−1 (n)
y(n − N + 1)
Bio-inspired
/derivative based
training rules
Fig. 2.3 Structure of an adaptive IIR filter
The weight update operation of adaptive IIR filter is carried out using either conventional
derivative based or derivative free learning algorithms. In addition to the linear structures
nonlinear structure can be used for which the principle of superposition does not hold when
the parameter values are fixed. Such systems are useful when the relationship between
d (n) and x(n) is not linear in nature. This class of nonlinear structure consists of artificial
neural network (ANN), functional link artificial neural network (FLANN) and radial basis
function (RBF) network. These networks inherently contain distributed nonlinear elements in
each path like the sigmoid function in ANN, sine / cosine terms in FLANN and Gaussian
function in RBF network. In the next section details of these nonlinear structures are dealt.
16
S E L E C T E D
A D A P T I V E A R C H I T E C T U R E S A N D B I O - I N S P I R E D
T E C H N I Q U E S , P R I N C I P L E S A N D A L G O R I T H M S
2.3 Artificial neural network (ANN)
Artificial neural network (ANN) takes its name from the network of nerve cells in the
brain. Recently, ANN has proved to be an important technique for classification and
optimization problems [2.3-2.5]. McCulloch and Pitts have developed the neural networks
for different computing machines. There are extensive applications of various types of
ANN in the field of communication, control, instrumentation and forecasting. The ANN is
capable of performing nonlinear mapping between the input and output space due to its
large parallel interconnection between different layers and the nonlinear processing
characteristics. An artificial neuron basically consists of a computing element that performs
the weighted sum of the input signal and the connecting weight. The sum is added with the
bias or threshold and the resultant signal is then passed through a nonlinear function of
sigmoid or hyperbolic tangent type. Each neuron is associated with three parameters whose
learning can be adjusted; these are the connecting weights, the bias and the slope of the
nonlinear function. For the structural point of view, a neural network (NN) may be single
layer or it may be multilayer. In multilayer structure, there is one or many artificial neurons
in each layer and for a practical case there may be a number of layers. Each neuron of the
one layer is connected to each neuron of the next layer. The functional-link ANN is
another type of single layer NN. In this type of network the input data is allowed to pass
through a functional expansion block where the input data are nonlinearly mapped to more
number of points. This is achieved by using trigonometric functions, tensor products or
power terms of the input. The output of the functional expansion is then passed through a
single neuron.
Twp types of NNs used in this thesis are discussed next.
2. 3. 1 Single neuron structure
Input
x1(n)
•••
x2(n)
wj (n)
α (n)
Activation
Function
ϕ ( .)
∑
y (n)
Output
xN(n)
Fig. 2.4 Structure of a single neuron
17
S E L E C T E D
A D A P T I V E A R C H I T E C T U R E S A N D B I O - I N S P I R E D
T E C H N I Q U E S , P R I N C I P L E S A N D A L G O R I T H M S
The basic structure of an artificial neuron is presented in Fig. 2.4. The operation in a neuron
involves the computation of the weighted sum of inputs and threshold [2.3-2.5]. The resultant
signal is then passed through a nonlinear activation function. This is also called as a perceptron,
which is built around a nonlinear neuron. The output of the neuron may be represented as,
⎡N
⎤
y ( n ) = ϕ ⎢∑ w j ( n ) x j ( n ) + α ( n )⎥
⎣ j =1
⎦
(2.8)
where α ( n ) is the threshold to the neurons at the first layer, w j ( n ) is the weight associated
with the j th input, N is the no. of inputs to the neuron and ϕ (.) is the nonlinear activation
function. Different types of nonlinear function are shown in Fig. 2.5.
ϕh ( v )
ϕt ( v )
ϕs ( v )
+1
+1
0
v
ϕ p (v)
+1
+1
0
v
0
v
0
v
-1
(c)
(b)
(a)
(d)
Fig. 2.5 Different types of nonlinear activation function,
(a) Signum function or hard limiter,
(b) Threshold function,
(c) Sigmoid function,
(d) Piecewise Linear
Signum Function: For this type of activation function, we have
⎧ 1 if v > 0
⎪
ϕ ( v ) = ⎨ 0 if v = 0
⎪−1 if v < 0
⎩
(2.9)
Threshold Function: This function is represented as,
⎧1
⎩0
ϕ (v) = ⎨
if v ≥ 0
(2.10)
if v < 0
18
S E L E C T E D
A D A P T I V E A R C H I T E C T U R E S A N D B I O - I N S P I R E D
T E C H N I Q U E S , P R I N C I P L E S A N D A L G O R I T H M S
Sigmoid Function: This function is S-shaped and is the most common form of the
activation function used in artificial neural network. It is a function that exhibits a graceful
balance between linear and nonlinear behaviour.
ϕ (v) =
1
1 + e − av
(2.11)
where v is the input to the sigmoid function and a is the slope of the sigmoid function.
For the steady convergence a proper choice of a is required.
Piecewise-Linear Function: This function is
1
⎧
v≥+
⎪1,
2
⎪
1
1
⎪
ϕ ( v ) = ⎨ v, + > v > −
2
2
⎪
1
⎪
v≤+
⎪ 0,
2
⎩
(2.12)
where the amplification factor inside the linear region of operation is assumed to be unity.
Out of these nonlinear functions the sigmoid activation function is extensively used in
ANN.
2. 3. 2 Multilayer perceptron (MLP)
In the multilayer neural network or multilayer perceptron (MLP), the input signal propagates
through the network in a forward direction, on a layer-by-layer basis. This network has been
applied successfully to solve some difficult and diverse problems by training in a supervised
manner with a highly popular algorithm known as the error back-propagation algorithm [2.3,
2.4]. The scheme of MLP using four layers is shown in Fig. 2.6. xi ( n ) represent the input to the
network, f j and f k represent the output of the two hidden layers and yl ( n ) represents the
output of the final layer of the neural network. The connecting weights between the input to the
first hidden layer, first to second hidden layer and the second hidden layer to the output layers
are represented by wij , w jk and wkl respectively.
19
S E L E C T E D
A D A P T I V E A R C H I T E C T U R E S A N D B I O - I N S P I R E D
T E C H N I Q U E S , P R I N C I P L E S A N D A L G O R I T H M S
+1
Input
layer
(Layer-1)
+1
wkl
Output
Signal
yl ( n )
•••
•••
Input
Signal,
xi ( n )
w jk
•••
+1
wij
Second
Hidden
layer
(Layer-3)
First
Hidden
layer
(Layer-2)
Output
layer
(Layer-4)
Fig. 2.6 MLP Structure
If P1 is the number of neurons in the first hidden layer, each element of the output vector
of first hidden layer may be calculated as,
⎡N
⎤
f j = ϕ j ⎢ ∑ wij xi ( n ) + α j ⎥ , i = 1, 2,3,...N , j = 1, 2,3,...P1
⎣ i =1
⎦
(2.13)
where α j is the threshold to the neurons of the first hidden layer, N is the number of
inputs and ϕ (.) is the nonlinear activation function of the neurons of the first hidden layer
which is defined in (2.11). The time index n has been dropped to make the equations
simpler. Let P2 be the number of neurons in the second hidden layer. The output of this
layer is represented as, f k and may be written as
⎡ P1
⎤
f k = ϕ k ⎢ ∑ w jk f j + α k ⎥ , k=1, 2, 3, …, P2
⎣ j =1
⎦
(2.14)
where, α k is the threshold to the neurons of the second hidden layer. The output of the
final output layer can be calculated as
20
S E L E C T E D
⎡ P2
yl ( n ) = ϕl ⎢ ∑ wkl f k + α l
⎣ k =1
A D A P T I V E A R C H I T E C T U R E S A N D B I O - I N S P I R E D
T E C H N I Q U E S , P R I N C I P L E S A N D A L G O R I T H M S
⎤
⎥,
⎦
l=1, 2, 3, … , P3
(2.15)
where, α l is the threshold to the neuron of the final layer and P3 is the number of neurons
in the output layer. The output of the MLP may be expressed as
⎡ P2
⎛ P1
⎞
⎧N
⎫
yl ( n ) = ϕn ⎢ ∑ wklϕk ⎜ ∑ w jkϕ j ⎨∑ wij xi ( n ) + α j ⎬ + α k ⎟ + α l
⎩ i =1
⎭
⎢⎣ k =1
⎝ j =1
⎠
⎤
⎥
⎥⎦
(2.16)
The details of BP algorithm used to train the weights of various layers of the ANN are discussed
in 2. 4. 1. 3.
2. 3. 3 Functional link artificial neural network (FLANN)
Pao originally proposed FLANN which is a novel single layer ANN structure capable of
forming arbitrarily complex decision regions by generating nonlinear decision boundaries
[2.6, 2.7]. In this structure, the initial representation of a pattern is enhanced by using
nonlinear function and thus the pattern dimension space is increased. The functional link
acts on an element of a pattern or entire pattern itself by generating a set of linearly
independent function and then evaluates these functions with the pattern as the argument.
Hence separation of the patterns becomes possible in the enhanced space. The use of
FLANN not only increases the learning rate but also has less computational complexity
[2.9]. Pao et al [2.8] have investigated the learning and generalization characteristics of a
random vector FLANN and compared with those attainable with MLP structure trained
with back propagation algorithm by taking few functional approximation problems. A
FLANN structure with two inputs is shown in Fig. 2.7.
Let X is the input vector of size N×1 which represents N number of elements; the nth
element is given by:
X ( n ) = xn ,1 ≤ n ≤ N
(2.17)
Each element undergoes nonlinear expansion to form M elements such that the resultant
matrix has the dimension of N×M.
21
S E L E C T E D
A D A P T I V E A R C H I T E C T U R E S A N D B I O - I N S P I R E D
T E C H N I Q U E S , P R I N C I P L E S A N D A L G O R I T H M S
The functional expansion of the element xn by power series expansion is carried out using
the equation given in (2.18)
⎧ x for i = 1
si = ⎨ nl
⎩ xn for i = 2,3, 4,K , M
(2.18)
where l = 1, 2,L , M .
For trigonometric expansion, the expanded elements are
⎧
xn
for i = 1
⎪
⎪
si = ⎨ sin ( lπ xn ) for i = 2, 4,K , M
⎪
for i = 3,5,K , M +1
⎪⎩cos ( lπ xn )
(2.19)
where l = 1, 2,L , M 2 .
For Chebyshev expansion the terms are given by
T0 ( x n ) = 1 for n = 0
T1 ( x n ) = x n for n = 1
T2 ( x n ) = 2 x n2 − 1 for n = 2
Tn +1 ( x n ) = 2 x nTn ( x n ) − Tn −1 ( x n ) for n > 2
(2.20)
In matrix notation the expanded elements of the input vector is denoted by S of size
N×(M+1).
The bias input is unity. So an extra unity value is padded with the S matrix and the
dimension of the S matrix becomes N×Q, where Q = ( M + 2 ) .
Let the weight vector is represented as W having Q elements. The output y is given as
22
S E L E C T E D
A D A P T I V E A R C H I T E C T U R E S A N D B I O - I N S P I R E D
T E C H N I Q U E S , P R I N C I P L E S A N D A L G O R I T H M S
Q
y = ∑ si wi
(2.21)
i =1
In matrix notation the output is written as
Y = S ⋅ WT
(2.22)
S
W
x1
x2
Functional Expansion
1
d (k )
y (k )
Σ
Adaptive
Algorithm
-
+
Σ
e (k )
Fig. 2.7 Structure of the FLANN model
2. 4 Learning algorithms
There are many learning algorithms which are employed to train various adaptive models.
The performance of these models depends on rate of convergence, training time,
computational complexity involved and minimum mean square error achieved after
training. The learning algorithms may be broadly classified into two categories (a) derivative
based (b) derivative free. The derivative based algorithms include least means square(LMS),
IIR LMS (ILMS), back propagation(BP) and FLANN-LMS. Under the derivative free
algorithms, genetic algorithm(GA), particle swarm optimization(PSO), bacterial foraging
optimization(BFO) and artificial immune system(AIS) have been employed. In this section
the details of these algorithms are outlined in sequel.
2.4.1 Derivative based algorithms
These algorithms are gradient search in nature and have been derived by taking derivative of the
squared error as the cost function. During the process of training these algorithms tend to drive
the weights of the model to local minima. This leads to premature termination of the weights. As
23
S E L E C T E D
A D A P T I V E A R C H I T E C T U R E S A N D B I O - I N S P I R E D
T E C H N I Q U E S , P R I N C I P L E S A N D A L G O R I T H M S
a result the mean square error does not attain the least possible value and hence the accuracy of
prediction becomes inferior. However these learning algorithms are simple to implement and
can be expressed in close form equations. A brief description of each of them is presented
below.
2.4.1.1 LMS algorithm for adaptive FIR filters
In an adaptive FIR filter, at any k th instant the error signal, ek is
computed as
ek = d k − yk
(2.23)
where
d k = the desired or training signal at k th instant
y k = the output of the filter at k th instant
The weights associated with the filter are then updated using the LMS algorithm [2.1]. The
weight updation equation for nth instant is given by
wk ( n + 1) = wk ( n ) + Δwk ( n )
(2.24)
where Δwk (n) is the change of k th weight at n th iteration.
The change in weight of each path in each iteration is obtained by minimizing the mean
squared error [2.1]. Using this value the weight update equation is given as
wk ( n + 1) = wk ( n ) + 2 ⋅η ⋅ ek ( n ) ⋅ XTk
(2.25)
where η is the learning rate parameter (0 ≤ η ≤ 1). This procedure is repeated till the mean
square error (MSE) of the network approaches a minimum value. The MSE at the time index k
may be defined as, ξ = E ⎡⎣ ek2 ⎤⎦ , where E [.] is the expectation value or average of the signal.
24
S E L E C T E D
A D A P T I V E A R C H I T E C T U R E S A N D B I O - I N S P I R E D
T E C H N I Q U E S , P R I N C I P L E S A N D A L G O R I T H M S
2.4.1.2 Adaptive IIR LMS (ILMS) algorithm
Referring to Fig. 2.3, the error term is given by
ek = d k − y k
(2.26)
where y k is obtained from (2.4) and (2.5). The ILMS update rule is given as
ˆ
Wk +1 = Wk − M ∇
k
(2.27)
ˆ = estimate of the gradient at k th instant and is given by
where ∇
k
ˆ = −2e [α .........α
∇
0k
k
k
Nk
M
β 1k ..........β Nk ]T
(2.28)
is a diagonal matrix of N + 1 learning parameters for zero coefficients and
N parameters for pole coefficients and is represented as
M = diag[ μ ...............μ
η ................η ]
(2.29)
The variables α nk and β nk are given by
α nk =
L
∂y k
= x k − n + ∑ bl α n , k −l ;
∂a n
l =1
0≤n≤L
(2.30)
β nk =
L
∂y k
= y k − n + ∑ bl β n , k −l ;
∂bn
l =1
1≤ n ≤ L
(2.31)
Equations (2.26) to (2.31) represent the key equations of ILMS algorithm.
2.4.1.3 Back propagation (BP) algorithm
x1
x2
Back-Propagation
Algorithm
el ( n )
Σ
yl ( n )
+
Fig. 2.8 Neural network using BP algorithm
25
d (n)
S E L E C T E D
A D A P T I V E A R C H I T E C T U R E S A N D B I O - I N S P I R E D
T E C H N I Q U E S , P R I N C I P L E S A N D A L G O R I T H M S
The generalized structure of MLP is shown in Fig. 2.6. To derive the BP algorithm a
simplified neural network with two inputs and 2-3-2-1 neurons (2, 3, 2 and 1 denote the
number of neurons in the input layer, the first hidden layer, the second hidden layer and the
output layer respectively) is depicted in Fig. 2.8. The parameters of the neural network can
be updated in both sequential and batch mode of operation. In BP algorithm, the weights
and the thresholds are initialized as very small random values. The intermediate and the
final outputs of the MLP are calculated by using (2.13), (2.14), and (2.15) respectively.
The final output yl ( n ) at the output of neuron l , is compared with the desired output
d ( n ) and the resulting error signal el ( n ) is obtained as
el ( n ) = d ( n ) − yl ( n )
(2.32)
The instantaneous value of the total error energy is obtained by summing all errors squared
over all neurons in the output layer, that is
ξ (n) =
1 P3 2
∑ el ( n )
2 l =1
(2.33)
where P3 is the no. of neurons in the output layer.
This error signal is used to update the weights and thresholds of the hidden layers as well as
the output layer. The reflected error components at each of the hidden layers is computed
using the errors of the last layer and the connecting weights between the hidden and the
last layer and error obtained at this stage is used to update the weights between the input
and the hidden layer. The thresholds are also updated in a similar manner as that of the
corresponding connecting weights. The weights and the thresholds are updated in an
iterative method until the error signal becomes minimum.
The weights are updated according to the following equations
wkl ( n + 1) = wkl ( n ) + Δwkl ( n )
(2.34)
26
S E L E C T E D
A D A P T I V E A R C H I T E C T U R E S A N D B I O - I N S P I R E D
T E C H N I Q U E S , P R I N C I P L E S A N D A L G O R I T H M S
w jk ( n + 1) = w jk ( n ) + Δw jk ( n )
(2.35)
wij ( n + 1) = wij ( n ) + Δwij ( n )
(2.36)
where, Δwkl ( n ) , Δw jk ( n ) and Δwij ( n ) are the change in weights of the second hidden
layer-to-output layer, first hidden layer-to-second hidden layer and input layer-to-first
hidden layer respectively. This can be computed as
Δwkl ( n ) = −2μ
dξ ( n )
dwkl ( n )
= 2μ e ( n )
dyl ( n )
dwkl ( n )
(2.37)
⎡ P2
⎤
= 2 μ e ( n ) ϕl′ ⎢ ∑ wkl f k + α l ⎥ f k
⎣ k =1
⎦
where, μ is the convergence coefficient ( 0 ≤ μ ≤ 1 ). Similarly Δw jk ( n ) and Δwij ( n ) can
also be computed as
The thresholds of each layer can be updated in a similar manner using equations
α l ( n + 1) = α l ( n ) + Δα l ( n )
(2.38)
α k ( n + 1) = α k ( n ) + Δα k ( n )
(2.39)
α j ( n + 1) = α j ( n ) + Δα j ( n )
(2.40)
where, Δα l ( n ) , Δα k ( n ) and Δα j ( n ) are the change in thresholds of the output, hidden
and input layer respectively. The change in threshold is represented as,
Δα l ( n ) = −2 μ
dξ ( n )
dα l ( n )
= 2μ e ( n )
dyl ( n )
dα l ( n )
(2.41)
⎡ P2
⎤
= 2 μ e ( n ) ϕl′ ⎢ ∑ wkl f k + α l ⎥
⎣ k =1
⎦
27
S E L E C T E D
A D A P T I V E A R C H I T E C T U R E S A N D B I O - I N S P I R E D
T E C H N I Q U E S , P R I N C I P L E S A N D A L G O R I T H M S
2.4.1.4 The FLANN algorithm
Referring the structure of the FLANN of Fig. 2.7, the error signal e ( k ) at kth iteration can
be computed as
e(k ) = d (k ) − y (k )
(2.42)
Let ξ ( k ) denotes the cost function at iteration k and is given by
ξ (k ) =
1 P 2
∑ ej (k )
2 j =1
(2.43)
where P is the number of nodes at the output layer.
The weight vector can be updated by least mean square (LMS) algorithm, as
w(k + 1) = w(k ) −
μ ˆ
∇(k )
(2.44)
2
ˆ (k ) is an instantaneous estimate of the gradient of ξ with respect to the weight
where ∇
vector w(k ) . This gradient is computed as
ˆ (k ) = ∂ξ = −2e(k ) ∂y (k ) = −2e(k ) ∂[ w(k ) s(k )]
∇
∂w
∂w
∂w
(2.45)
= −2e(k ) s(k )
ˆ (k ) in (2.44) we get
Substituting the values of ∇
w ( k + 1) = w ( k ) + μ e ( k ) s ( k )
(2.46)
where μ denotes the step-size ( 0 ≤ μ ≤ 1) , which controls the convergence speed of the
algorithm.
28
S E L E C T E D
A D A P T I V E A R C H I T E C T U R E S A N D B I O - I N S P I R E D
T E C H N I Q U E S , P R I N C I P L E S A N D A L G O R I T H M S
2.5 Derivative free algorithms / Evolutionary computing
based algorithms
2.5.1 Genetic algorithm (GA)
Genetic algorithms are a class of evolutionary computing techniques, which is a rapidly
growing area of artificial intelligence. Genetic algorithms are inspired by Darwin's theory of
evolution. Simply said, problems are solved by an evolutionary process resulting in a best
(fittest) solution (survivor) - in other words, the solution is evolved.
Evolutionary computing was introduced in the 1960s by Rechenberg in his work "Evolution
strategies" (Evolutionsstrategie in original). His idea was then developed by other researchers.
Genetic Algorithms (GAs) were invented by John Holland and developed by him and his
students and colleagues [2.10]. This led to Holland's book "Adaption in Natural and Artificial
Systems" published in 1975.
The algorithm begins with a set of solutions (represented by chromosomes) called
population. Solutions from one population are taken and used to form a new population.
This is motivated by a hope, that the new population will be better than the old one.
Solutions which are then selected to form new solutions (offspring) are selected according
to their fitness - the more suitable they are, the more chances they have to reproduce.
This is repeated until some condition (for example number of populations or improvement
of the best solution) is satisfied.
2.5.1.1 Outline of the basic genetic algorithm
1. [Start] Generate random population of n chromosomes (suitable solutions for the
problem)
2. [Fitness] Evaluate the fitness f(x) of each chromosome x in the population
3. [New population] Create a new population by repeating following steps until the
new population is complete
a
[Selection] Select two parent chromosomes from a population according
to their fitness (the better fitness, the bigger chance to be selected)
29
S E L E C T E D
A D A P T I V E A R C H I T E C T U R E S A N D B I O - I N S P I R E D
T E C H N I Q U E S , P R I N C I P L E S A N D A L G O R I T H M S
b [Crossover] With a crossover probability cross over the parents to form
new offspring (children). If no crossover was performed, offspring is the
exact copy of parents.
c
[Mutation] With a mutation probability mutate new offspring at each locus
(position in chromosome).
d
[Accepting] Place new offspring in the new population
4. [Replace] Use new generated population for a further run of the algorithm
5. [Test] If the end condition is satisfied, stop, and return the best solution in current
population
6. [Loop] Go to step 2
The outline of the Basic GA provided above is very general. There are many parameters
and settings that can be implemented differently in various problems.
Elitism is often used as a method of selection. Which means, that at least one of a
generation's best solution is copied without changes to a new population, so the best
solution can survive to the succeeding generation.
2.5.1.2 Operators of GA
Overview
The crossover and mutation are the most important parts of the genetic algorithm. The
performance is influenced mainly by these two operators.
Encoding of a Chromosome
A chromosome should in some way contain information about solution that it represents.
The most commonly used way of encoding is a binary string. A chromosome then could
look like this:
Chromosome 1 1101100100110110
Chromosome 2 1101111000011110
Fig. 2.9 Chromosome
30
S E L E C T E D
A D A P T I V E A R C H I T E C T U R E S A N D B I O - I N S P I R E D
T E C H N I Q U E S , P R I N C I P L E S A N D A L G O R I T H M S
Each chromosome is represented by a binary string. Each bit in the string can represent
some characteristics of the solution. There are many other ways of encoding. The encoding
depends mainly on the problem to be solved. For example, one can encode directly integer
or real numbers; sometimes it is useful to encode some permutations and so on.
Crossover
Crossover operates on selected genes from parent chromosomes and creates new offspring.
The simplest way how to do that is to choose randomly some crossover point and copy
everything before this point from the first parent and then copy everything after the
crossover point from the other parent.
Crossover is illustrated in the following Fig. 2.10 ( | is the crossover point)
Chromosome 1 11011 | 00100110110
Chromosome 2 11011 | 11000011110
Offspring 1
11011 | 11000011110
Offspring 2
11011 | 00100110110
Fig. 2.10 Crossover
There are other ways how to make crossover, for example we can choose more crossover
points.
Mutation
Mutation is intended to prevent falling of all solutions in the population into a local
optimum of the solved problem. Mutation operation randomly changes the offspring
resulted from crossover. In case of binary encoding we can switch a few randomly chosen
bits from 1 to 0 or from 0 to 1. Mutation can be then illustrated as follows (Fig. 2.11)
31
S E L E C T E D
A D A P T I V E A R C H I T E C T U R E S A N D B I O - I N S P I R E D
T E C H N I Q U E S , P R I N C I P L E S A N D A L G O R I T H M S
Original offspring 1 1101111000011110
Original offspring 2 1101100100110110
Mutated offspring 1 1100111000011110
Mutated offspring 2 1101101100110110
Fig. 2.11 Mutation
The technique of mutation (as well as crossover) depends mainly on the encoding of
chromosomes. For example when we are encoding by permutations, mutation could be
performed as an exchange of two genes.
2.5.1.3 Parameters of GA
Crossover and Mutation Probability
There are two basic parameters of GA - crossover probability and mutation probability.
Crossover probability: It indicates how often crossover will be performed. If there is no
crossover, offspring are exact copies of parents. If there is crossover, offspring are made
from parts of both parent's chromosome. If crossover probability is 100%, then all
offspring are made by crossover. If it is 0%, whole new generation is made from exact
copies of chromosomes from old population (but this does not mean that the new
generation is the same!). Crossover is made in hope that new chromosomes will contain
good parts of old chromosomes and therefore the new chromosomes will be better.
However, it is good to leave some part of old population survives to next generation.
Mutation probability: This signifies how often parts of chromosome will be mutated. If
there is no mutation, offspring are generated immediately after crossover (or directly
copied) without any change. If mutation is performed, one or more parts of a chromosome
are changed. If mutation probability is 100%, whole chromosome is changed, if it is 0%,
nothing is changed. Mutation generally prevents the GA from falling into local extremes.
32
S E L E C T E D
A D A P T I V E A R C H I T E C T U R E S A N D B I O - I N S P I R E D
T E C H N I Q U E S , P R I N C I P L E S A N D A L G O R I T H M S
Mutation should not occur very often, because then GA will in fact change to random
search.
Other Parameters
There are also some other parameters of GA. One another particularly important
parameter is population size.
Population size: It signifies how many chromosomes are present in population (in one
generation). If there are too few chromosomes, then GA has few possibilities to perform
crossover and only a small part of search space is explored. On the other hand, if there are
too many chromosomes, then GA slows down.
2.5.1.4 Selection
Introduction
The chromosomes are selected from the population to be parents for crossover. The
problem is how to select these chromosomes. According to Darwin's theory of evolution,
the best ones survive to create new offspring. There are many methods in selecting the best
chromosomes. Examples are roulette wheel selection, Boltzman selection, tournament
selection, rank selection, steady state selection and some others. In this thesis we have used
the tournament selection as it performs better than the others.
Tournament Selection
A selection strategy in GA is simply a process that favours the selection of better
individuals in the population for the mating pool. There are two important issues in the
evolution process of genetic search, population diversity and selective pressure. Population
diversity means that the genes from the already discovered good individuals are exploited
while promising the new areas of the search space continue to be explored. Selective
pressure is the degree to which the better individuals are favoured. The tournament
selection strategy provides selective pressure by holding a tournament competition among
individuals [2.11].
2.5.1.5 GA for Function optimization
To understand the use of GA, minimization of a multimodal function given in (2.47) is
carried out through simulation study.
33
S E L E C T E D
f(k)=...
A D A P T I V E A R C H I T E C T U R E S A N D B I O - I N S P I R E D
T E C H N I Q U E S , P R I N C I P L E S A N D A L G O R I T H M S
+5*exp(-0.1*((x-15)^2+(y-20)^2))...
-2*exp(-0.08*((x-20)^2+(y-15)^2))...
+3*exp(-0.08*((x-25)^2+(y-10)^2))...
+2*exp(-0.1*((x-10)^2+(y-10)^2))...
-2*exp(-0.5*((x-5)^2+(y-10)^2))...
-4*exp(-0.1*((x-15)^2+(y-5)^2))...
-2*exp(-0.5*((x-8)^2+(y-25)^2))...
-2*exp(-0.5*((x-21)^2+(y-25)^2))...
+2*exp(-0.5*((x-25)^2+(y-16)^2))...
+2*exp(-0.5*((x-5)^2+(y-14)^2));
(2.47)
The function has global minimum point at [15, 5]. Single point crossover was applied and
best 20 individuals having higher fitness values were selected for next generation. Mutation
and crossover rate were taken as 0.25 and 0.8 respectively and population size was set at 20.
Each parameter was represented by eight bits.
Fig. 2.12 Multimodal function of (2.47)
The fitness value settles to the global minimum value very soon. Results given by GA was
[15.1373
5.0980] and g min = -3.9757.
34
S E L E C T E D
A D A P T I V E A R C H I T E C T U R E S A N D B I O - I N S P I R E D
T E C H N I Q U E S , P R I N C I P L E S A N D A L G O R I T H M S
Fig. 2.13 Fitness curve of the function vs iteration
At iteration 1 the result is
Table 2.1
Initial Generation C1
W
Fitness
7.1373
0.3922
-16.2353 -16.3922
20.0000 -7.7647
-16.0784 -5.4118
7.9216 3.8431
9.1765 -10.1176
-13.8824 -8.8627
-8.8627 0.5490
1.9608 12.4706
-15.4510 16.5490
2.5882 -14.3529
15.4510 - 16.3922
4.4706 -19.8431
18.5882 -11.2157
-19.6863 -7.6078
-10.5882 7.4510
13.7255 16.8627
15.1373 5.4118
-11.8431 -10.1176
19.3725 12.1569
-0.0009
0.0000
-0.0000
0.0000
0.0060
-0.0000
0.0000
0.0000
0.0069
0.0000
-0.0000
-0.0000
-0.0000
-0.0000
0.0000
0.0000
1.5280
-3.9079
0.0000
-0.8527
After 400 iterations, all W and fitness values become equal which provides the desired
result.
35
S E L E C T E D
A D A P T I V E A R C H I T E C T U R E S A N D B I O - I N S P I R E D
T E C H N I Q U E S , P R I N C I P L E S A N D A L G O R I T H M S
Table 2.2
C1 population after 400th iteration
W
15.1373
15.1373
15.1373
15.1373
15.1373
15.1373
15.1373
15.1373
15.1373
15.1373
15.1373
15.1373
15.1373
15.1373
15.1373
15.1373
15.1373
15.1373
15.1373
15.1373
Fitness
-3.9757
-3.9757
-3.9757
-3.9757
-3.9757
-3.9757
-3.9757
-3.9757
-3.9757
-3.9757
-3.9757
-3.9757
-3.9757
-3.9757
-3.9757
-3.9757
-3.9757
-3.9757
-3.9757
-3.9757
5.0980
5.0980
5.0980
5.0980
5.0980
5.0980
5.0980
5.0980
5.0980
5.0980
5.0980
5.0980
5.0980
5.0980
5.0980
5.0980
5.0980
5.0980
5.0980
5.0980
2.5.2 Particle swarm optimization (PSO)
2.5.2.1 Basic method
Natural creatures sometimes behave as a swarm. One of the main stream of artificial life
researches is to examine how natural creatures behave as a swarm and reconfigure the
swarm models inside a computer. Swarm behavior can be modeled with a few simple rules.
School of fishes and swarm of birds can be modeled with such simple models.
In 2001 Kennedy and Eberhart developed a PSO [2.12] concept through simulation of bird
flocking in two-dimensional space. The position of each agent is represented by XY axes
position and also the velocity is expressed by vx (the velocity of X axis) and v y (the velocity
of Y axis). Modification of the agent position is realized by the position and velocity
information. Bird flocking optimizes a certain objective function. Each agent knows its best
value so far (pbest) and its XY position. This information is analogy of personal
experiences of each agent. Moreover, each agent knows the best value so far in the group
36
S E L E C T E D
A D A P T I V E A R C H I T E C T U R E S A N D B I O - I N S P I R E D
T E C H N I Q U E S , P R I N C I P L E S A N D A L G O R I T H M S
(gbest) among pbests. This information is analogy of knowledge of how the other agents
around them have performed. Namely, each agent tries to modify its position using the
following informations
1. the current positions ( x, y )
2. the current velocities (vx , v y )
3. to go to the center of the swarm
4. the distance between the current position and pbest
5. the distance between the current position and gbest
2.5.2.2 Particle swarm optimization algorithm
This modification can be represented by the concept of velocity. Velocity of each agent can
be modified by the following equation [2.12] :
Vi k +1 = wVi k + c1 × rand1 × ( pbesti − sik ) + c2 × rand 2 × ( gbesti − sik )
(2.48)
where
Vi k : velocity of agent i at iteration k
w
:
weighting function,
:
cj
rand :
sik
weighting factor,
random number between 0 and 1,
:
current position of agent i at iteration k
pbesti :
personal best of agent i ,
gbest :
global best of the group.
The following weighting function [2.12] is usually utilized in (2.48)
w = wmax −
( wmax − wmin )
× iter
itermax
(2.49)
where
wmax
:
initial weight,
:
wmin
itermax :
iter
:
final weight,
maximum iteration number,
current iteration number.
37
S E L E C T E D
A D A P T I V E A R C H I T E C T U R E S A N D B I O - I N S P I R E D
T E C H N I Q U E S , P R I N C I P L E S A N D A L G O R I T H M S
Using the above equation, a certain velocity, which gradually gets close to pbest and gbest
can be calculated. The current position (searching point in the solution space) can be
modified by the following equation [2.12] :
Sik +1 = Sik + Vi k +1
(2.50)
The general flow chart of PSO is shown in Fig. 2.14 and the step wise procedure is detailed
below
1. Step. 1 Generation of initial condition of each agent
Initial searching points ( si0 ) and velocities (vi0 ) of each agent are usually generated
randomly within the allowable range. The current searching point is set to pbest for each
agent. The best-evaluated value of pbest is set to gbest and the agent number with the best
value is stored.
2. Step. 2 Evaluation of searching point of each agent
The objective function value is calculated for each agent. If the value is better than the
current pbest of the agent, the pbest value is replaced by the current value. If the best value
of pbest is better than the current gbest, gbest is replaced by the best value and the agent
number with the best value is stored.
START
Generation of initial
condition of each agent
Step 1
Evaluation of searching
point of each agent
Step 2
Modification of each
searching point
NO
Step 3
Reach maximum
iteration ?
YES
STOP
Fig. 2.14 General flow chart of PSO
3. Step. 3 Modification of each searching point
38
S E L E C T E D
A D A P T I V E A R C H I T E C T U R E S A N D B I O - I N S P I R E D
T E C H N I Q U E S , P R I N C I P L E S A N D A L G O R I T H M S
The current search point of each agent is changed using (2.48), (2.49) and (2.50).
4. Step. 4 Checking the exit condition
If the current iteration number reaches the predetermined maximum iteration number, then
exit. Otherwise, go to step 2.
The features of the searching procedure of PSO can be summarized as follows:
1. As shown in (2.48), (2.49) and (2.50), PSO can essentially handle continuous
optimization problem.
2. PSO utilizes several search points like genetic algorithm and the search points gradually
get close to the optimal point using their pbest and the gbest information.
3. The first term of the right-hand side (RHS) of (2.48) is corresponding to diversification
in the search procedure. The second and third terms of that are corresponding to
intensification in the search procedure. This method has a well-balanced mechanism to
utilize diversification and intensification in the search procedure efficiently.
4. The PSO can handle continuous optimization problems with continuous state variables
in a n-dimension solution space.
Feature (3) can further be explained as follows. The RHS of (2.48) consists of three terms.
The first term is the previous velocity of the agent. The second and third terms are utilized
to change the velocity of the agent. Without the second and third terms, the agent will keep
on "flying" in the same direction until it hits the boundary. Namely, it tries to explore new
areas and, therefore, the first term is corresponding to diversification in the search
procedure. On the other hand, without the first term, the velocity of the "flying" agent is
only determined by using its current position and its best positions in history. Namely, the
agents will try to converge to their pbests and/or gbest and, therefore, the terms are
corresponding to intensification in the search procedure.
2.5.3 Bacterial foraging optimization (BFO)
2.5.3.1 Introduction
Natural selection tends to eliminate animals with poor "foraging strategies" (methods for
locating, handling, and ingesting food) and favor the propagation of genes of those
animals that have successful foraging strategies since they are more likely to enjoy
reproductive success (they obtain enough food to enable them to reproduce). After
39
S E L E C T E D
A D A P T I V E A R C H I T E C T U R E S A N D B I O - I N S P I R E D
T E C H N I Q U E S , P R I N C I P L E S A N D A L G O R I T H M S
many generations, poor foraging strategies are either eliminated or shaped into good
ones (redesigned). Logically, such evolutionary principles have led scientists in the field of
"foraging theory" to hypothesize that it is appropriate to model the activity of foraging as
an optimization process: A foraging animal takes actions to maximize the energy obtained per
unit time spent foraging, in the face of constraints presented by its own physiology and
environment
2.5.3.2 Bacterial foraging
Bacteria have the tendency to gather to the nutrient-rich areas by an activity called
chemotaxis. It is known that bacteria swim by rotating whip like flagella driven by a
reversible motor embedded in the cell wall. E. coli has 8-10 flagella placed randomly on a
cell body. When all flagella rotate counterclockwise, they form a compact, helically
propelling the cell along a helical trajectory, which is called run. When the flagella rotate
clockwise, they pull on the bacterium in different directions, which causes the bacteria to
tumble. The four steps involved in bacterial foraging are briefly outlined next.
(a)
(b)
(c)
Fig. 2.15 Swimming, Tumbling and Chemotactic behavior of Ecoli
(1)Chemotaxis : An E. coli bacterium can move in two different ways; it can run (swim
for a period of time) or it can tumble, and alternate between these two modes of operation
in the entire lifetime. In the BFO, a unit walk with random direction represnts a tumble and
a unit walk with the same direction in the last step indicates a run. After one step move, the
position of the i th bacterium can be presented [2.13] as
θ i ( j + 1, k , l ) = θ i ( j , k , l ) + C (i )φ ( j )
(2.51)
40
S E L E C T E D
A D A P T I V E A R C H I T E C T U R E S A N D B I O - I N S P I R E D
T E C H N I Q U E S , P R I N C I P L E S A N D A L G O R I T H M S
where θ i ( j, k , l ) represents the i th bacterium at j th chemotactic, k th reproductive and l th
elimination and dispersal step. C (i ) is the length of unit walk in the random direction. It is
assumed to be constant and φ ( j ) is the direction angle of the j th step. During run
operation, φ ( j ) is same as φ ( j − 1) , otherwise, φ ( j ) is a random angle directed within a
range of [0,2 π ].
If the cost at θ i ( j + 1, k , l ) is better than the cost at θ i ( j , k , l ) then the bactrium takes
another step of size C (i ) in the same direction otherwise it tumbles. This swim process is
as long as it continue to reduce the cost function, but only to a maximum number of steps,
Ns .
(2) Swarming : The bacteria in times of stresses release attractants to signal bacteria to
swarm together. It however also releases a repellant to signal others to be at a minimum
distance from it. Thus all of them will have a cell to cell attacrtion via attractant and cell to
cell repulsion via repellant. The cell to cell signalling in E. coli swarm may be represented
[2.13] by the function
S
S
p
i =1
i =1
m =1
J cc (θ , P( j , k , l )) = ∑ J cc (θ ,θ i ( j , k , l )) = ∑ [− d a exp(− wa ∑ (θ m − θ mi ) 2 )] +
S
∑h
i =1
r
(2.52)
p
exp(− wr ∑ (θ m − θ mi ) 2 )]
m =1
where J cc (θ , P ( j , k , l )) represents the objective function value to be added to the actual
objective function, S is the total number of bacteria, p is the number of variables to be
optimized, θ = [θ1 , θ 2 ,..........θ p ]T is a point in the
p -dimensional serach domain,
d a = depth of the attractant, wa = width of the attractant, hr = height of the repellant and
wr = width of the repellant.
(3) Reproduction : A reproduction step is taken after N c chemotactic steps. Let
S r = S b / 2 be the number of population members who have had sufficient nutrients so
that they reproduce i. e. split into two. For reproduction, the population is stored in order
of ascending fitness function. The S r least haelthy bacteria die and the other S r bacteria
41
S E L E C T E D
A D A P T I V E A R C H I T E C T U R E S A N D B I O - I N S P I R E D
T E C H N I Q U E S , P R I N C I P L E S A N D A L G O R I T H M S
each split into two identical ones which occupy the same positions in the environment.
This method keeps the population size constant.
(4) Elimination and Dispersal : Since bacteria may be stuck around the initial positions or
local optima, it is possible for the diversity of BFA to change either gradually or suddenly
to eliminate the possibility of being trapped into local minima. The dispersion vent happens
after a certain number of reproduction process. A bacterium is chosen, according to a
preset probability p ed , to be dispersed and moved to another position within the
environment. These events may prevent the local minima trapping effectively, but
unexpectedly disturb the optimization process. The mathematical treatment of this new
concept is presented in[2.13].
2.5.4 Artificial immune system (AIS)
The human immune system is a very complex system formed by a large number of cells,
molecules and diverse mechanisms. The main function of immunity is to protect our
bodies from the invasion of external microorganisms. The cells and molecules responsible
for immunity constitute biological immune system (BIS). The AIS is developed by
following the principles of BIS. Bersini first used immune algorithms to solve practical
problems. The books [2.14, 2.15] provide excellent materials about the various principles
and algorithms of AIS. According to BIS theory our body immunity is composed of two
defense lines: innate and adaptive immunity. Innate immunity is nonspecific which means
that it is independent of the foreign antigen. The adaptive immunity has memory and
learning capabilities and it is antigen dependent, meaning that each different type of antigen
provokes a different immune response. The main components of the adaptive immunity
are the cells called B lymphocytes or simply B cells. When B lymphocytes are stimulated by
a specific antigen, they produce a large number of molecules called antibodies, which play a
major role in the adaptive immune response.
The clonal selection principle of AIS describes how the immune cells eliminate a foreign
antigen and is simple but efficient approximation algorithm for achieving optimum
solution. The steps involved in the clonal selection algorithm are
42
S E L E C T E D
A D A P T I V E A R C H I T E C T U R E S A N D B I O - I N S P I R E D
T E C H N I Q U E S , P R I N C I P L E S A N D A L G O R I T H M S
Step 1 : Initialize a number of antibodies(immune cells) which represent initial population
size.
Step 2 : When an antigen or pathogen invades the organism; a number of antibodies that
recognize these antigens survives. In Fig.2.16, only the antibody C is able to recognize the
antigen as its structure fits to a portion of the pathogen. So fitness of antibody C is higher
than others.
Step 3 : The immune cells recognize antigens undergo cellular reproduction. During
reproduction the somatic cells reproduce in an asexual form, i.e. there is no crossover of
genetic material during cell mitosis. The new cells are copies (clones) of their parents as
shown for antibody C.
Step 4 : A portion of cloned cells undergo a mutation mechanism which is known as
somatic hypermutation as described in [2.15] .
Step 5 : The affinity of every cell with each other is a measure of similarity between them. It
is calculated by the distance between the two cells. The antibodies present in a memory
response have on average a higher affinity than those of early primary response. This
phenomenon is referred to as maturation of immune response. During the mutation
process the fitness as well as the affinity of the antibodies gets changed. In each iteration
after cloning and mutation those antibodies which have higher fitness and higher affinity
are allowed to enter the pool of efficient cells. Those cells with low affinity or self-reactive
receptors must be efficiently eliminated.
Step6: At each iteration among the efficient immune cells some become effecter cells
(Plasma Cell), while others are maintained as memory cells. The effecter cells secrete
antibodies and memory cells having longer span of life so as to act faster or more
effectively in future when the organism is exposed to same or similar pathogen.
Step7: The process continues till the termination condition is satisfied else steps 2 to 7 are
repeated.
43
S E L E C T E D
A D A P T I V E A R C H I T E C T U R E S A N D B I O - I N S P I R E D
T E C H N I Q U E S , P R I N C I P L E S A N D A L G O R I T H M S
Fig. 2.16 The Clonal Selection Principle
References
[2.1]
B. Widrow and S.D. Sterns, “Adaptive Signal Processing” Prentice-Hall, Inc. Englewood Cliffs, New Jersey, 1985.
[2.2]
S. Haykin, “Adaptive Filter Theory”, 4th edition, Pearson Education Asia, 2002.
44
S E L E C T E D
A D A P T I V E A R C H I T E C T U R E S A N D B I O - I N S P I R E D
T E C H N I Q U E S , P R I N C I P L E S A N D A L G O R I T H M S
[2.3]
S. Haykin, “Neural Networks: A comprehensive foundation” 2nd Edition, Pearson
Education Asia, 2002.
[2.4]
E. J. Dayhoff, “Neural Network Architecture – An Introduction” Van Norstand Reilold, New
York, 1990.
[2.5]
N. K. Bose and P. Liang., “Neural Network Fundamentals with Graphs, Algorithms,
Applications”, TMH Publishing Company Ltd, 1998.
[2.6]
Richard O. Duda, Peter E. Hart and David G. Stork, “Pattern Classification”, 2nd edition,
John Wiley & Sons, INC.2001.
[2.7]
Y.H. Pao, Adaptive Pattern Recognition and Neural Networks, Addison Wesley, Reading,
Massachusetts, 1989.
[2.8]
Y. H. Pao, G. H. Park and D. J. Sobjic., “Learning and Generalization Characteristics of
the Random Vector Function”, Neuro Computation, 6: 163-180, 1994.
[2.9]
J. C. Patra and R. N. Pal, “A Functional Link Artificial Neural Network for Adaptive
Channel Equalization”, Signal Processing, vol.43, no.2, pp.181-195, May 1995.
[2.10] D.E.Goldberg, “Genetic algorithms in search, optimization and machine learning”, AdditionWesley,1989.
[2.11] D. E. Goldberg and K. Deb, “A comparative analysis of selection schemes used in
GA”, Foundations of genetic Algorithms, I, pp. 53-69, 1991.
[2.12] J. Kennedy, R. C. Eberhart and Y. Shi, “Swarm intelligence”, San Francisco:
Morgan Kaufmann Publishers, 2001.
[2.13] K. M. Passino, “Biomimicry of Bacterial Foraging for distributed optimization and
control”, IEEE control system magazine, vol 22, issue 3, pp. 52-67, June 2002.
[2.14] D.Dasgupta, Artificial Immune Systems and their Applications, Springer-Verlag,
1999.
[2.15] L N de Castro and F. J. Von Zuben , “Learning and Optimization using Clonal
Selection Principle,” IEEE Trans on Evolutionary Computation, Special issue on
Artificial Immune Systems, vol. 6, issue 3, pp.239-251, 2002.
45
3
Chapter
Development of a New Cascaded
Functional Link Artificial Neural
Network (CFLANN) for Nonlinear
Dynamic System Identification
3.1 Introduction
T
HE identification of nonlinear dynamic system plays a key role in control,
communication and pattern recognition areas [3.1]. It finds wide applications in
many diverse fields such as electronic circuit design, environmental system analysis,
biological and medical systems and control system design [3.2]. The dynamic system
identification task is basically a model estimation process of capturing the dynamics of the
system using the measured data. This adaptive model can be used for prediction, system
design as well as control. Because of the emerging importance of identification problems
various attempts are currently underway to develop efficient nonlinear dynamic models of
46
D E V E L O P M E N T O F A N E W C A S C A D E D F U N C T I O N A L L I N K
A R T I F I C I A L N E U R A L N E T W O R K ( C F L A N N ) F O R N O N L I N E A R
D Y N A M I C S Y S T E M I D E N T I F I C A T I O N
complex real processes [3.38].
In recent years, the artificial neural network (ANN) has proved to be a potential tool for
system identification in a highly nonlinear dynamic environment. The ANN model has two
distinct advantages : (i) inherent ability to learn by optimizing an appropriate error function
(ii) excellent approximation capability to match a nonlinear function. The ANN
architectures used for this purpose are (i) the multilayer artificial neural network (MLANN),
(ii) the radial basis function (RBF) network, (iii) the recurrent neural network (RNN) and
(iv) the functional link ANN (FLANN). The MLANN trained with back propagation (BP)
algorithm and many variations of it have been successfully employed for system
identification task [3.3] - [3.7]. The MLANN structure offers robustness and effective
modeling and control capability of complex dynamic plants. Several real processes such as
control of truck-backer-upper problem [3.5] and robot arm [3.6] have also employed the
MLANN. However this is associated with some inherent drawbacks, e.g., multiple local
minima problem, the difficulty of selecting the number of hidden units and the possibility
of over fitting.
Subsequently the RBF neural networks have been introduced which have proved to be a
useful alternative to the MLANN for effective identification of nonlinear dynamic systems
[3.8]-[3.9]. The RBF neural network employs a simple structure, can learn functions with
local variations and discontinuities effectively and also possesses good universal
approximation capability [3.10].However, the RBF has the drawback that both the number
and location of its centers must be chosen appropriately. In recent years wavelet neural
network (WNN) [3.11], employing nonlinear wavelet basis functions, have been proposed
as useful approaches to nonlinear system identification [3.12]-[3.16]. The use of β − spline
and neuro fuzzy functions in place of wavelet basis function have also been suggested for
system identification [3.16]. The advantage of using WNN is that the local characteristics of
the wavelets enables efficient estimation of regressive functions resulting in localized
regularities. Various forms of RNN and recurrent neuro-fuzzy model have also been
successfully employed for identification and control tasks [3.17]-[3.22]. These networks are
essentially dynamic systems where the internal states evolve according to certain nonlinear
state equations. Because of their dynamic nature the identification of complex dynamic
systems has been successfully validated. However this approach does not provide any
solution to obtain minimal dimensions of the associated recurrent structure.
47
D E V E L O P M E N T O F A N E W C A S C A D E D F U N C T I O N A L L I N K
A R T I F I C I A L N E U R A L N E T W O R K ( C F L A N N ) F O R N O N L I N E A R
D Y N A M I C S Y S T E M I D E N T I F I C A T I O N
The learning capabilities of various neural networks used for system identification have
significant effects on the overall performance of the adaptive systems. If the information
content of the data input to the network can be modified in a suitable manner, the network
would able to extract the hidden features of the data. This is the prime motivation behind
functional link mapping used in FLANN. The functional link expansions map the original
input space into higher dimensions which help to reduce the burden during the training
phase of the networks. The functional link acts on each element of the input vector and
generates a set of linearly independent functions. They represent the enhanced version of
the input information and thus helps to improve the training time and learning accuracy.
The FLANN is a single neuron single layer network first proposed by Pao [3.23]. The
structure of the FLANN is simple as it represents a flat net with no hidden layers.
Therefore the computation and learning algorithm used in the architecture is straight
forward. It has already been applied to many diverse fields which include function
approximations[3.25]-[3.26], pattern classification [3.23]-[3.24], intelligent control [3.27],
nonlinear channel equalization [3.28]-[3.29], system identification [3.30]-[3.32], heating,
ventilating and air conditioning (HVAC) system [3.33], signal enhancement [3.34],
nonlinearity compensation of sensors [3.35]-[3.36] and financial forecasting [3.37]. It is in
general observed that in many of these applications the number of functional expansions
involved in the FLANN structure is exceedingly large. This leads to an increase in the
computational burdens, hence the overall learning time. Thus there is a need to introduce
an alternative ANN structure for modeling the nonlinear plants which associates structural
simplicity, less computational load and performance equivalent to or better than those
offered by either the MLANN or FLANN based models.
Therefore the main objective of the current work is to introduce an efficient new FLANN
model known as cascaded FLANN (CFLANN) with its learning algorithm for nonlinear
dynamic system identification task and evaluate its performance by conducting simulation
based identification experiments of typical nonlinear plants. The performance of the
proposed model is validated and compared by using the same identification examples in
MLANN and FLANN based approaches.
48
D E V E L O P M E N T O F A N E W C A S C A D E D F U N C T I O N A L L I N K
A R T I F I C I A L N E U R A L N E T W O R K ( C F L A N N ) F O R N O N L I N E A R
D Y N A M I C S Y S T E M I D E N T I F I C A T I O N
3.2 Nonlinear dynamic system identification
System identification is a key problem in system theory and is related with the mathematical
representation of a system. A model of a system is represented by an operator Ρ from an
input space Χ into an output space Υ and the objective is to characterize the class P to
which Ρ belongs. The problem of identification is to obtain a class Pˆ ⊂ P and an element
Ρˆ ∈ P̂ so that Ρ̂ approximates Ρ in some predefined sense. In static systems Χ and
Υ represent subsets of R n and R m respectively, but in dynamical systems these are assumed
to be bounded Lebesgue integrable function in the interval [0, T ] or [0, ∞] [3.4]. The
choice of the class of identification models and the specific methods used to determine P̂ is
related to the desired accuracy and the analytical tractability. In many practical situations
these decisions depend upon prior information associated with the plant to be identified.
Fig. 3.1 shows the identification scheme of a dynamic nonlinear system.
System
(P)
P(x)
+
x
Pˆ ( x )
Model
-
Σ
e
(Pˆ )
Update
Algorithm
Fig. 3.1 Identification scheme of a dynamic system
In
this
method
compact
sets
Χ i ⊂ R n are
mapped
into
elements
y i ∈ R m ; (i = 1, 2,..........) in the output space by the operator Ρ . The elements of Χ i denote
the input patterns corresponding to class y i . In dynamical systems the operator Ρ of a
given plant is defined by the input-output pairs {x(t ), y (t )}, t ∈[0, T ] . The objective of
identification is to obtain Ρ̂ so that
49
D E V E L O P M E N T O F A N E W C A S C A D E D F U N C T I O N A L L I N K
A R T I F I C I A L N E U R A L N E T W O R K ( C F L A N N ) F O R N O N L I N E A R
D Y N A M I C S Y S T E M I D E N T I F I C A T I O N
yˆ − y = Ρˆ ( x) − Ρ( x) ≤ ε , x ∈ Χ
(3.1)
where ε is a pre-specified small value and . denotes the norm on the output space. In Fig.
3.1 Ρˆ ( x) = yˆ , Ρ( x) = y and e denote the output of the model, the output of the system and
the error between the two respectively. Therefore e is given by
e = y − yˆ
(3.2)
The input x is assumed to be zero mean uniformly distributed random values lying between
-1 to +1. The stability of the plant is assumed with a known parameterization but with
unknown parameter values. In the present study single-input single-output (SISO) plants of
four different nonlinear models are considered. These plants are described by the nonlinear
difference equations (3.3)-(3.6) given in [3.4].
Model 1 :
y ( k + 1) =
n −1
∑ α y (k − i) + g [x (k ), x (k − 1),........ , x (k − m + 1]
i=0
i
(3.3)
Model 2 :
m −1
y (k + 1) = f [ y (k ), y (k − 1),..........., y (k − n + 1)] + ∑ β i x(k − i )
(3.4)
i =0
Model 3 :
y (k + 1) = f [ y (k ), y (k − 1),..........., y (k − n + 1)] + g[ x(k ), x(k − 1),.............., x(k − m + 1)]
(3.5)
Model 4 :
y ( k + 1) = f [ y ( k ), y ( k − 1),........ ..., y ( k − n + 1) + x ( k ), x ( k − 1),........ ........., x ( k − m + 1)]
(3.6)
In these models x(k ) and y (k ) represent the input and the output of the SISO plant at the
k th time instant respectively and m ≤ n . α i and β i are coefficients of the linear combiner
part of the models. In this chapter the nonlinear part of the given plants, f (.) and g (.) are
modeled by Cascaded FLANN (CFLANN) structures instead of using the MLANN
structure as suggested in [3.4]. It is assumed that the plants under consideration are
bounded-input-bounded-output (BIBO) stable. In order to achieve stability and to ensure
that the parameters of the model converge, a series-parallel scheme [3.4] is employed. In
this scheme the output of the plant instead of that of the ANN models is fed back to the
models during the training operation.
50
D E V E L O P M E N T O F A N E W C A S C A D E D F U N C T I O N A L L I N K
A R T I F I C I A L N E U R A L N E T W O R K ( C F L A N N ) F O R N O N L I N E A R
D Y N A M I C S Y S T E M I D E N T I F I C A T I O N
3.3 Cascaded functional link artificial neural network
Before dealing with the CFLANN adaptive structure and its training algorithm this section first
introduces the FLANN model and its associated training rules.
3.3.1 The functional link artificial neural network (FLANN)
When the FLANN structure is employed for identification and control task, its learning
capability significantly affects its performance. In this model the input is nonlinearly
mapped so that the network extracts the associated hidden information of the data. This is
the main motivation behind functional link mapping [3.4] of the FLANN structure. Such
mapping also reduces the need of addition layers of the network. The block diagram of an
FLANN structure is shown in Fig. 3.2. Let the input signal vector be represented as
X (k) = [x(k) x(k −1)..............x(k − m+1)]T
Then
the
functional
expansion
(3.7)
(FE)
block
maps
each
element
x(k ) into
(2 p + 1) nonlinearly expanded independent components. For trigonometric expansion
φ[ x(k )] is given by
φ [ x ( k )] = [ x ( k ), cos{ π x ( k )}, sin{ π x ( k )}.... cos{ p π x ( k )}, sin{ p π x ( k )}] T
= [φ 1 { x ( k )}, φ 2 { x ( k )},.... φ 2 p + 1 { x ( k )}] T
(3.8)
where p is an integer. When each element of X (k ) is expanded then the expanded vector
is represented as
φ[ X (k )] = [φ1{x(k )}....φ2 p +1{x(k )}φ2 p + 2{x(k − 1)}...φ2( 2 p +1){x(k − 1)}....φN {x(k − m)}]T
(3.9)
where N = m(2 p + 1) and m is the number of signal samples fed into the FLANN model.
The output yˆ (k ) of Fig. 3.2 is then given by
T
yˆ(k) = f {φ ( X (k))W(k) + b (k)}
(3.10)
where b(k ) is the bias weight and f {.} denotes the tanh function. W (k ) represents the
weight vector and is defined as
W (k ) = [ w1 ( k ), w2 ( k )......... ...w N ( k )]T
(3.11)
The weights of the FLANN model are trained using the algorithm
W (k + 1) = W (k ) + μφ{ X (k )}.e(k )(1 − yˆ 2 (k ) )
where the error term
51
(3.12)
D E V E L O P M E N T O F A N E W C A S C A D E D F U N C T I O N A L L I N K
A R T I F I C I A L N E U R A L N E T W O R K ( C F L A N N ) F O R N O N L I N E A R
D Y N A M I C S Y S T E M I D E N T I F I C A T I O N
e(k ) = y(k ) − yˆ (k )
(3.13)
The convergence coefficient is represented by μ and its value lies between 0 and 1.
Equations (3.8), (3.9), (3.10) and (3.12) represent the key equations of FLANN algorithm.
3.3.2 Cascaded functional link artificial neural network (CFLANN)
For the identification of complex nonlinear static and dynamic systems the number of
branches in the FLANN increases exponentially with the increase in the complexity of the
identification problem. As a result the structural complexity increases in case of FLANN
model and even then at times the performance is not improved. Keeping this in view a
structurally simple model known as a two-stage FLANN model is proposed which is
expected to offer less computational load. In this case the output of the first FLANN stage
undergoes another functional expansion. The weights of cascaded FLANN of both the
stages are updated by using a CFLANN algorithm. Referring to the proposed CFLANN
identification model in Fig. 3.3, the output of stage-1 is given by
(3.14)
yˆ 2 (k ) = tanh{ yˆ 1 (k )}
where yˆ1 (k ) is given by (3.10).
In the second stage the output yˆ 2 ( k ) undergoes nonlinear expansions in FE2 block. Its
output is represented as
(3.15)
ψ { yˆ 2 (k )} = [ψ 1 { yˆ 2 (k )}ψ 2 { yˆ 2 (k ).........ψ M { yˆ 2 (k )}]T
where ψ i { yˆ 2 (k )} denotes the i th trigonometric expansion of { yˆ 2 (k )}, 1 < i < M . The
estimated output of the second stage i. e. the output of the CFLANN model is then given
by
yˆ (k ) = tanh{yˆ 3 (k )}
(3.16)
where
T
yˆ 3 (k ) = ψ { yˆ 2 (k )}H (k ) + b2 (k )
(3.17)
and
H (k ) = [h1 (k ), h2 (k ),..........., hM (k )]T
(3.18)
The weights of the second stage are trained using (3.19)
(3.19)
H (k + 1) = H (k ) + μψ { yˆ 2 (k )}e(k )(1 − yˆ 2 (k ))
where
(3.20)
e( k ) = y ( k ) − yˆ ( k )
52
D E V E L O P M E N T O F A N E W C A S C A D E D F U N C T I O N A L L I N K
A R T I F I C I A L N E U R A L N E T W O R K ( C F L A N N ) F O R N O N L I N E A R
D Y N A M I C S Y S T E M I D E N T I F I C A T I O N
and y (k ) is the output of the plant or system to be identified. Based on the principle of
back propagation algorithm, a weight update algorithm for CFLANN model is derived.
The corresponding weight-update equation of the first stage is derived and is given by
T
′
2
W (k + 1) = W (k ) + μe(k )(1 − yˆ (k ))(1 − yˆ 2 (k ))[H (k + 1)ψ { yˆ 2 (k )}]φ{X (k )}
(3.21)
where ψ ′{ yˆ 2 (k )} is the first order derivative of ψ { yˆ 2 (k )} .
+1
⎡x(k) ⎤
⎢x(k −1) ⎥
⎢
⎥
⎥
⎢•
X(k) = ⎢
⎥
⎥
⎢•
⎢•
⎥
⎥
⎢
⎢⎣x(k −m)⎥⎦
φ1{x(k )}
w1 (k )
φ2 {x(k )}
w2 ( k )
b(k )
•
w2 p+1 (k )
Functional
Expansion
φ2 p +•1{x(k )}
φ 2 p + 2 {x(k − 1)}
Σ
yˆ (k )
∫
w2 p+ 2 (k )
•
φ N −1{x•(k − m)}
+
Σ
y (k )
e(k )
wN −1 (k )
φ N {x(k − m)}
w N (k )
FLANN
based
algorithm
Fig. 3.2 A FLANN model for identification of nonlinear dynamic systems
+1
φ2 {x(k )}
w2 (k )
φ2 p+1{x(k )}
φ 2 p + 2 {x(k − 1)}
φ N −1 {x(k − m)}
φ N {x(k − m)}
w2 p+1 (k )
w2 p+ 2 (k )
+1
ψ 1{ yˆ 2 (k )}
b1 (k )
Σ
yˆ1 (k )
∫
yˆ 2 (k )
wN −1 ( k )
Functional
Expansion-2
w1 ( k )
•
•
Functional
Expansion-1
⎡x(k) ⎤
⎢x(k −1) ⎥
⎥
⎢
⎥
⎢•
X(k) = ⎢
⎥
⎥
⎢•
⎥
⎢•
⎥
⎢
⎣⎢x(k −m)⎦⎥
φ1{x (k )}
ψ 2 { yˆ 2 (k )}
•
•
ψ M { yˆ 2 (k )}
h1 (k )
h2 (k )
•
b2 ( k )
Σ
yˆ (k ) ∫
3
_
yˆ ( k )
e (k )
•
hM (k )
w N (k )
CFLANN
update algorithm
Fig. 3.3 A CFLANN model for identification of nonlinear dynamic systems
53
Σ
+
y (k )
D E V E L O P M E N T O F A N E W C A S C A D E D F U N C T I O N A L L I N K
A R T I F I C I A L N E U R A L N E T W O R K ( C F L A N N ) F O R N O N L I N E A R
D Y N A M I C S Y S T E M I D E N T I F I C A T I O N
Equations (3.10), (3.14), (3.16), (3.17), (3.19) and (3.21) represent the key equations relating
to the CFLANN algorithm.
3.4 Simulation study
To demonstrate the performance of the proposed CFLANN model, simulation results of
nonlinear identification of four typical dynamic plants are presented. In these examples, the
series-parallel model is used to identify the given dynamic plants and CFLANN algorithm
is used to adjust the connecting weights of the CFLANN structure. To compare the
performance of the proposed models the same examples are also simulated using MLANN
and FLANN models. Keeping the best possible performance as the basis the parameters of
FLANN and CFLANN were adjusted suitably. For training the MLANN and FLANN
models 50, 000 iterations are carried out by using an uniformly distributed random signal
over the interval [-1,1] as input. During the testing phase, the effectiveness of the proposed
models is studied by using the parallel scheme where the input to the identified model [3.4]
used is
⎧ 2πk
⎪⎪sin 250 for k ≤ 250
x(k ) = ⎨
⎪0.8sin 2πk + 0.2sin 2πk for k > 250
⎪⎩
250
25
(3.22)
Example 1: The difference equation of the nonlinear plant [3.4] to be identified is given as
y ( k + 1) = 0.3 y ( k ) + 0.6 y ( k − 1) + g[ x ( k )]
(3.23)
The linear parameters are 0.3 and 0.6 and the unknown function g (.) is considered as one
of the nonlinear functions defined in (3.24)-(3.26).
g ( x) =
4.0 x 3 − 1.2 x 2 − 3.0 x + 1.2
0.4 x + 0.8 x 4 − 1.2 x 3 + 0.2 x 2 − 3.0
(3.24)
5
g ( x) = 0.5 sin 3 (πx) −
2.0
− 0.1cos(4πx) + 1.125
x 3 + 2.0
g ( x ) = 0.6 sin(πx ) + 0.3 sin(3πx) + 0.1sin(5πx)
(3.25)
(3.26)
To identify the plant a series-parallel model described by the difference equation (3.27) is
used
yˆ (k + 1) = 0.3 y(k ) + 0.6 y(k − 1) + Ν[ x(k )]
(3.27)
54
D E V E L O P M E N T O F A N E W C A S C A D E D F U N C T I O N A L L I N K
A R T I F I C I A L N E U R A L N E T W O R K ( C F L A N N ) F O R N O N L I N E A R
D Y N A M I C S Y S T E M I D E N T I F I C A T I O N
where Ν[ x (k )] represents one of the MLANN, FLANN and CFLANN models. The
MLANN model used for this purpose has {1-20-10-1} structure. The FLANN input is
expanded to 14 terms by using trigonometric expansion. Both the convergence
parameter, μ and the momentum parameter, η are chosen to be 0.1 for both the models.
But in case of CFLANN model the input is expanded into 10 terms (5 terms in first stage
and 5 terms in second stage) in examples shown in (3.24), 12 stages (7 in first stage and 5
in second stage) in examples shown in (3.25) and (3.26). The coefficient μ is chosen to be
0.1 in all examples. The comparison of responses of these dynamic systems are provided in
Figs. 3.4(a)-3.4(i). From all these plots of Fig. 3.4, it is in general observed that the
identification performance of CFLANN model is better than those obtained from the
MLANN and FLANN models. The sum of squared errors (SSE) computed between the
actual and estimated responses of various methods are shown in Table 3.1. It also shows
improved performance of the proposed method compared to those offered by the other
two.
3
Plant
Model
2
1
Outputs
0
-1
-2
-3
-4
-5
-6
0
100
200
300
Iteration
400
500
600
(a) Comparison of output response using CFLANN method (Example 1 with nonlinearity in (3.24))
55
D E V E L O P M E N T O F A N E W C A S C A D E D F U N C T I O N A L L I N K
A R T I F I C I A L N E U R A L N E T W O R K ( C F L A N N ) F O R N O N L I N E A R
D Y N A M I C S Y S T E M I D E N T I F I C A T I O N
3
Plant
2
Model
1
Outputs
0
-1
-2
-3
-4
-5
-6
0
100
200
300
Iteration
400
500
600
(b) Comparison of output response using FLANN method (Example 1 with nonlinearity in (3.24))
3
Plant
Model
2
1
Outputs
0
-1
-2
-3
-4
-5
-6
0
100
200
300
Iteration
400
500
600
(c) Comparison of output response using MLANN method (Example 1 with nonlinearity in (3.24))
56
D E V E L O P M E N T O F A N E W C A S C A D E D F U N C T I O N A L L I N K
A R T I F I C I A L N E U R A L N E T W O R K ( C F L A N N ) F O R N O N L I N E A R
D Y N A M I C S Y S T E M I D E N T I F I C A T I O N
6
4
Outputs
2
0
-2
-4
-6
-8
Plant
Model
0
100
200
300
Iteration
400
500
600
(d) Comparison of output response using CFLANN method (Example 1 with nonlinearity in (3.25))
6
4
Outputs
2
0
-2
-4
-6
-8
Plant
Model
0
100
200
300
Iteration
400
500
600
(e) Comparison of output response using FLANN method (Example 1 with nonlinearity in (3.25))
57
D E V E L O P M E N T O F A N E W C A S C A D E D F U N C T I O N A L L I N K
A R T I F I C I A L N E U R A L N E T W O R K ( C F L A N N ) F O R N O N L I N E A R
D Y N A M I C S Y S T E M I D E N T I F I C A T I O N
6
4
Outputs
2
0
-2
-4
-6
-8
Plant
Model
0
100
200
300
Iteration
400
500
600
(f) Comparison of output response using MLANN method (Example 1 with nonlinearity in (3.25))
6
Plant
Model
4
Outputs
2
0
-2
-4
-6
0
100
200
300
Iteration
400
500
600
(g) Comparison of output response using CFLANN method (Example 1 with nonlinearity in (3.26))
58
D E V E L O P M E N T O F A N E W C A S C A D E D F U N C T I O N A L L I N K
A R T I F I C I A L N E U R A L N E T W O R K ( C F L A N N ) F O R N O N L I N E A R
D Y N A M I C S Y S T E M I D E N T I F I C A T I O N
6
Plant
Model
4
Outputs
2
0
-2
-4
-6
0
100
200
300
Iteration
400
500
600
(h) Comparison of output response using FLANN method (Example 1 with nonlinearity in (3.26))
6
Plant
Model
4
Outputs
2
0
-2
-4
-6
(i)
0
100
200
300
Iteration
400
500
600
Comparison of output response using MLANN method (Example 1 with nonlinearity in (3.26)
Fig. 3.4 Comparison of identification performance of nonlinear plants of (Example -1)
59
D E V E L O P M E N T O F A N E W C A S C A D E D F U N C T I O N A L L I N K
A R T I F I C I A L N E U R A L N E T W O R K ( C F L A N N ) F O R N O N L I N E A R
D Y N A M I C S Y S T E M I D E N T I F I C A T I O N
Example 2: In this example [3.4] the plant to be identified is of type Model-2 and is
represented by the difference equation
y (k + 1) = f [ y (k ), y (k − 1)] + x(k )
(3.28)
The unknown nonlinear function f is given by
f ( y1 , y 2 ) =
y1 y 2 ( y1 + 2.5)( y1 − 1.0)
1.0 + y1 + y 2
2
(3.29)
2
In this case the series-parallel scheme used to identify the plant is given as
yˆ (k + 1) = Ν[( y (k ), y (k − 1)] + x(k )
(3.30)
In the simulation study the structure used for MLANN is {2-20-10-1}. In FLANN, the
two inputs are expanded into 24 terms and the values of convergence parameter, μ and
momentum parameter, η are set at 0.05 and 0.1 respectively in both MLANN and FLANN
models. In case of the CFLANN model the two dimensional input is expanded into 17
terms (14 term in the first stage and 3 terms in the second stage). The value of μ is chosen
to be 0.1. The response obtained from the plant and various models are compared in Figs.
3.5(a)-(c). In this case also it is observed that the identification performance of the
CFLANN is better than those obtained from the other two. It is also observed from the
magnitude of the SSE shown in Table 3.1.
1
Plant
Model
0.5
Outputs
0
-0.5
-1
-1.5
-2
0
100
200
300
400
500
600
Iteration
700
800
900
1000
(a) Comparison of output response using CFLANN method
60
D E V E L O P M E N T O F A N E W C A S C A D E D F U N C T I O N A L L I N K
A R T I F I C I A L N E U R A L N E T W O R K ( C F L A N N ) F O R N O N L I N E A R
D Y N A M I C S Y S T E M I D E N T I F I C A T I O N
Plant
Model
1
0.5
Outputs
0
-0.5
-1
-1.5
-2
0
100
200
300
400
500
600
Iteration
700
800
900
1000
(b) Comparison of output response using FLANN method
Plant
Model
1
0.5
Outputs
0
-0.5
-1
-1.5
-2
0
100
200
300
400
500
600
Iteration
700
800
900
1000
(c) Comparison of output response using MLANN method
Fig. 3.5 Comparison of identification performance of nonlinear plant of Example-2
Example 3: In this case the plant [3.4] belongs to Model-3 type and is given by the
difference equation
y ( k + 1) = f [ y ( k ) + g ( x (k )]
(3.31)
where the unknown nonlinear functions f (.) and g (.) are represented as
f ( y) =
y ( y + 0.3)
1. 0 + y 2
(3.32)
g ( x ) = x ( x + 0.8)( x − 0.5)
(3.33)
The series-parallel scheme used is given by
yˆ (k + 1) = Ν 1 [ y (k )] + Ν 2 [ x(k )]
(3.34)
61
D E V E L O P M E N T O F A N E W C A S C A D E D F U N C T I O N A L L I N K
A R T I F I C I A L N E U R A L N E T W O R K ( C F L A N N ) F O R N O N L I N E A R
D Y N A M I C S Y S T E M I D E N T I F I C A T I O N
where Ν 1 and Ν 2 represent one of the MLANN, FLANN or CFLANN model. In
MLANN the structure used for both Ν 1 and Ν 2 are having the structure {1-20-10-1},
whereas in case of FLANN, 14 and 24 terms trigonometric expansions are used. The values
of convergence parameter, μ and momentum parameter, η are chosen to be 0.1 in both
MLANN and FLANN models. In the CFLANN model each of the Ν 1 and Ν 2 is
expanded into 14 terms (7 in first stage and 7 in second stage) and μ is taken as 0.1. The
results of identification are plotted in Figs. 3.6(a)-(c). It may be seen that the CFLANN
model is estimating the plant response better than that of the MLANN and FLANN based
methods. The SSE of Table 3.1 also indicates the same trend.
2
Plant
Model
1.5
Outputs
1
0.5
0
-0.5
0
50
100
150
200
250
300
Iteration
350
400
450
500
(a) Comparison of output response Using CFLANN method
2
Plant
Model
1.5
Outputs
1
0.5
0
-0.5
0
50
100
150
200
250
300
Iteration
350
400
450
500
(b) Comparison of output response using FLANN method
62
D E V E L O P M E N T O F A N E W C A S C A D E D F U N C T I O N A L L I N K
A R T I F I C I A L N E U R A L N E T W O R K ( C F L A N N ) F O R N O N L I N E A R
D Y N A M I C S Y S T E M I D E N T I F I C A T I O N
Plant
Model
1.5
Outputs
1
0.5
0
-0.5
50
100
150
200
250
300
Iteration
350
400
450
(c ) Comparison of output response using MLANN
Fig. 3.6 Comparison of identification performance of nonlinear plant of Example-3
Example 4: The plant [3.4] in this case belongs to Model-4 and is described by the
difference equation
(3.35)
y(k + 1) = f [ y(k ), y(k − 1), y(k − 2), x(k ), x(k − 1)]
where the unknown nonlinear function f is given by
f [ a1 , a 2 , a3 , a 4 , a5 ] =
a1 a 2 a3 a5 ( a3 − 1.0) + a 4
1 .0 + a 2 + a 3
2
(3.36)
2
The series-parallel model used for identification of this plant is given as
yˆ (k + 1) = Ν[ y (k ), y (k − 1), y (k − 2), u (k ), u (k − 1)]
(3.37)
In the MLANN model Ν represents a {5-20-10-1} structure and in case of FLANN
model the input and output are expanded to ten terms each using trigonometric expansion.
Both the parameters μ and η are chosen to be 0.1 in these two cases.
In case of CFLANN model the input is expanded to 4 terms and output is expanded to 6
terms in the first stage and in the second stage the output is expanded to 3 terms. The value
of
μ chosen is 0.1. Figs. 3.7(a)-(c) show the comparative performance of the output
response of two models. The simulation results also indicate that the identification
performance is best in the proposed model as may be evident from comparison of SSE
shown in Table 3.1.
63
D E V E L O P M E N T O F A N E W C A S C A D E D F U N C T I O N A L L I N K
A R T I F I C I A L N E U R A L N E T W O R K ( C F L A N N ) F O R N O N L I N E A R
D Y N A M I C S Y S T E M I D E N T I F I C A T I O N
0.8
Plant
Model
0.6
0.4
0.2
Outputs
0
-0.2
-0.4
-0.6
-0.8
-1
0
100
200
300
400
Iteration
500
600
700
800
(a) Comparison of output response using CFLANN model
1
Plant
Model
0.8
0.6
0.4
Outputs
0.2
0
-0.2
-0.4
-0.6
-0.8
-1
0
100
200
300
400
Iteration
500
600
700
800
(b) Comparison of output response using FLANN model
0.6
0.4
0.2
Outputs
0
-0.2
-0.4
-0.6
-0.8
-1
Plant
Model
0
100
200
300
400
Iteration
500
600
700
800
(c) Comparison of output response using MLANN model
Fig. 3.7 Comparison of identification performance of nonlinear plant of Example –4
64
D E V E L O P M E N T O F A N E W C A S C A D E D F U N C T I O N A L L I N K
A R T I F I C I A L N E U R A L N E T W O R K ( C F L A N N ) F O R N O N L I N E A R
D Y N A M I C S Y S T E M I D E N T I F I C A T I O N
The SSE computed by comparing the desired and the estimated outputs of the four
identification examples of three different methods is shown in Table 3.1. It is observed that
the proposed model yields minimum SSE in all cases compared to that obtained from the
other two techniques.
Table 3.1
Comparison of the sum of squared errors (SSE) between the plant and the model outputs
Nonlinear plants
SSE
CFLANN
5.52
3.84
0.1917
5.40
1.15
6.88
Ex-1 using (3.24)
Ex-1 using (3.25)
Ex-1 using (3.26)
Example-2
Example-3
Example-4
FLANN
5.76
4.02
0.1846
5.40
1.70
7.92
MLANN
7.74
16.68
291.18
5.70
1.20
7.36
The computational complexity required in the identification of the plants given in examples
1-4 is presented in Table 3.2. It is in general observed that the proposed CFLANN model
involves the lowest computational complexity compared to other two models. Further it is
in general observed that the CFLANN offers best identification performance compared to
the existing MLANN and FLANN based models.
Table 3. 2
Comparison of Computational Complexity of various system identification models
Types of
models
No. of tanh ( )
MLANN
FLANN
CFLANN
31
1
2
MLANN
FLANN
CFLANN
31
1
2
MLANN
FLANN
CFLANN
62
2
2
MLANN
FLANN
CFLANN
31
1
2
No. of
Cos/Sin
Example-1
0
14
10
Example-2
0
24
17
Example-3
0
38
14
Example-4
0
20
13
65
No. of
weights
No. of Adds.
No. of
Muls.
261
15
12
230
14
11
230
15
12
281
25
19
250
24
18
250
25
19
522
40
16
460
39
15
460
40
16
341
21
15
310
20
14
310
21
15
D E V E L O P M E N T O F A N E W C A S C A D E D F U N C T I O N A L L I N K
A R T I F I C I A L N E U R A L N E T W O R K ( C F L A N N ) F O R N O N L I N E A R
D Y N A M I C S Y S T E M I D E N T I F I C A T I O N
3.5 Conclusion
The Chapter has introduced a new Cascaded FLANN(CFLANN) model with its learning
algorithm. This adaptive structure is then used to identify nonlinear complex plants.
Computer simulation based experiments have been carried out to validate its performance
and compare the same with those obtained by other standard methods. The results of
simulation indicate that the proposed CFLANN structure involves least computation and
offers best performance compared to those offered by the existing FLANN and MLANN
based methods.
References
[3.1] Chia-Feng Juang, “A TSK-type recurrent fuzzy network for dynamic systems
processing by neural network and genetic algorithms”, IEEE Trans. on Fuzzy Systems, vol.
10, no. 2, pp. 155-170, April 2002.
[3.2] A. Lo Schiavo and A. M. Luciano, “Powerful and flexible fuzzy algorithm for
nonlinear dynamic system identification”, IEEE Trans. on Fuzzy Systems, vol. 9, no. 6, pp.
828-835, December 2001.
[3.3] N. V. Bhat, P. A. Minderman Jr., T McAvoy and N. S. Wang., “Modeling chemical
process systems via neural computation, IEEE Contr. Syst. Mag., vol. 10, no. 3, pp. 24-29,
April 1990.
[3.4] K. S. Narendra and K. Parthasarathy, “Identification and control of dynamical systems
using neural networks”, IEEE Trans. on Neural Networks, vol. 1, pp. 4-26, January 1990.
[3.5] D. N. Nguyen and B. Widrow, “Neural networks for self learning control system”,
Int. J. Contr., vol. 54, no. 6, pp. 1439-1451, 1991.
[3.6] G. Cembrano, G. Wells, J. Sarda and A. Ruggeri, “Dynamic control of a robot arm
based on neural networks”, Contr. Eng. Practice, vol. 5, no. 4, pp. 485-492, 1997.
[3.7] S. Lu and T. Basar, “Robust nonlinear system identification using neural network
models”, IEEE Trans. on Neural Networks, vol. 9, pp. 407-429, May 1998.
[3.8] S. Chen, S. A. Billings and P. M. Grant, “Recursive hybrid algorithm for nonlinear
system identification using radial basis function networks”, Int. J. Contr., vol. 55, no. 5, pp.
1051-1070, 1992.
66
D E V E L O P M E N T O F A N E W C A S C A D E D F U N C T I O N A L L I N K
A R T I F I C I A L N E U R A L N E T W O R K ( C F L A N N ) F O R N O N L I N E A R
D Y N A M I C S Y S T E M I D E N T I F I C A T I O N
[3.9] S. V. T. Elanayar and Y. C. Shin, “Radial basis function neural network for
approximation and estimation of nonlinear stochastic dynamic systems”, IEEE Trans. on
Neural Networks, vol. 5, pp. 594-603, July 1994.
[3.10] E. J. Hartman, J. D. Keeler and J. M. Kowalski, “Layered neural networks with
Gaussian hidden units as universal approximation”, Neural Comput., vol. 2, pp. 210-215,
1990.
[3.11] J. Zhang, G. G. Walter, Y. Miao and W. G. W. Lee, “Wavelet neural networks for
function learning”, IEEE Trans. Signal Processing, vol. 43, pp. 1485-1497, June 1995.
[3.12] Qinghua Zhang, “Using wavelet network in nonparametric estimation”, IEEE Trans.
on Neural Network, vol. 2, no. 2, pp. 227-236, 1997.
[3.13] M. K. Tsatsanis and G. B. Giannakis, “Time varying system identification and model
validation using wavelets”, IEEE Trans. on Signal Processing, vol. 41, no. 12, pp. 35123523, 1993.
[3.14] Yonghong Tan, Xuanju Dang, Feng Liang and Chun-Yi Su, “Dynamic wavelet neural
network for nonlinear dynamic system identification”, Proc. of the 2000 IEEE
International Conf. on Control Applications, Anchorage, Alaska, USA, September 25-27,
2000, pp. 214-219.
[3.15] Y. Chen and S. Kawaji, “Evolving wavelet neural networks for system
identification “, Proc. of International Conf. on Electrical Engineering, Kitakyushu, Japan,
2000, pp. 279-282.
[3.16] Y. Chen and S. Kawaji, “Evolving the basis function neural networks for system
identification”, Int. J. Adv. Comput. Intell, vol. 5, no. 4, pp. 229-238, 2001.
[3.17] P. S. Sastry, G. Santharam and K. P. Unnikrishnan, “Memory neural networks for
identification and control of dynamical systems”, IEEE Trans. on Neural Networks, vol. 5,
pp. 306-319, March 1994.
[3.18] A. G. Parlos, K. T. Chong and A. F. Atiya, “Application of recurrent multilayer
perceptron in modeling of complex process dynamics”, IEEE Trans. on Neural Networks,
vol. 5, pp. 255-266, March 1994.
[3.19] P. A. Mastorocostas and John B. Theocharis, “A recurrent fuzzy-neural model for
dynamic system identification”, IEEE Trans. on Systems, Man and Cybernetics-Part B:
Cybernetics, vol. 32, no. 2, pp. 176-190, April 2002.
[3.20] Chen-Sen Ouyang and Shie-Jue Lee, “An improved TSK-type recurrent fuzzy
network for dynamic system identification”, Proc. of IEEE International Conf. on Systems,
Man and Cybernetics, Washington, USA, Oct. 5-8, 2003, vol. 4, pp. 3342-3347.
[3.21] Yen-Ping Chen and Jeen-Shing Wang, “A novel recurrent neural network with
minimal representation for dynamic system identification”, Proc. of IEEE International
Joint Conf. on Neural Networks, July 25-29, 2004, vol. 2, pp. 849-854.
67
D E V E L O P M E N T O F A N E W C A S C A D E D F U N C T I O N A L L I N K
A R T I F I C I A L N E U R A L N E T W O R K ( C F L A N N ) F O R N O N L I N E A R
D Y N A M I C S Y S T E M I D E N T I F I C A T I O N
[3.22] Jeen-Shing Wang and Yen-Ping Chen, “A fully automated neural network for
unknown dynamic system identification and control”, IEEE Trans. on Circuits and
Systems-I, vol. 53, no. 6, pp. 1363-1372, June 2006.
[3.23] Y. H. Pao, Adaptive Pattern Recognition and Neural Network. Reading, MA:AddisionWesley, 1989.
[3.24] A. Namatame and N. Ueda, “Pattern classification with Chebyshev neural network”,
Int. J. Neural Network, vol. 3, no. 4, pp. 23-31, March 1992.
[3.25] B. Igelnik and Y. H. Pao, “Stochastic choice of basis functions in adaptive function
approximation and the functional link net”, IEEE Trans. on Neural Network, vol. 6, no. 6,
pp. 1320-1329, Nov. 1995.
[3.26] T. T. Lee and J. T. Jeng, “The Chebyshev polynomial based unified model neural
networks for function approximations”, IEEE Trans. on System, man and Cybernetics-B,
vol. 28, pp. 925-935, December 1998.
[3.27] Y. H. Pao, S.M. Phillips and D. H. Sobajic, “Neural net computing and intelligent
control systems”, Int. J. Contr., vol. 56, no. 2, pp. 263-289, 1992.
[3.28] J. C. Patra, R. N. Pal, R. Baliarsingh and G. Panda, “Nonlinear channel equalization
for QAM signal constellation using artificial neural networks”, IEEE Trans. on Systems,
Man and Cybernetics- Part B, vol. 29, no.2, pp. 262-271, April 1999.
[3.29] J. C. Patra and R. N. Pal, “A functional link artificial neural network for adaptive
channel equalization”, Signal processing, vol. 43, pp. 181-195, May 1995.
[3.30] J. C. Patra, R. N. Pal, B. N. Chatterjee and G. Panda, “Identification of nonlinear
dynamic systems using functional link artificial neural networks”, IEEE Trans. on Systems,
Man and Cybernetics – Part B, vol. 29, no. 2, pp. 254-262, April 1999.
[3.31] J. C. Patra and A. C. Kot, “Nonlinear dynamic system identification using Chebyshev
functional link artificial neural networks”, IEEE Trans. on Systems, Man and Cybernetics –
Part B, vol. 32, no. 4, pp. 505-511, August 2002.
[3.32] S. Purwar, I. N. Kar and A. N. Jha, “Online system identification of complex systems
using Chebyshev neural networks”, Applied Soft Computing, 7, pp. 364-372, 2007.
[3.33] J. Teeter and M. Y. Chow, “Application of functional link neural network to HVAC
thermal dynamic system identification”, IEEE Trans. Ind. Electron., vol. 45, no. 1, pp. 170176, Feb. 1998.
[3.34] B. S. Lin, F. C. Chong and F. Lai, “A functional link network with higher order
statistics for signal enhancement”, IEEE Trans. on Signal Processing, vol. 54, no. 12, pp.
4821-4826, December 2006.
68
D E V E L O P M E N T O F A N E W C A S C A D E D F U N C T I O N A L L I N K
A R T I F I C I A L N E U R A L N E T W O R K ( C F L A N N ) F O R N O N L I N E A R
D Y N A M I C S Y S T E M I D E N T I F I C A T I O N
[3.35] J. C. Patra, G. Panda and R. Baliarsingh, “Artificial Neural Network based
nonlinearity estimation of pressure sensor”, IEEE Trans. on Inst. And Measurement, vol.
43, no. 6, pp.874-881, Dec., 1994.
[3.36] S. Mishra and G. Panda, “A novel method for designing LVDT and its comparison
with conventional design”, IEEE Sensor application Symposium, Texas, USA, 7-9 Fe.,
2006, pp. 129-134.
[3.37] R. Majhi, G. Panda and G. Sahoo, “Efficient prediction of foreign exchange using
single layer artificial neural network”, Proc. of IEEE Conf. on Cybernetics and Intelligent
Systems, Bankok, June 2006, pp. 1-5.
[3.38] M. Pachter and O. R. Reynolds, “Identification of a discrete time dynamical system”,
IEEE Trans. Aerospace Electronic System, vol. 36, issue 1, pp. 212-225, 2000.
69
4
Chapter
Identification of IIR Plants using
Comprehensive Learning Particle
Swarm Optimization
4.1 Introduction
D
URING the last two decades adaptive infinite-impulse response (IIR) filtering
and identification have been an active field of research. They have been applied
to linear prediction, adaptive differential pulse code modulation, channel
equalization, process control, echo cancellation, adaptive array processing and intelligent
instrumentation. Many real world systems such as speech synthesis and recognition,
acoustical modeling and adaptive digital subscriber loop (ADSL) are recursive in nature and
it is advantageous to model such plants using IIR adaptive filters. The main advantage of
IIR system is that it provides significantly improved performance than an adaptive FIR
filter having the same number of coefficients. This is due to the fact that the output feed
back generates an infinite impulse response with a finite number of parameters. Further the
70
I D E N T I F I C A T I O N O F I I R P L A N T S U S I N G C O M P R E H E N S I V E
L E A R N I N G P A R T I C L E S W A R M O P T I M I Z A T I O N
output of an IIR filter approximates the desired response more effectively if both poles and
zeros are present in a filter. Alternatively, to achieve a pre-specified performance, an IIR
filter requires considerably fewer coefficients than the corresponding FIR filter. It is a fact
that the adaptive IIR filters effectively substitute the conventionally-used adaptive FIR
filters for many practical applications [4.1].
As the error surface of IIR filters is usually multimodal with respect to the filter
coefficients, the gradient based learning algorithms such as the least-mean-square (LMS)
very often get stuck at local minima and its associated weights do not converge to the
global optimum [4.2]. This algorithm tries to find out the minimum point of the error
surface by moving in the direction of negative gradient. Like most of the learning
algorithms it may lead the filter to a local minimum when the error surface is multimodal.
In addition, the convergence behavior depends on the choice of the step size and initial
values of filter coefficients.
The adaptive IIR filtering has two distinct approaches which correspond to different
formulations of prediction error. These are equation-error [4.3, 4.4] and output-error
formulations [4.5, 4.6]. In the equation-error method the feed back coefficients are updated
in an FIR form which are then copied to a second filter implemented in all-pole form. This
formulation is essentially a type of adaptive FIR filtering. However this approach may lead
to biased estimates of the filter coefficients. On the other hand the output-error
formulation updates the coefficients of the feedback path directly in a pole-zero recursive
form. This approach does not generate biased estimates of the coefficients. But the
adaptive algorithm may converge to a local minimum of the mean square error (MSE)
leading to an incorrect estimate of the coefficients. In addition, its convergence properties
are not easily predicted [4.10]. Out of the two formulations of adaptive IIR filtering, the
output error based approach provides improved performance in system identification if the
problem of local minima associated with this algorithm is overcome.
Thus the main motive of this chapter is to propose a new adaptive algorithm using a
population based bio-inspired technique known as particle swarm optimization(PSO) for
identification of IIR or pole-zero systems which is expected to overcome the local minima
problem and to provide accurate estimates of the pole-zero coefficients. To improve the
performance of complex multimodal problems, further a comprehensive learning PSO
(CLPSO) algorithm has recently been proposed [4.9]. In this chapter the identification of
IIR systems is also carried out using this structured stochastic search algorithm. Since the
71
I D E N T I F I C A T I O N O F I I R P L A N T S U S I N G C O M P R E H E N S I V E
L E A R N I N G P A R T I C L E S W A R M O P T I M I Z A T I O N
proposed technique is independent of the adaptive filter structure and is capable of
converging to the global solution for multimodal problems, it is expected to be a potential
candidate for identification of IIR systems.
4.2 Related work
Swarm Intelligence (SI) is an artificial intelligence technique which involves the study of
collective behavior in decentralized and self organized systems. SI systems are typically
made up of a population of simple agents interacting locally with one another and with
their environment. Although there is no centralized control structure dictating how
individual agent should behave, local interactions between such agents lead to the
emergence of global behavior. Natural examples of SI include ant colonies, bird flocking,
animal herding, bacterial growth, honey bees and fish schooling. The SI refers to the
problem solving behavior which emerges from the interaction between the individuals of
such system. The algorithmic models of such behaviors have shown to adapt well in
changing environments and are flexible and robust. The last decade has witnessed rapid
growth of research interests in various SI paradigms, one of which is particle swarm
optimization (PSO) [4.7, 4.8]. The PSO is a global optimization algorithm for dealing with
problems in which the best solution can be represented as a point or surface in an ndimensional space. Its main advantage over other optimization strategies such as simulated
annealing [4.17] is that the large number of members that make up the particle swarm
enable the technique to be resilient to the problem of local minima. The PSO algorithm is
easy to implement and has been shown to perform well for many optimization problems.
In [4.6], Johnson presented a tutorial on adaptive IIR filtering techniques highlighting the
common basis between filtering and system identification. Subsequently another tutorial on
adaptive IIR filtering was published [4.11-14] that dealt with different algorithms, error
formulations and realizations. These filters are mostly direct-form realization. The direct
form is a convenient and simple structure but can not ensure stability of the adaptive filter.
To overcome this problem, the parallel [4.15] and lattice [4.16] forms have been proposed.
These structures offer simple stability monitoring with less complexity than that of direct
form. A review paper has been reported [4.26] on adaptive IIR filtering algorithms for
system identification using a unifying frame work. For achieving efficient system
identification neural network has also been introduced in the literature [4.55-4.56]
72
I D E N T I F I C A T I O N O F I I R P L A N T S U S I N G C O M P R E H E N S I V E
L E A R N I N G P A R T I C L E S W A R M O P T I M I Z A T I O N
The drawback of all adaptive IIR filter structures is that they produce error surfaces that
inherently tend to be multimodal. Therefore the local optimization techniques such as the
gradient descent algorithms, are not suitable because they are likely to be trapped to local
minimum solution. An alternative to gradient based techniques is a structured stochastic
search of the error surface. These type of global searches are structure independent because
a gradient is not calculated and the filter structure does not directly influence the parameter
updates. Due to this feature, these types of algorithms are potentially capable of globally
optimizing IIR filter structures. Several structured stochastic search approaches have
appeared in the literature, using simulated annealing [4.17], genetic algorithm (GA) [4.18]
and PSO [4.7, 4.8]. An adaptive genetic algorithms for determining the optimum filter
coefficients in a recursive adaptive filter is presented in [4.19]. The GA has also been
applied to optimize the parameters of adaptive IIR filters [4.20]. In another publication
[4.21] GA based approach applied for system identification and control of both continuous
and discrete time systems has been reported. For efficient adaptive IIR filtering a fast
genetic search algorithm has been introduced [4.22] in the LMS algorithm. In another paper
[4.23] the nonlinear parameters of the IIR filters have been estimated using GA. A
hierarchical GA based algorithm is proposed in [4.24] for design and optimization of IIR
filter structure. In a recent article [4.25] a new learning algorithm is introduced embedding
the genetic search into the gradient descent algorithm to accelerate the learning process and
to provide global search capability. It is reported that this method outperforms the LMS
algorithm and the gradient lattice algorithm in terms of convergence speed and ability to
locate the global optimal solution.
The basics of PSO is dealt in Section 2.5.2. The research papers on PSO which have
appeared in the literature have focused on two fronts : improving the performance of PSO
[4.27-4.37] by incorporating a number of modifications in the algorithm and application of
these PSO algorithms in diverse fields such as minimization of functions of many variables
[4.38], image segmentation [4.39, 4.40], design of antennas [4.41] and stock market
prediction [4.42], design of tree structures [4.43], learning to pay games [4.44] and
multimodal biomedical image reistration [4.45] and design of adaptive IIR filter [4.46].
Various variants of PSOs are PSO with decreasing inertia weights [4.27], PSO with fuzzy
adaptive inertia weights [4.28], self-organizing hierarchical PSO with time varying
accelerating coefficients [4.29], PSO with linearly decreasing Vmax [4.30], PSO with
73
I D E N T I F I C A T I O N O F I I R P L A N T S U S I N G C O M P R E H E N S I V E
L E A R N I N G P A R T I C L E S W A R M O P T I M I Z A T I O N
constriction factor [4.31], local version of PSO with constriction factor [4.32], dynamic
neighborhood PSO [4.33], unified PSO [4.34], fully informed PSO [4.35], fitness-distanceratio based PSO [4.36] and cooperative PSO [4.37]. Some researchers have proposed
hybridization by combining PSO with other search techniques to improve the performance
of the PSO. Evolutionary operators such as selection, crossover and mutation have also
been suggested in PSO to retain the best particles [4.47] and to improve the ability to
escape from local minima [4.48]. Two recent reported works in this direction are PSO with
crossover [4.49] and PSO with mutation [4.50]. In order to maintain the diversity and to
escape from local optima, relocation of particles when these are too close to each other
[4.51] or use of some collision-avoiding mechanisms [4.52] have been proposed in the
literature. Negative entropy has been employed in PSO [4.53] to discourage premature
convergence. Deflection, stretching and repulsion mechanisms have been introduced in
[4.54] to find as many minima as possible by preventing particles from entering to
previously discovered minimal region.
4.3 Basics of modified PSO and CLPSO algorithms
The introduction of constriction factor K in the velocity equation (2.48) of conventional
PSO has improved the convergence of the PSO algorithm [4.31]. Accordingly the velocity
equation is modified as
Vi (d ) = K *Vi (d ) + c1 * rand1i (d ) * ( Pi (d ) − X i (d )) + c 2 * rand 2 i (d ) * ( Pg (d ) − X i (d ))
(4.1)
where K =
(4.2)
2
2 − φ − φ − 4φ
2
and φ = c1 + c 2 , φ > 4
(4.3)
Usually φ is set to 4.1 and K = 0.729. As a result each of the ( Pi − X i ) terms is calculated
by multiplying 0.729 * 2.05= 1.49445.
In the original PSO, each particle learns from its pbest and gbest simultaneously. In the
PSO, the social learning aspect is restricted only to the gbest . This appears to be
somewhat an arbitrary decision. In addition, all particles in the swarm learn from the gbest
even if the current gbest is far from the global optimum. Under such situations, the
particles are attracted easily and trapped in to an inferior local optimum. As the fitness
74
I D E N T I F I C A T I O N O F I I R P L A N T S U S I N G C O M P R E H E N S I V E
L E A R N I N G P A R T I C L E S W A R M O P T I M I Z A T I O N
value of the particle is decided by all dimensions, a particle which has discovered the value
corresponding to the global optimum in one dimension may have a low fitness value
because of poor solution in other dimensions. In order to prevent this Liang et al have
proposed [4.9] a novel learning strategies which differ mainly in three aspects compared to
the PSO reported in [4.7], [4.8].
(i) Instead of using particle’s own pbest and pbest as the exemplars, all particles’ pbest
are used as exemplars to guide a particle’s flying direction
(ii) Each dimension of a particle may learn from the corresponding dimension of different
particles’ pbest .
(iii) Instead of learning from two exemplars ( gbest and pbest ) at the same time in very
generation as in the original PSO, each dimension of the particle trained from just one
exemplar for a few generations.
In the CLPSO algorithm the velocity update equation [4.9] is given by
Vi ( d ) = w * Vi ( d ) + c * rand i ( d )( pbest f i ( d ) ( d ) − X i ( d ))
(4.4)
where f i (d ) gives which particles’ pbest s will be used for the i th particle. The term
pbest f i ( d ) (d ) represents the corresponding dimension of a particle’s pbest including its
own and such a decision depends on the learning probability Pc . For each dimension of
particle i , a random number is generated. If this number is larger then Pc , the
corresponding dimension learns from its own pbest , otherwise it learns from another
particle’s pbest until the particle ceases improving for a certain number of generations
called the refreshing gap m , then f i is reassigned for the particle. In the later case the
tournament selection procedure is employed. The above stated operations increase the
diversity of swarms which results in enhanced performance when solving complex
multimodal problems. It is reported that different values of Pc yielded a solution which is
different from that obtained by taking same Pc for all particles. If different values of Pc for
different particles are taken then the particles have different levels of exploration and
exploitation ability in the population. The empirical relation for i th particle is given by
10(i − 1)
(exp(
) − 1)
Ps − 1
Pci = 0.05 + 0.45 *
(exp(10) − 1)
(4.5)
75
I D E N T I F I C A T I O N O F I I R P L A N T S U S I N G C O M P R E H E N S I V E
L E A R N I N G P A R T I C L E S W A R M O P T I M I Z A T I O N
where Ps = size of population. In many practical problems bounds are imposed on the
ranges of the variables. Two search ranges are suggested :
(i)
[ X min , X max ]
and
(ii) min( X max (d ), max( X min (d ), X i (d ))
The refreshing gap parameter m also influences the results as it affects the convergence
velocity. The details of the CLPSO algorithms is available in [4.9].
4.4 Adaptive system identification of IIR systems
The block diagram of an adaptive system identification of an IIR plant is shown in Fig. 4.1.
The model is an output-error adaptive IIR filter and is characterized by the recursive
difference equation given in (4.6)
N −1
M −1
m =1
m =0
yˆ (n) = ∑ aˆ m (n) yˆ (n − m) + ∑ bˆm x(n − m)
(4.6)
where x(n) and yˆ (n) represent the n th input and output of the plant respectively. The
present estimated output
yˆ (n) depends on the past estimated output samples
yˆ ( n − m), m = 1, 2,....... N − 1 . {aˆ m ( n), bˆm ( n)} represent adjustable coefficients which at the
end of the adaptation process give the estimated pole-zero parameters of the IIR plant.
v ( n)
IIR Plant
y (n)
∑
d (n)
+
x(n)
∑
e(n)
−
yˆ (n)
Model
Learning algorithm
Fig. 4.1 Adaptive identification of IIR systems using output-error adaptive IIR filter as the model
The output d (n) of the plant is represented by (4.7)
76
I D E N T I F I C A T I O N O F I I R P L A N T S U S I N G C O M P R E H E N S I V E
L E A R N I N G P A R T I C L E S W A R M O P T I M I Z A T I O N
N −1
M −1
m =1
m =0
d ( n ) = ∑ a m ( n ) y ( n − m) + ∑ bm ( n ) x ( n − m) + v ( n )
(4.7)
where y (n) and v(n) denote the output and measurement noise respectively. This noise is
uncorrelated with the input x(n) . The pole and zero parameters of the IIR plant are
a m (n) and bm (n) respectively. In the system identification configuration of Fig. 4.1, the
model is represented by output-error adaptive IIR filter of the form
⎛ Bˆ ( n, z ) ⎞
⎟ x ( n)
yˆ ( n) = ⎜⎜
⎟
ˆ
⎝ 1 − A( n, z ) ⎠
(4.8)
where the feed forward and feed back transfer functions are given by
N −1
A(n, z ) = ∑ aˆ m (n) z −m and Bˆ (n, z ) =
m =1
M
∑ bˆ
m=0
m
( m) z − m
(4.9)
respectively.
Unlike the equation-error formulation, the pole-polynomial 1 − Aˆ (n, z ) is adapted directly
in an IIR filter form. Such formulation is a natural generalization of the adaptive FIR filter
where Aˆ (n, z ) = 0 . Equation (4.6) may also be written as an inner product form
T
yˆ ( n ) = θˆ ( n )φ ( n )
(4.10)
where the estimated coefficient vector θˆ(n) and the signal vector φ (n) are given by (4.11)
θˆ(n) = [aˆ1 (n),..........., aˆ N −1 (n), bˆ0 (n),......., bˆM −1 (n)]T
and φ (n) = [ yˆ (n − 1),.. yˆ (n − N + 1), x (n),... x (n − M + 1)]T
(4.11)
The output yˆ (n) is a nonlinear function of θˆ(n) because the delayed output signals
yˆ (n − k ) of φ (n) depend on previous coefficient values. The output error is given by
e(n) = d (n) − yˆ (n) and is generated by subtracting the estimated output in (4.6) from
d (n) . It is evident that e(n) is a nonlinear function of θ and hence the mean square
output error is not a quadratic function and therefore it can have multiple minima. The
gradient based adaptive algorithms like the LMS could converge to one of the local
solutions yielding inaccurate estimates of pole zero parameters. The CLPSO algorithm is
therefore has been employed in this chapter in updating those parameters of the model so
that the parameter estimates would be optimal.
77
I D E N T I F I C A T I O N O F I I R P L A N T S U S I N G C O M P R E H E N S I V E
L E A R N I N G P A R T I C L E S W A R M O P T I M I Z A T I O N
4.5 CLPSO based identification of IIR systems
The identification of IIR system is formulated as an optimization problem. The steps
involved in the CLPSO based algorithm for identification are the following :
Step-1 Input samples of a suitable window length ( L) are selected from a zero mean
uniform random sample. It is represented by x(l ) , where l = 0,........, L − 1 and lies
between +0.5 to -0.5.
Step-2 D = ( M + N − 1) number of random numbers are generated which represents the
initial position of particle X . The first ( N − 1) random numbers denote the initial feedback
parameters and the remaining M random numbers represent the feed-forward parameters
of the model of the plant. Another set of M + N − 1 random numbers are also generated to
represent the corresponding velocities V of the particles. Here D denotes dimension of the
particle.
Step-3 The procedure in Step-2 is repeated for a specified population size Ps . The complete
set of population constitutes a swarm.
Step-4 Input samples of window size (L) are applied sample by sample to the plant of
known coefficients and then added with noise to generate the desired signal d (l ) .
Step-5 The same input is also applied to the model sequentially to get the output signal
yˆ i (l ) which is the estimated output of the model corresponding to the l th input sample
and for the i th particle.
Step-6 The Mean Square Error (MSE) of i th particle (which represents the cost function)
L −1
is computed using J i = 1 [ ∑ ( d (l ) − yˆ i (l )) 2
L
(4.12)
i =0
Step-7 In the same way the cost functions of all other particles are also evaluated for every
generation. The particle giving the minimum cost function, provides the best possible
representation of the unknown plant to be modeled.
Step-8 The pbest represents the best positions (i. e. set of model parameters that gives the
minimum cost function value) for a particular particle. It is initialized as the initial position
of the particle. The gbest represents the best position in the swarm. It is initialized as the
position of the particle which gives the minimum cost function value in the swarm.
78
I D E N T I F I C A T I O N O F I I R P L A N T S U S I N G C O M P R E H E N S I V E
L E A R N I N G P A R T I C L E S W A R M O P T I M I Z A T I O N
Step-9 k is the iteration number initialized to 1. The first particle is chosen i. e. i = 1, where
i is the particle number. Linearly decreasing inertia weight within the range 0.9 to 0.4 is
used for updating the velocity and position of particles in each iteration. The inertia weight
at k th generation is given by
wk = w0 −
(w0 −w1 ) * k
itr
(4.13)
where k =generation counter (from 1 to itr )
itr =number of iterations
w0 = 0.9 and w1 = 0.4
Step-10 The refreshing gap parameter m is adjusted depending on the function or problem
to be optimized. The flag i represents the number of generations the i th particle has not
improved its own pbest . The flag is initialized to 0 for all particles.
Step-11 Another parameter Pci called as the learning probability of a particle is initialized
according to the empirical formula given in (4).
Step-12 If flag i < m then go to Step-(20).
Step-13 Starting with d = 1 for every dimension of a particle a random number is generated
(rand).
Step-14 If rand < Pci then go to Step-17 else to Step-15.
Step-15 Set f i (d ) = i , which gives the number of the particle whose pbest will be used for
the present particle.
Step-16 Now if d < D then increment d and go to Step-13 or else go to Step-19.
Step-17
Two
particles
are
selected
f 1i (d ) = ⎡rand1i (d ) * Ps ⎤
randomly
by
using
(4.14)
f 2i (d ) = ⎡rand2i (d ) * Ps ⎤
where Ps = population size
⎡ ⎤ = ceiling operator
Step-18 Subsequently the cost function for the two randomly selected particles f 1i (d ) and
f 2 i (d ) are computed using their respective pbest .
If J ( f 1i ( d ) pbest ) < J ( f 2 i ( d ) pbest ) , then f i (d ) = f 1i (d ) else f i (d ) = f 2i (d ) . Then go to Step-16.
Step-19 Reinitialize flag i to zero.
79
I D E N T I F I C A T I O N O F I I R P L A N T S U S I N G C O M P R E H E N S I V E
L E A R N I N G P A R T I C L E S W A R M O P T I M I Z A T I O N
Step-20 d is set to 1 representing the dimension of the i th particle.
Step-21 New velocities and positions are calculated using
Vi (d ) = wk * Vi (d ) + c * rand i (d ) * ( pbest f i ( d ) (d ) − X i (d ))
(4.15)
Vi (d ) = min(V max(d ), max(Vmin (d ), Vi (d ))
X i ( d ) = X i ( d ) + Vi ( d )
where c = acceleration coefficients
Step-22 If d < D then increment d and go to Step-21 else go to Step-23.
Step-23 If X i (d ) ∈ [ X min , X max ] then go to Step-24 else go to Step-27.
Step-24 Calculate the cost function for new position of particle.
Step-25 If J X (i ) < J pbest (i ) for the i th particle then go to Step-26 else increment flag and go
to Step-27.
Step-26 Reinitialize pbest (i ) = X (i) and also initialize flag (i ) = 0 . If J X (i ) < J gbest then
gbest = X (i ) .
Step-27 If i < Ps then increment i and go to Step-12 or else go to Step-28.
Step-28 If k < itr , then increment k and go to Step-9.
Step-29 The elements corresponding to gbest particle provide the estimated coefficients of
the IIR system.
4.6 Simulation study
By conducting simulation experiments on identification of four bench mark IIR systems
(2nd order to 5th order) the performance of proposed CLPSO based method is compared
with those obtained from three other standard methods, recursive LMS (RLMS), GA and
PSO. The new algorithm is used in adaptive IIR identification to improve the performance
of the existing algorithms especially when the error surface is multimodal. The block
diagram of Fig. 4.1 is simulated using output-error formulation. The plant (the unknown
system) is a fixed IIR filter with transfer function H (z ) , while the adaptive system is an
adaptive IIR filter with transfer function Hˆ ( z ) whose coefficients are updated by different
learning algorithms. The transfer function of the plant is represented by
80
I D E N T I F I C A T I O N O F I I R P L A N T S U S I N G C O M P R E H E N S I V E
L E A R N I N G P A R T I C L E S W A R M O P T I M I Z A T I O N
M
H ( z) =
∑b z
j =0
L
−i
j
1 − ∑ ai z
(4.16)
−i
i =1
In the present simulation both full-order and reduced-order modeling are considered. Local
minima phenomena are observed in the reduced-order modeling, while the full-order
modeling is used to demonstrate the fast convergent behavior and global search ability of
the proposed algorithm. The CLPSO based algorithm discussed in the previous section is
used to compute the best estimate of the pole-zero parameters. The input is a zero mean
white random signal with uniform distribution. The additive noise v(n) is a white random
process uncorrelated with x(n) and with 30dB SNR. The Initial common parameters used
for CLPSO, PSO, GA and LMS are listed below :
CLPSO : D = no. of weights to be optimized, Ps = Population size =40 to 150, X min =
lower bound of weights = -1.3, X max = upper bound of weights = 1.3, Vmax = maximum
velocity = X max , w0 = max inertia weight = 0.9 , w1 = minimum inertia weight = 0.4,
c = acceleration factor = 1.042, X = positions of particles and V = velocities of the
particles, m = refreshing gap = 5 and L = 25.
PSO : D = no. of weights to be optimized, Ps = Population size=120 to 400 X =
positions of particles and V = velocities of the particles, K =constriction factor = 0.729,
c1 = c 2 = acceleration coefficients = 1.49445 and L = 500.
GA : D = no. of weights to be optimized, Ps = Population size = 80 to 120, N = no. of
bits = 20 to 40, Wmax = max. of weights , Wmin = min of weights, Pc = probability of
crossover =0.9, Pm = probability of mutation = 0.01, L = 1000
LMS : D = no. of weights to be optimized, μ = convergence coefficients = 0.1, L = 10,
000.
Example-1 The plant is a second order IIR filter [4.46] with M=1, L=2. The difference
2
1
i =1
j =0
equations of plant : y (n) = ∑ ai y (n − i ) + ∑ b j x(n − j )
d ( n) = y ( n) + v ( n)
{a i } = {0.3,−0.4}, {b j } = {1.25,−0.25}
81
I D E N T I F I C A T I O N O F I I R P L A N T S U S I N G C O M P R E H E N S I V E
L E A R N I N G P A R T I C L E S W A R M O P T I M I Z A T I O N
2
1
i =1
j =0
Full-order model : yˆ (n) = ∑ aˆ i y (n − i ) + ∑ bˆ j x(n − j )
1
0
i =1
j =0
Reduced – order model : yˆ (n) = ∑ aˆ i y (n − i ) + ∑ b j x(n − j )
The convergence characteristics of using GA, PSO and CLPSO based training are shown in
Fig. 4.2(a) for full-order and in Fig. 4.2(b) for reduced order. Fig. 4.2(a) reveals that GA and
PSO can not reach the minimum MSE even after 500 generations where as the new
algorithm can converge to the optimal solution with a MSE level of 10 −5
in only 200
generations. Similarly the convergence plot of Fig. 4.2(b) for reduced order model exhibits
superior MSE performance of CLPSO compared to other two. The results clearly show
that for reduced order the multimodal situation does not affect the convergence
performance of CLPSO but does so in other two cases.
-1
10
PSO
CLPSO
GA
-2
10
-3
MSE
10
-4
10
-5
10
-6
10
0
50
100
150
200
250
300
No. of generations
350
400
450
500
Fig. 4.2(a) Comparison of convergence characteristics of different methods for an exact 2nd order IIR
model
-1
10
PSO
CLPSO
GA
-2
MSE
10
-3
10
-4
10
0
20
40
60
80
100
120
No. of generations
140
160
180
200
Fig. 4.2(b) Comparison of convergence characteristics of different methods for a reduced order
(1st order) IIR model
82
I D E N T I F I C A T I O N O F I I R P L A N T S U S I N G C O M P R E H E N S I V E
L E A R N I N G P A R T I C L E S W A R M O P T I M I Z A T I O N
Example-2 The adaptive system is a third-order IIR filter [4.25] to model a plant of the
same order ( M = 2, L = 3) with {a i } = {0.6,−0.25,0.2}, {b j } = {−0.2,−0.4,0.5} .The convergence
are shown in Fig. 4.3(a) and (b)
plots of full-order and reduced-order ( M = 1, L = 2)
respectively. The plots indicate that the convergence performance of the new method is
superior to the GA and PSO based methods under multimodal situation (Fig. 4.3(b)) and as
well as when full-order models are used. The CLPSO method clearly exhibits best
performance.
1
10
PSO
CLPSO
GA
0
10
-1
MSE
10
-2
10
-3
10
-4
10
-5
10
0
100
200
300
400
No. of generations
500
600
700
Fig. 4.3(a) Comparison of convergence characteristics of different methods for an exact 3rd order IIR
model
0
10
PSO
CLPSO
GA
-1
10
-2
MSE
10
-3
10
-4
10
-5
10
0
50
100
150
200
250
300
No. of generation
350
400
450
500
Fig. 4.3(b) Comparison of convergence characteristics of different methods for a reduced order
(2nd order) IIR model
83
I D E N T I F I C A T I O N O F I I R P L A N T S U S I N G C O M P R E H E N S I V E
L E A R N I N G P A R T I C L E S W A R M O P T I M I Z A T I O N
Example-3 In this experiment the plant is of fourth order IIR system [4.15]
{a i } = {−0.04,−0.2775,0.2101,−0.14}, {b j } = {1,−0.9,0.81,−0.729} The
adaptive IIR system is
having M = 3, L = 4 for full-order model and M = 2, L = 3 for reduced order model. The
convergence characteristics for the two models are shown in Fig. 4.4 (a) and (b). It is
observed that standard GA and PSO exhibits faster convergence initially, but they fail to
improve further because the chromosomes and the swarm quickly become stagnant and
hence lead to suboptimal solution. However the new algorithm does not stagnate allowing
it to reach the minimum noise floor level. The convergence plot of Fig. 4.4(b) for reduced
order model also reveals that both the GA and PSO based algorithms get trapped to local
solution while the new algorithm clearly exhibits superior performance.
0
10
GA
PSO
CLPSO
-1
10
-2
MSE
10
-3
10
-4
10
-5
10
0
100
200
300
400
500
No. of generations
600
700
800
Fig. 4.4(a) Comparison of convergence characteristics of different methods for an exact 4th order IIR
model
0
10
CLPSO
PSO
GA
-1
MSE
10
-2
10
-3
10
-4
10
0
100
200
300
400
No. of generations
500
600
Fig.4.4(b) Comparison of convergence characteristics of different methods for a reduced order (3rd
order) IIR model
84
I D E N T I F I C A T I O N O F I I R P L A N T S U S I N G C O M P R E H E N S I V E
L E A R N I N G P A R T I C L E S W A R M O P T I M I Z A T I O N
Example-4 In this example the plant is a fifth order low pass Butterworth IIR filter taken
from [4.46]. The plant is represented by
5
5
i =1
j =0
y ( n) = ∑ a i y ( n − i ) + ∑ b j x ( n − j )
d ( n) = y ( n) + v ( n)
{a i } = {−0.9853,−0.9738,−0.3864,−0.1112,−0.0113}
{b j } = {0.1084,0.5419,1.0837,1.0837,0.5419,0.1084}
5
5
i =1
j =0
The full-order IIR model is given by yˆ (n) = ∑ aˆ i yˆ (n − i) + ∑ bˆ j x(n − j )
The corresponding reduced order model is also simulated. The convergence behaviors of
the exact and reduced-order models are shown in Fig. 4.5(a) and (b) respectively. It is
observed that both the full and reduced order GA and PSO based models fail to escape
from local minima where as the proposed model is not affected and exhibits significant
improvement in the convergence performance in both exact and reduced structures.
1
10
PSO
CLPSO
GA
0
10
-1
MSE
10
-2
10
-3
10
-4
10
-5
10
0
100
200
300
400
No. of generation
500
600
700
Fig. 4.5(a) Comparison of convergence characteristics of different methods for an exact 5th order IIR
model
85
I D E N T I F I C A T I O N O F I I R P L A N T S U S I N G C O M P R E H E N S I V E
L E A R N I N G P A R T I C L E S W A R M O P T I M I Z A T I O N
5
10
MSE
PSO
CLPSO
GA
0
10
-5
10
0
50
100
150
200
250
300
no. of generations
350
400
450
500
Fig. 4.5(b) Comparison of convergence characteristics of different methods for a reduced order
(4th order) IIR model
Table 4.1 summarizes the comparison performance of GA, PSO and CLPSO based
methods in terms of minimum MSE after convergence, execution time in seconds, the
product of population size and number of input samples used during training. All the three
parameters indicate the performance measure of a learning algorithm. In all the examples,
these three parameters are observed to be substantially low in case of the new algorithm.
For example, GA and PSO based approach take 40 to 80 times and 20 to 150 times more
computation of fitness function compared to that required by the new method. The
estimated feed forward and feed back coefficients of the IIR systems obtained from RLMS,
GA, PSO and CLPSO based methods are also listed in Table 4.2 along with the
corresponding plant parameters. In all the cases studied it is observed that the estimated
coefficients obtained from the new method are in close agreement with the true
coefficients of the plant compared to those obtained by other three methods.
The CLPSO is an improved version of conventional PSO algorithm. The CLSPO is
inherently a population based algorithm. As a result its initial rate of convergence is poor.
In the IIR identification problem the prime objective is the accuracy of the final solution
instead of faster convergence. Being fully aware of such a characteristic the CLPSO was
selected for training of the IIR model to provide best identification of IIR plants under
multimodal situation. The initial motivation to carry out the research has been successful as
may be evident from the set of simulation results provided in various examples.
86
I D E N T I F I C A T I O N O F I I R P L A N T S U S I N G C O M P R E H E N S I V E
L E A R N I N G P A R T I C L E S W A R M O P T I M I Z A T I O N
Table 4.1
Comparison of performance between GA, PSO & CLPSO based training of weights
MSE at -30dB noise
GA PSO CL-PSO
-21
-41
-50
Execution time in second
GA
PSO
CL-PSO
2nd order IIR system
18.03
2.47
0.27
-30
-39
-50
40.28
-21
-41
-49
-31
-21
-48
No. of times ( Ps * L) used
GA
PSO
CL-PSO
53
40
1
3rd order IIR system
17.42
0.68
80
155
1
146.7
4th order IIR system
48.24
1.70
48
80
1
62.64
5th order IIR system
7.06
1.03
40
20
1
Table 4.2
Comparison between true and estimated pole-zero parameters obtained from RLMS, GA, PSO and
CLPSO
Actual
Parameters
Estimated Parameters at -3odB NSR
GA
1.25
-0.25
0.3
-0.4
0.9787
0.0285
0.1428
-0.4562
-0.2
-0.4
0.5
0.6
-0.25
0.2
-0.1406
-0.5012
0.3222
0.2352
-0.2110
0.0109
1
-0.9
0.81
-0.729
-0.04
-0.2775
0.2101
-0.14
0.9801
-0.7998
0.7939
-0.5838
-0.2025
-0.4145
0.0755
-0.1685
PSO
CLPSO
2nd order IIR system
1.2513
1.2514
-0.2514
-0.2423
0.2996
0.2959
-0.4013
-0.4042
3rd order IIR system
-0.1996
-0.1951
-0.4110
-0.4051
0.5016
0.5000
0.5569
0.5957
-0.2302
-0.2362
0.1761
0.1966
4th order IIR system
0.9948
1.0043
-0.8964
-0.8887
0.8093
0.8032
-0.7290
-0.7277
-0.0388
-0.0492
-0.2779
-0.2766
0.2136
0.2122
-0.1385
-0.1354
87
RLMS
1.2510
-0.2765
0.3177
-0.4030
-0.2002
-0.4005
0.4903
0.5897
-0.2518
0.1940
1.007
-0.8794
0.7557
-0.7301
-0.0582
-0.2520
0.2579
-0.1156
I D E N T I F I C A T I O N O F I I R P L A N T S U S I N G C O M P R E H E N S I V E
L E A R N I N G P A R T I C L E S W A R M O P T I M I Z A T I O N
4.7 Conclusion
This chapter has suitably employed the recently developed CLPSO tool to identify the
feed-forward and feed-back coefficients of IIR systems. The new identification algorithm is
outlined in details and has been applied on few bench mark systems. Simulation study
reveals that the proposed method outperforms the existing standard RLMS, GA, PSO
based methods in terms of minimum MSE after convergence, execution time and product
of population size and number of input samples used in training. Further the new method
exhibits significant improvement in convergence behavior when reduced-order models are
used compared to those obtained by GA and PSO methods. This clearly indicates that the
new method can converge to the optimal solution even under multimodal environment in
which local minima problems can be encountered. Therefore the proposed method
provides fastest convergence, least training time and best estimates of feed-forward and
feed-back coefficients compared to other three methods.
References
[4.1] John J. Shynk, “Adaptive IIR filtering”, IEEE ASSP Magazine, April 1989, pp. 4-21.
[4.2] B. Widrow and S. D. Stearns, Adaptive Signal Processing, Englewood Cliffs, NJ :
Prentice-Hall, 1985.
[4.3] J. M. Mendel, Discrete Techniques of Parameter Estimation : The Equation Error
Formulation, Marcel Dekker, New York, 1973.
[4.4] R. P. Gooch, “Adaptive pole-zero filtering : the equation error approach”, Ph. D. diss.
Stanford University, 1983.
[4.5] I. D. Landau, Adaptive Control : The model reference approach, Marcel Dekker, New
York, 1979.
[4.6] C. R. Johnson, Jr., “Adaptive IIR filtering : current results and open issues”, IEEE
Trans. on Information Theory, vol. IT-30, no. 2, pp. 237-250, Mar. 1984.
[4.7] R. C. Eberhart and J. Kennedy, “A new optimizer using particle swarm theory”, in
Proc. of 6th Int. symp. Micro machine Human Sci., Nagoya, Japan, 1995, pp. 39-43, 1995.
88
I D E N T I F I C A T I O N O F I I R P L A N T S U S I N G C O M P R E H E N S I V E
L E A R N I N G P A R T I C L E S W A R M O P T I M I Z A T I O N
[4.8] J. Kennedy and R. C. Eberhart, “Particle swarm optimization”, in Proc. of IEEE Int.
Conf. Neural Networks, 1995, pp. 1942-1948.
[4.9] J. J. Liang, A. K. Qin, Ponnuthurai Nagaratnam Suganthan and S. Baskar,
“Comprehensive learning particle swarm optimizer for global optimization of multimodal
functions”, IEEE Trans. on Evolutionary Computation, vol. 10, no. 3, pp. 281-295, June
2006.
[4.10] S. D. Stearns, “Error surfaces of recursive adaptive filters”, IEEE Trans. Circuits
Systems, vol. CAS-28, no. 6, pp. 603-606, June 1981.
[4.11] P. L. Feintuch, “An adaptive recursive LMS filter”, Proc. IEEE, vol. 64, pp. 16221624, Nov. 1976.
[4.12] M. G. Larimore, J. R. Treichler and C. R. Johnson, Jr., “SHARF : An algorithm for
adapting IIR digital filters”, IEEE Trans. Acoust., Speech, Signal Processing, vol. ASSP-28,
pp. 428-440, Aug. 1980.
[4.13] B. Friedlander, “System identification techniques for adaptive signal processing”,
IEEE Trans. Acoust., Speech, Signal Processing, vol. ASSP-30, pp. 240-246, April 1982.
[4.14] H. Fan and W. K. Jenkins, “A new adaptive IIR filter”, IEEE Trans. Circuits System,
vol. CAS-33, pp. 939-947, October 1986.
[4.15] John Shynk, “Adaptive IIR filtering using parallel form realization”, IEEE Trans.
Acoustic, speech, signal processing, vol. 37, no. 4, pp. 519-533, April 1989.
[4.16] D. Parikh, N. Ahmed and S. D. Stearns, “An adaptive lattice algorithm for recursive
filters”, IEEE Trans. Acoust., speech, Signal Processing, vol. ASSP-28, pp. 110-111, Feb.
1980.
[4.17] R. Nambiar and P. Mars, “Genetic and Annealing Approaches to adaptive digital
filtering”, Proc. 26th Asilomar Conf. on Signals, Systems and Computers, vol. 2, Oct. 1992,
pp. 871-875.
[4.18] D. E. Goldberg, Genetic Algorithms in Search, Optimization and Machine Learning
Reading, MA: Addison-Wesley, 1989.
[4.19] D. M. Etter et al, “Recursive adaptive filter design using an adaptive genetic
algorithm”, Proc. of IEEE conf. on ASSP, 1982, pp. 635-638.
[4.20] R. Nambiar, C. K. K. tang and P. Mars, “Genetic and learning automata algorithms
for adaptive digital filters”, Proc. of IEEE conf. on ASSP, vol. 4,1992, pp. 41-44.
[4.21] Kristinn Kristinsson and Guy A. Dumont, “System identification and control using
genetic Algorithms”, IEEE Trans. on Systems, Man and Cybernetics, vol. 22, no. 5, pp.
1033-1046, September 1992.
89
I D E N T I F I C A T I O N O F I I R P L A N T S U S I N G C O M P R E H E N S I V E
L E A R N I N G P A R T I C L E S W A R M O P T I M I Z A T I O N
[4.22] S. C. Ng, C. Y. Chung, S. H. Leung and Andrew Luk, “Fast convergent genetic
search for adaptive IIR filtering”, IEEE Int. Conf. on Acoustics, Speech and Signal
Processing, vol. 3, April 1994, pp. 105-108.
[4.23] Leehter Yao and William A. Sethares, “Nonlinear parameter estimation via the
Genetic algorithm”, IEEE Trans. on Signal Processing, vol. 42, no. 4, pp. 927-935, April
1994.
[4.24] Kit-sang Tang, Kim-fung Man, Sam Kwong and Zhi-feng Liu, “Design and
optimization of IIR filter structure using Hierarchical Genetic Algorithms”, IEEE Trans.
on Industrial Electronics, vol. 45, no. 3, pp. 481-487, June 1998.
[4.25] S. C. Ng, S. H. Leung, C. Y. Chung, A. Luk and W. H. Lau, “ The genetic search
approach”, IEEE Signal Processing Magazine, Nov. 1996, pp. 38-46.
[4.26] Sergio L. Netto, Paulo S. R. Diniz and Panajotis Agathoklis, “Adaptive IIR filtering
algorithms for system identification :A general framework”, IEEE Trans. on Edu., vol. 38,
no. 1, pp. 54-66, Feb. 1995.
[4.27] Y. Shi and R. C. Eberhart, “A modified particle swarm optimizer”, in Proc. IEEE
Congress on Evolutionary Computation, 1998, pp. 69-73.
[4.28] Y. Shi and R. C. Eberhart, “Particle swarm optimization with fuzzy adaptive inertia
weight”, in Proc. Workshop Particle Swarm Optimization, Indianapolis, IN, 2001, pp. 101106.
[4.29] A. Ratnaweera, S. Halgamuge and H. Watson, “Self-organizing hierarchical particle
swarm optimizer with time varying accelerating coefficients”, IEEE Trans. on Evolutionary
Computation, vol. 8, pp. 240-255, June 2004.
[4.30] H. Y. Fan and Y. Shi, “Study on Vmax of particle swarm optimization”, in Proc.
Workshop Particle Swarm Optimization, Indianapolis, IN, 2001.
[4.31] M. Clerc and J. Kennedy, “The particle swarm-explosion, stability and convergence
in a multidimensional complex space”, IEEE Trans. Evolutionary Computation, vol. 6, no.
1, pp. 58-73, Feb. 2002.
[4.32] J. Kennedy and R. Mendes, “Population structure and particle swarm performance”,
in Proc. IEEE Congress on Evolutionary Computation, Honolulu, HI, 2002, pp. 16711676.
[4.33] X. Hu and R. C. Eberhart, “Multiobjective optimization using dynamic
neighborhood particle swarm optimization”, in Proc. Congr. Evol. Comput., Honolulu, HI,
2002, pp. 1677-1681.
[4.34] K. E. Parsopoulos and M. N. Vrahatis, “UPSO-A unified particle swarm
optimization scheme”, in Lecture series on Computational Sciences, 2004, pp. 868-873.
90
I D E N T I F I C A T I O N O F I I R P L A N T S U S I N G C O M P R E H E N S I V E
L E A R N I N G P A R T I C L E S W A R M O P T I M I Z A T I O N
[4.35] R. Mendes, J. Kennedy and J. Neves, “The fully informed particle swarm : Simpler,
may be better”, IEEE Trans. Evolutionary Computation, vol. 8, pp. 204-210, June 2004.
[4.36] T. Peram, K. Veeramachaneni and C. K. Mohan, “Fitness distance ratio based
particle swarm optimization”, in Proc. Swarm Intelligence Symp., 2003, pp. 174-181.
[4.37] F. van den Bergh abd A. P. Engelbrecht, „A cooperative approach to particle swarm
optimization“, IEEE Trans. on Evolutionary Computation, vol. 8, pp. 225-239, June 2004.
[4.38] Yanping Lu, Shaozi Li and Changle Zhou, “Multipoint organizational evolutionary
algorithm for globally minimizing functions of many variables”, in Proc. of IEEE 2nd Int.
Conf. on Pervasive Computing and applications, July 2007, pp. 84-89.
[4.39] A. Borji, M. Hamidi and A. M. Eftekhari moghadam, “CLPSO based fuzzy color
image segmentation”, Annual meeting of the North American fuzzy information
processing society, June 2007, pp. 508-513.
[4.40] Weihua Liu, Qingmei Sui, Wei Zhang, Nan Lu and Zhengmin Liu, “Image
segmentation with 2-D maximum entropy based on comprehensive learning particle swarm
optimization”, in Proc. of IEEE Int. Conf. on Automation and Logistics, Aug. 2007, pp.
793-797.
[4.41] S. Baskar, A. Alphones, P. N. Suganthan and J. J. Liang, “Design of Yagi-Uda
antennas using comprehensive learning particle swarm optimization”, IEE proceeding
Microw. Antennas Propag., vol. 152, no. 5, pp. 340-346, October 2005.
[4.42] G. Panda, D. Mohanty, Babita Majhi and G. Sahoo, “Identification of Nonlinear
Systems using Particle Swarm Optimization Technique”, Proc. of IEEE International
Congress on Evolutionary Computation(CEC-2007), Singapore, 25-28, September, 2007,
pp.3253-3257.
[4.43] J. F. Schutte and A. a. Groenwold, “Sizing design of truss structures using particle
swarms”, Strruct. Multidisc. Optim., vol. 25, no. 4, pp. 261-269, 2003.
[4.44] L. Messerschmidt and A. P. Engelbrecht, “Learning to play games using a PSObased competitive learning approach”, IEEE Trans. Evolutionary Computation, vol. 8, pp.
280-288, June 2004.
[4.45] M. P. Wachowiak, R. Smolikova, Y. F. Zheng, J. M. Zurada and A. S. Elmaghraby,
“An approach to multimodal biomedical image registration utilizing particle swarm
optimization”, IEEE Trans. on Evolutionary Computation, vol. 8, pp. 289-301, June 2004.
[4.46] D. J. Krusienski and W. K. Jenkins, “Particle swarm optimization for adaptive IIR
filter structures”, in Proc. IEEE Congress on Evolutionary Computation, June 2004, pp.
965-970.
[4.47] P. J. Angeline, “Using selection to improve particle swarm optimization”, in Proc.
IEEE Congress Evolutionary Computation, Anchorage, AK, 1998, pp. 84-89.
91
I D E N T I F I C A T I O N O F I I R P L A N T S U S I N G C O M P R E H E N S I V E
L E A R N I N G P A R T I C L E S W A R M O P T I M I Z A T I O N
[4.48] M. Lovbjerg, T. K. Rasmussen and T. Krink, “Hybrid particle swarm optimizer with
breeding and subpopulations”, in Proc. Genetic Evol. Comput. Conf., 2001, pp. 469-476.
[4.49] Zhi-feng Hao, Zhi Gang Wang and Han Huang, “A particle swarm optimization
algorithm with crossover operator”, Proc. of the 6th Int. Conf. on Machine learning and
cybernetics, Hong Kong, August 207, pp. 1036-1040.
[4.50] Andrew Stacey, Mirjana Jancic and Ian Grundy, “Particle swarm optimization with
mutation”, in Proc. of IEEE congress on Evolutionary Computation, December 2003, pp.
1425-1430.
[4.51] M. Lovbjerg and T. Krink, “Extending particle swarm optimizers with seld-organized
criticality”, in Proc. of IEEE Congress on Evolutionary Computation, Honolulu, HI, 2002,
pp. 1588-1593.
[4.52] T. M. Blackwell and P. J. Bentley, “Don’t push me! Collision avoiding swarms”, in
Proc. IEEE Congress on Evolutionary Computation, Honolulu, HI, 2002, pp. 1691-1696.
[4.53] X. Xie, W. Zhang and Z. Yang, “A dissipative particle swarm optimization”, in Proc.
IEEE Congress on Evolutionary Computation, Honolulu, HI, 2002, pp. 1456-1461.
[4.54] K. E. Parsopoulos and M. N. Vrahatis, “On the computation of all global minimizes
through particle swarm optimization”, IEEE Trans. on Evolutionary Computation, vol. 8,
pp. 211-224, June 2004.
[4.55] B. B. Murthy and G. Panda, “Efficient neural network algorithms for IIR system
identification”, in Proc. of International conf. on signal processing applications &
technology, Boston, USA, 7-10 October 1996, pp. 1343-1347.
[4.56] B. B. Murthy, C. F. N. Cowan and G. Panda, “Efficient scheme of pole-zero system
identification based on multilayer neural network”, Electronics Letter, vol. 29, issue 1, pp.
73, Jan. 1993.
92
5
Chapter
Dynamic System Identification
using FLANN Structure and PSO
and BFO Based Learning
Algorithms
5.1 Introduction
N
ONLINEAR system identification of complex dynamic plants finds potential
applications in many areas of engineering such as control, communication, power
system and instrumentation. In recent years, modeling of real time processes has
gained significant importance in these areas. Many interesting papers have been reported in
the literature to identify both static and dynamic nonlinear systems. The Artificial Neural
Network (ANN) has been applied for many identification and control tasks [5.1-5.3] but at
the expense of large computational complexity. Narendra and Parathasarathy [5.4] have
employed the multiplayer perceptron (MLP) networks for effective identification and
93
D Y N A M I C
S Y S T E M I D E N T I F I C A T I O N U S I N G F L A N N S T R U C T U R E
A N D P S O A N D B F O B A S E D L E A R N I N G A L G O R I T H M S
control of dynamic systems such as truck-backer-upper problem [5.5] and robot arm
control [5.6]. Subsequently the Radial Basis Function (RBF) network has been introduced
[5.7] to develop system identification model of nonlinear dynamic systems [5.8-5.9]. One
practical difficulty in this model is the selection of an appropriate set of RBF centres for
effective learning. Further the wavelets in place of RBF has been suggested in neural
network [5.10-5.11] to develop efficient identification models. The Functional Link
Artificial Neural Network (FLANN), a computationally efficient single layer ANN, has
been reported in the literature as an useful alterative to MLP for many applications. In the
literature the trigonometric [5.12] and Chebyshev [5.13-5.14] based FLANN architecture
have been proposed for identification of nonlinear dynamic systems [5.13 -5.14].
The swarm intelligence is the property of a system whereby the collective behavior of
unsophisticated agents that are interacting locally with their environment create coherent
global functional patterns. This type of intelligence is described by five principles such as
proximity, quality, diverse response, stability and adaptability. Swarm intelligence provides a
useful paradigm for complementing powerful adaptive systems. Both particle swarm
optimization (PSO) and bacterial foraging optimization (BFO) algorithms belong to the
family of swarm intelligence and share few computational attributes. These are
(i)
Individual elements are updated in parallel
(ii)
Each new value depends on its previous value as well as contribution from its
neighbors
(iii)
All updates are performed according to the same rules.
In recent years, evolutionary computational methods belonging to the swarm intelligence
category have proven to be promising tools to solve many engineering and financial
problems. These are found to be powerful methods in domains where analytic solutions
have not been proved to be effective. The BFO [5.15] is one such evolutionary computing
approach which is based on the foraging behaviour of E. coli bacteria in our intestine. In
this case foraging is considered as an optimization process in which the bacterium tries to
maximize the collected energy per unit foraging time. The BFO has been successfully
applied to many real world problems like harmonic estimation [5.16], transmission loss
reduction [5.17], active power filter for load compensation [5.18], power network [5.19],
load forecasting [5.20], independent component analysis [5.21], identification of nonlinear
dynamic systems [5.22-5.23], stock market prediction [5.24] and adaptive channel
equalization [5.25].
94
D Y N A M I C
S Y S T E M I D E N T I F I C A T I O N U S I N G F L A N N S T R U C T U R E
A N D P S O A N D B F O B A S E D L E A R N I N G A L G O R I T H M S
The basics of PSO algorithm is dealt in Section 2.5.2. The BFO and PSO are derivative free
optimization tools in the sense that they do not need the computation of derivatives during
training of the weights of the adaptive structure and therefore the solution is less likely to
be trapped to local minima. On the other hand the least mean square (LMS) and the
recursive least square (RLS) algorithms calculate the slope of the error surface at a current
position in all directions, but moves in the direction of the most negative slope. Such
optimization methods work satisfactorily when the error surface contains no local minima.
But most of the real life problems are multimodal and also are distorted due to additive
noise. In case of BFO there are number of parameters which are combinedly used for
searching the total solution space. As a result the possibility of avoiding the local minima is
higher. The distinct advantages of the BFO and PSO have motivated many researchers to
use these tools for identification of complex nonlinear and dynamic systems. The
connecting weights of the FLANN model are updated using BFO and PSO techniques
instead of using derivative based algorithm. To facilitate the development of the new
models efficient BFO and PSO based identification algorithms are proposed in this
chapter.
5.2 Dynamic system identification of nonlinear system
The basic principle of system identification is discussed in and depicted in Fig. 3.1. This
section also deals with four different identification models and the associated difference
equations given in (3.3) - (3.6). In this chapter single-input single-output (SISO) and multiinput multi-output (MIMO) plants of four different nonlinear models are considered. The
nonlinear functions
f (⋅) and g (.) associated with the plant are implemented using
FLANN-BP rule, FLANN-BFO and FLANN-PSO structures. It is assumed that the plant
under consideration is bounded-input-bounded-output (BIBO) stable. In order to achieve
the stability and to ensure that the parameters of the ANN model to converge a seriesparallel scheme is employed. In this scheme the output of the plant instead of that of the
ANN models is fed back to the models during the training operation [5.4].
95
D Y N A M I C
5.3
S Y S T E M I D E N T I F I C A T I O N U S I N G F L A N N S T R U C T U R E
A N D P S O A N D B F O B A S E D L E A R N I N G A L G O R I T H M S
A generalized FLANN Structure based identification
model
A generalized adaptive identification model of a complex dynamic nonlinear plant is shown
in Fig. 5.1. The output of the model yˆ (k + 1) at (k + 1)th instant is given by
yˆ ( k + 1) = N 1 [ x (k )] + N 2 [ y ( k )]
(5.1)
where N 1 and N 2 represent the low complexity FLANN structures of feed forward and
feed back paths respectively. The weights of these structures are updated using PSO and
BFO algorithms. Using functional expansion block-1, the input x(k ) is nonlinearly
expanded as
u (k ) = [1, x(k ), sin{πx(k )}, cos{πx(k )}....................... sin{nπx(k )}, cos{nπx(k )}]T
= [u 0 (k ), u1 (k ),........................u 2 n +1 (k )]T
(5.2)
(5.3)
There are n number of sine and equal number of cosine expansions of the input sample.
The first term u 0 (k ) is an unity input. Hence altogether there are (2n + 1) number of terms
in the input vector. Let the weight vector corresponding to the kth input vector defined in
(5.3) is given by
w(k ) = [ w0 (k ), w1 (k ), w2 (k ),.........................., w2 n +1 (k )]T
(5.4)
The estimated output of the feed forward path is thus given by
T
yˆ 1 (k + 1) = u (k ) w(k )
(5.5)
In the similar way, the estimated output of the feedback path is computed as
T
yˆ 2 (k + 1) = v (k ) h(k )
(5.6)
where v(k ) = [v0 (k ), v1 (k ),......................., v 2 m +1 (k )]T
(5.7)
= [1, y (k ), sin{πy (k )}, cos{πy (k )},............................ sin{nπy (k )}, cos{nπy (k )}]T
(5.8)
Here v0 (k ) = 1 .
The net estimated output , yˆ (k + 1) of the model is given by
yˆ ( k + 1) = yˆ 1 ( k + 1) + yˆ 2 ( k + 1)
(5.9)
96
S Y S T E M I D E N T I F I C A T I O N U S I N G F L A N N S T R U C T U R E
A N D P S O A N D B F O B A S E D L E A R N I N G A L G O R I T H M S
y (k + 1)
Nonlinear Dynamic
Plant
1
Functional Expansion Block-1
x(k )
x(k )
+
w0 (k )
w1 (k )
Σ
cos{πx ( k )}
w3 (k )
Σ
yˆ1 (k + 1)
+
yˆ (k + 1)
Σ
+
sin{nπx ( k )}
cos{ nπx ( k )}
e(k )
-
sin{πx ( k )} w2 ( k )
yˆ 2 (k + 1)
w2 n ( k )
w2 n+1 (k )
h0 (k )
h1 (k )
1
y (k )
h2 (k ) sin{πy (k )}
h3 (k ) cos{πy (k )}
Σ
h2 m (k )
h2 m +1 (k )
sin{mπy ( k )}
z −1
Functional Expansion Block-2
D Y N A M I C
y (k )
cos{mπy ( k )}
BFO/PSO
based training
Fig. 5.1 A generalized adaptive model of a complex dynamic nonlinear plant
5.4
BFO and PSO based nonlinear system identification
BFO based identification algorithm
The steps involved in BFO based identification algorithm is presented here.
Step -1 Initialization of parametrs used
(i) Sb = No. of bacteria to be used for searching the total region
(ii) N is = Number of input sample
(iii) p = Number of parameters of the FLANN model to be optimized
97
D Y N A M I C
S Y S T E M I D E N T I F I C A T I O N U S I N G F L A N N S T R U C T U R E
A N D P S O A N D B F O B A S E D L E A R N I N G A L G O R I T H M S
(iv) N s = Swimming length after which tumbling of bacteria is undertaken in a chemotactic
loop.
(v) N c = Number of iterations to be undertaken in a chemotactic loop. Always N c > N s .
(vi) N re = Maximum number of reproduction to be undertaken
(vii) N ed = Maximum number of elimination and dispersal events to be imposed over the
bacteria.
(viii) Ped = Probability with which the elimination and dispersal operation continues.
(ix) The location of each bacterium θ (1 − p,1 − S b ,1) is specified by random numbers
between [0,1]
(x) The runlength unit C (i ) of i th bacterium is assumed to be constant for all bacteria.
Step-2 Generation of desired signal for training
(i) An uniformly distributed random signal over the interval [-1, 1] is generated and
simultaneously fed to nonlinear dynamic plant and to the adaptive model which to be
trained by BFO algoithm. A series-parallel identification scheme is used for achieving
stability during training [5.4].
(ii) The output of the nonlinear plant acts as the desired signal for training.
Step -3 Iterative Identification Algorithm
In this step the bacterial population, chemotaxis, reproduction, elimination and dispersal
operations are performed to train the weights of the model.
Initially j = n = l = 0
(i) Elimination dispersal loop l = l + 1
(ii) Reproduction loop n = n + 1
(iii) Chemotaxis loop j = j + 1
(a) For i = 1,2,............Sb , the cost function, (in this case mean squared error) J (i, j , n, l ) for
each i th bacterium is calculated as follows :
(1) N is signal samples are passed through the model.
(2)The output is then compared with the corresponding desired signal to calculate the
error.
(3)The sum of squared error averaged over N is is finally stored in J (i, j , n, l ) . The cost
function of the model is calculated for N is input samples as
98
D Y N A M I C
J=
1
N is
S Y S T E M I D E N T I F I C A T I O N U S I N G F L A N N S T R U C T U R E
A N D P S O A N D B F O B A S E D L E A R N I N G A L G O R I T H M S
N is
∑e
2
(k )
(5.10)
k =1
where
e(k ) = y (k ) − yˆ (k )
(4)End of For loop.
(b)For i = 1,2,.............Sb the tumbling/swimming decision is taken.
,
Tumble: A random vector Δ(i ), with its element Δ m (i ), m = 1,2,......... . p, is computed where
each element is a random number in the range [-1, 1].
Move: The move operation is implemented as
θ i ( j + 1, n, l ) = θ i ( j , n, l ) + C (i) ×
Δ(i )
Δ (i )Δ(i )
T
(5.11)
The second term results in an adaptable step size in the direction of tumble for bacterium
i.
The new cost function J (i, j + 1, n, l ) is computed corresponding to the new location of the
bacteria.
Swim – (i) Let c =0; (counter for swim length)
(ii) While c < N s (have not climbed down too long)
Let c = c + 1
If J ( j ) < J ( j − 1) then by using (5.10) the new cost function J (i, j + 1, n, l ) is computed
else
let c = N s . This is the end of while statement.
(c)The operation of next bacterium is processed if i ≠ Sb
(d)If min(J ) {minimum value of J achieved by all the bacteria} is less than the tolerance
limit specified then all the loops are broken.
Step 4. If j < N c , the chemotaxis loop beginning from (iii) is continued.
Step 5. Reproduction
(a) For given n and l , and for each i = 1,2,..........S b let J be the health of i th bacterium.
The bacteria are sorted in ascending order of cost functions J (higher cost means lower
health).
99
D Y N A M I C
S Y S T E M I D E N T I F I C A T I O N U S I N G F L A N N S T R U C T U R E
A N D P S O A N D B F O B A S E D L E A R N I N G A L G O R I T H M S
(b) One half (S r = S b / 2 ) bacteria with higher J values die and the remaining S r bacteria
providing minimum MSE values split and are placed at the same location as their parents.
Step 6. If k < N re the reproduction loop from (ii) is continued.
Step 7. Elimination –Dispersal
Bacteria are eliminated and dispersed with probability Ped . This is achieved by eliminating a
bacterium and dispersing it to a random location. Same numbers of new bacteria with
random locations are added. This process keeps the number of bacteria in the population
constant.
In the present study swarming operation is not used to keep the algorithm simple and
simultaneously sacrificing little accuracy in the identification task.
PSO based nonlinear system identification
The updating of the weights of the PSO based model is carried out using the training rule
as outlined in the following steps:
Step 1. K (K≥500) samples of uniformly distributed random signal in the interval [-1, 1] are
generated and simultaneously fed to actual nonlinear system and the adaptive model. A
series-parallel identification scheme is used.
Step 2. The output of the plant provides the desired signal. Hence K numbers of desired
samples and K numbers of estimated outputs using (5.9) are produced by feeding all the K
input samples.
Step 3 . Each of the desired output is compared with the corresponding model output and
K errors are produced.
Step 4 . The mean square error (MSE) for a given plant (corresponding to ith particle) is
determined by using the relation.
K
MSE (i ) =
∑e
2
k =1
K
k
This is repeated for I times.
Step 5. Since the objective is to minimize MSE (i), i = 1 to I the PSO based optimization
method is used.
Step 6. The velocity and position of each particle is updated using (4.1) and (4.2)
respectively.
100
D Y N A M I C
S Y S T E M I D E N T I F I C A T I O N U S I N G F L A N N S T R U C T U R E
A N D P S O A N D B F O B A S E D L E A R N I N G A L G O R I T H M S
Step 7. In each iteration the minimum MSE, (MMSE) which shows the learning behavior
of adaptive model from iteration to iteration is stored.
Step 8. When the MMSE has reached the pre-specified level, the optimization process is
stopped.
Step 9. At this step the particles attain the same position, which represents the desired
solution i. e. the estimated coefficients of the given dynamic plant.
Primarily the identification problem is a optimization problem in the sense that the average
mean squared error is to be iteratively minimized. The popular population based
optimization algorithms are BFO and PSO. The algorithms have been selected to be used
to change the parameters of the identification model in such a way that the squared error
fitness function is minimized. This section has dealt the BFO and PSO based nonlinear
system identification problem to be used in the simulation study.
5.5 Simulation study
In this section simulation study is carried out to assess the performance of the proposed
models when nonlinear identification of static and dynamic plants described by (3.3)-(3.6).
In these examples, the series-parallel model is used to identify these plants and BFO and
PSO algorithms are used to train the connecting weights of the FLANN structure of the
models. The performance of the proposed (FLANN-BFO and FLANN-PSO) approaches
is obtained from simulation and compared with that obtained by FLANN-BP method
[5.12]. For training the weights of FLANN-BP model, 50,000 iterations are carried out by
using an uniformly distributed random signal over the interval [-1,1] as input. During the
test phase, the effectiveness of the proposed models are studied by using the parallel
scheme where the input to the identified model used is given by
⎧ 2πk
⎪⎪sin 250 for k ≤ 250
x( k ) = ⎨
⎪0.8 sin 2πk + 0.2 sin 2πk for k > 250
250
25
⎩⎪
(5.12)
A quantitative measure for performance evaluation used is the normalized mean square
error (NMSE) defined in [5.26] as
101
D Y N A M I C
NMSE =
1
2
σ TD
TD
S Y S T E M I D E N T I F I C A T I O N U S I N G F L A N N S T R U C T U R E
A N D P S O A N D B F O B A S E D L E A R N I N G A L G O R I T H M S
∑ [ y(k ) − yˆ (k )]
(5.13)
2
k =1
where y (k ) and yˆ (k ) represent the plant and model outputs at k th discrete time,
respectively and σ 2 denotes variance of the plant output sequence over the test duration
TD .
Static Systems
Two examples of static systems [5.4, 5.12] used in the simulation study are
Example 1 : f 1 ( x) = x 3 + 0.3 x 2 − 0.4 x
Example 2 : f 2 ( x) = 0.6 sin(πx) + 0.3 sin(3πx) + 0.1sin(5πx)
In case of FLANN-BFO and FLANN-PSO nine input nodes for first example and eleven
input nodes for second example are used to obtain the best possible identification results.
The number of connecting weights including the threshold is ten and twelve respectively.
These weights are updated using the bacterial foraging or particle swarm optimization
algorithm. But in case of FLANN-BP, fifteen input nodes including a bias input are used to
achieve similar performance. In both cases the input pattern is expanded using
trigonometric expansion and the nonlinearity used is tanh(.) function. The convergence
coefficient is set to 0.1 in case of FLANN-BP where as the parameters used for FLANNBFO are : Sb = 16, N is = 100, p =10, N s =3, N c =5, N re =100-130, N ed =5, Ped = 0.25,
C (i ) = 0.0075. Similarly the parameters used in case of PSO based simulation are no. of
particles=30, no. of input samples =200, c1 = c 2 = 1.042, v max = 1 . The results of
identification of Examples 1 and 2 are shown in Figs. 5.2(a) – (d). From these results it is
clear that the BFO and PSO based FLANN models provide excellent agreement between
plant and model responses. Using the same number of expansions in both the examples the
estimation error provided by the FLANN-BFO and FLANN-PSO are found (Table 5.1) to
be lower than that of FLANN-BP approach.
102
S Y S T E M I D E N T I F I C A T I O N U S I N G F L A N N S T R U C T U R E
A N D P S O A N D B F O B A S E D L E A R N I N G A L G O R I T H M S
1
Plant
Model
0.8
Outputs
0.6
0.4
0.2
0
-0.2
-0.4
-1
-0.8
-0.6
-0.4
-0.2
0
0.2
Discrete Time
0.4
0.6
0.8
1
(a) FLANN-BFO (nine expansions)
1
Plant
Model
0.8
Outputs
0.6
0.4
0.2
0
-0.2
-0.4
-1
-0.8
-0.6
-0.4
-0.2
0
0.2
Discrete time
0.4
0.6
0.8
1
(b) FLANN-PSO (nine expansions)
0.8
0.6
0.4
0.2
Outputs
D Y N A M I C
0
-0.2
-0.4
-0.6
-0.8
-1
Plant
Model
-0.8
-0.6
-0.4
-0.2
0
0.2
Discrete Time
0.4
0.6
0.8
(c) FLANN-BFO (eleven expansions)
103
1
D Y N A M I C
S Y S T E M I D E N T I F I C A T I O N U S I N G F L A N N S T R U C T U R E
A N D P S O A N D B F O B A S E D L E A R N I N G A L G O R I T H M S
0.8
0.6
0.4
Outputs
0.2
0
-0.2
-0.4
-0.6
Plant
Model
-0.8
-1
-0.8
-0.6
-0.4
-0.2
0
0.2
Discrete time
0.4
0.6
0.8
1
(d) FLANN-PSO (eleven expansions)
Fig. 5.2 Response matching of static systems ((a), (b) for Example 1 and (c) and (d) for
Example 2)
Dynamic (SISO) Systems
Example 3: The difference equation of the plant [5.4, 5.12] to be identified is given as
y ( k + 1) = 0.3 y ( k ) + 0.6 y ( k − 1) + g[ x ( k )]
(5.14)
The linear parameters are 0.3 and 0.6 and the unknown nonlinear functions g i (.) are given
by g 1 ( x) =
4.0 x 3 − 1.2 x 2 − 3.0 x + 1.2
0.4 x + 0.8 x 4 − 1.2 x 3 + 0.2 x 2 − 3.0
(5.15)
5
g 2 ( x) = 0.5 sin 3 (πx) −
2.0
− 0.1 cos(4πx) + 1.125
x + 2.0
(5.16)
3
To identify the plant, a series-parallel model is used whose difference equation is given as
yˆ (k + 1) = 0.3 y(k ) + 0.6 y(k − 1) + Ν[ x(k )]
(5.17)
where Ν[ x (k )] represents either the FLANN-BP, FLANN-BFO or FLANN-PSO model.
The FLANN input is expanded to nine terms by using trigonometric expansion and BFO
or PSO algorithm is used to update its connecting weights. The parameters used for BFO
based FLANN model are same as used in Example 1 except that N re =60. The parameters
used
for
PSO
are
:
no.
of
particles=30,
no.
of
input
samples=200,
c1 = c 2 = 1.042 and v max = 1 . In case of FLANN-BP the input is expanded to fourteen
trigonometric terms and delta rule is used to train the weights. Both convergence
104
D Y N A M I C
S Y S T E M I D E N T I F I C A T I O N U S I N G F L A N N S T R U C T U R E
A N D P S O A N D B F O B A S E D L E A R N I N G A L G O R I T H M S
parameter, μ and momentum parameter, η are chosen to be 0.1. The results of
identification of (5.14) with nonlinear functions defined in (5.15) and (5.16) are shown in
Figs. 5.3 (a) and (b) and Figs. 5.4(a) and (b) respectively.
3
Plant
Model
2
1
Outputs
0
-1
-2
-3
-4
-5
-6
0
100
200
300
Disceret Time
400
500
600
(a) Using FLANN-BFO (nine expansions)
3
plant
model
2
1
Outputs
0
-1
-2
-3
-4
-5
-6
0
100
200
300
Discrete time
400
500
600
(b) Using FLANN-PSO (nine expansions)
Fig. 5.3 Comparison of response of the dynamic plant of Example 3 using nonlinearity
defined in (5.15)
105
D Y N A M I C
S Y S T E M I D E N T I F I C A T I O N U S I N G F L A N N S T R U C T U R E
A N D P S O A N D B F O B A S E D L E A R N I N G A L G O R I T H M S
6
4
Outputs
2
0
-2
-4
-6
Plant
Model
-8
0
100
200
300
Discrete time
400
500
600
(a) Using FLANN-BFO (nine expansions)
6
4
Outputs
2
0
-2
-4
-6
-8
Plant
Model
0
100
200
300
Discrete Time
400
500
600
(b) Using FLANN-PSO (nine expansions)
Fig. 5.4 Comparison of response of the dynamic plant of Example 3 using nonlinearity
defined in (5.16)
From these results it is evident that the FLANN-BFO and FLANN-PSO methods provide
accurate identification performance. Further, from Table 5.1 it is observed that the BFO
and PSO based approches yield lower NMSE compared to its FLANN-BP counterpart.
Example 4 : In this example the plant [5.4, 5.12] to be identified is of Model-2 type and is
represented by the difference equation
y (k + 1) = f [ y (k ), y (k − 1)] + x(k )
(5.18)
106
D Y N A M I C
S Y S T E M I D E N T I F I C A T I O N U S I N G F L A N N S T R U C T U R E
A N D P S O A N D B F O B A S E D L E A R N I N G A L G O R I T H M S
The unknown nonlinear function f is given by
f ( y1 , y 2 ) =
y1 y 2 ( y1 + 2.5)( y1 − 1.0)
1.0 + y1 + y 2
2
(5.19)
2
In this case the series-parallel scheme of the model used is given by
yˆ (k + 1) = Ν[( y (k ), y (k − 1)] + x(k )
(5.20)
The symbol N represents the model defined in example 4. In FLANN-BFO and FLANNPSO, each of the two inputs are expanded to six terms each and the BFO or PSO is used
to train the weights. The parameters used for BFO based FLANN model are Sb = 16, N is
= 100, p =13, N s =3, N c =5, N re =240, N ed =5, Ped = 0.25 and C (i ) = 0.0075. For PSO
the parameters used are no. of particles=30, no. of input samples =500,
c1 = c 2 = 1.042 and v max = 1 .In case of FLANN-BP the two inputs are expanded into 24
terms and the convergence and the momentum parameters are set at 0.05 and 0.1
respectively. The response obtained from the plant and the two models are shown in Figs.
5.5(a) and (b). In this case also it is observed that the response matching of FLANN-BFO
and FLANN-PSO is excellent. Results presented in Table 5.1 also indicate improved
performance of the two new models compared to the results of FLANN-BP model.
1
0.5
Outputs
0
-0.5
-1
-1.5
Plant
model
0
100
200
300
400
500
600
Discerete time
700
800
900
(a) Using FLANN-BFO (twelve expansions)
107
1000
D Y N A M I C
S Y S T E M I D E N T I F I C A T I O N U S I N G F L A N N S T R U C T U R E
A N D P S O A N D B F O B A S E D L E A R N I N G A L G O R I T H M S
1
0.5
Outputs
0
-0.5
-1
-1.5
Plant
Model
0
100
200
300
400
500
600
Discrete time
700
800
900
1000
(c) Using FLANN-PSO (twelve expansions)
Fig. 5.5 Comparison of response of the dynamic plant of Example 4
Example 5: In this case the plant is of Model-3 type and is described by the difference
equation
y ( k + 1) = f [ y ( k )] + g[ x ( k )]
(5.21)
where the unknown nonlinear functions f (.) and g (.) are represented as
f ( y) =
y ( y + 0.3)
1. 0 + y 2
(5.22)
g ( x) = x ( x + 0.8)( x − 0.5)
(5.23)
The model is represented by a series-parallel scheme
yˆ (k + 1) = Ν 1 [ y (k )] + Ν 2 [ x(k )]
(5.24)
where Ν 1 and Ν 2 represent the FLANN-BP, FLANN-BFO or FLANN-PSO model. In
FLANN-BP model Ν 1 and Ν 2 structures contain 14 and 24 trigonometric expansions
respectively whereas in case of FLANN-BFO and FLANN-PSO seven and five number of
expansions are used. The convergence and momentum parameters are chosen to be 0.1 in
case of FLANN-BP model. The parameters used for BFO based FLANN model are Sb =
16, N is = 100, p =14, N s =3, N c =5, N re =120, N ed =5, Ped = 0.25 and C (i ) = 0.0075. For
PSO, the parameters used are no. of particles=30, no. of input samples =500,
c1 = c 2 = 1.042, v max = 1 . The responses obtained from the plant and various models are
108
D Y N A M I C
S Y S T E M I D E N T I F I C A T I O N U S I N G F L A N N S T R U C T U R E
A N D P S O A N D B F O B A S E D L E A R N I N G A L G O R I T H M S
compared in Figs. 5.6(a) and (b) and the computed NMSE is presented in Table 5.1. These
results also indicate superior performance of the proposed technique over its FLANN-BP
counterpart.
2
Plant
model
1.5
Outputs
1
0.5
0
-0.5
0
50
100
150
200
250
300
Discrete Time
350
400
450
500
(a) Using FLANN-BFO (twelve expansions)
2
Plant
Model
1.5
Outputs
1
0.5
0
-0.5
0
50
100
150
200
250
300
Discrete time
350
400
450
500
(b) Using FLANN-PSO (twelve expansions)
Fig. 5.6 Comparison of response of the dynamic plant of Example 5
Example 6 : MIMO System
The two input and two output nonlinear disceret time plant [5.14] is described by
109
D Y N A M I C
S Y S T E M I D E N T I F I C A T I O N U S I N G F L A N N S T R U C T U R E
A N D P S O A N D B F O B A S E D L E A R N I N G A L G O R I T H M S
⎡ y 2 (k ) ⎤
⎡ y1 (k + 1) ⎤ ⎢1 + y12 (k ) ⎥ ⎡ x1 (k ) ⎤ ⎡ n1 (k ) ⎤
⎢ y (k + 1)⎥ = ⎢ y (k ) ⎥ + ⎢ x (k )⎥ + ⎢n (k )⎥
⎥ ⎣ 2 ⎦ ⎣ 2 ⎦
⎣ 2
⎦ ⎢ 1
⎢⎣1 + y12 (k ) ⎥⎦
(5.25)
where n1 (k ) and n 2 (k ) are white Guassian noise with zero mean and covariance of 0.0009.
The estimated outputs are given by
yˆ1 (k + 1) = f1 [ y1 (k ), y 2 (k ), x1 (k ), x 2 (k )]
yˆ 2 (k + 1) = f 2 [ y1 (k ), y 2 (k ), x1 (k ), x 2 (k )]
(5.26)
The inputs x1 (k ) and x 2 (k ) are
⎛ 2πk ⎞
x1 (k ) = cos⎜
⎟
⎝ 100 ⎠
⎛ 2πk ⎞
x 2 (k ) = sin ⎜
⎟
⎝ 100 ⎠
(5.27)
The block diagram for the identification of the MIMO plant (5.25) is given in Fig. 5.7.
x1 (k )
x2 (k )
y 2 (k )
z −1
y1 (k )
z −1
Nonlinear
MIMO
Plant
y1 (k + 1) +
y2 (k + 1)
Σ
e1 (k )
-
+ +
Σ
e2 (k )
-
FLANN
structure
yˆ1 (k + 1)
BFO/PSO
algorithm
FLANN
structure
yˆ 2 (k + 1)
BFO/PSO
algorithm
Fig. 5.7 Block diagram of nonlinear MIMO plant identification
110
D Y N A M I C
S Y S T E M I D E N T I F I C A T I O N U S I N G F L A N N S T R U C T U R E
A N D P S O A N D B F O B A S E D L E A R N I N G A L G O R I T H M S
In case of FLANN-BP model, each of the inputs y1 (k ), y 2 (k ), x1 (k ) and x 2 (k ) are
expanded into three terms each using Chebyshev polynomials [5.13]. The weights are
updated using delta rule. A parallel scheme is used for the identification purpose. The actual
and model responses of the MIMO system are displayed in Figs.5.8 (a) and (b) for
FLANN-BFO and Figs. 5.8 (c) and (d) for FLANN-PSO. The proposed methods show
improved agreement between the estimated and true responses. However the FLANN-BP
method shows poor response matching capability as evident fron Figs. 5.8 (e) and (f).
2
1.5
Plant
Model
1.5
1
1
0.5
Outputs
Outputs
0.5
0
0
-0.5
-0.5
-1
-1
-1.5
Plant
model
-1.5
0
200
400
600
800 1000 1200
Dicsrete time
-2
1400
1600
1800
2000
0
200
400
600
800 1000 1200
Discrete time
1400
1600
1800
2000
(d) Second output using FLANN-PSO
(a) First output using FLANN-BFO
1.5
2
Plant
Model
Plant
model
1.5
1
1
0.5
Outputs
Outputs
0.5
0
0
-0.5
-0.5
-1
-1
-1.5
-1.5
-2
0
200
400
600
800 1000 1200
Discrete time
1400
1600
1800
0
200
400
600
2000
800 1000 1200
Discrete Time
1400
1600
1800
2000
(e)First output using FLANN-BP
(b) Second output using FLANN-BFO
2
Plant
Model
1.5
data1
Model
1.5
1
1
Outputs
Outputs
0.5
0
-0.5
0
-0.5
-1
-1
-1.5
0.5
-1.5
0
200
400
600
800 1000 1200
Discrete time
1400
1600
1800
2000
0
200
400
600
800 1000 1200
Discrete Time
1400
1600
1800
2000
(f) Second output using FLANN-BP
(c) First output using FLANN-PSO
Fig. 5.8 Response matching of MIMO system of Example 6
111
D Y N A M I C
S Y S T E M I D E N T I F I C A T I O N U S I N G F L A N N S T R U C T U R E
A N D P S O A N D B F O B A S E D L E A R N I N G A L G O R I T H M S
Table 5.1
Comparison of NMSE(dB) computed for different examples of two different models
Example No.
For higher input expansion
FLANN-BP
NMSE(dB)
No. of
exapnsions
For same number of input expansions
No. of
expansions
FLANNBP
FLANNBFO
FLANN
-PSO
Ex-1
-23.21
14
09
-20.66
-27.30
-29.77
Ex-2
-34.19
14
11
-30.71
-38.98
-42.65
Ex-3with (5.15)
-25.48
14
09
-18.51
-27.09
-20.90
Ex-3with (5.16)
-32. 82
14
09
-28.16
-33.77
-31.02
Ex-4
-21.01
24
12
-20.57
-22.75
-22.88
Ex-5
-19.24
38
12
-18.29
-21.09
-22.58
Table 5.2
Comparison of Computational Complexities of various system identification models
Types of models
No. of
tanh ( )
No. of
No. of
Cos/Sin
weights
Example-1
14
15
09
10
FLANN-BP
FLANN-BFO/
FLANN-PSO
1
0
FLANN- BP
FLANN-BFO/
FLANN-PSO
1
0
14
11
FLANN- BP
FLANN-BFO/
FLANN-PSO
1
0
14
09
FLANN- BP
FLANN-BFO/
FLANN-PSO
1
0
24
12
FLANN-BP
FLANN-BFO/
FLANN-PSO
2
0
38
12
No. of
Adds.
No. of
Muls.
14
9
15
10
15
12
14
11
15
12
15
10
14
9
15
10
25
13
24
12
25
13
40
14
39
13
40
14
Example-2
Example-3
Example-4
Example-5
Extensive simulation studies exhibit that the FLANN structure with BFO and PSO based
training models provide better output response as well as require substantially lesser
number (about 200 to 600) of iterations to converge compared to 50,000 iterations required
by FLANN-BP model. In BFO based training, only 100 input samples and 16 number of
bacteria population are required for convergence whereas in case of PSO 200-500 input
112
D Y N A M I C
S Y S T E M I D E N T I F I C A T I O N U S I N G F L A N N S T R U C T U R E
A N D P S O A N D B F O B A S E D L E A R N I N G A L G O R I T H M S
samples and 30 particles are required. The computational requirements involved per
iteration of the algorithm both in the proposed and FLANN-BP methods are evaluated and
listed in Table 5.2. This comparison also indicates that the FLANN-BFO and FLANNPSO methods involve lesser operations compared to FLANN-BP in all the examples
simulated. As shown in Table 5.1, the NMSE obtained in each example by the proposed
FLANN-BFO and FLANN-PSO models is also less in comparison to the FLANN-BP
approach.
5.6 Conclusion
This Chapter introduces the problem and importance of adaptive nonlinear system
identification. Then the shortcomings of the conventional identification methods are
highlighted. Two new approaches based on swarm intelligence are proposed to identify
complex nonlinear dynamic plants. The corresponding BFO and PSO based identification
algorithms are presented in this chapter. The performance of the proposed methods are
assessed by simulating various standard nonlinear dynamic and MIMO systems. These
results have also been compared with those obtained by FLANN-BP based approach. The
comparison reveals that the new methods of identification are fast, more accurate and
involve less computation compared to its FLANN-BP counterpart. Thus the proposed new
approaches are promising methods of achieving efficient nonlinear dynamic system
identification.
References
[5.1] S. Haykin , Neural Networks, Ottawa, ON Canada: Maxwell Macmillan, 1994.
[5.2] P. S. Sastry, G. Santharam and K. P. Unnikrishnan, “Memory neural networks for
identification and control of dynamical systems”, IEEE Trans. Neural Networks, vol. 5, pp.
306-319, 1994.
[5.3] A. G. Parlos., K. T. Chong and A. F. Atiya, “Application of recurrent multilayer
perceptron in modeling of complex process dynamics”, IEEE Trans. Neural Networks,
vol. 5, pp. 255-266, 1994.
[5.4] K. S. Narendra and K. Parthasarathy, “Identification and control of dynamical systems
using neural networks”, IEEE Trans. on neural networks, vol. 1, no. 1, pp. 4-27, 1990.
113
D Y N A M I C
S Y S T E M I D E N T I F I C A T I O N U S I N G F L A N N S T R U C T U R E
A N D P S O A N D B F O B A S E D L E A R N I N G A L G O R I T H M S
[5.5] D. H. Nguyen and B. Widrow, “Neural networks for self-learning control system”,
Int. J. Conttr., vol. 54, no. 6, pp. 1439-1451, 1991.
[5.6] G. Cembrano, G. Wells, J. Sarda and A. Ruggeri, “Dynamic control of a robot arm
based on neural networks”, Contr. Eng. Practice, vol 5, no. 4, pp. 485-492, 1997.
[5.7] T. Poggio and F. Girosi, “Networks for approximation and learning”, Proceeding of
IEEE, vol. 78, no. 9, pp. 1481-1497, 1990.
[5.8] S. Chen, S. A. Billings and P. M. Grant, “Recursive hybrid algorithm for nonlinear
system identification using radial basis function networks”, Int. J. Contr., vol. 55, no. 5, pp.
1051-1070, 1992.
[5.9] S. V. T. Elanayar and Y. C. Shin, “Radial basis function neural network for
approximation and estimation of nonlinear stochastic dynamic systems”, IEEE Trans.
Neural Network, vol. 5, pp. 594-603, 1994.
[5.10] Q. Zhang and A. Benveniste, “Wavelet networks”, IEEE Trans. Neural Networks,
vol. 3, pp. 889-898, 1992.
[5.11] Y. C. Pati and P. S. Krishnaprasad, “Analysis and synthesis of feed forward neural
networks using discrete affine wavelet transforms”, IEEE Trans. Neural Networks, vol. 4,
pp. 73-85, 1993.
[5.12] J. C. Patra, R. N. Pal, B. N. Chatterji and G. Panda, “Identification of nonlinear
dynamic systems using functional link artificial neural networks”, IEEE Trans. in Systems,
Man and Cybernetics-Part B: Cybernetics, vol. 29, no. 2, pp. 254-262, 1999.
[5.13] J. C. Patra and A. C. Kot, “Nonlinear dynamic system identification using Chebyshev
functional link artificial neural networks”, IEEE Trans. in Systems, Man and CyberneticsPart B: Cybernetics, vol. 32, no. 4, pp. 505-511, 2002.
[5.14] S. Purwar, I. N.Kar and A. N. Jha, “Online system identification of complex systems
using Chebyshev neural networks “, Applied Soft Computing, Elsevier, pp. 364-372, 2007.
[5.15] K. M. Passino, “Biomimicry of Bacterial Foraging for distributed optimization and
control”, IEEE control system magazine, vol. 22, no. 3, pp. 52-67, 2002.
[5.16] S. Mishra, “Hybrid least square adaptive bacterial foraging strategy for harmonic
estimation”, IEE Proc. on Gener., Transm., Distrib., vol. 152, no. 3, pp. 379-389, 2005.
[5.17] M. Tripathy, S. Mishra., L. L. Lai and Q. P. Zhang, “Transmission loss reduction
based on FACTS and Bacteria Foraging algorithm”, PPSN, pp. 222-231, 2006.
[5.18] S. Mishra and C. N. Bhende, “Bacterial Foraging technique based optimized active
power filter for load compensation”, IEEE Trans. on Power Delivery, vol. 22, no.1, pp.
457-465, 2007.
114
D Y N A M I C
S Y S T E M I D E N T I F I C A T I O N U S I N G F L A N N S T R U C T U R E
A N D P S O A N D B F O B A S E D L E A R N I N G A L G O R I T H M S
[5.19] M. Tripathy and S. Mishra, “Bacteria foraging based to optimize both real power loss
and voltage stability limit”, IEEE Trans. on Power Systems, vol. 22, no. 1, pp. 240-248,
2007.
[5.20] L. Ulagammai., P. Venkatesh, S. P. Kannan and N. P. Padhy, “Application of bacteria
foraging technique trained and artificial and wavelet neural networks in load forecasting”.
Neurocomputing, pp. 2659-2667, 2007.
[5.21] D. P. Acharya, G. Panda, S. Mishra and Y. V. S. Lakhshmi, “Bacteria foraging based
independent component analysis”. Proc. of Int. Conf. on Computational Intelligence and
Multimedia Applications, vol. 2, pp. 527-531, 2007.
[5.22] Babita Majhi and G. Panda, “Bacterial Foraging based Identification of Nonlinear
Dynamics System”, Proc. of IEEE International Congress on Evolutionary
Computation(CEC-2007), Singapore, 25-28, September, 2007, pp.1636-1641.
[5.23] G. Panda, Babita Majhi and S. Mishra, “Nonlinear System Identification using
Bacterial Foraging based Learning”, Proc. of IET 3rd International Conference on Artificial
Intelligence in Engineering Technology (ICAIET-2006), Kota kinabalu, Malaysia, 22-24,
November, 2006, pp. 120-125.
[5.24] R. Majhi, G. Panda, G. Sahoo and D.P. Das, “Stock market prediction of S&P 500
and DJIA using Bacterial Foraging Optimization technique”, IEEE Congress on
Evolutionary Computation (CEC 2007), Singapore, 25-28 September 2008, pp.2569-2575.
[5.25] Babita Majhi, G. Panda and A. Choubey, “On The Development of a new Adaptive
Channel Equalizer using Bacterial Foraging Optimization Technique”, Proc. of IEEE
Annual India Conference (INDICON-2006), New Delhi, India, 15th-17th September, 2006,
pp. 1-6
[5.26] N. A. Gershenfeld. and A. S. Weigend, “The future of time series: Learning and
understanding”, Time Series Prediction: Forecasting the future and past, Reading, MA:
Addison-Wesley, pp. 1-70, 1993.
115
6
Chapter
Robust Identification and Prediction
using Particle Swarm Optimization
Technique
6.1 Introduction
T
HE objective of identification is to determine a suitable mathematical model of a given
system/process, useful for predicting the behavior of the system under different
operating conditions. Its another objective is to design a controller which allows the
system to perform in a desired manner. Most of the practical plants and systems are nonlinear
and dynamic in nature and hence identification of such complex plants is a challenging task.
Accurate and fast identification of real time nonlinear complex processes is still a difficult
problem. Further, in many practical situations, building of a proper model of a plant becomes
difficult when outliers are present or few data are missing from the output samples of the plant
or the training signal. Under such adverse conditions the training of models becomes ineffective
when conventional learning algorithms such as the least mean square (LMS) or recursive least
square (RLS) type algorithms are used. But all these derivative based algorithms have been
116
R O B U S T
I D E N T I F I C A T I O N
A N D P R E D I C T I O N U S I N G P A R T I C L E
S W A R M O P T I M I Z A T I O N T E C H N I Q U E
derived by minimizing the square of the error as the cost function. In recent past many
bioinspired and evolutionary computing tools such as genetic algorithm (GA), particle swarm
optimization (PSO), bacterial foraging optimization (BFO) and ant colony optimization (ACO)
have been reported and have been applied to optimization and identification tasks. In case of the
derivative free algorithms conventionally the mean square error (MSE) is used as the fitness or
cost function. Use of MSE as cost function leads to improper training of adaptive model when
outliers are present in the desired signal. Therefore there is a need for identification of complex
plants which are nonlinear and dynamic in nature. It is a fact that the traditional regressors
employ least square fit which minimizes the Euclidean norm, while the robust estimator is based
on a fit which minimizes another rank based on a norm called Wilcoxon norm [6.1]. It is known
in statistics that linear regressors developed using Wilcoxon norm are robust against outliers.
Using such norm new robust machines have recently been proposed for approximation of
nonlinear functions [6.2]. In the present investigation a new method of robust identification of
nonlinear dynamic systems or plants is developed by minimizing robust cost function (RCF)
[6.1], [6.44], [6.47] of errors of a functional link artificial neural network (FLANN) model using
a derivative free PSO technique. The identification performance of the new method is evaluated
through simulation study and is compared with the results obtained from corresponding
Euclidean norm based PSO technique. Hence the main contribution of the paper is the
formulation of complex identification task as a robust optimization problem of RCF of the
FLANN model. The second contribution is the effective minimization of this norm employing a
population based derivative free PSO technique which essentially adjusts the connecting weight
of feed forward and feed back paths of the model. The third contribution is the selection of
appropriate FLANN structure as the backbone of the model which is a single layer ANN
structure and offers low complexity. The robust identification performance in presence of
outliers in the training signal is then shown through simulation of some bench mark nonlinear
dynamic identification problems. Many research papers have been reported in the literature to
identify both static and dynamic nonlinear systems. The Artificial Neural Network (ANN) has
been employed for many identification and control purpose[6.3-6.5] but at the expense of large
computational complexity. Narendra and Parathasarathy [6.6] have used the MLP architecture
with back propagation learning algorithm for effective identification and control of dynamic
systems [6.7] and robot arm control [6.8]. Similarly the Radial Basis Function (RBF) network has
been introduced to develop system identification model of nonlinear dynamic systems [6.9-6.10].
In recent past, the wavelets in place of RBF has been suggested in neural network [6.11-6.12] to
117
R O B U S T
I D E N T I F I C A T I O N
A N D P R E D I C T I O N U S I N G P A R T I C L E
S W A R M O P T I M I Z A T I O N T E C H N I Q U E
develop efficient identification model. Further, the Functional Link Artificial Neural Network
(FLANN), a computationally efficient single layer ANN, has been reported in the literature as an
useful alternative to MLP for many applications. This single layer ANN has also been
successfully employed for identification of nonlinear systems [6.13]. Recently ChebyshevFLANN has been proposed for identification of nonlinear dynamic systems [6.14].
The basics of PSO is dealt in Section 2.5.2. In addition to the applications of PSO described in
Section 4.2 it is also applied for reactive power and voltage control [6.15, 6.16], economic
dispatch [6.17] -[6.20], power system reliability and security [6.21], generation expansion problem
[6.22, 6.23], state estimation [6.24, 6.25], controller tuning [6.26, 6.27], system identification and
control [6.28, 6.29], capacitor placement [6.30, 6.31], short term load forecasting [6.32], generator
contribution to transmission systems [6.33], industrial applications [6.34], task assignment
problem [6.35], to solution of Sudoku puzzles [6.36], electromagnetic design [6.37], unit
commitment problem [6.38] and optimization of multimodal functions [6.39].
In the past many robust learning algorithms have been proposed for training different adaptive
networks. A robust BP learning algorithm that is resistant to the noise effects has been derived
in [6.40]. However the convergence of this algorithm is very slow. Another robust learning
algorithm has been reported [6.41] for recurrent neural network. This algorithm is based on
filtering outliers from data followed by estimating parameters from the filtered data. The new
method makes better prediction of electrical demand compared to conventional methods. In a
letter [6.42] a robust learning method has been proposed for RBF network and has been applied
for function approximation. Kadir Liano has reported [6.43] a mean log squared error (MLSE)
cost function (CF) and has shown that the new cost function yields algorithm which is robust
compared to conventional squared error based cost function. In another publication the authors
have used the fuzzy-neural network and β -spline membership function for function
approximation with outliers in training data [6.44] and have shown that their algorithm is more
flexible and efficient than the one reported in [6.38]. In 1998 a robust interval regression analysis
has been suggested which provides robust performance against outliers as well as improvement
in rate of convergence [6.45]. A robust objective function is suggested in [6.46] for RBF
networks to reduce the influence of outliers. The authors have shown that the proposed
objective function yields better function approximation compared to its least square (LS)
counterpart. In [6.47], the authors have proposed robust learning algorithms of fuzzy neural
118
R O B U S T
I D E N T I F I C A T I O N
A N D P R E D I C T I O N U S I N G P A R T I C L E
S W A R M O P T I M I Z A T I O N T E C H N I Q U E
network to reduce the outlier effects during training. They have tested the robustness of their
algorithm through simulation of various function approximation problems. Chuang et al have
recently proposed [6.48] a robust TSK fuzzy modeling approach with improved performance
for function approximation in presence of outliers. A novel regression approach has recently
been reported [6.49] to enhance the robust capability of the support vector regressor. In their
approach they have used a cost function which works satisfactorily when maximum 10%
outliers are present in the training set. In 2004, a robust analysis of linear models is presented
using Wilcoxon norm [6.1] and it has been shown that the proposed norm is more robust to
outliers compared to its least square counterpart. Recently Hsieh et al have proposed robust
learning rules for neural network, fuzzy neural network and kernel based regressor using a
Wilcoxon norm [6.2] which is different from other reported norms and have shown that the
new norm-based algorithm exhibit robust better performance when the percentage of outliers is
as high as 40%.
6.2 Formulation of PSO based nonlinear system
identification model
The identification scheme of a dynamic nonlinear system is shown in Fig. 6.1 in which
x( n), yˆ (n), y ( n) and e(n) denote the input, output of the model, the output of the system
and the error between the two at n th time instant respectively. The input x(n) is an
uniformly generated white signal. Therefore,
e(n) = y (n) − yˆ (n)
(6.1)
In the present study, identification of single-input single-output dynamic (SISO) plants of
four different nonlinear models [6.6] described by the difference equations given in (3.3)(3.6) are considered.
System
(P)
y(n)
+
x(n)
Model
yˆ ( n)
-
Σ
e(n)
(Pˆ )
Update
Algorithm
Fig. 6.1 Identification scheme of a dynamic system
119
R O B U S T
I D E N T I F I C A T I O N
A N D P R E D I C T I O N U S I N G P A R T I C L E
S W A R M O P T I M I Z A T I O N T E C H N I Q U E
The nonlinear functions associated with these plants are implemented using single layer
nonlinear FLANN structure and its weights are trained by minimizing three different CFs
using PSO algorithm. It is assumed that the plant under consideration is bounded-inputbounded-output (BIBO) stable. In order to achieve the stability and to ensure that the
parameters of the FLANN model converge, a series-parallel scheme is employed. In this
scheme the output of the plant instead of that of the ANN models is fed back to the
models during the training phase [6.6]. Fig. 6.2 depicts a detailed diagram for identification
of a nonlinear dynamic plant using a PSO based robust norm minimization technique. The
architecture of the proposed adaptive model is taken to be two functional link artificial
neural networks (FLANNs) [6.50] as shown in the same figure. The output for n th input
sample at k th generation, y n (k ) is used as input to the feedback FLANN structure. In this
method N input samples x(n), (1 ≤ n ≤ N ) uniformly distributed between [-1, 1] are used
to develop the model. The same set of input is used for all particles. The output y n (k ) may
be computed as
y n (k ) = y n' (k ) + η (k )
(6.2)
where η (k ) represents outliers at randomly selected locations and zero magnitude at
remaining samples at k th generation. For developing the proposed FLANN-PSO based
adaptive identification model the combined feed forward and feed back weights are
considered as one particle. In the beginning, P such particles called a swarm is chosen to
represent a population of random solutions. Each weight-particle of the model which is
updated using PSO based technique has a random velocity and flies within the solution
space. Each particle has also memory to keep track of its previous best position and the
corresponding fitness value. Similarly each swarm remembers its best solution achieved so
far. The velocity and position of each weight-particle are updated using its personal best
position and global best position of the swarm.
The output of the model for p th particle, n th input sample and at k th generation is given
by
T
yˆ n , p (k ) = Zˆ n (k ) W p (k )
(6.3)
where
Zˆ n (k ) = [ z1,n
z 2, n
............z L ,n1
z L1+1,n
...........z L ,n ]T
120
R O B U S T
I D E N T I F I C A T I O N
A N D P R E D I C T I O N U S I N G P A R T I C L E
S W A R M O P T I M I Z A T I O N T E C H N I Q U E
Wp (k) =[w1,p (k) w2,p (k).........
..wL1, p(k) wL1+1,p (k).........
....wL, p(k)]T
(6.4)
z l1 ,n (1 ≤ l1 ≤ L1 ) and z l2 ,n (1 ≤ l 2 ≤ L − L1 ) represent the trigonometrically expanded values
of the input and output vectors respectively. W p (k ) denotes the weight vector of forward
and backward structure of the FLANN model . When the n th input sample x(n) is applied
the input and output of the shift register contents of Fig.6.2 are given by
[ x(n), x(n − 1),.............., x(n − T1 + 1)] and [ y n (k − 1), y n (k − 2),............ y n (k − T2 )]
respectively. The
symbols T1 and T2 represent number of input and output nodes. Each feed forward input
sample x(n) is nonlinearly expanded by using trigonometric expansion scheme such as
[ x ( n), sin{2π .x ( n)}, cos{2π .x( n)}, sin{3.2π .x( n)}, cos{3.2π .x ( n)}... sin{( 2 s − 1).2π .x ( n)}, cos{( 2 s − 1).2π .x ( n)}]
The symbol s represents the number of sine or cosine expansions of each input sample.
The main motivation of employing nonlinear expansions of input samples is to create a
nonlinear environment using an adaptive linear combiner which is expected to improve
identification of nonlinear dynamic systems. This single layer adaptive architecture
effectively substitutes the needs of multilayer artificial neural network. In the same way
each feed back sample y n (k − 1) is expanded to same number of trigonometric terms. In
the present investigation trigonometric expansion is chosen as it is observed to perform
better than the power series based expansions [6.50]. L1 and L2 represent the number of
expanded terms for feed forward and feed back inputs respectively where L1 = T1 × (2 s + 1)
and L2 = T2 × (2 s + 1) . The total number of expanded terms thus becomes L = L1 + L2 .
The error produced at n th input sample, k th generation and for p th particle is given by
en , p (k ) = y n (k ) − yˆ n , p (k )
(6.5)
where y n (k ) represents the output of the nonlinear dynamic plant corresponding to n th
input sample and at k th generation.
This plant output serves as a training signal for all particles of the proposed model.
121
R O B U S T
I D E N T I F I C A T I O N
A N D P R E D I C T I O N U S I N G P A R T I C L E
S W A R M O P T I M I Z A T I O N T E C H N I Q U E
In the first part of the investigation, the mean square error defined in (6.6) is computed for
each particle and the PSO tool is used to minimize this CF by iteratively changing the
weight-particles of the adaptive identification model.
N
E p (k ) = ∑ en2, p (k ) / N
(6.6)
n =1
for
p = 1, 2, ..................., P
k = 1, 2,...................., K
Outliers in random locations
yn' (k )
Nonlinear Dynamic Plant
yn (k )
Σ
x(n)
en , p ( k )
Σ
z1, n
z 2, n
x(n)
z
w1, p (k )
w2, p (k )
−1
Σ
x(n − 1)
z −1
yˆ n , p ( k )
Σ
•
•
x(n −T1 + 1)
z L1 , n
wL1 , p (k )
wL1 +1, p (k )
z L1 +1, n
wL2 +1, p (k )
z L2 +1, n
z −1
y n (k − 1)
z
Σ
−1
yn (k − 2)
•
z −1
•
wL , p ( k )
yn (k − T2 )
z L, n
PSO based weight
update rule by
robust CFs minimization
Model
Fig. 6.2 Identification of nonlinear dynamic plants using FLANN architecture and PSO
based robust CF minimization
122
R O B U S T
I D E N T I F I C A T I O N
A N D P R E D I C T I O N U S I N G P A R T I C L E
S W A R M O P T I M I Z A T I O N T E C H N I Q U E
6.3 Weight update of FLANN model by squared
error minimization using PSO
The initial (k=0) position vector of each particle is represented by a single weight vector
W p (0) defined in (6.4) and is formed by combining the feed-forward and feedback weight
vectors of the FLANN model. This initial set of vectors are obtained by generating random
numbers lying between -0.5 to +0.5. The corresponding sets of initial velocity vector
associated with the position vectors of particle is assumed to be random numbers lying
between -0.5 to +0.5 and is denoted as ΔW p (0) . For each particle, N input samples are
applied successively and the corresponding initial squared error norm, E p (0) is computed
using (6.3), (6.5) and (6.6). The initial best position vector of a particle corresponds to its
own position vector. That is W b p (0) = W p (0) and the associated cost function of each
particle is E p (0); 1 ≤ p ≤ P . The initial potentiality of a p th particle is represented by the
twin parameters {W b p (0) and E p (0)} . The initial global best position W g (0) is obtained
by comparing all W b p (0) and choosing the one which provides minimum E p (0) . In the
next generation the velocity and position of each particle are updated as
ΔW p ( k + 1) = α ΔW p ( k ) + c1 ∗ R 1 ( k )[W b p ( k ) − W p ( k )] + c 2 ∗ R 2 ( k )[W g ( k ) − W b p ( k )]
(6.7)
W p (k + 1) = W p (k ) + ΔW p (k + 1)
(6.8)
where ΔW p (k ) = velocity or rate of change of position change of p th particle vector at
the k th generation of PSO.
W p (k ) = position vector of p th weight-particle at k th generation
W b p (k ) = the best position vector of p th particle which yields the best fitness value until
k th generation
W g (k ) = The best position vector among all the particles in the population achieved up to
k th generation.
c1 and c 2 represent positive constants. R 1 and R 2 represent two vectors of random
numbers each of which lies in the range of 0 to 1. The second and third terms of (6.7)
represent the self thinking of a particle and the social collaboration among particles
respectively. α represents the inertia weight which balances between global and local
123
R O B U S T
I D E N T I F I C A T I O N
A N D P R E D I C T I O N U S I N G P A R T I C L E
S W A R M O P T I M I Z A T I O N T E C H N I Q U E
searches. It may be a fixed positive or a time varying constant. By linearly decreasing α
from a relatively large value to a small value in succeeding generations, the particle tends to
have a more global search ability in the beginning of the search but attains more local
search ability towards the last generation [6.39].
To obtain the best position of a weight particle at k th generation the two successive fitness
values E p (k − 1) and E p (k ) are compared and the weight vector associated with minimum
fitness value is selected as its personal best position vector, E b p (k ) . This process is
repeated for all particles. The global best weight vector, W g (k ) is then selected from
among local best weight vectors which associate minimum fitness value. This process is
repeated for many generations until the fitness function defined in (6.6) attains lowest
possible minimum. The corresponding global best weight particle W g (k ) provides the
desired solution. It means that this weight vector of the FLANN model generates output
which is in close agreement with the plant output.
6.4 Development of robust identification and
prediction models using PSO based training
with robust norm minimization
Three robust cost functions reported in the literature are used in the development of
identification model. The PSO is then used to iteratively minimize these norms of the error
terms obtained from the model and hence the resulting identification model is expected to
be robust. These cost functions are defined as follows
(a) Robust Cost Function-1 (Wilcoxon Norm) [6.1, 6.2]
A score function is first defined as an increasing function φ (u ) : [0,1] → ℜ such that
1
∫φ
0
2
(u)du < ∞
(6.9)
The score function has the characteristics
1
1
0
0
∫ φ (u )du = 0 and ∫ φ
2
(u )du = 1
(6.10)
The score associated with the score function φ is defined as
124
R O B U S T
I D E N T I F I C A T I O N
A N D P R E D I C T I O N U S I N G P A R T I C L E
S W A R M O P T I M I Z A T I O N T E C H N I Q U E
⎛ i ⎞
aφ (i ) = φ ⎜
⎟, i ∈ l
⎝ l + 1⎠
(6.11)
where l is a fixed positive integer.
From (6.10) it may be observed that aφ (1) ≤ aφ (2) ≤ ......... ≤ aφ (l ) . The Wilcoxon norm
[6.1, 6.2] is a pseudo-norm on ℜ ' and is defined as
l
l
i =1
i =1
C1 = ∑ a( R(vi ))vi = ∑ a(i )vi , v = [v1 , v 2 ,..........., vl ]T ∈ ℜ '
(6.12)
where R(vi ) denotes the rank of vi among v1 , v 2 ,......... ......, vl , v (1) ≤ v ( 2 ) ≤ ......v(l ) are the
ordered values of v1 , v 2 ,..............., vl , a(i ) = φ[i /(l + 1)] . In statistics, different types of
score functions have been dealt but the commonly used one is given by
φ (u ) = 12 (u − 0.5) .
(b) Robust Cost Function-2 [6.47]
It is defined as
C 2 = σ (1 − exp(−e 2 / 2σ ))
(6.13)
where σ = a parameter to be adjusted during training
and e 2 = mean square error defined in (6.6)
(c) Robust Cost Function – 3 (Mean Log Squared error) [6.44]
The third cost function is defined as
C3 = log(1 +
e2
)
2
(6.14)
where e 2 is defined in (6.6).
The weight-update of the identification model of Fig. 6.2 is carried out by minimizing these
cost functions of the errors defined in (6.12), (6.13) and (6.14) using PSO algorithm. In this
approach the steps outlined from (6.3) to (6.5) remain same. Subsequent steps involved are
detailed as follows :
Let the error vector of p th particle at k th generation due to application of N input
samples to the model be represented as [e1, p (k ), e2, p (k ),............., e N , p (k )]T . The errors are
then arranged in an increasing manner from which the rank R{en , p (k )} of each n th error
term is obtained. The score associated with each rank of the error term is evaluated as
125
R O B U S T
a (i ) = 12 (
I D E N T I F I C A T I O N
A N D P R E D I C T I O N U S I N G P A R T I C L E
S W A R M O P T I M I Z A T I O N T E C H N I Q U E
i
− 0.5)
N +1
(6.15)
where i, (1 ≤ i ≤ N ) denotes the rank associated with each error term. At k th generation
of each p th particle the Wilcoxon norm is then calculated as
N
C p (k ) = ∑ a(i ) ei , p (k )
(6.16)
i =1
Similarly the other two CFs are computed using (6.13) and (6.14). The steps involved in the
first and second generations in the PSO based minimization of these CFs are detailed in
Figs. 6.3 and 6.4 respectively. The computations required in subsequent generations are just
repetitions of the steps outlined in Fig. 6.4. The learning strategy described in Figs. 6.3 and
6.4 continues until the CF decreases to the possible minimum values. At this stage the
training is complete and the global best weight vector W g represents the feed forward and
feedback weights of the FLANN based model.
FLANN-1
en ,1 (0)
W 1 (0)
Cost function
computation
C1 (0)
{W P (0) W bP (0) C1 (0)}
W 1 ( 0)
W b1 (0)
FLANN-2
e n , 2 ( 0)
W 2 (0)
Cost function
computation
C 2 (0)
{W 2 (0) W b2 (0) C 2 (0)}
W 2 ( 0)
•
•
•
N-input
samples
•
•
•
•
•
W p ( 0)
W b p ( 0)
FLANN-p
e n , p ( 0)
W p ( 0)
•
•
W P (0)
•
Cost function
computation
C p (0)
•
Select global
best weight
vector having
least Cp(0)
•
•
{W p (0) W b p (0) C p (0)}
•
•
•
{W P (0) W bP (0) C P (0)}
W g (0)
To 2nd generation
W b2 (0)
W b P ( 0)
FLANN-P
W P ( 0)
e n , P ( 0)
Cost function
computation
C P (0)
Fig. 6.3 Steps involved in the first generation weight update mechanism using PSO based CF minimization
126
R O B U S T
{W
1
}
(0), W b1 (0), C1 (0)
I D E N T I F I C A T I O N
A N D P R E D I C T I O N U S I N G P A R T I C L E
S W A R M O P T I M I Z A T I O N T E C H N I Q U E
W 1 (1)
Position and
velocity update of
weight-particle of
FLANN-1
C1 (1)
en ,1 (1) Cost function
computation
{W 1 (1), C1 (1)}
{W
{W
2
}
Position and
velocity update of
weight-particle of
FLANN-2
(0), W b2 (0), C 2 (0)
p
}
(0), W b p (0), C p (0)
•
•
•
•
Position and
velocity update of
weight-particle of
FLANN-p
{W
Ρ
}
(0), W bΡ (0), C Ρ (0)
2
b2 (0), C 2 (0)
en , Ρ (1)
}
(1), C 2 (1)
}
For p=2
pbest vector
selection by
computing
Cost function
{W
{W
1
}
(1), W b1 (1), Cb1 (1)
b2
}
(1), Cb2 (1)
{W
2
}
(1), W b2 (1), Cb 2 (1)
b
p (1), C p (1)}
bp
W g (1)
gbest selection
based on
minimum c (1)
C p (1)
Cost function
computation
}
(0), C p (0)
For p= p
pbest vector
selection by
computing
cost function
{W
bp
}
(1), Cb p (1)
{W
Ρ
}
(1), W bΡ (1), CbΡ (1)
•
•
C Ρ (1)
{W Ρ (1), CΡ (1)}
{W
}
(1), Cb1 (1)
•
•
•
•
• {W
•
•
Position and
velocity update of
weight-particle of
FLANN-P
{W
{W
{W
b1
C 2 (1)
en , p (1) Cost function
computation
•
}
(0), C1 (0)
{W
W 2 (1)
en , 2 (1) Cost function
computation
W g ( 0)
{W
b1
For p=1
pbest vector
selection by
computing
Cost function
}
bΡ (0), C Ρ (0)
For p= P
pbest vector
selection by
computing
Cost function
W p (1)
{W
bΡ
}
(1), CbΡ (1)
{W
Ρ
W Ρ (1)
Fig. 6.4 Steps involved in 2nd generation weight-update mechanism using PSO based CF minimization
6.5 Simulation study
In this section, simulation study is carried out to assess the identification performance of
the proposed algorithm and results of nonlinear identification of static and dynamic plants
described by (3.3)-(3.6) are presented in presence of 10% to 50% of outliers in the desired
signal. The outliers are uniformly distributed random values within the range of -1 to +1
and are added at random locations (10% to 50%) of the training samples. In these
127
}
(1), W bΡ (1), CbΡ (1)
R O B U S T
I D E N T I F I C A T I O N
A N D P R E D I C T I O N U S I N G P A R T I C L E
S W A R M O P T I M I Z A T I O N T E C H N I Q U E
examples, the series-parallel model is used to identify these plants. In the scheme-1, MSE is
used as the cost function where as in schemes-2, 3 and 4 the CFs used are C1 , C 2 and C3
respectively. The performance of the proposed three schemes are obtained from simulation
studies and compared with that obtained by scheme-1. For training the weights of FLANN
model an uniformly distributed random signal in the interval [-1,1] is used as input. During
the testing phase, the effectiveness of the proposed models are evaluated by using the test
signal
⎧ 2πk
⎪⎪sin 250 for k ≤ 250
x(k ) = ⎨
⎪0.8 sin 2πk + 0.2 sin 2πk for k > 250
250
25
⎩⎪
(6.17)
A quantitative measure for performance evaluation used is the normalized mean square
error (NMSE) defined in [6.51] as
Γ=
1
σ2
S
∑ [ y(k ) − yˆ (k )]
S
(6.18)
2
k =1
where y (k ) and yˆ (k ) represent the plant and model outputs at k th discrete time,
respectively and σ 2 denotes variance of the plant output sequence over S number of test
samples.
Identification of Static Systems
Identification of two different static plants is carried out through simulation experiments.
Example 1 : f1 ( x) = x 3 + 0.3x 2 − 0.4 x
(6.19)
Example2: f 2 ( x) = 0.6 sin(πx) + 0.3 sin(3πx) + 0.1sin(5πx)
(6.20)
Fig. 6.5 shows desired signal of Example-1 with outliers of 50% added to it within the
range of (-1, 1). The input to the model is expanded into eleven trigonometric terms to get
the best identification results in all schemes. The number of connecting weights including
the threshold is twelve, which are updated using four different schemes. The parameters
used
in
the
study
are
no.
of
particles=30,
no.
of
input
samples=200,
c1 = c 2 = 1.042 and v max = 1 . Simulation is carried out using 10% to 50% of outliers in
the training samples but the results shown in Figs. 6.6(a) – (d) are for 50% outlier only. It is
evident from these plots that the scheme-2 based model provides accurate response
matching in presence of 50% of outlier whereas the scheme-1 based model exhibits poor
identification performance. In both examples, NMSE obtained from scheme-2 listed in
128
R O B U S T
I D E N T I F I C A T I O N
A N D P R E D I C T I O N U S I N G P A R T I C L E
S W A R M O P T I M I Z A T I O N T E C H N I Q U E
Table 6.1 is much lower than that obtained from scheme-1 model.
2
actual
outliers
1.5
1
Outputs
0.5
0
-0.5
-1
-1.5
0
20
40
60
80
100
120
Discrete Time
140
160
180
200
Fig. 6.5 Plot of desired signal with 50% outliers used in Example-1
1
Plant
Model
0.8
Outputs
0.6
0.4
0.2
0
-0.2
-0.4
-1
-0.8
-0.6
-0.4
-0.2
0
0.2
Discrete Time
0.4
0.6
0.8
1
(a) Scheme-2 learning with 50% outliers
1
Plant
Model
0.8
Outputs
0.6
0.4
0.2
0
-0.2
-0.4
-1
-0.8
-0.6
-0.4
-0.2
0
0.2
Discrete time
0.4
0.6
0.8
(b) Scheme-1 learning with 50% outliers
129
1
R O B U S T
I D E N T I F I C A T I O N
A N D P R E D I C T I O N U S I N G P A R T I C L E
S W A R M O P T I M I Z A T I O N T E C H N I Q U E
0.8
Plant
Model
0.6
0.4
Outputs
0.2
0
-0.2
-0.4
-0.6
-0.8
-1
-0.8
-0.6
-0.4
-0.2
0
0.2
Discrete time
0.4
0.6
0.8
1
(c) Scheme-2 learning with 50% outliers
0.8
Plant
Model
0.6
0.4
Outputs
0.2
0
-0.2
-0.4
-0.6
-0.8
-1
-0.8
-0.6
-0.4
-0.2
0
0.2
Discrete time
0.4
0.6
0.8
1
(d) Scheme-1learninmg with 50% outliers
Fig. 6.6 Response matching of static systems ((a) and (b) for Example 1 and (c) and (d) for
Example 2)
Identification of SISO Dynamic Systems
Example 3: The difference equation of the plant is
y ( k + 1) = 0.3 y ( k ) + 0.6 y ( k − 1) + g[ x ( k )]
(6.21)
The linear parameters are 0.3 and 0.6 and two unknown nonlinearities g i (.) used in the
study are
g 1 ( x) = 0.5 sin 3 (πx) −
2. 0
− 0.1 cos( 4πx) + 1.125
x + 2 .0
3
g 2 ( x) = 0.6 sin(πx) + 0.3 sin(3πx) + 0.1 sin(5πx)
130
(6.22)
(6.23)
R O B U S T
I D E N T I F I C A T I O N
A N D P R E D I C T I O N U S I N G P A R T I C L E
S W A R M O P T I M I Z A T I O N T E C H N I Q U E
To identify the plant a series-parallel model described by (6.24) is used
yˆ (k + 1) = 0.3 y(k ) + 0.6 y(k − 1) + Ν[ x(k )]
(6.24)
The term Ν[ x (k )] represents a FLANN model using various schemes. The input is
expanded to 14 terms by using trigonometric expansions and PSO algorithm is used to
update its connecting weights. The parameters used are no. of particles=30, no. of input
samples=200, c1 = c 2 = 1.042 and v max = 1 . The results of identification of (6.21) with
nonlinear functions defined in (6.22) and (6.23) in prersence of 50% and 40% outliers are
shown in Figs. 6.7(a), (b) and Figs. 6.8(a), (b) respectively . It is observed that the FLANN
with scheme-2 learning exhibits robust performance compared to that offered by the model
using scheme-1. This is also supported by the NMSE listed in Table 6.1.
6
4
Outputs
2
0
-2
-4
-6
Plant
Model
-8
0
100
200
300
Discrete Time
400
500
600
(a) Using Scheme-2 learning with 50% outliers
8
Plant
Model
6
4
Outputs
2
0
-2
-4
-6
-8
0
100
200
300
Discrete Time
400
500
600
(b) Using Scheme -1 learning with 50% outliers
Fig.6.7 Comparison of response of the dynamic plant of Example 3 using nonlinearity defined in
(6.22)
131
R O B U S T
I D E N T I F I C A T I O N
A N D P R E D I C T I O N U S I N G P A R T I C L E
S W A R M O P T I M I Z A T I O N T E C H N I Q U E
6
Plant
Model
4
Outputs
2
0
-2
-4
-6
0
100
200
300
Discrete Time
400
500
600
(a) Using Scheme-2 learning with 40% outliers
6
Plant
Model
4
Outputs
2
0
-2
-4
-6
0
100
200
300
Discrete Time
400
500
600
(b) Using Scheme-1 learning with 40% outliers
Fig. 6.8 Comparison of response of the dynamic plant of Example 3 using nonlinearity defined in
(6.23)
Example 4 : In this example the plant to be identified is of Model-2 type and is
represented by the difference equation
y (k + 1) = f [ y (k ), y (k − 1)] + x(k )
(6.25)
The unknown nonlinearity associated with the plant is given by
f ( y1 , y 2 ) =
y1 y 2 ( y1 + 2.5)( y1 − 1.0)
1.0 + y1 + y 2
2
(6.26)
2
In this case the series-parallel scheme of the model is given by
132
R O B U S T
I D E N T I F I C A T I O N
A N D P R E D I C T I O N U S I N G P A R T I C L E
S W A R M O P T I M I Z A T I O N T E C H N I Q U E
yˆ (k + 1) = Ν[( y (k ), y (k − 1)] + x(k )
(6.27)
The two inputs are expanded into 12 terms and scheme-1 and scheme-2 based training are
used. The parameters used are no. of particles=30, no. of input samples =500,
c1 = c 2 = 1.042 and v max = 1 . The response obtained from the plant and the two models
are shown in Figs. 6.9(a) and (b). These figures and Table 6.1 indicate that scheme-2
provides robust identification compared to that offered by scheme-1 method.
1.5
Plant
Model
1
Outputs
0.5
0
-0.5
-1
-1.5
0
100
200
300
400
500
600
Discrete Time
700
800
900
1000
(a) Using Scheme-2 learning with 50% outliers
1.5
Plant
Model
1
Outputs
0.5
0
-0.5
-1
-1.5
0
100
200
300
400
500
600
Discrete Time
700
800
900
1000
(b) Using Scheme-1 learning with 50% outliers
Fig. 6.9 Comparison of response of the dynamic plant of Example 4
Example 5: In this case the plant is of Model-3 type and is given by the difference
equation
133
R O B U S T
I D E N T I F I C A T I O N
A N D P R E D I C T I O N U S I N G P A R T I C L E
S W A R M O P T I M I Z A T I O N T E C H N I Q U E
y (k + 1) = f [ y (k )] + g[ x( k )]
(6.28)
where the unknown nonlinear functions f (.) and g (.) are represented as
f ( y) =
y ( y + 0.3)
1.0 + y 2
(6.29)
g ( x) = x ( x + 0.8)( x − 0.5)
(6.30)
The model is represented by a series-parallel scheme
yˆ (k + 1) = Ν 1 [ y (k )] + Ν 2 [ x(k )]
(6.31)
where Ν 1 and Ν 2 represent the FLANN model with scheme-1 or scheme-2 training. Ν 1
and Ν 2 structures contain seven and five number of expansions. For PSO the parameters
used are no. of particles=30, no. of input samples =500, c1 = c 2 = 1.042 and v max = 1 .
The response obtained from the plant and various models are compared in Figs. 6.10(a)
and (b) and the computed NMSE is presented in Table 6.1. These results also indicate
superior performance of the proposed scheme-2 technique over its scheme-1 counterpart.
2
Plant
Model
1.5
Outputs
1
0.5
0
-0.5
0
50
100
150
200
250
300
Discrete Time
350
400
(a) Using Scheme-2 learning with 50% outliers
134
450
500
R O B U S T
I D E N T I F I C A T I O N
A N D P R E D I C T I O N U S I N G P A R T I C L E
S W A R M O P T I M I Z A T I O N T E C H N I Q U E
2
Plant
Model
1.5
Outputs
1
0.5
0
-0.5
0
50
100
150
200
250
300
Discrete Time
350
400
450
500
(b)Using Scheme-1 learning with 50% outliers
Fig. 6.10 Comparison of response of the dynamic plant of Example 5
Example 6: The plant in this case is category Model-4 and is described by the difference
equation
(6.32)
y(k + 1) = f [ y(k ), y(k − 1), y(k − 2), x(k ), x(k − 1)]
where the unknown nonlinear function f is given by
f [ a1 , a 2 , a3 , a 4 , a5 ] =
a1 a 2 a3 a5 ( a3 − 1.0) + a 4
1.0 + a 2 + a3
2
(6.33)
2
The series-parallel model used for identification of this plant is given as
yˆ (k + 1) = Ν[ y (k ), y (k − 1), y (k − 2), x(k ), x(k − 1)]
(6.34)
In case of scheme-1 and scheme-2 models, the input is expanded to six terms and output is
expanded by nine terms. Figs. 6.11(a)-(b) show the comparative performance of the output
response of two models. The simulation results also indicate that the identification
performance is best in the proposed model as may be evident from comparison of NMSE
shown in Table 6.1.
135
R O B U S T
I D E N T I F I C A T I O N
A N D P R E D I C T I O N U S I N G P A R T I C L E
S W A R M O P T I M I Z A T I O N T E C H N I Q U E
0.8
Plant
Model
0.6
0.4
Outputs
0.2
0
-0.2
-0.4
-0.6
-0.8
-1
0
100
200
300
400
500
Discrete Time
600
700
800
(a) Using Scheme-2 learning with 40% outliers
1.5
Plant
Model
1
Outputs
0.5
0
-0.5
-1
0
100
200
300
400
500
Discrete Time
600
700
800
(b)Using Scheme-1 learning with 40% outliers
Fig. 6.11 Comparison of response of the dynamic plant of Example 6
Example 7 : Identification of Box-Jenkin’s System
A total of 296 pairs input-output samples are generated with a sampling period of 9s. The
gas combustion process has one variable, gas flow x(k ) and one output variable, the
concentration of CO2 , y (k ) . The output y (k ) is influenced by four past output samples
y (k − 1), y (k − 2), y (k − 3) and x(k − 1) .Uniformly distributed random values between
[−3, 3] is added at 10% to 50% random location of the desired samples. Figs. 6.12 (a) and
(b) display the actual and estimated values obtained by using scheme-2 and scheme-1
136
R O B U S T
I D E N T I F I C A T I O N
A N D P R E D I C T I O N U S I N G P A R T I C L E
S W A R M O P T I M I Z A T I O N T E C H N I Q U E
methods of training respectively. It is evident from these figures that scheme-2 provides
better identification performance in comparison to scheme-1 based method in presence of
strong outliers in the training signal.
60
58
Outputs
56
54
52
50
48
46
Model
Plant
44
10
20
30
40
50
60
Discrete Time
70
80
90
(a) Using Scheme-2 learning with 50% outliers
60
58
Outputs
56
54
52
50
48
46
Model
Plant
44
10
20
30
40
50
60
Discrete Time
70
80
90
(b) Using Scheme-1 learning with 50% Outliers
Fig. 6.12 Output response matching of Example 8
Example 8 : Prediction of Mackey Glass Time Series
The Mackey-Glass System (MGS) is a standard benchmark system for identification. This is
a chaotic time series generated by solving the time-delay differential equation
dx(t )
x(t − τ )
= −bx(t ) + a
dt
1 + x(t − τ )10
(6.35)
137
R O B U S T
I D E N T I F I C A T I O N
A N D P R E D I C T I O N U S I N G P A R T I C L E
S W A R M O P T I M I Z A T I O N T E C H N I Q U E
The MG Series is periodic for τ < 17 and is non-periodic otherwise. Initial values are taken
as random values. The differential equation is solved using Euler’s method. A set of 1100
samples are generated with b = 0.9, a = 0.2 and τ = 30 . The first 100 samples are
discarded due to their random nature. Out of the remaining 1000 samples, 800 samples are
used as training data and the remaining 200 as test samples.
The model of the system can be represented by
(6.36)
x (t + p ) = f {x(t ), x (t − τ ), x (t − 2τ ),......, x (t − ( N − 1)τ }
where p = 4 and N = 4 .
x(t ), x(t − τ ), x(t − 2τ ) , x(t − 3τ ) are used as the inputs and x(t + p) is used as the output.
The training data set is corrupted by adding random values from a uniform distribution of
[−15, 15] to the uncorrupted data set. Simulation is carried out in presence of 10% to 50%
of outliers in the training signal. The result of response matching is shown in Figs. 6.13(a)
and (b) for 50% outlier only. From these figures it is observed that scheme-2 based model
identify the system correctly in presence of 50% outlier in the training signal where as the
scheme-1 based model fails to identify the system.
1.6
Plant
Model
1.4
Outputs
1.2
1
0.8
0.6
0.4
0.2
0
20
40
60
80
100
120
Discrete Time
140
160
180
(a) Using Scheme-2 learning with 50% outliers
138
200
R O B U S T
I D E N T I F I C A T I O N
A N D P R E D I C T I O N U S I N G P A R T I C L E
S W A R M O P T I M I Z A T I O N T E C H N I Q U E
1.6
Plant
Model
1.4
Outputs
1.2
1
0.8
0.6
0.4
0.2
0
20
40
60
80
100
120
Discrete Time
140
160
180
200
(b) Using Scheme-1learning with 50% outliers
Fig. 6.13 Output response matching of Example 8
Example 9 : Prediction of Sunspot Time Series
The series consists of 288 data points of yearly averages of sunspots starting from the year
1700 to the year 1987. The sunspots problem is a typical time series prediction problem, in
which the sunspots number is to be predicted for the following year based on data of the
past years. Out of the 288 data, first 225 data are used for training and rest 63 data used for
testing purpose. The training data set is corrupted by adding random values from a uniform
distribution defined between [−15, 15] to the uncorrupted data. Simulation is carried out in
presence of 10% to 50% of outliers in the training signal. The response matching of the
system with 40% outliers and NMSE obtained for 10% to 40% outliers is given in Figs.
6.14(a) and (b). The identification performance of scheme-1 is severely degraded at 40%
outliers. It is clearly observed that the schem-2 model performs better in all cases in
comparison to the scheme-1 based model in presence of outliers.
139
I D E N T I F I C A T I O N
A N D P R E D I C T I O N U S I N G P A R T I C L E
S W A R M O P T I M I Z A T I O N T E C H N I Q U E
1
Plant
Model
0.9
0.8
0.7
Outputs
0.6
0.5
0.4
0.3
0.2
0.1
0
0
10
20
30
Discrete Time
40
50
60
(a) Using Scheme-2 learning with 40% outliers
1.2
Plant
Model
1
0.8
Outputs
R O B U S T
0.6
0.4
0.2
0
-0.2
0
10
20
30
Discrete Time
40
50
60
(b) Using Scheme-1 learning with 40% outliers
Fig. 6.14 Output response matching of Example 9
140
R O B U S T
I D E N T I F I C A T I O N
A N D P R E D I C T I O N U S I N G P A R T I C L E
S W A R M O P T I M I Z A T I O N T E C H N I Q U E
Table 6.1
Comparison of NMSE obtained in Example-1 to Example-6 from models using three robust cost
functions and conventional MSE CF
Example No.
1
2
3 with (6.22)
3 with (6.23)
4
5
6
% of outliers
RCF-1
RCF-2
NMSE (in dB)
RCF -3
MSE
10
-19.85,
-17.10
-14.05
-12.58
20
-28.86,
-18.23
-10.91
-9.64
30
-26.43,
-16.74
-11.63
-10.14
40
-27.42,
-15.51
-10.56
-9.33
50
-24.03
-16.69
-11.59
-8.67
10
-34.52
-22.98
-19.38
-19.06
20
-42.46
-23.49
-16.91
-15.88
30
-36.64
-21.14
-17.89
-16.52
40
-35.78
-23.80
-16.82
-15.77
50
-37.10
-22.29
-15.70
-14.91
10
-26.44
-21.88
-21.78
-20.80
20
-32.87
-22.65
-21.50
-20.39
30
-34.82
-21.62
-20.51
-19.70
40
-30.56
-21.99
-20.98
-19.87
50
-27.79
-20.83
-20.19
-19.23
10
-38.51
-27.98
-19.92
-19.15
20
-42.70
-27.80
-20.41
-19.27
30
-38.45
-26.23
-20.51
-19.58
40
-52.52
-25.84
-20.93
-20.55
50
-37.92
-25.97
-20.63
-20.19
10
-23.82
-22.87
-22.00
-21.43
20
-24.97
-24.14
-23.04
-20.48
30
-28.55
-24.84
-24.09
-21.80
40
-26.21
-21.49
-21.04
-19.71
50
-22.56
-18.55
-18.01
-18.31
10
-14.73
-9.36
-8.37
-4.07
20
-14.49
-13.77
-11.45
-9.74
30
-15.65
-14.76
-10.58
-8.30
40
-17.39
-14.97
-9.12
-4.87
50
-23.82
-8.23
-5.11
-4.53
10
-16.37
-16.86
-16.53
-16.53
20
-16.36
-11.91
-7.35
-5.07
30
-13.18
-12.42
-9.74
-8.71
40
-13.80
-13.26
-7.91
-5.53
50
-13.53
-12.23
-11.26
-11.28
141
R O B U S T
I D E N T I F I C A T I O N
A N D P R E D I C T I O N U S I N G P A R T I C L E
S W A R M O P T I M I Z A T I O N T E C H N I Q U E
6.6 Conclusion
In this Chapter a novel method of robust identification of nonlinear dynamic system using
a low complexity single layer FLANN model has been proposed. The robust identification
task is formulated as an optimization of RCFs of the error terms of the model. The
connecting weights of the FLANN model are iteratively adjusted using PSO technique to
achieve this objective. The proposed technique is robust because it provides excellent
identification performance of complex plants even when the training signal of the model
contains up to 50% of outliers. The robust performance of the model using different RCFs
is demonstrated using simulation of wide varieties of benchmark examples. The
introduction of the new CFs in the model and PSO based minimization of these CFs are
contributing to robust and improved performance compared to standard squared error
norm based model. Further comparison of identification performance between the models
using different RCFs indicate that the second scheme which employs Wilcoxon norm as
the CF outperforms other three schemes. Robust identification performance using
Wilcoxon norm is also observed in case of Mackey Glass, Box Jenkin’s and Sunspot time
series when strong outliers are also present in training set.
References
[6.1] Joseph W. McKean, “Robust analysis of Linear models”, Statistical Science, vol. 19,
no. 4, pp. 562-570, 2004.
[6.2] Jer-Guang Hsieh, Yih-Lon Lin and Jyh-Horng Jeng, “Preliminary study on Wilcoxon
learning machines”, IEEE Trans. on neural networks, vol. 19, no. 2, pp. 201-211, Feb.
2008.
[6.3] S. Haykin , Neural Networks, Ottawa, ON Canada: Maxwell Macmillan, 1994.
[6.4] P. S. Sastry, G. Santharam and K. P. Unnikrishnan, “Memory neural networks for
identification and control of dynamical systems”, IEEE Trans. Neural Networks, vol. 5, pp.
306-319, 1994.
[6.5] A. G. Parlos, K. T. Chong and A. F. Atiya, “Application of recurrent multilayer
perceptron in modeling of complex process dynamics”, IEEE Trans. Neural Networks,
vol. 5, pp. 255-266, 1994.
142
R O B U S T
I D E N T I F I C A T I O N
A N D P R E D I C T I O N U S I N G P A R T I C L E
S W A R M O P T I M I Z A T I O N T E C H N I Q U E
[6.6] K. S Narendra. and K. Parthasarathy, “Identification and control of dynamical systems
using neural networks”. IEEE Trans. on neural networks, vol. 1, no. 1, pp. 4-27, 1990.
[6.7] D. H. Nguyen and B. Widrow, “Neural networks for self-learning control system”.
Int. J. Conttr., vol. 54, no. 6, pp. 1439-1451, 1991.
[6.8] G. Cembrano, G. Wells, J. Sarda and A. Ruggeri, “Dynamic control of a robot arm
based on neural networks”, Contr. Eng. Practice, vol 5, no. 4, pp. 485-492, 1997.
[6.9] S. Chen, S. A. Billings and P. M. Grant, “Recursive hybrid algorithm for nonlinear
system identification using radial basis function networks”, Int. J. Contr., vol. 55, no. 5, pp.
1051-1070, 1992.
[6.10] S. V. T. Elanayar and Y. C. Shin, “Radial basis function neural network for
approximation and estimation of nonlinear stochastic dynamic systems”, IEEE Trans.
Neural Network, vol. 5, pp. 594-603, 1994.
[6.11] Q. Zhang and A. Benveniste, “Wavelet networks”. IEEE Trans. Neural Networks,
vol. 3, pp. 889-898, 1992.
[6.12] Y. C. Pati and P. S. Krishnaprasad, “Analysis and synthesis of feed forward neural
networks using discrete affine wavelet transforms”, IEEE Trans. Neural Networks, vol. 4,
pp. 73-85, 1993.
[6.13] J. C. Patra, R. N. Pal, B. N. Chatterji and G. Panda, “Identification of nonlinear
dynamic systems using functional link artificial neural networks”, IEEE Trans. in Systems,
Man and Cybernetics-Part B, vol. 29, no. 2, pp. 254-262, 1999.
[6.14] J. C. Patra and A. C. Kot, “Nonlinear dynamic system identification using Chebyshev
functional link artificial neural networks”, IEEE Trans. in Systems, Man and CyberneticsPart B, vol. 32, no. 4, pp. 505-511, 2002.
[6.15] H. Yoshida, K. Kawata, Y. Fukuyama, S. Takayama, and Y. Nakanishi, “A particle
swarm optimization for reactive power and voltage control considering voltage security
assessment,” IEEE Trans. Power Syst., vol. 15, no. 4, pp. 1232–1239, Nov. 2000.
[6.16] B. Zhao, C. Guo, and Y. J. Cao, “A multiagent-based particle swarm optimization
approach for optimal reactive power dispatch,” IEEE Trans. Power Syst., pp. 1070–1078,
May 2005.
[6.17] J. Park, K. Lee, J. Shin, and K. Y. Lee, “A particle swarm optimization for economic
dispatch with nonsmooth cost functions,” IEEE Trans.Power Syst., vol. 20, no. 1, pp. 34–
42, Feb. 2005.
[6.18] T. Aruldoss, A. Victoire, and A. Jeyakumar, “Reserve constrained dynamic dispatch
of units with valve-point effects,” IEEE Trans. Power Syst., vol. 20, no. 3, pp. 1273–1282,
Aug. 2005.
143
R O B U S T
I D E N T I F I C A T I O N
A N D P R E D I C T I O N U S I N G P A R T I C L E
S W A R M O P T I M I Z A T I O N T E C H N I Q U E
[6.19] Z. Gaing, “Particle swarm optimization to solving the economic dispatch considering
the generator constraints,” IEEE Trans. Power Syst., vol. 18, no. 3, pp. 1187–1195, Aug.
2003.
[6.20] Y-P Chen, W. C. Peng and M. C. Jian, “Particle swarm with recombination and
dynamic linkage discovery”, IEEE Trans. on System, Man and Cybernetics – Part B, vol.
37, no. 6, pp. 1460-1470, Dec. 2007.
[6.21] Kassabalidis, M. El-Sharkawi, R. Marks, L. Moulin, and A. Silva, “Dynamic security
border identification using enhanced particle swarm optimization,” IEEE Trans. Power
Syst., pp. 723–729, Aug.2002.
[6.22] S. Kannan, S. Slochanal, and N. Padhy, “Application of particle swarm optimization
technique and its variants to generation expansion problem,” ELSERVIER Electric Power
Syst. Res., vol. 70, no. 3, pp.203–210, Aug. 2004.
[6.23] S. Kannan, S. Slochanal, and N. Padhy, “Application and comparison of
metaheuristic techniques to generation expansion planning problem,” IEEE Trans. Power
Syst., vol. 20, no. 1, pp. 466–475, Feb. 2005.
[6.24] A. Abido, “Optimal power flow using particle swarm optimization,” Int. J. Elect.
Power Energy Syst., vol. 24, no. 7, pp. 563–571, Oct. 2002.
[6.25] J. Vlachogiannis and K. Lee, “Determining generator contributions to transmission
system using parallel vector evaluated particle swarm optimization,” IEEE Trans. Power
Syst., vol. 20, no. 4, pp. 1765–1774, Nov. 2005.
[6.26] M. Abido, “Optimal design of power-system stabilizers using particle swarm
optimization,” IEEE Trans. Energy Conversion, vol. 17, no. 3, pp. 406–413, Sep. 2002.
[6.27] Z. Gaing, “A particle swarm optimization approach for optimum design of PID
controller in AVR system,” IEEE Trans. Energy Conversion, vol. 19, no. 2, pp. 384–391,
Jun. 2004.
[6.28] J. Chia-Feng, “A hybrid of genetic algorithm and particle swarm optimization for
recurrent network design,” IEEE Trans. Syst., Man, Cybern.,Part B: Cybern., vol. 34, no. 2,
pp. 997–1006, Apr. 2004.
[6.29] Y. Liu, X. Zhu, J. Zhang, and S.Wang, “Application of particle swarm optimization
algorithm for weighted fuzzy rule-based system,” in Proc. 30th Annu. Conf. IEEE Ind.
Electron. Soc., Nov. 2004, vol. 3, pp. 2188–2191.
[6.30] A. Esmin, G. Torres, and A. Zambroni, “A hybrid particle swarm optimization
applied to loss power minimization,” IEEE Trans. PowerSyst., vol. 20, no. 2, pp. 859–866,
May 2005.
[6.31] X.Yu, X. Xiong, and Y.Wu, “A PSO-based approach to optimal capacitor placement
with harmonic distortion consideration,” Electric Power Syst. Res., vol. 71, pp. 27–33, Sep.
2004.
144
R O B U S T
I D E N T I F I C A T I O N
A N D P R E D I C T I O N U S I N G P A R T I C L E
S W A R M O P T I M I Z A T I O N T E C H N I Q U E
[6.32] C. Huang, C. J. Huang, and M. Wang, “A particle swarm optimization to identifying
the ARMAXmodel for short-term load forecasting,” IEEE Trans. Power Syst., vol. 20, no.
2, pp. 1126–1133, May 2005.
[6.33] J. Vlachogiannis and K. Lee, “Determining generator contributions to transmission
system using parallel vector evaluated particle swarm optimization,” IEEE Trans. Power
Syst., vol. 20, no. 4, pp. 1765–1774, Nov. 2005.
[6.34] S. H. Ling, H. H. C. Iu, K. Y. Chan, H. K. Lam, Benny C. W. Yeung and Frank H.
Leung, “Hybrid particle swarm optimization with wavelet mutation and its industrial
applications”, IEEE Trans. on System, Man and Cybernetics – Part B, vol. 38, no. 3, pp.
743-763, June 2008.
[6.35] Shinn-Ying Ho, Hung-Sui Lin, Weei-Hurng Liauh and Shinn-Jang Ho, “OPSO :
Orthogonal particle swarm optimization and its application to task assignment problems”,
IEEE Trans. on Systems, Man and Cybernetics – Part A, vol. 38, no. 2, pp. 288-298, March
2008.
[6.36] Alberto Moraglio, Cecilia Di Chio, Julian Togelius and Riccardo Poli, “Geometric
particle swarm optimization” Journal of Artificial Evolution and Applications, ID 143624,
pp. 1-14, 2008.
[6.37] L. S. Coelho, “Novel Gaussian quantum behaved particle swarm optimizer applied to
electromagnetic design”, IET Sci. Meas. Technol., vol. 1, no. 5, pp. 290-294, 2007.
[6.38] T. O. Ting, M. V. C Rao and C. K. Loo, “A novel approach for unit commitment
problem via an effective hybrid particle swarm optimization”, IEEE Trans. on Power
Systems, vol. 21, no. 1, pp. 411-418, Feb. 2006.
[6.39] J. J. Liang, A. K. Qin, Ponnuthurai Nagaratnam Suganthan and S. Baskar,
“Comprehensive learning particle swarm optimizer for global optimization of multimodal
functions”, IEEE Trans. on Evolutionary Computation, vol. 10, no. 3, pp. 281-295, June
2006
[6.40] David S. Chen and Ramesh C. Jain, “A robust back propagation learning algorithm
for function approximation”, IEEE Trans. on Neural Networks, vol. 5, no. 3, pp. 467-479,
May 1994.
[6.41] Jerome T. Connor, R. Douglas Martin and L. E. Atlas, “Recurrent neural networks
and robust time series prediction”, IEEE Trans. on Neural Networks, vol. 5, no. 2, pp.
240-254, March 1994.
[6.42] V. David Sanchez A., “Robustization of a learning method for RBF networks”,
Neurocomputing, vol. 9, pp. 85-94, 1995.
[6.43] Kadir Liano, “Robust error measure for supervised neural network learning with
outliers”, IEEE Trans. On Neural Networks, vol. 7, no. 1, pp. 246-250, Jan. 1996.
145
R O B U S T
I D E N T I F I C A T I O N
A N D P R E D I C T I O N U S I N G P A R T I C L E
S W A R M O P T I M I Z A T I O N T E C H N I Q U E
[6.44] Wei-Yen Wang, Tsu-Tian Lee, Ching-Lang Liu and Chi-Hsu Wang, “Function
approximation using fuzzy neural networks with robust learning algorithm”, IEEE Trans.
on Systems, Man and Cybernetics-Part B : Cybernetics, vol. 27, no. 4, pp. 740-747, Aug.
1997.
[6.45] Lei Huang, Bai-Ling Zhang and Qian Huang, “Robust interval regression analysis
using neural networks”, Fuzzy Sets and Systems, vol. 97, pp. 337-347, 1998.
[6.46] Chien-Cheng Lee, Pau-Choo Chung, Jea-Rong Tsai and Chein-I Chang, “Robust
radial basis function neural networks”, IEEE Trans. on Systems, Man and Cybernetics –
Part B : Cybernetics, vol. 29, no. 6, pp. 674-685 , Dec., 1999.
[6.47] Hung-Hsu Tsai and Pao-Ta Yu, “On the optimal design of fuzzy neural networks
with robust learning for function approximation”, IEEE Trans. n Systems, Man and
Cybernetics-Part B : Cybernetics, vol. 30, no. 1, pp. 217-223, Feb. 2000.
[6.48] Chen-Chia Chuang, Shun-Feng Su and Song-Shyong Chen, “Robust TSK fuzzy
modeling for function approximation with outliers”, IEEE Trans. on Fuzzy Systems, vol. 9,
no. 6, pp. 810-821, Dec. 2001
[6.49] Chen-Chia Chuang, Shun-Feng Su, Jin-Tsong Jeng and Chih-Ching Hsiao, “Robust
support vector regression networks for function approximation with outliers”, IEEE
Trans. on Neural networks, vol. 13, no. 6, pp. 1322-1330, Nov. 2002.
[6.50] Y. H. Pao., Adaptive Pattern Recognition & Neural Networks, Reading, MA :
Addison-Wesley, 1989.
[6.51] N. A. Gershenfeld and A. S. Weigend, “The future of time series: Learning and
understanding”, in Time Series Prediction: Forecasting the future and past, Reading, MA:
Addison-Wesley, pp. 1-70, 1993.
146
7
Chapter
Robust Adaptive Inverse Modeling
using Bacterial Foraging
Optimization Technique and
Applications
7.1 Introduction
T
HE inverse model of a system having an unknown transfer function is itself a
system having a transfer function which is in some sense a best fit to the reciprocal
of the unknown transfer function. Sometimes the inverse model response contains
a delay which is deliberately incorporated to improve the quality of the fit. In Fig. 7.1, a
source signal s (n) is fed into an unknown system that produces the input signal x(n) for the
adaptive filter. The output of the adaptive filter is subtracted from a desired response signal
that is a delayed version of the source signal, such that
147
R O B U S T
A D A P T I V E
I N V E R S E
M O D E L I N G
U S I N G B F O T E C H N I Q U E
A N D A P P L I C A T I O N S
d ( n) = s ( n − Δ )
(7.1)
where Δ is a positive integer value. The goal of the adaptive filter is to adjust its
characteristics such that the output signal is an accurate representation of the delayed source
signal.
There are many applications of adaptive inverse model of a system. If the system is a
communication channel then the inverse model is an adaptive equalizer which compensates
the effects of inter symbol interference (ISI) caused due to restriction of channel bandwidth
[7.1]. Similarly if this system is the model of a high density recording medium then its
corresponding inverse model reconstruct the recorded data without distortion [7.5]. If the
system represents a nonlinear sensor then its inverse model represents a compensator of
environmental as well as inherent nonlinearities [7.44]. The adaptive inverse model also
finds applications in adaptive control [7.4] as well as in deconvolution in geophysics
application [7.3].
Delay
d (n)
η (n)
s (n)
System/plant
/channel
+
+
Σ
x(n)
Adaptive
filter
y (n)
-
+
Σ
e(n)
Fig. 7.1 Inverse Modeling
Channel equalization is a technique of decoding of transmitted signals across nonideal
communication channels. The transmitter sends a sequence s (n) that is known to both the
transmitter and receiver. However, in equalization, the received signal is used as the input
signal x(n) to an adaptive filter, which adjusts its characteristics so that its output closely
matches a delayed version s (n − Δ) of the known transmitted signal. After a suitable
148
R O B U S T
A D A P T I V E
I N V E R S E
M O D E L I N G
U S I N G B F O T E C H N I Q U E
A N D A P P L I C A T I O N S
adaptation period, the coefficients of the system either are fixed and used to decode future
transmitted messages or are adapted using a crude estimate of the desired response signal
that is computed from y (n) . This latter mode of operation is known as decision-directed
adaptation.
Channel equalization is one of the first applications of adaptive filters and is described in the
pioneering work of Lucky [7.1]. Today, it remains as one of the most popular uses of an
adaptive filter. Practically every computer telephone modem transmitting at rates of 9600
bits per second or greater contains an adaptive equalizer. Adaptive equalization is also useful
for wireless communication systems. Qureshi [7.2] has written an excellent tutorial on
adaptive equalization. A related problem to equalization is deconvolution, a problem that
appears in the context of geophysical exploration [7.3].
In many control tasks, the frequency and phase characteristics of the plant hamper the
convergence behavior and stability of the control system. We can use an adaptive filter
shown in Fig. 7.1 to compensate for the nonideal characteristics of the plant and as a method
for adaptive control. In this case, the signal s(n) is sent at the output of the controller, and the
signal x(n) is the signal measured at the output of the plant. The coefficients of the adaptive
filter are then adjusted so that the cascade of the plant and adaptive filter can be nearly
represented by the pure delay z-∆. Details of the adaptive algorithms as applied to control
tasks in this fashion can be found in [7.4].
Transmission and storing of high density digital information plays an important role in the
present age of information technology. Digital information obtained from audio, video or
text sources needs high density storage or transmission through communication channels.
Communication channels and recording medium are often modeled as band-limited
channel for which the channel impulse response is that of an ideal low pass filter. When a
sequence of symbols are transmitted/recorded, the low pass filtering of the channel distorts
the transmitted symbols over successive time intervals causing symbols to spread and
overlap with adjacent symbols. This resulting linear distortion is known as inter symbol
interference. In addition nonlinear distortion is also caused by cross talk in the channel and
use of amplifiers. In the data storage channel, the binary data is stored in the form of tiny
magnetized regions called bit cells, arranged along the recording track. At read back, noise
and nonlinear distortions (ISI) corrupt the signal. An ANN based equalization technique
has been proposed [7.5] to alleviate the ISI present during read back from the magnetic
149
R O B U S T
A D A P T I V E
I N V E R S E
M O D E L I N G
U S I N G B F O T E C H N I Q U E
A N D A P P L I C A T I O N S
storage channel. Recently, Sun et al have reported [7.6] an improved Viterbi detector to
compensate the nonlinearities and media noise. Thus adaptive channel equalizers play an
important
role
in
recovering
digital
information
from
digital
communication
channels/storage media. Preparta had suggested [7.7] a simple and attractive scheme for
dispersal recovery of digital information based on the discrete Fourier transform.
Subsequently Gibson et al have reported [7.8] an efficient nonlinear ANN structure for
reconstructing digital signal which has passed through a dispersive channel and corrupted
with additive noise. In a recent publication [7.9] the authors have proposed an optimal
preprocessing strategies for perfect reconstruction of binary signals from a dispersive
communication channels. Touri et al have developed [7.10] deterministic worst case frame
work for perfect reconstruction of discrete data transmission through a dispersive
communication channel. In recent past, new adaptive equalizers have been suggested using
soft computing tools such as artificial neural network (ANN), polynomial perceptron
network (PPN) and the functional link artificial neural network (FLANN) [7.11]. It is
reported that these methods are best suited for nonlinear and complex channels. Recently,
Chebyshev artificial neural network has also been proposed for nonlinear channel
equalization[7.12]. The drawback of these methods are that the estimated weights may
likely fall to local minima during training.
For this reason genetic algorithm (GA) has been suggested for training adaptive channel
equalizers[7.13]. The main attraction of GA lies in the fact that it does not rely on Newtonlike gradient-descent methods, and hence there is no need for calculation of derivatives.
This makes them less likely to be trapped in local minima. But only two parameters of GA,
the crossover and the mutation, help to avoid local minima problem. There is still some
situations when the weights in GA optimization are trapped to local minima.
In recent years bacterial foraging optimization (BFO) has been proposed [7.14] and has been
applied in harmonic estimation of power system signals[7.15], adaptive inverse modeling[7.16],
image segmentation[7.17], image filtering[7.18], optimal power flow[7.19], economic load
dispatch[7.20 -7.21], parameter estimation[7.22], independent component analysis[7.23],
recognition of handwriting[7.24], tuning of power system stabilizers[7.25], controller
optimization[7.26], design of multiple optimal power system stabilizers[7.27], optimization of
coefficients of PI controller[7.28-7.30], optimization in dynamic environments[7.31],
Hammerstein model identification[7.32], D-STATCOM[7.33], function minimization and
control[7.34], load compensation [7.35], load forecasting[7.36]. A hybrid GA and BFO algorithm
150
R O B U S T
A D A P T I V E
I N V E R S E
M O D E L I N G
U S I N G B F O T E C H N I Q U E
A N D A P P L I C A T I O N S
has been developed for global optimization in [7.37]. Adaptation of run length unit by using
Takagi-Sugeno fuzzy scheme based on the minimum value of cost function has been reported in
[7.15] .The chemotactic step size is made adaptive to accelerate the convergence speed near the
optima[7.38] and mathematical analysis of reproduction operator is reported in [7.39]. The BFO
is an useful alternative to GA and requires less number of computations. In addition, the BFO is
also a derivative free optimization technique. The number of parameters that are used for
searching the total solution space is higher in BFO compared to those in GA. Hence the
possibility of avoiding the local minimum is higher in BFO. In this scheme, the foraging
(methods for locating, handling and ingesting food) behaviour of E. Coli bacteria present in our
intestines is mimicked.
In case of the derivative free algorithms conventionally the mean square error (MSE) is used as
the fitness or cost function. Use of MSE as cost function leads to improper training of adaptive
model when outliers are present in the desired signal. It is a fact that the traditional regressors
employ least square fit which minimizes the Euclidean norm, while the robust estimator is based
on a fit which minimizes another rank based on a norm called Wilcoxon norm [7.40]. It is
known in statistics that linear regressors developed using Wilcoxon norm are robust against
outliers. Using such norm new robust machines have recently been proposed for approximation
of nonlinear functions [7.41]. In the present investigation we develop a new method of robust
inverse model of complex nonlinear channels and systems by minimizing robust cost function
(RCF) [7.41], [7.42] and [7.43] of errors of the model using a derivative free BFO technique. The
performance of the new method is evaluated through simulation study and is compared with the
results obtained from corresponding error square norm based BFO technique.
7.2 Data recovery by adaptive channel equalization
Reading out of high density data from the recording medium or recovery of binary data
from the noisy digital channel needs ISI compensation. This is achieved by employing an
adaptive inverse model shown in Fig. 7.2. The transmitted symbols are represented as x(k )
at time instance, k . They are then passed into the channel model which may be linear or
nonlinear. An FIR filter is used to model a linear channel whose output at time instant k
may be written as
151
R O B U S T
A D A P T I V E
I N V E R S E
M O D E L I N G
U S I N G B F O T E C H N I Q U E
A N D A P P L I C A T I O N S
N −1
y (k ) =
∑ w (i ) x ( k − i )
(7.2)
i=0
where w(i ) are the channel tap values and N is the length of the FIR system or channel.
The “NL” block represents the nonlinear distortion of the symbols in the channel and its
output may be expressed as
z ( k ) = ψ ( x ( k ), x ( k − 1),........ x ( k − N + 1)
(7.3)
w ( 0 ), w (1),........ .......... .......... ..... w ( N − 1)),
where ψ (.) is some nonlinear function generated by the “NL” block. The channel output
z (k ) is corrupted with additive white Gaussian noise q(k ) of variance σ 2 . This corrupted
received signal is given by r (k ) . The received signal r (k ) is then passed into the digital
channel equalizer to produce xˆ (k ) which recovers the transmitted symbol x(k ) . From
initial tap values (at t = 0, w(i ) = 0 ), the weights are updated until the cost function,
∑
N
k =1
e 2 (k ) , is minimized. Where N = No. of input samples used for training and
e(k ) = x d (k ) − xˆ (k ) . The received or recorded data is given by r (k ) = z (k ) + q (k ) . The
minimization of this cost function is iteratively performed by BFO scheme which is dealt in
the Chapter 2
q(k)
x(k)
Linear
system
y(k)
NL
z (k )
Σ
r(k)
Equalizer
BFO
Delay
x(k − d ) = x d (k ) +
xˆ ( k )
e(k )
Σ
-
Fig. 7.2 A Digital Communication System with BFO based adaptive inverse
model
152
R O B U S T
A D A P T I V E
I N V E R S E
M O D E L I N G
U S I N G B F O T E C H N I Q U E
A N D A P P L I C A T I O N S
7.3 BFO based training of weights of inverse model
The updating of the weights of the BFO based inverse model is carried out using the
training rule as outlined in the following steps:
Step -1 Initialization of various parameters
(i) Sb = No. of bacteria to be used for searching the total region
(ii) N is = Number of input samples
(iii) p = Number of parameters to be optimized
(iv) N s = Swimming length after which tumbling of bacteria will be undertaken in a
chemotactic loop.
(v) N c = Number of iterations to be undertaken in a chemotactic loop. Always N c > N s .
(vi) N re = Maximum number of reproduction to be undertaken
(vii) N ed = Maximum number of elimination and dispersal events to be imposed over the
bacteria.
(viii) Ped = Probability with which the elimination and dispersal continue.
(ix) The location of each bacterium P (1- p , 1- Sb , 1) is specified by random numbers on
[0,1].
(x) The value of C (i ) (i.e. run length unit). It is assumed to be constant for all bacteria.
Step-2 Generate desired signal
(i) Random binary input [1,-1] is applied to the channel..
(ii) The output of the channel is contaminated with white Guassian noise of known
strength to produce the input signal for the equalizer.
(iii) The binary input is delayed by half of the order of the equalizer to act as the desired
signal, d (k ) .
Step -3 Iterative algorithm for optimization
This section models the bacterial population, chemotaxis, reproduction, elimination and
dispersal. Initially j = k = l = 0
(i) Elimination dispersal loop l = l + 1
(ii) Reproduction loop k = k + 1
(iii) Chemotaxis loop j = j + 1
153
R O B U S T
A D A P T I V E
I N V E R S E
M O D E L I N G
U S I N G B F O T E C H N I Q U E
A N D A P P L I C A T I O N S
(a) For i = 1,2,............Sb , the cost function, (in this case mean squared error) J (i, j , k , l ) for
each i th bacterium is calculated as follows :
(1) N is number of binary input are passed through the equalizer.
(2)The output is then compared with the corresponding desired signal, d (k ) to calculate
the error, e(k ) .
(3)The sum of squared error averaged over N is is finally stored in J (i, j , k , l ) .
(4)End of For Loop.
(b)For i = 1,2,................., S b the tumbling/swimming decision is taken.
Tumble : Generate a random vector Δ (i ), with each element, Δ m (i), m = 1,2,.......... p, a
random number in the range of [-1, 1].
Move: Let Pi ( j +1, k,l) = Pi ( j, k,l) + C(i) ×
Δ(i)
Δ (i)Δ(i)
T
This results in an adaptable step size in the direction of tumble for bacterium i . The cost
function (mean squared error) J (i, j + 1, k , l ) is computed.
Swim – (i) Let c =0; (counter for swim length)
(ii) While c < N s (have not climbed down too long)
Let c = c + 1
If J ( j ) < J ( j − 1) then Pi ( j +1, k,l) = Pi ( j, k,l) + C(i) ×
Δ(i)
Δ (i)Δ(i)
T
and the P ( j + 1, k , l ) is used to compute the new J (i, j + 1, k , l )
ELSE let c = N s . This is the end of the WHILE statement.
(c)Go to next bacterium (i + 1) if i ≠ Sb to process the next bacterium.
(d)If min( J ) {minimum value of J among all the bacteria} is less than the tolerance limit
then break all the loops.
Step-4. If j < N c ,go to (iii) i.e. continue chemotaxis loop since the life of the bacteria is not
over.
Step-5 Reproduction
(a) For the given k and l , and for each i = 1,2,................., S b let
Ji
be the health of the i th
bacterium. The bacteria are sorted in ascending order of cost J (higher cost means lower
health).
154
R O B U S T
(b) The S r =
A D A P T I V E
I N V E R S E
M O D E L I N G
U S I N G B F O T E C H N I Q U E
A N D A P P L I C A T I O N S
Sb
bacteria with highest J value die and other S r bacteria with the best value
2
split and the copies that are made are placed at the same location as their parent.
Step-6. If k < N re go to Step-2. In this case, the number of specified reproduction steps has
not reached and the next generation in the chemotactic loop is to be started.
Step-7. Elimination –Dispersal
The bacterium, which has an elimination-dispersal probability above a preset value Ped , is
eliminated by dispersing to a random location and new replacements are randomly
initialized over the search space. By this the total population is maintained constant.
7.4 Development of robust inverse modeling using BFO
based training with robust norm minimization
Three robust cost functions defined in literature [7.41-7.43] are used in the development of
robust adaptive inverse models. The BFO algorithm is then used to iteratively minimize
these norms of the error obtained from the model and hence the resulting inverse model is
expected to be robust against outliers. These cost functions are defined in Section 6.4 of
Chapter 6. The weight-update of inverse model of Fig. 7.2 is carried out by minimizing
these cost functions of the errors defined in (6.16), (6.17) and (6.18) using BFO algorithm.
In this approach, the procedure outlined from Step-1 to Step-7 of section 7.3 remains the
same. The only exception is detailed as follows :
Let the error vector of p th bacterium at k th generation due to application of N input
samples to the model be represented as [e1, p (k ), e2, p (k ),............., e N , p ( k )]T . The errors are
then arranged in an increasing manner from which the rank R{en , p (k )} of each n th error
term is obtained. The score associated with each rank of the error term is evaluated as
a (i ) = 12 (
i
− 0.5)
N +1
(7.4)
where (1 ≤ i ≤ N ) denotes the rank associated with each error term. At k th generation of
each p th particle the Wilcoxon norm is then calculated as
155
R O B U S T
A D A P T I V E
I N V E R S E
M O D E L I N G
U S I N G B F O T E C H N I Q U E
A N D A P P L I C A T I O N S
N
C p (k ) = ∑ a (i ) ei , p (k )
(7.5)
i =1
Similarly other two CFs are computed using (6.10), (6.17) and (6.18). The learning process
continues until the CF decreases to the possible minimum values. At this stage the training
is completed and the resulting weight vector represents the final weights of the inverse
model.
7.5 Simulation study
In this section, the simulation study of the proposed inverse model in presence of 10% to
50% of outliers in the desired signal is carried out. Fig. 7.2 is simulated for various
nonlinear sensor/systems/channels using the algorithm given in section7.4. Three standard
linear systems used in the simulation study are :
S1 : 0.209 + 0.995 z −1 + 0.209 z −2
(7.6)
S 2 : 0.260 + 0.930 z −1 + 0.260 z − 2
S 3 : 0.304 + 0.903z
−1
+ 0.304 z
−2
The eigen value ratio (EVR) of S1, S2 and S3 are 6.08, 11.12 and 21.71 respectively [7.11] .
This ratio indicates the severity or contouring capacity of the system or channel. To study
the effect of nonlinearity on the equalization performance, two different nonlinearities are
introduced to the channel
NL1 : z(k ) = tanh(y(k ))
(7.7)
NL2 : z(k ) = y(k ) + 0.2 y2 (k ) − 0.1y3 (k )
where y(k ) is the output of each of these linear systems (S1 through S3). The additive noise
in the channel or measurement noise in sensor is white Gaussian with -30dB strength. In
this study an 8-tap adaptive FIR filter is used as an inverse model. The desired signal is
generated by delaying the input binary sequence by half of the order (4 in this case) of the
inverse model. Outliers are added by simply replacing the bit value from 1 to -1 or -1 to 1
at randomly selected locations (10% to 50%) of the desired signal. In this simulation work,
we have considered the following parameters of BFO : Sb = 8, N is = 100, p =8, N s =3,
N c =5, N re =40-60, N ed =10, Ped = 0.25, C (i ) = 0.0075. For the sake of clarity different cost
functions of the inverse models used in the simulation are mentioned here. These are
CF1 - Wilcoxon norm
156
R O B U S T
A D A P T I V E
I N V E R S E
M O D E L I N G
U S I N G B F O T E C H N I Q U E
A N D A P P L I C A T I O N S
CF2 - mean square error, e 2
CF3 - σ (1 − exp
−
⎛ e2
CF4 - log⎜⎜1 +
2
⎝
e2
2σ
) , where σ is a constant.
⎞
⎟⎟
⎠
The bit error ratio (BER) plot of BFO trained inverse model pertaining to different
nonlinear channels/sensors with different cost functions with 0%-50% of outliers are
obtained through simulation and are plotted in the Figs. 7.3(a)-(f) to 7.8(a)-(f). The BER
was computed for the nonlinear channels/systems with different EVR values at SNR 15dB
in presence of 0% and 50% outliers in the desired signal and the results are plotted in Figs.
7.9 and 7.10 for 40% and 50% outliers respectively. Few notable observations obtained
from these plots are :
(a)Keeping CF, SNR and percentage of outliers in the desired signal same, the BER
increases with increase in the EVR of the channel. Similarly under identical conditions of
simulation the squared error cost function based model performs the worst where as the
Wilcoxon norm based model provides the best performance (least BER).
(b)As the outliers in the desired signal increases the Wilcoxon norm based model continues
to provide lowest BER performance compared to that provided by other norms.
(c)With no outlier in the desired signal, the BER plot of all four CFs are almost same (Figs.
7.3(a), 7.4(a), 7.5 (a), 7.6 (a), 7.7(a) and 7.8(a)).
(d)At high outliers the conventional CF2 based model performs the worst followed by CF4
based model. In all cases the Wilcoxon norm (CF1) based inverse model performs the best
and hence is more robust against low to high outliers in the training signal.
(e)The accuracy of inverse models based on CF3 and Cf4 norms developed using outliers is
almost identical.
(f)In addition, the plots of Figs. 7.9 and 7.10 indicate that at 50% outliers in the desired
signal the BER increases with increase in the EVR of the nonlinear channels or systems.
(g) Further, the BER of the inverse model in all channels and SNR conditions is highest in
square error norm (CF2) based training compared to the all other three norms used.
However the Wilcoxon norm (CF1) based inverse model yields minimum BER among all
cases studied.
157
A D A P T I V E
I N V E R S E
M O D E L I N G
U S I N G B F O T E C H N I Q U E
A N D A P P L I C A T I O N S
0
10
CF-1
CF-2
CF-3
CF-4
-1
Probability of error
10
-2
10
-3
10
-4
10
-5
10
2
4
6
8
10
SNR in dB
12
14
16
(a) 0% Outliers
0
10
CF-1
CF-2
CF-3
CF-4
-1
Probability of error
10
-2
10
-3
10
-4
10
-5
10
2
4
6
8
10
SNR in dB
12
14
16
18
(b)10% Outliers
0
10
CF-1
CF-2
CF-3
CF-4
-1
10
Probability of error
R O B U S T
-2
10
-3
10
-4
10
-5
10
2
4
6
8
10
SNR in dB
12
(c) 20% Outliers
158
14
16
18
R O B U S T
A D A P T I V E
I N V E R S E
M O D E L I N G
U S I N G B F O T E C H N I Q U E
A N D A P P L I C A T I O N S
0
10
CF-1
CF-2
CF-3
CF-4
-1
Probability of error
10
-2
10
-3
10
-4
10
-5
10
2
4
6
8
10
12
14
SNR in dB
16
18
20
22
(d) 30% Outliers
0
10
CF-1
CF-2
CF-3
CF-4
-1
Probability of error
10
-2
10
-3
10
-4
10
-5
10
0
5
10
15
20
25
SNR in dB
(e) 40% Outliers
0
10
CF-1
CF-2
CF-3
CF-4
-1
Probability of error
10
-2
10
-3
10
-4
10
-5
10
0
5
10
15
SNR in dB
20
25
30
(f) 50% Outliers
Fig. 7. 3. Comparison of BER of four different CFs based nonlinear equalizers with [.209, .995, .209] as
channel coefficients and NL1
159
A D A P T I V E
I N V E R S E
M O D E L I N G
U S I N G B F O T E C H N I Q U E
A N D A P P L I C A T I O N S
0
10
CF-1
CF-2
CF-3
CF-4
-1
Probability of error
10
-2
10
-3
10
-4
10
-5
10
2
4
6
8
10
12
SNR in dB
14
16
18
20
(a) 0% Outliers
0
10
CF-1
CF-2
CF-3
CF-4
-1
Probability of error
10
-2
10
-3
10
-4
10
-5
10
2
4
6
8
10
12
SNR in dB
14
16
18
20
(b) 10% Outliers
0
10
CF-1
CF-2
CF-3
CF-4
-1
10
Probability of error
R O B U S T
-2
10
-3
10
-4
10
-5
10
2
4
6
8
10
12
SNR in dB
14
(c) 20% Outliers
160
16
18
20
R O B U S T
A D A P T I V E
I N V E R S E
M O D E L I N G
U S I N G B F O T E C H N I Q U E
A N D A P P L I C A T I O N S
0
10
CF-1
CF-2
CF-3
CF-4
-1
Probability of error
10
-2
10
-3
10
-4
10
-5
10
0
5
10
15
20
25
SNR in dB
(d) 30% Outliers
0
10
CF-1
CF-2
CF-3
CF-4
-1
Probability of error
10
-2
10
-3
10
-4
10
-5
10
0
5
10
15
SNR in dB
20
25
30
(e) 40% Outliers
0
10
CF-1
CF-2
CF-3
CF-4
-1
Probability of error
10
-2
10
-3
10
-4
10
-5
10
0
5
10
15
SNR in dB
20
25
30
(f) 50% Outliers
Fig. 7. 4. Comparison of BER of four different CFs based nonlinear equalizers with [.209, .995, .209] as
channel coefficients and NL2
161
A D A P T I V E
I N V E R S E
M O D E L I N G
U S I N G B F O T E C H N I Q U E
A N D A P P L I C A T I O N S
0
10
CF-1
CF-2
CF-3
CF-4
-1
Probability of error
10
-2
10
-3
10
-4
10
-5
10
2
4
6
8
10
SNR in dB
12
14
16
18
(a) 0% Outliers
0
10
CF-1
CF-2
CF-3
CF-4
-1
Probability of error
10
-2
10
-3
10
-4
10
-5
10
2
4
6
8
10
12
SNR in dB
14
16
18
20
(b) 10% Outliers
0
10
CF-1
CF-2
CF-3
CF-4
-1
10
Probability of error
R O B U S T
-2
10
-3
10
-4
10
-5
10
2
4
6
8
10
12
SNR in dB
14
(b) 20% Outliers
162
16
18
20
R O B U S T
A D A P T I V E
I N V E R S E
M O D E L I N G
U S I N G B F O T E C H N I Q U E
A N D A P P L I C A T I O N S
0
10
CF-1
CF-2
CF-3
CF-4
-1
Probability of error
10
-2
10
-3
10
-4
10
-5
10
0
5
10
15
20
25
SNR in dB
(d) 30% Outliers
0
10
CF-1
CF-2
CF-3
CF-4
-1
Probability of error
10
-2
10
-3
10
-4
10
-5
10
0
5
10
15
SNR in dB
20
25
30
(e) 40% Outliers
0
10
CF-1
CF-2
CF-3
CF-4
-1
Probability of error
10
-2
10
-3
10
-4
10
-5
10
0
5
10
15
SNR in dB
20
25
30
(f) 50% Outliers
Fig. 7.5. Comparison of BER of four different CFs based nonlinear equalizers with [.260, .930, .260] as channel
coefficients and NL1
163
A D A P T I V E
I N V E R S E
M O D E L I N G
U S I N G B F O T E C H N I Q U E
A N D A P P L I C A T I O N S
0
10
CF-1
CF-2
CF-3
CF-4
-1
Probability of error
10
-2
10
-3
10
-4
10
-5
10
0
5
10
15
20
25
SNR in dB
(a) 0% Outliers
0
10
CF-1
CF-2
CF-3
CF-4
-1
Probability of error
10
-2
10
-3
10
-4
10
-5
10
2
4
6
8
10
12
14
SNR in dB
16
18
20
22
(b) 10% Outliers
0
10
CF-1
CF-2
CF-3
CF-4
-1
10
Probability of error
R O B U S T
-2
10
-3
10
-4
10
-5
10
2
4
6
8
10
12
14
SNR in dB
(c) 20% Outliers
164
16
18
20
22
R O B U S T
A D A P T I V E
I N V E R S E
M O D E L I N G
U S I N G B F O T E C H N I Q U E
A N D A P P L I C A T I O N S
0
10
CF-1
CF-2
CF-3
CF-4
-1
Probability of error
10
-2
10
-3
10
-4
10
-5
10
0
5
10
15
SNR in dB
20
25
30
(d) 30% Outliers
0
10
CF-1
CF-2
CF-3
CF-4
-1
Probability of error
10
-2
10
-3
10
-4
10
-5
10
0
5
10
15
SNR in dB
20
25
30
(e) 40% Outliers
0
10
CF-1
CF-2
CF-3
CF-4
-1
Probability of error
10
-2
10
-3
10
-4
10
-5
10
0
5
10
15
SNR in dB
20
(f) 50% Outliers
25
30
Fig. 7. 6 Comparison of BER of four different CFs based nonlinear equalizers with [.260, .930, .260] as
channel coefficients and NL2
165
A D A P T I V E
I N V E R S E
M O D E L I N G
U S I N G B F O T E C H N I Q U E
A N D A P P L I C A T I O N S
0
10
CF-1
CF-2
CF-3
CF-4
-1
Probability of error
10
-2
10
-3
10
-4
10
-5
10
2
4
6
8
10
12
SNR in dB
14
16
18
20
(a) 0% Outliers
0
10
CF-1
CF-2
data3
data4
-1
Probability of error
10
-2
10
-3
10
-4
10
-5
10
2
4
6
8
10
12
14
SNR in dB
16
18
20
22
(b) 10% Outliers
0
10
CF-1
CF-2
CF-3
CF-4
-1
10
Probability of error
R O B U S T
-2
10
-3
10
-4
10
-5
10
2
4
6
8
10
12
14
SNR in dB
(c) 20% Outliers
166
16
18
20
22
R O B U S T
A D A P T I V E
I N V E R S E
M O D E L I N G
U S I N G B F O T E C H N I Q U E
A N D A P P L I C A T I O N S
0
10
CF-1
CF-2
CF-3
CF-4
-1
Probability of error
10
-2
10
-3
10
-4
10
-5
10
0
5
10
15
SNR in dB
20
25
30
(d) 30% Outliers
0
10
CF-1
CF-2
CF-3
CF-4
-1
Probability of error
10
-2
10
-3
10
-4
10
-5
10
0
5
10
15
SNR in dB
20
25
30
(e) 40% Outliers
0
10
CF-1
CF-2
CF-3
CF-4
-1
Probability of error
10
-2
10
-3
10
-4
10
-5
10
0
5
10
15
SNR in dB
20
25
30
(f) 50% Outliers
Fig. 7. 7. Comparison of BER of four different CFs based nonlinear equalizers with [.304, .903, .304] as
channel coefficients and NL1
167
A D A P T I V E
I N V E R S E
M O D E L I N G
U S I N G B F O T E C H N I Q U E
A N D A P P L I C A T I O N S
0
10
CF-1
CF-2
CF-3
CF-4
-1
Probability of error
10
-2
10
-3
10
-4
10
-5
10
0
5
10
15
SNR in dB
20
25
30
(a) 0% Outliers
0
10
CF-1
CF-2
CF-3
CF-4
-1
Probability of error
10
-2
10
-3
10
-4
10
-5
10
0
5
10
15
SNR in dB
20
25
30
(b) 10% Outliers
0
10
CF-1
CF-2
CF-3
CF-4
-1
10
Probability of error
R O B U S T
-2
10
-3
10
-4
10
-5
10
0
5
10
15
SNR in dB
20
(c) 20% Outliers
168
25
30
R O B U S T
A D A P T I V E
I N V E R S E
M O D E L I N G
U S I N G B F O T E C H N I Q U E
A N D A P P L I C A T I O N S
0
10
CF-1
CF-2
CF-3
CF-4
-1
Probability of error
10
-2
10
-3
10
-4
10
-5
10
0
5
10
15
SNR in dB
20
25
30
(d) 30% Outliers
0
10
CF-1
CF-2
CF-3
CF-4
-1
Probability of error
10
-2
10
-3
10
-4
10
-5
10
0
5
10
15
SNR in dB
20
25
30
(e) 40% Outliers
0
10
CF-1
CF-2
CF-3
CF-4
-1
Probability of error
10
-2
10
-3
10
-4
10
-5
10
0
5
10
15
SNR in dB
20
25
30
(f) 50% Outliers
Fig. 7. 8 Comparison of BER of four different CFs based nonlinear equalizers with [.304, .903, .304] as
channel coefficients and NL2
169
R O B U S T
A D A P T I V E
I N V E R S E
M O D E L I N G
U S I N G B F O T E C H N I Q U E
A N D A P P L I C A T I O N S
-1
10
-2
Bit Error Rate (BER)
10
-3
10
-4
10
CF-1
CF-2
CF-3
CF-4
-5
10
6
8
10
12
14
16
Eigen value ratio (EVR)
(a)
18
20
22
NL1
-1
Bit Error Rate (BER)
10
-2
10
-3
10
CF-1
CF-2
CF-3
CF-4
-4
10
6
8
10
12
14
16
Eigen value ratio (EVR)
18
20
22
(b) NL2
Fig. 7. 9 Effect of EVR on the BER performance of the four CF-based equalizers in presence of 50%
outliers
170
R O B U S T
A D A P T I V E
I N V E R S E
M O D E L I N G
U S I N G B F O T E C H N I Q U E
A N D A P P L I C A T I O N S
-1
10
-2
Bit Error Rate (BER)
10
-3
10
-4
10
CF-1
CF-2
CF-3
CF-4
-5
10
6
8
10
12
14
16
Eigen value ratio (EVR)
18
20
22
(a) NL1
-1
10
Bit error rate (BER)
-2
10
-3
10
CF-1
CF-2
CF-3
CF-4
-4
10
6
8
10
12
14
16
Eigen value ratio (EVR)
18
20
22
(b) NL2
Fig. 7. 10 Effect of EVR on the BER performance of the four CF-based equalizers in presence of 40%
outliers
171
R O B U S T
A D A P T I V E
I N V E R S E
M O D E L I N G
U S I N G B F O T E C H N I Q U E
A N D A P P L I C A T I O N S
7.6 Conclusion
This chapter examines and evaluates the learning capability of different norms of error when the
training signal (of the inverse model) is contaminated with strong outliers. To facilitate such
evaluation different nonlinear channels with varying EVRs are used. The population based BFO
learning tool is developed to minimize four different norms. The robustness of these norms is
assessed through simulation study. It is in general observed that the conventional squared error
norm (CF2) is least robust to develop inverse models of nonlinear systems under varying noise
conditions. Whereas the Wilcoxon norm (CF1) is the most robust one. In terms of robust
performance, the order of the norms is CF1, CF3, CF4 and CF2.
References
[7.1] R. W. Lucky, Techniques for adaptive equalization of digital communication systems,
Bell Sys.Tech. J., 45, 255-286, Feb. 1966.
[7.2] S. U. H Qureshi, Adaptive equalization, Proc. IEEE, 73(9), 1349-1387, Sept. 1985.
[7.3]E. A. Robinson and T. Durrani, Geophysical Signal Processing, Prentice-Hall,
Englewood Cliffs, NJ, 1986.
[7.4] B. Widrow and E. Walach, Adaptive Inverse Control, Prentice-Hall, Upper Saddle
River, NJ, 1996.
[7.5] S. K. Nair and Jaekyun Moon, “A theoretical study of linear and nonlinear equalization
in nonlinear magnetic storage channels”, IEEE Trans. on neural networks, vol. 8, no. 5, pp.
1106-1118, Sept. 1997.
[7.6] H. Sun, G. Mathew and B. Farhang-Boroujeny, “Detection techniques for high density
magnetic recording”, IEEE Trans. on Magnetics, vol. 41, no. 3, pp. 1193-1199, March
2005.
[7.7] F. Preparata, “Holographic dispersal and recovery of Information”, IEEE Trans.
Inform. Theory, vol. 35, no. 5, pp. 112 -1124 , Sept., 1989.
[7.8] G. J. Gibson, S. Siu and C. F. N. Cowan, “The application of nonlinear structures to
the reconstruction of binary signals”, IEEE Trans. signal processing, vol. 39, no. 8, pp.
1877-1884, Aug. 1991.
172
R O B U S T
A D A P T I V E
I N V E R S E
M O D E L I N G
U S I N G B F O T E C H N I Q U E
A N D A P P L I C A T I O N S
[7.9] P. G. Voulgaris and C. N. Hadjicostics, “Optimal processing strategies for perfect
reconstruction of binary signals under power-constrained transmission”, Proc. IEEE
confernec on decision and control, Atlantis, Bahamas, vol. 4, pp. 4040-4045, Dec. 2004.
[7.10] R. Touri, P. G. Voulgaris and C. N. Hadjicostis, “Time varying power limited
preprocessing for perfect reconstruction of binary signals”, Proc. of the 2006 American
control conference, Minneapdis, USA, pp. 5722-5727, June 2006.
[7.11] J. C. Patra, R. N. Pal, R. Baliarsingh and G. Panda, “Nonlinear channel equalization
for QAM signal constellation using Artificial Neural Network”, IEEE Trans. on systems,
man and cybernetics-Part B:cybetnetics, vol. 29, no. 2, April 1999.
[7.12] J. C. Patra, Wei Beng Poh, N. S. Chaudhari and Amitabha Das,” Nonlinear channel
equalization with QAM signal using Chebyshev artificial neural network”, Proc. of
International joint conference on neural networks, Montreal, Canada, pp. 3214-3219,
August 2005.
[7.13] G. Panda, B. Majhi, D. Mohanty, A. Choubey and S. Mishra, “Development of
Novel Digital Channel Equalizers using Genetic
Algorithms”, Proc. of National
Conference on Communication (NCC-2006), IIT Delhi, pp.117-121, 27-29, January, 2006.
[7.14] K. M. Passino, “Biomimicry of Bacterial Foraging for distributed optimization and
control”, IEEE control system magazine, vol 22, issue 3, pp. 52-67, June 2002.
[7.15] S. Mishra, “A Hybrid least square Fuzzy bacterial foraging strategy for harmonic
estimation”, IEEE Trans. on Evolutionary Computation, vol 9, no. 1, pp. 61-73, Feb. 2005.
[7.16] Babita Majhi, G. Panda and A. Choubey, “On The Development of a new Adaptive
Channel Equalizer using Bacterial Foraging Optimization Technique”, Proc. of IEEE
Annual India Conference (INDICON-2006), New Delhi, India, 15th-17th September, 2006,
pp. 1-6.
[7.17] M. Maitra and A. Chatterjee, “A novel technique for multilevel optimal magnetic
resonance brain image thresholding using bacterial foraging”, Measurement, Elsevier, vol.
41, pp. 1124-1134, 2008.
[7.18] T. Y. Ji, M. S. Li, Z. Lu and Q. H. Wu, “Optimal morphological filter design using a
bacterial swarming algorithm”, Proc. of IEEE Congress on Evolutionary Computation
(CEC 2008), Hong Kong, pp 452- 458.
[7.19] W. J. Tang, M. S. Li, Q. H. Wu and J. R. Saunders, “Bacterial foraging algorithm for
optimal power flow in dynamic environments”, IEEE Trans. on Circuits and Systems, vo.
55, no. 8, pp. 2433-2442, Sep. 2008.
[7.20] B. K. Panigrahi and V. Ravikumar Pandi, “Bacterial foraging optimization : NelderMead hybrid algorithm for economic load dispatch”, IET Gener. Transm. Distrib., vol. 2,
no. 4, pp. 556-565, 2008.
173
R O B U S T
A D A P T I V E
I N V E R S E
M O D E L I N G
U S I N G B F O T E C H N I Q U E
A N D A P P L I C A T I O N S
[7.21] A Y. Saber and G. K. Venayagamoorthy, “Economic load dispatch using bacterial
foraging technique with particle swarm optimization biased evolution”, Proc. of IEEE
Swarm Intelligence Symposium, St. Louis MO USA, 21-23 September, 2008, pp. 1-8.
[7.22] W. Lin, R. Liu, P. X. Liu and Max. Q-H Meng, “Parameter estimation using
biologically inspired methods”, Proc. of IEEE Int. conf. on Robotics and Biomimetics,
Sanya, china, 15-18 Dec., 2007, pp. 1339-1343.
[7.23] D. P. Acharya, G. Panda and Y. V. S. Lakshmi, “Effect of finite register length on
bacterial foraging optimization based ICA and constrained genetic algorithm based ICA
algorithm”, Proc. of IEEE Int. Conf. on signal processing, communication and
networking, Anna Univ., Chennai, 4-6 Jan., 2008, pp. 244-249.
[7.24] M. Hanmandlu, A. V. Nath, A. C. Mishra and V. K. Madasu, “Fuzzy model based
recognition of handwritten hindi numerals using bacterial foraging”, Proc. of 6th IEEE Int.
Conf. on computer and information science (ICIS 2007), 11-13 July 2007, pp. 309-312.
[7.25] B. Samanbabu, S. Mishra, B. K. Panigrahi and G. K. Venayagamoorthy, “Robust
tuning of modern power system stabilizers using bacterial foraging algorithm”, Proc. of
IEEE Congress on Evolutionary Computation (CEC 2007), Singapore, 25-28 Sep. 2007,
pp. 2317-2324.
[7.26] Leandro dos Santos Coelho and Camila da Costa Silveira, “Improved bacterial
foraging strategy for controller optimization applied to robotic manipulator system”, Proc.
of IEEE Int. Symposium on intelligent control, Munich, Germany, 4-6 Oct. 2006, pp.
1276-1281.
[7.27] T. K. Das, G. K. Venayagamoorthy and U. O. Aliyu, “Bio-inspired algorithms for the
design of multiple optimal power system stabilizers: SPPSO and BFA”, IEEE Trans. on
Industry Applications, vol. 44, no. 5, pp. 1445-1457, Sep-Oct. 2008.
[7.28] S. Mishra, C. N. Bhende and L. L Lai, “Optimization of a distribution static
compensator by bacterial foraging technique”, Proc. of IEEE 5th Int. Conf. on machine
learning and cybernetics, Dalian, 13-16 August 2006, pp. 4075-4082.
[7.29] Ben Niu, Y Zhu, X He and X Zeng, “Optimum design of PID controllers using only
a germ of intelligence”, Proc. of IEEE 6th World congress on intelligent control and
automation, Dalian, China, 21-23 June 2006, pp. 3584 – 3588.
[7.30] A. Ali and S. Majhi, “Design of optimum PID controller by bacterial foraging
strategy”, Proc. of IEEE Int. Conf. on Industrial Technology, 15-17 Dec. 2006, pp. 601605.
[7.31] W. J. Tang, Q. H. Wu and J. R. Saunders, “Bacterial foraging algorithm for dynamic
environments”, Proc. of IEEE Congress on Evolutionary Computation, Canada, 16-21 July
2006, pp. 1324-1330.
[7.32] W. Lin and P. X. Liu, “Hammerstein model identification based on bacterial
foraging”, Electronics Letter, vol 42, no. 23, Nov. 2006.
174
R O B U S T
A D A P T I V E
I N V E R S E
M O D E L I N G
U S I N G B F O T E C H N I Q U E
A N D A P P L I C A T I O N S
[7.33] S. Mishra, B. K. Panigrahi and M. Tripathy, “A hybrid adaptive bacterial foraging and
feedback linearization scheme based D-STATCOM”, Proc. of IEEE Int. Conf. on Power
system technology, Singapore , 21-24 Nov. 2004, pp. 275-280.
[7.34] Dong Hwa Kim, Ajit Abraham and Jae Hoon Cho, “A hybrid genetic algorithm and
bacterial foraging approach for global optimization”, Information Sciences, Elsevier, vol
177, pp. 3918-3937, 2007.
[7.35] S. Mishra and C. N. Bhende, “Bacterial foraging technique based optimized active
power filter for load compensation”, IEEE Trans. on Power delivery, vol. 22, no. 1,
January 2007.
[7.36] M. Ulagammai, P. Venkatesh, P. S. Kannan and N. P. Padhy, “Application of
bacterial foraging technique trained artificial and wavelet neural networks in load
forecasting”, Neurocomputing, Elsevier, vol. 70, pp. 2659-2667, 2007.
[7.37] Tai-Chen Chen, Pei-Wei Tsai, Shu-Chuan Chu and Jeng-Shyang Pan, “A novel
optimization approach: Bacterial-GA foraging”, Proc. of IEEE 2nd Int. Conf. on Innovative
computing, information and control, 5-7 Sept. 2007, pp. 391-399.
[7.38] S. M. Dasgupta, A. Biswas, A. Abraham and S. Das, “Adaptive computational
chemotaxis in bacterial foraging algorithm”, Proc. of IEEE Int. Conf. on complex,
intelligent and software intensive systems,4-7 March 2008, pp. 64-71.
[7.39] A. Abraham, A. Biswas, S. Dasgupta and S. Das, “Analysis of reproduction operator
in bacterial foraging optimization algorithm”, Proc. of IEEE World Congress on
Evolutionary Computation (CEC 2008), Hong Kong, 1-6 June 2008, pp. 1476-1483.
[7.40] Joseph W. McKean, “Robust analysis of Linear models”, Statistical Science, vol. 19,
no. 4, pp. 562-570, 2004.
[7.41] Jer-Guang Hsieh, Yih-Lon Lin and Jyh-Horng Jeng, “Preliminary study on Wilcoxon
learning machines”, IEEE Trans. on neural networks, vol. 19, no. 2, pp. 201-211, Feb.
2008.
[7.42] Wei-Yen Wang, Tsu-Tian Lee, Ching-Lang Liu and Chi-Hsu Wang, “Function
approximation using fuzzy neural networks with robust learning algorithm”, IEEE Trans.
on Systems, Man and Cybernetics-Part B : Cybernetics, vol. 27, no. 4, pp. 740-747, Aug.
1997.
[7.43] Hung-Hsu Tsai and Pao-Ta Yu, “On the optimal design of fuzzy neural networks
with robust learning for function approximation”, IEEE Trans. on Systems, Man and
Cybernetics-Part B : Cybernetics, vol. 30, no. 1, pp. 217-223, Feb. 2000.
[7.44] J. C. Patra, A. C. Kot and G. Panda, “An intelligent pressure sensor using neural
networks”, IEEE Trans. on Instrumentation and Measurement, vol. 49, issue 4, pp. 829834, Aug. 2000.
175
8
Chapter
Identification of Hammerstein
Plants using Clonal PSO and
Immunized PSO Algorithms
8.1 Introduction
P
RACTICALLY it is difficult to model physical systems by mathematical analysis
method. But through system identification, a suitable model can be developed
which is mathematically equivalent to a given physical system. Many practical
systems possess inherent nonlinear characteristics due to harmonic generation,
intermediation, desensitization, gain expansion and chaos. Identification of such nonlinear
complex plants plays a significant role in analysis and design of control systems. The
Hammerstein model is widely used because its structure describes the nonlinearity
associated with practical dynamic systems. Several methods have been proposed in the
literature for identification of Hammerstein model by using correlation theory [8.1],
orthogonal functions [8.2], polynomial functions [8.3], piecewise linear model [8.4], artificial
176
I D E N T I F I C A T I O N
O F
H A M M E R S T E I N P L A N T S U S I N G C L O N A L P S O
A N D I M M U N I Z E D P S O A L G O R I T H M S
neural networks [8.5], genetic algorithm [8.6], radial basis function (RBF) networks [8.7],
particle swarm optimization (PSO) [8.8, 8.9] and bacterial foraging optimization (BFO)
[8.10] techniques . The Particle Swarm Optimization (PSO) was developed by Eberhart and
Kennedy in 1995 [8.11] inspired by swarm intelligence theory such as birds flocking, fish
schooling etc. It gained a lot of attention in various optimal control system applications
because of its faster convergence [8.12], reduced memory requirement, lower
computational complexity and easier implementation as compared to other evolutionary
algorithms. However, there are some problems associated with the basic PSO, such as
premature convergence and stagnation at the local optimal solution. In [8.13] it is shown
that the PSO performs well in early generations than any other evolutionary algorithm, but
it degrades as the number of generations increases. Therefore it has a slow fine tuning
ability of the solution. Several studies have been made to improve the performance of PSO
[8.14] - [8.17].
The biological immune system (BIS) is a multilayer protection system where each layer
provides different types of defense mechanisms for detection, recognition and responses. It
also resists infectious diseases and reacts to foreign substances. Following the principle of
BIS a new tool of computational intelligence known as artificial immune system (AIS)
[8.18]- [8.20] has evolved which finds applications in optimization problems [8.21, 8.22],
computer security [8.23, 8.24], fault detection [8.25, 8.26], job scheduling [8.27] and
clustering. The four forms of AIS algorithm reported in the literature are immune network
model [8.19], negative selection [8.23, 8.28], clonal selection [8.21, 8.22] and danger theory
[8.29].
In this chapter two new hybrid algorithms known as Clonal PSO (CPSO) and Immunized
PSO (IPSO) have been proposed by suitably combining the good features of PSO and AIS
algorithms. The performance of these new algorithms has been assessed by employing
them in identification of various standard Hammerstein models. The nonlinear static part
of the model is represented by a single layer low complexity nonlinear functional link
artificial neural network (FLANN) architecture [8.30, 8.31]. The weights of the FLANN
structure and the dynamic part of the model are estimated by the proposed algorithms.
177
I D E N T I F I C A T I O N
O F
H A M M E R S T E I N P L A N T S U S I N G C L O N A L P S O
A N D I M M U N I Z E D P S O A L G O R I T H M S
8.2 Identification of Hammerstein plants using FLANN
8. 2. 1 Hammerstein model
The nonlinear dynamic system described by Hammerstein model is composed of a
nonlinear static block in series with a linear dynamic system block as shown in Fig.8.1.
u(k)
Nonlinearity
F(.)
Linear Dynamics
x(k)
z −1
e(k)
B(z −1 )
A( z −1 )
+
Σ
y(k)
+
1
A(z −1 )
Fig. 8.1 The Hammerstein Model
The model in general is described by
A (z −1 ) y(k ) = B(z −1 ) x (k − 1) + e(k )
(8.1)
x (k ) = F(u (k ))
(8.2)
A (z −1 ) = 1 + a 1 z −1 + ..... + a n z − n
(8.3)
B(z −1 ) = b 0 + b1 z −1 + ..... + b r z − r
(8.4)
where z −1 denotes an unit delay. In this model u (k ), y (k ) and e(k ) represent the input,
output, and noise samples at instant k respectively. The intermediate signal x(k ) is not
accessible for measurement. The symbols n and r are known degrees of polynomials of
A(z −1 ) and B(z −1 ) respectively. The function F (.) is assumed to be nonlinear and
unknown.
The objective of the identification task of the Hammerstein model is to determine the
system parameters {ai }, {b j } of the linear dynamic part and response matching at the
nonlinear static part and the output of the model using known input and output samples
u (k ) and y (k ) .
178
I D E N T I F I C A T I O N
O F
H A M M E R S T E I N P L A N T S U S I N G C L O N A L P S O
A N D I M M U N I Z E D P S O A L G O R I T H M S
8. 2. 2 FLANN architecture for modeling nonlinear static part
In this Chapter, the nonlinear static part of the Hammerstein model is represented by a
FLANN structure. Its input signal u (k ) at the k th instant is functionally expanded to a
number of nonlinear values to feed to an adaptive linear combiner whose weights are
altered according to an iterative learning rule. The types of expansion suggested in the
literature are either trigonometric, power series or Chebyshev expansion. For trigonometric
expansion the linear matrix is given by
⎧1
⎪u ( k )
⎪
Φ i {u ( k )} = ⎨
⎪sin( iπu ( k ))
⎪⎩ cos( iπu(k) )
for i = 0
for i = 1
for i = 2,4,....M
for i = 3,5,...M + 1
(8.5)
where i = 1, 2,........, M / 2 . As a result the total expanded values including an unity bias
input become 2M + 2 . Let the corresponding weight vector be represented as
wi (k ) having 2 M + 2 elements. The estimated output of the nonlinear static part as
shown in Fig. 8.2 is given by
F (u (k )) =
2M +2
∑ w Φ (u (k )) + ε (k )
i =1
i
(8.6)
i
where ε(k ) is approximation error.
w0(k)
1
u(k)
Functional Expansion
Φ1(u(k))
Φ2(u(k))
w1(k)
w2(k)
F(u(k))
Σ
.
.
w2M+1(k)
Φ2M+1(u(k))
Fig. 8.2 Structure of FLANN model
Substitution of (8.4) in (8.1) gives
179
I D E N T I F I C A T I O N
O F
H A M M E R S T E I N P L A N T S U S I N G C L O N A L P S O
A N D I M M U N I Z E D P S O A L G O R I T H M S
A( z −1 ) y (k ) = [b0 F (u (k − 1)) + b1 F (u (k − 2)) + ..... + br F (u (k − r − 1))] + e(k )
(8.7)
Similarly from (8.6) and (8.7) we get
⎞
⎛ 2M +2
A( z −1 ) y (k ) = [b0 ⎜ ∑ wi Φ i (u (k − 1)) + ε (k − 1) ⎟ + ...
⎠
⎝ i =1
⎞
⎛ 2M +2
.. + br ⎜ ∑ wi Φ i (u (k − r − 1)) + ε (k − r − 1) ⎟] + e(k )
⎠
⎝ i =1
(8.8)
Rearrangement of (8.8) gives
r
A( z −1 ) y (k ) = [∑ bi w1Φ 1 (u (k − i − 1)) +..
i =0
r
.. + ∑ bi w2 M + 2 Φ 2 M + 2 (u (k − i − 1))
i =0
r
+ ∑ bi ε (k − i − 1)] + e(k )
(8.9)
i =0
The identification structure of Hammerstein model corresponding to (8.9) is shown in Fig.
8.3.
Here v(k ) and θ represented as
r
v ( k ) = e( k ) + ∑ bi ε ( k − i − 1)
(8.10)
θ = [θ aT ,θ wT1 ,...θ wT
(8.11)
i =0
2 M +2
]T
where
θ aT = [a1 , a 2 ,...a n ]T
(8.12)
θ w = [θ w (1),θ w (2),...θ w (r + 1)]T
i
i
i
i
= [b0 wi , b1 wi ,...., br wi ]T
(8.13)
At the k th instant ϕ(k ) is given by
ϕ ( k ) = [ϕ aT ( k ), ϕ wT1 ( k ),...ϕ wT
2M +2
( k )]T
(8.14)
where
ϕ a (k ) = [− y(k − 1),− y (k − 2),... − y (k − n)]T
(8.15)
ϕw (k) =[Φi (u(k −1)),.........Φi (u(k − r −1))]T
(8.16)
i
Using (8.10)-(8.11) and (8.14) , (8.8) can be expressed as
180
I D E N T I F I C A T I O N
O F
H A M M E R S T E I N P L A N T S U S I N G C L O N A L P S O
A N D I M M U N I Z E D P S O A L G O R I T H M S
u(k)
Φ1(u(k-1))
b0w1
Φ1(u(k-2))
b1w1
Φ1
Σ
:
:
Φ1(u(k-r-1))
Z-1
-1
Z
Φ2
Φ2(u(k-1))
b0w2
Φ2(u(k-2))
b1w2
:
:
Z-1
Φ2M+2
+
+
Σ
:
:
brw1
:
Φ2(u(k-r-1))
+
brw2
Φ2M+2(u(k-1))
b0w2M+2
Φ2M+2(u(k-1))
b1w2M+2
Σ
:
:
Φ2M+2(u(k-r-1))
brw2M+2
e(k)
ε(k)
+
Z-1
b0
Z-1
b1
+
Σ
+
Σ
+
Σ
y(k)
+
Σ
_
:
a1
Z-1
a2
Z-1
:
Z-1
Σ
br
Σ
:
:
an
Z-1
Fig. 8. 3 Adaptive Identification model of the generalized Hammerstein Plant
y (k ) = ϕ T (k )θ + v (k )
(8.17)
The objective here is to estimate the system parameters defined in (8.11). If derivative
based least square method is taken then the estimate of these system parameters is given by
−1
⎡ Ns + N
⎤ ⎡ Ns + N
⎤
θ = ⎢ ∑ ϕ (k )ϕ T (k )⎥ ⎢ ∑ ϕ (k ) y (k )⎥
⎣k = N s +1
⎦ ⎣k = N s +1
⎦
Λ
(8.18)
where N is the number of input output data. Substituting wˆ 1 = 1 the parameters of linear
dynamic part are estimated as aˆ1 ,......aˆn , bˆ1 ,.......bˆr . Equation (8.18) provides an estimate of
the parameters which involves matrix inversion and hence computationally very expensive.
181
I D E N T I F I C A T I O N
O F
H A M M E R S T E I N P L A N T S U S I N G C L O N A L P S O
A N D I M M U N I Z E D P S O A L G O R I T H M S
Further the input in this case is obtained from an nonlinear model. The conventional
derivative based method to estimate the pole-zero parameters often leads to instability
during training. Hence it is motivating to devise efficient and reliable methods to efficiently
identify Hammerstein plant using two new population based CPSO and IPSO algorithms.
8.3 Proposed clonal
algorithms
PSO
and
immunized
PSO
In PSO algorithm, a swarm consists of a set of volume-less particles (a point) moving in a
D-dimensional search space, each representing a potential solution. The i th particle is
represented by a vector: X i = [ xi1 , xi 2 ....xid ....xiD ] .The best previous position (the
position giving the best fitness value) of the i th particle is recorded and represented as
Pi = [ pi1 , pi 2 .... pid .... piD ] . At each iteration, the global best particle in the swarm is
represented by
Pg = [ p g1 , p g 2 .... p gd ..... p gD ] .The velocity of the i th particle is
represented as Vi = [vi1 , vi 2 ...vid ...viD ] . The maximum velocity and the range of particles
are given by Vmax = [v max 1 , v max 2 ......v max d .......v max D ] and X max = [ x max 1 , x max 2 ..... x max d ...... x max D ] .
The velocity and position of the d th element of the i th particle at (k + 1) th search from
the knowledge of previous search are modified according to (8.19)-(8.22).
Vid (k + 1) = w(k ) * vid (k ) + c1 * r1 * ( pid (k ) − xid (k )) + c 2 * r2 * ( p gd (k ) − xid (k ))
(8.19)
⎧v max d , vid (k + 1) > v max d
Vid (k + 1) = ⎨
⎩− v max d , vid (k + 1) < −v max d
(8.20)
X id (k + 1) = X id (k ) + Vid (k + 1)
(8.21)
⎧ x max d , xid (k + 1) > x max d
X id (k + 1) = ⎨
⎩− x max d , xid (k + 1) < − x max d
(8.22)
where i = 1, 2,......N 1 , d = 1, 2,.......D and N 1 is the number of particles . The symbols r1 and
r2 represent random numbers between 0 and 1, c1 and c2 denote acceleration constants.
The inertia weight, w is employed to control the impact of pervious history of velocities
on the current one in order for tradeoff between the global and local exploitations and is
given by [8.16].
182
I D E N T I F I C A T I O N
w(k ) = w0 −
O F
H A M M E R S T E I N P L A N T S U S I N G C L O N A L P S O
A N D I M M U N I Z E D P S O A L G O R I T H M S
(w0 −w1 ) * k
itr
(8.23)
where k =generation counter (from 1 to itr )
itr =number of iterations
w0 = 0.9 and w1 = 0.4
8. 3. 1. The CPSO Algorithm
In conventional PSO, the velocity of each particle in the next search is updated using the
knowledge of its past velocity, personal and global best positions. Since the global best
position after a search is the best among all personal best positions, their use in updating
the velocity has little contribution in moving to new positions. Therefore in the present
investigation second term in the velocity update equation (8.19) of conventional PSO is not
considered.
Further according to clonal selection principle when an antigen or a pathogen invades the
organism, a number of antibodies is produced by the immune cells. The fittest antibody
undergoes cloning operation to produce number of new cells. These are used to eliminate
the invading antigens. Employing this principle of AIS in PSO it is proposed here that each
particle moves to the global best position after a search is complete wherefrom each
individual starts its next search. The above idea implies that during any k th search the
position of the d th element of the i th particle becomes equal to the global best position
i.e. xid (k ) = p gd (k ) . As a result the third term of the velocity updates equation (8.19) of
conventional PSO becomes zero. Incorporating the above two ideas into the conventional
PSO algorithm leads to a simplified velocity update equation
Vid' (k + 1) = w(k ) * vid' (k )
(8.24)
X id' (k + 1) = X id' (k ) + Vid' (k + 1)
(8.25)
where i = 1,2 ....N 1 , d = 1,2 ....D , The inertia weight w(k ) is computed according to
(8.23). According to this algorithm after every search all particles migrate to the global best
position wherefrom each particle disperses again according to individual’s magnitude and
direction of velocity. The same process is repeated until the position of gbest finally
represents the optimal solution of the problem.
183
I D E N T I F I C A T I O N
O F
H A M M E R S T E I N P L A N T S U S I N G C L O N A L P S O
A N D I M M U N I Z E D P S O A L G O R I T H M S
8. 3. 2 The IPSO algorithm
The CPSO algorithm is relatively simpler than conventional PSO and performs
satisfactorily when applied to different optimization problems. However one important
observation in this new algorithm is that computation of every new position of a particle
depends on two factors: i.e. time varying inertia weight w(k) and its initial velocity. As a
result the diversification in the solution space after each search becomes limited. Hence
there is a chance that the final solution in this approach might lead to a local one. To
overcome this shortcoming another new algorithm called immunized PSO algorithm is
proposed by introducing mutation process into the algorithm.
In this case, like the CPSO algorithm, each particle after a search occupies the global best
position. Then the mutation operation is carried out on the position vector of the particles
to enable random diversifications of their positions. Since the position of each particle is
changed unlike in CPSO, the third term remains. But the second term which contributes to
change in velocity due to local best is not used. Thus the update equation becomes
Vid" (k + 1) = w(k ) * vid" (k ) + c 2" * r2" * ( p gd (k ) − xid (k ))
(8.26)
X id" (k + 1) = X id" (k ) + Vid" (k + 1)
(8.27)
From among the updated positions of the particles the global best position is selected and
then cloned. Then the cloned cells undergo a mutation mechanism by following the hyper
mutation concept of AIS [8.21, 8.22]. The mutation operation has fine-tuning capabilities
which helps to achieve better optimal solution. The single dimension mutation (SDM)
operation [8.16] is defined as
xmid (k + 1) = xT1d (k + 1) + 0.01* xT2d (k + 1)
(8.28)
xm(i +1) d (k + 1) = xT2d (k + 1) + 0.01* xT1d (k + 1)
(8.29)
where T1 and T2 represent the particles’ positions to be mutated and are chosen randomly
from the set of cloned positions. In order to increase the efficiency of mutation an adaptive
SDM (ASDM) is also proposed where the constants 0.01 is replaced by a parameter z
whose values varies with the number of search. The value of z (k ) at k th search is given by
⎛I −k⎞
z (k ) = ( z i − z f )⎜
⎟+ zf
⎝ I ⎠
(8.30)
184
I D E N T I F I C A T I O N
O F
H A M M E R S T E I N P L A N T S U S I N G C L O N A L P S O
A N D I M M U N I Z E D P S O A L G O R I T H M S
where z i and z f are initial and final values of z and are selected within the range [0,1].
The symbol I represents the maximum number of search. The fitness values of updated
position as well as the mutated position of particles are then evaluated and the overall best
location is selected for evaluation. In the next search the best location is again cloned and
the process continues. The IPSO introduces improved search of particles in the
D-dimensional space using mutation.
8.4 Weight update of the Hammerstein model
8.4.1
Identification algorithm using FLANN structure and PSO
based training
Step 1. Determination of output of the Hammerstein Model:
Uniformly distributed random ' k ' samples are generated to act as input during training.
These are passed through the nonlinear static part and subsequently though the linear
dynamic part of the Hammerstein model to produce output y (k ) .
Step 2. Functional expansion of input:
The same input samples are also passed through the model consisting FLANN structure.
Each input sample undergoes either trigonometric expansion as illustrated in (8.5), power
series or square cube expansion.
Step 3. Initialization of positions and velocities of swarm:
The weight vector of the Hammerstein model of Fig.8.3 is considered as a particle. Similar
to other evolutionary algorithms a set of particles representing a set of initial solutions is
chosen. The weight vector of D elements comprises of ( M + 2) number of elements for
FLANN along as well as (n + r ) elements for linear dynamic part. Each weight vector
consists of D = M + n + r + 2 weights which are each initialized as random numbers. For
the i th particle, the position vector (which represents the weight vector) is given by
Wi = X i = [ wi1 wi 2 .......wid ........wiD ]
(8.31)
where wid is the d th weight of the i th particle. Similarly the velocity assigned to the i th
particle is expressed as
185
I D E N T I F I C A T I O N
O F
H A M M E R S T E I N P L A N T S U S I N G C L O N A L P S O
A N D I M M U N I Z E D P S O A L G O R I T H M S
Vi = [vi1 vi 2 .......vid ........viD ]
(8.32)
Initially the personal best position each i th particle achieved is same as the initial i th
weight vector Wi and is represented as
Pi = Wi = [ wi1 wi 2 .......wid ........wiD ]
(8.33)
Step 4. Calculation of output of model:
The output of model is computed using FLANN model according to (8.17) and the weight
vector defined in (8.13).
Step 5. Fitness Evaluation:
The output of the model yˆ i (k ) due to k th sample and i th particle is compared with the
output of the plant to produce error signal given by
e i ( k ) = y i ( k ) − yˆ i ( k )
(8.34)
For each i th weight vector the mean square error (E) is determined and is used as the
fitness function given by
K
E (i) =
∑
k =1
e i2 ( k )
(8.35)
K
The identification task is then reduces to a minimization of the MSE defined in (8.35) using
PSO and new algorithms.
Step 6. Updation:
The velocity and the position of the d th weight of each i th particle for the next search are
obtained by the update rule given in eqs. (8.19)-(8.23).
Step 7. Evaluation of global best position of particle:
The fitness values of all particles are evaluated following step5. The best fitness value that
is, the minimum MSE (MMSE) is obtained and its corresponding D weights are identified
and termed as the global best. It is denoted by
Pg = W g = [ wg1 wg 2 .......wgd ........wgD ]
(8.36)
Step 8. Stopping Criteria:
The search process described in steps 1 to 6 continues until all the particles in the swarm
(the weight vectors) have achieved the global best position corresponding to a predefined
mean square error.
186
I D E N T I F I C A T I O N
8.4.2
O F
H A M M E R S T E I N P L A N T S U S I N G C L O N A L P S O
A N D I M M U N I Z E D P S O A L G O R I T H M S
Identification algorithm using FLANN structure and
CPSO based training
In CPSO, steps 1 to 5 of basic PSO remain the same .
Step 6. Updation:
The position and velocity of d th weight of each i th particle for the next search is obtained
by the update rule given in (8.24)-(8.25). The maximum velocity and position of particles
after updating is controlled by (8.20) and (8.22).
Step 7. Evaluation of global best position of particle:
The global best position of particles is evaluated in similar way as described in step 7 of
PSO.
Step 8. Cloning Operation:
The global best position is cloned in the sense that all particles of the swarm start their next
search from this position..
Step 9. Stopping Criteria:
The search process of steps 4 to 8 continues until all the particles in the swarm (the weight
vectors) have attained the global best corresponding to a predefined mean square error.
8.4.3. Identification algorithm using FLANN structure and IPSO
based training
Steps 1 to 5 are same as those of CPSO.
Step 6. Updation:
The position and velocity of d th weight of each i th particle for the next search is obtained
by the update rule given (8.26)-(8.27). The maximum velocity and position of particles after
updating are governed by (8.20) and (8.22).
Step 7. Evaluation of global best position of particle:
The global best position of particles is evaluated in similar way following step 7 of PSO
algorithm.
Step 8. Cloning Operation:
The global best particle position is cloned so that all particles occupy the same best
position.
Step 9. Mutation:
187
I D E N T I F I C A T I O N
O F
H A M M E R S T E I N P L A N T S U S I N G C L O N A L P S O
A N D I M M U N I Z E D P S O A L G O R I T H M S
Mutation process is incorporated to introduce variations in the cloned position. Probability
of mutation Pm is taken to be greater than 0.5. The mutated children produced are given by
xmid (k + 1) = xT1d (k + 1) + z(k) * xT2d (k + 1)
(8.37)
xm(i +1)d (k + 1) = xT2d (k + 1) + z(k ) * xT1d (k + 1)
(8.38)
Step 10. Stopping Criteria:
The search process from described in steps 4 to 9 continues until all the particles in the
swarm (the weight vectors) have converged to the global best position yielding a predefined
minimum mean square error.
8.5 Simulation study
To demonstrate the improved identification performance of the two new algorithms
simulation study using MATLAB is carried out. Four standard Hammerstein plants are
used for identification using CPSO and IPSO algorithms. The accuracy of identification of
the proposed models are assessed by comparing the following results.
1. True and estimated responses at the output of nonlinear static part.
2. The true and estimated coefficients of the linear dynamic part.
3. Comparison of sum of squared errors (SSE) between true and overall estimated
responses. The sum of squared error is defined as
K
SSE (k ) = ∑ ( y (k ) − yˆ (k )) 2
(8.39)
k =1
Λ
where y(k) is true output and y(k ) is estimated output during testing.
Example 1
The Hammerstein plant used for identification [8.32] is given by
A( z −1 ) y (k ) = B( z −1 ) x(k − 1) + e(k )
(8.40)
x(k ) = F (u (k )) = u (k ) + 0.5u 3 (k )
(8.41)
A( z −1 ) = 1 + 0.8 z −1 + 0.6 z −2
(8.42)
B ( z −1 ) = 0.4 + 0.2 z −1
(8.43)
188
I D E N T I F I C A T I O N
O F
H A M M E R S T E I N P L A N T S U S I N G C L O N A L P S O
A N D I M M U N I Z E D P S O A L G O R I T H M S
The input to the plant and the identification model is an uniformly distributed input lying
between [-3.0, 3.0]. A zero mean white Gaussian noise with standard deviation of 0.01 is
added to the plant. The number of input samples used to train the network is 300. In the
model the nonlinear static part is represented by a FLANN structure. Each input sample is
expanded to four terms by power series expansion as
⎧1
for i = 0
⎪
Φ i {u ( k )} = ⎨u ( k ) for i = 1
⎪u i ( k ) for i = 2,3
⎩
(8.44)
These expanded inputs are weighted by the coefficient of the FLANN. The weights are
updated by using GA, CLONAL, PSO, CPSO and IPSO algorithms. In all cases the initial
population of particles is taken as 70. The weights of the model are trained for 40
generations. The positions of the particles are considered within range [-2, 2] and their
velocities are limited within the range [-1.5, 1.5]. In case of IPSO the probability of
mutation Pm is taken as 0.8. The values of z i and z f are set at 0.9 and 0.05 respectively.
The true and estimated outputs of nonlinear static part of the plant are compared in Fig.8.4.
Comparison of estimates of the system parameters of linear dynamic part are shown in
Table 8.1 along with percentage of error within pair of brackets.
Percentage of error =
Actual Coefficient − Estimated Coeffcient
Actual Coefficient
(8.45)
The CPU time required for training of the model is presented in Table 8.2. The
comparative results of sum of squared errors (SSE) obtained during testing is presented in
Table 8.2.
189
I D E N T I F I C A T I O N
O F
H A M M E R S T E I N P L A N T S U S I N G C L O N A L P S O
A N D I M M U N I Z E D P S O A L G O R I T H M S
25
20
15
10
Nonlinearity
PSO
CLONAL
CPSO
IPSO
F(u(k))
5
0
-5
-10
-15
-20
-25
-3
-2
-1
0
u(k)
1
2
3
Fig.8.4 Comparison of response at the output of nonlinear static part of the plant and the
corresponding models of Example 1
Table 8.1
Comparison of true and estimated parameters of system for dynamic part of the model of Example 1
Parameters True
Values
IPSO
Estimated Values
CPSO
PSO
CLONAL
GA
a1
0.8
0.805 (0.6 %)
0.900 (12.5 %)
0.850 (6.2 %)
0.826 (3.2 %)
0.835 (4.3 %)
a2
0.6
0.590 (1.6 %)
0.606 (1.0 %)
0.650 (8.3 %)
0.614 (2.3 %)
0.630 (5.0 %)
b0
0.4
0.409 (2.2 %)
0.350 (12.5 %)
0.336 (16.0 %)
0.456 (14.0 %)
0.345 (13.7 %)
b1
0.2
0.210 (5.0 %)
0.240 (20.0 %)
0.240 (20.0 %)
0.236(18.0 %)
0.240 (20.0 %)
190
I D E N T I F I C A T I O N
O F
H A M M E R S T E I N P L A N T S U S I N G C L O N A L P S O
A N D I M M U N I Z E D P S O A L G O R I T H M S
Table 8.2
Comparison of CPU time and SSE for identifying the plant of Example 1
Algorithm
IPSO
During During
Training Testing
CPU
SSE
Time
(Sec.)
28.468
0.661
CPSO
28.448
4.334
PSO
27.813
10.683
CLONAL
35.125
1.948
GA
41.213
7.358
Example 2
In this example [8.32] the plant is described by
A( z −1 ) y (k ) = B( z −1 ) x(k − 1) + e(k )
(8.46)
x(k ) = F (u (k ))
⎧− 2.0
⎪u (k ) / 0.6 + 1.0
⎪⎪
= ⎨0.0
⎪u (k ) / 0.6 − 1.0
⎪
⎩⎪2.0
(−3.0 ≤ u (k ) < −1.8)
(−1.8 ≤ u (k ) < −0.6)
(−0.6 ≤ u (k ) < 0.6)
(0.6 ≤ u (k ) < 1.8)
(1.8 ≤ u (k ) ≤ 3.0)
(8.47)
A( z −1 ) = 1 + 0.8 z −1 + 0.6 z −2
(8.48)
B ( z −1 ) = 0.4 + 0.2 z −1
(8.49)
This system has saturation and dead- zone nonlinearity. The input signal is a uniformly
distributed signal lying between [-3.0, 3.0] and a total of 100 such samples are used for
training. Zero mean white Gaussian noise, e(k ) is with standard deviation 0.01.In the
FLANN, each input sample of the FLANN is expanded as
for i = 0
⎧1
⎪
Φ i {u ( k )} = ⎨u ( k ) for i = 1
⎪sin( u ( k )) for i = 2
⎩
(8.50)
191
I D E N T I F I C A T I O N
O F
H A M M E R S T E I N P L A N T S U S I N G C L O N A L P S O
A N D I M M U N I Z E D P S O A L G O R I T H M S
The initial population of particles is taken as 70. The weights of the model are trained for
40 generations. The positions of the birds are considered within the range [-2, 2] and their
corresponding velocities are taken within the range [-1.5, 1.5].In case of IPSO, the
probability of mutation Pm is taken as 0.8. The values of z i and z f are 0.9 and 0.05
respectively. The true and estimated output of nonlinear static part of the model is
presented in Fig. 8.5. Comparison of true and estimated of the system parameters along
with percentage of error of linear dynamic part shown in Table 8.3. Table 8.4 shows the
comparative result of CPU time required for training of model and sum of squared errors
(SSE) obtained during testing.
3
2
Nonlinearity
IPSO
CPSO
PSO
CLONAL
F(u(k))
1
0
-1
-2
-3
-3
-2
-1
0
u(k)
1
2
3
Fig. 8.5 Comparison of response at the output of nonlinear static part of the plant and the
corresponding models of Example 2
Table 8.3
Comparative results of estimates of system parameters for dynamic part of the model of Example 2
Parameters True
Values
IPSO
Estimated Values
CPSO
PSO
CLONAL
GA
a1
0.8
0.810 (1.2 %)
0.690 (13.7 %)
0.900 (12.5 %)
0.752 (6.0 %)
0.861(7.62 %)
a2
0.6
0.600 (0.0 %)
0.520 (13.3 %)
0.553(7.83 %)
0.582 (3.0 %)
0.563 (6.17 %)
b0
0.4
0.410 (2.5 %)
0.450 (12.5 %)
0.466(16.5 %)
0.378(5.5 %)
0.441 (10.2 %)
b1
0.2
0.200 (0.0 %)
0.220 (10.0 %)
0.251 (25.5 %)
0.160(20.0 %)
0.236(18.0 %)
192
I D E N T I F I C A T I O N
O F
H A M M E R S T E I N P L A N T S U S I N G C L O N A L P S O
A N D I M M U N I Z E D P S O A L G O R I T H M S
Table 8.4
Comparison of CPU time and SSE for identifying the plant of Example 2
Algorithm
IPSO
During During
Training Testing
CPU
SSE
Time
(Sec.)
9.844
0.149
CPSO
9.687
0.513
PSO
9.578
2.410
CLONAL
14.906
0.486
GA
19.104
1.231
Example 3
The third Hammerstein model [8.10] chosen for identification is
A( z −1 ) y (k ) = B( z −1 ) x(k − 1) + e(k )
(8.51)
x(k ) = F (u (k ))
= u (k ) + 0.5u 2 (k ) + 0.3u 3 (k ) + 0.1u 4 (k )
(8.52)
A( z −1 ) = 1 + 0.9 z −1 + 0.15 z −2 + 0.02 z −3
(8.53)
B ( z −1 ) = 0.7 + 1.5 z −1
(8.54)
Fig.8.3 represents the plant model. is taken into consideration. The input to this model is a
uniformly distributed signal lying between [-1.0, 1.0].The number of input sample used
during training is 300. The white Gaussian noise with zero mean and standard deviation
0.01 is added at the output of the plant. In FLANN each input sample is expanded to 5
terms
for i = 0
⎧1
⎪u ( k ) for i = 1
⎪
Φ i {u ( k )} = ⎨
⎪sin(( i − 1) * u ( k )) for i = 2, 4
⎪⎩ cos(( i − 1) * u ( k )) for i = 3
(8.55)
The initial population of particles is taken as 90. The weights of the model are trained for
40 generations. The positions of the particles are considered within range [-2, 2] and their
193
I D E N T I F I C A T I O N
O F
H A M M E R S T E I N P L A N T S U S I N G C L O N A L P S O
A N D I M M U N I Z E D P S O A L G O R I T H M S
velocities are taken in the range [-1.5, 1.5]. In case of IPSO the probability of mutation Pm
is taken as 0.8. The values of z i and z f used are 1 and 0.01 respectively. The true and
estimated outputs of nonlinear static part of the given example are compared Fig.8.6.
Comparison of true and estimated system parameters along with percentage of error of
linear dynamic part is shown in Table 8.5. The CPU time required for training of model
structure is presented in Table 8.6. The comparative result of sum of squared errors (SSE)
obtained during testing is presented in Table 8.6.
3
Nonlinearity
IPSO
CPSO
PSO
CLONAL
2.5
2
F(u(k))
1.5
1
0.5
0
-0.5
-1
-1.5
-1
-0.8
-0.6
-0.4
-0.2
0
u(k)
0.2
0.4
0.6
0.8
1
Fig. 8.6 Comparison of response at the output of nonlinear static part of the plant and the
corresponding models of Example 3
Table 8.5
Comparative results of estimates of system parameters for dynamic part of the model of Example 3
Parameters
True
Values
IPSO
Estimated
CPSO
Values
PSO
CLONAL
GA
a1
0.9
0.898 (0.2 %)
0.900 (0.0 %)
0.841(6.5%)
1.037 (15.2%)
1.011(12.3 %)
a2
0.15
0.146 (2.6 %)
0.069 (54.0 %)
0.150(0.0 %)
0.522 (248 %)
0.471 (214 %)
a3
0.02
0.020 (0.0 %)
0.020 (0.0 %)
0.020(0.0 %)
0.188(940 %)
0.195 (975 %)
b0
0.7
0.696(0.5 %)
0.750(7.14 %)
0.416(40.5 %)
0.830(18.5%)
0.846(20.8 %)
b1
1.5
1.494(0.4 %)
1.084 (27.7 %)
0.892 (40.5%)
1.067(28.8 %)
0.992(33.8 %)
194
I D E N T I F I C A T I O N
O F
H A M M E R S T E I N P L A N T S U S I N G C L O N A L P S O
A N D I M M U N I Z E D P S O A L G O R I T H M S
Table 8.6
Comparison of CPU time and SSE for identifying the plant of Example 3
Algorithm
IPSO
During During
Training Testing
CPU
SSE
Time
(Sec.)
53.79
3.587
CPSO
51.63
6.550
PSO
50.60
21.779
CLONAL
57.68
19.77
GA
64.71
19.96
Example 4
The Hammerstein plant used in this example is same as that of the Example3, except that a
different nonlinear static part given in (8.56) is used
x(k ) = F (u (k )) = 0.5 * sin 3 (π u (k ))
(8.56)
For modeling this plant the weights of the model are trained for 40 generations. In case of
IPSO the values of z i and z f are set to 0.9 and 0.05 respectively. The remaining conditions
of simulation are same as that used in example 3. The true and estimated outputs of
nonlinear static part of the given example are compared in Fig. 8.7. Comparison of the
estimates of the system parameters of linear dynamic part is provided in Table 8.7. Table
8.8 compares the CPU time required for training the model and sum of squared errors
(SSE) obtained during testing.
195
I D E N T I F I C A T I O N
O F
H A M M E R S T E I N P L A N T S U S I N G C L O N A L P S O
A N D I M M U N I Z E D P S O A L G O R I T H M S
1
Nonlinearity
IPSO
CPSO
PSO
CLONAL
0.8
0.6
0.4
F(u(k))
0.2
0
-0.2
-0.4
-0.6
-0.8
-1
-1
-0.8
-0.6
-0.4
-0.2
0
u(k)
0.2
0.4
0.6
0.8
1
Fig. 8.7 Comparison of response at the output of nonlinear static part of the plant and the
corresponding models of Example 4
Table 8.7
Comparative results of estimates of system parameters for dynamic part of the model of Example 4
Parameters
True
Values
IPSO
Estimated
CPSO
Values
PSO
CLONAL
GA
a1
0.9
0.890(1.1 %)
0.910 (1.1 %)
1.000(11.1 %)
0.935 (3.89 %)
0.942(4.67 %)
a2
0.15
0.150 (0.0 %)
0.170 (13.3 %)
0.310(3.8 %)
0.240(60.0 %)
0.203(35.3%)
a3
0.02
0.021 (5.0 %)
0.022 (10.0 %)
0.088(340 %)
0.075(275%)
0.078 (290 %)
b0
0.7
0.690(1.4 %)
0.750(7.1 %)
0.597(14.7%)
0.462(34.0 %)
0.475(32.1%)
b1
1.5
1.480(1.3 %)
1.471(1.93 %)
0.875 (41.6 %)
1.201(19.9 %)
1.122(25.2%)
196
I D E N T I F I C A T I O N
O F
H A M M E R S T E I N P L A N T S U S I N G C L O N A L P S O
A N D I M M U N I Z E D P S O A L G O R I T H M S
Table 8.8
Comparison of CPU time and SSE for identifying the plant of Example 4
Algorithm
IPSO
During During
Training Testing
CPU
SSE
Time
(Sec.)
46.42
0.0016
CPSO
44.04
0.0066
PSO
44.00
0.9744
CLONAL
51.23
0.1252
GA
57.01
0.3517
8.6 Conclusion
In this Chapter two modified PSO based algorithms have been proposed by incorporating
some modifications on the standard PSO algorithm and those have been used in training
the weights of FLANN structure of nonlinear static part and the parameters of linear
dynamic part of Hammerstein plant identification models. Both the proposed algorithms
are relatively simple compared to the original PSO algorithm. But the simulation study
reveals that the IPSO algorithm offers best identification performance compared to other
algorithms. Out of the two algorithms proposed, the CPSO is computationally simpler but
offers identification performance nearly similar to its PSO counterpart. Under identical
conditions the IPSO requires more CPU time followed by PSO and CPSO. The above
observations have been arrived at by comparing the SSE, the output response and the true
and estimated parameters obtained from the simulation study of four benchmark examples.
References
[8.1]
S.A. Billings and S. Y. Fakhouri, “Identification of Systems Containing Linear
Dynamic and Static Nonlinear Elements”, Automatica, vol. 18, no.1, pp. 15-26,
1982.
197
I D E N T I F I C A T I O N
O F
H A M M E R S T E I N P L A N T S U S I N G C L O N A L P S O
A N D I M M U N I Z E D P S O A L G O R I T H M S
[8.2]
F. C. Kung and D. H. Shih , “ Analysis and Identification of Hammerstein Model
Non-linear Delay Systems Using Block-pulse Function Expansions ”, Int. J.
Control, vol. 43, no. 1, pp. 139-147, 1986.
[8.3]
S. Adachi and H. Murakami , “Generalized Predictive Control System Design
Bused on Non-linear Identification by Using Hammerstein Model”, Trans. of the
Institute of Systems, Control and Information Engineers, vol. 8, no. 3, pp. 115-121,
1995.
[8.4]
H. Al-Duwaish and M. N. Karim, “A New Method for the Identification of
Hammerstein Model ”, Automatica , vol. 33, no. 10, pp. 1871-1875, 1997.
[8.5]
Y. Kobayashi, M. Oki and T. Okita , “ Identification of Hammerstein Systems with
Unknown Order by Neural Networks ”, Trans. on the IEE Japan, vol. l20-C, no.6,
pp. 871-878, 2000.
[8.6]
H.X. Li, “Identification of Hammerstein models using genetic algorithms”, IEE
Proc. Control Theory Appl., vol. 146, no. 6, pp.499-504, 1999.
[8.7]
Tomohiro Hachino, Katsuhisa Deguchi and Hitoshi Takata, “Identification of
Hammerstein Model using Radial Basis Function and Genetic Algorithms”, in Proc.
of 5th Asian Control Conference, pp. 124-129 , 2004.
[8.8]
W. Lin, Huidi Zhang and Peter X. Liu, “A New Identification Method for
Hammerstein Model Based on PSO”, in Proc. of IEEE International Conference
on Mechatronics and automation, pp. 2184-2188, 2006 .
[8.9]
W. Lin, C.G. Jiang and J.X. Qian, “The identification of Hammerstein model based
on PSO with fuzzy adaptive inertia weight,” J. Syst. Sci. Inf., Vol. 3, No. 2 , pp.381391,2005.
[8.10] W. Lin and P.X. Liu, “Hammerstein model identification based on bacterial
foraging,” Electronics Letters, Vol. 42, No. 23, pp.1332-1333, 2006.
[8.11] R.Eberhart, J.Kennedy. “A new optimizer using particle swarm theory”, in Proc. of
6th Int’l Symposium on Micro Machine and Human Science, Nagoya, Japan, pp:3943, 1995.
[8.12] G.Panda, D. Mohanty, B.Majhi, G.Sahoo, “Identification of nonlinear systems
using particle swarm optimization technique,” in Proc. of IEEE Congress on
Evolutionary Computation, Singapore, pp. 3253-3257, Sept. 2007.
[8.13] P. Angeline, “Evolutionary optimization versus particle swarm optimization:
philosophy and performance difference”, In Proc. of the Evolutionary
Programming Conference, San Diago, USA, pp: 169-173, 1998.
[8.14] J. Liu, W. Xu and J. Sun , “ Quantam-behaved Particle Swarm Optimization with
Mutation Operator,” in Proc. of IEEE Int. Conf. on Tools with Artificial
Intelligence (ICTAI’ 05), 2005.
198
I D E N T I F I C A T I O N
O F
H A M M E R S T E I N P L A N T S U S I N G C L O N A L P S O
A N D I M M U N I Z E D P S O A L G O R I T H M S
[8.15] J. Liu, X. Fan and Z. Qu , “ An Improved Particle Swarm Optimization with
Mutation Based on Similarity,” in Proc. of IEEE International Conference on
Natural Computation (ICNC 07), pp.824-828,2007.
[8.16] M. Senthil Arumugam, A. Chandramohan and M.V.C. Rao, “Competitive
Approaches to PSO Algorithms via New Acceleration Co-efficient variant with
Mutation Operators”, in Proc. of IEEE sixth Int. Conf. on Computational
Intelligence and Multimedia Applications (ICCIMA’05), pp.225-230, 2005.
[8.17] V.Katari, S. Malireddi, S.K.S. Bendapudi, and G.Panda, “Adaptive Nonlinear
System Identification using Comprehensive Learning PSO,” in Proc. of IEEE 3rd
International Symposium on Comm., Cont. and Signal Processing, pp.434-439,
Mar. 2008.
[8.18] D. Dasgupta, “Advances in Artificial immune Systems,” IEEE Computational
Intelligence Magazine, vol. 1, issue 4, pp.40 – 49, Nov. 2006.
[8.19] D. Dasgupta and N. Attoh-Okine, “Immunity-based systems: A survey,” Proc.
IEEE Int. Conf. Syst., Man, Cybern., Orlando, FL, Oct. 12–15, 1997, pp. 369 –374.
[8.20] P. S. Andrews and J. Timmis, “Inspiration for the next generation of artificial
immune systems, in Proc. of fourth international conference on artificial
immune system” Lecture Notes in Computer Science, 3627, 126–138, 2005.
[8.21] L N de Charsto and J. V. Zuben , “Learning and Optimization using Clonal
Selection Principle ,” IEEE Trans on Evolutionary Computation, Special issue on
Artificial Immune Systems, vol. 6,issue 3 , pp.239-251,2002.
[8.22] L N de Charsto and J. Timmis , “An Artificial Immune Network for Multimodal
Function Optimization”, in Proc. of IEEE Congress on Evolutionary Computation
(CEC’02), vol. 1,pp.699-674,May ,Hawaii,2002.
[8.23] S. Forrest, S. A. Hofmeyr, A. Somayaji and T. A. Longstaff, “A sense of self for
unix processe”, in Proc. of the IEEE Symposium on Computer Security Privacy,
1996,pp. 120–128.
[8.24] S. Forrest and S. A. Hofmeyr, Immunology as information processing : Design Principles for
the Immune System and Other Distributed Autonomous Systems, Santa Fe Institute Studies
in the Sciences of Complexity, pp. 361–387, Oxford Univ. Press, 2000.
[8.25] D. W. Bradley and A. M. Tyrrell, “Immuotronics – Hardware fault tolerance
inspired by the immune system”, in Proc. of the third International Conference on
Evolvable Systems, 2000, pp.11-20 ,1801, Springer-Verlag.
[8.26] D. W. Bradley and A. M. Tyrrell, “Immuotronics – novel finite-state-machine
architectures with built-in self-test using self–nonself differentiation”, IEEE
Transaction on Evolutionary Computation , vol. 6, no. 3, pp. 227-238, 2002.
199
I D E N T I F I C A T I O N
O F
H A M M E R S T E I N P L A N T S U S I N G C L O N A L P S O
A N D I M M U N I Z E D P S O A L G O R I T H M S
[8.27] Hong-Wei Ge, L. Sun, Y. C. Liang and F. Qian, „An effective PSO and AIS-based
hybrid intelligent algorithm for job shop scheduling”, IEEE Transaction on system,
man and cybernetics A, vol. 38, no. 2, pp.358–368, 2008.
[8.28] F. Esponda, S. Forrest and P. Helman, “A formal framework for positive and
negative detection”, IEEE Transaction on system, man and cybernetics, vol. 34,
pp.357–373, 2004.
[8.29] P Matzinger, The Danger Model: a Renewed Sense of Self. Science, 296, pp.301304, 2002.
[8.30] J. C. Patra, R. N. Pal, B. N.Chatterji and G. Panda, “Identification of nonlinear
dynamic systems using functional link artificial neural networks” IEEE Transaction
on System, Man and Cybernetics B, vol. 29, pp.254–262, 1999.
[8.31] J. C. Patra and A. C. Kot, “Nonlinear dynamic system identification using
Chebyshev functional link artificial neural networks”, IEEE Transaction on System,
Man and Cybernetics B, vol.32, pp. 505–511, 2002.
[8.32] Tomohiro Hachino, Katsuhisa Deguchi and Hitoshi Takata, “Identification of
Hammerstein Model using radial basis function and genetic algorithms”, in Proc. of
the 5th Asian Control Conference, Melbourne, Australia, 2004, pp. 124-129.
200
9
Chapter
Development of Distributed Particle
Swarm Optimization Algorithms for
Robust Nonlinear System
Identification
9.1 Introduction
T
HE wireless sensor networks provide a smart interaction with the physical world
and are deployed at low cost and in large numbers in remote environments. They
yield autonomous and intelligent measurements, answer queries and also perform
monitoring tasks. Application areas include environment monitoring, battlefield
surveillance, health care, home automation and so on [9.1]. In a traditional centralized
solution, the nodes in the network collect observations and send them to a central location
for processing. The central processor then performs the required estimation tasks and
broadcast the result back to the individual nodes. This mode of operation requires a
powerful central processor, in addition to extensive amounts of communication between
201
D E V E L O P M E N T O F D I S T R I B U T E D P A R T I C L E S W A R M O P T I M I Z A T I O N
A L G O R I T H M S F O R R O B U S T N O N L I N E A R S Y S T E M I D E N T I F I C A T I O N
the nodes and the processor. Distributed processing extracts information from data
collected at different points that are distributed over a geographical area. In the distributed
mode of solution, each system uses its own local data and as well as interact with its
surrounding nodes to obtain global solution so that the amount of processing and
communications is significantly reduced [9.2] – [9.4]. In a distributed scenario all the nodes
coordinate with each other by some means and the distributed algorithm guides the system
towards a promising solution without being influenced by the local information of each
individual node. Though each sensor is characterized by low power constraint and limited
computation and communication capabilities, potential networks can be built to perform
various high level tasks with sensor collaboration [9.5] such as distributed estimation,
distributed detection and target localization and tracking. Applications range from sensor
networks to precision agriculture, environment monitoring, disaster relief management,
smart spaces and medical applications [9.2], [9.6], [9.7]. The distributed parameter
estimation has been suggested in [9.8]-[9.12] assuming that the joint distribution of sensors’
observations is known and that the messages can be sent from the sensors to the fusion
center without distortion.
The effectiveness of any distributed implementation depends on the modes of cooperation
that are allowed among the nodes. Two such modes of cooperation available are as shown
in Fig. 9.1. In the first approach the overall system parameters are estimated in a
cooperative fashion by using local estimates and sharing them with their pre-identified
neighbours. This mode of operation requires a cyclic pattern of collaboration among the
nodes, and it tends to require the least amount of communication and power [9.3], [9.13],
[9.14].
(a) Incremental
(b) diffusion
Fig. 9.1 Two modes of cooperation
202
D E V E L O P M E N T O F D I S T R I B U T E D P A R T I C L E S W A R M O P T I M I Z A T I O N
A L G O R I T H M S F O R R O B U S T N O N L I N E A R S Y S T E M I D E N T I F I C A T I O N
The second approach employs diffusion protocols for establishing cooperation among
individual nodes [9.15]-[9.17]. Each node obtains local estimates of interested parameters
and share this information with their neighbours. The amount of communication in this
case is higher than in an incremental case. The communications in the diffusion
implementation can be reduced by allowing each node to communicate only with subset of
its neighbours. Both incremental and diffusion algorithms are distributed and cooperative
in nature. They are also capable of responding in real time to environmental changes.
Recently an adaptive distributed strategy is proposed based on incremental techniques for
linear estimation in a cooperative fashion [9.18]. A study of distributed estimation
algorithms based on diffusion protocols to implement cooperation among individual
adaptive nodes is reported in [9.19]. Distributed evolutionary optimization frame work
based on a swarm intelligence principle has been recently reported [9.20] for sensor
networks. A different version of particle swarm optimization algorithm has been reported
for a swarm of robotic applications [9.21]. An RF IC optimization methodology based on
an elitist distributed particle swarm optimization algorithm is presented in [9.22].
The literature reviews reveals that the recently reported distributed algorithms have been applied
only for linear estimation of interested parameters of a region. On the other hand in many
practical simulations the local input and output data collected at each node of the network is
nonlinearly related which means that each node functions as an individual nonlinear filter. The
research work reported in this chapter is based on the following motivations :
(i)
There is a need to develop novel algorithms using evolutionary computing strategy.
(ii)
To assess the performance of new algorithms when used for system identification.
(iii)
To evolve methodology for robust identification of plants using distributed
algorithms and robust norm.
In this chapter two novel distributed PSO algorithms : incremental PSO(INPSO) and
diffusion PSO (DPSO) are proposed and applied to system identification. In this method
of identification minimization of mean square error is taken as the cost function. But when
outliers are present in the training samples this conventional cost function does not provide
satisfactory performance. Therefore to achieve robust identification performance against
outliers a robust norm is introduced and used as cost function. In this chapter performance
203
D E V E L O P M E N T O F D I S T R I B U T E D P A R T I C L E S W A R M O P T I M I Z A T I O N
A L G O R I T H M S F O R R O B U S T N O N L I N E A R S Y S T E M I D E N T I F I C A T I O N
of a new identification scheme using combination of INPSO and DPSO along with robust
norm has been assesses through simulation study.
9.2 Distributed system identification
(a) INPSO based system identification
The basic concept of the incremental PSO (INPSO) algorithm is as follows :
Each node performs its own calculation using its local data, updates its particles’ positions,
velocities, personal best (pbest) positions and transfer the global best position to its next
neighbour. The adjacent node updates its own particles’ positions, velocities and pbests
using its local data and the gbest vector received from the previous node and transmit the
new updated gbest to its neighbour. This process continues for several cycles through the
network until the desired solution is obtained. Fig. 9.2 shows the schematic representation
of an INPSO based identification scheme. In this scheme it is assumed that
N
(1  n  N ) sensor nodes present in a small region, participate in the identification
task using their local measurements { X n , Dn } . In this case X n  [ xn1 xn 2 ............xnM ]T
and Dn  [d n1 d n 2 .............d nM ]T represent the input and desired samples of n th node.
Each desired sample is contaminated with white Gaussian noise vm (n) . At each n th node
the swarm consists of K number of particles. In any l th (1  l  L) search, the position
vector, the best position vector and the velocity vector of k th particle of first node is
represented by P1k (l ), B1k (l ) and V1k (l ) respectively. During a specific search, a pair of
input-output samples {xm , d m } produces an error and hence a total of M errors are
generated for a particle. The mean square error, E1k (l ) of the k th particle at l th search is
thus computed and is used as the fitness or cost function. The initial position vector of any
k th particle is taken to be its initial personal best position vector. From the set of B1k (l ) ,
1  k  K , the best position vector corresponding to the least E1b (l ) is chosen as the
global best position vector. The values of P1k (l ), B1k (l ),V1k (l ) associated with k th
204
D E V E L O P M E N T O F D I S T R I B U T E D P A R T I C L E S W A R M O P T I M I Z A T I O N
A L G O R I T H M S F O R R O B U S T N O N L I N E A R S Y S T E M I D E N T I F I C A T I O N
population of first node and the global best position vector G N (l ) of N th node are used
to update the velocity vector and then the position vector. These update equations are
V1k (l  1)  wl V1k (l ) * c1k * rand1k (l ) *[ B1k (l )  P1k (l )]  c 2k * rand 2k (l ) *[G1 (l )  P1k (l )]
(9.1)
P1k (l  1)  P1k (l )  V1k (l  1)
(9.2)
where P1k (l ), B1k (l ) are position and local best position vectors at l th search of k th
particle of node 1. G1 (l ) represents the global best position vector at l th search of node 1.
rand1k (l ) and rand 2 k (l ) are random numbers in the range [0, 1]. c1 and c 2 represent the
acceleration coefficients and w is the inertia weight which balances the global and local
searches. The inertia weight at l th search is given by
w(l )  w0 
( w0  w1 ) * l
L
(9.3)
where
l = search number
L = total number of searches
w0  0.9 and w1  0.4
The dotted portion of Fig. 9.2 indicated by A1 corresponds to node 1 of the incremental
sensor network. The contents of block A2 , AN 1 and AN of Fig. 9.2 are identical to that of
A1 except that the initialized values of the positions and velocities of the particles are
different. The updated values are used to obtain the new personal best and global best
position vectors required for next search. After completion of l th search at the k th node,
the global position vector Gn (l ) is evaluated and communicated to (k  1) th neighbour.
This process is repeated until the average MSE of the cluster of nodes reduces to the lowest
possible value. Under this situation, the final global best position vector gives the estimate
of the weight vector of the model. On the other hand identification of nonlinear system is
achieved when the response of the model matches with the plant output.
205
D E V E L O P M E N T O F D I S T R I B U T E D P A R T I C L E S W A R M O P T I M I Z A T I O N
A L G O R I T H M S F O R R O B U S T N O N L I N E A R S Y S T E M I D E N T I F I C A T I O N
D1
V11(l)
B11(l)
xM
d2
x2
d1
x1
V12(l)
P12(l)
V1K(l)
B12(l)
P1K(l)
B1K(l)
E1K(l)
Eb1K(l)
Σ
+
Σ
_
+
E11(l)
B11(l)
Σ
+
E12(l)
Eb11(l)
B12(l)
_
Σ
_
Eb12(l)
B1K(l)
V11(l)
P11(l)
V12(l)
P12(l)
Choose minimum Eb1k(l)
= Eg1(l)
G1(l)
Position and Velocity Update
V1K(l)
P1K(l)
GN(l)
P11(l+1) V11(l+1)
A1
P1K(l+1) V1K(l+1)
Used in next search
GN(l)
)
G 1(l
1
GN
1(
)
(l)
G1
l)
G
2 (l
GN
N
G 2(
-1 (l)
l)
N-1
(l )
G2
)
G N-
-1 ( l
AN
G1 (l
)
G N(l
(l)
GN
v(m)
P11(l)
X1
dM
AN-1
Fig. 9.2 INPSO based nonlinear identification scheme
206
2
)
A2
D E V E L O P M E N T O F D I S T R I B U T E D P A R T I C L E S W A R M O P T I M I Z A T I O N
A L G O R I T H M S F O R R O B U S T N O N L I N E A R S Y S T E M I D E N T I F I C A T I O N
(b) DPSO based system identification
The diffusion based distributed mechanism using LMS is reported in [8.9]. Diffusion PSO
(DPSO) algorithm uses similar cooperation strategy. The main objective in this case is to
develop an adaptive distributed procedure that approximates the solution and delivers a
good estimate to every node in the network. In DPSO each node updates its particles’
positions, velocities and pbest using its local data and transmits the gbest vector to all other
neighbouring nodes. Every node then finds out the best gbest and employs this gbest and
its local available data for computing its new positions, pbest and gbest. Such an update
strategy reduces the amount of information that is communicated among the particles and
thus increases the stability of the algorithm. The information exchange mechanism of
DPSO is shown in Fig. 9.3.
Like in the case of INPSO, the global best position, G n at all nodes are evaluated using
local data. Similarly the position and velocities of all particles at nodes are simultaneously
evaluated. The computed global best information G n is exchanged between all participating
nodes. Then each node locally compares and selects the best of global best positions ( Gbn )
and use it for updating the velocity and position of particles of that node. The velocity and
position update equations remain same as given in (9.1) and (9.2). The above stated
procedure is repeated until the average MSE of all nodes attains the lowest possible value.
207
D E V E L O P M E N T O F D I S T R I B U T E D P A R T I C L E S W A R M O P T I M I Z A T I O N
A L G O R I T H M S F O R R O B U S T N O N L I N E A R S Y S T E M I D E N T I F I C A T I O N
D1
V11(l)
B11(l)
xM
d2
x2
d1
x1
V12(l)
P12(l)
V1K(l)
B12(l)
P1K(l)
B1K(l)
E1K(l)
Eb1K(l)
Σ
+
Σ
_
+
B12(l)
_
+
E12(l)
Eb11(l)
E11(l)
B11(l)
Σ
Σ
_
Eb12(l)
B1K(l)
V11(l)
P11(l)
V12(l)
P12(l)
Choose minimum Eb1k(l)
= Eg1(l)
G1(l)
Position and Velocity Update
V1K(l)
P1K(l)
GN(l)
P11(l+1) V11(l+1)
A1
P1K(l+1) V1K(l+1)
Used in next search
G 1(l)
Gb1(l)
1
GN
G2 (l)
G N(l)
G 1(l)
(l)
G1 (l)
Gb
GN-1(l)
G2(l)
( l)
GN
Gb N
N
-1 (l)
l)
2(
GN(l)
GN (
l)
G1(l)
G 2(l
)
N-1
Gb
N1(
l)
AN-1
Fig. 9.3 DPSO based nonlinear identification scheme
208
A2
G
( l)
G N-1
-1 (l
)
AN
GN
V(m)
P11(l)
X1
dM
2 (l
2
)
D E V E L O P M E N T O F D I S T R I B U T E D P A R T I C L E S W A R M O P T I M I Z A T I O N
A L G O R I T H M S F O R R O B U S T N O N L I N E A R S Y S T E M I D E N T I F I C A T I O N
9.3 Distributed robust identification of plants
When the measured data contains outliers of different densities and strengths then the
conventional learning algorithms based on squared error minimization provide poor
identification performance. This is because of improper training of the identification model
and hence the scheme is not robust. To improve the identification performance under such
adverse conditions the learning strategy needs improvement. Attempt has been made to
fulfill this objective by selecting and employing robust norm from the statistics literature.
One such norm is Wilcoxon norm [9.27] which has been proven to be a robust norm
against outliers. The detail of this norm is available in Section 6.4 of Chapter 6.
The distributed strategy of training used for identification using INPSO and DPSO
algorithms outlined in the previous sections are almost identical. The only exception in this
case is a new cost function which is the Wilcoxon norm evaluated from the error vector.
This norm of errors is minimized using INPSO and DPSO algorithms as suggested in the
previous section. The performance of these two algorithms in identification of plant is
evaluated through simulation study and is presented in subsequent section.
9.4 Stepwise distributed PSO algorithms
The objective of an adaptive identification algorithm is to change the coefficients of the
model iteratively so that the squared error, e 2 (k ) or any other robust norm (Wilcoxon
norm) is minimized and subsequently reduced to a best possible minimum. In this case the
architecture of the model is an adaptive linear combiner and the learning algorithm used to
change the coefficient values is either INPSO or DPSO. The steps involved in these two
algorithms are outlined as follows :
1
M (M  100) number of input signal samples uniformly distributed between 0.5 to +0.5 are generated.
2
Ten of these input samples are assigned to each node. In this way all 100 samples
are assigned to all 10 nodes.
3
Each ten input samples is passed to the nonlinear model of a corresponding node
and measurement noise of known strength is added to the output. The resultant
signal acts as the desired output at that node. This process is repeated at all nodes
209
D E V E L O P M E N T O F D I S T R I B U T E D P A R T I C L E S W A R M O P T I M I Z A T I O N
A L G O R I T H M S F O R R O B U S T N O N L I N E A R S Y S T E M I D E N T I F I C A T I O N
using their corresponding models and the estimated output in each is obtained.
This process is repeated for all nodes.
4
By comparing the output sample with the corresponding estimated output of the
model the error signal is obtained.
5
The mean square error (MSE) defined in (9.4)/Wilcoxon norm defined in (6.12)
of k th particle is determined
M
MSE (k ) 
e
m 1
2
m
(9.4)
M
This is repeated for all particles.
6
The MSE/Wilcoxon norm is minimized by using INPSO and DPSO based
techniques using the procedure outlined in Figs. 9.2 and 9.3 respectively.
7
The velocity and position of each particle is updated using (9.1) and (9.2).
8
In case of INPSO, the gbest is transmitted to the next node, but in case of
DPSO, the same value is transmitted to all other neighbouring nodes. This
process is repeated for each of ten nodes.
9
In each iteration, the average MSE/average Wilcocon norm (expressed in dB) is
stored and is plotted against iteration(cycle) to show the learning characteristics
of the distributed PSO algorithm.
10 When the MSE/Wilcoxon norm reaches the pre-specified level, the optimization
process is stopped.
11 At this stage all the particles attain almost the same positions which represent the
estimated coefficient vector of the overall system.
To validate and compare the performance of proposed INPSO and DPSO algorithms the
conventional PSO algorithm is also simulated to obtain the estimated parameters. To
achieve this all the data collected at different nodes are accumulated at one node and are
used in the identification task [9.23].
9.5 Simulation study
Identification of standard linear and nonlinear plants is carried out under distributed
environment using newly developed INPSO and DPSO algorithms. Zero mean white
210
D E V E L O P M E N T O F D I S T R I B U T E D P A R T I C L E S W A R M O P T I M I Z A T I O N
A L G O R I T H M S F O R R O B U S T N O N L I N E A R S Y S T E M I D E N T I F I C A T I O N
Gaussian noise is added to the plant output to generate the training signal. To facilitate
comparison estimation of parameters is also carried out using centrally available measured
data and PSO algorithm for training the model.
A. Identification of linear plants
Three non-minimum phase all-zero systems [9.24, 9.25] used in this simulation are
1. x(n)  w(n)  0.9w(n  1)  0.385w(n  2)  0.771w(n  3)
2. x(n)  w(n)  0.8w(n  1)  1.52w(n  2)  0.64w(n  3)  0.99w(n  4)
3. x(n)  w(n)  2.33w(n  1)  0.75w(n  2)  0.5w(n  3)  0.3w(n  4)  1.4w(n  5)
The zeros of these systems are located at  0.75  j 0.85 and 0.6, 0.6  j 0.8602 and
 0.2  j 0.9274 and 1.851,  0.587  j 0.563 and 0.827  j 0.678 respectively. The
output of the plant is added with white Gaussian noise of strength -30dB to produce the
output of the known plants. Uniform random signal within the range of -0.5 to +0.5 is
generated and used as input signal. This is passed through both the plant and adaptive
model simultaneously. Table 9.1 shows the number of nodes and number of input samples
used in the simulation study. It is observed that the product of number of nodes and
number of input samples are kept constant in all cases. The parameters estimated using
distributed PSO algorithm are listed in Table 9.2. This Table also shows the percentage
deviation of coefficients, the MSE during testing and training time in seconds obtained by
three different methods.
Table 9.1
Comparison of simulation parameters used in INPSO, DPSO and PSO based models
Example
no.
Ex-1
Ex-2
Ex-3
Parameters used in simulation
No. of nodes
No. of input samples at
each node
No. of particles at each
node
No. of nodes
No. of input samples at
each node
No. of particles at each
node
No. of nodes
No. of input samples at
each node
No. of particles at each
node
211
INPSO
10
50
DPSO
10
50
PSO
1
500
10
10
100
10
100
10
100
1
1000
100
100
1000
10
100
10
100
1
1000
60
60
600
D E V E L O P M E N T O F D I S T R I B U T E D P A R T I C L E S W A R M O P T I M I Z A T I O N
A L G O R I T H M S F O R R O B U S T N O N L I N E A R S Y S T E M I D E N T I F I C A T I O N
Table 9.2
Comparison of estimated parameters using INPSO, DPSO and PSO techniques
Example no.
Actual
Parameters
Estimated parameters
INPSO
Ex -1
MSE during
testing
Execution
time in
second
Ex-2
MSE during
testing
Execution
time in
second
Ex-3
1
0.9
0.385
-0.771
0.999
0.8988
0.3842
-0.7682
9.13 x 10-4
% of
Deviation
0.0100
0.1333
0.2078
0.3632
0.1260
1
-0.80
1.52
-0.64
0.99
0.9882
-0.8021
1.4948
-0.6466
0.9763
0.0050
MSE during
testing
Execution
time in
second
0.9923
-2.3216
0.7381
0.5272
0.3207
-1.2800
0.0476
1.0000
0.9007
0.3906
-.7735
8.84 x 10-4
% of
Deviation
0
0.0778
1.4545
0.3243
0.1143
1.1800
0.2625
1.6600
1.0312
1.3838
2.1755
1
-2.33
0.75
0.5
0.3
-1.40
DPSO
0.9998
-0.7965
1.4983
-0.6369
0.9855
0.0037
3.2648
0.9986
-2.3779
0.6928
0.5094
0.2563
-1.2929
0.0561
3.2563
0.999
0.9015
0.3840
-.7714
9.16 x 10-4
% of
Deviation
0.1000
0.1667
0.2597
0.0519
1.0888
0.0200
0.4375
1.4276
0.4844
0.4545
2.1617
0.7700
0.3605
1.5867
5.4400
6.9000
8.5700
PSO
0.9858
-0.8007
1.4991
-0.6484
0.9869
0.0013
1.4200
0.0875
1.3750
1.3125
0.3131
21.6401
0.1400
2.0558
7.6267
1.8800
14.5667
7.6500
0.9994
-2.1757
0.9921
0.5232
0.3154
-1.2543
0.1798
0.0600
6.6223
32.2800
4.6400
5.1333
10.410
32.5122
Observation of results indicates that both distributed methods provide excellent
identification performance which is comparable or even better. Further the training time of
the new methods is much less compared to that of conventional PSO based method.
B. Identification of nonlinear plants
The new algorithms are also used to estimate nonlinear parameters. Four nonlinear plants
are simulated using combinations of two linear plants cascaded with two different
nonlinearities [9.26].
Linear plants
4. H(z) = 0.3040 + 0.9030z-1 + 0.3040z-2 and
212
D E V E L O P M E N T O F D I S T R I B U T E D P A R T I C L E S W A R M O P T I M I Z A T I O N
A L G O R I T H M S F O R R O B U S T N O N L I N E A R S Y S T E M I D E N T I F I C A T I O N
5. H(z) = 0.2600 + 0.9300z-1 + 0.2600z-2
Two different nonlinearities used to make nonlinear plants are
NL1 : b(k )  tanh{a(k )}
NL2 : b(k )  a(k )  0.2a 2 (k )  0.1a 3 (k )
where a(k ) is the output of the linear part of the node H(z) and b(k ) is the output of the
non-linear part of the node. The output of the system is added with white Gaussian noise
of different strengths of –20dB and –30dB. The convergence characteristics of INPSO,
DPSO and conventional PSO obtained from simulation study for the nonlinear estimation
problem of Examples 4 and 5 are shown in Figs.9.4 (a)-(h) for -30dB and -20dB additive
noise. The convergence characteristics shown in Figs. 9.4(a)-(h) also indicate that the new
distributed approaches converge to noise floor levels below that achieved by the PSO
method. However the convergence performance of DPSO and INPSO is observed to be
almost identical.
The output responses obtained by these methods are compared during test phase and are
shown in Figs.9.5 (a) and (b). The comparison shows excellent agreement between the
actual and estimated responses obtained by new methods. In addition the response
achieved by the new methods is also comparable with that of conventional PSO based
method. Table.9.3 provides the CPU time taken by the new algorithms during training
phase when they are implemented under identical conditions. It also lists the sum of
squared error (SSE) obtained from actual and estimated responses for different examples.
Observation of this Table indicates that at both noise levels the CPU time taken by INPSO
and DPSO based algorithms is less in comparison to that obtained from PSO model.
Therefore, considering all counts the INPSO and DPSO approaches are observed to be
better distributed candidates for parameter estimation or response matching of complex
plants.
213
D E V E L O P M E N T O F D I S T R I B U T E D P A R T I C L E S W A R M O P T I M I Z A T I O N
A L G O R I T H M S F O R R O B U S T N O N L I N E A R S Y S T E M I D E N T I F I C A T I O N
0
DPSO
INPSO
PSO
-5
MSE in dB
-10
-15
-20
-25
-30
0
10
20
30
40
50
60
No. of iterations
70
80
90
100
Fig 9.4 (a) Convergence of Ex-4 with NL1 at -30dB
0
PSO
INPSO
DPSO
MSE in dB
-5
-10
-15
-20
-25
0
10
20
30
40
50
60
No. of Iterations
70
80
90
100
Fig 9.4 (b) Convergence of Ex-4 with NL1 at -20dB
0
INPSO
DPSO
PSO
-5
MSE in dB
-10
-15
-20
-25
-30
0
10
20
30
40
50
60
No. of iterations
70
80
90
100
Fig. 9.4 (c) Convergence of Ex-5 with NL1 at -30dB
214
D E V E L O P M E N T O F D I S T R I B U T E D P A R T I C L E S W A R M O P T I M I Z A T I O N
A L G O R I T H M S F O R R O B U S T N O N L I N E A R S Y S T E M I D E N T I F I C A T I O N
0
PSO
INPSO
DPSO
MSE in dB
-5
-10
-15
-20
-25
0
10
20
30
40
50
60
No. of iterations
70
80
90
100
Fig. 9.4 (d) Convergence of Ex-5 with NL1 at -20dB
0
INPSO
DPSO
PSO
MSE in dB
-5
-10
-15
-20
-25
0
10
20
30
40
50
60
No. of Iterations
70
80
90
100
Fig. 9.4 (e) Convergence of Ex-4 with NL2 at -30dB
0
INPSO
DPSO
PSO
-2
-4
MSE in dB
-6
-8
-10
-12
-14
-16
-18
-20
0
10
20
30
40
50
60
No. of iterations
70
80
90
100
Fig. 9.4 (f) Convergence of Ex-4 with NL2 at -20dB
215
D E V E L O P M E N T O F D I S T R I B U T E D P A R T I C L E S W A R M O P T I M I Z A T I O N
A L G O R I T H M S F O R R O B U S T N O N L I N E A R S Y S T E M I D E N T I F I C A T I O N
0
PSO
INPSO
DPSO
MSE in dB
-5
-10
-15
-20
-25
0
10
20
30
40
50
60
No. of Iterations
70
80
90
100
Fig. 9.4 (g) Convergence of Ex-5 with NL2 at -30dB
0
DPSO
INPSO
PSO
-2
-4
MSE in dB
-6
-8
-10
-12
-14
-16
-18
-20
0
10
20
30
40
50
60
No. of Iterations
70
80
90
100
Fig. 9.4 (h) Convergence of Ex-5 with NL2 at -20dB
0.5
Actual
PSO
INPSO
DPSO
0.4
0.3
Outputs
0.2
0.1
0
-0.1
-0.2
-0.3
-0.4
1
2
3
4
5
6
7
No. of input samples
8
9
10
Fig. 9.5 (a) Response matching of Ex-4 with NL1 at -20dB
216
D E V E L O P M E N T O F D I S T R I B U T E D P A R T I C L E S W A R M O P T I M I Z A T I O N
A L G O R I T H M S F O R R O B U S T N O N L I N E A R S Y S T E M I D E N T I F I C A T I O N
0.3
actual
INPSO
DPSO
PSO
0.2
Outputs
0.1
0
-0.1
-0.2
-0.3
-0.4
1
2
3
4
5
6
7
No. of input samples
8
9
10
Fig. 9.5 (b) Response matching of Ex-5 with NL2 at -30dB
Table 9.3
Comparison of CPU time and sum of squared error obtained using PSO, INPSO and DPSO
Examples
CPU Time in second during training
Sum of squared error (SSE) during testing
PSO
INPSO
DPSO
PSO
INPSO
DPSO
Ex-4 with NL1
at -30dB
.0200
.0141
.0133
.0009
.0013
.0007
Ex-4 with NL1
at -20dB
.0240
.0156
.0135
.0077
.0117
.0134
Ex-4 with NL2
at -30dB
.0216
.0156
.0136
.0026
.0022
.0074
Ex-4 with NL2
at -20dB
.0195
.0159
.0138
.0067
.0120
.0127
Ex-5 with NL1
at -30dB
.0203
.0156
.0136
.0008
.0012
.0006
Ex-5 with NL1
at -20dB
.0206
.0154
.0135
.0076
.0117
.0068
Ex-5 with NL2
at -30dB
.0219
.0159
.0138
.0027
.0034
.0079
Ex-5 with NL2
at -20dB
.0208
.0159
.0141
.0064
.0105
.0129
217
D E V E L O P M E N T O F D I S T R I B U T E D P A R T I C L E S W A R M O P T I M I Z A T I O N
A L G O R I T H M S F O R R O B U S T N O N L I N E A R S Y S T E M I D E N T I F I C A T I O N
C. Robust distributed identification of nonlinear plants
Here the training signal is contaminated with outliers at locations ranging from 10% to 50%.
The magnitudes of such disturbance are varied between -5 to +5, -10 to +10 and -20 to +20.
The Wilcoxon norm of errors is used to train the model using INPSO and DPSO algorithms.
For comparison purpose the squared error norm based models are also simulated. In all cases
the SSE is obtained as performance measure to compare the performance between the two
methods. Simulation results presented in Tables 9.4 – 9.7 clearly indicate that the Wilcoxon
norm based distributed INSPO and DPSO algorithms show least SSE in all cases compared to
their squared error counterpart. This is true for all variations of density and strength of outliers
studied. In presence of outliers both the robust distributed PSOs are observed to provide almost
identical identification performance.
9.6 Conclusion
The chapter presents two distributed PSO algorithms : INPSO and DPSO and use them for
identification of nonlinear plants using locally measured data. In all cases it is observed that the
two algorithms provide almost similar performance. Further it is observed that the two
algorithms do not provide satisfactory identification performance when outliers are present in
the training signal. To enhance the robustness of these two algorithm Wilcoxon norm based
training is incorporated in this chapter. The results of identification using such training scheme
indicate that with high density outliers up to 50% and with strength of outliers as high as -20 to
20, the proposed distributed approach shows more robust performance in identification of
nonlinear plants.
218
D E V E L O P M E N T O F D I S T R I B U T E D P A R T I C L E S W A R M O P T I M I Z A T I O N
A L G O R I T H M S F O R R O B U S T N O N L I N E A R S Y S T E M I D E N T I F I C A T I O N
Table 9.4
Comparison of sum of squared error (SSE) during testing for Ex-4 with nonlinearity NL1
Percentage of outliers
INPSO using
Wilcoxon norm
DPSO using
Error square
Wilcoxon
norm
Error square
Outlier within the range of (-5 to +5)
0%
.0009
.0009
.0009
.0010
10%
.0012
.1546
.0013
.2379
20%
.0011
.4424
.0009
.2317
30%
.0024
.2639
.0009
.1038
40%
.0022
.2170
.0023
1.8967
50%
.0049
1.4475
.0065
1.1638
Outlier within the range of (-10 to +10)
10%
.0011
.9845
.0013
.8314
20%
.0013
.4424
.0009
.5018
30%
.0014
.2230
.0009
.1125
40%
.0029
.7623
.0030
2.4375
50%
.0032
2.5405
.0114
3.0345
Outlier within the range of (-20 to +20)
10%
.0011
1.7893
.0013
1.5053
20%
.0011
.8412
.0009
.6650
30%
.0014
1.4450
.0009
.0474
40%
.0025
.9239
.0038
2.4375
50%
.0066
2.3882
.0114
2.9731
219
D E V E L O P M E N T O F D I S T R I B U T E D P A R T I C L E S W A R M O P T I M I Z A T I O N
A L G O R I T H M S F O R R O B U S T N O N L I N E A R S Y S T E M I D E N T I F I C A T I O N
Table 9.5
Comparison of sum of squared error (SSE) during testing for Ex-4 with nonlinearity NL2
Percentage of outliers
INPSO using
Wilcoxon norm
DPSO using
Error square
Wilcoxon
norm
Error square
Outlier within the range of (-5 to +5)
0%
.0053
.0055
.0048
.0049
10%
.0068
.2426
.0063
.2423
20%
.0060
.4347
.0052
.2234
30%
.0048
.2520
.0056
.1087
40%
.0067
.2003
.0102
1.9191
50%
.0164
1.8002
.0206
1.2163
Outlier within the range of (-10 to +10)
10%
.0061
1.0322
.0063
.8194
20%
.0054
.4347
.0052
.4813
30%
.0057
.1941
.0056
.0183
40%
.0086
.8069
.0112
2.4849
50%
.0134
2.2508
.0222
3.2130
Outlier within the range of (-20 to +20)
10%
.0061
2.12
.0063
1.4678
20%
.0062
.8495
.0052
.6416
30%
.0051
1.3547
.0056
.1548
40%
.0087
.9210
.0136
2.4849
50%
.0160
2.5370
.0222
3.1524
220
D E V E L O P M E N T O F D I S T R I B U T E D P A R T I C L E S W A R M O P T I M I Z A T I O N
A L G O R I T H M S F O R R O B U S T N O N L I N E A R S Y S T E M I D E N T I F I C A T I O N
Table 9.6
Comparison of sum of squared error (SSE) during testing for Ex-5 with nonlinearity NL1
Percentage of outliers
INPSO using
Wilcoxon norm
DPSO using
Error square
Wilcoxon
norm
Error square
Outlier within the range of (-5 to +5)
0%
.0008
.0067
.0008
.0071
10%
.0013
.2434
.0011
.2937
20%
.0012
.4424
.0007
.2778
30%
.0019
.2639
.0008
.1022
40%
.0025
.2841
.0022
1.8401
50%
.0050
1.8163
.0068
1.1896
Outlier within the range of (-10 to +10)
10%
.0012
.9888
.0011
.8712
20%
.0012
.4424
.0008
.5170
30%
.0014
.1728
.0008
.1550
40%
.0025
.9108
.0032
2.4375
50%
.0042
1.8111
.0105
3.0536
Outlier within the range of (-20 to +20)
10%
.0012
2.1028
.0011
1.5606
20%
.0011
.8412
.0008
.6981
30%
.0013
1.3217
.0008
.1296
40%
.0015
.9397
.0036
2.4375
50%
.0039
2.3882
.0102
2.9796
221
D E V E L O P M E N T O F D I S T R I B U T E D P A R T I C L E S W A R M O P T I M I Z A T I O N
A L G O R I T H M S F O R R O B U S T N O N L I N E A R S Y S T E M I D E N T I F I C A T I O N
Table 9.7
Comparison of sum of squared error (SSE) during testing for Ex-5 with nonlinearity NL2
Percentage of outliers
INPSO using
Wilcoxon norm
DPSO using
Error square
Wilcoxon
norm
Error square
Outlier within the range of (-5 to +5)
0%
.0051
.0083
.0043
.0100
10%
.0057
.3227
.0058
.2994
20%
.0053
.4347
.0047
.2693
30%
.0048
.2520
.0052
.1065
40%
.0060
.2026
.0097
1.8385
50%
.0149
2.2513
.0236
1.2425
Outlier within the range of (-10 to +10)
10%
.0073
1.0266
.0058
.8571
20%
.0049
.4347
.0047
.4977
30%
.0049
.1941
.0052
.1030
40%
.0060
1.0471
.0112
2.4849
50%
.0106
1.9260
.0243
3.2359
Outlier within the range of (-20 to +20)
10%
.0073
2.2317
.0058
1.5579
20%
.0051
.8495
.0047
.6686
30%
.0050
1.3819
.0052
.1354
40%
.0071
.9210
.0130
2.4849
50%
.0171
2.5370
.0257
3.1592
222
D E V E L O P M E N T O F D I S T R I B U T E D P A R T I C L E S W A R M O P T I M I Z A T I O N
A L G O R I T H M S F O R R O B U S T N O N L I N E A R S Y S T E M I D E N T I F I C A T I O N
References
[9.1] I. F. Akyildiz, W. Su, Y. Sankarsubramaniam and E. Cayirci, “Wireless sensor
networks: A survey”, Computer Netw., vol. 38, pp. 393-422, March 2002.
[9.2] D. Estrin, G. Pottie and M. Srivastava, “Intrumenting the worlds with wireless sensor
networks”, in Proc. IEEE Int. Conf. Acoustics, Speech, Signal Processing (ICASSP), Salt
Lake City, UT, May 2001, pp. 2033-2036.
[9.3] M. G. Rabbat and R. D. Nowak, “Quantized incremental algorithms for distributed
optimization”, IEEE J. Sel. Areas Commun., vol. 23, no. 4, pp. 798-808, April 2005.
[9.4] M. Wax and T. Kailath, “Decentralized processing in sensor arrays”, IEEE Trans.
Acoust, Speech, Signal Processing, vol. ASSP-33, no. 4, pp. 1123-1129, October 1985.
[9.5] S. Kumar, F. Zhao and D. Shepherd, “Special issue on collaborative information
processing”, IEEE Signal Processing Mag., vol. 19, no. 2, pp. 13-14, March 2002.
[9.6] I. Akyildiz, W. Su, Y. Sankarasubramaniam and E. Cayirci, “A survey on sensor
networks”, IEEE Communication Magazine, vol. 40, no. 8, pp. 102-114, August 2002.
[9.7] D. Culler, D. Estrin and M. Srivastava, “Overview of sensor networks”, Computer,
vol. 37, no. 8, pp. 41-49, August 2004.
[9.8] D. A. Castanon and D. Teneketzis, “Distributed estimation algorithms for nonlinear
systems”, IEEE Trans. Autom. Control, vol. AC-30, pp. 418-425, 1985.
[9.9] A. S. Willsky, M. Bello, D. A. Castanon, B. C. Levy and G. Verghese, “Combining and
updating of local estimates and regional maps along sets of one-dimensional tracks”, IEEE
Trans. Autom. Control, vol. AC-27, pp. 799-813, 1982.
[9.10] Z. Chair and P. K. Varshney, “Distributed Bayesian hypothesis testing with
distributed data fusion”, IEEE Trans. Syst. Man, Cybern., vol. 18, pp. 695-699, 1988.
[9.11] J. L. Speyer, “Computation and transmission requirements for a decentralized linearquadratic Gaussian control system”, IEEE Trans. Autom. Control, vol. AC-24, pp. 266269, 1979.
[9.12] C. Y. Chong, “Hierarchical estimation”, in 2nd MIT/ONR workshop on C3,
Monterey, CA, Jul. 1979.
[9.13] M. G. Rabbat and R. D. Nowak, “decentralized source localization and tracking”, in
Proc. IEEE Int. Conf. Acoustics, Speech, Signal Processing (ICASSP), Montreal, QC,
Canada, May 2004, vol. 3, pp. 921-924.
223
D E V E L O P M E N T O F D I S T R I B U T E D P A R T I C L E S W A R M O P T I M I Z A T I O N
A L G O R I T H M S F O R R O B U S T N O N L I N E A R S Y S T E M I D E N T I F I C A T I O N
[9.14] C. G. Lopes and A. H. Sayed, “Distributed adaptive incremental strategies:
Formulation and performance analysis”, in Proc. IEEE Int. Conf. Acoustics, Speech, Signal
Processing (ICASSP), Toulouse, France, May 2006, vol. 3, pp. 584-587.
[9.15] L. Xiao, S. Boyd and S. Lall, “A scheme for robust distributed sensor fusion based on
average consensus” in Proc. 4th Int. Symp. Information Processing in Sensor Networks,
Los Angeles, CA, April 2005, pp. 63-70.
[9.16] L. Xiao, S. Boyd and S. Lall, “A space-time diffusion scheme for peer-to-peer leastsquares estimation”, in Proc. 5th Int. Symp. Information Processing in Sensor Networks,
Nashville, TN, april 2006.
[9.17] C. G. Lopes and A. H. Sayed, “Diffusion least mean squares over adaptive
networks”, in Proc. IEEE Int. Conf. Acoustics, Speech, Signal Processing (ICASSP),
Honolulu, HI, April 2007, pp. 917-920.
[9.18] Cassio G. Lopes and Ali H. Sayed, “Incremental adaptive strategies over distributed
networks”, IEEE Trans. on Signal Processing, vo. 55, no. 8, pp. 4064-4077, August 2007.
[9.19] Cassio G. Lopes and Ali H. Sayed, “Diffusion least mean squares over adaptive
networks: Formulation and performance analysis”, IEEE Trans. on Signal Processing, vol.
56, no. 7, pp. 3122-3136, July 2008.
[9.20] Bo Wang and Zhihai He, “Distributed optimization over wireless sensor networks
using swarm intelligence”, in IEEE Int. Symposium on Circuits & Systems, 2007, pp. 25022505.
[9.21] James M. Hereford, “A distributed particle swarm optimization algorithm for swarm
robotic applications”, in IEEE Congress on Evolutionary Computation, Canada, 2006, pp.
1678-1685.
[9.22] Min Chu and David J. Allstot, “An elitist distributed particle swarm algorithm for RF
IC optimization”, in Asia and South Pacific Design Automation Conference, 2005, vol. 2,
pp. 671-674.
[9.23] G. Panda, D. Mohanty, Babita Majhi and G. Sahoo, “Identification of Nonlinear
Systems using Particle Swarm Optimization Technique”, Proc. of IEEE Congress on
Evolutionary Computation(CEC-2007), Singapore, 25-28, September, 2007, pp.3253-3257.
[9.24] Xian-Da Zhang and Yuan-Sheng, “FIR system identification using HOS alone”,
IEEE Trans. on Signal Processing, vol 42, no. 10, pp. 2854-2858, October 1994.
[9.25] Wei Li and Wan-Chi Siu, “New approaches without post processing to FIR system
identification using selected order cumulants”, IEEE Trans. on Signal Processing, vol. 48,
no. 4, pp. 1144-1153, April 2000.
[9.26] J. C. Patra, R. N. Pal, R. Baliarsingh and G. Panda, “Nonlinear channel equalization
for QAM signal constellation using Artificial Neural Network, IEEE Trans. On Systems,
Man and Cybernetics – Part B: vol. 29, No. 2, pp.262-272, April 1999.
224
D E V E L O P M E N T O F D I S T R I B U T E D P A R T I C L E S W A R M O P T I M I Z A T I O N
A L G O R I T H M S F O R R O B U S T N O N L I N E A R S Y S T E M I D E N T I F I C A T I O N
[9.27] Jer-Guang Hsieh, Yih-Lon Lin and Jyh-Horng Jeng, “Preliminary study on Wilcoxon
learning machines”, IEEE Trans. on neural networks, vol. 19, no. 2, pp. 201-211, Feb.
2008.
225
10
Chapter
Conclusion and Scope for Further
Work
10.1 Conclusion
I
N this chapter the conclusion of the overall thesis is presented and some of the future
research problems which may be attempted by interested readers are outlined. The thesis
has investigated on two key problems : efficient direct modeling or system identification of
complex and noisy plants like Hammerstein, dynamic SISO and MIMO and inverse modeling
(normal and robust) of nonlinear channels. The novelty of the present work is the introduction
of bio-inspired techniques to direct and inverse modeling problems. These techniques have been
essentially used for training the weights of the model. The bio-inspired techniques used are
(i) PSO (ii) modified PSO such as CLPSO, CPSO, IPSO and (iii) BFO. In Chapter 3 a new
cascaded FLANN (CFLANN) structure using a novel learning algorithm is employed for
identification of nonlinear dynamic plants. It is shown through exhaustive simulation that the
new model offers least computation compared to MLANN and FLANN models. In all cases
studied, the proposed CFLANN has produced improved response matching and least sum of
squared errors between the actual and estimated outputs [ 10.3].
226
C O N C L U S I O N
A N D
S C O P E
F O R
F U R T H E R
W O R K
Identification of standard IIR plants has been carried out in Chapter 4 using an improved PSO
(CLPSO) based algorithm. In terms of convergence behaviour, execution time and product of
population size and input samples, it is observed that the new IIR identification scheme offers
improved performance compared to RLMS, GA and PSO based methods. Further it is
observed that the proposed method shows better convergence characteristics than the GA and
PSO based methods [10.5] when the reduced order models are used. This shows that the new
method is capable of offering better optimal solution compared to GA and PSO based
identification schemes.
In Chapter 5, two new bio-inspired techniques called PSO and BFO are introduced to develop
efficient identification models for nonlinear dynamic SISO and MIMO plants. In all cases the
structure used is essentially a low complexity FLANN. The population based PSO and BFO
techniques provide improved learning of the models compared to that provided by gradient
descent algorithm. Further, the computation time offered by the new methods is less than the
existing BP trained models. It is further observed that the identification performance of BFO
and PSO based models is almost similar [10.1, 10.4, 10.6, 10.9, 10.14] but BFO based model is
computationally faster than the PSO based model during training.
If squared error cost function is used, the training data contaminated with outliers degrades the
identification performance. In Chapter 6, three robust norms are introduces as the cost
functions and PSO based training to design robust identification models has been suggested.
These new models produce improved identification of complex plants even when 50% outliers
in the training samples are present. Out of the three robust norms used it is shown through
simulation of many benchmark problems that the model developed using PSO based
minimization of Wilcoxon norm of errors performs the best compared to that offered by
standard squared error norm and other two robust norms [ 10.2, 10.7, 10.10].
In Chapter 7 robust adaptive inverse models have been developed using robust cost functions
and its BFO based minimization of the cost functions. The inverse model is extensively used in
designing equalizers for communication and magnetic media so that ISI is minimized. The BFO
is shown to be an efficient learning candidate to design robust inverse models. From the
simulation study it is observed that the conventional squared error norm is the least and
Wilcoxon norm is the best robust norms to develop inverse models of different types of linear
and nonlinear channels.
227
C O N C L U S I O N
A N D
S C O P E
F O R
F U R T H E R
W O R K
The identification of Hammerstein plants is a challenging task. This is studied in Chapter 8
using CPSO and IPSO algorithms. These two new algorithms have been suitably applied for
identification task. In this case in the identification task involves response matching of nonlinear
static part and matching of parameters of dynamic part of the plants. Identification of standard
Hammerstein plants has been carried by simulation study and it is in general observed that the
new scheme of identification is superior to GA, AIS and PSO based training schemes [10.8,
10.12, 10.13].
In sensor networks, distributed parameter estimation plays an important role to estimate the
temperature, pressure and humidity of a region using local measurements. Distributed linear,
nonlinear and robust identifications are therefore have gained importance in sensor network
environment. This problem has been studied in Chapter 9. We have assumed that the inputoutput data is linear or nonlinear. Further, the cost function used is squared error or robust
norm of errors like Wilcoxon norm. Two new distributed PSO algorithms : incremental PSO
and diffusion PSO are developed to identify linear as well as nonlinear plants using local
measurements. The new distributed algorithms are applied for parameter estimation of linear
plants and response matching of nonlinear plants. It is in general observed that the diffusion
PSO algorithm performs better than its counter part. Further the training samples are corrupted
with outliers and the identification performance of Wilcoxon norm minimization based model
is shown to perform better than that offered by squared error minimization based model. It is
observed in case of identification of both linear and nonlinear systems [10.11].
10.2 Further research extension
The work carried out in the present thesis can be extended in many directions. The proposed
identification schemes are population based and hence take more training time and are not
suitable for online applications. The differential evolution or such fast bio-inspired techniques
can be applied to reduce the training time and hence may be applied for online control
applications. The bio-inspired training methodology can also be effectively applied to develop
efficient forecasting models for prediction of complex financial or other times series like stock
market, exchange rates, interest rates, oil price etc. Improved learning algorithms using bioinspired techniques can be developed to reduce the population size and number of input
samples used during training. Further the identification problem can be viewed as a multi-
228
C O N C L U S I O N
A N D
S C O P E
F O R
F U R T H E R
W O R K
objective problem by considering the structure complexity and mean square error as two
objectives. Then suitable multi-objective bio-inspired techniques may be used to achieve efficient
reduced or pruned structure model for identification. Research work can also be carried out on
the convergence analysis of bio-inspired optimization algorithms. Little progress is reported in
the literature in this direction. Similarly effects of finite register length on the performance of
such algorithms is important when such algorithms are implemented in DSP processor or
VHDL. Therefore investigation can also be made in this field both using fixed and floating point arithmetic.
Publications out of the thesis
International Journals
Published
[10.1] Babita Majhi and G. Panda, “Development of Efficient identification Scheme for
Nonlinear Dynamic Systems using Swarm Intelligence techniques”, Expert Systems with
applications, Elsevier, vol. 37, issue 1, pp. 556-566, January 2010.
[10.2] Babita Majhi, G. Panda and B. Mulgrew, “Robust identification using New
Wilcoxon Least Mean Square (WLMS) Algorithm”, IET (IEE) Electronics Letter, vol.
45, issue 6, pp. 334-335, 12th March 2009.
[10.3] Babita Majhi and G. Panda, “Cascaded Functional Link Artificial Neural Network
Model for Nonlinear Dynamic System Identification”, International Journal of Artificial
Intelligence and Soft Computing, vol. 1, Nos. 2/3/4, pp. 223 – 237, 2009.
[10.4] Babita Majhi and G. Panda, “A Hybrid Functional Link Neural Network and
Bacterial Foraging Approach for Efficient Identification of Dynamic Systems”,
International Journal of Applied Artificial Intelligence in Engineering Systems, vol.
1, no. 1, pp. 91-104, January-June 2009.
[10.5] Babita Majhi and G. Panda, “Identification of IIR Systems using Comprehensive
Learning Particle Swarm Optimization”, International Journal of Power and Energy
Conversion, vol. 1, no. 1, pp.105-124, 2009.
[10.6] Babita Majhi and G. Panda, “Nonlinear System Identification based
on Bacterial Foraging Optimization Technique”, International Journal of Systemics,
Cybernetics and Informatics, ISSN 0973-4864, pp. 44-50, April 2008.
229
C O N C L U S I O N
A N D
S C O P E
F O R
F U R T H E R
W O R K
Communicated
[10.7] Babita Majhi and G. Panda, “Robust Identification of Nonlinear Complex Systems
using Low Complexity ANN and Particle Swarm Optimization Technique”, Expert
systems with applications, Elsevier, May 2009.
[10.8] G. Panda, S. J. Nanda and Babita Majhi, “Improved Identification of Hammerstein
Plants using new CPSO and IPSO algorithms”, Expert Systems with Applications,
Elsevier, April 2009.
Book Chapter (Accepted)
[10.9] Babita Majhi and G. Panda, “Efficient identification of Nonlinear Dynamic Systems
using BFO and PSO Techniques”, Book Chapter, Applied Swarm Intelligence, Springer
Series in Studies in Computational Intelligence, June, 2009.
International Conferences
Published
[10.10] Babita Majhi, G. Panda and B. Mulgrew, “Robust prediction and identification
using Wilcoxon norm and Particle swarm optimization”, 17th European Signal
Processing Conference (EUSIPCO 2009), Glasgow, UK, 24-28 August, 2009.
[10.11] Babita Majhi, G. Panda and B. Mulgrew, “Nonlinear identification over adaptive
networks using distributed PSO algorithms”, IEEE Congress on Evolutionary
Computation (CEC 2009), Norway, pp.2076-2082, May 18-21, 2009.
[10.12] S. J. Nanda, G. Panda and Babita Majhi, “Development of Immunized PSO
Algorithm and Its Application to Hammerstein Model Identification”, IEEE Congress on
Evolutionary Computation (CEC 2009),Norway, pp.3080-3086, May 18-21, 2009.
[10.13] S. J. Nanda, G. Panda and Babita Majhi, “Improved identification of Hammerstein
Model based on Artificial Immune System”, Proc. of IEEE International Conference on
Emerging Trends in Computing (ICETiC-09), Tamilnadu, pp. 193-198, 8-10, January
2009.
[10.14] Babita Majhi and G. Panda, “Bacterial Foraging based Identification of Nonlinear
Dynamics System”, Proc. of IEEE Congress on Evolutionary Computation(CEC2007), Singapore, 25-28, September, 2007, pp.1636-1641.
230
C O N C L U S I O N
A N D
S C O P E
F O R
F U R T H E R
W O R K
Related Publications
International Journal
Communicated
[10.15] S. J. Nanda, G. Panda and Babita Majhi, “Improved identification of Nonlinear
Plants using Artificial Immune System based FLANN model”, Communicated to
Engineering Applications of Artificial Intelligence, Elsevier, (revised and sent),
October 2009.
International Conferences
Published
[10.16] S. J. Nanda, G. Panda, Babita Majhi and P. Tah, “Improved Identification of
Nonlinear MIMO Plants using New hybrid FLANN-AIS Model”, Proc. of IEEE
International Advance Computing Conference (IACC-09), Patiala, 6-7, March 2009,
pp. 141-146.
[10.17] S. J. Nanda, G. Panda, Babita Majhi and P. Tah, “Development of a new
optimization algorithm based on Artificial Immune System and its application”, Proc. of
IEEE ICIT 2008, Bhubaneswar, 17-20, December 2008, pp. 45-48.
[10.18] S. J. Nanda, G. Panda and Babita Majhi, “Development of Novel Digital
Equalizers for noisy nonlinear channel using artificial Immune System”, Proc. of IEEE
Region 10 Colloquium and 3rd International Conference on Industrial and
Information Systems, IIT Kharagpur, 8-10, December, 2008, pp. 1-6..
[10.19] S. J. Nanda, G. Panda and Babita Majhi, “Improved identification of nonlinear
dynamic systems using Artificial Immune System”, Proc. of IEEE INDICON, IIT,
Kanpur, 11-13 December, 2008, pp. 268-273.
[10.20] Babita Majhi, G. Panda and A. Choubey, “Efficient scheme of identification of
Pole-Zero Systems using Particle Swarm Optimization Technique”, Proc. of IEEE
Congress on Evolutionary Computation (CEC 2008), Hong Kong, June 1-6, 2008, pp.
446-451.
[10.21] Babita Majhi, G. Panda and H. Thethi, “Development of Efficient Adaptive
Nonlinear Channel Equalizers using Fuzzy-Bacterial Foraging Optimization Technique”,
Proc. of IEEE Conference on AI Tools in Engineering (ICAITE - 2008), Pune, 6-8
March, 2008.
[10.22] G. Panda, Babita Majhi, D. Mohanty and A. Sahoo, “A GA-based Pruning
Strategy and Weight Update Algorithm for Efficient Nonlinear System Identification”,
Proc. of International Conference on Pattern Recognition and Machine Intelligence
(PReMI’ 07), ISI, Kolkata, 18-22, December 2007, pp. 244-251(Springer-Verlag Berlin
Heidelberg 2007).
231
C O N C L U S I O N
A N D
S C O P E
F O R
F U R T H E R
W O R K
[10.23] Babita Majhi and G. Panda, “Particle Swarm Optimization based Efficient
Adaptive Channel Equalizer for Digital Communication”, Proc. of International
Conference on Trends in Intelligent Electronics Systems (ties2007), Chennai, 12-14,
November, 2007, pp.23-28.
[10.24] G. Panda, D. Mohanty, Babita Majhi and G. Sahoo, “Identification of Nonlinear
Systems using Particle Swarm Optimization Technique”, Proc. of IEEE Congress on
Evolutionary Computation(CEC-2007), Singapore, 25-28, September, 2007, pp.32533257.
[10.25] Babita Majhi and G. Panda, “Recovery of Digital Information using Bacterial
Foraging Optimization based Nonlinear Channel Equalizers”, Proc. of IEEE 1st
International Conference on Digital Information Management (ICDIM2006),Bangalore, India, 6-8th December, 2006, pp. 367-372.
[10.26] G. Panda, Babita Majhi and S. Mishra, “Nonlinear System Identification using
Bacterial Foraging based Learning”, Proc. of IET 3rd International Conference on
Artificial Intelligence in Engineering Technology (ICAIET-2006), Malaysia, 22-24,
November, 2006, pp. 120-125.
[10.27] Babita Majhi, G. Panda and A. Choubey, “On The Development of a new
Adaptive Channel Equalizer using Bacterial Foraging Optimization Technique”, Proc. of
IEEE Annual India Conference (INDICON-2006), New Delhi, India, 15th-17th
September, 2006, pp. 1-6.
[10.28] G. Panda, Babita Majhi, D. Mohanty and A. Choubey, “Development of GA
Based Adaptive Techniques for Nonlinear System Identification”, Proc. of International
Conference on Information Technology (ICIT-2005), Bhubaneswar, India, 20-23,
December, 2005, pp. 198-204.
National Conferences
Published
[10.29] T. Panigrahi, G. Panda and Babita Majhi, “Collaborative Signal Processing in
Distributed Wireless sensor Network”, Proc. of National Conference on Advances in
Computational Intelligence Applications (NCACI 2009), Bhubaneswar, 20-22 March,
2008.
[10.30] U. K. Sahoo, G. Panda and Babita Majhi, “Distributed regression : an efficient
framework for modeling sensor network data”, Proc. of National Conference on
Advances in Computational Intelligence Applications (NCACI 2009), Bhubaneswar,
20-22 March, 2008.
232
C O N C L U S I O N
A N D
S C O P E
F O R
F U R T H E R
W O R K
[10.31] P. M. Pradhan, B. Mulgrew, G. panda and Babita Majhi, “Layout optimization of
Wireless sensor network using NSGA-II”, National Conference on Advances in
Computational Intelligence Applications (NCACI 2009), Bhubaneswar, 20-22 March,
2008.
[10.32] D. Mohanty, Babita Majhi and G. Panda, “Recovery of digital data using real
coded genetic algorithm”, Proc. of National Seminar on Soft Computing Techniques
& applications (SCTA-2007), Bhubaneswar, India, 7th April 2007, pp. 39-44.
[10.33] G. Panda, D. Mohanty and Babita Majhi, “Development of novel digital nonlinear
channel equalizers using Particle Swarm Optimization technique”, National Conference
on Data Mining and e-Goverence, Bilaspur, India, 17th Feb., 2007, pp. 25 (abstract).
[10.34] G. Panda, Babita Majhi and A. Choubey, ”Nonlinear Adaptive Channel
Equalization using GA based Technique”, Proc. of National Conference on Soft
Computing and Machine Learning for Signal Processing, Control, Power and
Telecommunications (NCSC-2006), Bhubaneswar, India, 24-26, March, 2006, pp.
32(abstract).
[10.35] G. Panda, D. Mohanty and Babita Majhi, “An Adaptive Genetic Algorithm for
System Identification : a faster approach”, Proc. of National Conference on Soft
Computing Techniques for Engineering Applications (SCT-2006), NIT Rourkela,
India, 24-26, March, 2006, pp. 329-336.
[10.36] G. Panda, Babita Majhi, D. Mohanty, A. Choubey and S. Mishra, “Development
of Novel Digital Channel Equalisers using Genetic Algorithms”, Proc. of National
Conference on Communication (NCC-2006), IIT Delhi, India, 27-29,January, 2006,
pp.117-121.
[10.37] G. Panda, Babita Majhi, D. Mohanty and B. N. Biswal, “Application of Adaptive
Genetic Algorithm in Nonlinear System Identification”, Souvenir of National Conference
of Cyber Security, Data Mining and ICT for Society, Bilaspur, India, 18-19, January,
2006, pp. 6 (abstract).
233
Was this manual useful for you? yes no
Thank you for your participation!

* Your assessment is very important for improving the work of artificial intelligence, which forms the content of this project

Related manuals

Download PDF

advertisement