Fault Detection in Nonlinear Systems Technical University of Crete

Fault Detection in Nonlinear Systems Technical University of Crete
Fault Detection in Nonlinear Systems
by
Adam A. Papadimitropoulos,
Thesis
Presented to the Department of Electronic Engineering of
Technical University of Crete, GREECE
in Partial Fulfillment
of the Requirements
for the Degree of
Master in Electronic Engineering
Technical University of Crete, GREECE
August 2003
To my parents, Andreas and Cleo
Alla famiglia di Pica
Acknowledgments
A long list of friend and colleagues have helped me in the preparation of the thesis in
many different ways. I would like to express my gratitude to the professors George
Rovithakis and Thomas Parisini for their continuous guidance, encouragement and
support. Without their insightful discussions and contribution, this thesis would
have never been written. They have both introduced me in the fascinating world of
Control and Fault Diagnosis as well as they have sharpened my skills towards these
fields. Their outstanding academic career and as persons themselves, they were and
always will be a source of inspiration for me.
Special thanks to professor M. Christodoulou for accepting me as a postgraduate
student in technical university of Crete and for his cooperation in the preparation
of this thesis and his useful suggestions in finetuning my research.
Thanks are due to the members of DAMADICS research training network for accepting me as a member and for the finance of my work as long as I stayed in Italy.
I would like to thank my parents and my brother Harry, for all the encouragement
and support they have given to me my entire life. I would not be where I am today
without them. Special thanks also to my lifelong friends Sifis and Papadopoulos. At
the end, I would like to thank Stefania and her family as well as Parisini’s family,
iii
for their love, endurance and the true emotions that were supporting me during the
tough years of my graduate studies in Italy. I would like to express my deepest and
most frank appreciation to them and especially to Stefania.
Adam A. Papadimitropoulos
Technical University of Crete, GREECE
August 2003
iv
Absrtact
In this thesis, various non smooth nonlinearities presented in any physical system
and fault diagnosis methods are examined. Towards this concept, a large number
of mathematical models and their identification and estimation techniques are presented. In parallel, an introduction in the fault diagnosis area and its up-to-date
methodology are also presented.
The problem of actuator fault detection in mechanical systems with friction that
perform linear motion, is discussed and it is the main contribution of this dissertation. The dynamic LuGre model is used to model the effects of friction. The
proposed architecture is built upon an on-line neural network approximator which
requires only system’s position and velocity. The friction internal state is not assumed to be available for measurement. The developed fault detector is analyzed
with respect to its robustness and sensitivity. Rigorous fault detectability conditions
and upper bounds for the detection time are also derived. The proposed methodology is applied to the DAMADICS benchmark problem which is developed in order
to approximate the industrial process in a sugar factory located in Lublin (Poland).
The neural network approximation scheme makes it possible to detect either incipient or abrupt faults regarding the friction and the spring models of the considered
actuator.
v
Contents
Acknowledgments
iii
Absrtact
v
List of Tables
viii
List of Figures
ix
Chapter 1 Introduction
1
Chapter 2 Non-smooth non-linearities
5
2.1
2.2
2.3
2.4
Friction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
5
2.1.1
Mathematical Models of Friction . . . . . . . . . . . . . . . .
9
Backlash . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
20
2.2.1
Mathematical Models of Backlash . . . . . . . . . . . . . . .
20
Hysteresis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
25
2.3.1
Mathematical Models of Hysteresis . . . . . . . . . . . . . . .
25
Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
48
Chapter 3 Fault Diagnosis
50
3.1
What is Fault Detection and Diagnosis . . . . . . . . . . . . . . . . .
51
3.2
Methods in Fault Detection and Diagnosis . . . . . . . . . . . . . . .
54
vi
3.3
3.2.1
Model-Free methods . . . . . . . . . . . . . . . . . . . . . . .
54
3.2.2
Model-Based methods . . . . . . . . . . . . . . . . . . . . . .
55
Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
73
Chapter 4 Fault detection in mechanical systems with friction phenomena: an on-line approximation approach.
74
4.1
Problem Formulation
75
4.2
Nominal System On-line Approximation
4.3
Fault Detectability Analysis
4.4
Simulation results
4.5
. . . . . . . . . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . .
77
. . . . . . . . . . . . . . . . . . . . . .
87
. . . . . . . . . . . . . . . . . . . . . . . . . . . .
93
Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
94
Chapter 5 Fault Detection in Damadics benchmark problem
98
5.1
Plant description . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
99
5.2
Problem formulation for DAMADICS case . . . . . . . . . . . . . . . 101
5.3
Damadics Simulation Results . . . . . . . . . . . . . . . . . . . . . . 102
Bibliography
107
Vita
114
vii
List of Tables
2.1
Approximate ranges for the parameters of seven parameter friction
model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
11
2.2
Friction model capabilities
11
5.1
Explanation of the symbols of the pneumatic servomotor and its phys-
. . . . . . . . . . . . . . . . . . . . . . .
ical layout. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 100
5.2
Fault specifications.
. . . . . . . . . . . . . . . . . . . . . . . . . . . 104
viii
List of Figures
2.1
Inverting a backlash . . . . . . . . . . . . . . . . . . . . . . . . . . .
21
2.2
Backlash response to a sine input. . . . . . . . . . . . . . . . . . . .
22
2.3
A typical Hysteresis diagram with a major and minor loop
. . . . .
25
2.4
Hysteresis model . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
26
2.5
Construction procedure for obtaining f ,g and h
33
2.6
Possible w functions and their effects on frequency behavior
. . . .
34
2.7
Hysteresis Transducer . . . . . . . . . . . . . . . . . . . . . . . . . .
35
2.8
A simple hysteresis operator γ̂αβ . . . . . . . . . . . . . . . . . . . .
37
2.9
The Preisach phase plane . . . . . . . . . . . . . . . . . . . . . . . .
38
2.10 The formation of S + and S − . . . . . . . . . . . . . . . . . . . . . .
39
2.11 The First- and Second-order transition reversal curves . . . . . . . .
40
2.12 The Preisach phase plane for the Moving model
. . . . . . . . . . .
44
3.1
Stages of model-based fault detection and diagnosis. . . . . . . . . .
56
4.1
Behaviors of the: (a) position error ξ˜1 = x1 − x̂1 ; (b) velocity error
ξ˜2 = x2 − x̂2 ; (c) detectability Condition
4.2
. . . . . . . . . . . . . . .
95
Behaviors of the: (a) position error ξ˜1 = x1 − x̂1 ; (b) velocity error
ξ˜2 = x2 − x̂2 ; (c) detectability Condition
4.3
. . . . . . . . . . .
. . . . . . . . . . . . . . .
Dependence of detection time td on the value of parameter k2
ix
. . .
96
97
5.1
A control valve-pneumatic servomotor-positioner device.
. . . . . .
99
5.2
The pneumatic servomotor and its physical layout.
5.3
Architecture of the adaptive on–line approximation scheme.
5.4
Architecture used for DAMADICS simulation trials. . . . . . . . . . 103
5.5
Behaviors of (a) position error ξ˜1 . (b) velocity error ξ˜2 . (c) absolute
. . . . . . . . . 100
. . . . 102
value of velocity error, |ξ˜2 | threshold ρ. (d) A fault evolution. (e)
detection decision. (f) control Value (CV).
5.6
. . . . . . . . . . . . . . 105
Behaviors of (a) position error ξ˜1 . (b) velocity error ξ˜2 . (c) absolute
value of velocity error, |ξ˜2 | threshold ρ. (d) A fault evolution. (e)
detection decision. (f) control Value (CV).
x
. . . . . . . . . . . . . . 106
Chapter 1
Introduction
The need to design systems able to guarantee increased reliability, availability and
safety, triggered on-going research in the area of fault diagnosis, mainly through
the model-based analytical redundancy path. Analytical redundancy schemes have
received considerable attention in the last two decades mostly owing to the advances
in computer technology, as well as to the appearance of powerful signal processing
and learning methodologies. In general, the actual behavior of the plant is compared
to that expected, on the basis of a plant model. Deviations between the actual and
the estimated behavior are expressed in terms of residuals, which are indications
of faults. Detailed overviews of such schemes may be found in [26]-[29]. The majority of these methods are constraint to linear systems. Owing to the inherent
complexity, derivation of analytical results regarding robustness and sensitivity of
fault diagnosis schemes for nonlinear systems is difficult. Despite the difficulties,
works on nonlinear systems have recently appeared [51]-[53] and [54].
In a wide range of physical systems such as mechanical systems, electro-magnetic
systems, actuators, sensors etc., non-smooth nonlinear mechanisms such as friction,
backlash and hysteresis, severely limit their performance and reliability. Up to now,
1
previous works on nonlinear fault diagnosis were focused on developing generalpurpose architectures. Thus, unavoidably, they were restricted to special classes
of nonlinear systems, assuming full state measurement and smooth nonlinearities.
Nonlinear observers have also been employed to relax the full state measurement
assumption. Unfortunately, the use of observers further restricts the class of nonlinear systems and the type of permissible faults. Moreover, nonlinear observer design
is not de-coupled from controller design. Hence, important theoretical questions
regarding even fault detection are raised, since practically all systems operate in a
closed loop.
The main contribution of this thesis is that of detecting faults in mechanical systems
with friction. Friction is present in any system that involves mechanical motion. It
may cause large steady state errors and oscillations generated by a combination
of friction, which counteracts motion, and an instability mechanism, thus making
friction a very complicated phenomenon. The aforementioned reasons impose extra
complexity to any scheme that is targeted at diagnosing faults in such systems.
Studies [3], [6] have shown that a friction model involving dynamics is necessary
to describe accurately the friction phenomena. Various dynamic models have been
proposed [9], [3], [15] and [18]. However, the unknown structure of the incoming
faults significantly magnifies the level of system uncertainty. Neural networks with
their massive parallelism, very fast adaptability and inherent approximation capabilities, have already been utilized mainly towards the friction compensation problem
[10].
In this work we present a novel approach to detect faults in mechanical systems
with friction that perform linear motion. The basic module in the proposed ar-
2
chitecture is an on-line approximator which is based on liner-in-the-weights neural
network structures. To model the effects of friction, the dynamic LuGre model [9] is
used. However, we don’t assume knowledge of system nonlinearities. Furthermore,
the friction internal state is not assumed to be available for measurement. The online approximator requires system’s position and velocity as well as its input force.
The performance of the developed fault detector is analyzed with respect to its
robustness and sensitivity. Rigorous fault detectability conditions are also derived
basing on the important results presented in [51]. We go beyond the theoretical
analysis and present simulation studies to clarify and verify the approach with emphasis on the application to the DAMADICS actuator benchmark problem. Under
the framework of the DAMADICS research network funded by the European Union,
a benchmark model was developed to approximate the behavior of the evaporation
stage of a sugar factory in Lublin (Poland). Actuators under consideration consist
of a control valve, a pneumatic linear servomotor and a positioner. In such kind of
electromechanical systems, the presence of friction phenomena is unavoidable and
significantly increases the complexity of the fault diagnosis problem.
The thesis is organized as follows. In Chapter 2, wide-used models in the literature for friction, backlash and hysteresis are presented. It is also reported their
limitations and applicability. Additionaly the existed identification and estimation
techniques are also investigated. The main purpose of this chapter is to explain
in-depth the non-smooth non-linearities that are often appear to the physical systems. Furthermore, as it is explained above, the model constitutes the basis for
the development of model-based analytical redundancy methods. In Chapter 3 an
introduction to the fault diagnosis area and some basic definitions are being made.
This chapter presents some basic methods, such as observers, parity relations, etc.
It also presents the up-to-date research where it is oriented to the nonlinear sys-
3
tems and the application of qualitative and computational intelligence techniques.
In Chapter 4, which is a combination of the two preceded chapters, our method
is presented. We formulate the problem and we state the necessary assumptions.
Some definitions and preliminaries are also provided. This chapter also deals with
the design and the robustness analysis of the on-line approximation scheme. Moreover, the sensitivity analysis of the fault detection scheme is carried out, in which
fault detectability conditions are also derived. In addition, upper bounds on the
detection time and a relationship between detection time and the values of certain
design parameters are established. Simulation studies are also presented. Finally, in
Chapter 5, a brief description of the DAMADICS benchmark problem including its
simulation results and the performance of the proposed, in Chapter 4, methodology
are given. The results are clarify and verify the reliability of the proposed method.
4
Chapter 2
Non-smooth non-linearities
Non-smooth non-linearities are common in practical systems. Such non-linearities
are usually poorly known and may vary with time and they often severly limit system
performance. Especially in actuators which are installed in harsh environment,
non-linearities increase with wear and tear and in mass production change from
component to component. The objective is to design a desirable system in order
to be able to accomodate such uncertainties. Typical non-smooth non-linearities
addressed in this chapter are friction, backlash and hysteresis. In the following,
various models used in the literature and some approximation techniques regarding
friction, backlash and hysteresis phenomena are presented.
2.1
Friction
Whenever there is a motion or tendency of motion between two elements, friction
forces exist. The frictional forces encountered in physical systems are usually of a
non-linear nature. The characteristics of the frictional forces between the surfaces
often depend on such factors as the composition of the surfaces the pressure between the surfaces, their relative velocity and others, so that an exact mathematical
5
description of the frictional force is difficult to be established.
The types of friction which are commonly used in practical systems are : Viscous,
Static, Coulomb, and Stribeck friction. The friction model being used is a conglomerate of these friction components from which a force balance may be obtained
for friction acting against a surface [5]. It is often assumed when studying friction
that there is no motion while in static friction, which is to say no motion without
sliding. But Dahl [15], [16], [17] studying experimental observation of friction in
small rotation of ball bearing concluded that for small motions, a junction in static
friction behaves like a spring and considered the implications for control. There is
a displacement (pre-sliding displacement) which is an approximately linear function
of the applied force, up to a critical force, at which breakaway occurs. When forces
are applied, the asperities will deform, but recover when the force is removed. At
this point, the tangential force is governed by:
Ft (x) = −kt x
(2.1)
where Ft is the tangential force, kt is the tangential stiffness of the contact and x
is the displacement away from the equilibrium position. Ft and x refer to the force
and displacement in the contact before sliding begins. The tangential stiffness kt ,
is a function of asperity geometry, material elasticity and applied normal force. To
first approximation it is actually the breakaway displacement that is constant and
the stiffness is then given by
kt =
Fb
xb
(2.2)
where Fb is the breakaway force and xb the maximum deformation. The transition
from elastic contact to sliding is not simple. Sliding is observed to originate first
at the boundary of a contact and to propagate toward the center. Thus there
is no abrupt transition to sliding. Pre-sliding displacement is of interest to the
control community in extremely high precision pointing applications in dynamics
and in simulation and may also be important in establishing that there are no
6
discontinuities in friction as a function of time.
Coulomb friction from the other hand, is a retarding force that has a constant
amplitude with respect to the change in velocity, but the sign of the frictional force
changes with the reversal of the direction of velocity. The other friction component,
Viscous friction, represents a retarding force that is a linear relationship between
the applied force and velocity. Both of these phenomena is obviously that opposing
the motion when the velocity is different from zero.
However imperfection in the motor mechanics and unbalances on the motor shaft
yield asymmetries behavior of the motor dynamics. The model proposed by [7]
includes Coulomb and viscous friction and accounts for friction asymmetries. Some
experiments conducted with this model showed that the asymmetries in the Coulomb
friction components were dominant. This observation was also corroloborated by
the results presented in [2].
An another effect takes place after the stiction force has been surmounted where the
friction force decreases exponentially reaching approximately 60% of the breakaway
force ([8]), and then increases proportionally to the velocity. These bends occur at
velocities close to zero. This type of friction structure, sometimes known as a stickslip friction. This arises because static friction is greater than the level of Coulomb
friction at zero velocity. Stribeck friction can be explained as an inertial effect
occuring when trying to separate two objects which have been at rest for long periods
of time. The Stribeck friction force decreases as movement occurs. The phenomena
of friction decreasing during a sliding period after movement is called stick-slip. To
capture this behavior an empirical Stribeck velocity parameter, the so-called vs , is
used as we shall show later in the following mathematical models of friction. The
Cincinnati Milacron test procedure [14] indicates that when Fs /Fc < 0.85 stickslip will be eliminated. It is also widely observed that stick-slip can be eliminated
by stiffening a mechanism. The following expression summarize the main friction
7
components:
Ff (v) = asgn(v)
(2.3)
Ff (v) = ai sgn(v) + bi v
(2.4)
Ff (v) = (a0 + a1 e−b|v| )sgn(v)
(2.5)
where a in (2.3) represents the Coulomb friction, ai , bi in (2.4) the asymmetric
model of Coulomb and viscous friction. In (2.5) the sum represents the breakaway
force and b the slip constant.
We note, before presenting the dominant models, that when the velocity is not
constant, the dynamics of the model will be very important and give rise to different
types of phenomena such as friction lag (frictional memory); a change in friction
will lag changes in velocity or load. Also the friction force is lower for decreasing
velocities than for increasing velocities responding to the existence of the hysteresis
in the relation between friction and velocity. The hysteresis loop becomes wider
at higher rates of the velocity changes. Hess & Soom explained their experimental
results by a pure time delay in the relation between velocity and friction force.
Finally an another time-dependent property of friction is the rising static friction
with increasing dwell time. Dwell time is the time spent in static friction. [25]
proposed an empirical model that incorporates the relation between static friction
and dwell time as:
m
Fs (t) = Fs,∞ − (Fs,∞ − Fc )e−γt
(2.6)
where Fs,∞ is the ultimate static friction; Fc the Coulomb friction at the moment
of arrival in the stuck condition; γ, m are empirical parameters. [3], [4] presents a
model of rising static friction which is useful for analysis and solves some problems
associated with using Fc as the starting point of the static friction rise. The model
is:
Fs,bn (t2 ) = Fs,an−1 + (Fs,∞ − Fs,an−1 )
8
t2
t2 + γ
(2.7)
where Fs,bn is the level of Stribeck friction at the beginning (breakaway) of the nth
interval of slip; and Fs,an−1 is the Stribeck friction at the end (arrival) of the previous interval slip. Note γ is still an empirical factor, will be different in physical
dimension from the equation (2.6).
2.1.1
Mathematical Models of Friction
In [3], proposed a seven parameter model, where the friction is given by:
• Not sliding (presliding displacement)
Ff (x) = −kt x
(2.8)
• Sliding (Coulomb + viscous + Stribeck curve function with frictional memory)
Ff (ẋ, t) = −(Fc + Fv |ẋ| + Fs (γ, t2 )
1
1+
L) 2
( ẋ(t−τ
)
ẋs
)sgn(ẋ)
(2.9)
• Rising static friction (friction level at breakaway)
Fs (γ, t2 ) = Fs,a + (Fs,∞ − Fs,a )
t2
t2 + γ
(2.10)
where :
- Ff () is the instantaneous friction force
- Fc (*) is the Coulomb friction force
- Fv (*) is the viscous friction force
- Fs is the magnitude of the Stribeck friction (frictional force at breakaway is
Fc + Fs )
9
- Fs,a is the magnitude of the Stribeck friction at the end of of the previous
sliding period
- Fs,∞ (*) is the magnitude of the Stribeck friction after a long time at rest
(with a slow application of force)
- kt (*) is the tangential stifness of the static contact
- ẋs (*)is the characteristic velocity of the Stribeck friction
- τL (*) is the time constant of frictional memory
- γ (*) is the temporal parameter of the rising static friction
- t2 is the dwell time, time at zero velocity.
(*) marks friction model parameters, other variables are state variables.
The magnitude of the seven friction parameters will naturally depend upon the
mechanism and lubrication, but typical parameters may be offered. These are summarized in Table 2.1 and in Table 2.2 is presented for each of seven parameters of
the model where represent a different friction phenomenon, the effect of these phenomena on sliding behavior. This model, however, does not combine the different
friction phenomena but it is in fact one model for stiction and another for sliding
friction. Another dynamic model suggested by Rice and Ruina has been used in
connection with control by Dupont. This model is not defined at zero velocity.
In [8] a friction model covering most of friction components can be expressed as
follows:
Ff (ẋ) = [a0 + a1 e−b1 |ẋ| + a2 (1 − e−b2 |ẋ| )]sgn(ẋ)
(2.11)
where ai ’s and bi ’s are positive constants. Asymmetries can be included in (2.11) by
letting ai ’s be different for different velocity direction. The bi ’s can be maintained
10
Parameter
Range
Fc
0.001 − 0.1 ∗ Fn
Lubricant viscosity, contact geometry and loading
Fv
0-very large
Lubricant viscosity, contact geometry and loading
Fs,∞
0 − 0.1 ∗ Fn
Boundary lubrication,Fc
kt
1
∆x
Parameters depends principally upon
∗ (Fs + Fc ); ∆x ' 1 − 50[µm]
ẋs
0.00001 −
Material properties and surface finish
meter
0.1[ second
]
Bound. lubric., lubricant viscosity, contact geometry and loading
τL
1 − 50[ms]
Lubricant viscosity, contact geometry and loading
γ
0 − 206[s]
Boundary lubrication
Table 2.1: Approximate ranges for the parameters of seven parameter friction model
Friction model
Predicted/Observed behavior
Viscous
Stability at all velocities an at velocities reversal
Coulomb
No stick-slip for Pd control;No hunting for PID control
Static+Coulomb+Viscous
Predicts stick-slip for certain initial conditions under PD control;predicts hunting under PID control
Stribeck
Needed to correctly predict initial conditions leading to stick-slip
Rising static friction
Needed to correctly predict interaction of velocity and stick-slip amplitude
Frictional memory
Needed to correctly predict interaction of stiffness and stick-slip amplitude
Presliding
Needed to correctly predict small displacements while sticking (including velocity reversals)
Table 2.2: Friction model capabilities
11
constant. The non-linearity of parameters ai and bi restricts its utility for online identification (linear predictors require a model expression that it is linear in
parameters). A simplified model is also presented in the same paper that captures
the asymmetries and stick-slip while remains linear in the unknown parameters.
Such model is :
1
Ff (ẋ) = [a0 + a1 |ẋ| 2 + a2 |ẋ|]sgn(ẋ)
(2.12)
To evaluate the precision that can be achieved with this reduction the parameters
ai of the model (2.12) are estimated by minimizing a least-square estimation algorithm. In this paper also referred that the uniqueness of the ai ’s does not exist.
Indeed, several sets of parameters ai may exist leading to equivalent approximation.
The LuGre model, its variants and approximation techniques
In [9] the friction interface between two surfaces is presented in some extent, as a
contact between bristles. The average deflection of the bristles is denoted by z and
is modelled by:
dz
|v|
=v−
z
dt
g(v)
(2.13)
where v is the relative velocity between the two surfaces. The first term gives a
deflection which is proportional to the integral of the relative velocity. The second
term asserts that the deflection z approaches the value:
zss =
v
g(v) = g(v)sgn(v)
|v|
(2.14)
in steady-state, i.e, when v is constant. The function g is positive and depends on
many factors such as material properties, lubrication, temperature. It needs not be
symmetrical. Direction dependent behavior can therefore be captured. For typical
bearing friction, g(v) will decrease monotonically from g(0) when v increases. This
corresponds to Stribeck effect. The friction force generated from the bending of the
12
bristles is described as:
F = σ0 z + σ1
dz
+ σ2 v
dt
(2.15)
where σ0 is the stiffness, σ1 a damping coefficient and σ2 v a term which accounts for
viscous friction. The function σ0 g(v) and σ2 v can be determined by measuring the
steady-state friction force when the velocity is held constant. A parameterization
of g that has been proposed to describe the relation between velocity and friction
force for steady-state motion is given by:
Fss (v) = σ0 g(v)sgn(v) + σ2 v
(2.16)
v
2
= Fc sgn(v) + (Fs − Fc )e−( vs ) sgn(v) + σ2 v
(2.17)
In that paper is assumed that if the parameters σ0 , σ1 , σ2 and function g(v) are
known and using a non-linear friction observer to estimate the unmeasurable state
z where the observer given by:
dẑ
dt
F̂
|v|
ẑ − Ke
g(v)
dẑ
= σ0 ẑ + σ1
+ σ2 v
dt
= v−
(2.18)
(2.19)
where K > 0 and e is the position error, we can have position control. If xd is the
desired reference and is assumed to be twice differentiable then the position error
defined as e = x − xd and the term Ke in (2.18) ‘is a correction term for the position
error.
Similarly is proposed the velocity control where e = v − vd with vd the desired
velocity which is assumed to be differentiable.
By the way, to assume that the friction model and its parameters are known exactly
is of course a strong assumption. In addition to this the accuracy required in the
velocity measurement is a similar problem. The model of friction given by (2.17)
is used by [1] where the sgn() function approximated by tanh(σx) function where
σ defines the slope of the function. The larger the value the steepest the slope
13
is. They suggest that a value of σ = 30 provides a close fit while capturing most
of friction effects. They also suggest that the parameters may be estimated, as a
structured disturbance, using an observer. A new system then is created so that the
parameters become additional states augmented to the original state-space system.
Using the non-linear Luenberger observer for parameter estimation, can also be
served as an on-line method for detecting faults. However, an extra attention needs
to be paid for the parameter selection for estimation where the observability needs
to be maintained. In some applications the friction variation may also depend on
the actual value of the position or a more complex combination of the position and
velocity. As a consequence, in some applications it will be required that friction is
explicitly parameterized not only as a function of velocity but also as a function of
position (see [10]). Another form of the model described previously in [10] is :
ż = −α(ẋ)|ẋ|z + ẋ
(2.20)
F
(2.21)
= σ0 z + σ1 ż + σ2 ẋ
where z denotes the average deflection of the bristles, which is not measurable, a(ẋ)
a finite positive function. One parameterization of α(ẋ) which describes the Stribeck
effect is :
α(ẋ) =
σ0
ẋ
fc + (fs − fc )e−( x˙s )
2
(2.22)
where fc the Coulomb friction level, fs is the level of stiction force and x˙s the
Stribeck velocity. The parameters σ0 , σ1 are assumed to be known. In this case,
σ0 /fc ≤ α(ẋ) ≤ σ0 /fs , if it is assumed that fs ≥ fc . As it mentioned above the
friction may be position dependent. In [10] assumes in this case that α(x, ẋ) is an
upper and lower bounded positive smooth function of x and ẋ. There is no need to
know the exact form of the function as generalized basis functions shall be used to
emulate it. In this way it can capture properties related not only to velocity but
14
also to position. So the model can be:
ż = −α(x, ẋ)|ẋ|z + ẋ
(2.23)
F
(2.24)
= σ0 z + σ1 ż + σ2 ẋ
If α(x, ẋ) is assumed completely unknown neural networks is a possible tool to
approximate the non-linear mapping. The approximation αα (W, x, ẋ) is written as:
αα (x, ẋ) = W T S(x, ẋ)
(2.25)
where W = [w1 , w2 , . . . , wl ]T ∈ <l the parameter vector and S(x, ẋ) = [s1 (x, ẋ),
s2 (x, ẋ), . . . , sl (x, ẋ)]T ∈ <l is the vector bounded basis functions, and therefore we
have :
α(x, ẋ) = W T S(x, ẋ) + ²
(2.26)
with ² being the modelling error which is assumed to be bounded. If the non-linear
function α(x, ẋ) is in the functional range of the approximation, then ² = 0. For the
case where α(ẋ) is only a function of ẋ, the rough form/shape of α(ẋ) in terms of
velocity is infinitely smooth. This piece of information, as they suggested, helps to
find more appropriated basis function for approximation rather than constructing
NN blindly as in most cases. The function is :
α(ẋ) =
c0
1 + c1 e−(ẋ/x˙s )2
(2.27)
where c0 = σ0 /fc , c1 = (fs − fc )/fc and c2 = x˙s 2 . The parameters c1 > 0 and c2 < 1
appear non-linearly. The polynomial approximation which is suggested is to expand
(2.27) using Taylor expansion around ẋ2 = 0. Then :
α(ẋ) =
m
X
1
ϑk α
| 2 (ẋ2 )k + ²t = W T S(ẋ) + ²t
2 )k ẋ =0
k!
ϑ(
ẋ
k=0
where :
W = [α|ẋ2 =0 ,
ϑα
1 ϑm α
|
,
.
.
.
,
| 2 ]T
2
ϑẋ2 ẋ =0
m! ϑ(ẋ2 )m ẋ =0
15
(2.28)
and
S(ẋ) = [1, ẋ2 , . . . , (ẋ2 )m ]T
The remainder ²t is given by :
²t =
1
ϑ(m+1) α
| 2 (ẋ2 )m+1
(m + 1)! ϑ(ẋ2 )(m+1) ẋ =0
In theory, the remainder decreases as m increases. By numerical calculation, it is
found that m needs to be very large in order to obtain an acceptable approximation
accuracy, and that because c2 is in the order of 10−4 in the denominators of the
terms.
An another approximation method is the non-linear functional approximation. In
this method under the assumption that c2 = ẋs2 is known exactly, (2.27) can be
expanded around the nominal value c1n using Taylor serias as:
α(ẋ) =
m
X
ϑm α
k=0
ϑcm
1
|c1 =c1n (c1 − c1n )m + ²t = W T S(ẋ) + ²t
(2.29)
where :
W = [c0 , . . . , (−1)m c0 (c1 − c1n )m ]T
h
S(ẋ) =
i
1
e−mẋ /c2
,
.
.
.
,
2
2
1 + c1n e−ẋ /c2
(1 + c1n e−ẋ /c2 )m+1
2
(2.30)
c1n is the nominal value of c1 and the remainder ²t is given by, ∀(0 ≤ ξ ≤ 1)
²t =
2 /c
2
(−1)(m+1) c0 e−(m+1)ẋ
(1 + ξe−ẋ2 /c2 )m+1
(c1 − c1n )(m+1)
(2.31)
It can be seen that |²t | ≤ c0 |c1 − c1n |(m+1) , ∀c1 ∈ (0, 1). Because |c1 − c1n < 1, the
upper bound c0 |c1 − c1n |(m+1) decreases as m increases. The result is global because
the upper bound of the approximation error is independent of the operating range
of ẋ, which can be very large. From (2.30) the primitives used to construct the basis
2 /c
2
function S are e−ẋ
and
1
.
2
1+c1n e−ẋ /c2
Each element si of S can be written as:
h
−ẋ2 /c2
e
ii h
ij
1
e−iẋ /c2
=
1 + c1n e−ẋ2 /c2
(1 + c1n e−ẋ2 /c2 )j
2
16
with i + j ≥ 0, i, j ≥ 0. In general, the smaller the i + j, the more important this
term is for the reconstruction. The higher the i + j is, the better the approximation accuracy. If both c1 and c2 are unknown, then the non-linear function can be
expanded with respect to both of them around their nominal values (c1n , c2n ), subsequently the basis functions vector S(ẋ) and the corresponding unknown parameters
W can be found.The remainder can be quantified similarly. By some calculations
2 /c
2n
the primitives can be found for this case consist of: ẋ2 , e−ẋ
Each subelement si of S should be
2
ẋ2i e−j ẋ /c2n
2
(1+c1n e−ẋ /c2n )k
and
1
.
2
1+c1n e−ẋ /c2n
with i + j + k ≥ 0, i, j, k ≥ 0. As
before, the smaller i + j is the more important this term is for the reconstruction.
The higher the i + j is the better the approximation accuracy.
If no knowledge is available a Neural Network as mentioned earlier can be used to
generate I/O maps using the property that a multi-layer NN can approximate any
function, under mild assumptions with any desired accuracy. It has been proven
that any continuous functions not necessarily infinitely smooth, can be uniformly
approximated by linear combinations of Gaussians. The Gaussian RBF neural network is a particular network architecture which uses l-numbers of Gaussian functions
of the form:
si (x, ẋ) = exp [−
(x − µ1i )2 + (x − µ2i )2
]
σ2
where [x, ẋ]T ∈ <2 is the input variable, σ 2 ∈ < is the variance and [µ1i , µ2i ]T ∈ <
is the center vector. The si ’s are the elements of the basic function vector S(x, ẋ)
of the approximation:
αα (w, x, ẋ) = W T S(x, ẋ)
The shortcoming of [9], [10] model, called as LuGre model, lies in the inadequacy
of the hysteresis part since it does not account for non-local memory and it cannot
accomodate arbitrary displacement-force transition curves.
17
In [22] noted that LuGre model whereas allows good description of the constant
velocity behavior and offers a smooth transition at velocity reversal, the modeling
capabilities in presliding regime are restricted as follows:
⇒ The model is too dissipative in presliding
⇒ The shape of the transition curve is fixed by the model and therefore cannot
be adapted to actually measured values.
For the latter, beside the parameter σ0 which models the initial stiffness at velocity
reversal, no parameters are left for the shaping of the transition curves which will
always have the same form and therefore is inadequate for fitting transition curves
of arbitrary forms. In [22] is also presented an improved friction model which the
friction force F is modelled by a set of two equations where as in the case of the
LuGre model, depend on a state variable z representing the average deformation
of the asperities of the contacting surfaces. The first equation, the friction force
equation, is:
F = Fh (z) + σ1
dz
+ σ2 v
dt
(2.32)
where σ1 is a micro-viscous damping coefficient, σ2 is the viscous damping coefficient
and v is the velocity of the moving object. Fh (z) is the hysteresis friction force that
is the part of friction force exhibiting hysteretic behavior. It is a static hysteresis
nonlinearity with non-local memory. This hysteresis function is consisting of transition curves (curves between two reversal points or extrema). Each velocity reversal
initiates a new transition curve, adds a new extremum to the hysteresis memory and
resets the state variable z to zero. The transition curve which is active at a certain
time is represented by Fd (z) and the value of Fh (z) at the beginning of a transition
curve is represented by Fb . Then we have:
Fh (z) = Fb + Fd (z)
18
(2.33)
Fd as they mention, is a point-symmetrical strictly increasing function of z.
The second equation, the nonlinear equation, is based on the current hysteresis
transition curve Fd (z) and the current velocity, that is:
µ
dz
Fd (z)
Fd (z) n
= v 1 − sign(
)∗|
|
dt
S(v) − Fb
S(v) − Fb
¶
(2.34)
where S(v) is the constant velocity behavior in sliding which is the same as in (2.17)
without accounting for viscous friction term, that is σ2 v. The parameter n, allows
to modify the influence of Fd (z)/(S(v) − Fb ) on the difference between dz/dt and
v such that the model behavior correspond better to friction measurements in the
transition from presliding to sliding. The constant velocity behavior (sliding) where
is described by this model is exactly the same as in the LuGre model. Namely, the
friction force is given by:
F = Fb + Fd (z) + σ2 v = S(v) + σ2 v
(2.35)
where is in the same form as in (2.17). The difference over the LuGre model consists
in the zero velocity behavior (presliding) where a hysteresis model with non-local
memory is included. Consequently from (2.32) and (2.34), for zero velocity we have
:
F
dz
dt
= Fb + Fd (z) = Fh (z)
= 0
The hysteresis model relates the state variable z and the hysteresis force Fh . The
implementation of the hysteresis model requires two memory stacks the one for the
minima of Fh in ascending order (stack m) and one for the maxima of Fh (stack M).
The stacks grow at velocity reversal and shrink when an internal hysteresis loop is
closed. The stacks are reset when the system goes from presliding to sliding. The
19
value of Fb equals to the most recent element of stack M if the transition curve is
descending and of stack m if the transition curve is ascending. The value of the state
variable z is reset to zero at each velocity reversal and recalculated at the closing of
an internal loop. In [22] is given an analytical description of the mechanisms which
govern the hysteresis model, mechanisms that also exist in [23] and on the following
hysteresis section of this report.
The above model can account accurately for experimentally obtained friction characteristics which are: Stribeck friction in sliding, hysteretic behavior in presliding,
frictional lag, varying break-away and stick-slip behavior.
2.2
Backlash
Actuator and sensor nonlinearities are among the key factors limiting both static
and dynamic performance of feedback control systems. Harmful effects of backlash
in gears are well known. Backlash prevents accurate positioning and may lead to
chattering and limit-cycle-type instabilities. This increases wear and tear of the
gears, which, in turn, increases backlash. This phenomenon has haunted the constructors of control systems for more than 50 years: from the servomechanisms in
the 1940s to the modern high precision robotic manipulators. Typically the concept
of backlash is associated with gear trains and similar mechanical couplings. Sometimes backlash can be used to approximate description of the delays in drives with
elastic cables or in long pipes.
2.2.1
Mathematical Models of Backlash
The most familiar and simple model perhaps is the one for backlash hysteresis
(piecewise linear model) that the backlash characteristic u(t) = B(v(t)) described
by two parallel lines connected via horizontal lines (see fig. 2.2). The methods that
20
ud
v
BI()
B()
u = ud
Figure 2.1: Inverting a backlash
are presented in [19] and [21], need to construct an inverse model to mitigate the
effects of the backlash (see fig. 2.1).
Mathematically, this phenomenon is modelled as:
u̇(t) =



mv̇(t) if v̇(t) > 0 and u(t) = m(v(t) − cr )






or







 0
(2.36)
if v̇(t) < 0 and u(t) = m(v(t) − cl )
otherwise
with input v(t) and output u(t) and cl , cr the left and right crossings respectively
with cr > cl . The v(t) and v̇(t) uniquely determine u(t), u̇(t) and the knowledge of
v̇(t) is necessary to specify the signal motion of B(·) (B(·) the backlash characteristic) on whether a straight line or an inner segment (horizontal). A further insight
into the nature of backlash can be gained from the waveforms of the output u(t)
when the input is v(t) is a sine signal in fig. 2.2. For this illustration the backlash
parameters are m = 1, cr = 0.5, cl = −0.5 and for three different initial conditions
at a time, i.e. u(0) = −0.5, u(0) = 0 and u(0) = 0.3. It is noticeable that for
the last two initials conditions we obtain initial “transients” and then u(t) settle in
its periodic steady state. For the initial condition u(0) = −0.5 the periodic steady
state is reached without transients. The periodic steady state of u(t) reveals the
two fundamental features of backlash. First, it introduces a phase delay. Second, it
causes a loss of information by chopping the peaks of v(t).
21
input
output
1
0.5
0.5
0
0
−0.5
−1
0
2
4
6
8
−0.5
−1
10
1
−0.5
0
0.5
1
−0.5
0
0.5
1
−0.5
0
0.5
1
0.5
0.5
0
0
−0.5
−1
0
2
4
6
8
−0.5
−1
10
1
0.5
0.5
0
0
−0.5
−1
0
2
4
6
8
−0.5
−1
10
Figure 2.2: Backlash response to a sine input.
The inverse model of (2.36) as proposed is:

1



m u̇d (t)












v̇(t) =
if u̇d (t) > 0 and v(t) =
1
m ud (t)
+ cr
1
m ud (t)
+ cl
or
if u̇d (t) < 0 and v(t) =


0
if u̇d (t) = 0






1

g(t, t)
if u̇d (t) > 0 and v(t) = m
ud (t) + cl





 −g(t, t) if u̇ (t) < 0 and v(t) = 1 u (t) + c
r
d
m d
(2.37)
where g(τ, t) = δ(τ − t)(cr − cl ) with δ(t) the Dirac δ-function. ud (t) is a given
desired signal for u(t) and a backlash inverse BI(·) defined by (2.37) is such that
ud (t) = B(BI(ud (t))).
In the above definition the inverse of a horizontal segment of the backlash characteristic is a vertical jump of a distance (cr − cl ). The following lemma that it is
being proved in [21] is the following one:
Lemma 1
The characteristic BI(·) defined by (2.37) is the right inverse of the characteristic
22
(2.36) in the sense:
B(BI(ud (t0 ))) = ud (t0 ) ⇒ B(BI(ud (t))) = ud (t), ∀t ≥ t0 for any piecewise continuous ud (t) and any t ≥ t0 .
It is also reported that an initialization of the backlash inverse BI(·) for u(t) = ud (t)
is possible at any given time t0 . When the parameters m, cr , cl are unknown instead
of these, are used their estimates m̂(t), ĉr (t), ĉl (t) to design an adaptive feedback
backlash inverse. As before, ud (t), u̇d (t) uniquely determine v(t), v̇(t) and the knowledge of u̇d (t) is necessary to specify the signal motion of the backlash inverse.
In [13] is defined a dynamic hysteresis model to approximate the backlash hysteresis, an approximation which is called backlash-like hysteresis. As in [21] a backlash
hysteresis non-linearity can be described by:



c(v(t) − B)



w(t) = P (v(t)) =
c(v(t) + B)




 w(t )
−
if v̇(t) > 0 and w(t) = c(v(t) − B)
if v̇(t) < 0 and w(t) = c(v(t) + B)
(2.38)
otherwise
where c > 0 is the slope of the lines and B > 0 is the backlash distance. The
continuous-time dynamic model which describes a class of backlash-like hysteresis
is given by the following equation:
¯
¯
¯ dv ¯
dw
dv
= α ¯¯ ¯¯ (cv − w) + B1
dt
dt
dt
(2.39)
where α and B1 are constants satisfying c > B1 . The equation (2.39) can be solved
explicitly for v piecewise monotone:
w(t) = cv(t) + d(v)
(2.40)
with :
d(v) = [w0 − cv0 ]e−α(v−v0 )sgnv̇ + e−αvsgnv̇
23
Z v
v0
[B1 − c]eαζ(sgnv̇) dζ
(2.41)
for v̇ constant, w(v0 ) = w0 . For d(v), it can be shown that if w(v; w0 , v0 ) is the
solution of (2.39) with initial values (v0 , w0 ) then if v̇ > 0(v̇ < 0) and v → ∞(−∞),
one has:
c − B1
v→∞
v→∞
α
c − B1
lim d(v) = lim [w(v; v0 , w0 ) − f (v)] =
v→−∞
v→−∞
α
lim d(v) =
lim [w(v; v0 , w0 ) − f (v)] = −
(2.42)
(2.43)
The above convergence is exponential at the rate of α. Solution (2.40) and properties (2.42),(2.43) show that w(t) eventually satisfies the first and second conditions
of (2.38). Moreover, setting v̇ = 0 results in ẇ = 0 which satisfies the last condition
of (2.38). This implies that the dynamic equation (2.39) can be utilized to model
a class of backlash-like hysteresis and is an approximation of backlash hysteresis
(2.38). Equations (2.42) and (2.43) indeed show that w switches exponentially from
1
1
the line cv(t) − c−B
to cv(t) + c−B
to generate backlash-like hysteresis curve. The
α
α
solutions of (2.39) can be obtained by numerical integration with v as the independent variable. The parameter α determines the rate for w(t) to switch between
1
− c−B
and
α
c−B1
α .
The larger the parameter α is, the faster the transition in w(t) is
going to be. However the backlash distance is determined by
c−B1
α
and the parame-
ter must satisfy c > B1 . Consequently a compromise should be made for choosing a
suitable parameter set {α, c, B1 } to model the required shape of backlash-like hysteresis. If the values of backlash slope and distance are not exactly known, then
adaptations will be used to estimate them.
The useful outcome of [13] is that backlash-like hysteresis is modelled by a dynamic
equation without the need to construct a backlash hysteresis inverse.
It is also reported that in the presence of actuator dynamics prior to the backlash,
the adaptive backlash inverse control problem is more difficult because it requires
that backlash be inverted through a dynamic block and this problem is currently
under investigation.
24
2.3
Hysteresis
As it is characterized by the literature, backlash is the simplest form of hysteresis. Hysteresis phenomena are even more numerous and diverse than those modeled
by backlash characteristics. Generally include nondifferentiable nonlinearities and
usually unknown. While ferromagnetic hysteresis is the best known type of hysteresis, similar characteristics are common in plastic, piezoelectric and other materials.
However it is in general unrealistic to expect that a single hysteresis model can serve
a vast variety of applications. In the sequel of this section various models of hysteresis are presented. In fig. 2.3 is given a typical graph of hysteresis phenomenon.
1
0.8
0.6
0.4
0.2
0
−0.2
−0.4
−0.6
−0.8
−1
−5
−4
−3
−2
−1
0
1
2
3
4
5
Figure 2.3: A typical Hysteresis diagram with a major and minor loop
2.3.1
Mathematical Models of Hysteresis
In [20] a simplified hysteresis model is used that captures most of the hysteresis
characteristics and is useful for parameter adaptive control. It has been showed that
the proposed model has a parameterizable right inverse which cancels the effect of
25
the hysteresis when cascaded with the hysteresis. Its main hysteresis and two minor
loops are shown in fig. 2.4. It can be tuned by as many as eight parameters:
u
ct
m
t
m
r
m
l
cr
c
l
v
m
b
c
b
Figure 2.4: Hysteresis model
four slopes ml , mr , mb , mt and four crosssing parameters cl , cr , cb , ct , where the
subscripts indicate “left”, “right”, “bottom” and “top” respectively. The difference
between the slopes mb and mt allows for the appearance of local loops.
Defining as:
4 ct +ml cl
ml −mt
4 ct +mr cr
v3 = mr −mt
v1 =
4 cb +mr cr
mr −mb
4 cb +ml cl
v4 = ml −mb
v2 =
where v1 , v2 , v3 , v4 are the values of v(t) at the upper-left, lower-right, upper-right
and lower-left corners of the quadrilateral. Then the hysteresis u(t) = H(v(t))
representing the motion of u(t) and v(t) is fully described by:
26


mt v̇(t)












































mb v̇(t)













u̇(t) =


























mr v̇(t)

















ml v̇(t)















0
if v(t) ≥ v3 AN D u(t) = mt v(t) + ct , OR
if v4 < v(t) < v3 ,v̇(t) < 0,
u(t) = mt v(t) + cd , u(t) 6= ml (v(t) − cl ) AN D u(t) 6= mb v(t) + cb , OR
if mt < mb , v4 < v(t) < v3 ,
u(t) = mb v(t) + cb AN D v̇(t) < 0, OR
if mt < mb , v4 < v(t) < v3 ,
u(t) = mt v(t) + ct AN D v̇(t) > 0
if v(t) ≤ v4 AN D u(t) = mb v(t) + cb , OR
if v4 < v(t) < v3 ,v̇(t) > 0,
u(t) = mb v(t) + cu , u(t) 6= mr (v(t) − cr ) AN D u(t) 6= mt v(t) + ct , OR
if mt > mb , v4 < v(t) < v3 ,
u(t) = mt v(t) + ct AN D v̇(t) > 0, OR
if mt > mb , v4 < v(t) < v3 ,
u(t) = mb v(t) + cb AN D v̇(t) < 0
if v4 < v(t) < v3 , v̇(t) > 0 AN D
u(t) = mr (v(t) − cr )
if v4 < v(t) < v3 , v̇(t) < 0 AN D
u(t) = ml (v(t) − cl )
if v̇(t) = 0
(2.44)
These expressions indicate that is hysteresis is a complex nonlinear dynamic system
defined by piecewise linear relationships between the input v(t), output u(t), and
their time derivatives.
With ud (t) be a control signal to be designed, the inverse of the hysteresis, name it
HI(·), it is given by the motion of ud (t) and v(t) and mathematically is described
27
as:



























































v̇(t) =


























































1
u̇ (t)
mt d
if ud (t) ≥ u3 , OR
if u4 < ud (t) < u3 , u̇d (t) < 0,
v(t) 6=
1
u (t)
ml d
+ cl AN D v(t) 6=
1
(ud (t)
mb
− cb ), OR
if mt < mb , u4 < ud (t) < u3 ,
v(t) =
1
(ud (t)
mb
− cb ) AN D u̇d (t) < 0, OR
if mt < mb , u4 < ud (t) < u3 ,
v(t) =
1
u̇ (t)
mb d
1
(ud (t)
mt
− ct ) AN D u̇d (t) > 0
if ud (t) ≤ u4 , OR
if u4 < ud (t) < u3 ,u̇d (t) > 0,
v(t) 6=
1
mr
ud (t) + cr AN D v(t) 6=
1
(ud (t)
mt
if mt > mb , u4 < ud (t) < u3 ,
u(t) =
1
(ud (t)
mt
− ct ), OR
(2.45)
− ct ) AN D u̇d (t) > 0, OR
if mt > mb , u4 < ud (t) < u3 ,
u(t) =
1
mr
u̇d (t)
1
mr
ud (t) + cr
if u4 < ud (t) < u3 , u̇d (t) < 0 AN D
v(t) =
0
− cb ) AN D u̇d (t) < 0
if u4 < ud (t) < u3 , u̇d (t) > 0 AN D
v(t) =
1
u̇ (t)
ml d
1
(ud (t)
mb
1
u (t)
ml d
+ cl )
if u̇d (t) = 0
with:
4 ml (mt cl +ct )
ml −mt
4 mr (mt cr +ct )
u3 = mr −mt
u1 =
4 mr (mb cr +cb )
mr −mb
4 ml (mb cl +cb )
u4 = ml −mb
u2 =
The proposed hysteresis inverse has the following property:
Proposition 1
The characteristic HI(·) given by (2.45) is the right inverse of the characteristic
(2.44) in the sense:
28
H(HI(ud (t0 ))) = ud (t0 ) ⇒ H(HI(ud (t))) = ud (t), ∀t ≥ t0 for any piecewise continuous ud (t) and any t ≥ t0 .
However, as ud (t) is the design signal of our choice, an initialization of the hysteresis inverse by an appropriate choice of ud (t0 ) should always make v(t) and u(t)
leave the inside loop at t0 so that u(t0 ) = ud (t0 ) and then from Proposition 1,
u(t) = ud (t) for any t ≥ t0 . When the parameters mt , ct , mb , cb , mr , cr , ml , cl
are unknown instead of these, are used their estimates m̂t , ĉt , m̂b , ĉb , m̂r , ĉr , m̂l , ĉl
to design an adaptive feedback hysteresis inverse.
In [11] is defined a mathematical model for hysteresis loop. This model
is a first order nonlinear differential equation and is capable, as they allegate, of
simulating exactly a given hysteresis loop and furthermore the model exhibits many
of the properties that are, in fact, observed in practice. The mathematical model is
the following one :
dy
= h(y)g ◦ [x(t) − f (y)]
dt
(2.46)
where f, g, h are real-valued functions defined on the real line < and ‘◦’ denotes the
composition operation. The set of functions having K continuous derivatives on <
are denoted by C k (<). The three functions are assumed to satisfy the conditions:
(i) g, f, h ∈ C 1 (<)
(ii) g 0 > 0, f 0 > 0 on <
(iii) f, g : < → <
(iv) 0 < a ≤ h < b < ∞ on <
where a, b are finite positive constants. The function g is referred to as the dissipation
function, the function f as the restoring function and h is called the weighting function. This model, given by a non-linear differential equation, represents a dynamic
29
process. The model exhibits many of the observed properties, such as widening
effects with increasing frequency and minor hysteresis loops when a periodic signal
is superimposed upon a constant signal.
They are also reported some important properties that govern the model (2.46)
where the proofs here have been deleted for brevity. Primarily let denote I the
interval on the real line [0, ∞). A solution to (2.46) with initial condition y(t0 ) = y0
where t0 ∈ I is a differentiable function φ defined on the interval I such that:
φ(t) = h(φ(t))g ◦ [x(t) − f (φ(t))]for t ≥ 0 and φ(t0 ) = y0
(2.47)
The six properties are the following:
• Property 1 If x(t) is bounded and continuous on I, the for all y0 ∈ < and all
t0 ≥ 0
(a) there exists a solution φ(t) satisfying (2.46) for all t ≥ 0
(b) φ(t) is uniformly bounded
(c) φ(t) is unique
• Property 2 If x(t) is bounded, continuous on I and periodic of fundamental
period T , then there exists a unique periodic solution p(t) to equation (2.46)
of the same fundamental period T .
• Property 3 With x(t) satisfying the same conditions as in property 2 any
solution φ(t) to equation (2.46) with arbitrary initial conditions will approach
asymptotically the unique periodic solution p(t)
• Property 4 If x(t) is bounded, continuous on I and periodic of fundamental
period T and if in addition x(t) has only one maximum and one minimum per
cycle, with no points of inflection, then the unique periodic solution p(t) of
equation (2.46) also has only one maximum and one minimum per cycle.
30
• Property 5 If x(t) satisfies the condition of property 4 and let p(t) be the
unique periodic solution to equation (2.46), then a parameterized curve defined
by :
Γ = {(x, y) : x = x(t), y = p(t), 0 ≤ t ≤ T }
is a simple closed curve.
• Property 6 if y2 (t) = y1 (αt) with 0 ≤ α ≤ 0 and y1 (t) = y1 (t + T1 ) and the
two associated simple closed curves defined by :
Γ1 = {(x, y) : x = x(t), y = y1 (t), 0 ≤ t < T1 }
Γ2 = {(x, y) : x = x(t), y = y2 (t), 0 ≤ t < T1 /α}
Then the area enclosed by the curve Γ1 is greater than the area enclosed by
Γ2 .
The one that remains is to present a procedure for constructing the dissipation
function g, the restoring function f and the weighting function h. In order to
identify these functions a pair of waveforms {x(t), y(t)} must be measured. Because
the system characterized as nonlinear there is no apparent advantage in using one
set of measurements over another, unlike the linear case. The authors report that
one procedure is to select a pair of waveforms {x(t), y(t)} where y(t) is a cosine
function. Namely to apply a cosine waveform y(t) and measuring the corresponding
waveform x(t). With y(t) constrained to be cosine waveform it follows that ẏ(t) is
known and (2.46) is reduced to an algebraic relationship.
For a given closed hysteresis loop where y(t) is a cosine function, g can be chosen
as an arbitrary odd function providing it satisfies the necessary conditions (i) − (iv)
where are mentioned above, and construct f, h so that equation (2.46) can be used
31
to represent exactly a given hysteresis loop. Since both y(t) and ẏ(t) are known and
since g −1 exists we can rewrite equation (2.46) as :
µ
x(t) = g −1
ẏ(t)
h(y(t))
¶
+ f (y(t))
(2.48)
With y(t) a cosine function of period of T for each y in the range of y(t) (except
for the extremal points) there exists two values of t ∈ [0, T ] say t1 , t2 where y(t1 ) =
y(t2 ) = y. It is obvious that ẏ(t1 ) = −ẏ(t2 ). Thus with g odd
µ
g −1
ẏ(t1 )
h(y(t1 ))
¶
µ
= −g −1
ẏ(t2 )
h(y(t2 ))
¶
=d
(2.49)
and since both (x(t1 ), y(t1 )) and (x(t2 ), y(t2 )) represent points on the hysteresis loop
with same ordinate x(t1 ) − x(t2 ) = 2d. Moreover, in view of equations (2.48), (2.49)
the midpoint of the two points on the hysteresis loop corresponding to t1 and t2
satisfies the equation:
µ
y(t1 ) = y(t2 ) = f −1
¶
µ
where
x(t1 )+x(t2 )
2
x(t1 ) + x(t2 )
2
¶
(2.50)
is on the midpoint of the hysteresis loop taken along the abscissa
direction. Thus the locus of midpoints of the hysteresis loop determines the f −1
function. Since g (and hence g −1 ) is known and since the value of d can be measured
directly from the hysteresis loop, h(y(t1 )) can be obtained from equation (2.49).
Continuing for all values of t ∈ [0, T ) the function h can be determined. The points
from the above description are given in fig. 2.5.
The functions f, h constructed by this procedure are unique. The arbitrary choice of
g is an advantage which permits us to represent more closely a family of hysteresis
loops. Hence if we are attempting to model a family of hysteresis loops rather than
a single loop, appropriate optimization techniques may be used to determined the
functions f, g, h.
The main shortcoming from the above discussed model that is not predicted is its
32
1
−1
y=f (x)
0.8
0.6
0.4
(x(t1 ),y(t1 ))
(x(t2 ),y(t2 ))
d
d
0.2
0
−0.2
.
−1
d=g (
y(t1 )
)
h(y(t1 ))
−0.4
−0.6
−0.8
−1
−5
−4
−3
−2
−1
0
1
2
3
4
5
Figure 2.5: Construction procedure for obtaining f ,g and h
behavior when x is constant (called as dc behavior) where y assume more than
one distinct state. To overcome this drawback an improved model is presented in
[12] where the dc hysteresis is included while it preserves the good characteristics of
the model (2.46) namely unusual versatility, loop widening with increasing frequency
and the lack of any required special handling once the model is placed in the system.
It also exhibits loop narrowing with increasing frequency, a phenomenon in the
i − v curves of fluorescent lamps, such as reduction reduction in loop widening to
insignificant amounts of beyond an upper threshold frequency while including strong
widening at intermediate frequencies and a dc loop (phenomenon which is presented
in iron-core materials). This hysteresis model is the following :
33
dy
=w◦
dt
µ
¶
dx
h ◦ (y(t))g ◦ (x(t) − f (y(t)))
dt
(2.51)
where as before ‘◦’ denotes functional composition. The new characteristic to this
model is the function w where it is also assumed that belongs to class C 1 (<) and
w ◦ (dx/dt) ≥ 0. Also is assumed that w ◦ (dx/dt) = 0 ⇔ dx/dt = 0. Moreover, the
model which is given by (2.51) has the similar properties of that of the model which
is given by (2.46). In fig. 2.6 several possible w functions are shown along with brief
descriptions of their effects upon the friction behavior of (2.51).
Frequency Behavior of the model
W function
w(dx/dt)
Loop widening with frequency increases.
Loop collapses as the frequency goes to zero.
Result is a purely nonlinear "Lag" or "Delay" model.
dx/dt
w(dx/dt)
Completely eliminates effect of frequency variation
on hysteresis loop. Loop as frequency goes to infinity is
identical as goes to zero. Can be used where loop
widening is not needed.
dx/dt
w(dx/dt)
K
−K
dx/dt
K
At low frequencies, a frequency insensitive loop appears.
Loop is unchanged as frequency goes to zero. However
as frequency increases to the point where |dx/dt| begins
to exceed K, loop widening with increased frequency
appears. If x(t) sinusoidal, say Acos(wt), then threshold
frequency of loop widening is K/A
w(dx/dt)
K
−K
K
dx/dt
At low frequencies, loop is insensitive to frequency variations.
As frequency increases such that |dx/dt|>K, loop narrowing
with frequency increases occurs.
w(dx/dt)
K
−K’ −K
K K’
dx/dt
For low frequencies, obtain frequency−insensitive loops.
At moderate frequencies, get loop widening with
increased frequency. At still higher frequencies, loop width
becomes progressively frequency insensitive.
Figure 2.6: Possible w functions and their effects on frequency behavior
We should note that these piecewise linear w functions may be replaced with smooth
versions if necessary. The basic drawback of (2.51) is, of course the need for introduction of parasitics under arbitrary excitation. At very low frequencies, this can
34
drastically increase computer solution time.
An another approach which is prevalent in hysteresis modelling is the Preisachtype models of hysteresis. All the models have a common generic feature; they are
constructed as superpositions of simplest hysteresis nonlinearities-rectangular loops.
In the following discussion, according to [23] here are reported various generalizations and extensions of the classical Preisach model, giving the necessary and sufficient conditions for the representation of actual hysteresis nonlinearities by various
Preisach-type models, the solution of identification problems for these models and
numerical implementation.
Starting with the definition of scalar hysteresis we consider a transducer (see fig. 2.7)
which is called a hysteresis transducer (HT) if its I/O relationship is a multibranch
u(t)
f(t)
HT
Figure 2.7: Hysteresis Transducer
nonlinearity for which branch-to-branch transitions occur after input extrema. The
term static hysteresis nonlinearity means that the branches are determined only
by the past extremum values of input while the speed of input variations between
extremum points has no influence on branching. It is worthwhile to keep in mind
that, for very fast input variations, time effects become important and the given
definition of static hysteresis fails. It is also important to mention that the given
definition of hysteresis emphasizes the fact that branching constitutes the essence
of hysteresis, while the formation of hysteresis loops (looping) is a particular case
of branching. Indeed, looping occurs when the input varies back and forth between
two consecutive extremum values while branching takes place for arbitrary input
variations. All static hysteresis fall into two general classifications :
35
(a) Hysteresis nonlinearities with local memories. The value of output
f (t0 ) at some instant of time t0 and the values of input u(t) at all subsequent
instants of time t ≥ t0 uniquely predetermine the value of output f (t) for all
t > t0 . In other words, the past exerts its influence upon the future through
the current value of output.
(b) Hysteresis nonlinearities with non-local memories. In antithesis, for
the non-local hysteresis nonlinearities the future values of output f (t) (t ≥ t0 )
depend not only on the current value of output f (t0 ) but on past extremum
values of input as well.
It is clear from the above that all hysteresis nonlinearities with local memories have
the following common feature:
Every reachable point in the f − u diagram corresponds to a uniquely defined state.
This state predetermines the behavior of HT in exactly one way for increasing u(t)
and in exactly one way for decreasing u(t). Namely, at any point in the f − u
diagram there are only one or two curves that may represent the future behavior of
HT with local memory.
From the other hand in the case of non-local memories, at any reachable point in the
f −u diagram there is an infinity of curves that may represent the future behavior of
the transducer. Each of these curves depends on a particular past history, namely,
on a particular sequence of past extremum values of input.
To describe the mathematical Preisach model we consider an infinite set of simplest
hysteresis operators γ̂αβ . Each of these operators can be represented by a rectangular
loop on the I/O diagram (see fig. 2.8). Numbers α and β correspond to “up”
and “down” switching values of input respectively. It is assumed that α ≥ β. It
is apparent that these operators γ̂αβ represent hysteresis nonlinearities with local
memories. Along with the set of operators γ̂αβ consider an arbitrary weight function
µ(α, β) referred to as the Preisach function. Then the Preisach model can be written
36
2
1.5
1
+1
γαβu
0.5
0
b
a
−0.5
−1
−1
−1.5
−2
0
0.1
0.2
0.3
0.4
0.5
u
0.6
0.7
0.8
0.9
1
Figure 2.8: A simple hysteresis operator γ̂αβ
as:
Z Z
f (t) = Γ̂u(t) =
α≥β
µ(α, β)γ̂αβ u(t)dαdβ
(2.52)
Although the Preisach hysteresis nonlinearity (2.52) is constructed as a superposition of elementary hysteresis nonlinearities γ̂αβ with local memories it usually
has non-local memory. Introducing now, the model’s geometric interpretation we
can see that there is one to one correspondence between operators γ̂αβ and points
(α, β) of the half-plane (see fig. 2.9) α ≥ β. Consequently each point of the halfplane α ≥ β can be identified with only one particular γ̂-operator whose “up” and
“down” switching values are respectively equal to α and β coordinates of the point.
It is also assumed that the Preisach function equals to zero outside of the triangle
T (see fig. 2.9). It is worthwhile to give an example explaining the usefulness of the
Preisach phase plane. Assuming that input u(t) at some instant of time t0 has the
value which is less than β0 . Then the outputs of all γ̂-operators which correspond to
37
a
(a0,b0)
T
a=b
b
Figure 2.9: The Preisach phase plane
the points of the triangle T are equal to −1 (negative saturation). Then we assume
that the input increases monotonically until it reaches at time t1 some maximum
value u1 . Then decreases monotonically until it reaches some minimum value u2 at
time instant t2 . Geometrically the triangle T it winnows into the two sets: S + (t)
consisting of points (α, β) for which γ̂-operators are in the “up”-position and S − (t)
consisting of points (α, β) for which γ̂-operators are in the “down”-position (see
fig. 2.10). Lets name L(t), as the interface between S + (t) and S − (t). The above
discussion reveals the mechanism of memory formation in the Preisach model. The
memory is formed as result of two different rules for the modification of the interface
L(t). For a monotonically increasing input we have a horizontal final link of L(t)
moving upward, while for a monotonically decreasing input we have a vertical final
link of L(t) moving from right to the left. These two different rules result in the
formation of the staircase interface L(t) whose vertices have coordinates equal to
past input extrema.
38
a
S−
a=b
(u1,u2)
b
S+
Figure 2.10: The formation of S + and S −
From (2.52) according to the above we have that:
Z Z
f (t) =
S + (t)
Z Z
µ(α, β)γ̂αβ u(t)dαdβ +
Z Z
=
S + (t)
Z Z
µ(α, β)dαdβ −
S − (t)
S − (t)
µ(α, β)γ̂αβ u(t)dαdβ
µ(α, β)dαdβ
(2.53)
The following properties hold for the Preisach model.
(a) Wiping-out Property Each local input maximum wipes out the vertices
of L(t) whose α-coordinates are below this maximum and each local minimum wipes out the vertices whose β-coordinates are above this minimum.
Namely, only the alternating series of dominant input extrema are stored by
the Preisach model. All other input extrema are wiped out
(b) Congruency Property All minors hysteresis loops corresponding to back
and forth variations of input between the same consecutive extremum values
are congruent
39
What it follows next, is the determination of µ(α, β). To determine µ(α, β), the set
of first-order transition (reversal) curves are needed. These curves can be experimentally found as follows. First the input u(t) should be decreased to the value
which is less than β0 (a situation which is called negative saturation). Next the
input is monotonically increased until it reaches some value α0 . As the input increased, an ascending branch of a major loop is followed. This branch is also called
as the limiting ascending branch, because usually there is no branch below it. The
notation fα0 will be used for the output value on this branch which corresponds to
the input value u = α0 . The first-order transition (reversal) curves are attached to
the limiting ascending branch. Each of these curves is formed as the above monotonic increase of the input is followed by a sub-sequent monotonic decrease. The
term first-order emphasize the fact that each of these curves is formed after the first
reversal of input (see fig. 2.11).
1
0.8
0.6
first order reversal curve
0.4
Output f(t)
0.2
0
−0.2
−0.4
second order reversal curve
−0.6
−0.8
−1
−5
−4
−3
−2
−1
0
Input u(t)
1
2
3
4
5
Figure 2.11: The First- and Second-order transition reversal curves
40
Defining the function:
4 1
F (α0 , β 0 ) = (fα0 − fα0 β 0 )
2
(2.54)
after some calculations that contain differentiations, we have:
µ(α0 , β 0 ) = −
ϑ2 F (α0 , β 0 )
ϑα0 ϑβ 0
(2.55)
From (2.54), the above expression can be written in another equivalent form, which
is:
µ(α0 , β 0 ) =
1 ϑ2 fα0 β 0
2 ϑα0 ϑβ 0
(2.56)
These first-order curves can also be named as first-order decreasing transition curves.
By almost repeating literally the previous reasoning a similar expression can be
found by using the first-order increasing transition curves. It is important to note
here that when the first-order transition curves are congruent the mirror symmetry
of functions F (α, β) and µ(α, β) with respect to the line α = −β is valid, i.e. :
F (−β, −α) = F (α, β)
µ(−β, −α) = µ(α, β)
respectively.
We next proceed to the formulation of the fundamental theorem that gives the
necessary and sufficient conditions for the representation of actual hysteresis nonlinearities by the Preisach model.
Representation Theorem
The wiping-out property and the congruency property constitute the necessary and
sufficient conditions for a hysteresis nonlinearity to be represented by the Preisach
model on the set of piecewise monotonic inputs.
We also mention that if the wiping-out and congruency properties are valid, then
41
it really does not matter which transition curves are used for the determination of
µ(α, β). If the properties are not valid the values of µ(α, β) and the accuracy of the
Preisach model will depend on a particular choice of transition curves employed for
the determination of µ(α, β).
The numerical implementation of the Preisach model which circumvents the evaluation of double integrals of (2.53) and the formula (2.56), where due to the differentiations may strongly amplify errors (noise) and are inherently presented in any
experimental data, is:
n(t)−1 ·
f (t) = − F (α0 , β0 ) + 2
·
X
¸
F (Mk , mk−1 ) − F (Mk , mk )
k=1
¸
+ 2 F (Mn , mn−1 ) − F (Mn , u(t))
(2.57)
where the above expression has been derived for monotonically decreasing input
that is, the final link of interface L(t) is a vertical one. For the case where we have
monotonically increasing input :
n(t)−1 ·
f (t) = − F (α0 , β0 ) + 2
·
X
¸
F (Mk , mk−1 ) − F (Mk , mk )
k=1
¸
+ 2 F (u(t), mn−1 )
(2.58)
The function F (α, β) is related to experimentally measured first-order transition
curves by the formula (2.58), Mk and mk correspond to the maximum and minimum values of the input respectively on the k-vertex of the Preisach phase plane and
n(t) represents the number of the dominant extremum values and it is a function
of time due to the wiping-out property of the Preisach model and this number may
change with the time.
The above discussed model is called classical Preisach model and constitutes the
base to understand in depth the following modified models and that because, the
classical model has some intrinsic limitations which are:
42
1. The C.P. (classical Preisach) model describes hysteresis nonlinearities which
exhibit congruency of minor loops formed for the same reversal values of input.
In fact the actual hysteresis nonlinearities deviate from this property.
2. The C.P. model is static in nature and does not account for dynamic properties
of hysteresis nonlinearities. For fast input variations these properties may be
essential.
3. The C.P. model describes hysteresis nonlinearities with the wiping-out property, which means to the immediate formation of hysteresis loop after one
cycle of back-and-forth variation of input between any two reversal values.
However, experiments show that hysteresis loop formation is often preceded
by some stabilization process which may require large number of cycles to
achieve a stable minor loop.
4. The C.P. model deals only with scalar hysteresis nonlinearities. In many applications however, vector hysteresis1 is encountered.
The model that we will describe now is referred to as the Moving Preisach model.
+
−
We subdivide the triangle T (see fig. 2.12) into three sets Su(t)
,Ru(t) , Su(t)
which are
defined as:
+
(α, β) ∈ Su(t)
if β0 ≤ α ≤ u(t)
(α, β) ∈ Ru(t) if β0 ≤ β ≤ u(t), u(t) ≤ α ≤ α0
−
(α, β) ∈ Su(t)
if u(t) ≤ β ≤ α ≤ α0
The Moving Preisach model is the following:
Z Z
f (t) =
1 +
−
µ(α, β)γ̂αβ u(t)dαdβ + (fu(t)
+ fu(t)
)
2
Ru(t)
1
(2.59)
Vector hysteresis is a vector nonlinearity with the property that past extremum values of input
projections along all possible directions may affect future values of output.
43
a
−
Su(t)
a=b
Ru(t)
b
+
Su(t)
Figure 2.12: The Preisach phase plane for the Moving model
+
−
where fu(t)
, fu(t)
is the output along the limiting ascending and the limiting de-
scending branch respectively. In expression (2.59) the integration is performed not
over the fixed limiting triangle T but over the rectangle Ru(t) which changes along
with the input variations. The identification problem as before is in determining
the µ-function by fitting the model (2.59) to some experimental data. To overcome
this let introduce the function:
4
T (α, β) = fβ− − fαβ
(2.60)
where here assume that we started from the state of positive saturation and the
input u(t) is monotonically decreased. Consequently (2.60) is equal to the output
increments between the limiting descending branch and first-order transition curves.
After some calculations we conclude that:
µ(α, β) = −
1 ϑ2 T (α, β)
2 ϑαϑβ
44
(2.61)
and using (2.60) into (2.61) we have:
µ(α, β) =
1 ϑ2 fαβ
2 ϑαϑβ
(2.62)
It must be cleared that the moving model (2.59) is equivalent to the classical Preisach
model (2.52) as far as description of purely hysteretic behavior is concerned. More
precisely, it is apparent that this equivalence holds only for input and output variations confined to the region enclosed by a major hysteresis loop. Outside this region,
the C.P. model prescribes flat saturation values for output, while the moving model
+
−
(2.59) prescribes the actual experimentally observed values fu(t)
and fu(t)
for the
states of negative and positive saturation, respectively. For this reason, the wipingout and congruency property of minor loop are valid for the moving model. Again,
the numerical implementation is given by:
n(t) ·
f (t) = 2
X
¸
+
T (Mk+1 , mk ) − T (Mk , mk ) + fu(t)
(2.63)
k=1
where (2.63) expresses explicitly the output f (t) in terms of experimentally measured function T .
An another model which is a modified version of the classical Preisach model will
be discussed now and is called as the nonlinear or input dependent Preisach model.
The advantages of this model over the classical one are:
1. The congruency property of minor loops is relaxed and
2. The nonlinear model allows one to fit experimentally measured first and secondorder reversal curves.
Since higher-order reversal curves are sandwiched between first- and second-order
ones, it is reasonable to expect that the nonlinear model will be more accurate than
the classical one. The nonlinear Preisach model can be mathematically defined as:
Z Z
f (t) =
Ru(t)
µ(α, β, u(t))γ̂αβ u(t)dαdβ +
45
+
−
fu(t)
+ fu(t)
2
(2.64)
or using the geometric interpretation the equation (2.64) becomes:
Z Z
f (t) =
Z Z
S(t)+
µ(α, β, u(t))dαdβ −
S(t)−
µ(α, β, u(t))dαdβ
(2.65)
It is clear that a new feature of this model in comparison with the moving one is
the dependence of the function µ on the current value of input u(t). This model has
the following two properties:
(a) Wiping-out Property Only the alternating series of past dominant extrema
Mk and mk are stored by the nonlinear Preisach model.
(b) Property of equal Vertical Chords All minor loops resulting from backand-forth variations between the same two consecutive extrema have equal
vertical chords (output increments) for the same input values.
We should note that for two consecutive extrema of the input, let assume u ∈
(u+ , u− ), the corresponding vertical chord does not depend on a particular past
history preceding the formation of a minor loop.
Now, for the solution of the identification problem the sets of first and secondorder reversal curves are required. Let assume that first the input is decreased to
reach the negative saturation state and then monotonically increases until it reaches
some value α. The first-order reversal curves are attached to the limiting ascending
branch and they are formed when the above monotonic increase of u(t) is followed
by a subsequent decrease. The notation fαu will be used for the output values on
the first-order reversal curve. The second-order reversal curves are attached to the
first-order reversal curves and they formed when the above monotonic decrease is
followed by a monotonic increase (see fig. 2.11). Using fαβu as the notation for the
output values on the second-order reversal curve, we consider the function:
4
P (α, β, u) = fαu − fαβu
46
(2.66)
which (2.66) has the physical meaning of output increments between the first- and
second-order reversal curves. It is clear that:
P (α, u, u) = P (u, β, u) = P (u, u, u) = 0
After some calculations we obtain:
µ(α, β, u) = −
1 ϑ2 P (α, β, u)
2
ϑαϑβ
(2.67)
and using (2.66) in (2.67) we finally obtain:
µ(α, β, u) =
1 ϑ2 fαβu
2 ϑαϑβ
(2.68)
Due to the mirror symmetry we also have that:
µ(−β, −α, −u) = µ(α, β, u)
The representation theorem for this model is the following one:
Representation Theorem
The wiping-out property and the property of equal vertical chords for minor loops
constitute the necessary and sufficient conditions for the representation of a hysteresis nonlinearity by the nonlinear Preisach model on the set of piecewise monotonic
inputs.
We must clear that the property of equal vertical chords is more general than
the congruency property. Indeed, if comparable minor loops are congruent, then
they have equal vertical chords. If now they have equal vertical chords, they are
not necessarily congruent. Likewise, under the congruency condition the nonlinear
Preisach model (2.64) or (2.65) coincides with the moving Preisach model (2.59).
The numerical implementation of the nonlinear model is the following one:
n(t) ·
−
f (t) = fu(t)
+
X
¸
P (Mk+1 , mk , u(t)) − P (Mk , mk , u(t))
k=1
47
(2.69)
−
where fu(t)
is the output value of a monotonic decrease input from some above α0
(positive saturation) to α value of u(t). The formula (2.69) computes output values
by using input values, a set of second-order reversal curves and an input history
which are all specified by the user. Other models, very similar to the above ones,
can be found in [23].
Hysteresis modeling with the Preisach-model approach appears to be efficient as
it better approaches the requirements of accuracy and adaptability; as a matter
of fact, the possibility of including dynamic and mean field effects and the ability
to be coupled with the numerical solutions of Maxwell equations justifies its large
diffusion in many applications. From the other hand, the Preisach models incude
integrodifferential operators, thus making them very complicated and it is still not
clear how to fuse them into the controller design. However, we should mention that
modeling a general type of hysteresis itself is still a research topic and the reader
may refer to [24] for a recent view.
2.4
Summary
In this chapter mathematical models of non-smooth nonlinearities, such as friction,
backlash and hysteresis, and their limitations have been presented. Identification
and estimation techniques of the above models have also been presented when feasible. As it is cleared from this chapter, the better mathematical description we have
the higher accuracy and better description of the nonlinearities achieved. From the
other hand, the demand of high accuracy leads to complicated models that cannot
be implemented sometimes into the controller design. The in depth knowledge of
the above nonlinearities, the improvement of the nonlinear control theory that has
been achieved in the last decade together with the advanced technology, orient the
research to search for a unified method to face better the above nonlinearities.
48
As it will be cleared from the following chapter, which describes the basic principles
of fault diagnosis, a succesful fault diagnosis is dependent from the model choice.
The better model we have the more accurate fault diagnosis can be achieved.
49
Chapter 3
Fault Diagnosis
The detection and isolation of faults (diagnosis) is of a great importance in any engineering system. Such kind of systems can be a broad spectrum of human-made machinery, including industrial production facilities (oil refineries, steel mills, chemical
plants, etc.) transportation vehicles (ships, airplanes, trains, etc.) and household
devices (air conditioning equipment, refrigerators, washing machines, etc.). The
early detection of the fault occurrence is critical in avoiding product deterioration,
performance degradation, major damage to the machinery itself and damage to
human health or even loss of lives. The quick and correct diagnosis of the faulty
component then facilitates proper and optimal decisions on emergency and corrective actions and on repairs.
The traditional approaches to fault detection and diagnosis involve the limit checking of some variables or the application of redundant sensors (physical redundancy).
More advanced methods rely on the spectral analysis of signals emanating from the
machinery or on the comparison of the actual plant behavior to that expected on
the basis of a mathematical model (analytical redundancy). The latter approach includes methods which are more deterministically framed (parity relations, observers)
and those formulated more on a statistical basis (Kalman filtering and parameter
50
estimation). The boundaries between the various approaches are rather blurred and,
lately, sever methods have been shown to be closely related to one another and even
to produce identical results under broad conditions.
3.1
What is Fault Detection and Diagnosis
The detection and diagnosis of the faults in engineering systems are concerned
whether they occur in the plant or in its measurement and control instruments.
In the sequel what is meant by faults will be described and the tasks of detection
and diagnosis will be specified.
In general, faults are deviations from the normal behavior in the plant or its instrumentation. The faults of interest belong to one of the following categories:
• Additive process faults.
These are unknown inputs acting on the plant, which are normally zero and
which, when present, cause a change in the plant outputs independent of the
known inputs. Such faults best describe plant leaks, loads, etc.
• Multiplicative process faults.
These are changes (abrupt or gradual) in some plant parameters. They cause
changes in the plant ouputs which depend also on the magnitude of the known
inputs. Such faults best describe the deterioration of plant equipment, such
as surface contamination, clogging, or the partial or total loss of power.
• Sensor faults
These are discrepancies between the measured and actual values of individual
plant variables. These fautls are usually considered additive (independent of
the measured magnitude), though some sensor faults (sticking or complete
failure) may be better characterized as multiplicative.
51
• Actuator faults
These are discrepancies between the input command of an actuator and its
actual output. Actuator faults are usually handled as additive though, again,
some kinds of them may be better characterized as multiplicative.
The sytems now that perform fault detection and diagnosis implement the following
tasks:
- Faul detection, which means, the indication that something is going wrong in
the monitored system.
- Fault isolation, which means, the determination of the exact location of the
fault or in other words which component is faulty.
- Fault identification, which means, the determination of the magnitude of the
fault.
The last two tasks together, that is, isolation and identification are referred to as
fault diagnosis. While detection is absolutely necessary in any practical system and
isolation is almost equally important, fault identification may not justify the extra
effort it requires. For this reason, most practical systems contain only the fault
detection and isolation tasks and are referred to as FDI systems. Most of the time,
the fault detection and diagnosis activity takes place on-line, in real time. The two
tasks can be performed either in parallel way or sequentially. In some systems, the
detection tasks is running permanently while the diagnostic task is triggered only
upon the detection of the presence of a fault.
Particularly, according to [26] in model-based fault detection and diagnosis (will be
described later) the following conventions are usually adopted:
(i) It is assumed that the faults are not present initially in the system but arrive
at some later time, The faults are generally described by a deterministic timefunctions which are unknown.
52
(ii) Another deterministic and unknown inputs to the system are the additive
disturbances. The distinction between additive faults and disturbances is subjective; the faults are those unknown inputs we wish to detect and isolate while
disturbances are nuisances we wish to ignore.
(iii) The noise which emanates from the plant or from the sensors and actuators,
is considered random with zero mean. Any nonzero mean is handled as a fault
or a disturbance.
(iv) Modeling errors are discrepancies between the model (model parameters) and
the true system. They are present ever since the origins of the system or due to
the changes of the operating-point. They may be considered as multiplicative
disturbances, in contrast to multiplicative faults which are also discrepancies
between the model and the true system, but which we wish to detect.
The detection performance of the diagnostic technique is characterized by a number
of important and quantifiable benchmarks which are:
- Fault sensitivity, that is, the ability of the technique to detect faults of reasonably small size.
- Reaction speed, that is, the ability of the technique to detect faults with
reasonably small delay after their arrival.
- Robustness, that is, the ability of the technique to operate in the presence of
noise, disturbances and modeling errors, with few false alarm1
It is remarkable to note here, that in the most cases, there are design trade-offs
between the various properties described above. The isolation performance, that
is, the ability of the diagnostic system to distinguish faults depend on the physical
properties of the plant, on the size of the faults, noise, disturbances and modeling
1
Erroneous fault detection
53
error, and on the design of the algorithm. Multiple simultaneous faults are, in
general, more difficult to isolate them than single faults. Moreover, the interplay
between faults and disturbances, noise, modeling error may lead to uncertain or
incorrect isolation decisions. In addition, some faults may be non-isolable from one
another because they act on the physical plant in an undistinguishable way.
3.2
Methods in Fault Detection and Diagnosis
The methods in fault detection and diagnosis (FDD), may be classified into two
main categories: those that do not utilize a mathematical model of the plant and
those that do.
3.2.1
Model-Free methods
These are the FDD methods that do not utilize a mathematical model of the plant
and are:
Physical redundancy. In this method, multiple sensors are installed to measure
the same physical quantity. Any serious discrepancy between the measurements
indicates a sensor fault. With only two parallel sensors, fault isolation is not possible.
With three sensors, a voting scheme can be formed in order to isolate the fault sensor.
Physical redundancy involves extra hardware cost and extra weight, where in the
latter consists a serious factor in aerospace applications.
Special sensors. These sensors are installed explicitly for detection and diagnosis
purposes. They may be limit sensors (measuring e.g. temperature, pressure), which
perform limit checking. Other special sensors may measure some faulty-indicating
physical quantity, such as sound, vibration, etc.
Limit checking This approach is widely used in practice. Plant measurements are
compared by computer to preset limits. Exceeding the threshold indicates a fault
situation. While simple and straightforward, the limit checking approach has two
54
serious drawbacks:
- Since the plant variables may vary widely due to normal input variations, the
thresholds need to be quite conservatively.
- The effect of a single component fault may propagate to many plant variables,
setting off a confusing multitude of alarms and making isolation extremely
difficult.
Spectrum analysis. Most plant variables exhibit a typical frequency spectrum
under normal operating conditions; any deviation from this is an indication of abnormality. Certain types of faults may have a specific signature in the spectrum
making thus isolation simpler.
Logic reasoning. Are techniques which are complementary to the discused above
methods, in that they are aimed at evaluating the symptoms obtained by the detection (hardware and/or software). They are consist of trees of logical rules of the
“IF symptom AND symptom THEN conclusion” type. Each conclusion in turn, can
serve as a symptom in the next rule until to lead to a final conclusion.
3.2.2
Model-Based methods
Model-based FDD methods utilize an explicit mathematical model of the monitored
plant. Their natural mathematical description is in the form of differential equations
or equivalent transformed representations for the continuous-time model, while for
the discrete-time in the form of difference equations or their transformed equivalents.
Also, though most physical systems are nonlinear, their mathematical descriptions
usually relies on linear approximations.
Most of the model-based FDD methods rely on the concept of analytical redundancy.
In contrast with the physical redundancy, when measurements from parallel sensors
are compared to each other, now sensory measurements are compared to analytically computed values of the respective variable. The resulting differences, called
55
residuals, are indicative of the presence of faults in the system. The generation of
residuals needs to be followed by residual evaluation, in order to arrive at detection
and isolation decisions. Schematically is depicted in fig. 3.1. Because of the presence
observations
Residual
Generation
residuals
Residual
Evaluation
decision
Figure 3.1: Stages of model-based fault detection and diagnosis.
of noise and modeling errors, the residuals are never zero, even if there is no fault.
Hence, the detection decision requires testing the reiduals against thresholds, obtained empirically or by theoretical considerations. To facilitate fault isolation, the
residual generators are usually designed for isolation enhanced residuals, exhibiting
structural or directional properties. The isolation decisions then can be obtained in
a structural (boolean) or directional (geometrical) framework, with or without the
inclusion of statistical elements.
Robustness issues
The residuals which are generated to indicate faults may also react to the presence of
noise, disturbances and modeling errors. Desensitizing the residuals to these sources
is the most important aspect in the design of the detection and diagnosis algorithm.
More precisely:
- To deal with the effects of noise, the residuals may be filtered and statistical techniques may be applied to their evaluation. In the case of not sufficient information concerning the statistical properties of the noise and the
noise-transfer dynamics of the plant may complicate and hamper the overall
procedure.
- Disturbance decoupling may be built into the design of the residuals genera56
tor, but it competes with the isolation enhancement for the available design
freedom.
- Robustness in the face of modeling errors is the most fundamental problem in
model-based FDD scheme. Several methods are available which usually rely
on some sort of optimization. Unfortunately, this problem does not lend itself
to easy solution and the known techniques are effective only under limited
circumstances.
Residual Generation Techniques
The generation of residual signals is a central issue in model-based fault diagnosis.
A rich variety of methods are available for residual generation and here will be
discussed briefly some of the most common approaches. It must be pointed out that
most residual generation approaches are applicable for both continuous and discrete
models, however some approaches can only work for discrete models. For example,
the parity relation approach is developed specially for discrete models although there
have been some studies into the use of the parity relation approach for continuous
models.
Considering the general cases, a system with all possible faults, i.e. sensor, actuator
and process faults, according the methodology and the denomination of [27] can be
described by the following state space model as:
ẋ(t) = Ax(t) + Bu(t) + R1 f (t)
y(t) = Cx(t) + Du(t) + R2 f (t)
(3.1)
where f (t) ∈ <g is a fault vector, each element fi (t) (i = 1, 2, · · · , g) corresponds
to a specific fault and is considered as unknown time function. The matrices R1 and
R2 are known as fault entry matrices which represent the effects of faults on the
system. The vector u(t) ∈ <r is the input to the actuator or measured actuation,
57
and the vector y(t) ∈ <m is the measured output, and both vectors are known
for FDI purposes. The vector x(t) ∈ <n is the state vector and A, B, C, D are
known system matrices with appropriate dimensions. What is following are some
approaches that have been developed for the residual generation.
• Observer-based approaches
The basic idea behind the observer or filter-based approaches is to estimate the
outputs of the system from the measurements (or a subset of measurements)
by using either Luenberger observer(s) in the deterministic setting or Kalman
filter(s) in a stochastic setting. Then, the output estimation error (or innovations in the stochastic case), is used as a residual. It should be pointed out that
we are interesting to estimate the outputs using an observer, while it is not
necessary the estimation of the state vector. Therefore, a functional observer
is suitable for this task. In practice, the order of the functional observer is less
than the order of a state observer. It is desired to estimate a linear function of
the state, i.e. Lx(t), using a functional (or generalized) Luenberger observer
with the following structure:
ż(t) = F z(t) + Ky(t) + Ju(t)
w(t) = Gz(t) + Ry(t) + Su(t)
(3.2)
where z(t) ∈ <q is the state vector of this functional observer with F, K, J,
R, G, S matrices with appropriate dimensions. The output w(t) of this observer is said to be an estimate of Lx(t), for the system described in (3.1), in
an asymptotic sense if in the absence of faults:
lim [w(t) − Lx(t)] = 0
t→∞
(3.3)
To introduce a transformation matrix T , the observer in (3.2) will generate the
estimate Lx(t) in the asymptotic sense if and only if the following conditions
58
hold:



F has stable eigenvalues







T A − F T = KC















J = T B − KD
(3.4)
RC + GT = L
S + RD = 0
The necessary and sufficient condition for the existence of the observer given
by (3.2) for the system (3.1) is that the pair (C, A) is observable. In order to
generate residuals, we need to estimate the system output. If we assign:
L=C
(3.5)
ŷ(t) = w(t) + Du(t)
(3.6)
we have the output estimation as:
The residual vector r(t) is defined as:
r(t) = Q[y(t) − ŷ(t)] = L1 z(t) + L2 y(t) + L3 u(t)
(3.7)
where:
L1 = −QG
L2 = Q − QR
L3 = −Q(S + D)
Now, the residual generator based on generalized Luenberger is given by:
ż(t) = F z(t) + Ky(t) + Ju(t)
r(t) = L1 z(t) + L2 y(t) + L3 u(t)
59
(3.8)
and the matrices in this equation should satisfy the following conditions:



F has stable eigenvalues







T A − F T = KC















J = T B − KD
(3.9)
L1 T + L2 C = 0
L3 + L2 D = 0
When we apply the residual generator described in (3.8) to the system described by (3.1), the residual will be:
ė(t) = F e(t) − T R1 f (t) + KR2 f (t)
r(t) = L1 e(t) + L2 R2 f (t)
(3.10)
where e(t) = z(t) − T x(t). It is obvious that the residual depends solely and
totally on faults.
The simplest method is of the full order observer and in this case (q = n) we
have:
T =I
L1 = QC
F = A − KC
L2 = −Q
J = B − KD
L3 = QD
For any dynamic system, the observer-based residual generator always exists.
This is because any input-output transfer function matrix has the observable
realization. In other words, the output estimator always exists although a
suitable state observer cannot always be designed. The minimal order q0 of a
functional observer satisfies the inequality:
q0 ≤ µ − 1
(3.11)
where µ is the observability index of the system which is defined as the mini60
mum number for which:
rank[C T , (CA)T , · · · , (CAµ )T ] = n
For observable systems the observability index lies within the limits:
n
≤µ≤n−m+1
m
Inequality (3.11) gives only the minimum possible order of a functional observer. Providing additional freedom in order to achieve the required diagnostic performance, the observer order is normally larger thna the minimum
possible order.
To isolate the faults, the observer-based approaches can be used to design
structured residual sets or fixed residual vectors. For sensor faults, such kind
of design is straightforward. If it is required that a residual is sensitive to
faults in all but one of the sensors, the observer used to generate this residual should be driven by outputs excluding that single sensor measurement.
However, the design of a structured residual set for actuator fault isolation is
more difficult. This problem can be solved via unknown input observers and
eigenstructure assignement [27]. However, the isolation of actuators faults is
not always possible.
• Parity vector (relation) methods
The basic idea of the parity relation approach is to provide a proper check of
the parity (consistency) of the measurements of the monitored system. The
parity relations are rearranged direct input-output model equations, subjected
to a linear dynamic transformation. The transformed residuals serve for detection and isolation. To begin with this problem, let consider the measurement
of an n-dimensional vector using m sensors, as in [27]. The measurement
equation is:
y(k) = Cx(k) + f (k) + ξ(k)
61
where y(k) ∈ <m the measurement vector, x(k) ∈ <n the state vector, f (k)
the vector of sensor faults , ξ(k) the noise vector and C an m × n measurement
matrix. Furthermore, the dimension of y(k) is larger than the dimension of
x(k), that is:
m > n and rank(C) = n
Inconsistency in the measurement data is then a metric that can be used
initially for detecting faults and, subsequently for fault isolation. For FDI
purposes, the vector y(k) can be combined into a set of linearly independent
parity equations to generate the parity vector (residual):
r(k) = V y(k)
To satisfy the usual requirement for a residual, that is zero-valued in the faultfree case, the matrix V should satisfy the condition:
VC =0
Under this condition, the parity vector contains only information on the faults
and noise:
r(k) = v1 [f1 (k) + ξ1 (k)] + · · · + vm [fm (k) + ξm (k)]
(3.12)
where vi the ith column of V , fi (k) is the ith element of f (k) which denotes
the fault in the ith sensor. From (3.12) one can see that the parity vector
is independent of the unmeasured state x(k) and that contains information
about the faults and the noise (uncertainty). Moreover, the parity space is
spanned by the columns of V , i.e. the columns of V form a basis for this
space. In addition, a fault in the ith sensor, implies a growth of the residual
r(k) in the direction vi . The space span{V } is called a “parity space”. Then
a fault detection decision function is defined as:
4
DF D(k) = r(k)T r(k)
62
If a fault occurs in the sensors, DF D(k) will be greater than a predetermined
threshold. For the fault isolation decision another function is defined which is:
4
DF Ii (k) = viT r(k) ; i ∈ {1, 2, · · · , m}
For a given r(k), a malfunctioning sensor is identified by computing the m
values of DF Ii (k). If DF Ij (k) is the largest one of these values then the
sensor that corresponds to DF Ij (k) is the one which is most likely to have
become faulty. In the parity space point of view, the columns of V define
m distinct fault signature directions. After a fault has been declared, it can
be isolated by comparing the orientation of the parity vector to each these
signature directions. So, DF Ii (k) is a measure of the correlation of the residual
vector with fault signature directions. For a reliable isolation, the generalized
angles between fault signature directions should be as large as possible, i.e.,
to make viT vj
(i 6= j) as small as possible. Thus, optimal fault isolation
performance will be achieved when vi determined by:


 min{v T vj }
i

T

max{vi vi }
i 6= j i, j ∈ {1, 2, · · · , m}
i ∈ {1, 2, · · · , m}
For the case rank(C) = m < n, redundancy relations are needed to be construct and can be done by collecting sensor outputs over a time interval. say
{y(k − s), y(k − s + 1), · · · , y(k)}. This is known as “temporal” or “serial”
redundancy. As in [27], we consider a system with the following discrete state
space equations:
x(k + 1) = Ax(k) + Bu(k) + R1 f (k)
y(k) = Cx(k) + Du(k) + R2 f (k)
(3.13)
where y ∈ <m the output vector, x ∈ <n the state vector, u ∈ <r the input
vector, f ∈ <g the fault vector and A, B, C, D, R1 , R2 real matrices with
63
compatible dimensions. Combining (3.13) from time instant k − s to k yields
the following redundant relations:


y(k − s)


 y(k − s + 1)


..

.


|
y(k)
{z


u(k − s)





 u(k − s + 1)


 −H 
..


.




}
|
Y (k)
u(k)
{z


f (k − s)




 = W x(k − s) + M





 f (k − s + 1)


..

.


}
|
U (k)
f (k)
{z








}
F (k)
(3.14)
or in a condensed form:
Y (k) − HU (k) = W x(k − s) + M F (k)
with:





H=



(3.15)

D
0
···
0
CB
..
.
D
..
.
···
..
.
0
..
.




 ∈ <(s+1)m×(s+1)r



CAs−1 B CAs−2 B . . . D


C




 CA 


W =  .  ∈ <(s+1)m×n
 . 
 . 


CAs
and the matrix M is constructed by replacing {D, B} with {R2 , R1 } in the
matrix H. Then a residual signal can be defined as:
4
r(k) = V [Y (k) − HU (k)]
(3.16)
where V ∈ <p×(s+1)m and p the residual vector dimension. Eq.(3.16) is the
computational form of a residual generator which shows the residual signal as
a function of measured inputs and outputs of the monitored system. Using
64
(3.15) in (3.16) we obtain:
r(k) = V W x(k − s) + V M F (k)
(3.17)
This is the evaluation format of the residual. Again to make the parity vector
insensitive to system’s inputs and states the following equation should hold:
VW =0
(3.18)
and to satisfy the fault detectability condition, the matrix V should also satisfy
the condition:
V M 6= 0
(3.19)
Once we have matrix V , the residual signal can be generated using (3.16). The
residual generator design depends on solutions of (3.18). For an appropriately
large s, it follows from the Cayley-Hamilton theorem that the solution of (3.18)
exists and so the parity relation-based residual generator for fault detection
does. It must be pointed out that the parity relation can also be constructed
using a z-transformed input-output model.
The parity relation approach can be used to design structured residual set for
fault isolation. As in observer-based approach, for isolating sensor faults is very
straightforward. If we use cTi (the ith row of C) and yi (the ith component of y)
instead of C and y, the parity relation will contain only the ith sensor’s output
together with all the inputs. The residual generated by this relation is only
sensitive to the fault in the ith sensor. For the actuator isolation problem, the
structured residual set is more difficult to design and the isolation of actuator
faults is not always possible.
• FDI via Parameter Estimation
Parameter estimation is a natural approach to the detection and isolation of
parametric (multiplicative) faults. This approach is based on the assumption
65
that the faults are reflected in the physical system parameters such friction,
mass, resistance, etc. The basic idea of the detection method is that the
parameters of the actual process are repeatedly estimated on-line using wellknown estimation methods and the results are compared with the parameters
the reference model obtained initially under the fault-free condition. Any
substantial discrepancy indicated as a fault. This approach normally uses the
I/O mathematical model of a system in the following form:
y(t) = f (P, u(t))
where P is the model coefficient vector which is directly related to physical
parameters of the system. The function f can take both linear or non-linear
foramts. The basic procedure that is followed for FDI purposes using parameter estimation approach has the following steps:
(1) Establish the process model using physical relations.
(2) Determine the relationship between model coefficients and process physical parameters.
(3) Estimate the normal model coefficients.
(4) Calculate the normal process physical parameters.
(5) Determine the parameter changes which occur for the various fault cases.
By carrying out the last step for known faults, a database of faults and their
symptoms can be built up.
To generate residuals using this approach, an on-line parameter identification
algorithm should be used. It is not easy however to achieve fault isolation
using the parameter estimation method. This is because the parameters being
identified are model parameters which cannot always be converted back to the
system physical parameters. Moreover, this method is also more demanding
66
in terms of on-line computation and input excitaion requirements than the
other methods described previously.
• Kalman filter
The innovation (prediction error) of the Kalman filter can be used as a fault
detection residual; its mean is zero if there is no fault (and disturbance) and becomes non-zero in the presence of faults. However, fault isolation is somewhat
awkward with Kalman filter; one needs to run a bank of “matched filters”, one
for each suspected fault and for each possible arrival time, and check which
filter output can be matched with the actual observation.
With their steady establisment in the past, reasearch attention has been devoted to
the interconnection among these approaches, in particular, between parity relation
and the other three approaches. Equivalence between them has been demonstrated
form different viewpoints [30], [31]. Recently, [32], derived a one-to-one relationships among the design parameters and reveal that the real difference between these
approaches lies in the fact the the on-line implementation form of parity relation
approach is nonrecursive, while in the observer-based ones are implemented recursively. Making use of these results the design of residual generators can be carried
out independent of the implementation form possibly used.We can use, for instance,
the parity space approach for the residual generator design, then transform the parameters achieved to the parameters needed for the construction of a diagnostic
observer and finally realize the diagnostic observer.
In the literature, all the above approaches described for the residual generation purposes, are referred to as analytical approaches that make use of quantitative models.
Other approaches that make use of qualitative models as well as approaches using
computational intelligence techniques are referred to as knowledge-based approaches
and will be described in the sequel.
67
• Qualitative fault diagnosis
Since in most cases available, a priori knowledge about a process is hardly
complete, or, even if this is the case, might be too complex to directly deal
with, an approximation has to be made, so that models become inaccurate.
Or measurements are subjected to noise. Consequently, deviations between
the reality and its representation, i.e. modeling errors, are unavoidable. This
method has been extensively applied in science and engineering, e.g. when
nonlinear differential equation is linearized or a complex system is represented
by a trained artificial neural network. These quantitative models are able to
predict the system behavior precisely but more often inaccurately. Efforts have
to be made through bringing more information (e.g. training data) to raise the
accuracy of the prediction in the modeling stage, or through modeling error
decoupling to reduce the influence of such errors when applying the models to
fault diagnosis.
Alternatively, incomplete knowledge can be treated via abstraction. Instead of
the precise description by a quantitative model, a qualitative description of a
process can be applied. By allowing the existence of a tolerance, the resolution
of the representations is reduced, to emphasize primary distinctions and ignore
unimportant or unknown details. Although this description is imprecise it is
able to represent the system accurately. The qualitative approach, in contrast
with quantitative one, requires only declarative information, e.g. the sign
of variables, the tendencies of variables (increasing, decreasing or constant),
order and/or relative magnitude, and hence can be robust with respect to
uncertainty in a well defined sense. The qualitative approach is motivated by
the following circumstances (see also [27]):
- Faults cannot be reasonably described by analytical methods, e.g. a valve
is blocked or a pipe is broken
68
- The on-line information available is not given by quantitative assesments
of the current operating conditions, e.g. the water level is high cannot
be unambiguously transformed into quantitative measurement data.
- If the system structure or parameters are not precisely known and diagnosis has to be based primarily on heuristic information (e.g. connection
of symptoms and faults, process history, fault statistics etc.), no quantitative model can be set up.
According to the available information about a plant, there are several different
possibilities to qualitatively represent the information of the dynamic process,
each of which is associated with an appropriate simulation method. Basically, a
qualitative simulation method should be responsible for retaining the accuracy
of the represented system behavior, thus the fault detection approaches based
on them could avoid false alarm. The representations that are relevant to the
FDI approaches are:
- qualitative differential equations ([34])
- envelope behaviors ([35], [36])
- stochastic qualitative behaviors ([37], [33])
Main disadvantages of the qualitative approach emerge when there is a possibility of ambiguity, for example when are manipulated two or more declarative
variables (the sum of a positive variable and a negative one can either be positive or negative), or because the qualitative models are relatively crude, usually
cannot be used to detect soft faults as the diagnosis is symptom-based.
Quantitative and qualitative approaches have a lot of complementary features
and can be suitably combined together in order to increase the robustness of
69
the quantitative methods. This combination can also minimize the disadvantages of the two approaches. Hence, one of the aims in the future research on
model-based FDI is to find the way to combine these two methods togeher to
provide highly reliable diagnostic information.
• FDI using Computational Intelligence Techniques
In the case of fault diagnosis in complex systems, one is faced with the problem
that no, or insufficiently accurate, mathematical models are available. The use
of knowledge-model-based or data-model-based techniques, either in the framework of diagnosis expert systems or in combination with a human expert, is
then the only feasible way to proceed.
Fuzzy logic in fault diagnosis
The second stage of model-based FDI, decision making, is a logic decision process that transforms quantitative knowledge (residual signals) into qualitative
statements (faulty, normal, etc.). To outline the basic idea, let consider the
case that the residual due to faults is also contaminated by noise and the effect of uncertainty due to incomplete de-coupling, so that the residual will be
non-zero even in the absence of faults, i.e. the residual will fluctuate depending on the unknown time functions of the disturbances, noise and inputs of
the process. Based upon this limitation, the problem is to make the correct
decisions on the basis of uncertain information.
Contrary to the classical logic which allows a definite calssification of fixed
values, the fuzzy logic offers a form for the description of tolerances. Fuzzy
processing can be divided into essentially the following stages. In the first, the
residuals are compared with membership functions which are often assumed
to be of triangular shape. In the second stage, the lower of the two antecedent
outputs is selected. Then the output of all rules is combined. Finally, the cen70
ter of gravity (or another averaging methods) is used to defuzzify the output
and lead to the possibility of definite decision-making. The introduction of
fuzzy logic can improve the decision-making, and in turn will provide reliable
and sufficient FDI which are applicable for real industrial systems. One of the
latest development in this area is the fuzzy oberver based approach. In [38], the
fuzzy observer concept actually represents a set of analytical linear observers
on whose ouputs a fuzzy fusion is performed based on Takaki-Sugeno fuzzy
models. Using this approach a nonlinear dynamic system is described by a
number of locally linearised models. For the fuzzy observer scheme the linear
models are implemented in a bank of linear observers. The final state estimation is given by a fuzzy fusion of all local observer outputs. The difference
between the measured output and the estimated output provides the residual
for further diagnostic evaluation.
Neural Networks in fault diagnosis
In the past two decades, the techniques of artificial neural networks (ANN)
are growing mature, as a data-driven method, which provides a totally new
perspective to fault diagnosis. The ANN is hopeful approach to FDI, owing to
its robustness and strong learning ability. The ability to learn means that, if a
causal relationship exists between the output and input, the network will learn
it. If sufficient internal nodes and internal layers are available, the network
will also map any set of inputs to the corresponding output. The ultimate
result is a network which will faithfully reproduce the desired output for the
entire training set, including any noise. A very frequent application of ANNs
for FDI purposes, is their use as classifiers with training data for each fault.
However, the majority of the ANN-based FDI systems suffer form the lack
of universality, the dilemma of stability and the long training time, due to
71
the localization of the algorithm itself. One of the latest development in this
area is the neural observer based approach. In [39], [40] neural networks are
used as nonlinear multi-input single-output models of ARMA type to set up
different kinds of observer schemes. Thereby the neural networks replace the
analytical models which are usually necessary for observer-based FDI. Two
types of observer schemes are proposed by [39] for actuator, component and
instrument fault detection: the neural single observer scheme and the neural dedicated observer scheme. Whilst the first one is driven by all process
inputs and outputs the second one is only driven by the process inputs and
the output of the component to be supervised. Therefore, the first scheme
consists only of a single observer which is composed of a bank of multi-input
single-output neural nets each estimating one output in contrast to the second
scheme, which consists of a number of observers associated to each component
of the plant. These neural observers in turn consist of a number of multi-input
single-output neural nets each estimating one process output. In both cases
the training is based on fault free process data reflecting the normal behavior.
The residual evaluation part can then be performed by a well-known static
multi-layer perceptron neural network.
It has to be mentioned also that a combination of the above intelligence techniques
with the help of genetic algorithms can also be used in order to cope with the problem of nonlinear processes, lacking analytical knowledge and robustness issues. For
example, fuzzy neural networks (FNNs) (see [41]) combine the advantage of fuzzy
reasoning, which is the capability of handling uncertain and imprecise information,
with the advantage of neural networks, which is the capability of learning from examples. Genetic algorithms from the other hand can be used in order to find for
example, optimal neural structures.
72
Thanks to the rapid progress of nonlinear observer theory during the last decade,
significant results in designing nonlinear residual generators have been achieved in
recent five years. Nevertheless, a general theory for the solution of nonlinear FDI
problems is still missing. Thus the development of nonlinear FDI approaches is one
of the current FDI topics that are receiving much attention. Despite the difficulties,
works on nonlinear systems have recently appeared [51]-[54].
3.3
Summary
This chapter has presented the basic priciples of FDI and especially of the modelbased one. The FDI problem has been formalized in a uniform framework by presenting mathematical descriptions and definitions. The residual generator, which is
identified as a central issue in model-based FDI, has been summarized in a generalized structure which can cover all residual generation methods. Other FDI methods
such as computational intelligence techniques and qualitative modeling have been
discussed briefly. In the following chapter, a novel approach for fault detection
in mechanical systems is presented where friction nonlinearities are present. The
precedening chapters including this one, have been constituted the base for the
development of this novel approach.
73
Chapter 4
Fault detection in mechanical
systems with friction
phenomena: an on-line
approximation approach.
In this chapter we present a novel approach to detect faults in mechanical systems
with friction that perform linear motion. The basic module in the proposed architecture is an on-line approximator which is based on liner-in-the-weights neural
network structures. To model the effects of friction, the dynamic LuGre model [9] is
used. However, we don’t assume knowledge of system nonlinearities. Furthermore,
the friction internal state is not assumed to be available for measurement. The online approximator requires system’s position and velocity as well as its input force.
The performance of the developed fault detector is analyzed with respect to its
robustness and sensitivity. Rigorous fault detectability conditions are also derived
basing on the important results presented in [51].
74
4.1
Problem Formulation
Consider the linear motion of a mass m driven by an input force u:
mẍ + Kx + F = u
(4.1)
where F represents the friction force, K > 0 denotes the spring constant, x the
mass position, and ẋ its velocity. To model the effects of friction, the dynamic
LuGre model [9] is used:
F
= σ0 z + σ1 ż + ẋ + ω(x, ẋ, z)
(4.2)
ż = −α(ẋ)|ẋ|z + ẋ
(4.3)
where the friction internal state z describes the averaging deflection of the contact
surfaces during the sticking phases and ω(x, ẋ, z) denotes a friction modelling error.
We assume that |ω(x, ẋ, z)| ≤ ω̄ , where ω̄ ≥ 0 is an unknown but suitably small (see
Section 4.2) constant. Furthermore, the parameters σ0 , σ1 , σ2 that appear in (4.2)
are positive and are considered unknown, too. In [9], the function α(ẋ) is given by
α(ẋ) =
σ0
fc + (fs − fc )e−(ẋ/vs )2
where fc is the Coulomb friction, fs is the stiction force and vs is the Stribeck
velocity. It is apparent that 0 < σ0 /fs ≤ α(ẋ) ≤ σ0 /fc . In practice, α(ẋ) depends
on several factors such as material properties, temperature etc.
4
4
Defining x1 = x, x2 = ẋ, it follows that
ẋ1 = x2
(4.4)
ẋ2 = −α1 x2 + [α3 α(x2 )|x2 | − α2 ] z + α4 u − α4 ω(x1 , x2 , z) − α5 x1
(4.5)
ż = −α(x2 ) |x2 | z + x2
(4.6)
4
4
4
4
where |x2 |α(x2 ) ≥ 0, α1 = (σ1 + σ2 )/m, α2 = σ0 /m, α3 = σ1 /m, α4 = 1/m, and
4
α5 = K/m.
For system (4.4)-(4.6), the following assumptions are introduced:
75
Assumption 1 The state variables x1 and x2 are available for measurement.
Assumption 2 Let U be the class of piecewise continuous and bounded signals.
Then, for any u ∈ U and any initial condition, the state trajectories x1 , x2 are
uniformly bounded.
It is worth noting that the internal friction state z is not assumed to be available for measurement, and the function α(x2 ) as well as the positive parameters
αi , i = 1, . . . , 5 are considered unknown.
As α(x2 )|x2 | > 0 and α(x2 )|x2 | = 0 only when x2 = 0, it follows immediately
that the internal friction state z is input–to–state stable when x2 is considered as
input. Hence, there exists a pair of functions β ∈ KL and γ ∈ K (these functions
are not assumed to be known) such that, for every essentially bounded input x2 , we
have
|z(t, z0 , x2 )| ≤ β(|z0 |, t) + γ(|x2 |) ,
∀t ≥ 0
(4.7)
where z(t, z0 , x2 ) denotes the trajectory of (4.6) starting from z0 at time t = 0 with
input x2 .
The faults considered in this paper are modelled as additive perturbations (occurring at some unknown time instant T ) ∆F1 (x1 , x2 , t) and ∆F2 (x1 , x2 , t) to the
nominal F and K in (4.1), respectively. Then, after occurrence of a fault (i.e., for
t ≥ T ), the dynamics of the systems becomes
ẋ1 = x2
(4.8)
ẋ2 = −α1 x2 − α5 x1 + α4 u + [α3 α(x2 )|x2 | − α2 ] z − α4 ω(x1 , x2 , z)
+[(α3 α(x2 )|x2 | − α2 ) z − α1 x2 − α4 ω(x1 , x2 , z)] ∆F1 − α5 x1 ∆F2 (4.9)
ż = −α(x2 )|x2 | z + x2
(4.10)
76
The following further assumption is introduced (no multiple faults are considered in
this paper).
Assumption 3 Only one single fault may occur at a given time T .
To end this section on the problem formulation, we would like to emphasize that the
additive perturbations ∆F1 (x1 , x2 , t) and ∆F2 (x1 , x2 , t) to the nominal values of F
and K, respectively, reflect variations in the normal forces in contact, temperature
changes and material wear, as well as spring’s stiffening and relaxation phenomena.
These malfunctions are typically encountered for instance in actuators installed in
harsh plant environments (see also the DAMADICS benchmark problem presented
in Chapter 5, where the problem of detecting faults in a sugar plant actuator is
addressed).
4.2
Nominal System On-line Approximation
In this section, we present an on–line approximation scheme for the nominal system
presented in Section 4.1. The approximator’s output will serve as the residual signal
for fault detection. In this respect, as will be seen later on, a key role is played
by the functional approximation scheme that in this work is implemented by one–
hidden–layer neural structures with a linear output layer. In the following, the basic
properties of such a class of neural approximators will be briefly reported for the
sake of completeness.
More specifically, the considered class of neural approximators can be characterized
as
y > = W > S(v)
77
(4.11)
where v ∈ <n2 and y ∈ <n1 denote the approximator input and output, respectively,
W is an L-dimensional vector of synaptic weights, and S(v) is a L × n1 matrix of
regressor terms.
The regressor terms may contain high order connections of sigmoidal functions [47],
radial basis functions (RBFs) with fixed centers and widths [44], [48], [46], shifted
sigmoids [42], [43], thus forming High Order Neural Networks (HONNs), RBFs and
Shifted Sigmoidal Neural Networks, respectively.
An important and well known property shared by the aforementioned neural approximating structures is the following (see also the references above):
Density Property. For every continuous function f (v) : <n2 → <n1 , there exist
an integer L and optimal weight values W ? such that for every ² > 0
>
sup |f (v)> − W ? S(v)| ≤ ²
v∈Ω
where Ω ⊂ <n2 is a given compact set.
In other terms, if the number of regressor terms L is sufficiently large, then there
>
exist a weight vector W ? such that W ? S(v) can approximate f (v)> to any degree
of accuracy, in a given compact set. This property allows us to focus on linear–in–
the–weights neural networks (LNN for short) without loss of generality in terms of
approximation error. This, in turn, will make it easier to prove basic system properties like stability and robustness. However, it is also important to mention that,
under suitable assumptions, neural networks are characterized by other interesting
properties related to their appproximating capabilities (see the basic work [49] and
also the recent paper [50], where an extensive discussion on such properties in the
nonlinear optimal control context is reported).
78
Now, let us consider the following estimator
x̂˙ 1 = x2 + k1 ξ˜1 − α̂5 ξ˜2
(4.12)
x̂˙ 2 = −α̂1 x̂2 − α̂5 x̂1 + α̂4 u + k2 ξ˜2 − φ ,
(4.13)
where k1 , k2 > 0 denote design constants, φ denotes a function that will be defined
later on, ξ˜1 , ξ˜2 represent the state estimation errors defined as
4
ξ˜i = xi − x̂i ,
i = 1, 2 ,
(4.14)
and α̂i , for i = 1, 4, 5 are parameters to be updated on line. Then, from (4.4),(4.5),
and (4.12)-(4.14), it follows that
˙
ξ˜1 = −k1 ξ˜1 + α̂5 ξ˜2
(4.15)
˙
ξ˜2 = −α5 ξ˜1 − (k2 + α1 )ξ˜2 + α̃1 x̂2 + α̃5 x̂1 − α̃4 u + [α3 α(x2 )|x2 | − α2 ]z −
−α4 ω(x1 , x2 , z) + φ
(4.16)
4
where α̃i = α̂i − αi , for i = 1, 4, 5.
The objective is to design an adaptive structure for φ such as to guarantee the
boundedness of the estimation errors and of all internal variables in front of the
unknown friction terms entering the dynamics of the system via the unmeasurable
state variable z and of the modelling uncertainty.
In this connection, we introduce the following form for function φ in (4.13):
4
φ = −[|x2 |ŵ1> S1 (x2 , |x2 |) + ŵ2> S2 (|x2 |) + |x2 |²̂1 + b̂1 ]sigm(ξ˜2 )
(4.17)
where sigm(·) denotes a sigmoidal smooth approximation of the signum function
sgn(x), the terms ŵ1> S1 (x2 , |x2 |) and ŵ2> S2 (|x2 |) denote neural approximators of
the form (4.11) and ²̂1 , b̂1 are further parameters to be updated on line. The reasons
79
motivating structure (4.17) for term φ will be clear after the proof of Theorem 1
stated later on.
It is worth noting that
sigm(x) = sgn(x) + εs (x)
(4.18)
where the error εs (x) satisfies |εs (x)| ≤ 1 (as is well know, sigm(·) can be shaped
to make εs (x) as small as desired). Hence, using (4.17) and (4.18), we obtain that
(4.17) can be rewritten as
φ = −[|x2 |ŵ1> S1 (x2 , |x2 |) + ŵ2> S2 (|x2 |) + |x2 |²̂1 + b̂1 ](sgn(ξ˜2 ) + εs (ξ˜2 ))(4.19)
The parameters appearing in (4.12),(4.13), and (4.19) are provided by the following
adaptive laws:
ŵ˙ 1 = Pa {|ξ˜2 ||x2 |S1 (x2 , |x2 |)}
ŵ˙ 2 = Pb {|ξ˜2 |S2 (|x2 |)}
²̂˙ 1 = Pc {|ξ˜2 ||x2 |}
˙
b̂1 = Pd {|ξ˜2 |}
(4.20)
α̂˙ 1 = Pe {−x̂2 ξ˜2 }
α̂˙ 4 = Pf {uξ˜2 }
α̂˙ 5 = Pg {−ξ˜2 (ξ˜1 + x̂1 )}
where Pa , Pb , Pc , Pd , Pe , Pf and Pg denote the projection operators with respect
4
4
to the convex sets Wa = {ŵ1 ∈ <L1 : |ŵ1 | ≤ Ma }, Wb = {ŵ2 ∈ <L2 : |ŵ2 | ≤ Mb },
4
4
4
Wc = {²̂1 ∈ < : |²̂1 | ≤ Mc }, Wd = {b̂1 ∈ < : |b̂1 | ≤ Md }, We = {α̂1 ∈ < : 0 < α̂1 ≤
4
4
Me }, Wf = {α̂4 ∈ < : 0 < α̂4 ≤ Mf }, Wg = {α̂5 ∈ < : 0 < α̂5 ≤ Mg }, where Ma ,
Mb , Mc , Md , Me , Mf , Mg are suitably large positive scalars (the definition of the
projection operation with respect to a convex set can be found, for instance, in [45]).
Now, we are able to state and prove the following basic theorem:
80
Theorem 1 Consider the system (4.4)-(4.6). There exists a choice of the scalars
Ma , Mb , Mc , Md , Me , Mf , Mg defining the projection sets Wa , Wb , Wc , Wd , We ,
Wf , Wg and of the initial conditions of the estimated parameters in the adaptive
laws (4.20) such that the on-line approximator (4.12),(4.13), (4.19) together with
the update laws (4.20) guarantee the uniform ultimate boundedness of ξ˜i , i = 1, 2
with respect to the sets
½
Ξ1
Ξ2
¯
¯
¾
Φ̄Mg
=
(k2 + α1 )k1
¯
½
¾
¯
Φ̄
|Φ|
=
ξ˜2 ∈ <¯¯|ξ˜2 | ≤
≤
k2 + α1
k2 + α1
ξ˜1 ∈ <¯¯|ξ˜1 | ≤
where Φ̄ > |Φ| = [|x2 |ŵ1> S1 (x2 , |x2 |) + ŵ2> S2 (|x2 |) + |x2 |²̂1 + b̂1 ] is a suitable positive
scalar, as well as the boundedness of all parameter estimates ŵi , i = 1, 2, ²̂1 , b̂1 , α̂1 ,
α̂4 and α̂5 .
Proof: Consider the Lyapunov function candidate:
1
1
1
1
1
1
1
1
4 1
V = ξ˜12 + ξ˜22 + w̃1> w̃1 + w̃2> w̃2 + ²̃21 + b̃21 + α̃12 + α̃42 + α̃52
2
2
2
2
2
2
2
2
2
(4.21)
Differentiating V with respect to time we obtain
V̇
= −k1 ξ˜12 + α̂5 ξ˜1 ξ˜2 − (k2 + α1 )ξ˜22 − α5 ξ˜1 ξ˜2 + ξ˜2 (α3 α(x2 )|x2 | − α2 )z + φξ˜2 + w̃1> w̃˙ 1 + w̃2> w̃˙ 2
˙
−ξ˜2 α4 ω(x1 , x2 , z) + ξ˜2 α̃1 x̂2 + ξ˜2 α̃5 x̂1 − ξ˜2 α̃4 u + α̃1 α̂˙ 1 + α̃4 α̂˙ 4 + α̃5 α̂˙ 5 + ²̃1 ²̃˙ 1 + b̃1 b̃1
≤ −k1 ξ˜12 − (k2 + α1 )ξ˜22 + ξ˜1 ξ˜2 α̃5 + |ξ˜2 ||x2 |α3 α(x2 )|z| + α2 |ξ˜2 ||z| + ξ˜2 φ + w̃1> w̃˙ 1 + w̃2> w̃˙ 2
˙
+|ξ˜2 |α4 |ω(x1 , x2 , z)| + ξ˜2 α̃1 x̂2 + ξ˜2 α̃5 x̂1 − ξ˜2 α̃4 u + α̃1 α̂˙ 1 + α̃4 α̂˙ 4 + α̃5 α̂˙ 5 + ²̃1 ²̃˙ 1 + b̃1 b̃1
(4.22)
From bounds (4.7) and |ω(x, ẋ, z)| ≤ ω̄, and introducing a positive scalar d0 such
that 0 ≤ β(|z0 |, t) ≤ d0 , it follows that
·
V̇
¸
·
¸
≤ −k1 ξ˜12 − (k2 + α1 )ξ˜22 + |ξ˜2 ||x2 |α3 α(x2 ) β(|z0 |, t) + γ(|x2 |) + α2 |ξ˜2 | β(|z0 |, t) + γ(|x2 |)
81
+ξ˜2 φ + w̃1> w̃˙ 1 + w̃2> w̃˙ 2 + |ξ˜2 |α4 ω̄ + ξ˜2 α̃1 x̂2 + ξ˜2 α̃5 (ξ˜1 + x̂1 ) − ξ˜2 α̃4 u + α̃1 α̂˙ 1 + α̃4 α̂˙ 4
˙
+α̃5 α̂˙ 5 + ²̃1 ²̃˙ 1 + b̃1 b̃1
= −k1 ξ˜12 − (k2 + α1 )ξ˜22 + |ξ˜2 ||x2 |β(|z0 |, t)α3 α(x2 ) + |ξ˜2 ||x2 |α3 α(x2 )γ(|x2 |) + α2 |ξ˜2 |β(|z0 |, t)
+α2 |ξ˜2 |γ(|x2 |) + ξ˜2 φ + w̃1> w̃˙ 1 + w̃2> w̃˙ 2 + |ξ˜2 |α4 ω̄ + ξ˜2 α̃1 x̂2 + ξ˜2 α̃5 (ξ˜1 + x̂1 ) − ξ˜2 α̃4 u
˙
+α̃1 α̂˙ 1 + α̃4 α̂˙ 4 + α̃5 α̂˙ 5 + ²̃1 ²̃˙ 1 + b̃1 b̃1
·
µ
¶¸
≤ −k1 ξ˜12 − (k2 + α1 )ξ˜22 + |ξ˜2 ||x2 | α3 α(x2 ) d0 + γ(|x2 |)
+ |ξ˜2 |(α2 d0 + α4 ω̄) + |ξ˜2 |α2 γ(|x2 |)
+ξ˜2 φ + w̃1> w̃˙ 1 + w̃2> w̃˙ 2 + ξ˜2 α̃1 x̂2 + ξ˜2 α̃5 (ξ˜1 + x̂1 ) − ξ˜2 α̃4 u + α̃1 α̂˙ 1 + α̃4 α̂˙ 4 + α̃5 α̂˙ 5
˙
+²̃1 ²̃˙ 1 + b̃1 b̃1
The idea now is to approximate on line the unknown nonlinear terms α3 α(x2 )(d0 +
γ(|x2 |)) and α2 γ(|x2 |) by suitable neural approximators. More specifically, it turns
out that there exist continuous functions ε1 (x2 ), ε2 (x2 ) (denoting the approximation
errors) and constant but unknown weight vectors w1? , w2? , such that
>
α3 α(x2 ) (d0 + γ(|x2 |)) = w1? S1 (x2 , |x2 |) + ε1 (x2 )
(4.23)
>
α2 γ(|x2 |) = w2? S2 (|x2 |) + ε2 (x2 )
From the density property, it also follows that, on a generic compact set Ω ⊂ <, the
approximation errors can be suitably bounded as |ε1 (x2 )| ≤ ²1 and |ε2 (x2 )| ≤ ²2 ,
where ²1 > 0, ²2 > 0.
4
Now, letting k3? = α2 d0 + α4 ω̄ and using (4.23), we have
·
V̇
¸
>
≤ −k1 ξ˜12 − (k2 + α1 )ξ˜22 + |ξ˜2 ||x2 | w1? S1 (x2 , |x2 |) + ε1 (x2 ) + k3? |ξ˜2 |
·
¸
>
+|ξ˜2 | w2? S2 (|x2 |) + ε2 (x2 ) + ξ˜2 φ + w̃1> w̃˙ 1 + w̃2> w̃˙ 2 + ξ˜2 α̃1 x̂2 + ξ˜2 α̃5 (ξ˜1 + x̂1 )
˙
−ξ˜2 α̃4 u + α̃1 α̂˙ 1 + α̃4 α̂˙ 4 + α̃5 α̂˙ 5 + ²̃1 ²̃˙ 1 + b̃1 b̃1
>
>
= −k1 ξ˜12 − (k2 + α1 )ξ˜22 + |ξ˜2 ||x2 |w1? S1 (x2 , |x2 |) + k3? |ξ˜2 | + |ξ˜2 |w2? S2 (|x2 |)
·
¸
+|ξ˜2 | |x2 |ε1 (x2 ) + ε2 (x2 ) + ξ˜2 φ + w̃1> w̃˙ 1 + w̃2> w̃˙ 2 + ξ˜2 α̃1 x̂2 + ξ˜2 α̃5 (ξ˜1 + x̂1 )
82
˙
−ξ˜2 α̃4 u + α̃1 α̂˙ 1 + α̃4 α̂˙ 4 + α̃5 α̂˙ 5 + ²̃1 ²̃˙ 1 + b̃1 b̃1
·
>
>
≤ −k1 ξ˜12 − (k2 + α1 )ξ˜22 + |ξ˜2 ||x2 |w1? S1 (x2 , |x2 |) + |ξ˜2 |w2? S2 (|x2 |) + |ξ˜2 | |x2 |²1 + k3? + ²2
+ξ˜2 φ + w̃1> w̃˙ 1 + w̃2> w̃˙ 2 + ξ˜2 α̃1 x̂2 + ξ˜2 α̃5 (ξ˜1 + x̂1 ) − ξ˜2 α̃4 u + α̃1 α̂˙ 1 + α̃4 α̂˙ 4 + α̃5 α̂˙ 5
˙
+²̃1 ²̃˙ 1 + b̃1 b̃1
After adding and subtracting the terms |ξ˜2 ||x2 |ŵ1> S1 (x2 , |x2 |) and |ξ˜2 |ŵ2> S2 (|x2 |),
we obtain
V̇
>
= −k1 ξ˜12 − (k2 + α1 )ξ˜22 + |ξ˜2 ||x2 |w1? S1 (x2 , |x2 |) + |ξ˜2 ||x2 |ŵ1> S1 (x2 , |x2 |)
>
−|ξ˜2 ||x2 |ŵ1> S1 (x2 , |x2 |) + |ξ˜2 |w2? S2 (|x2 |) + |ξ˜2 |ŵ2> S2 (|x2 |) − |ξ˜2 |ŵ2> S2 (|x2 |)
·
¸
+|ξ˜2 | |x2 |²1 + b1 + ξ˜2 φ + w̃1> w̃˙ 1 + w̃2> w̃˙ 2 + ξ˜2 α̃1 x̂2 + ξ˜2 α̃5 (ξ˜1 + x̂1 )
˙
−ξ˜2 α̃4 u + α̃1 α̂˙ 1 + α̃4 α̂˙ 4 + α̃5 α̂˙ 5 + ²̃1 ²̃˙ 1 + b̃1 b̃1
= −k1 ξ˜12 − (k2 + α1 )ξ˜22 − |ξ˜2 ||x2 |w̃1> S1 (x2 , |x2 |) + |ξ˜2 ||x2 |ŵ1> S1 (x2 , |x2 |) − |ξ˜2 |w̃2> S2 (|x2 |)
¸
·
+|ξ˜2 |ŵ2> S2 (|x2 |) + |ξ˜2 | |x2 |²1 + b1 + ξ˜2 φ + w̃1> w̃˙ 1 + w̃2> w̃˙ 2
˙
+ξ˜2 α̃1 x̂2 + ξ˜2 α̃5 (ξ˜1 + x̂1 ) − ξ˜2 α̃4 u + α̃1 α̂˙ 1 + α̃4 α̂˙ 4 + α̃5 α̂˙ 5 + ²̃1 ²̃˙ 1 + b̃1 b̃1
Using (4.19), (4.20) and defining
4
Φ = [|x2 |ŵ1> S1 (x2 , |x2 |) + ŵ2> S2 (|x2 |) + |x2 |²̂1 + b̂1 ]
we obtain
V̇
≤ −k1 ξ˜12 − (k2 + α1 )ξ˜22 − Φεs (ξ˜2 )ξ˜2
≤ −(k2 + α1 )ξ˜22 + |Φεs (ξ˜2 )ξ˜2 |
·
¸
≤ −(k2 + α1 )ξ˜22 + |Φ||ξ˜2 | = −|ξ˜2 | (k2 + α1 )|ξ˜2 | − |Φ|
≤ 0
provided that
|ξ˜2 | >
|Φ|
k2 + α 1
83
¸
Hence ξ˜2 is uniformly ultimately bounded with respect to the set
¯
½
¯
|Φ|
Ξ2 = ξ˜2 ∈ <¯¯|ξ˜2 | ≤
k2 + α1
¾
Notice that
|Φ| ≤ |x2 ||ŵ1 ||S1 (x2 , |x2 |)| + |ŵ2 ||S2 (|x2 |)| + |x2 ||²̂1 | + |b̂1 |
Moreover, |S1 (x2 , |x2 |)|, |S2 (|x2 |)| are bounded too (let s1 , s2 denote their known
upper bounds). Then, according to (4.20) and denoting by x̄2 the known upper
bound on the velocity (see Assumption 2), we have
4
|Φ| ≤ Φ̄ = x̄2 Ma s1 + Mb s2 + x̄2 Mc + Md
Furthermore, since
˙
ξ˜1 = −k1 ξ˜1 + α̂5 ξ˜2
we obtain
ξ˜1 (t) = |ξ˜1 (0)|e−k1 t +
Z t
0
Thus,
|ξ˜1 (t)| ≤ |ξ˜1 (0)|e−k1 t + Mg
e−k1 (t−τ ) α̂5 ξ˜2 (τ )dτ
Z t
0
e−k1 (t−τ ) |ξ˜2 (τ )|dτ
which finally becomes
|ξ˜1 (t)| ≤ |ξ˜1 (0)|e−k1 t +
Φ̄Mg
(1 − e−k1 t )
(k2 + α1 )k1
Hence, ξ˜1 (t) and ξ˜2 (t) are uniformly ultimately bounded with respect to the sets
¯
¯
½
Φ̄Mg
Ξ1 = ξ˜1 ∈ <¯¯|ξ˜1 | ≤
(k2 + α1 )k1
½
¯
¾
¯
|Φ|
Φ̄
Ξ2 = ξ˜2 ∈ <¯¯|ξ˜2 | ≤
≤
k2 + α1
k2 + α1
¾
Now, if the sets Wa , Wb , Wc , Wd , We , Wf , Wg are chosen in such a way that w1? ,
ŵ1 (0) ∈ Wα , w2? , ŵ2 (0) ∈ Wb , ²1 , ²̂1 (0) ∈ Wc , b1 , b̂1 (0) ∈ Wd , α1 , α̂1 (0) ∈ We ,
84
α4 , α̂4 (0) ∈ Wf and α5 , α̂5 (0) ∈ Wg , where ŵ1 (0), ŵ2 (0), ²̂1 (0), b̂1 (0), α̂1 (0), α̂4 (0)
and α̂5 (0) denote the initial values of ŵ1 , ŵ2 , ²̂1 , b̂1 , α̂1 , α̂4 and α̂5 , respectively,
then the use of the projection modification to the update laws (4.20) guarantees the
boundedness of ŵ1 , ŵ2 , ²̂1 , b̂1 , α̂1 , α̂4 and α̂5 , thus ending the proof of the theorem.
Remark 1. It is worth noting that the magnitude of sets Ξ1 and Ξ2 depends on
several factors and in general it is not easy to ascertain a clear way as to how reduce
this magnitude that cannot be made arbitrarily small. In this respect, a key role is
played by the modeling uncertainty and the number L of regressor terms on one side,
and by the design constants k1 and k2 on the other. In case of significant modeling
uncertainties, several issues come up. First of all, the projection algorithm may not
be able to guarantee the boundedness of the signals that appear in (4.20). More
specifically, the projection modification requires knowledge of the upper bounds on
the norms of the unknown weights, b1 , ²1 , etc. If, for instance, we have |b1 | > Md
due to the modeling error, the parameter b̂1 may drift to infinity since there is no
guarantee that b̂1 will be bounded. Moreover, in the presence of large modeling
error, large values of the design constants k1 , k2 are needed in order to maintain
the sets Ξ1 , Ξ2 reasonably small. However, large values of k1 , k2 may give rise to
high gain feedback which in turn leads to instability. These important issues deserve
further investigation.
As it has been shown by Theorem 1, the velocity error ξ˜2 is guaranteed to be
uniformly ultimately bounded with respect to the set Ξ2 . Consequently (the assumptions of the theorem being satisfied), if ξ˜2 (0) 6∈ Ξ2 , then there exists a finite
time T0 > 0 such that ξ˜2 (t) ∈ Ξ2 , ∀ t ≥ T0 . This important, though not surprising,
result can be proved as follows. Recall that
V̇ ≤ −(k2 + α1 )ξ˜22 + Φεs (ξ˜2 )ξ˜2 ≤ −k2 ξ˜22 + Φεs (ξ˜2 )ξ˜2 ≤ −k2 |ξ˜2 |2 + |Φ||ξ˜2 |
85
(4.24)
Moreover, recall also that V̇ ≤ 0 , whenever |ξ˜2 | >
|Φ|
k2 +α1 .
Then integrating (4.24)
from 0 to T0 , we obtain
V (T0 ) − V (0) ≤
Z T0
0
(−k2 |ξ˜2 |2 + |Φ||ξ˜2 |)dt
which becomes (according also to the Lyapunov function definition):
1 ˜ 2
|ξ2 | ≤ V (T0 ) ≤ V (0) +
2
Z T0
0
(−k2 |ξ˜2 |2 + |Φ||ξ˜2 |)dt
It is sufficient to show that there exist a finite T0 such that
V (0) +
or
V (0) ≤
Z T0
0
µ
¶2
µ
¶2
1
|Φ|
(−k2 |ξ˜2 |2 + |Φ||ξ˜2 |)dt ≤
2 k2 + α 1
· Z T0
0
¸
|Φ|
1
(k2 |ξ˜2 |2 − |Φ||ξ˜2 |)dt +
2 k2 + α1
(4.25)
Because we assumed before that ξ˜2 (0) 6∈ Ξ2 , using (4.24) we have that
−k2 |ξ˜2 |2 + |Φ||ξ˜2 | < 0
Hence,
Z T0
0
∀t ∈ [0, T0 ]
(−k2 |ξ˜2 |2 + |Φ||ξ˜2 |)dt < 0
(4.26)
Observe that by (4.26) the term in brackets in (4.25) is positive. Moreover, define
Γ(T0 ) =
Z T0
0
(k2 |ξ˜2 |2 − |Φ||ξ˜2 |)dt
Then
dΓ(T0 )
= k2 |ξ˜2 (T0 )|2 − |Φ(T0 )||ξ˜2 (T0 )| > 0
dT0
Since Γ(T0 ) is monotonically increasing and positive, there exists a finite time T0
which satisfies (4.25). Hence,
|ξ˜2 (T0 )| ≤
86
|Φ(T0 )|
k2 + α 1
which contradicts what we assumed before. If on the other hand we restrict ξ˜2 (0) ∈
Ξ2 then T0 = 0.
Summing up, if Theorem 1 holds true, we have shown that the velocity error enters
the set Ξ2 in finite time. If such a set can be made sufficiently small, this result can
be exploited in the framework of fault detection as will be seen in the next section.
4.3
Fault Detectability Analysis
In the previous section, the robustness properties of the on-line approximation
scheme prior to the occurrence of a possible fault have been analyzed. Now, assume
that conditions under which Theorem 1 holds true are satisfied and, accordingly,
let T0 to have the same meaning as before, that is, let it denote the time instant
at which the nominal trajectory of the velocity error ξ˜2 (t) enters the set Ξ2 and
never leaves it ∀ t ≥ T0 .
Now, consider the occurrence of a fault at time T in which case the dynamics
of the system is described by Eqs. (4.8)-(4.10). The following further assumption is
needed.
Assumption 4 The time instant T of fault occurrence satisfies T > T0 .
Clearly, if Assumption 4 is satisfied, no false alarm is generated prior to the occurrence of a fault, provided that |ξ˜2 (t)| serves as the residual signal and the threshold
is selected as
4
ρ=
|Φ|.
|Φ|.
>
k2
k2 + α1
Remark 2. The threshold function that is used, is the conservative ρ =
(4.27)
|Φ|
k2
instead
of the uniform bound that appears in the definition of Ξ2 , because the parameter
α1 is considered unknown.
87
The decision on the occurrence of a fault is being made when the residual signal
exceeds the threshold, i.e.
a fault occurred if
∃ T > T0
such that |ξ˜2 (T )| > ρ
This decision criterion reflects the very intuitive fact that the fault to be detectable
should be big enough to make the residual exceeding the threshold. In this respect,
it is thus very important to address the issue of fault detectability.
The analysis in this section is deeply inspired by the basic work by Polycarpou
and Trunov [51] with two differences: the state vector is not assumed to be completely available for measurement and the on-line approximator operates ∀ t ≥ 0
and not only after detection of a fault.
After occurrence of a fault (i.e., t ≥ T ), from Eqs. (4.9), (4.13), and (4.14) it
follows that
˙
ξ˜2 = −α5 ξ˜1 − (k2 + α1 )ξ˜2 + (α3 α(x2 )|x2 | − α2 )z − α4 ω(x1 , x2 , z) + φ + α̃1 x̂2
+α̃5 x̂1 − α̃4 u + ((α3 α(x2 )|x2 | − α2 )z − α1 x2 − α4 ω(x1 , x2 , z)) · ∆F1 (x1 , x2 , t)
−α5 x1 ∆F2 (x1 , x2 , t)
(4.28)
(Recall that we assume a single fault scenario and thus ∆F1 and ∆F2 cannot be
simultaneously different from zero.) Moreover, let
4
A = −α5 ξ˜1 + [α3 α(x2 )|x2 | − α2 ]z − α4 ω(x1 , x2 , z) + φ + α̃1 x̂2 + α̃5 x̂1 − α̃4 u
(4.29)
4
B1 = [(α3 α(x2 )|x2 | − α2 )z − α1 x2 − α4 ω(x1 , x2 , z)] · ∆F1 (x1 , x2 , t)
4
B2 = −α5 x1 ∆F2 (x1 , x2 , t)
(4.30)
(4.31)
88
According to [51], the detectability analysis can be performed in both the abrupt
and the incipient fault cases. Specifically, an incipient time–profile for the fault
can be characterized by a multiplicative term (1 − e−π(t−T ) ) , where π > 0 is an
unknown constant that represents the rate evolution of the fault. In case π = ∞
the fault becomes an abrupt one.
The following simple result (analogous to the one presented in [51] for generic nonlinear systems) characterizes, in an implicit way, the set of faults that can be detected
using the previously defined threshold.
Theorem 2 Assume that fault ∆Fi (x1 , x2 , t), for i = 1 or i = 2 occurs at time T .
If there exists a time interval [T + t1 , T + t2 ], with t2 > t1 ≥ 0, such that
¯ Z T +t
¯
2
¯
¯
−(k2 +α1 )(T +t2 −τ )
−π(τ −T )
¯
¯
e
(1
−
e
)B
dτ
i
¯
¯
T +t1
¯ Z T +t
¯
2
¯
¯
−(k2 +α1 )(t2 −t1 )
−(k2 +α1 )(T +t2 −τ )
¯
≥ ρ + ρe
+¯
e
Adτ ¯¯ (4.32)
T +t1
with ρ =
|Φ|
k2
and Φ as defined in Theorem 1. Then the fault is detected at time
t = t2 .
Proof. For any t2 > t1 the solution of (4.28) using (4.29) and (4.30) or (4.31), is
given by:
ξ˜2 (T + t2 ) = e−(k2 +α1 )(T +t2 −T −t1 ) ξ˜2 (T + t1 ) +
+
Z T +t2
T +t1
Z T +t2
T +t1
e−(k2 +α1 )(T +t2 −τ ) Adτ
e−(k2 +α1 )(T +t2 −τ ) (1 − e−π(τ −T ) )Bi dτ
Using the triangle inequality and |ξ˜2 (T + t1 )| ≤ ρ =
|Φ|
k2 ,
we obtain:
¯ Z T +t
¯
2
¯
¯
−(k2 +α1 )(T +t2 −τ )
e
Adτ ¯¯
T +t1
¯ Z T +t
¯
2
¯
¯
−(k2 +α1 )(T +t2 −τ )
−π(τ −T )
¯
+¯
e
(1 − e
)Bi dτ ¯¯
|ξ˜2 (T + t2 )| ≥ −ρe−(k2 +α1 )(t2 −t1 ) − ¯¯
T +t1
89
where if the fault function is such that (4.32) is satisfied, then we obtain that
|ξ˜2 (T + t2 )| ≥ ρ, which implies that the fault will be detected.
Estimation of the detection time
One of the most important characteristics in any fault diagnosis scheme is the time
(detection time) required between the occurrence and the detection of a fault. Early
detection (i.e, small detection time), is crucial to prohibit the possibly catastrophic
consequences of a fault.
The following result (the proof is inspired again by [51]) gives an upper bound on
the detection time for abrupt and incipient faults.
Theorem 3 Assume that Theorem 2 holds. Moreover, suppose that there exist
lower bounds Bmi ≤ Bi , i = 1, 2 and an upper bound Ā > A such that, for i = 1, 2,
we have
Bmi > Ā + |Φ|, ∀t ∈ [T + t1 T + td ]
Then:
(a) incipient faults: an upper bound t+
d on the detection time td is given by the
solution of the algebraic equation
·
¸
+
−(k2 +α1 )(td −t1 )
gi (t+
= Ā + |Φ| (4.33)
d , k2 + α1 ) − gi (t1 , k2 + α1 ) + |Φ| − Ā e
where
gi (t, k2 + α1 ) =
Bmi
(k2 + α1 − πi − (k2 + α1 )e−πi t ))
k2 + α 1 − π i
(4.34)
(b) abrupt faults: an upper bound t+
d on the detection time td is given by
"
t+
d
#
1
Bmi − Ā + |Φ|
=
ln
+ t1
k2 + α 1
Bmi − Ā − |Φ|
90
(4.35)
Furthermore in general, t+
d decreases monotonically as k2 increases.
Proof. (a) As Ā is an upper bound on A, the following inequality holds:
¯ Z T +t
¯
Z T +t2
2
¯
¯
Ā
−(k2 +α1 )(T +t2 −τ )
¯
¯
e
Adτ ¯ ≤ Ā
e−(k2 +α1 )(T +t2 −τ ) dτ =
(1−e−(k2 +α1 )(t2 −t1 ) )
¯
k
+
α
T +t1
T +t1
2
1
(4.36)
Similarly, as Bmi is a lower bound on Bi , we have:
¯
¯ Z T +t
Z t2
2
¯
¯
−(k2 +α1 )(T +t2 −τ )
−πi (τ −T )
¯
¯
≥
B
e−(k2 +α1 )(t2 −τ ) (1 − e−πi τ )dτ
e
(1
−
e
)B
dτ
i
mi
¯
¯
T +t1
t1
Bmi
Bmi −(k2 +α1 )(t2 −t1 )
Bmi
−
e
−
e−πi t2
k2 + α1 k2 + α1
k2 + α1 − πi
Bmi
+
e−(k2 +α1 )t2 +(k2 +α1 −πi )t1
k2 + α1 − πi
·
Bmi
(k2 + α1 − πi ) − (k2 + α1 − πi )e−(k2 +α1 )(t2 −t1 ) − (k2 + α1 )e−πi t2
=
(k2 + α1 )(k2 + α1 − πi )
=
¸
−(k2 +α1 )(t2 −t1 ) −πi t1
+(k2 + α1 )e
e
·
¸
Bmi
=
(k2 + α1 − πi ) − (k2 + α1 )e−πi t2
(k2 + α1 )(k2 + α1 − πi )
·
¸
Bmi
−πi t1 −(k2 +α1 )(t2 −t1 )
(k2 + α1 − πi ) − (k2 + α1 )e
e
−
(k2 + α1 )(k2 + α1 − πi )
gi (t2 , k2 + α1 ) gi (t1 , k2 + α1 ) −(k2 +α1 )(t2 −t1 )
=
−
e
k2 + α 1
k2 + α1
Hence, using (4.36) and (4.37), it follows that the detectability condition (4.32)
becomes
gi (t2 , k2 + α1 ) gi (t1 , k2 + α1 ) −(k2 +α1 )(t2 −t1 )
−
e
k2 + α1
k2 + α1
Ā
Ā
≥ ρe−(k2 +α1 )(t2 −t1 ) + ρ +
−
e−(k2 +α1 )(t2 −t1 )(4.38)
k2 + α 1 k2 + α 1
An upper bound on the detection time can thus be obtained by solving with respect
to the unknown t+
d the algebraic equation
+
gi (t+
gi (t1 , k2 + α1 ) −(k2 +α1 )(t+ −t1 )
d , k2 + α1 )
d
−
e
= ρe−(k2 +α1 )(td −t1 ) + ρ
k2 + α1
k2 + α 1
+
Ā
Ā
+
−
e−(k2 +α1 )(td −t1 )
k2 + α1 k2 + α1
91
(4.37)
or, equivalently,
·
¸
+
−(k2 +α1 )(td −t1 )
gi (t+
= Ā + |Φ|
d , k2 + α1 ) − gi (t1 , k2 + α1 ) + |Φ| − Ā e
thus proving (4.33).
(b) Letting πi → ∞, it follows that (4.33) becomes
µ
¶
Bmi − Bmi + |Φ| − Ā e−(k2 +α1 )(td −t1 ) = Ā + |Φ|
and hence
e(k2 +α1 )(td −t1 ) =
Bmi − Ā + |Φ|
Bmi − Ā − |Φ|
thus obtaining
"
#
1
Bmi − Ā + |Φ|
td =
ln
+ t1
k2 + α1
Bmi − Ā − |Φ|
which proves (4.35).
Finally, let us show that t+
d decreases monotonically as k2 increases. From (4.38)
and recalling that ρ =
|Φ|
k2 ,
we have
|Φ|(k2 + α1
+ (Ā − |Φ|)e−(k2 +α1 )(td −t1 )
k2
> Ā + |Φ| + (Ā − |Φ|)e−(k2 +α1 )(td −t1 ) (4.39)
gi (td , k2 + α1 ) − gi (t1 , k2 + α1 )e−(k2 +α1 )(td −t1 ) ≥ Ā +
It is useful to introduce the quantities
f
4
= gi (td , k2 + α1 ) − gi (t1 , k2 + α1 )e−(k2 +α1 )(td −t1 )
4
z = Ā + |Φ| + (Ā − |Φ|)e−(k2 +α1 )(td −t1 )
The partial derivative of (4.34) with respect to k2 gives
∂gi
Bmi πi e−πi t
=
∂k2
(k2 + α1 − πi )2
92
(4.40)
Using (4.40), we obtain:
∂f
∂k2
=
Bmi πi e−πi td
Bmi πi e−πi t1 −(k2 +α1 )(td −t1 )
−
e
(k2 + α1 − πi )2 (k2 + α1 − πi )2
·
¸
Bmi
+
(k2 + α1 − πi − (k2 + α1 )e−πi t1 ) (td − t1 )e−(k2 +α1 )(td −t1 )
(k2 + α1 − πi )
·
=
Bmi e−(k2 +α1 )(td −t1 )
πi e(k2 +α1 −πi )(td −t1 ) e−πi t1 − πi e−πi t1 + (k2 + α1 − πi )2 (td − t1 )
(k2 + α1 − πi )2
−(k2 + α1 − πi )(k2 + α1 )(td − t1 )e−πi t1 +πi e−πi t1 (k2 + α1 − πi )(td − t1 )
¸
−πi e
−πi t1
(k2 + α1 − πi )(td − t1 )
·
=
µ
¶
Bmi e−(k2 +α1 )(td −t1 )
πi e−πi t1 e(k2 +α1 −πi )(td −t1 ) − (k2 + α1 − πi )(td − t1 ) − 1 +
(k2 + α1 − πi )2
¸
2
+(k2 + α1 − πi ) (td − t1 )(1 − e
As em − m − 1 > 0, it follows that
∂f
∂k2
−πi t1
)
> 0 and that f increases monotonically as
k2 increases. Moreover :
∂z
∂k2
= −(Ā − |Φ|)(td − t1 )e−(k2 +α1 )(td −t1 )
As |A| ≤ Ā (as an upper bound) and |A| ≥ |Φ| (from A’s definition), we have
Ā ≥ |Φ|. All the above leads to the conclusion that
∂z
∂k2
< 0 and that z decreases
monotonically as k2 increases. Looking into (4.39) with the above results we note
that as k2 increases, t+
d decreases.
4.4
Simulation results
In this section, extensive simulation results will be given to show potentialities and
possible limitations of the proposed methodology. Specifically, a simple example is
given just to emphasize some of the key aspects of the techique.
Consider the nominal system with m = 1, K = 1, σ0 = 2, σ1 =
√
2, σ2 = 0.4,
fc = 1, fs = 1.5, and vs = 0.001. To implement the on-line approximator we have
93
employed HONNs, with sigmoid activation function s(x) =
m
1+e−l(x−c)
+ λ. Specifi-
cally, for the term ŵ2> S2 (|x2 |) we have chosen a 5th -order HONN with (m, l, c, λ) =
(0.8, −4, 2.119, −1.5), while for ŵ1> S1 (x2 , |x2 |) a 2nd -order HONN with (m, l, c, λ) =
(1.41, −10.0225, 0.5974, −2.11). To highlight the fault detectability issue, we first
simulated the system with a fault of the form ∆F1 = 20 + e10x2 occurring at
T = 60 sec (alteration in friction parameters). Then we simulated it for the type
of fault, ∆F2 = −1 occurring at T = 60 sec which represents the spring’s break.
In both cases the design parameters were k1 = 100 and k2 = 200. The input u
was 3 sin(0.2t). The results for faults ∆F1 and ∆F2 are depicted in Fig. 4.1 and
Fig. 4.2, respectively. The detection time in which |ξ˜2 | ≥ ρ, where ρ is the threshold
defined in (4.27), for the first one was td = 0.0076 sec, while for the second one,
td = 0.0209 sec. The subplots (4.1c-4.2c) depict the detectability condition ((4.32) or
|
R T +t2 −(k +α )(T +t −τ )
R +t2 −(k +α )(T +t −τ )
2
1
2
2
Bi dτ |−ρ−ρe−(k2 +α1 )(t2 −t1 ) −| TT+t
e 2 1
Adτ |,
T +t1 e
1
and confirm the occurrence of the faults when becomes greater than zero. Fig. 4.3
shows the decreasing behavior of the detection time as a function of k2 .
4.5
Summary
In this chapter, we have presented an approach to detect faults in mechanical systems
with friction that perform linear motion. The friction is modeled with the aid of the
dynamic LuGre model. However all system nonlinearities and critical parameters
are assumed unknown. Moreover, the frictional internal state is not available for
measurement. The main contributions of this work are: 1) the development of an online neural network approximator for mechanical systems with friction that does not
require full state measurement; and 2) the derivation of fault detectability conditions
and the upper bounds of the detection time. Simulation results clarify and verify the
theoretical analysis. In the following chapter, the DAMADICS benchmark problem
is defined where the methodology developed here is applied yielding results that
94
fault
−5
Position Error ξ1
1
x 10
0
−1
−2
Velocity Error ξ2
2
0
−3
x 10
10
20
30
40
50
60
70
0
−3
x 10
10
20
30
40
50
60
70
0
−2
−4
Detectability Condition
−6
0
−2
−4
60
60.001
60.002
60.003
60.004
Time(sec)
60.005
60.006
60.007
60.008
Figure 4.1: Behaviors of the: (a) position error ξ˜1 = x1 − x̂1 ; (b) velocity error
ξ˜2 = x2 − x̂2 ; (c) detectability Condition
clarify and verify, additionally, its reliability.
95
fault
−5
Position Error ξ1
2
x 10
0
−2
−4
Velocity Error ξ2
2
0
−3
x 10
10
20
30
40
50
60
70
0
−3
x 10
10
20
30
40
50
60
70
0
−2
−4
Detectability Condition
−6
0
−2
−4
60
60.005
60.01
60.015
60.02
Time(sec)
Figure 4.2: Behaviors of the: (a) position error ξ˜1 = x1 − x̂1 ; (b) velocity error
ξ˜2 = x2 − x̂2 ; (c) detectability Condition
96
0.022
∆ F1
∆ F2
0.02
0.018
0.016
Detection Time td
0.014
0.012
0.01
0.008
0.006
0.004
0.002
100
150
200
250
Design Parameter k2
300
350
400
Figure 4.3: Dependence of detection time td on the value of parameter k2
97
Chapter 5
Fault Detection in Damadics
benchmark problem
In this chapter we concentrate on detecting faults giving emphasis to the DAMADICS1
actuator2 benchmark problem applying the methodology developed in Chapter 4. In
the framework of the DAMADICS research network funded by the European Union,
a benchmark model was developed to approximate the behavior of the evaporation
stage of a sugar factory in Lublin (Poland). Actuators under consideration consist
of a control valve, a pneumatic linear servomotor and a positioner. In such a kind of
electromechanical systems, the presence of friction phenomena is unavoidable and
significantly increases the complexity of the FD problem.
1
The author acknowledge funding support under the EC RTN contract (RTN-1999-00392)
DAMADICS. Thanks are expressed to the management and staff of the Lublin sugar factory,
Cukrownia Lublin SA, Poland for their collaboration and provision of manpower and access to
their sugar plant.
2
Actuator or a final control element is a physical device, structure or assembly of devices acting
on controlled process
98
5.1
Plant description
The plant under concern is the sugar factory Cukrownia Lublin S.A located in
Lublin (Poland). Specifically, we consider the evaporation process where the main
task is to thicken the beet juice coming from the cleaning and filtering stages, at
the minimum heat-energy consumption. The first three sections work with natural
juice circulation and the last two work with juice circulation forced by pumps. We
focus on the first section, consisting of one evaporator and containing an important
actuator, located on the inflow of thin juice and controlling its level in the first stage
of evaporation station.
CV
X
F
Positioner
ps
x
P1
P2
F
V1
V
V3
V2
Figure 5.1: A control valve-pneumatic servomotor-positioner device.
As shown in Fig. 5.1, the actuator is made of three main components [55]:
• Control valve driven by a servomotor, which is used to prevent, to allow and/or
to limit the flow of fluids.
• Spring-and-diaphragm pneumatic servomotor; this is a compressible fluid powered device where the fluid acts upon the flexible diaphragm thus providing
linear motion of the servomotor stem.
99
Symbols
Meaning
Symbols
Meaning
ks
Spring constant
Ff V
Viscosity friction force
kd
Diaphragm constant
Ff C
Coulomb friction force
ps
Air pressure in chamber
Fvc
Vena-contracta force
Fn
Normal packing force
FdA
d’Alambert force
Fp
Active force
x
Rod’s displacement
Fg
Gravity force
x0
Initial spring compression
Fs
Spring compression force
m
mass of rod, valve, diaphragm
Table 5.1: Explanation of the symbols of the pneumatic servomotor and its physical
layout.
• Positioner; this device is used to eliminate control-valve stem miss-positions
due to external or internal sources such as friction, hydrodynamic forces, etc..
Fig. 5.2 shows a more detailed overview of the servomotor as well as its physical
layout; the effects (forces) of the other two components are emphasized (the meaning
of the symbols is straightforward and is presented in Table 5.1).
Fp
Ae
ps
kd
Fd
Fs
ks
Ffv
FfC
FN
Fn
m
FdA
Fg
Fg
x
Fvc
xd
Figure 5.2: The pneumatic servomotor and its physical layout.
A rather detailed dynamic model of the above evaporation process (and of the
actuator as well) has been developed and validated in the context of the DAMADICS
research training network. The unavoidable friction effects are modelled by means of
100
suitable hysteresis functions. In this work instead, the frictional effects are described
by the already mentioned dynamic model, the LuGre model. The reason to use this
friction model is that it is able to capture important phenomena such as presliding
displacement, frictional lag, stick-slip motion, etc.. Another important reason is
that, in the considered actuator, the motion corresponds to a low-velocity motion.
In such a case, the friction nonlinearities dominate and the LuGre model is very
suitable to characterize these nonlinear effects.
5.2
Problem formulation for DAMADICS case
The linear motion provided by the servomotor device, the use of LuGre model
as well as the fault definition given by the DAMADICS benchmark motivated us
to apply our developed methodology. It is important to clarify that the abovedescribed dynamic model for a mechanical system with friction phenomena in both
nominal and faulty modes of operation has a different structure with respect to
the DAMADICS model. However, the complexity of the DAMADICS model rules
out the possibility of using it in the framework of a nonlinear model-based FD
algorithm. Therefore, the key idea is to determine a suitable LuGre model to make
its behavior very similar to the DAMADICS one from an input–output perspective.
This will allow us to use the LuGre model to design a model-based FD scheme as
it is described in the previous sections. In Fig. 5.3, this intuitive idea is shown in a
schematic way.3
In this respect using the theoretical results, the approximator’s output will serve as
the residual signal for fault detection. Owing to the convergence analysis presented
3
The use of the LuGre model needs the velocity measurement which is not available in the
DAMADICS actuator case. However, the velocity can be easily estimated using the position measurements by means of a suitably designed Kalman filter.
101
Output
Input
Actuator
DAMADICS
Benchmark
Model
Velocity
On-line
approximator
Figure 5.3: Architecture of the adaptive on–line approximation scheme.
before, it follows that Φ can be used to define the threshold function ρ as:
ρ=
|Φ|
k2
(5.1)
Choosing now as a residual signal ξ˜2 with its correspondent threshold (5.1), we can
say that a fault will be detected when ξ˜2 ≥ ρ.
5.3
Damadics Simulation Results
In this part we present the simulation results regarding actuator faults introduced in
friction and servomotor’s spring. Relative to the friction fault an increasing of valve
or bushing friction is considered. Mechanical wear, air pollution, corrosion products
and sedimentation consist the reasons of existence and the physical interpretation of
the fault related to friction. On the other hand relative to the servomotor’s spring
fault, the harsh environment causes fatigue or corrosion of spring material. The
results were taken according to the scheme that is depicted in Fig. 5.4.
P1 and P2 represent the pressure before and after the control valve and were set
to be 3.5 · 106 Pa and 2.6 · 106 Pa respectively. T represent the water temperature
and was 20o C. CV is the control value that takes values in [0, 1]. A value of “1”
102
DAMADICS
Figure 5.4: Architecture used for DAMADICS simulation trials.
expresses that the valve is closed where a value of “0” a fully-opened valve. The
output of the benchmark model X1, represents the rod’s displacements. To implement the on-line approximator we have employed High Order Neural Networks
(HONNs), with sigmoid activation function of the form s(x) =
m
+λ.
1+e−l(x−c)
Specif-
ically for the term ŵ2> S2 (|x2 |) we have chosen a 5th -order HONN with (m, l, c, λ) =
(0.8, −4, 2.119, −1.5), while for ŵ1> S1 (x2 , |x2 |) a 2nd -order HONN with (m, l, c, λ) =
(1.41, −10.0225, 0.5974, −2.11). The design constants k1 and k2 were set to be 100
and 400 respectively. The outputs X10 and X20 of the on-line approximator represent the estimated position and velocity respectively. As simulations have been
carried out in a noise-free environment, the velocity was estimated by introducing
a high-pass filter. According also to benchmark definition, the faults are standardized to the range of [−1 1]. The limiting values “-1” and “1” corresponds to some
pre-defined states or physical values (∆fmin , ∆fmax ). Fault notations are given in
Table 5.2.
The type of faults can be either abrupt or incipient. More specifically, the fault
concerning friction is an incipient one. The fault that we simulated occurs at t = 70
103
Friction fault
−1 - no friction
0 - unchanged friction
1 - advanced friction
Servomotor spring fault
−1 - spring’s perforation
0 - no fault
1 - spring’s tightness
Table 5.2: Fault specifications.
sec and takes its final value “1” after 20 sec. A detection decision (0-no fault, 1fault) is being made when |ξ˜2 | ≥ ρ for more than one sample time. The simulations
results are depicted in Fig. 5.5.
The conclusions that can be drawn from Figs. 5.5(a)-(b) are that the adaptive
scheme is able to learn on line the behavior of the model with very small errors. In
Fig. 5.5(c) a parallel graph of |ξ˜2 | and of the corresponding threshold ρ is plotted.
As it mentioned before, a fault decision is taken when |ξ˜2 | ≥ ρ for more than one
sample time. Specifically:
|ξ˜2 (t)|
≥
ρ(t)
AN D
|ξ˜2 (t + ∆t)|
≥
ρ(t + ∆t)
where t is the time instant at which |ξ˜2 | ≥ ρ and ∆t is the sampling step. This is the
reason why no fault indication is turned on before the actual occurrence of the fault
(see Fig. 5.5(e)), despite some spikes occurring before the time of fault occurrence
(see Fig. 5.5(c)).
As can be noticed from Fig. 5.5(e), the fault is detected at t = 81.12sec. The
fault strength on this time–instant is of about 50% of its final value (Fig. 5.5(d)), a
characteristic which can prevent on time the overall system from serious damages.
Similar comments can be made when we simulate the system with the servomotor spring fault (see Fig. 5.6), which, according to the benchmark definition, is an
abrupt fault. In this case, the fault is detected at t = 70.005sec.
104
Fault
−5
(a)
x 10
6
4
2
0
−2
−3
0x 10
10
20
40
60
80
100
120
20
40
60
80
100
120
20
40
60
80
100
120
20
40
60
80
100
120
20
40
60
80
100
120
20
40
60
80
100
120
5
(b)
0
−5
0
0.01
(c)
(d)
(e)
(f)
0.005
0
0
1.5
1
0.5
0
−0.5
0
1.5
1
0.5
0
−0.5
0
1.5
1
0.5
0
−0.5
0
AVE of velocity error
Threshold
Time(sec)
Figure 5.5: Behaviors of (a) position error ξ˜1 . (b) velocity error ξ˜2 . (c) absolute
value of velocity error, |ξ˜2 | threshold ρ. (d) A fault evolution. (e) detection decision.
(f) control Value (CV).
105
fault
5
(a)
6
4
2
0
x 10
2
3
0x 10
10
20
40
60
80
100
120
40
60
80
100
120
20
40
60
80
100
120
20
40
60
80
100
120
20
40
60
80
100
120
20
40
60
80
100
120
5
(b)
0
5
0
0.01
(c)
(d)
(e)
(f)
0.005
0
0
1.5
1
0.5
0
0.5
0
1.5
1
0.5
0
0.5
0
1.5
1
0.5
0
0.5
0
20
AVE of velocity error
Threshold
Time(sec)
Figure 5.6: Behaviors of (a) position error ξ˜1 . (b) velocity error ξ˜2 . (c) absolute
value of velocity error, |ξ˜2 | threshold ρ. (d) A fault evolution. (e) detection decision.
(f) control Value (CV).
106
Bibliography
[1] Arkan Kayihan, Francis J. Doule III, “Friction compensation for a process
control valve,” Control enginnering practice, 8, pp.799-812, 2000.
[2] B. Armstrong, “Dynamics for robot control: friction modeling and ensuring
excitation during parameter identification,” Ph.D thesis, Dept. of Electrical
Engineering, Stanford University, Stanford Computer Science Memo STANCS-88-1205, 1988.
[3] B. Armstrong-Hélouvry, Control of Machines with Friction. Kluwer Academic
Publishers, Norwell, MA, 1991.
[4] B. Armstrong-Hélouvry, “A perturbation analysis of stick-slip. In R. A. Ibrahim
and A. Soom (Eds),” Friction-Induced Vibration, Chatter, Squeal, and Chaos,
Proc. ASME Winter Annual Meeting, Anaheim,
DE-Vol. 49, ASME, NY,
pp.41-48.
[5] B. Armstrong-Hélouvry, B. and P. Dupont, “Friction modeling for controls,
and Compensations Techinques for Servos with Friction,” in Proc. American
Control Conference, pp.1905-1915, 1993.
[6] B.Armstrong-Hélouvry, P. Dupont and Canudas de Wit , “A survey of models, analysis tools and compensation methods for the control of machines with
friction,” Automatica, vol. 30, No. 7, pp.1083-1138, 1994.
107
[7] Canudas de Wit, K.J. Åström, and K. Braun, “Adaptive friction compensation
in DC motor drives”, in IEEE Conference on Robotics and Automation,, 3:15561561, December, 1986.
[8] Canudas de Wit, P. Noel, A. Auban and B. Brogliato, “Adaptive Friction
Compensation in Robot Manipulators: Low Velocities,” International Journal
of Robot Reasearch, vol. 10, No. 3, pp.189-199, 1991.
[9] Canudas de Wit, H. Olsson, K.J. Åström, and P. Lischinsky, “A New Model
for Control of Systems with Friction,” IEEE Trans. on Automatic Control, vol.
40, No. 3, pp.419-425, March 1995.
[10] Canudas de Wit, S.S. Ge, “Adaptive Friction Compensation for Systems with
Generalized Velocity/Position Friction Dependency”, in Proc. 36th Conf Decis
Contr, pp.2465-2470, December, 1997.
[11] Chua L.O and Stromsmoe, “Mathematical model for dynamic hysteresis loops,”
Int.J.Eng.Sci., vol. 19, pp. 435-450, 1971
[12] Chua L.O and Bass S.C, “A generalized Hysteresis model,” Trans. on Circuit
Theory, vol. CT-19, No. 1, pp. 36-48, January 1972.
[13] Chun-Yi Su, Y. Tan, Y. Stepanenko, “Adaptive Control of a Class of Nonlinear
Systems Preceded by an Unknown Backlash-Like Hysterisis”, in Proc. 39th
Conf Decis Contr, 2000.
[14] Cincinatti Milacron, Revised Stick-Slip Test Procedure, 1986.
[15] Dahl, P.R “A solid friction model”, TOR-158(3107-18), The Aerospace Corporation, El Segundo, CA, 1968.
[16] Dahl, P.R “Solid friction damping of mechanical vibrations,” AAIA J., 14(12),
pp.1675-1682, 1976.
108
[17] Dahl, P.R “Measurement of sloid friction parameters of ball bearings,” Proc
of 6th Annual Symp on Incremental Motion, Control Systems and Devices,
University of Illinois, ILO, 1977.
[18] J.R. Rice and A.L. Ruina, “Stability of steady frictional slipping,” J. Applied
Mechanics, vol. 50, No.2, 1983
[19] G. Tao and P.V. Kokotoviç, “Continuous-time Adaptive control of systems with
unknown backlash,” IEEE Trans. on Automatic Control, vol. 40, pp.1083-1087,
1995a.
[20] G. Tao and P.V. Kokotoviç, “Adaptive control of plants with unknown hystereses,” IEEE Trans. on Automatic Control, vol. 40, pp.200-212, 1995b.
[21] G. Tao and P.V. Kokotoviç, Adaptice Control of Systems with Actuator and
Sensor Nonlinearities, Wiley Interscience, NY, 1996.
[22] J. Swevers, Farid Al-Bender, Chris G.Ganseman and Tutuko Prajogo, “An Integrated friction model structure with improved presliding behavior for accurate
friction compensation,”,IEEE Trans. on Automatic Control, vol. 45, No. 4, pp.
675-686, April 2000.
[23] Mayergoyz I.D., Mathematical models of Hysteresis. New York: SpringerVerlag, 1991
[24] Macki J.W., Nistri P. and Zecca P., “Mathematical models for hysteresis,”
SIAM Review, vol. 35, pp.94-123, 1993.
[25] S. Kato, K. Yamaguchi and T. Matsubayashi, “Some considerations of characteristics of static friction of machine tool slideway,” J. of Lubrication Technology, 94(3), pp.234-247, 1972.
109
[26] Gertler J.J., Fault Detection and Diagnosis in Engineering Systems New York:
Marcel Dekker, 1998
[27] Chen J., Patton R.J., Robust Model-based Fault Diagnosis for Dynamic Systems
Massachusetts, USA: Kluwer Academic Publishers, 1999
[28] P.M. Frank, “Fault diagnosis in dynamic systems using analytical and
knowledge-based redundancy - a survey and some new results,” Automatica,
vol. 26, pp.459-474, 1990.
[29] J.J. Gertler, “Fault detection and isolation using parity relations,” Control Eng.
Practice, vol. 5, No. 5, pp.653-661, 1997.
[30] Gertler J.J., “Analytical redundancy methods in fault detection and isolation,”
Proc. of the IFAC/IMACS Symposium SAFEPROCESS, pp.9-21.
[31] Magni J.F. and Ph. Mouyon, “On the residual generation by observer and
parity space approaches,” IEEE Trans. on Automatic Control, vol. 39, pp.441447, 1994
[32] Ding S.X, E.L. Ding and T.Jeinsch, “An approach to analysis and design of
observer and parity relation based FDI systems,” Proc. The 14th IFAC World
Congr. Beiijing, 1999
[33] Zhuang Z. and P.M Frank, “A fault detection scheme based on stochastic qualitative modeling,” Proc. The 14th IFAC World Congr. Beiijing, 1999
[34] Kuipers B. “Qualitative simulation,” Artificial Intelligence, vol 66, No. 29,
pp.289-338, 1986
[35] Kay H. and B. Kuipers, “Numerical behavior envelopes for qualitative models,”
Proceedings of the 7th National Conference on Artificial Intelligence, pp.606613, 1993.
110
[36] Bonarini A. and G. Bontempi, “A qualitative simulation approach for fuzzy
dynamical models”, ACM transactions on Modeling and Computer Simulation,
vol 4, No. 4, pp.285-313, 1994.
[37] Lunze J. “Qualitative modeling of linear dynamical systems with quantized
state measurements”, Automatica, vol 30, No. 3, pp.417-431, 1994.
[38] Chen J., Lopez-Toribio C.J and Patton R.J., “Non-linear dynamic systems fault
detection and isolation using fuzzy observers”, Proc. Instn. Mech. Engrs. 213,
Part 1, 1999.
[39] Marcu T., Matcovschi M.H. and Frank P.M., “Neural observer-based approach
to fault detection and isolation of a three-tank system”, ECC’99, Karlsruhe,
1999.
[40] Marcu T., Matcovschi M.H. and Frank P.M., “Neural observer-based approach
to fault tolerant control of a three-tank system”, ECC’99, Karlsruhe, 1999.
[41] Calado J.M.F. and J.M.G. Sa da Costa, “On-line fault detection and diagnosis
based on a coupled system”, ECC’99, Karlsruhe, 1999.
[42] N.E. Cotter, “The Stone-Weierstrass theorem and its applications to neural
networks”, IEEE Trans. on Neural Networks, vol.1, pp.290-295, 1990.
[43] G.Cybenko, “Approximations by superpositions of a sigmoidal function”, Mathematics of Control, Signals, and Systems, vol. 2 pp.303-314, 1989.
[44] Neuro-Control Systems: Theory and Applications. M.M. Gupta and D.H. Rao
(Eds.), New York, IEEE Press, 1994.
[45] P.A. Ioannou and J. Sun. Robust Adaptive Control. Englewood Cliffs,
NJ:Prentice Hall, 1995.
111
[46] T. Poggio and F. Girosi, “Regularization algorithms for learning that are equivalent to multilayer networks”, Science, vol. 247, pp.978-982, 1990.
[47] G.A. Rovithakis and M.A. Christodoulou. Adaptive Control with Recurrent
High-Order Neural Networks. Springer-Verlag, London, 2000
[48] Handbook of Intelligent Control: Neural, Fuzzy and Adaptive Approaches. D.A.
White and D.A. Sofge (Eds.). New York, IEEE Press, 1994
[49] A. R. Barron, “Universal approximation bounds for superpositions of a sigmoidal function,” IEEE Trans. on Information Theory, vol. 39, pp. 930–945,
1993.
[50] R. Zoppoli, M. Sanguineti, and T. Parisini, “Approximating networks and extended Ritz method for the solution of functional optimization problems,” Journal of Optimization Theory and Applications, vol. 112, n. 2, pp. 403–439, 2002.
[51] M. M. Polycarpou and A. B. Trunov, “Learning Approach to Nonlinear Fault
Diagnosis: Detectability Analysis,” IEEE Trans. on Automatic Control, Vol.
45, No. 4, pp.806-812, 2000.
[52] M.A. Demetriou and M.M. Polycarpou, “Incipient fault diagnosis of dynamical
systems using on-line approximators,” IEEE Trans. on Automat. Contr., vol.
43, pp.1612-1617, 1998.
[53] M.M. Polycarpou M.A. Helmicki, “Automated fault detection and accomodation: A learning approach,” IEEE Trans. Syst. Man, Cybern., vol. 25, pp.14471458, 1995.
[54] E. Alcorta Garcia and P.M. Frank, “Deterministic nonlinear observer-based
approaches to fault diagnosis: a survey,” Control Eng. Practice, vol. 5, No. 5,
pp.663-670, 1997.
112
[55] M.Z. Bartys, “Specification of actuators intented to use for benchmark definition”, Technical Report, Poland, 2002.
113
Vita
...
Permanent Address: Solonos 101 str.
P.O. 10678 Athens
GREECE
This thesis was typeset with LATEX 2ε 4 by the author.
4 A
LT
EX 2ε is an extension of LATEX. LATEX is a collection of macros for TEX. TEX is a trademark
of the American Mathematical Society. The macros used in formatting this thesis were written by
Dinesh Das, Department of Computer Sciences, The University of Texas at Austin, and extended
by Bert Kay and James A. Bednar.
114
Was this manual useful for you? yes no
Thank you for your participation!

* Your assessment is very important for improving the work of artificial intelligence, which forms the content of this project

Download PDF

advertisement