Noise Modeling, State Estimation and System Identification in Linear Differen Equations

Noise Modeling, State Estimation and System Identification in Linear Differen Equations
Noise Modeling, State Estimation and System
Identification in Linear Differential-Algebraic
Markus Gerdin, Thomas Schön
Control & Communication
Department of Electrical Engineering
Linköpings universitet, SE-581 83 Linköping, Sweden
E-mail: [email protected], [email protected]
1st November 2004
Report no.: LiTH-ISY-R-2639
Submitted to CCSSE 2004
Technical reports from the Control & Communication group in Linköping are
available at
This paper treats how parameter estimation and Kalman filtering can
be performed using a Modelica model. The procedures for doing this
have been developed earlier by the authors, and are here exemplified on a
physical system. It is concluded that the parameter and state estimation
problems can be solved using the Modelica model, and that the parameters
estimation and observer construction to a large extent could be automated
with relatively small changes to a Modelica environment.
Keywords: DAE, Differential-Algebraic Equations, Modelica, System Identification, Parameter Estimation, Observer, Kalman Filter.
Noise Modeling, State Estimation and System Identification in Linear
Differential-Algebraic Equations
Markus Gerdin and Thomas Schön
Department of Electrical Engineering, Linköping University, Sweden
gerdin,[email protected]
This paper treats how parameter estimation and Kalman
filtering can be performed using a Modelica model. The
procedures for doing this have been developed earlier by
the authors, and are here exemplified on a physical system. It is concluded that the parameter and state estimation problems can be solved using the Modelica model, and
that the parameters estimation and observer construction
to a large extent could be automated with relatively small
changes to a Modelica environment.
Keywords: DAE, Differential-Algebraic Equations,
Modelica, System Identification, Parameter Estimation,
Observer, Kalman Filter.
This paper is the result of a project within the PhD course
Project-Oriented Study (POS), which is a part of the graduate school ECSEL. The aim of the project was to model
a physical system in Modelica, estimate unknown parameters in the model using measurements from a real process,
and then implement a Kalman filter to estimate unknown
states. These steps are described in the paper. The work
is based on theory which has been developed earlier by the
authors [21, 10].
In the paper it is concluded that the estimation of unknown parameters and construction of observers (Kalman
filters) could be automated to a large extent, if relatively
small additions are introduced in the Modelica modeling
environment. This could significantly reduce the time for
modeling a system, estimating parameters, and constructing observers.
Process Description
The process we are studying in this project resembles the
problem that occurs in power transfers in weak axes, such
as the rear axis in trucks. This problem is studied within the
area of power train control, which is an important research
area for the automotive industry. The problem is that when
large torques are generated by the engine the drive shaft
winds itself up like a spring. In the gear shifting process
when the engine torque is removed this spring will affect
the gearbox, since the drive shafts winds itself back again.
This is an undesirable behavior, since the gear shifting process is significantly slowed down by this fact, which will
be a sever problem for automatic gearboxes. Furthermore,
this torque will expose parts in the gearbox to wear. Hence,
it is desirable to eliminate the elastic behavior of the drive
shafts. This can be done by closing the loop using feedback
control. In order to be able to construct a good controller
a model of the process is needed and the internal states of
this model have to be estimated. We will in this report study
the modeling and state estimation problems for this process.
However, since we do not have access to a real truck we will
use a laboratory scale process, which contains the same phenomena. The engine is substituted with a DC-motor and a
gearbox. The drive shafts and the load have been replaced
by a spring and a heavy metal disc. In Figure 1, a photo of
the laboratory process is given.
Object-Oriented Modeling
Simulation of dynamical systems has traditionally been
performed by writing equations describing the system as
ordinary differential equations (ODE), and then integrating them using an ODE solver. Tools such as S IMULINK
have made this process easier by allowing graphical modeling of the ODE, but in principle the user must still transform the equations to ODE form manually. In recent years,
object-oriented modeling languages have become increasingly popular. The most promising example of such a
language is probably Modelica [8, 23], which is the language that will be used in this work. (More specifically, we
will use the Dymola implementation of Modelica). Objectoriented languages have the advantage of being equationbased, which means that the user can enter the equations de-
is to estimate the constant values of the parameters θ using
measurements of u(t) and y(t).
A straightforward way to solve this problem is to solve
the optimization problem
(ŷ(t|θ) − y(t))2
DC motor
PSfrag replacements
Metal disc
Angle sensor
where the predictor ŷ(t|θ) is selected as a simulated output from the model, and y(t) is the measured output. In
system identification terms, this corresponds to prediction
error identification with an output-error (OE) model. An
output-error model corresponds to the assumptions that the
measurement (1b) is corrupted by noise and that the DAE
(1a) holds exactly. For a more thorough discussion on noise
models in system identification see, e.g., [14]. In the case
when the DAE is linear,
˙ = J(θ)ξ(t) + K(θ)u(t)
Figure 1. The examined process.
y(t) = C(θ)ξ(t),
scribing the system without having to transform them into
ODE form. The equations will instead be in differentialalgebraic equation (DAE) form, which means that modeling environments must be able to simulate such models.
The term object-oriented means that equations describing a
commonly used system can be packaged, stored and reused.
It can for example be convenient to not have to enter the
equations describing a resistor every time it is used in a
model. Modeling environments for object-oriented modeling languages usually allow graphical modeling, which
means that different sub-models are selected from component libraries and then connected using a graphical interface.
other noise models than output-error can be used, which
can give better estimates of the unknown parameters. With
other noise models than output error (2) can still be used,
but ŷ(t|θ) will be a predicted output, which in the linear
case is calculated by the Kalman filter. To implement a
Kalman filter for (3), different transformations can be used.
The method used in this work is to transform the DAE into
state-space form using the transformation sketched below.
E ξ˙ = Jξ + Ku ⇒
−A 0 w1
0 ẇ1
I w2
N ẇ2
Aw1 + B1 u
= Pm−1
(i) ⇒
i=0 (−N ) D1 u
ẋ = Ãx + B̃u(m−1)
System Identification
In this section we will describe how the unknown parameters in the model have been estimated from measured data.
First, we will discuss some basic theory for estimation of
unknown parameters in DAE:s in Section 2.1.
The first step is calculated using numerical software, and
must therefore be calculated for every parameter value for
which a state-space description is necessary. Note that in
the last transformation, the input might be redefined as one
of its derivatives. For a more thorough discussion on this
see, e.g., [9] or [10].
Theory for Parameter Estimation in DAE:s
In the general case, a model from an object-oriented
modeling tool is a nonlinear DAE, which can be written as
F (ξ(t),
ξ(t), u(t), θ) = 0
y(t) = h(ξ(t), θ).
Here, u(t) is a known input, y(t) is the measured output,
ξ(t) is a vector of auxiliary variables used to model the system, and θ is a vector of unknown parameters. The problem
Parameter Estimation for Laboratory Process
Since the laboratory process that was modeled in Modelica contains some unknown parameters, the technique described in the previous section was used to estimate the unknown parameters. The model is linear, so the equations
can be written as
˙ = J(θ)ξ(t) + K(θ)u(t)
y(t) = C(θ)ξ(t).
Angle after spring
Angular velocity before spring
During the identification experiment, the input was the voltage to the motor. Two outputs were measured, the angle of
the mass after the spring, and the angular velocity of the
mass before the spring.
The actual identification was performed using the System identification toolbox for M ATLAB. It was therefore
necessary to transform the Modelica model to the format
(8) in M ATLAB. This was performed manually, but the procedure could quite easily be automated if it were possible to
specify inputs, outputs, and unknown parameters in Modelica. This is an important subject for future work, since greybox identification then could be performed by first modeling
the system using Modelica, and then estimating unknown
parameters in M ATLAB without having to manipulate any
equations manually.
The data from the process was collected with the sampling interval 0.01 s. However, this was too fast for this
process and would have required higher order models, so
the signals were resampled at 0.05 s. The input voltage (a
sum of sinusoids) was multiplied by 90 since the voltage is
passed through an amplifier that amplifies it 90 times. The
mean values were also removed from the data, and the data
set was divided into estimation and validation data. The preprocessed data that was used for the identification is shown
in Figure 2–3.
Figure 3. Input-output data from motor voltage to angular velocity before the spring. The
data to the left of the vertical line is estimation data, and the data to the right is validation data.
time poles from voltage to angular velocity before the
spring, and four continuous-time poles to angular velocity
after the spring). As can be seen in Figure 4–5, the fit is
quite good.
By noting that the transfer functions from voltage to angular velocity before the spring from the ARX model and
from the physical model respectively should be the same, it
is possible to quite easily estimate the value of k.
13.68s2 + 621.9s + 6128
+ 24.95s2 + 1394s + 11180
J2 s2 + ds + c
Gphysical = −30k
900RJ1 J2 s3 +
(900J2 k 2 + 900RJ1 d + J2 Rd)s2 +
(900dk 2 + 900RJ1 c + J2 Rc)s + 900ck 2
Figure 2. Input-output data from motor voltage to the angle after the spring. The data to
the left of the vertical line is estimation data,
and the data to the right is validation data.
To get a feeling for the system, two ARX models (from
voltage to angular velocity before the spring and from voltage to angle after spring respectively) were estimated. The
orders of the models were selected to be the same as the
physical Modelica model of the system (three continuous-
We see that an approximate value of k is given by
≈ −0.06.
6128 · (−30)
This value will later be used as an initial value for k in the
parameter estimation.
Since the ARX models give such good fits, they might
be all that is necessary for some applications. However, we
are interested in the actual parameter values and to have a
physical model where the connection between the different
outputs is clear, so we proceed with parameter estimation
for the physical model.
The actual parameter estimation is performed using the
idgrey command in M ATLAB. For each parameter value
that idgrey needs a state-space model, the transformation
that was briefly described in Section 2.1 is performed using numerical software. The selected initial values and estimated values for the unknown parameters are shown in
Table 1. The initial values where selected from physical
insight, except for k which was estimated from the ARX
models, see (11), and R which was measured.
Measured Output and Simulated Model Output
Measured Output
ARXModelAngleCont Fit: 87.96%
Angle after spring
d (Nms/rad)
c (Nm/rad)
J1 (kg m2 )
J2 (kg m2 )
R (Ω)
k (Nm/A)
Initial value
5.43 · 10−4
3.0 · 10−6
3.66 · 10−4
Estimated value
5.87 · 10−4
6.30 · 10−6
4.17 · 10−4
Table 1. Initial and estimated values for the
unknown parameters.
Figure 4. Validation of ARX model from voltage to angle after spring using simulated output.
A comparison between the initial model and the validation data and between the estimation model and the validation data is shown in Figure 6 and 7 respectively. As can be
seen, already the initial parameters describe the angular velocity quite well, while the estimation process improves the
fit for the angle considerably. Compared to the ARX models discussed earlier, the estimated grey box model has a
somewhat worse fit for the angular velocity, and a comparable fit for the angle. The grey box model has the advantage
of describing the whole system in one model.
Measured Output and Simulated Model Output
Measured Output
ARXModelVelCont Fit: 90.67%
Angular velocity before spring
Figure 5. Validation of ARX model from voltage to angular velocity using spring with simulated output.
State Estimation
The state estimation problem is about finding the best
estimate of the system state using the information available
in the measurements from the system. In our case we need
the state estimates in order to design a controller, which removes as much of the oscillations in the spring as possible.
From a general control theoretical perspective the use of the
state estimates for controller design is quite common, and
many of the modern control algorithms rely on a model of
some kind.
The linear state estimation problem will be discussed in
quite general terms in Section 3.1. Section 3.2 describes
and solves the problems with introducing white noise in linear differential-algebraic equations. When the white noise
has been introduced it is quite straightforward to derive the
Kalman filter for linear DAE:s, which is done in Section 3.4.
Finally the results from the experiments are given.
Since the system we are studying in this project can be
modeled using linear dynamics we are only concerned with
linear state estimation. The problem we are faced with in
linear state estimation is to estimate the states, x, given measurements of the input, u, and output, y, signals in the following model
Measured Output and Simulated Model Output
Angle after spring
Angular velocity before spring
Measured Output
m1 Fit: 47.96%
Linear State Estimation
ẋ = Ax + Bu + w,
y = Cx + e,
where the matrices A, B, and C are given. Furthermore,
the process noise w and the measurement noise e are white,
Gaussian, with
E[w(t)wT (s)] = R1 (t)δ(t − s),
Measured Output
m1 Fit: 82.23%
Figure 6. Validation of grey box model with
initial parameter values using simulated output.
E[w(t)eT (s)] = R12 (t)δ(t − s).
This problem was solved by Kalman and Luenberger in [13,
16, 17]. The idea is to use an observer which simulates
the dynamics and adapts the state according to its ability to
describe the output according to
x̂(0) = x̂0 .
Angle after spring
Measured Output
m2 Fit: 86.49%
x̃˙ = (A − KC)x̃ + w − Ke
Measured Output
m2 Fit: 81.68%
Figure 7. Validation of grey box with estimated parameter values model using simulated output.
Hence, the problem boils down to finding the K-matrix that
gives the best estimates. By forming the error dynamics
(x̃ = x − x̂)
Measured Output and Simulated Model Output
E[e(t)e (s)] = R2 (t)δ(t − s),
x̂˙ = Ax̂ + Bu + K(y − C x̂),
Angular velocity before spring
we can conclude that the choice of the K-matrix constitutes
a compromise between convergence speed and sensitivity
to disturbances. The speed of convergence is given by the
locations of the eigenvalues of the matrix (A − KC). If
system (12) is observable we can chose arbitrary locations
for these eigenvalues, by assigning a certain K-matrix. Of
course we would like the convergence to be as fast as possible, which implies large entries in the K-matrix. However,
the downside of having such a K-matrix is that the measurement noise is amplified with the K-matrix. Hence, a Kmatrix with large entries implies that the measurement noise
will give a significant influence on the estimate. This is of
course undesirable, since we want to minimize the influence
of the measurement noise on the state estimate. Hence, it is
not obvious how to chose the observer gain K. Either we
can use trial and error by placing the eigenvalues for the
error dynamics and checking the influence of the measurement noise, or we can formulate an optimization problem
of some kind. In this way the estimates will be optimal in
the sense of the optimization problem. One possible choice
is to find the estimates that minimizes the variance of the
estimation error. This results in the following K-matrix
K = P C T R2−1 ,
of white noise might occur, which do not exist. This issue
will be discussed in more detail later. Hence, we have to
find a suitable subspace where the noise can live. More
specifically we have to find all the possible Bw -matrices in
where P is given by the following Ricatti equation
Ṗ = AP + P AT + R1 − P C T R2−1 CP
P (0) = P0
E ẋ = Ax + Bu + Bw wt ,
An observer using the K-matrix given in (15) is referred
to as a Kalman filter [13]. The Kalman filter can also be
motivated from a purely deterministic point of view, as the
solution to a certain weighted least-squares problem [22,
The assumption that the observer has the form (13a)
might seem ad hoc. However, it can in fact be proven that
this structure arises from the deterministic weighted leastsquares formulation [18], from the maximum a posteriori
formulation [20], etc. Furthermore, it can be proven that
if the system is linear, subject to non-Gaussian noise the
Kalman filter provides the best linear unbiased estimator
(BLUE) [1].
The discussion has until now been purely in continuous
time, but since the measurements from the real world are
inherently discrete we have to consider the discrete time
Kalman filter.
The problem is that the model obtained from Modelica
is not in the form (12), but rather it is in the form
E ẋ = Ax + Bu,
where the E-matrix can be singular. A model of this type is
referred to as a linear differential-algebraic equation (DAE).
Other common names for this type of equation are implicit
systems, descriptor systems, semi-state systems, generalized systems, and differential equations on a manifold [4].
If the singularity assumption of the E-matrix is dropped we
can multiply with E −1 from the left and obtain an ordinary
differential equation (ODE), and standard Kalman filtering
theory applies.
Our aim is to be able to estimate the internal variables, x,
from (17) directly, without transforming the equation into a
standard state-space form (12). The reason is that we want
a theory that is directly applicable to the models generated
by Modelica, so that as much as possible can be done automatically. Furthermore, this allows us to add noise where it
makes physical sense, simply by inserting it at appropriate
places in the model. In order to accomplish this Modelica
has to be extended with some kind of interaction mode and
the possibility of generating random variables (noise). With
this capabilities we could model and simulate stochastic, as
well as, deterministic models in Modelica.
The problem is that in the general case we cannot add
noise to all equations in (17). The reason is that derivatives
where wt is a white noise sequence (i.e., Bw -matrices that
assures that white noise is not differentiated). The two main
reasons for why we want to introduce white noise in (18)
• There are un-modeled dynamics and disturbances acting on the system, that can only be introduced as an
unknown stochastic term.
• There is a practical need for tuning in the filter in order to make the trade-off between tracking ability and
sensor noise attenuation. This is in the Kalman filter
accomplished by keeping the sensor noise covariance
matrix constant and tuning the process noise covariance matrix, or the other way around. Often, it is easier to describe the sensor noise in a stochastic setting,
and then it is more natural to tune the process noise.
The problem of finding the subspace where the noise can
live was solved in [21] and is briefly discussed in the subsequent section.
Noise Modeling
We will omit the deterministic input in this derivation
for notational convenience, so the continuous time linear
invariant differential-algebraic equations considered is
E ẋ(t) + F x(t) = Bw(t)
y(t) = Cx(t) + e(t)
The reader is referred to [10] for details on how the noncausality with respect to the input signal, u(t), can be handled. For the purpose of this discussion we will assume that
w and e are continuous time white noises. (See e.g., [2]
for a thorough treatment of continuous time white noise).
If det(Es + F ) is not identically zero as a function of
s ∈ R, (19) can always be transformed into the standard
form (21) (see [3, 5]). Note that if det(Es + F ) is identically zero, then x(t) is not uniquely determined by w(t)
and the initial value x(0). This can be realized by Laplace
transforming (19). Therefore it is a reasonable assumption
that det(Es + F ) is not identically zero.
First, a transformation to the standard form is needed.
This is done by finding a suitable change of variables x =
Qz and a matrix P to multiply (19a) from the left. Both P
and Q are non-singular matrices. By doing this we get
P EQż(t) + P F Qz(t) = P Bw(t),
which for suitably chosen P - and Q-matrices can be written
in the following standard form:
−A 0 z1 (t)
I 0 ż1 (t)
I z2 (t)
0 N ż2 (t)
where the N -matrix is nilpotent, i.e., N k = 0 for some
k. The matrices P and Q can be calculated using, e.g.,
ideas from [24] involving the generalized real Schur form
and the generalized Sylvester equation, the details can be
found in [9]. We can also write (21) on the form (22), see
e.g., [6] or [15].
ż1 (t) = Az1 (t) + G1 w(t),
z2 (t) =
(−N )i G2
di w(t)
From a theoretical point of view G1 can be chosen arbitrarily, since it describes how white noise should enter an
ordinary differential equation. However, constraints on G1
can of course be imposed by the physics of the system that
is modeled. When it comes to G2 , the situation is different, here we have to find a suitable parameterization. The
problem is now that white noise cannot be differentiated, so
we proceed to find a condition on the B-matrix in (19a) under which there does not occur any derivatives in (22b), i.e.,
N i G2 = 0 for all i ≥ 1. This is equivalent to
N G2 = 0.
The result is given in Theorem 3.1. To formulate the theorem, we need to consider the transformation (20) with matrices P and Q which gives a system in the form (21). Let
the matrix N have the singular value decomposition
Σ 0
Σ 0 T
V1 V2 ,
N =U
V =U
0 0
0 0
where V2 contains the last k columns of V having zero singular values. Finally, define the matrix M as
I 0
M = P −1
0 V2
It is now possible to derive a condition on B.
Proof 3.1 From the discussion above we know that there
exist matrices P and Q such that
˙ = P JQQ−1 ξ(t) + P Bw(t)
P EQQ−1 ξ(t)
gives the canonical form
I 0
˙ + −A
Q−1 ξ(t)
0 N
0 −1
Q ξ(t) =
Note that B can be written as
B = P −1
Let the matrix N have the singular value decomposition
Σ 0
N =U
0 0
where Σ is a diagonal matrix with nonzero elements. Since
N is nilpotent it is also singular, so k singular values are
zero. Partition V as
V = V1 V2 ,
where V2 contains the last k columns of V having zero singular values. Then N V2 = 0.
We first prove the implication (26) ⇒ (23): Assume
that (26) is fulfilled. B can then be written as
−1 I
0 V2 T
V2 T
for some matrices S and T . Comparing with (29), we see
that G1 = S and G2 = V2 T . This gives
N G2 = N V 2 T = 0
so (23) is fulfilled.
Now the implication (23) ⇒ (26) is proved: Assume
that (23) is fulfilled. We then get
Σ 0 V1T
0 = N G2 = U
G2 = U
. (34)
0 0 V2T
This gives that
V1T G2 = 0,
so the columns of G2 are orthogonal to the columns of V1 ,
and G2 can be written as
G2 = V2 T.
Theorem 3.1 The condition (23) is equivalent to
B ∈ R(M )
where B is defined in (19) and M is defined in (25).
The expression (26) means that B is in the range of M , that
is the columns of B are linear combinations of the columns
of M .
Equation (29) now gives
−1 I
−1 G1
−1 G1
V2 T
∈ R(M ).
(26) is fulfilled.
The B-matrices satisfying (29) will thus allow us to incorporate white noise without having a problem with differentiation of white noise. The design parameters to be specified
are G1 and T defined in the proof above. Also note that the
requirement that the white noise should not be differentiated
is related to the concept of impulse controllability discussed
in [6].
R(M ). By projecting a b-vector onto N (M T ) using the
following projection matrix
I − MM†
we can check whether the corresponding vector can be part
of the B-matrix or not. Hence, b-vectors with only one
nonzero entry cannot be part of the B-matrix if
Constructing the Process Noise Subspace
(I − M M † )b 6= 0.
The proof for Theorem 3.1 is constructive. Hence, it can
be used to derive the process noise subspace for the laboratory example treated in this work. Forming the singular
value decomposition for the 50 × 50 matrix N in (21) we
obtain two nonzero singular values (1.41 and 1.00). Hence,
V2 ∈ R50×48 . We can parameterize all matrices G2 , satisfying
N G2 = 0,
Applying (41) to our model implies that we cannot add
noise to 6 of the 54 equations. These 6 equations involve
angles of various kinds. It is quite natural that we are not
allowed to add white noise to angles, since that would correspond to an infinite jump in the corresponding derivative,
which we cannot have. Instead the noise have to be added
to the derivative and then integrated to affect the angle.
using an arbitrary matrix T ∈ R48×48 , according to G2 =
V2 T . Furthermore we can, using the result (25), write the
M -matrix according to
−1 I4
M =P
0 V2
where I4 is a four dimensional identity matrix. The subspace where the noise can live is now given by R(M ) according to Theorem 3.1. Ultimately we want the user to be
able to, directly in Modelica, specify where in the model
the noise should enter. For instance, if a certain part of the
model corresponds to a significant simplification of the true
system we would like to add noise there in order to mathematically account for the fact that we are uncertain about
the dynamics at this part. Another example might be that
we are uncertain about a parameter value, and model this
uncertainty by adding noise to the corresponding equation.
If the user were allowed to add noise to all equations, some
of this noise would inevitably end up in the forbidden noise
subspace, i.e., the subspace where B ∈
/ R(M ). Hence, we
want to find the equations to which we cannot add noise.
We will do this by studying a vector b ∈ R54 , with only one
nonzero element. Think of the case where we only have
one noise source, B = b. It is not a restriction to limit
the treatment to this type of B-matrices, since generally we
cannot physically motivate the use of the same noise at two
places in the model. That would correspond to using exactly the same disturbance at for instance the resistor and
the spring in our model. If we want noise at both these
places we would add two separate noise sources, which of
course could have the same statistical properties.
For Theorem 3.1 to be valid the b-vectors constituting the
B-matrix cannot have any contribution in N (M T ), which
is the orthogonal complement to the range space of M ,
Discrete Model
There are several reasons to derive a discrete model, such
• The measurements from systems are obtained in discrete time.
• A common use of the estimated states are for model
based control. These controllers are implemented using computers, which implies that the control signal
will inherently be discrete.
If the noise enters the system according to a B-matrix satisfying Theorem 3.1 the original system (19) can be written
ż1 (t) = Az1 (t) + G1 w(t),
z2 (t) = G2 w(t),
y(t) = CQz(t) + e(t).
where x = Qz. Furthermore w(t) and e(t) are both assumed to be Gaussian white noise signals with covariances
R1 and R2 respectively, and zero cross-covariance (the case
of nonzero cross-covariance can be handled as well, the
only difference is that the expressions are more involved).
The system (42) can be discretized using standard techniques from linear systems theory, see e.g., [19]. If we assume that w(t) remains constant during one sample interval1 , we have (here it is assumed that sampling interval is
one to simplify the notation)
w(t) = w[k],
k ≤ t < (k + 1)
1 See e.g., [11] for a discussion on other possible assumptions on the
stochastic process w(t) when it comes to discretization.
we obtain
z1 [k + 1] = Ãz1 [k] + G̃1 w[k],
z2 [k] = G2 w[k],
y[k] = CQz[k] + e[k]
à = eA
G̃1 =
Rνt = Ct Pt|t−1 CtT + Rt .
eAτ dτ G1 .
Hence, (44) and (45) constitutes a discrete time model
of (19).
Rwt νt = E(wt νtT ) = E(wt (yt − Ct x̂t|t−1 )).
= C̃1 z1 [k] + C̃2 G2 w[k] + e[k]
Note that the measurement noise, ẽ[k], and the process
noise, w[k], are correlated, if C̃2 G2 6= 0. Now, the Kalman
filter can be applied to (47) in order to estimate the internal
variables z1 [k] and the process noise w[k]. Finally an estimate of the internal variables z2 [k] can be found using the
estimated process noise, since z2 [k] = G2 w[k], according
to (44b). Finally the internal variables, x[k], are obtained
by x[k] = Q−1 z[k]. For details regarding the Kalman filter
see e.g., [12, 11, 1]. Since we did not find any good references on how to derive the estimate of the process noise we
include a derivation of this in the subsequent section.
Estimating the Process Noise
The problem is how to estimate the process noise wt
given the innovations, νi , i ≤ t. We will use the following theorem from [12, p. 81]:
Theorem 3.2 Given two zero mean random variables x
and y, the linear least-mean-squares estimator x̂ = K0 y
of x given y is given by any solution to the so-called normal
K0 Ry = Rxy
where Ry = E(yy T ) and Rxy = E(xy T ).
Finally, this gives
K0 = R12 (Ct Pt|t−1 CtT + Rt )−1
ŵt|t = R12 (Ct Pt|t−1 CtT + Rt )−1 νt .
and thus
Combining (44a) and (46) we obtain
y[k] = C̃1 z1 [k] + ẽ[k]
z1 [k + 1] = Ãz1 [k] + G̃1 w[k]
Since wt is independent of x̂t|t−1 we get
Rwt νt = E(wt ytT ) = R12
= C̃1 z1 [k] + C̃2 z2 [k] + e[k]
The Kalman Filter
In order to apply the Kalman filter to the discrete time
model (44) we start out by rewriting (44c) as
z1 [k]
y[k] = CQz[k] + e[k] = [C̃1 C̃2 ]
+ e[k]
z2 [k]
K0 R ν t = R w t ν t
From the Kalman filter equations, we have
Now back to our problem. Estimating wt given νi , i ≤ t
is equivalent to estimating wt given νt since both processes
are white. According to Theorem 3.2, our estimator ŵt|t =
K0 νt is given by the solution to
The theory derived in the previous sections will now be
tested using real data from the spring servo system studied
throughout this work. Both the time-varying and the timeinvariant (stationary) Kalman filter will be used to estimate
the states. We have access to two output signals from our
process, the angular velocity of the motor and the angular
position of the mass after the spring. In order to validate our
theory we will use the angular position as output signal and
use the information available in this signal to estimate the
other states. The estimated motor angular velocity will be
compared with the true (measured) motor angular velocity
in order to validate the estimation algorithm.
The following noise covariances were used
1 0 0 0
0 1 0 0
R2 = 0.1
R1 = 0.1 
0 0 1 0
0 0 0 1
In Figure 8 the estimation performance is showed using
the stationary and the time-varying Kalman filter with correct initial conditions (zero). From this figure is it clear
that there is nothing to gain in using the time-varying
Kalman filter in this case, which is due to the fact that the
time-varying Kalman filter will coincide with the stationary
Kalman filter directly. However, it we does not have access
to the correct initial conditions, which we typically do not
have, the result will be different (see Figure 9). In the transient phase the time-varying Kalman filter will be superior
to the stationary Kalman filter. When the transient has disappeared the time-varying and the stationary Kalman filter
both yield the same result, see Figure 10.
Motor angular velocity (deg/s)
Motor angular velocity (deg/s)
Time (s)
Time (s)
Figure 10. Stationary performance. The measured motor velocity is the black (solid) and
the red (dashed) curve is the estimate from
the Kalman filter.
Figure 8. Transient behavior of the motor
velocity estimates with zero initial states.
The red (dashed), and the blue (dash-dotted)
curve shows the estimates from the timeinvariant and the time-varying Kalman filters
respectively. The measured motor velocity is
the black (solid) curve.
Concluding Remarks
In this work we have applied theory previously developed by the authors [21, 10] to perform system identification and state estimation using differential-algebraic equations (DAE:s). The system was modeled in Modelica, the
resulting equations were then transfered to M ATLAB where
the unknown parameters were estimated. Using this estimated model the states could then be successfully estimated.
Motor angular velocity (deg/s)
Further Work
There are several ideas for further work, some of the
most important directions are:
• Automatic translation of the Modelica model into the
Time (s)
Figure 9. Transient behavior of the motor velocity estimates with nonzero initial states.
The red (dashed), and the blue (dash-dotted)
curve shows the estimates from the timeinvariant and the time-varying Kalman filters
respectively. The measured motor velocity is
the black (solid) curve.
˙ = J(θ)ξ(t) + K(θ)e(t),
y(t) = C(θ)ξ(t),
in M ATLAB. This should be fairly straightforward if it
was possible to specify inputs, outputs, and unknown
parameters in Modelica. If this translation existed we
could perform grey-box system identification by first
modeling the system in Modelica and then translate
the model into the form (56), where the matrices are
readily imported in M ATLAB. The unknown parameters can then be estimated without having to manually
manipulate the equations.
• Simulation of noise in Modelica. In this way it would
be possible to model disturbances as well. It would be
necessary to include some kind of interface in Modelica where we can check where it is possible to include
the noise using Theorem 3.1.
• Investigate to what extent these results could be extended to nonlinear systems. The particle filter [7]
can probably be useful in the process of estimating the
states in a nonlinear DAE.
[1] B. Anderson and J. Moore. Optimal Filtering. Information and system science series. Prentice Hall, Englewood Cliffs, New Jersey, 1979.
[2] K. Åström. Introduction to Stochastic Control Theory.
Mathematics in Science and Engineering. Academic
Press, New York and London, 1970.
[3] K. Brenan, S. Campbell, and L. Petzold. Numerical Solution of Initial-Value Problems in DifferentialAlgebraic Equations. SIAM’s Classics in Applied
Methematics. SIAM, New York, 1996.
[4] S. Campbell. Descriptor systems in the 90’s. In proceedings of the 29th Conference on Decision and control, pages 442–447, Honolulu, Hawaii, USA, December 1990.
[5] L. Dai. Filtering and LQG problems for discrete-time
stochastic singular systems. IEEE Transactions on Automatic Control, 34(10):1105–1108, Oct. 1989.
[6] L. Dai. Singular Control Systems. Lecture Notes
in Control and Information Sciences. Springer-Verlag,
Berlin, New York, 1989.
[7] A. Doucet, N. de Freitas, and N. Gordon, editors. Sequential Monte Carlo Methods in Practice. Springer
Verlag, 2001.
[8] P. Fritzson. Principles of Object-Oriented Modeling
and Simulation with Modelica 2.1. Wiley-IEEE, New
York, 2004.
[9] M. Gerdin. Parameter estimation in linear descriptor
systems. Licentiate Thesis No 1085, Linköping University, 2004.
[10] M. Gerdin, T. Glad, and L. Ljung. Parameter estimation in linear differential-algebraic equations. In proceedings of the 13th IFAC Symposium on System Identification, Rotterdam, The Netherlands, Aug. 2003.
[11] F. Gustafsson. Adaptive Filtering and Change Detection. John Wiley & Sons, 2000.
[12] T. Kailath, A. Sayed, and B. Hassibi. Linear Estimation. Information and System Sciences Series. Prentice Hall, Upper Saddle River, New Jersey, 2000.
[13] R. E. Kalman. A new approach to linear filtering and
prediction problems. Trans. AMSE, J. Basic Engineering, 82:35–45, 1960.
[14] L. Ljung. System Identification - Theory for the User.
Information and System Sciences Series. Prentice Hall
PTR, Upper Saddle River, N.J., 2. edition, 1999.
[15] L. Ljung and T. Glad. Modellygge och simulering.
Studentlitteratur, 2003. In Swedish.
[16] D. Luenberger. Observers for multivariable systems. IEEE Transactions on Automatic Control, AC11(2):190–197, Apr. 1966.
[17] D. Luenberger. An introduction to observers. IEEE
Transactions on Automatic Control, AC-16(6):596–
602, Dec. 1971.
[18] C. Rao.
Moving Horizon Strategies for the
Constrained Monitoring and Control of Nonlinear
Discrete-Time Systems. PhD thesis, University of Wisconsin Madison, 2000.
[19] W. Rugh. Linear System Theory. Information and
system sciences series. Prentice Hall, Upper Saddle
River, New Jersey, 2 edition, 1996.
[20] T. Schön. On Computational Methods for Nonlinear
Estimation. Licentiate thesis, Linköping University,
Oct. 2003. Thesis No. 1047.
[21] T. Schön, M. Gerdin, T. Glad, and F. Gustafsson. A modeling and filtering framework for linear differential-algebraic equations. In proceedings of
the 42nd Conference on Decision and Control, Maui,
Hawaii, USA, Dec. 2003.
[22] H. Sorenson. Least-squares estimation: from Gauss to
Kalman. IEEE Spectrum, 7:63–68, July 1970.
[23] M. Tiller. Introduction to Physical Modeling with
Modelica. Kluwer, Boston, Mass., 2001.
[24] A. Varga. Numerical algorithms and software tools for
analysis and modelling of descriptor systems. In proceedings of 2nd IFAC Workshop on System Structure
and Control, Prague, Czechoslovakia, pages 392–395,
Was this manual useful for you? yes no
Thank you for your participation!

* Your assessment is very important for improving the work of artificial intelligence, which forms the content of this project

Download PDF