# Noise Modeling, State Estimation and System Identification in Linear Differen Equations

Noise Modeling, State Estimation and System Identification in Linear Differential-Algebraic Equations Markus Gerdin, Thomas Schön Control & Communication Department of Electrical Engineering Linköpings universitet, SE-581 83 Linköping, Sweden WWW: http://www.control.isy.liu.se E-mail: [email protected], [email protected] 1st November 2004 OMATIC CONTROL AU T CO MMU NICATION SYSTE MS LINKÖPING Report no.: LiTH-ISY-R-2639 Submitted to CCSSE 2004 Technical reports from the Control & Communication group in Linköping are available at http://www.control.isy.liu.se/publications. Abstract This paper treats how parameter estimation and Kalman filtering can be performed using a Modelica model. The procedures for doing this have been developed earlier by the authors, and are here exemplified on a physical system. It is concluded that the parameter and state estimation problems can be solved using the Modelica model, and that the parameters estimation and observer construction to a large extent could be automated with relatively small changes to a Modelica environment. Keywords: DAE, Differential-Algebraic Equations, Modelica, System Identification, Parameter Estimation, Observer, Kalman Filter. Noise Modeling, State Estimation and System Identification in Linear Differential-Algebraic Equations Markus Gerdin and Thomas Schön Department of Electrical Engineering, Linköping University, Sweden gerdin,[email protected] Abstract This paper treats how parameter estimation and Kalman filtering can be performed using a Modelica model. The procedures for doing this have been developed earlier by the authors, and are here exemplified on a physical system. It is concluded that the parameter and state estimation problems can be solved using the Modelica model, and that the parameters estimation and observer construction to a large extent could be automated with relatively small changes to a Modelica environment. Keywords: DAE, Differential-Algebraic Equations, Modelica, System Identification, Parameter Estimation, Observer, Kalman Filter. 1 Introduction This paper is the result of a project within the PhD course Project-Oriented Study (POS), which is a part of the graduate school ECSEL. The aim of the project was to model a physical system in Modelica, estimate unknown parameters in the model using measurements from a real process, and then implement a Kalman filter to estimate unknown states. These steps are described in the paper. The work is based on theory which has been developed earlier by the authors [21, 10]. In the paper it is concluded that the estimation of unknown parameters and construction of observers (Kalman filters) could be automated to a large extent, if relatively small additions are introduced in the Modelica modeling environment. This could significantly reduce the time for modeling a system, estimating parameters, and constructing observers. 1.1 Process Description The process we are studying in this project resembles the problem that occurs in power transfers in weak axes, such as the rear axis in trucks. This problem is studied within the area of power train control, which is an important research area for the automotive industry. The problem is that when large torques are generated by the engine the drive shaft winds itself up like a spring. In the gear shifting process when the engine torque is removed this spring will affect the gearbox, since the drive shafts winds itself back again. This is an undesirable behavior, since the gear shifting process is significantly slowed down by this fact, which will be a sever problem for automatic gearboxes. Furthermore, this torque will expose parts in the gearbox to wear. Hence, it is desirable to eliminate the elastic behavior of the drive shafts. This can be done by closing the loop using feedback control. In order to be able to construct a good controller a model of the process is needed and the internal states of this model have to be estimated. We will in this report study the modeling and state estimation problems for this process. However, since we do not have access to a real truck we will use a laboratory scale process, which contains the same phenomena. The engine is substituted with a DC-motor and a gearbox. The drive shafts and the load have been replaced by a spring and a heavy metal disc. In Figure 1, a photo of the laboratory process is given. 1.2 Object-Oriented Modeling Simulation of dynamical systems has traditionally been performed by writing equations describing the system as ordinary differential equations (ODE), and then integrating them using an ODE solver. Tools such as S IMULINK have made this process easier by allowing graphical modeling of the ODE, but in principle the user must still transform the equations to ODE form manually. In recent years, object-oriented modeling languages have become increasingly popular. The most promising example of such a language is probably Modelica [8, 23], which is the language that will be used in this work. (More specifically, we will use the Dymola implementation of Modelica). Objectoriented languages have the advantage of being equationbased, which means that the user can enter the equations de- is to estimate the constant values of the parameters θ using measurements of u(t) and y(t). A straightforward way to solve this problem is to solve the optimization problem X min (ŷ(t|θ) − y(t))2 (2) DC motor θ Gearbox PSfrag replacements Spring Metal disc Angle sensor t where the predictor ŷ(t|θ) is selected as a simulated output from the model, and y(t) is the measured output. In system identification terms, this corresponds to prediction error identification with an output-error (OE) model. An output-error model corresponds to the assumptions that the measurement (1b) is corrupted by noise and that the DAE (1a) holds exactly. For a more thorough discussion on noise models in system identification see, e.g., [14]. In the case when the DAE is linear, ˙ = J(θ)ξ(t) + K(θ)u(t) E(θ)ξ(t) Figure 1. The examined process. y(t) = C(θ)ξ(t), scribing the system without having to transform them into ODE form. The equations will instead be in differentialalgebraic equation (DAE) form, which means that modeling environments must be able to simulate such models. The term object-oriented means that equations describing a commonly used system can be packaged, stored and reused. It can for example be convenient to not have to enter the equations describing a resistor every time it is used in a model. Modeling environments for object-oriented modeling languages usually allow graphical modeling, which means that different sub-models are selected from component libraries and then connected using a graphical interface. 2 2.1 (3b) other noise models than output-error can be used, which can give better estimates of the unknown parameters. With other noise models than output error (2) can still be used, but ŷ(t|θ) will be a predicted output, which in the linear case is calculated by the Kalman filter. To implement a Kalman filter for (3), different transformations can be used. The method used in this work is to transform the DAE into state-space form using the transformation sketched below. I 0 E ξ˙ = Jξ + Ku ⇒ B1 −A 0 w1 0 ẇ1 u⇒ = + D1 0 I w2 N ẇ2 Aw1 + B1 u ẇ1 = Pm−1 i (i) ⇒ w2 i=0 (−N ) D1 u ẋ = Ãx + B̃u(m−1) System Identification In this section we will describe how the unknown parameters in the model have been estimated from measured data. First, we will discuss some basic theory for estimation of unknown parameters in DAE:s in Section 2.1. (3a) (4) (5) (6) (7) The first step is calculated using numerical software, and must therefore be calculated for every parameter value for which a state-space description is necessary. Note that in the last transformation, the input might be redefined as one of its derivatives. For a more thorough discussion on this see, e.g., [9] or [10]. Theory for Parameter Estimation in DAE:s 2.2 In the general case, a model from an object-oriented modeling tool is a nonlinear DAE, which can be written as ˙ F (ξ(t), ξ(t), u(t), θ) = 0 y(t) = h(ξ(t), θ). (1a) (1b) Here, u(t) is a known input, y(t) is the measured output, ξ(t) is a vector of auxiliary variables used to model the system, and θ is a vector of unknown parameters. The problem Parameter Estimation for Laboratory Process Since the laboratory process that was modeled in Modelica contains some unknown parameters, the technique described in the previous section was used to estimate the unknown parameters. The model is linear, so the equations can be written as ˙ = J(θ)ξ(t) + K(θ)u(t) E(θ)ξ(t) y(t) = C(θ)ξ(t). (8a) (8b) Angle after spring 2 rad 1 0 −1 −2 0 10 20 30 40 50 60 Voltage V 5 0 −5 0 Angular velocity before spring rad/s 2 0 −2 0 10 20 30 40 50 60 40 50 60 Voltage 5 V During the identification experiment, the input was the voltage to the motor. Two outputs were measured, the angle of the mass after the spring, and the angular velocity of the mass before the spring. The actual identification was performed using the System identification toolbox for M ATLAB. It was therefore necessary to transform the Modelica model to the format (8) in M ATLAB. This was performed manually, but the procedure could quite easily be automated if it were possible to specify inputs, outputs, and unknown parameters in Modelica. This is an important subject for future work, since greybox identification then could be performed by first modeling the system using Modelica, and then estimating unknown parameters in M ATLAB without having to manipulate any equations manually. The data from the process was collected with the sampling interval 0.01 s. However, this was too fast for this process and would have required higher order models, so the signals were resampled at 0.05 s. The input voltage (a sum of sinusoids) was multiplied by 90 since the voltage is passed through an amplifier that amplifies it 90 times. The mean values were also removed from the data, and the data set was divided into estimation and validation data. The preprocessed data that was used for the identification is shown in Figure 2–3. 0 −5 0 10 20 30 Figure 3. Input-output data from motor voltage to angular velocity before the spring. The data to the left of the vertical line is estimation data, and the data to the right is validation data. time poles from voltage to angular velocity before the spring, and four continuous-time poles to angular velocity after the spring). As can be seen in Figure 4–5, the fit is quite good. By noting that the transfer functions from voltage to angular velocity before the spring from the ARX model and from the physical model respectively should be the same, it is possible to quite easily estimate the value of k. 13.68s2 + 621.9s + 6128 + 24.95s2 + 1394s + 11180 J2 s2 + ds + c Gphysical = −30k ... 900RJ1 J2 s3 + ... (900J2 k 2 + 900RJ1 d + J2 Rd)s2 + GARX = s3 (900dk 2 + 900RJ1 c + J2 Rc)s + 900ck 2 10 20 30 40 50 60 Figure 2. Input-output data from motor voltage to the angle after the spring. The data to the left of the vertical line is estimation data, and the data to the right is validation data. To get a feeling for the system, two ARX models (from voltage to angular velocity before the spring and from voltage to angle after spring respectively) were estimated. The orders of the models were selected to be the same as the physical Modelica model of the system (three continuous- (9) (10) We see that an approximate value of k is given by k≈ 11180 ≈ −0.06. 6128 · (−30) (11) This value will later be used as an initial value for k in the parameter estimation. Since the ARX models give such good fits, they might be all that is necessary for some applications. However, we are interested in the actual parameter values and to have a physical model where the connection between the different outputs is clear, so we proceed with parameter estimation for the physical model. The actual parameter estimation is performed using the idgrey command in M ATLAB. For each parameter value that idgrey needs a state-space model, the transformation that was briefly described in Section 2.1 is performed using numerical software. The selected initial values and estimated values for the unknown parameters are shown in Table 1. The initial values where selected from physical insight, except for k which was estimated from the ARX models, see (11), and R which was measured. Measured Output and Simulated Model Output 1.5 Measured Output ARXModelAngleCont Fit: 87.96% Angle after spring 1 0.5 Parameter d (Nms/rad) c (Nm/rad) J1 (kg m2 ) J2 (kg m2 ) R (Ω) k (Nm/A) 0 −0.5 −1 −1.5 45 50 55 60 65 Initial value 5.43 · 10−4 0.0362 3.0 · 10−6 3.66 · 10−4 4.50 −0.060 Estimated value 5.87 · 10−4 0.0341 6.30 · 10−6 4.17 · 10−4 4.60 −0.0599 Table 1. Initial and estimated values for the unknown parameters. Figure 4. Validation of ARX model from voltage to angle after spring using simulated output. A comparison between the initial model and the validation data and between the estimation model and the validation data is shown in Figure 6 and 7 respectively. As can be seen, already the initial parameters describe the angular velocity quite well, while the estimation process improves the fit for the angle considerably. Compared to the ARX models discussed earlier, the estimated grey box model has a somewhat worse fit for the angular velocity, and a comparable fit for the angle. The grey box model has the advantage of describing the whole system in one model. Measured Output and Simulated Model Output 3 Measured Output ARXModelVelCont Fit: 90.67% Angular velocity before spring 2 3 1 0 −1 −2 −3 45 50 55 60 Figure 5. Validation of ARX model from voltage to angular velocity using spring with simulated output. 65 State Estimation The state estimation problem is about finding the best estimate of the system state using the information available in the measurements from the system. In our case we need the state estimates in order to design a controller, which removes as much of the oscillations in the spring as possible. From a general control theoretical perspective the use of the state estimates for controller design is quite common, and many of the modern control algorithms rely on a model of some kind. The linear state estimation problem will be discussed in quite general terms in Section 3.1. Section 3.2 describes and solves the problems with introducing white noise in linear differential-algebraic equations. When the white noise has been introduced it is quite straightforward to derive the Kalman filter for linear DAE:s, which is done in Section 3.4. Finally the results from the experiments are given. 3.1 Since the system we are studying in this project can be modeled using linear dynamics we are only concerned with linear state estimation. The problem we are faced with in linear state estimation is to estimate the states, x, given measurements of the input, u, and output, y, signals in the following model Measured Output and Simulated Model Output Angle after spring 2 0 −1 −2 45 Angular velocity before spring Measured Output m1 Fit: 47.96% 1 50 55 4 60 Linear State Estimation 65 ẋ = Ax + Bu + w, y = Cx + e, 0 where the matrices A, B, and C are given. Furthermore, the process noise w and the measurement noise e are white, Gaussian, with −2 E[w(t)wT (s)] = R1 (t)δ(t − s), Measured Output m1 Fit: 82.23% 2 −4 45 T 50 55 60 65 Figure 6. Validation of grey box model with initial parameter values using simulated output. (12d) E[w(t)eT (s)] = R12 (t)δ(t − s). (12e) This problem was solved by Kalman and Luenberger in [13, 16, 17]. The idea is to use an observer which simulates the dynamics and adapts the state according to its ability to describe the output according to x̂(0) = x̂0 . Angle after spring Measured Output m2 Fit: 86.49% 1 x̃˙ = (A − KC)x̃ + w − Ke 0 −1 −2 45 50 55 4 60 65 Measured Output m2 Fit: 81.68% 2 0 −2 −4 45 50 55 60 Figure 7. Validation of grey box with estimated parameter values model using simulated output. (13a) (13b) Hence, the problem boils down to finding the K-matrix that gives the best estimates. By forming the error dynamics (x̃ = x − x̂) Measured Output and Simulated Model Output 2 (12c) E[e(t)e (s)] = R2 (t)δ(t − s), x̂˙ = Ax̂ + Bu + K(y − C x̂), Angular velocity before spring (12a) (12b) 65 (14) we can conclude that the choice of the K-matrix constitutes a compromise between convergence speed and sensitivity to disturbances. The speed of convergence is given by the locations of the eigenvalues of the matrix (A − KC). If system (12) is observable we can chose arbitrary locations for these eigenvalues, by assigning a certain K-matrix. Of course we would like the convergence to be as fast as possible, which implies large entries in the K-matrix. However, the downside of having such a K-matrix is that the measurement noise is amplified with the K-matrix. Hence, a Kmatrix with large entries implies that the measurement noise will give a significant influence on the estimate. This is of course undesirable, since we want to minimize the influence of the measurement noise on the state estimate. Hence, it is not obvious how to chose the observer gain K. Either we can use trial and error by placing the eigenvalues for the error dynamics and checking the influence of the measurement noise, or we can formulate an optimization problem of some kind. In this way the estimates will be optimal in the sense of the optimization problem. One possible choice is to find the estimates that minimizes the variance of the estimation error. This results in the following K-matrix K = P C T R2−1 , (15) of white noise might occur, which do not exist. This issue will be discussed in more detail later. Hence, we have to find a suitable subspace where the noise can live. More specifically we have to find all the possible Bw -matrices in where P is given by the following Ricatti equation Ṗ = AP + P AT + R1 − P C T R2−1 CP P (0) = P0 E ẋ = Ax + Bu + Bw wt , (16a) (16b) An observer using the K-matrix given in (15) is referred to as a Kalman filter [13]. The Kalman filter can also be motivated from a purely deterministic point of view, as the solution to a certain weighted least-squares problem [22, 20]. The assumption that the observer has the form (13a) might seem ad hoc. However, it can in fact be proven that this structure arises from the deterministic weighted leastsquares formulation [18], from the maximum a posteriori formulation [20], etc. Furthermore, it can be proven that if the system is linear, subject to non-Gaussian noise the Kalman filter provides the best linear unbiased estimator (BLUE) [1]. The discussion has until now been purely in continuous time, but since the measurements from the real world are inherently discrete we have to consider the discrete time Kalman filter. The problem is that the model obtained from Modelica is not in the form (12), but rather it is in the form E ẋ = Ax + Bu, (17) where the E-matrix can be singular. A model of this type is referred to as a linear differential-algebraic equation (DAE). Other common names for this type of equation are implicit systems, descriptor systems, semi-state systems, generalized systems, and differential equations on a manifold [4]. If the singularity assumption of the E-matrix is dropped we can multiply with E −1 from the left and obtain an ordinary differential equation (ODE), and standard Kalman filtering theory applies. Our aim is to be able to estimate the internal variables, x, from (17) directly, without transforming the equation into a standard state-space form (12). The reason is that we want a theory that is directly applicable to the models generated by Modelica, so that as much as possible can be done automatically. Furthermore, this allows us to add noise where it makes physical sense, simply by inserting it at appropriate places in the model. In order to accomplish this Modelica has to be extended with some kind of interaction mode and the possibility of generating random variables (noise). With this capabilities we could model and simulate stochastic, as well as, deterministic models in Modelica. The problem is that in the general case we cannot add noise to all equations in (17). The reason is that derivatives (18) where wt is a white noise sequence (i.e., Bw -matrices that assures that white noise is not differentiated). The two main reasons for why we want to introduce white noise in (18) are: • There are un-modeled dynamics and disturbances acting on the system, that can only be introduced as an unknown stochastic term. • There is a practical need for tuning in the filter in order to make the trade-off between tracking ability and sensor noise attenuation. This is in the Kalman filter accomplished by keeping the sensor noise covariance matrix constant and tuning the process noise covariance matrix, or the other way around. Often, it is easier to describe the sensor noise in a stochastic setting, and then it is more natural to tune the process noise. The problem of finding the subspace where the noise can live was solved in [21] and is briefly discussed in the subsequent section. 3.2 Noise Modeling We will omit the deterministic input in this derivation for notational convenience, so the continuous time linear invariant differential-algebraic equations considered is E ẋ(t) + F x(t) = Bw(t) y(t) = Cx(t) + e(t) (19a) (19b) The reader is referred to [10] for details on how the noncausality with respect to the input signal, u(t), can be handled. For the purpose of this discussion we will assume that w and e are continuous time white noises. (See e.g., [2] for a thorough treatment of continuous time white noise). If det(Es + F ) is not identically zero as a function of s ∈ R, (19) can always be transformed into the standard form (21) (see [3, 5]). Note that if det(Es + F ) is identically zero, then x(t) is not uniquely determined by w(t) and the initial value x(0). This can be realized by Laplace transforming (19). Therefore it is a reasonable assumption that det(Es + F ) is not identically zero. First, a transformation to the standard form is needed. This is done by finding a suitable change of variables x = Qz and a matrix P to multiply (19a) from the left. Both P and Q are non-singular matrices. By doing this we get P EQż(t) + P F Qz(t) = P Bw(t), (20) which for suitably chosen P - and Q-matrices can be written in the following standard form: G1 −A 0 z1 (t) I 0 ż1 (t) = w(t), + G2 0 I z2 (t) 0 N ż2 (t) (21) where the N -matrix is nilpotent, i.e., N k = 0 for some k. The matrices P and Q can be calculated using, e.g., ideas from [24] involving the generalized real Schur form and the generalized Sylvester equation, the details can be found in [9]. We can also write (21) on the form (22), see e.g., [6] or [15]. ż1 (t) = Az1 (t) + G1 w(t), z2 (t) = k−1 X (−N )i G2 i=0 di w(t) . dti (22a) (22b) From a theoretical point of view G1 can be chosen arbitrarily, since it describes how white noise should enter an ordinary differential equation. However, constraints on G1 can of course be imposed by the physics of the system that is modeled. When it comes to G2 , the situation is different, here we have to find a suitable parameterization. The problem is now that white noise cannot be differentiated, so we proceed to find a condition on the B-matrix in (19a) under which there does not occur any derivatives in (22b), i.e., N i G2 = 0 for all i ≥ 1. This is equivalent to N G2 = 0. (23) The result is given in Theorem 3.1. To formulate the theorem, we need to consider the transformation (20) with matrices P and Q which gives a system in the form (21). Let the matrix N have the singular value decomposition T Σ 0 Σ 0 T V1 V2 , N =U V =U (24) 0 0 0 0 where V2 contains the last k columns of V having zero singular values. Finally, define the matrix M as I 0 . (25) M = P −1 0 V2 It is now possible to derive a condition on B. Proof 3.1 From the discussion above we know that there exist matrices P and Q such that ˙ = P JQQ−1 ξ(t) + P Bw(t) P EQQ−1 ξ(t) gives the canonical form I 0 ˙ + −A Q−1 ξ(t) 0 0 N G1 0 −1 w(t). Q ξ(t) = G2 I (28) Note that B can be written as B = P −1 G1 . G2 (29) Let the matrix N have the singular value decomposition Σ 0 VT (30) N =U 0 0 where Σ is a diagonal matrix with nonzero elements. Since N is nilpotent it is also singular, so k singular values are zero. Partition V as V = V1 V2 , (31) where V2 contains the last k columns of V having zero singular values. Then N V2 = 0. We first prove the implication (26) ⇒ (23): Assume that (26) is fulfilled. B can then be written as S 0 S S −1 I −1 B=M =P =P (32) T 0 V2 T V2 T for some matrices S and T . Comparing with (29), we see that G1 = S and G2 = V2 T . This gives N G2 = N V 2 T = 0 (33) so (23) is fulfilled. Now the implication (23) ⇒ (26) is proved: Assume that (23) is fulfilled. We then get Σ 0 V1T ΣV1T G2 0 = N G2 = U G2 = U . (34) 0 0 V2T 0 This gives that V1T G2 = 0, (35) so the columns of G2 are orthogonal to the columns of V1 , and G2 can be written as G2 = V2 T. Theorem 3.1 The condition (23) is equivalent to B ∈ R(M ) (27) (26) where B is defined in (19) and M is defined in (25). The expression (26) means that B is in the range of M , that is the columns of B are linear combinations of the columns of M . Equation (29) now gives −1 I −1 G1 −1 G1 =P =P B=P V2 T 0 G2 G1 =M ∈ R(M ). T (26) is fulfilled. (36) 0 V2 G1 T (37) The B-matrices satisfying (29) will thus allow us to incorporate white noise without having a problem with differentiation of white noise. The design parameters to be specified are G1 and T defined in the proof above. Also note that the requirement that the white noise should not be differentiated is related to the concept of impulse controllability discussed in [6]. 3.2.1 R(M ). By projecting a b-vector onto N (M T ) using the following projection matrix I − MM† we can check whether the corresponding vector can be part of the B-matrix or not. Hence, b-vectors with only one nonzero entry cannot be part of the B-matrix if Constructing the Process Noise Subspace (I − M M † )b 6= 0. The proof for Theorem 3.1 is constructive. Hence, it can be used to derive the process noise subspace for the laboratory example treated in this work. Forming the singular value decomposition for the 50 × 50 matrix N in (21) we obtain two nonzero singular values (1.41 and 1.00). Hence, V2 ∈ R50×48 . We can parameterize all matrices G2 , satisfying N G2 = 0, (40) (41) Applying (41) to our model implies that we cannot add noise to 6 of the 54 equations. These 6 equations involve angles of various kinds. It is quite natural that we are not allowed to add white noise to angles, since that would correspond to an infinite jump in the corresponding derivative, which we cannot have. Instead the noise have to be added to the derivative and then integrated to affect the angle. (38) using an arbitrary matrix T ∈ R48×48 , according to G2 = V2 T . Furthermore we can, using the result (25), write the M -matrix according to 0 −1 I4 , (39) M =P 0 V2 where I4 is a four dimensional identity matrix. The subspace where the noise can live is now given by R(M ) according to Theorem 3.1. Ultimately we want the user to be able to, directly in Modelica, specify where in the model the noise should enter. For instance, if a certain part of the model corresponds to a significant simplification of the true system we would like to add noise there in order to mathematically account for the fact that we are uncertain about the dynamics at this part. Another example might be that we are uncertain about a parameter value, and model this uncertainty by adding noise to the corresponding equation. If the user were allowed to add noise to all equations, some of this noise would inevitably end up in the forbidden noise subspace, i.e., the subspace where B ∈ / R(M ). Hence, we want to find the equations to which we cannot add noise. We will do this by studying a vector b ∈ R54 , with only one nonzero element. Think of the case where we only have one noise source, B = b. It is not a restriction to limit the treatment to this type of B-matrices, since generally we cannot physically motivate the use of the same noise at two places in the model. That would correspond to using exactly the same disturbance at for instance the resistor and the spring in our model. If we want noise at both these places we would add two separate noise sources, which of course could have the same statistical properties. For Theorem 3.1 to be valid the b-vectors constituting the B-matrix cannot have any contribution in N (M T ), which is the orthogonal complement to the range space of M , 3.3 Discrete Model There are several reasons to derive a discrete model, such as • The measurements from systems are obtained in discrete time. • A common use of the estimated states are for model based control. These controllers are implemented using computers, which implies that the control signal will inherently be discrete. If the noise enters the system according to a B-matrix satisfying Theorem 3.1 the original system (19) can be written as ż1 (t) = Az1 (t) + G1 w(t), z2 (t) = G2 w(t), y(t) = CQz(t) + e(t). (42a) (42b) (42c) where x = Qz. Furthermore w(t) and e(t) are both assumed to be Gaussian white noise signals with covariances R1 and R2 respectively, and zero cross-covariance (the case of nonzero cross-covariance can be handled as well, the only difference is that the expressions are more involved). The system (42) can be discretized using standard techniques from linear systems theory, see e.g., [19]. If we assume that w(t) remains constant during one sample interval1 , we have (here it is assumed that sampling interval is one to simplify the notation) w(t) = w[k], k ≤ t < (k + 1) (43) 1 See e.g., [11] for a discussion on other possible assumptions on the stochastic process w(t) when it comes to discretization. we obtain z1 [k + 1] = Ãz1 [k] + G̃1 w[k], z2 [k] = G2 w[k], y[k] = CQz[k] + e[k] (44a) (44b) (44c) Ã = eA G̃1 = Z Rνt = Ct Pt|t−1 CtT + Rt . 1 eAτ dτ G1 . (45) 0 Hence, (44) and (45) constitutes a discrete time model of (19). (49) Rwt νt = E(wt νtT ) = E(wt (yt − Ct x̂t|t−1 )). = C̃1 z1 [k] + C̃2 G2 w[k] + e[k] | {z } (47a) (47b) Note that the measurement noise, ẽ[k], and the process noise, w[k], are correlated, if C̃2 G2 6= 0. Now, the Kalman filter can be applied to (47) in order to estimate the internal variables z1 [k] and the process noise w[k]. Finally an estimate of the internal variables z2 [k] can be found using the estimated process noise, since z2 [k] = G2 w[k], according to (44b). Finally the internal variables, x[k], are obtained by x[k] = Q−1 z[k]. For details regarding the Kalman filter see e.g., [12, 11, 1]. Since we did not find any good references on how to derive the estimate of the process noise we include a derivation of this in the subsequent section. Estimating the Process Noise The problem is how to estimate the process noise wt given the innovations, νi , i ≤ t. We will use the following theorem from [12, p. 81]: Theorem 3.2 Given two zero mean random variables x and y, the linear least-mean-squares estimator x̂ = K0 y of x given y is given by any solution to the so-called normal equations K0 Ry = Rxy (48) where Ry = E(yy T ) and Rxy = E(xy T ). Finally, this gives K0 = R12 (Ct Pt|t−1 CtT + Rt )−1 (53) ŵt|t = R12 (Ct Pt|t−1 CtT + Rt )−1 νt . (54) and thus 3.4.2 Combining (44a) and (46) we obtain y[k] = C̃1 z1 [k] + ẽ[k] (52) Experiments (46) ẽ[k] z1 [k + 1] = Ãz1 [k] + G̃1 w[k] (51) Since wt is independent of x̂t|t−1 we get Rwt νt = E(wt ytT ) = R12 = C̃1 z1 [k] + C̃2 z2 [k] + e[k] (50) Furthermore, The Kalman Filter In order to apply the Kalman filter to the discrete time model (44) we start out by rewriting (44c) as z1 [k] y[k] = CQz[k] + e[k] = [C̃1 C̃2 ] + e[k] z2 [k] 3.4.1 K0 R ν t = R w t ν t From the Kalman filter equations, we have where 3.4 Now back to our problem. Estimating wt given νi , i ≤ t is equivalent to estimating wt given νt since both processes are white. According to Theorem 3.2, our estimator ŵt|t = K0 νt is given by the solution to The theory derived in the previous sections will now be tested using real data from the spring servo system studied throughout this work. Both the time-varying and the timeinvariant (stationary) Kalman filter will be used to estimate the states. We have access to two output signals from our process, the angular velocity of the motor and the angular position of the mass after the spring. In order to validate our theory we will use the angular position as output signal and use the information available in this signal to estimate the other states. The estimated motor angular velocity will be compared with the true (measured) motor angular velocity in order to validate the estimation algorithm. The following noise covariances were used 1 0 0 0 0 1 0 0 R2 = 0.1 (55) R1 = 0.1 0 0 1 0 0 0 0 1 In Figure 8 the estimation performance is showed using the stationary and the time-varying Kalman filter with correct initial conditions (zero). From this figure is it clear that there is nothing to gain in using the time-varying Kalman filter in this case, which is due to the fact that the time-varying Kalman filter will coincide with the stationary Kalman filter directly. However, it we does not have access to the correct initial conditions, which we typically do not have, the result will be different (see Figure 9). In the transient phase the time-varying Kalman filter will be superior to the stationary Kalman filter. When the transient has disappeared the time-varying and the stationary Kalman filter both yield the same result, see Figure 10. 4 4 3 Motor angular velocity (deg/s) Motor angular velocity (deg/s) 3 2 1 0 −1 −2 2 1 0 −1 −2 −3 −3 −4 0 −4 40 0.5 1 1.5 Time (s) 2 2.5 44 46 48 50 Time (s) Figure 10. Stationary performance. The measured motor velocity is the black (solid) and the red (dashed) curve is the estimate from the Kalman filter. Figure 8. Transient behavior of the motor velocity estimates with zero initial states. The red (dashed), and the blue (dash-dotted) curve shows the estimates from the timeinvariant and the time-varying Kalman filters respectively. The measured motor velocity is the black (solid) curve. 4 Concluding Remarks In this work we have applied theory previously developed by the authors [21, 10] to perform system identification and state estimation using differential-algebraic equations (DAE:s). The system was modeled in Modelica, the resulting equations were then transfered to M ATLAB where the unknown parameters were estimated. Using this estimated model the states could then be successfully estimated. 30 20 Motor angular velocity (deg/s) 42 3 10 0 −10 −20 4.1 Further Work −30 There are several ideas for further work, some of the most important directions are: −40 −50 • Automatic translation of the Modelica model into the form −60 −70 0 1 2 3 4 Time (s) Figure 9. Transient behavior of the motor velocity estimates with nonzero initial states. The red (dashed), and the blue (dash-dotted) curve shows the estimates from the timeinvariant and the time-varying Kalman filters respectively. The measured motor velocity is the black (solid) curve. 5 ˙ = J(θ)ξ(t) + K(θ)e(t), E(θ)ξ(t) y(t) = C(θ)ξ(t), (56a) (56b) in M ATLAB. This should be fairly straightforward if it was possible to specify inputs, outputs, and unknown parameters in Modelica. If this translation existed we could perform grey-box system identification by first modeling the system in Modelica and then translate the model into the form (56), where the matrices are readily imported in M ATLAB. The unknown parameters can then be estimated without having to manually manipulate the equations. • Simulation of noise in Modelica. In this way it would be possible to model disturbances as well. It would be necessary to include some kind of interface in Modelica where we can check where it is possible to include the noise using Theorem 3.1. • Investigate to what extent these results could be extended to nonlinear systems. The particle filter [7] can probably be useful in the process of estimating the states in a nonlinear DAE. References [1] B. Anderson and J. Moore. Optimal Filtering. Information and system science series. Prentice Hall, Englewood Cliffs, New Jersey, 1979. [2] K. Åström. Introduction to Stochastic Control Theory. Mathematics in Science and Engineering. Academic Press, New York and London, 1970. [3] K. Brenan, S. Campbell, and L. Petzold. Numerical Solution of Initial-Value Problems in DifferentialAlgebraic Equations. SIAM’s Classics in Applied Methematics. SIAM, New York, 1996. [4] S. Campbell. Descriptor systems in the 90’s. In proceedings of the 29th Conference on Decision and control, pages 442–447, Honolulu, Hawaii, USA, December 1990. [5] L. Dai. Filtering and LQG problems for discrete-time stochastic singular systems. IEEE Transactions on Automatic Control, 34(10):1105–1108, Oct. 1989. [6] L. Dai. Singular Control Systems. Lecture Notes in Control and Information Sciences. Springer-Verlag, Berlin, New York, 1989. [7] A. Doucet, N. de Freitas, and N. Gordon, editors. Sequential Monte Carlo Methods in Practice. Springer Verlag, 2001. [8] P. Fritzson. Principles of Object-Oriented Modeling and Simulation with Modelica 2.1. Wiley-IEEE, New York, 2004. [9] M. Gerdin. Parameter estimation in linear descriptor systems. Licentiate Thesis No 1085, Linköping University, 2004. [10] M. Gerdin, T. Glad, and L. Ljung. Parameter estimation in linear differential-algebraic equations. In proceedings of the 13th IFAC Symposium on System Identification, Rotterdam, The Netherlands, Aug. 2003. IFAC. [11] F. Gustafsson. Adaptive Filtering and Change Detection. John Wiley & Sons, 2000. [12] T. Kailath, A. Sayed, and B. Hassibi. Linear Estimation. Information and System Sciences Series. Prentice Hall, Upper Saddle River, New Jersey, 2000. [13] R. E. Kalman. A new approach to linear filtering and prediction problems. Trans. AMSE, J. Basic Engineering, 82:35–45, 1960. [14] L. Ljung. System Identification - Theory for the User. Information and System Sciences Series. Prentice Hall PTR, Upper Saddle River, N.J., 2. edition, 1999. [15] L. Ljung and T. Glad. Modellygge och simulering. Studentlitteratur, 2003. In Swedish. [16] D. Luenberger. Observers for multivariable systems. IEEE Transactions on Automatic Control, AC11(2):190–197, Apr. 1966. [17] D. Luenberger. An introduction to observers. IEEE Transactions on Automatic Control, AC-16(6):596– 602, Dec. 1971. [18] C. Rao. Moving Horizon Strategies for the Constrained Monitoring and Control of Nonlinear Discrete-Time Systems. PhD thesis, University of Wisconsin Madison, 2000. [19] W. Rugh. Linear System Theory. Information and system sciences series. Prentice Hall, Upper Saddle River, New Jersey, 2 edition, 1996. [20] T. Schön. On Computational Methods for Nonlinear Estimation. Licentiate thesis, Linköping University, Oct. 2003. Thesis No. 1047. [21] T. Schön, M. Gerdin, T. Glad, and F. Gustafsson. A modeling and filtering framework for linear differential-algebraic equations. In proceedings of the 42nd Conference on Decision and Control, Maui, Hawaii, USA, Dec. 2003. [22] H. Sorenson. Least-squares estimation: from Gauss to Kalman. IEEE Spectrum, 7:63–68, July 1970. [23] M. Tiller. Introduction to Physical Modeling with Modelica. Kluwer, Boston, Mass., 2001. [24] A. Varga. Numerical algorithms and software tools for analysis and modelling of descriptor systems. In proceedings of 2nd IFAC Workshop on System Structure and Control, Prague, Czechoslovakia, pages 392–395, 1992.

* Your assessment is very important for improving the work of artificial intelligence, which forms the content of this project

Download PDF

advertisement