# MScThesis_Altmann_20150206.

Delft University of Technology Faculty of Electrical Engineering, Mathematics and Computer Science Delft Institute of Applied Mathematics Numerical Methods for Differential-Algebraic Equations A thesis submitted to the Delft Institute of Applied Mathematics in partial fulfillment of the requirements for the degree MASTER OF SCIENCE in APPLIED MATHEMATICS by Kristin Altmann Delft, The Netherlands February 6, 2015 c 2015 by Kristin Altmann. All rights reserved. Copyright MSc THESIS APPLIED MATHEMATICS “Numerical Methods for DifferentialAlgebraic Equations” Kristin Altmann Delft University of Technology Daily supervisors Dr. Matthias Möller Prof. dr. ir. Cornelis Vuik Committee members Prof. dr. ir. Cornelis Vuik Dr. Henk Schuttelaars Dr. Matthias Möller Dr. ir. Jan Schuurmans Delft, The Netherlands February 6, 2015 Abstract In recent years, the use of differential equations in connection with algebraic constraints on the variables has become a widely accepted tool for modeling the dynamical behaviour of physical processes. Compared to ordinary differential equations (ODEs), it has been observed that a number of difficulties can arise when numerical methods are used to solve differential-algebraic equations (DAEs), for instance order reduction phenomena, drift-off effects or instabilities. DAEs arise naturally and have to be solved in a variety of applications such as the mathematical pendulum and a reheat furnace model, both of which are used in this thesis to demonstrate the application of numerical methods and the arising difficulties. Working towards the prospective development of an applicable NMPC algorithm with DAEs as system models, this thesis is mainly concerned with the analysis and numerical treatment of DAEs. A particular focus is put on the topics of the strangeness index, an iterative procedure for determining all hidden constraints, regularisation techniques and numerical methods for solving DAEs. This includes the implementation of the RadauIIa and BDF methods. Due to a multitude of examples, this thesis may serve as an accessible introduction to DAEs and as a foundation for future research into this field. Key words: differential-algebraic equations, strangeness index, numerical methods, regularisation, RadauIIa methods, BDF methods, mathematical pendulum, reheat furnace model K. Altmann CONFIDENTIAL Master of Science Thesis Contents Abstract IV Contents V List of Figures VII List of Tables IX Abbreviations X Notation XI 1 Introduction 1 2 Analysis of differential-algebraic equations 3 2.1 Introductory examples . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4 2.2 Basic definitions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7 2.3 Linear differential-algebraic equations with constant coefficients . . . . . . . . . . . 8 2.3.1 Equivalence and regularity 9 2.3.2 Solvability of regular linear differential-algebraic equations with constant . . . . . . . . . . . . . . . . . . . . . . . . . . . coefficients . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2.3.3 Solvability of singular linear differential-algebraic equations with constant coefficients . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14 Characteristic quantities . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 19 Linear differential-algebraic equations with variable coefficients . . . . . . . . . . . 32 2.3.4 2.4 10 Master of Science Thesis CONFIDENTIAL K. Altmann VI Contents 2.5 2.4.1 Equivalence and regularity . . . . . . . . . . . . . . . . . . . . . . . . . . . 32 2.4.2 Strangeness index approach . . . . . . . . . . . . . . . . . . . . . . . . . . . 35 2.4.3 Derivative array approach . . . . . . . . . . . . . . . . . . . . . . . . . . . . 44 2.4.4 Differentiation index . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 49 Nonlinear differential-algebraic equations . . . . . . . . . . . . . . . . . . . . . . . . 53 2.5.1 Strangeness index approach . . . . . . . . . . . . . . . . . . . . . . . . . . . 53 2.5.2 Structured problems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 57 3 Numerical solution of differential-algebraic equations 64 3.1 Differential-algebraic equations are not ordinary differential equations . . . . . . . 64 3.2 Index reduction and regularisation . . . . . . . . . . . . . . . . . . . . . . . . . . . 65 3.3 Methods for strangeness-free differential-algebraic equations . . . . . . . . . . . . . 69 3.3.1 One-step methods . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 70 3.3.2 Multi-step methods . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 76 Numerical solution of systems of nonlinear equations . . . . . . . . . . . . . . . . . 79 3.4 4 Test problems 80 4.1 Mathematical pendulum . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 80 4.2 Reheat furnace model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 82 5 Numerical results 87 5.1 Numerical solution of systems of nonlinear equations . . . . . . . . . . . . . . . . . 87 5.2 Index reduction and regularisation . . . . . . . . . . . . . . . . . . . . . . . . . . . 89 5.3 Computing time and accuracy . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 94 5.3.1 Mathematical pendulum . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 94 5.3.2 Reheat furnace model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 95 6 Conclusion and future research 101 Bibliography 104 K. Altmann CONFIDENTIAL Master of Science Thesis List of Figures 4.1 The mathematical pendulum . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 80 4.2 Schematic view of a reheat furnace . . . . . . . . . . . . . . . . . . . . . . . . . . . 82 4.3 Space discretisation of the furnace . . . . . . . . . . . . . . . . . . . . . . . . . . . 83 4.4 Heat balance in section m of the furnace . . . . . . . . . . . . . . . . . . . . . . . . 83 5.1 Different solvers for systems of nonlinear equations for the pendulum . . . . . . . . 88 5.2 Different solvers for systems of nonlinear equations for the furnace . . . . . . . . . 89 5.3 Accuracy of different methods with respect to the step size h for the regularised s-index 0 formulation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5.4 Accuracy of different methods with respect to the step size h for the s-index 2 formulation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5.5 90 Accuracy of different methods with respect to the step size h for the s-index 1 formulation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5.6 90 91 Accuracy of different methods with respect to the step size h for the s-index 0 formulation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 92 5.7 Residuals of the constraints for the regularised s-index 0 formulation . . . . . . . . 92 5.8 Residuals of the constraints for the s-index 2 formulation . . . . . . . . . . . . . . 93 5.9 Residuals of the constraints for the s-index 1 formulation . . . . . . . . . . . . . . 93 5.10 Residuals of the constraints for the s-index 0 formulation . . . . . . . . . . . . . . 93 5.11 Computing time of each of different methods with respect to the step size h for the regularised s-index 0 formulation . . . . . . . . . . . . . . . . . . . . . . . . . . . . 94 5.12 Computing time of each of different methods with respect to the reached accuracy for the regularised s-index 0 formulation . . . . . . . . . . . . . . . . . . . . . . . . Master of Science Thesis CONFIDENTIAL 95 K. Altmann VIII List of Figures 5.13 Accuracy of different methods with respect to the step size h for the furnace . . . . 96 5.14 Computing time of each of different methods with respect to the step size h for the furnace . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 96 5.15 Computing time of each of different methods with respect to the reached accuracy for the furnace . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 97 5.16 Computing time of each of different methods with respect to the step size h for the furnace . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 100 5.17 Computing time of each of different methods with respect to the reached accuracy for the furnace . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 100 K. Altmann CONFIDENTIAL Master of Science Thesis List of Tables 3.1 Properties of Runge-Kutta methods applied to strangeness-free quasi-linear DAEs 73 3.2 Butcher tableaus of the first three Gauß methods . . . . . . . . . . . . . . . . . . . 74 3.3 Butcher tableaus of the first three RadauIa methods . . . . . . . . . . . . . . . . . 74 3.4 Butcher tableaus of the first three RadauIIa methods . . . . . . . . . . . . . . . . . 75 3.5 Butcher tableaus of the first two LobattoIIIa methods . . . . . . . . . . . . . . . . 75 3.6 Butcher tableaus of the first two LobattoIIIb methods . . . . . . . . . . . . . . . . 75 3.7 Butcher tableaus of the first two LobattoIIIc methods . . . . . . . . . . . . . . . . 75 3.8 Coefficients of the BDF methods . . . . . . . . . . . . . . . . . . . . . . . . . . . . 77 4.1 Parameter values of the reheat furnace model . . . . . . . . . . . . . . . . . . . . . 86 5.1 Parameter values of the base solutions . . . . . . . . . . . . . . . . . . . . . . . . . 87 5.2 Analysis of the computing time of the BDF methods for the furnace (tol = 10−10 , h = 60 s, T = 5 h) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5.3 Analysis of the computing time of the RadauIIa methods for the furnace (tol = 10−10 , h = 60 s, T = 5 h) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5.4 98 Computing times of generating the function files and corresponding files for the Jacobians for the BDF methods . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5.5 98 99 Computing times of generating the function files and corresponding files for the Jacobians for the RadauIIa methods . . . . . . . . . . . . . . . . . . . . . . . . . . Master of Science Thesis CONFIDENTIAL 99 K. Altmann Abbreviations BDF backward differentiation formula DAE differential-algebraic equation DE-step differentiation-elimination step d-index differentiation index IVP initial value problem JCF Jordan canonical form KCF Kronecker canonical form NMPC nonlinear model predictive control ODE ordinary differential equation PDE partial differential equation s-index strangeness index SVD singular value decomposition WCF Weierstraß canonical form K. Altmann CONFIDENTIAL Master of Science Thesis Notation ẋ total derivative of x with respect to t, i.e. ẋ(t) = ẍ second total derivative of x with respect to t, i.e. ẍ(t) = x(i) i-th total derivative of x(t) with respect to gx (x, y) partial derivative of g(x, y) with respect to E ∗ d dt x(t) d2 dt2 x(t) di t, i.e. x(i) (t) = dt i x(t) ∂ g(x, y) x, i.e. gx (x, y) = ∂x Hermitian conjugate of E, i.e. E ∗ = E T T0 matrix that completes T to a nonsingular matrix [T T 0 ] ⊕ direct sum k·k norm ∅n,0 empty matrix ∈ Cn,0 aij , bi , ci coefficients of a Runge-Kutta method b weight vector of a Runge-Kutta method c node vector of a Runge-Kutta method cair heat capacity of air cf heat capacity of fuel cf uel cost of fuel cg heat capacity of waste gas cprod production profit cs heat capacity of steel products cw heat capacity of furnace wall e Euler’s number f (t) f˜(t) inhomogeneity of a linear DAE f (x(t), t) right-hand side of a quasi-linear DAE f (t, x) right-hand side of an ODE g gravitational acceleration g(x, t) set of constraints transformed inhomogeneity of a linear DAE h step size hi (x, t) hidden constraint of level i kj stage of a Runge-Kutta method for ODEs l length of the pendulum m number of equations Master of Science Thesis CONFIDENTIAL K. Altmann XII Notation m mass of the pendulum mc number of constraints mi size of the Jordan block Ji m1 , m2 number of equations of the differential part and of the constraint part of a semi-implicit DAE, respectively n number of unknowns p order of consistency or convergence r, a, s, d, u, v characteristic values of a linear DAE with constant coefficients s number of stages of a Runge-Kutta method t independent variable of a DAE t0 initial time ti grid points tol prescribed tolerance u1 , u2 , u3 fuel flows to burners 1, 2, 3, respectively v speed of the steel products v index of nilpotency of a matrix vc maximal constraint level of a DAE vd differentiation index of a DAE vmax maximal speed of the steel products vn index of nilpotency of a DAE vs strangeness index of a DAE x(t) state variable of a DAE depending on t x̃ transformed state variable of a DAE depending on t ẋ = Φ(x, t) x 0 underlying ODE initial guess of the Newton method x0 initial value of an IVP xi approximation of x(ti ) x k Newton iterates ∆x length of a section m of the furnace ∆xk correction term of the Newton iteration A coefficient matrix of a linear DAE with constant coefficients A Runge-Kutta matrix Ã transformed coefficient matrix of a linear DAE with constant coefficients A(t) coefficient matrix of a linear DAE with variable coefficients As,m surface area of the steel products in section m Aw,m surface area of the furnace wall in section m B width of the furnace C set of complex numbers C set of continuous functions Ci set of i-times continuously differentiable functions Ds thickness of steel products K. Altmann CONFIDENTIAL Master of Science Thesis Notation XIII Dw thickness of furnace wall Dx domain of the state variable x Dẋ domain of the derivative ẋ E leading matrix of a linear DAE with constant coefficients Ẽ transformed leading matrix of a linear DAE with constant coefficients Ek kinetic energy Ep potential energy E(t) leading matrix of a linear DAE with variable coefficients E(x(t), t) leading matrix of a quasi-linear DAE (Emod , Amod ) modified matrix pair of (E, A) F (ẋ(t), x(t), t) right-hand side of a DAE F̂ (ẋ(t), x(t), t) right-hand side of a regularisation of a DAE Fl derivative array of order l H height of the furnace H0 lower calorific value of fuel I, Id identity matrix (∈ Rd,d ) I domain of the independent variable t J Jordan block of the WCF J objective function Jf Jacobian of the function f Ji Jordan block of the JCF Jρj Jordan block of the KCF L Lipschitz constant L length of the furnace Ls length of steel products Ll set of solutions of the derivative array of order l L Lagrange function Lεj Jordan block of the KCF L∗ underdetermined block of a DAE in KCF M solution manifold or set of consistency Mηj Jordan block of the KCF M∗ overdetermined block of a DAE in KCF (Ml , Nl ) inflated pair of order l N nilpotent Jordan block of the WCF N number of discrete grid points N number of sections of the furnace N set of natural numbers N0 set of natural numbers (including 0) Nσj Jordan block of the KCF P left transformation matrix P (t) left transformation matrix function Master of Science Thesis CONFIDENTIAL K. Altmann XIV Notation Q right transformation matrix Q(t) right transformation matrix function Qair,m heat brought in by air in section m Qc,m heat brought in by combustion in section m Qf,m heat brought in by fuel in section m Qg,m+1 heat brought in by waste gas of section m + 1 Qo,m heat leaving the furnace wall to outside air in section m Qs,m heat entering steel products in section m Qw,m heat entering the furnace wall in section m Raf ratio of air to fuel R set of real numbers S(x, t) selector matrix function T end point of the domain I = [t0 T ] To temperature of air outside the furnace Tair temperature of air used in combustion Tf temperature of fuel Tg,m temperature of the waste gas in section m Ts,m temperature of the steel products in section m Vs,m volume of the steel products in section m Xj , Xj0 stages of a Runge-Kutta method for DAEs αi , βi coefficients of a linear multi-step method εj size of the Jordan block Lεj εs emissivity of steel products εw emissivity of furnace wall ηj size of the Jordan block Mηj λ Lagrange multiplier % first characteristic polynomial of a linear multi-step method ρ density of steel products ρj size of the Jordan block Jρj ρw density of furnace wall σ Stefan-Boltzmann constant σj size of the Jordan block Nσj Φ increment function of a one-step method Φf,m fuel flow into section m Φg,m waste gas flow from section m into m − 1 K. Altmann CONFIDENTIAL Master of Science Thesis Chapter 1 Introduction In recent years, the use of differential equations in connection with algebraic constraints on the variables, for example due to laws of conservation or position constraints, has become a widely accepted tool for modeling the dynamical behaviour of physical processes. Such combinations of both differential and algebraic equations are called differential-algebraic equations (DAEs). DAEs arise naturally and have to be solved in a variety of applications, such as mechanical multibody systems (vehicle dynamics, aeronautics, biomechanics, robotics, etc.), chemical process simulation, control theory, simulation of electrical networks, fluid dynamics and many other areas (see for instance Steinbrecher 2006, section 4.1.3; Kunkel/Mehrmann 2006, section 1.3; Brenan et al. 1996, section 1.3). Furthermore, solving partial differential equations (PDEs) by using the method of lines and discretizing the spatial derivatives first, for example by finite element or finite difference methods, can also lead to DAEs (see Kunkel/Mehrmann 2006, p. 10; Brenan et al. 1996, pp. 10-12). The present master thesis project is conducted in collaboration with DotX Control Solutions BV, a Dutch company based in Alkmaar. DotX Control Solutions BV is specialised in designing and implementing complex control software for industrial processes and has worked on projects in water management, steel production, wind turbine design and paper drying (see DotX Control Solutions BV 2014a, 2014c). DotX Control Solutions BV operates in a number of research fields, including nonlinear model predictive control (NMPC). NMPC is an advanced method of process control that typically computes a (sub-) optimal control based on predictions of the system dynamics. The predictions rely on a mathematical model of the complex dynamical process. As the name suggests, NMPC can handle nonlinearities in the underlying model. Nonlinear modeling realised by ordinary differential equations (ODEs) as system models has been an important and successful research field of DotX Control Solution BV. However, advanced industrial applications require models of even greater realism and complexity. Thus, it is intended to use DAEs as system models in NMPC in order to improve the control performance. Master of Science Thesis CONFIDENTIAL K. Altmann 2 Chapter 1. Introduction While the numerical solution techniques and the theoretical analysis of ODEs have reached maturity, many difficult and complex questions concerning both the analytical and the numerical behaviour of DAEs remain untreated or unsolved despite an enormous increase in the research of DAEs during the past 40 years (see Steinbrecher 2006, p. 21; Kunkel/Mehrmann 2006, pp. 4-5). Compared to ODEs, it has been observed that a number of difficulties can arise when numerical methods are used to solve DAEs, for instance order reduction phenomena, drift-off effects or instabilities (see Steinbrecher 2006, p. 106). Therefore, the main purpose of this thesis is to investigate the analytical and numerical behaviour of DAEs. In particular, efficient numerical methods for solving DAEs are required in order to prospectively develop an applicable NMPC algorithm with DAEs as system models. In order to achieve the desired goals of this thesis, we are interested in answering the following questions: • What types of DAEs exist and how can DAEs be characterized with respect to their analytical and numerical behaviour? • Which numerical methods exist for DAEs? What are their differences regarding accuracy and computation time? • Which classes of DAEs create no or only slight difficulties while solving them numerically? • What strategies exist for classes of DAEs which cause significant problems in their numerical treatment? Therefore, this thesis is structured as follows: After this introduction, chapter 2 covers the analysis of DAEs by giving some introductory examples, introducing necessary basic definitions and treating first linear DAEs and afterwards general nonlinear DAEs. Chapter 3 is concerned with the numerical treatment of DAEs. This includes a brief discussion of the difficulties which can arise when numerical methods for ODEs are used to solve DAEs. Subsequently, index reduction and regularisation techniques are surveyed, followed by different methods for the numerical solution of strangeness-free DAEs as well as numerical methods for solving systems of nonlinear equations. In chapter 4, two test problems, the mathematical pendulum and a reheat furnace model, are introduced. In chapter 5, the results of several implementations of numerical methods applied to the two test problems are presented and discussed. Finally, chapter 6 concludes this master thesis and gives an outlook on future research in this field. K. Altmann CONFIDENTIAL Master of Science Thesis Chapter 2 Analysis of differential-algebraic equations This chapter reviews important facts on the analysis of DAEs. In the first section 2.1, some introductory examples illustrate that DAEs can be seen as combining the properties of ODEs and purely algebraic equations. Necessary basic definitions are introduced in section 2.2. The following sections cover the range of DAEs from simple types to more general ones, starting with linear DAEs with constant coefficients in section 2.3. Linear DAEs with variable coefficients are considered in section 2.4, while nonlinear DAEs are treated in section 2.5. In each of these sections, respectively, existence and uniqueness results are discussed. This includes the development of several canonical forms and the determination of characteristic quantities of a DAE, in particular different concepts of the so-called index. Generally, the idea of all these index concepts is to classify DAEs with respect to their difficulty in the analytical as well as the numerical solution. In particular, they measure the degree of smoothness of the problem that is needed to obtain existence and uniqueness results. In this thesis, the focus is set on the index of nilpotency for linear DAEs with constant coefficients, the most widely used differentiation index (d-index) and the recently developed strangeness index (s-index). Other index concepts, such as the perturbation index (see Hairer et al. 1989), the geometric index (see Rheinboldt 1984), the tractability index (see Griepentrog/März 1986) or the structural index (see Pantelides 1988), will not be discussed. The contents of the present chapter are covered in a more detailed way in Kunkel/Mehrmann (2006) and partially in Steinbrecher (2006). Master of Science Thesis CONFIDENTIAL K. Altmann 4 Chapter 2. Analysis of differential-algebraic equations 2.1 Introductory examples Example 1: Differential equations Consider the linear (ordinary) differential equations ẋ1 = 3x1 − x2 , ẋ2 = 4x1 − 2x2 . The coefficient matrix A = tors ! 1 1 and 1 4 3 ! −1 4 −2 has eigenvalues 2 and -1, and corresponding eigenvec- ! , respectively. Thus, the general solution of the system is given by x(t) = x1 x2 ! = c1 ! 1 1 ! 1 2t · e + c2 4 · e−t with c1 , c2 ∈ R. By prescribing initial values, the parameters c1 , c2 can be determined, e.g. x(0) = ! 3 6 yields c1 = 2, c2 = 1. Hence, the solution of the initial value problem is x(t) = x1 x2 ! = 2 ! 2 2t ·e + 1 4 ! · e−t . In summary, the initial values can freely be chosen in R2 , i.e. there are 2 degrees of freedom and the solution manifold, i.e. the space every solution trajectory (x(t), t) lies in, is the whole R3 (3-dimensional). Example 2: Algebraic equations Consider the algebraic equations 0 = 2x1 + x2 + 3 sin(t), 0 = x1 − x2 + 6 cos(t). Adding both equations yields 0 = 3x1 + 3 sin(t) + 6 cos(t) ⇔ x1 = − sin(t) − 2 cos(t). By inserting this into the second equation we obtain 0 = − sin(t) − 2 cos(t) − x2 + 6 cos(t) K. Altmann ⇔ CONFIDENTIAL x2 = − sin(t) + 4 cos(t). Master of Science Thesis 2.1 Introductory examples 5 Thus, the unique solution of the system is given by x1 x(t) = ! ! −1 = x2 −1 sin(t) + −2 ! cos(t). 4 In summary, initial values cannot be chosen, i.e. there are 0 degrees of freedom and the solution manifold is one trajectory (1-dimensional). Example 3: Differential-algebraic equations I Consider the differential-algebraic equations ẋ1 = 2x1 + 2x2 , 0 = x1 + x2 − cos(t). From the second equation we get x2 = −x1 + cos(t). Inserting this into the first equation yields ẋ1 = 2x1 − 2x1 + 2 cos(t) = 2 cos(t) ⇒ x1 = 2 sin(t) + c with c ∈ R. By inserting this into the second equation we obtain x2 = −2 sin(t) + cos(t) − c. Thus, the general solution of the system is given by x(t) = x1 x2 ! = 2 ! −2 sin(t) + 0 1 ! cos(t) + 1 ! −1 c with c ∈ R. The parameter c can be determined from prescribed initial values, but one has to be careful ! 1 since the initial values have to satisfy the algebraic equations. For example x(0) = is 1 ! ! 1 c not possible since it yields = which is contradictory. 1−c 1 ! 1 The initial values x(0) = are a possible choice and yield c = 1. Hence, the solution of 0 the initial value problem is x(t) = x1 x2 ! = 2 ! −2 sin(t) + 0 ! 1 cos(t) + 1 ! −1 . In summary, the initial values cannot be chosen freely, i.e. there is 1 degree of freedom and the solution manifold is a surface (2-dimensional). Master of Science Thesis CONFIDENTIAL K. Altmann 6 Chapter 2. Analysis of differential-algebraic equations The previous examples show that DAEs reflect the behaviour of differential equations as well as the behaviour of algebraic equations. In particular, the degrees of freedom and the dimension of the solution manifold of DAEs lie in between those of differential equations and algebraic equations. The conjecture that the degrees of freedom of a DAE are equal to the number of unknowns minus the number of algebraic equations is not true in general. It is possible that besides the explicitly given algebraic equations, there are further algebraic constraints hidden in the DAE. These are so-called hidden constraints. This situation is illustrated in the following example. Example 4: Differential-algebraic equations II Consider the differential-algebraic equations ẋ1 = x1 + 2x2 + 3x3 + sin(t) + cos(t), ẋ2 = x1 + x2 + 2x3 + cos(t), 0 = x1 − x2 + cos(t). Subtracting the second from the first equation yields ẋ1 − ẋ2 = x2 + x3 + sin(t). By differentiating the third equation we get ẋ1 − ẋ2 = sin(t). Thus, it follows that 0 = x2 + x3 , which is the hidden constraint. Inserting the algebraic constraints into the second equation yields ẋ2 = x2 − cos(t) + x2 − 2x2 + cos(t) = 0 ⇒ x2 = c with c ∈ R. Hence, it follows that x1 = x2 − cos(t) = c − cos(t) and x3 = −x2 = −c. Therefore, the general solution of the system is given by 1 x1 −1 cos(t) + x(t) = = 1 c x2 0 0 −1 x3 with c ∈ R. Thus, there is only 1 degree of freedom and possible initial values have to satisfy both algebraic equations, the explicitly given constraint 0 = x1 − x2 + cos(t) and the hidden constraint 0 = x2 + x3 . K. Altmann CONFIDENTIAL Master of Science Thesis 2.2 Basic definitions 2.2 7 Basic definitions First of all, it is necessary to formulate a general definition of DAEs: Definition 1: Differential-algebraic equations A set of equations of the form 0 = F (ẋ(t), x(t), t) (2.1) with F : Dẋ × Dx × I → Cm where I ⊆ R is a compact interval and Dẋ , Dx ⊆ Cn are open, n, m ∈ N, is called a set of differential-algebraic equations (DAEs). Furthermore, x : I → Cn are called state variables or unknown variables and t ∈ I is called the independent variable. Note that this most general form of DAEs includes ODEs (ẋ = f (t, x)) as well as purely algebraic equations (0 = f (t, x)) as special cases. Other special forms of DAEs are: • Linear DAEs with constant coefficients: E ẋ(t) = Ax(t) + f (t), (2.2) with E, A ∈ Cm,n and f : I → Cm . • Linear DAEs with variable coefficients: E(t)ẋ(t) = A(t)x(t) + f (t), (2.3) with E, A : I → Cm,n and f : I → Cm . • Quasi-linear DAEs: E(x(t), t)ẋ(t) = f (x(t), t), (2.4) with E : Cn × I → Cm,n and f : Cn × I → Cm . Uniqueness of solutions is usually considered in the context of initial value problems: Definition 2: Initial value problem If in addition to the DAE (2.1) an initial condition x(t0 ) = x0 (2.5) with given t0 ∈ I and x0 ∈ Cn is prescribed, we have an initial value problem (IVP) where x0 is called the initial value. Master of Science Thesis CONFIDENTIAL K. Altmann 8 Chapter 2. Analysis of differential-algebraic equations In the following, only the classical solvability concept is considered: Definition 3: Solution of DAEs A function x : I → Cn is called a solution of the DAE (2.1) if x is continuously differentiable and satisfies (2.1) pointwise for all t ∈ I. A solution x of the DAE (2.1) is called a solution of the IVP (2.1), (2.5) if x furthermore satisfies the initial condition (2.5). Definition 4: Consistency The DAE (2.1) is called solvable if there exists a solution of the DAE (2.1). Values y ∈ Cn are called consistent to the DAE (2.1) if there exists a solution x of the DAE (2.1) and a t ∈ I with x(t) = y. The initial values x0 in (2.5) are called consistent if there exists a solution of the IVP (2.1), (2.5). Example 5: Consistent initial values In Example 1, we have seen that any initial value x0 ∈ R2 is consistent. ! −2 . In Example 2, there is only one consistent initial value x0 = 4 ! 1 In Example 3, we have seen that x0 = is not consistent, and that x0 = 1 In general, consistent initial values for this example are of the form x0 = 2.3 ! 1 0 c is consistent. ! 1−c with c ∈ R. Linear differential-algebraic equations with constant coefficients In this section, we consider linear DAEs with constant coefficients of the form (2.6) E ẋ(t) = Ax(t) + f (t) where E, A ∈ Cm,n and f ∈ C(I, Cm ), possibly together with an initial condition x(t0 ) = x0 K. Altmann with t0 ∈ I, x0 ∈ Cn . CONFIDENTIAL (2.7) Master of Science Thesis 2.3 Linear differential-algebraic equations with constant coefficients 2.3.1 9 Equivalence and regularity Let P ∈ Cm,m and Q ∈ Cn,n be nonsingular. Then, multiplication of the DAE (2.6) with P from the left and a change of variables x̃ = Q−1 x yields: E ẋ = Ax + f (t) ⇔ P E ẋ = P Ax + P f (t) ⇔ P EQx̃˙ = P AQx̃ + P f (t). (2.8) Thus, the DAE (2.6) can be transformed into another linear DAE with constant coefficients of the form Ẽ ẋ = Ãx + f˜(t) with Ẽ = P EQ, Ã = P AQ, f˜ = P f. (2.9) Note that the relation x = Qx̃ gives a 1-to-1 correspondence between the corresponding solution spaces. Hence, we can also investigate the transformed DAE (2.9) instead of the original DAE (2.6) with respect to existence and uniqueness of solutions. This motivates the following definition: Definition 5: Strong equivalence Two pairs of matrices (Ei , Ai ) ∈ Cm,n × Cm,n , i = 1, 2, are called strongly equivalent if there exist nonsingular matrices P ∈ Cm,m and Q ∈ Cn,n such that E2 = P E1 Q and A2 = P A1 Q. (2.10) Remark: As the name suggests, it can be shown that the relation introduced in Definition 5 is indeed an equivalence relation (see Kunkel/Mehrmann 2006, Lemma 2.2). One important property of matrix pairs concerning the solution behaviour of the corresponding linear DAE is the following: Definition 6: Regular and singular matrix pairs Let E, A ∈ Cm,n . The matrix pair (E, A) is called regular if m = n and if the polynomial det(λE − A) is not the zero polynomial. Otherwise, (E, A) is called singular. Lemma 1 If (E, A) with E, A ∈ Cm,n is strongly equivalent to a regular matrix pair, then (E, A) is regular. Proof: See Kunkel/Mehrmann (2006), Lemma 2.6. Master of Science Thesis CONFIDENTIAL K. Altmann 10 Chapter 2. Analysis of differential-algebraic equations With the help of the equivalence relation for matrix pairs defined above, the next step consists in finding an appropriate canonical form in such a way that the properties and invariants of the corresponding DAE can be easily determined. Thus, in the following two subsections we discuss canonical forms of matrix pairs which can be traced back to the fundamental works of Weierstraß (1858) and Kronecker (1890), and the resulting solvability statements for the corresponding DAEs, first for regular and then for singular matrix pairs. 2.3.2 Solvability of regular linear differential-algebraic equations with constant coefficients Definition 7: Nilpotent matrix, index of nilpotency A matrix A ∈ Cn,n is called nilpotent if there exists a v ∈ N with Av = 0 and Av−1 6= 0. The number v is called the index of nilpotency of A. Theorem 1: Jordan canonical form (JCF) Let A ∈ Cn,n . Then there exists a nonsingular matrix P ∈ Cn,n such that P −1 λi AP = diag(J1 , . . . , Jk ) where Ji = 1 λi .. . .. . ∈ Cmi ,mi 1 (2.11) λi are so-called Jordan blocks with k P mi = n. i=1 The JCF is unique except for permutation of the Jordan blocks. Proof: See Weintraub (2009). Theorem 2: Weierstraß canonical form (WCF) Let E, A ∈ Cn,n such that (E, A) is regular. Then there exist a matrix J ∈ Cd,d and a nilpotent matrix N ∈ Cn−d,n−d with d ∈ N, 0 ≤ d ≤ n, both in Jordan canonical form such that (E, A) is strongly equivalent to ( Id 0 0 N ! , J 0 0 In−d ! (2.12) ). The WCF is unique except for permutation of the Jordan blocks in J and N . Proof: See Kunkel/Mehrmann (2006), Theorem 2.7. K. Altmann CONFIDENTIAL Master of Science Thesis 2.3 Linear differential-algebraic equations with constant coefficients 11 Example 6: Weierstraß canonical form In Example 4, we have a linear DAE with constant coefficients with corresponding matrix pair 1 (E, A) = (0 0 0 1 2 1 0 , 1 1 0 (E, A) is regular since 0 1 0 λ−1 det(λE − A) = det −1 −1 −2 λ−1 −1 3 2 ). 1 0 −3 −2 = −λ 0 is not the zero polynomial. With 1 P = 1 0 − 23 −1 0 1 2 0 , 1 −2 Q = −2 2 0 3 2 1 −2 0 we get the following WCF of (E, A) 1 P EQ = 0 0 0 0 0 0 1 = 0 ! I N , 0 P AQ = 0 0 0 1 0 0 0 = 1 ! J I . By transforming the regular matrix pair (E, A) into WCF the corresponding DAE E ẋ = Ax + f can be cast into the form x˙1 = Jx1 + f1 N x˙2 = x2 + f2 (2.13) with x1 and x2 separated. The first subproblem x˙1 = Jx1 + f1 is a linear ODE whose corresponding IVP is uniquely solvable for f ∈ C(I, Cn ). Therefore, we only have to investigate DAEs of the form N ẋ = x + f (2.14) where N ∈ Cn,n is nilpotent. Lemma 2 Consider the DAE (2.14). Let v ∈ N be the index of nilpotency of N . If f ∈ C v (I, Cn ), then (2.14) has the unique solution x=− v−1 X N i f (i) . (2.15) i=0 Proof: See Kunkel/Mehrmann (2006), Lemma 2.8. Master of Science Thesis CONFIDENTIAL K. Altmann 12 Chapter 2. Analysis of differential-algebraic equations Remarks: • The solution of the DAE (2.14) is unique without specifying initial values. Thus, the only consistent initial condition at t0 is given by the value of x from (2.15) at t0 . • Formally, the inhomogeneity f has to be v times continuously differentiable to obtain a continuously differentiable solution x, but in general, this is an overly restrictive assumption. The following theorem summarises the obtained results. Theorem 3 Consider the DAE (2.6) with the initial condition (2.7). Let E, A ∈ Cn,n with (E, A) being regular and let P, Q ∈ Cn,n be nonsingular matrices which transform (E, A) into WCF, i.e. P EQ = I 0 0 N ! , P AQ = Set −1 Q x= x̃1 x̃2 J 0 0 I ! , ! , Q −1 x0 = Pf = x̃1,0 x̃2,0 f˜1 f˜2 ! . (2.16) ! (2.17) . Furthermore, let v be the index of nilpotency of N and let f ∈ C v (I, Cn ). Then we have the following: 1) The DAE (2.6) is solvable with the general solution x̃1 x = Q v−1 P i ˜(i) − N f2 (2.18) i=0 where x̃1 is a solution of x̃˙ 1 = J x̃1 + f˜1 . 2) An initial condition (2.7) is consistent if and only if x̃2,0 = − v−1 X (i) N i f˜2 (t0 ). (2.19) i=0 In particular, the set of consistent initial values is nonempty. 3) Any IVP (2.6)-(2.7) with consistent initial values has the unique solution (2.18) where x̃1 satisfies the initial condition x̃1 (t0 ) = x̃1,0 . K. Altmann CONFIDENTIAL Master of Science Thesis 2.3 Linear differential-algebraic equations with constant coefficients 13 Example 7 Consider the linear DAE from Example 4: 1 0 0 0 1 0 1 1 ẋ = 0 1 0 0 sin(t) + cos(t) 3 cos(t) 1 2 x + cos(t) −1 0 2 ⇔ E ẋ = Ax + f. In the previous example, we had the WCF with 1 P = 1 0 − 23 1 2 0 , −1 0 −2 0 Q= −2 0 2 1 3 2 , 1 −2 Q−1 1 = 0 1 − 32 1 −1 0 1 0 such that we have the equivalent DAE in WCF 1 0 0 0 0 ˜ ˙ Ẽ x̃ = Ãx̃ + f with Ẽ = P EQ = 0 0 1 , Ã = P AQ = 0 1 0 0 0 0 0 x̃ x − 3x sin(t) 1 1 2 2 −1 f˜ = P f = sin(t) , x̃ = x̃2 = Q x = x2 + x3 , v = 2. x̃3 cos(t) x1 − x2 0 0 , 1 Solving the two subproblems yields: x̃˙ 1 = sin(t) x̃2 x̃3 ! =− v−1 X (i) N i f˜2 = − 0 i=0 x̃1 = k − cos(t) with k ∈ R, ! ! ! ! 0 cos(t) 0 1 sin(t) . = − − cos(t) − sin(t) 0 0 cos(t) ⇒ ! 1 0 1 Back-transformation yields the general solution −2 0 x = Qx̃ = −2 0 2 3 k − cos(t) −2k − cos(t) = 2 0 1 −2 − cos(t) −2k 2k c − cos(t) = c −c with c = −2k ∈ R which is the same result we obtained in Example 4. For initial values x0 with t0 = 0 we get the consistency condition x̃2,0 x̃3,0 ! ! =− Master of Science Thesis v−1 X i=0 (i) N i f˜2 (0) = 0 ! −1 ⇒ −2x̃1,0 − 3 x0 = Qx̃0 = −2x̃1,0 − 2 . 2x̃1,0 + 2 CONFIDENTIAL K. Altmann 14 Chapter 2. Analysis of differential-algebraic equations 2.3.3 Solvability of singular linear differential-algebraic equations with constant coefficients Consider linear DAEs with constant coefficients E ẋ = Ax + f with (E, A) being singular, i.e. m 6= n or det(λE − A) = 0. Then, we have the following existence and uniqueness results for the corresponding IVP: Theorem 4 Let E, A ∈ Cm,n such that (E, A) is singular. Then: 1) If rank(λE − A) < n for all λ ∈ C, the the homogeneous IVP E ẋ = Ax, (2.20) x(t0 ) = 0 has a nontrivial solution. Thus, the solution of the corresponding inhomogeneous IVP (2.6)-(2.7) is not unique. 2) If rank(λE − A) = n for some λ ∈ C (and hence m > n), then there exist arbitrary smooth inhomogeneities f for which the corresponding DAE (2.6) is not solvable. If the DAE (2.6) is solvable for the stated f , then the corresponding IVP (2.6)-(2.7) is uniquely solvable for every consistent initial value. Proof: See Kunkel/Mehrmann (2006), Theorem 2.14. Similar to the regular case there exists a canonical form for general matrix pairs (E, A) which is an extended version of the WCF: Theorem 5: Kronecker canonical form (KCF) Let E, A ∈ Cm,n . Then there exist nonsingular matrices P ∈ Cm,m and Q ∈ Cn,n such that (for all λ ∈ C) P (λE − A)Q = diag(Jρ1 , . . . , Jρr , Nσ1 , . . . , Nσs , Lε1 , . . . , Lεp , Mη1 , . . . , Mηq ) (2.21) with r, s, p, q ∈ N0 and where the block entries on the diagonal have the following properties: 1) Every entry Jρj is a Jordan block of size ρj × ρj , ρj ∈ N, λj ∈ C, of the form 1 λ .. λj − . .. . 1 K. Altmann CONFIDENTIAL 1 λj .. . .. . . 1 (2.22) λj Master of Science Thesis 2.3 Linear differential-algebraic equations with constant coefficients 15 2) Every entry Nσj is a nilpotent block of size σj × σj , σj ∈ N, of the form 0 λ 1 − 1 1 0 .. . .. . .. . . .. . (2.23) 1 0 3) Every entry Lεj is a bidiagonal block of size εj × (εj + 1), εj ∈ N0 , of the form 0 λ 1 .. . .. 1 − . 0 0 .. . 1 .. 1 (2.24) . . 0 4) Every entry Mηj is a bidiagonal block of size (ηj + 1) × ηj , ηj ∈ N0 , of the form 1 0 λ .. . .. . 0 1 − 1 .. . .. . . 0 (2.25) 1 0 The KCF is unique except for permutation of the blocks, i.e. the kind, size and number of the blocks are characteristic for the matrix pair (E, A). Proof: See Gantmacher (1959), Chapter XII. Remarks: • For regular matrix pairs (E, A) the blocks Lεj and Mηj do not exist. Then the KCF corresponds to the WCF with Jρ1 λI − J = .. , . Nσ1 λN − I = Jρr .. . . (2.26) Nσs • Blocks Lεj of size 0 × 1 and blocks Mηj of size 1 × 0 are possible. • The block notation in KCF implies that a pair (0, 0) of size 1 × 1 actually consists of two blocks, L0 of size 0 × 1 and M0 of size 1 × 0. Master of Science Thesis CONFIDENTIAL K. Altmann 16 Chapter 2. Analysis of differential-algebraic equations By transforming the matrix pair (E, A) into KCF the corresponding DAE E ẋ = Ax + f becomes x˙1 = Jx1 + f1 N x˙2 = x2 + f2 ) regular part = ˆ WCF −−−−−−−−−−−−−−−−−−−−−−−−− | | 0 Iε1 ẋ3 = Iε1 x + f 0 3 3 | | .. . | | 0 I x + f ẋ = I 0 ε p+2 p+2 p+2 ε p p | | L∗ − − − − − − − − − − − − − − − − − − − − − − − − − ! ! Iη1 −0− ẋp+3 = xp+3 + fp+3 Iη1 −0− .. M∗ . ! ! Iηq −0− ẋq+p+2 = xq+p+2 + fq+p+2 −0− Iηq (2.27) singular part. (2.28) The regular part has already been investigated in the previous subsection. Hence, it suffices to investigate DAEs of the form | | I ẋ = I 0 | I and ! −0− ẋ = 0 x + f | ! −0− x+f I of size m × (m + 1) (2.29) of size (n + 1) × n. (2.30) Consider a DAE according to an L∗ block, i.e. a DAE of the form 0 1 .. . .. . 0 ẋ1 1 . . = . 1 ẋm+1 0 .. . .. . 1 x1 f1 . . . + . . . . 0 xm+1 fm (2.31) This is an underdetermined system. Therefore, xj can be chosen arbitrarily for a j ∈ {1, . . . , m+1}. Then xk , k 6= j, can be determined by ẋk+1 = xk + fk by solving differential equations for k = j, . . . , m and by solving algebraic equations for k = j − 1, . . . , 1. K. Altmann CONFIDENTIAL Master of Science Thesis 2.3 Linear differential-algebraic equations with constant coefficients 17 Consider a DAE according to an M∗ block, i.e. a DAE of the form 1 .. 0 .. . . 0 ẋ1 1 .. . = 1 ẋn 0 .. . .. . x1 f1 .. .. . + . . 0 xn fn+1 1 (2.32) This is an overdetermined system. We get xn = −fn+1 , xj = ẋj+1 − fj+1 for j = n − 1, . . . , 1 and in addition the consistency condition 0 = ẋ1 − f1 . In summary, a DAE in KCF is solvable if and only if the consistency condition for every M∗ block is satisfied. Furthermore, free parameters result from the J block (initial values) and from L∗ blocks (whole function xj ). Example 8: Kronecker canonical form Consider the linear DAE = 2x1 − x2 , ẋ1 = 2x1 + x2 , ẋ2 = −x1 + 3x2 + x3 − x4 + f3 , = x1 − x2 − x3 + x4 + f4 . 0 with ẋ1 − ẋ2 1 1 E= 0 0 −1 0 1 0 0 0 0 , 0 0 0 0 0 2 2 A= −1 1 −1 0 1 0 3 1 −1 −1 ⇔ E ẋ = Ax + f, 0 0 f = f . 3 f4 0 0 , −1 1 Checking the regularity: λ−2 λ − 2 det(λE − A) = det 1 −1 = det −λ + 1 0 −1 0 λ−3 −1 1 λ−2 −λ + 1 λ−2 −1 1 ! · det | 0 0 1 −1 −1 1 ! 1 −1 {z } =0 =0 Thus, the DAE is singular with rank(λE − A) < n. It follows from Theorem 4 that the solution of a corresponding IVP will not be unique. Master of Science Thesis CONFIDENTIAL K. Altmann 18 Chapter 2. Analysis of differential-algebraic equations With 0 −1 P = −1 1 1 0 0 1 0 1 −1 −1 1 0 , 0 1 1 0 Q= 1 0 0 0 0 = Q−1 1 1 0 1 0 −1 −1 0 0 we get the transformed DAE in KCF 1 0 0 0 0 1 0 0 0 0 0 0 2 x̃˙ 1 0 ˙ 0 x̃2 = 0 0 x̃˙ 3 0 0 x̃˙ 4 0 ( ⇔ x̃1 0 1 0 0 2 0 0 1 0 0 0 x̃2 + 0 0 x̃3 −f3 f3 + f4 x̃4 0 P EQx̃˙ = P AQx̃ + P f, x̃ = Q−1 x) with P (λE − A)Q = diag(J2 , N1 , L0 , M0 ). Block J2 : Solving the linear ODE yields: x̃˙ 1 x̃˙ 2 ! = 2 ! ! x̃1 1 0 2 x̃2 x̃1 ⇒ x̃2 ! = c1 te2t + c2 e2t ! c1 e2t with c1 , c2 ∈ R. Block N1 : x̃3 = f3 (consistency condition for initial values) Block L0 : underdetermined system: one variable for zero equations ⇒ x̃4 arbitrary Block M0 : overdetermined system: zero variables for one equation ⇒ consistency condition f4 = −f3 Back-transformation yields the general solution x1 x̃1 c1 te2t + c2 e2t x2 x̃2 c1 e2t , x = = Qx̃ = = 2t 2t 2t x3 x̃1 − x̃2 − x̃3 + x̃4 c1 te + c2 e − c1 e − f3 + x̃4 x4 x̃4 x̃4 with c1 , c2 ∈ R. An initial condition x(t0 ) = x0 is consistent if x0 satisfies the consistency condition x̃3 (t0 ) = f3 (t0 ). With x̃ = Q−1 x we get x1 (t0 ) − x2 (t0 ) − x3 (t0 ) + x4 (t0 ) = f3 (t0 ). K. Altmann CONFIDENTIAL Master of Science Thesis 2.3 Linear differential-algebraic equations with constant coefficients 2.3.4 19 Characteristic quantities Index of nilpotency We have seen the KCF of a matrix pair (E, A) which is unique except for permutation of blocks. In that we have the block Ñ = diag(Nσ1 , . . . , Nσs ) = λN − I with the nilpotent matrix N with index of nilpotency v. This index is independent of block permutations and therefore also a characteristic quantity of the associated linear DAE (2.6). Definition 8: Index of nilpotency for linear DAEs ˜ Ñ , L̃, M̃ ) be the KCF of (λE − A) with Ñ = λN − I Let E, A ∈ Cm,n be given and let diag(J, where N is nilpotent with index of nilpotency vn . Then vn is also called the index of nilpotency of the DAE E ẋ = Ax + f . If Ñ is not present in the KCF, then vn = 0. In Lemma 2 we have seen that the index of nilpotency determines the smoothness requirements on the inhomogenity f in order to obtain a continuously differentiable solution x. The index of nilpotency vn also has another meaning. Consider one block of N ẋ = x + f : 0 1 .. . .. . 0 ẋ1 x1 f1 . . . . . . . . . = + 1 ẋn−1 xn−1 fn−1 ẋn xn fn 0 differential equations = h0 (x, t) algebraic constraint (2.33) First differentiation of the algebraic constraints yields 0 1 .. . .. . 0 f1 ẋ1 x1 . . . . . . . . . = + . ẋ x f 1 n−1 n−1 n−1 −1 ẋn 0 f˙n (2.34) Transformation from left yields 0 1 .. . .. . 0 ẋ1 x1 f1 . . .. . . . . . = + 1 ẋn−1 xn−1 fn−1 ˙ 0 ẋn xn−1 fn−1 + fn (2.35) = h1 (x, t) Thus, a new algebraic constraint h1 (x, t) is obtained which is contained in the DAE N ẋ = x + f , but hidden. h1 (x, t) is called hidden constraint of level 1. Master of Science Thesis CONFIDENTIAL K. Altmann 20 Chapter 2. Analysis of differential-algebraic equations Continuing the differentiation and transformation procedure yields after i steps 0 1 .. . .. . 0 f1 .. . x1 ẋ1 . . . . . . + f = n−1 1 ẋn−1 xn−1 i P (i−j) fn−j xn−i ẋn 0 (2.36) o = hi (x, t). j=0 Thus, a new algebraic constraint hi (x, t) is obtained after i differentiations of (parts of) the DAE N ẋ = x + f which is called hidden constraint of level i. Finally, after n (= vn ) differentiations and transformations we get 0 −1 1 .. . .. . 0 f1 .. . x1 ẋ1 . . . . . . . + f = n−1 1 ẋn−1 xn−1 n−1 P (n−j) fn−j 0 ẋn 0 (2.37) j=0 0 1 By multiplying this DAE with P = .. . .. . .. . 1 −1 from the left we get an ODE 0 n−1 P (n−j) fn−j ẋ1 0 − j=0 . . x1 . f1 = . + ẋ .. .. n−1 . xn−1 ẋn fn−1 , (2.38) which is called underlying ODE. Thus, we obtained hidden constraints 0 = hi (x, t) of level i = 1, . . . , vn − 1. Since the solution x has to satisfy all (hidden) constraints 0 = hi (x, t) for all t ∈ I we get the so-called solution manifold {Cn × I : 0 = h0 (x, t), . . . , 0 = hvn −1 (x, t)} 3 (x(t), t). Note that here, x is not the state of the DAE (2.6), but the state of the corresponding KCF. In summary, the larger the index of nilpotency vn , the stronger the smoothness requirements on f and the deeper some constraints are hidden. The hidden constraints are not explicitly stated in the DAE, and thus they impose additional consistency conditions on the initial values and cause difficulties in the numerical solution of the DAE. Therefore, the index of nilpotency vn is a measurement of the numerical difficulties in solving the DAE. K. Altmann CONFIDENTIAL Master of Science Thesis 2.3 Linear differential-algebraic equations with constant coefficients 21 Differentiation index Another index approach is most common in the extant literature which is a generalisation to nonlinear DAEs of the form 0 = F (ẋ, x, t). It is due to an idea by Campbell (1987; also see Campbell/Griepentrog 1995) to differentiate the original DAE. Thus, summarising the nonlinear DAE 0 = F (ẋ, x, t) and all its derivatives up to a certain order l ∈ N0 in one large system of equations yields the following definition: Definition 9: Derivative array, inflated DAEs Let F : (ẋ, x, t) 7→ F (ẋ, x, t) be a function in C s (Cn × Cn × I, Cm ), s ∈ N. Then, the associated derivative array of order l with l ∈ N0 , l ≤ s, has the form F (ẋ, x, t) d F (ẋ, x, t) dt Fl (x, ẋ, . . . , x(l+1) , t) = . .. . dl F ( ẋ, x, t) dtl (2.39) Then, the corresponding derivative array equations or inflated DAEs with respect to the DAE 0 = F (ẋ, x, t) are given by 0 = Fl (x, ẋ, . . . , x(l+1) , t). Example 9: Derivative array for linear DAEs with constant coefficients Differentiating a linear DAE with constant coefficients (E ẋ = Ax + f ) yields E 0 0 −A 0 .. . E 0 −A .. . E .. . A ẋ ··· ẍ 0 = .. . . 0 . .. . .. . . ··· 0 0 0 .. . 0 ... x f 0 . . . ẋ + f˙ , 0 . . . . . . .. .. . . . . . i.e. the derivative array equations of order l for linear DAEs with constant coefficients are given by Ml · żl = Nl · zl + gl where E (Ml )ij = −A 0 for i = j for i = j + 1 , else (zl )j = x(j) , A for i = j = 0 (Nl )ij = , 0 else (gl )j = f (j) , i, j = 0, . . . , l, j = 0, . . . , l. The matrix pair (Ml , Nl ) with Ml , Nl ∈ Cm(l+1),n(l+1) is called inflated pair. Master of Science Thesis CONFIDENTIAL K. Altmann 22 Chapter 2. Analysis of differential-algebraic equations Based on the derivative array, the differentiation index is defined as follows (see Gear 1988; Campbell/Gear 1995; Steinbrecher 2006): Definition 10: Differentiation index (d-index), underlying ODE Suppose that the DAE (2.1) is a solvable DAE. Let Fl (x, ẋ, w, t) be the corresponding derivative array of order l where w = (ẍ, . . . , x(l+1) ). Let ẋ be considered locally as an algebraic variable y. The smallest number vd ∈ N0 (if it exists) for which y is uniquely determined by (x, t) and 0 = Fvd (x, y, w, t) for all consistent values as y = Φ(x, t) is called the differentiation index (d-index) of the DAE (2.1). With y = ẋ we have the underlying ODE ẋ = Φ(x, t). In other words, the d-index can be seen as a measurement of how far the DAE is away from an ODE. Note that the concept of the d-index is only suited for square systems with unique solutions since a DAE with a free solution component cannot lead to an ODE which is uniquely solvable for consistent initial values (see Kunkel/Mehrmann 2006, p. 96). For linear DAEs with constant coefficients, a simpler characterisation of the d-index can be formulated with the help of the following definition: Definition 11: 1-full A block matrix M ∈ Ckn,ln is called 1-full (with respect to the block structure built from n × n matrices) if and only if there exist a nonsingular matrix R ∈ Ckn,kn and a matrix H ∈ C(k−1)n,(l−1)n such that RM = In 0 0 H ! (2.40) . Lemma 3 Let a matrix pair (E, A) be given with the inflated pair (Ml , Nl ). Then, the d-index vd of the DAE E ẋ = Ax + f is the smallest number vd ∈ N0 for which Mvd is 1-full. Example 10: Differentiation index In Example 1, we have a linear ODE. Hence, the d-index is 0 since the derivative array equations of order 0 are the ODE itself. Thus, it is uniquely solvable for y = ẋ: y= K. Altmann 3 ! −1 4 −2 x. CONFIDENTIAL Master of Science Thesis 2.3 Linear differential-algebraic equations with constant coefficients 23 Consider the differential-algebraic equations from Example 4: 1 0 0 1 1 ẋ = 0 1 0 0 0 1 0 sin(t) + cos(t) 3 cos(t) 1 2 x + cos(t) −1 0 2 ⇔ E ẋ = Ax + f. Obviously, the derivative array equations of order 0 are not uniquely solvable for y = ẋ since E is singular. The derivative array equations of order 1 with x, y = ẋ, w = ẍ are given by 1 0 0 −1 −1 −1 0 0 0 0 1 0 0 0 0 0 0 0 −2 −3 1 0 −1 −2 0 1 1 0 0 0 1 0 1 0 ! 1 0 y = 0 0 w 0 0 0 0 2 3 0 0 1 2 0 0 −1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 sin(t) + cos(t) 0 cos(t) 0 ! cos(t) 0 x + . 0 y cos(t) − sin(t) 0 − sin(t) − sin(t) 0 They are not uniquely solvable for y only dependent on x, t without dependence on!w. Thus, ! w1 ẍ consider the derivative array equations of order 2 with x, y = ẋ, w = ... = : x w2 1 0 0 −1 −1 −1 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 −2 −3 1 0 0 0 0 −1 −2 0 1 0 0 0 1 0 0 0 0 0 0 0 0 −1 −2 −3 1 0 0 0 −1 −1 −2 0 1 0 0 −1 0 0 1 0 1 2 1 1 1 −1 0 0 0 0 0 0 0 0 0 0 0 Master of Science Thesis 0 0 0 0 0 y w = 0 1 0 w2 0 0 0 3 0 0 0 0 0 2 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 sin(t) + cos(t) 0 cos(t) 0 cos(t) 0 x cos(t) − sin(t) + . 0 y − sin(t) 0 w1 − sin(t) 0 − sin(t) − cos(t) 0 − cos(t) 0 − cos(t) CONFIDENTIAL K. Altmann 24 Chapter 2. Analysis of differential-algebraic equations −2 −2 2 0 Multiplying with R = 1 0 0 0 0 1 0 0 0 1 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 −1 −1 −1 0 0 −1 0 0 −3 0 0 3 −1 0 0 −2 0 −3 1 −1 1 2 0 0 1 0 0 0 0 −1 0 0 0 1 0 0 0 −2 3 −1 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 −1 0 0 0 0 from the left yields 0 −2 0 0 1 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 y w = 0 0 0 0 0 1 1 0 0 0 0 w2 −2 −3 1 0 0 −1 −2 0 1 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 −1 0 0 0 0 0 0 1 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 3 0 0 0 0 0 0 0 0 0 0 sin(t) 0 0 0 0 0 cos(t) 0 x + 0 y 0 0 0 w1 − sin(t) − cos(t) 0 − cos(t) 0 − cos(t) 0 which contains in the first three rows y = (sin(t), 0, 0)> . Therefore, we get the d-index vd = 2. Note that the resulting system contains the constraint 0 = x1 − x2 + cos(t) in the 4th row as well as the hidden constraint 0 = x2 + x3 in the 5th row. Remark: In principle, it is possible to determine the hidden constraints from the KCF by back-transformation into the original states of the DAE or from the derivative array. But the determination of the KCF is very technical and involves the computation of JCFs which are very sensitive to perturbations, and thus this approach is numerically not stable. However, the determination of the hidden constraints from the whole derivative array is very intricate, and furthermore it is hard to distinguish the different levels. Therefore, we consider another index concept in the following, the so-called strangeness index. K. Altmann CONFIDENTIAL Master of Science Thesis 2.3 Linear differential-algebraic equations with constant coefficients 25 Strangeness index Definition 12: Corange, cokernel Let E ∈ Cm,n . Then, the corange and cokernel of E are defined as follows: corange(E) = ker(E ∗ ), coker(E) = range(E ∗ ). (2.41) Remark: It holds that coker(E) is the orthogonal complement of ker(E), i.e. ker(E) ⊕ coker(E) = Cn , and corange(E) is the orthogonal complement of range(E), i.e. range(E) ⊕ corange(E) = Cm . For convenience, the following conventions are used: 1) We say a matrix T ∈ Cn,k is a basis of a subspace U ⊆ Cn if this is valid for its columns, i.e. if rank(T ) = k and range(T ) = U. 2) The empty matrix ∅n,0 ∈ Cn,0 is the only basis of {0} ∈ Cn with rank(∅n,0 ) = 0 and det(∅0,0 ) = 1. 3) For a given matrix T ∈ Cn,k with rank(T ) = k, we use T 0 to denote a matrix from Cn,n−k that completes T to a nonsingular matrix [T T 0 ] ∈ Cn,n . Theorem 6: Strong canonical form Let E, A ∈ Cm,n and introduce the matrices T : basis of ker(E), i.e. ET = 0, Z : basis of corange(E), i.e. Z ∗ E = 0, V : basis of corange(Z ∗ AT ), i.e. V ∗ Z ∗ AT = 0. Then the quantities r = rank(E) (rank), d=r−s (differential part), a = rank(Z ∗ AT ) (algebraic part), u=n−r−a (undetermined variables), s = rank(V ∗ Z ∗ AT 0 ) (strangeness), v = m − r − a − s (vanishing equations), called characteristic values, are independent of the particular choice of T, V, Z and T 0 and invariant under strong equivalence. Furthermore, the matrix pair (E, A) is strongly equivalent to the canonical form s Is 0 ( 0 0 0 Master of Science Thesis d a 0 0 Id 0 0 0 0 0 0 0 u s 0 0 0 0 0 , 0 0 Is 0 0 d a u A12 0 A14 A22 0 0 Ia 0 0 0 0 CONFIDENTIAL s A24 d 0 a ). 0 s 0 v (2.42) K. Altmann 26 Chapter 2. Analysis of differential-algebraic equations Proof: The proof is similar to the one of Theorem 3.7 in Kunkel/Mehrmann (2006). In particular, the transformation matrices for the construction of the strong canonical form can be determined with the singular value decomposition (SVD) which is numerically stable. By transforming the matrix pair (E, A) into the strong canonical form (2.42), the corresponding DAE E ẋ = Ax + f becomes ẋ1 = A12 x2 + A14 x4 + f1 , s (2.43a) ẋ2 = A22 x2 + A24 x4 + f2 , d (2.43b) 0 = x3 + f3 , a (2.43c) 0 = x1 + f4 , s (2.43d) 0 = f5 . v (2.43e) Thus, there is an algebraic equation (2.43c) for x3 (algebraic part) and a consistency condition (2.43e) for the inhomogeneity f5 (vanishing equations). Furthermore, there is a differential equation (2.43b) for x2 (differential part) with a possible free choice in x4 (undetermined variables). However, there is a problematic coupling, which is called strangeness, due to an algebraic equation (2.43d) as well as a differential equation (2.43a) for x1 . In order to eliminate the strangeness (i.e. to eliminate ẋ1 ), the idea is to differentiate the algebraic equation (2.43d) and to add it to the differential equation (2.43a). Then, this differentiationelimination step (DE-step) corresponds to a conversion into the modified DAE 0 = A12 x2 + A14 x4 + f1 + f˙4 , s (2.44a) d (2.44b) 0 = x3 + f3 , a (2.44c) 0 = x1 + f4 , s (2.44d) 0 = f5 , v (2.44e) ẋ2 = A22 x2 + A24 x4 + f2 , with corresponding modified matrix pair 0 0 (0 0 0 0 Id 0 0 0 0 0 0 0 0 0 0 , 0 0 0 Is 0 0 0 0 A12 0 A14 A22 0 0 Ia 0 0 0 0 A24 0 ) = (Emod , Amod ). 0 0 (2.45) Remarks: • The linear DAE corresponding to (E, A) and the modified DAE corresponding to (Emod , Amod ) have the same set of solutions since we can reverse the procedure by differentiating (2.44d) and subtracting it from (2.44a). • The rank of the modified matrix Emod is reduced by s, i.e. rank(Emod ) = r − s = d. K. Altmann CONFIDENTIAL Master of Science Thesis 2.3 Linear differential-algebraic equations with constant coefficients 27 Because of the non-uniqueness of the strong canonical form, the following theorem is necessary to ensure that the modified DAE is still characteristic for the original problem. Theorem 7 Let E, A, Ẽ, Ã ∈ Cm,n with (E, A) and (Ẽ, Ã) being strongly equivalent and in strong canonical form (2.42). Then, the matrix pairs (Emod , Amod ) and (Ẽmod , Ãmod ) obtained by the DE-step from (E, A) and (Ẽ, Ã), respectively, are also strongly equivalent. Proof: The proof follows later from the corresponding theorem for linear DAEs with variable coeffi cients. Now, Theorem 7 allows for the following iterative procedure of finding a strangeness-free formulation: Procedure 1: Reduction to strangeness-free form Consider a linear DAE of the form E ẋ = Ax + f with E, A ∈ Cm,n . Starting from (E0 , A0 ) = (E, A), f0 = f , iterate for i = 0, 1, . . . 1) Transform (Ei , Ai ) into the strong canonical form (Ẽi , Ãi ), fi into f˜i and xi into x̃i by using the nonsingular matrices Pi ∈ Cm,m and Qi ∈ Cn,n with Ẽi = Pi Ei Qi , Ãi = Pi Ai Qi , f˜i = Pi fi , x̃i = Q−1 i xi . (2.46) We get the characteristic values ri , ai , si , di , ui , vi . If si = 0, stop the procedure. 2) Transform (Ẽi , Ãi ) into (Ei+1 , Ai+1 ) and f˜i into fi+1 by the differentiation-elimination step (DE-step), i.e. Ei+1 Is i = Ẽi − 0 0 , fi+1 0 0 ˙ f˜i,4 0 ˜ = fi + 0 , Ai+1 = Ãi , xi+1 = x̃i . 0 0 (2.47) Procedure 1 determines a non-unique sequence of matrix pairs (Ei , Ai ) due to the non-uniqueness of the strong canonical form, but a unique sequence of characteristic values (ri , si , ai ) due to Theorem 7. Because of the relation ri+1 = ri − si , we have si = 0 after finitely many iterations i. Then, the sequence (ri , si , ai ) becomes stationary and the procedure stops. This gives rise to the definition of a new characteristic quantity of the matrix pair (E, A) and the corresponding DAE as follows: Master of Science Thesis CONFIDENTIAL K. Altmann 28 Chapter 2. Analysis of differential-algebraic equations Definition 13: Strangeness index Let E, A ∈ Cm,n and let the sequence (ri , si , ai ), i ∈ N0 , be determined by Procedure 1. Then, vs = min{i ∈ N0 : si = 0} is called strangeness index (s-index) of the matrix pair (E, A) and of the DAE E ẋ = Ax + f . In the case that vs = 0, both the matrix pair (E, A) and the DAE E ẋ = Ax + f are called strangeness-free. Furthermore, from the sequence (Ei , Ai , fi ) we obtain a strangenesse-free formulation Ẽv ẋ = Ãv x + f˜v . s s s Example 11: Strangeness index Consider the differential algebraic equations from Example 4: 1 (0 0 ẋ1 = x1 + 2x2 + 3x3 + sin(t) + cos(t) ⇒ ẋ2 = x1 + x2 + 2x3 + cos(t) 0 = x1 − x2 + cos(t) 0 1 0 0 1 0 , 1 0 1 2 1 −1 3 2 ) = (E0 , A0 ). 0 1) Transformation into strong canonical form with 1 P0 = 0 0 −1 1 0 0 −1 , 1 1 1 Q0 = 0 0 1 0 0 0 1 yields x̃˙ 1 = x̃2 + x̃3 + sin(t) x̃˙ 2 = 2x̃2 + 2x̃3 ⇒ 0 = x̃1 + cos(t) 1 (0 0 0 1 0 0 0 0 , 0 0 1 1 2 0 1 2 ) = (Ẽ0 , Ã0 ) 0 with characteristic values r0 = 2, a0 = 0, s0 = 1, d0 = 1, u0 = 1, v0 = 0. Thus, the DAE is not strangeness-free and we have to proceed. 2) The DE-step yields 0 (0 0 0 = x̃2 + x̃3 x̃˙ 2 = 2x̃2 + 2x̃3 ⇒ 0 = x̃1 + cos(t) 0 1 0 0 0 0 , 0 0 1 1 2 0 1 2 ) = (E1 , A1 ). 0 1) Transformation into strong canonical form with −2 P1 = 1 0 K. Altmann 0 0 0 , 0 1 1 0 Q1 = 1 −1 CONFIDENTIAL 1 0 0 1 0 0 Master of Science Thesis 2.3 Linear differential-algebraic equations with constant coefficients 29 yields x̂˙ 1 = 0 ⇒ 0 = x̂2 0 = x̂3 + cos(t) 1 ( 0 0 0 0 0 0 0 0 , 0 1 0 0 0 0 0 0 ) = (Ẽ1 , Ã1 ) 1 with characteristic values r1 = 1, a1 = 2, s1 = 0, d1 = 1, u1 = 0, v1 = 0. Therefore, we get the s-index vs = 1. Note that the state variables x are transformed in step 1). It holds that x̂1 0 −1 −1 x̂ = = Q Q x = x̂ 1 0 2 0 x̂3 1 1 1 −1 x1 x2 0 1 x2 = x2 + x3 . x3 x1 − x2 0 The following theorem summarises the obtained results. Theorem 8 Let vs be the s-index of the DAE (2.6) and let f ∈ C vs (I, Cm ). Then, the DAE (2.6) is equivalent (in the sense that there is a 1-to-1 correspondence between the solution spaces via a nonsingular matrix) to a DAE of the form x̂˙ 1 = Â11 x̂1 + Â13 x̂3 + fˆ1 , dvs (2.48a) 0 = x̂2 + fˆ2 , avs (2.48b) 0 = fˆ3 , vvs (2.48c) where Â11 ∈ Cdvs ,dvs and Â13 ∈ Cdvs ,uvs and the inhomogeneities fˆ1 , fˆ2 , fˆ3 are determined from (E, A) and f , f (1) , . . . , f (vs ) using Procedure 1. Corollary 1 Let vs be the s-index of the DAE (2.6) and let f ∈ C vs +1 (I, Cm ). Then we have the following: 1) The DAE (2.6) is solvable if and only if the vvs consistency conditions fˆ3 = 0 are fulfilled. 2) An initial condition (2.7) is consistent if and only if it implies the avs conditions x̂2 (t0 ) = −fˆ2 (t0 ). 3) Any IVP (2.6)-(2.7) with consistent initial values is uniquely solvable if and only if uvs = 0 holds. Master of Science Thesis CONFIDENTIAL K. Altmann 30 Chapter 2. Analysis of differential-algebraic equations Example 12 Consider the DAE from Example 8: ẋ1 − ẋ2 = 2x1 − x2 ẋ1 = 2x1 + x2 ⇔ ẋ2 = −x1 + 3x2 + x3 − x4 + f3 E ẋ = Ax + f (∗) 0 = x1 − x2 − x3 + x4 + f4 with 1 1 E0 = E = 0 0 −1 0 1 0 0 0 0 , 0 0 0 0 0 2 2 A0 = A = −1 1 Transformation into strong canonical 0 1 −1 1 P = 1 −1 1 −1 −1 0 1 0 3 1 −1 −1 form with 1 0 0 0 0 0 , Q= 1 1 0 0 1 1 0 0 , −1 1 0 0 f = f . 3 f4 0 0 0 1 0 0 1 1 0 0 1 1 yields x̃˙ 1 = 2x̃1 + x̃2 x̃˙ 2 = 2x̃2 0 = x̃3 + f3 ⇒ 0 = f3 + f4 1 0 ( 0 0 0 0 1 0 0 0 0 0 2 0 , 0 0 0 0 0 0 1 0 0 2 0 0 1 0 0 0 ) = (Ẽ0 , Ã0 ) 0 0 with characteristic values r0 = 2, a0 = 1, s0 = 0, d0 = 2, u0 = 1, v0 = 1. Thus, the DAE is strangeness-free. Solvability statements (see Corollary 1): 1) The DAE (∗) is solvable if and only if 0 = f3 + f4 . 2) An initial condition x(t0 ) = x0 is consistent if and only if x̂2 (t0 ) = −fˆ2 (t0 ). With 1 x̂1 x̃2 0 −1 x̃ = x̃ = x̂2 = Q x = −1 3 x̂3 0 x̃4 x̃1 0 1 1 −1 x1 x1 0 0 x2 x2 = 1 −1 x3 −x1 + x2 + x3 − x4 0 1 x4 −x2 + x4 0 0 we get −x1 (t0 ) + x2 (t0 ) + x3 (t0 ) − x4 (t0 ) = −f3 (t0 ). 3) The corresponding IVP is not uniquely solvable since u0 = 1. In particular, x̃4 can be chosen arbitrarily. K. Altmann CONFIDENTIAL Master of Science Thesis 2.3 Linear differential-algebraic equations with constant coefficients 31 Solving the transformed strangeness-free formulation yields c1 te2t + c2 e2t x̃2 c1 e2t = x̃ −f 3 3 x̃4 x̃4 x̃1 with c1 , c2 ∈ R. Back-transformation yields the general solution 1 0 x2 x= x = Qx̃ = 1 3 0 x4 x1 0 0 1 0 0 1 0 x̃1 c1 te2t + c2 e2t c1 e2t x̃2 0 . = 2t 2t 1 1 x̃1 + x̃3 + x̃4 c1 te + c2 e − f3 + x̃4 2t c1 e + x̃4 x̃2 + x̃4 0 1 Concluding remarks The following summarises the obtained results in this section: 1) The index of nilpotency vn is based on the Kronecker canonical form (KCF) for general matrix pairs or on the Weierstraß canonical form (WCF) for regular matrix pairs. The solvability statements and a solution approach are developed from the different blocks of these canonical forms. But the determination of the KCF is very technical and involves the computation of JCFs which are very sensitive to perturbations, and thus this approach is numerically not stable. 2) The d-index vd is based on the derivative array equations of the DAE. This index approach is only applicable for square systems which are uniquely solvable, i.e. for regular matrix pairs, because otherwise the d-index does not exist. Furthermore, it is in principle possible to determine the hidden constraints from the derivative array, but this takes a large effort and leads to unnecessary smoothness requirements due to the use of the whole derivative array. 3) The s-index vs is based on an iterative procedure which in each step first determines a strong canonical form of the DAE and subsequently performs a differentiation-elimination step which results in an equivalent strangeness-free formulation. Then, the corresponding solvability statements and a solution approach are developed from this strangeness-free formulation. Therefore, only necessary smoothness requirements are needed since only parts of the DAE are differentiated which results in an efficient procedure compared to the other approaches. 4) Comparing the solvability statements of Theorem 3 and 4 with Corollary 1 results in the following alternative characterisation of regular matrix pairs. A matrix pair (E, A) is regular if and only if the characteristic values satisfy uvs = vvs = 0 where vs is the s-index. Furthermore, regularity of the matrix pair (E, A) implies that the d-index vd is well-defined and the relation between the two index concepts can be expressed as vs = max{0, vd − 1}. Master of Science Thesis CONFIDENTIAL K. Altmann 32 Chapter 2. Analysis of differential-algebraic equations 2.4 Linear differential-algebraic equations with variable coefficients In this section, we consider linear DAEs with variable coefficients of the form (2.49) E(t)ẋ(t) = A(t)x(t) + f (t) where E, A ∈ C(I, Cm,n ) and f ∈ C(I, Cm ), possibly together with an initial condition with x(t0 ) = x0 2.4.1 (2.50) t0 ∈ I, x0 ∈ Cn . Equivalence and regularity Considering matrix functions E, A ∈ C(I, Cm,n ), one might assume that demanding regularity of the matrix pair (E(t), A(t)) for all t ∈ I leads to unique solvability of the IVP similar to Theorem 3. However, the concept of regularity does not guarantee unique solvability of the IVP as the following two examples show (compare with Examples 3.1 and 3.2 in Kunkel/Mehrmann 2006): Example 13 Let E, A and f be given by E(t) = −t t2 −1 t ! , A(t) = ! 0 ! −1 0 0 −1 , f (t) = 0 , I = R. Then it follows that det(λE(t) − A(t)) = det −λt + 1 −λt2 −λ λt + 1 ! = (−λt + 1)(λt + 1) + λ2 t2 = 1. Thus, (E(t), A(t)) is regular for all t ∈ I. However, for any c ∈ C(I, C) with c(t0 ) = 0 the function x(t) = c(t) t ! 1 is a solution of the IVP E(t)ẋ(t) = A(t)x(t), x(t0 ) = 0 because E(t)ẋ(t) = −t t2 −1 t ! ċ(t) t 1 ! + c(t) 1 !! 0 = −t −1 ! c(t) = A(t)x(t). Hence, the IVP is not uniquely solvable despite of regularity. K. Altmann CONFIDENTIAL Master of Science Thesis 2.4 Linear differential-algebraic equations with variable coefficients 33 Example 14 Let E, A and f be given by E(t) = 0 0 1 −t ! , A(t) = ! −1 t 0 0 , f (t) = ! f1 (t) f2 (t) ∈ C 2 (I, C2 ), I = R. Then it follows that det(λE(t) − A(t)) = det 1 −t λ −λt ! = −λt + λt = 0. Thus, (E(t), A(t)) is singular for all t ∈ I. ! x1 However, with x = we obtain from E(t)ẋ(t) = A(t)x(t) + f (t) the two equations x2 0 = −x1 (t) + tx2 (t) + f1 (t), ẋ1 (t) − tẋ2 = f2 (t). The first equation gives x1 (t) = tx2 (t) + f1 (t). Differentiating this equation and inserting it into the second equation yields x2 (t) = f2 (t) − f˙1 (t). Inserting this into the first equation gives x1 (t) = tf2 (t) − tf˙1 (t) + f1 (t). Thus, we have the unique solution x(t) = ! x1 (t) x2 (t) = ! tf2 (t) − tf˙1 (t) + f1 (t) . f2 (t) − f˙1 (t) Hence, every IVP with consistent initial values is uniquely solvable despite of singularity. Thus, in derogation from the constant coefficients case, the two properties regularity of the matrix pair (E(t), A(t)) for all t ∈ I and unique solvability of the corresponding IVP are completely independent of each other. This behaviour can be explained by the inadequacy of the strong equivalence relation (2.10) for linear DAEs with variable coefficients. Instead of transformations with nonsingular matrices, we should consider pointwise nonsingular matrix functions P ∈ C(I, Cm,m ) and Q ∈ C(I, Cn,n ). Then, the DAE (2.49) can be transformed into another linear DAE with variable coefficients by multiplying with P from the left and a change of variables x = Qx̃ as in the case of constant coefficients. Thus, we get ẋ = Qx̃˙ + Q̇x̃, i.e. an additional term Q̇x̃ due to the product rule. Hence, transformation of the DAE (2.49) yields: E ẋ = Ax + f ⇔ P E ẋ = P Ax + P f ⇔ P EQx̃˙ = (P AQ − P E Q̇)x̃ + P f. (2.51) This results in a different kind of equivalence: Master of Science Thesis CONFIDENTIAL K. Altmann 34 Chapter 2. Analysis of differential-algebraic equations Definition 14: Global equivalence Two pairs of matrix functions (Ei , Ai ), Ei , Ai ∈ C(I, Cm,n ), i = 1, 2, are called globally equivalent if there exist pointwise nonsingular matrix functions P ∈ C(I, Cm,m ) and Q ∈ C 1 (I, Cn,n ) such that and E2 = P E1 Q (2.52) A2 = P A1 Q − P E1 Q̇. Remarks: • It can be shown that the relation introduced in Definition 14 is indeed an equivalence relation (see Kunkel/Mehrmann 2006, Lemma 3.4). • For P and Q constant and therefore Q̇ = 0, we obtain the strong equivalence. In contrast to strong equivalence (see Lemma 1), regularity of a matrix pair (E(t), A(t)) for fixed t ∈ I is not invariant under global equivalence which is illustrated in the following example: Example 15 In Example 13, we have a linear DAE with variable coefficients with corresponding pair of matrix functions (E(t), A(t)) = ( −t t2 −1 t ! , ! −1 0 0 −1 ). We have seen that (E(t), A(t)) is regular for all t ∈ I. With P (t) = ! −1 0 −1 t , Q(t) = ! 1 t 0 1 we obtain Ẽ = P EQ = 1 ! 0 0 0 , Ã = P AQ − P E Q̇ = 0 ! 1 1 0 − 0 ! 1 0 0 Thus, (E(t), A(t)) is globally equivalent to the singular matrix pair ( = 1 ! 0 0 0 , 0 ! 0 1 0 0 0 1 0 . ! ). For given matrices P̄ ∈ Cm,m and Q̄, R̄ ∈ Cn,n , and for a given t̄ ∈ I, we can define the matrix functions P (t) = P̄ , (2.53) Q(t) = Q̄ + (t − t̄)R̄ satisfying P (t̄) = P̄ , Q(t̄) = Q̄, (2.54) Q̇(t̄) = R̄. Therefore, we get some freedom in the equivalence relation locally considered at a fixed point t̄ ∈ I. This gives rise to the definition of the following local version of the equivalence relation: K. Altmann CONFIDENTIAL Master of Science Thesis 2.4 Linear differential-algebraic equations with variable coefficients 35 Definition 15: Local equivalence Two pairs of matrices (Ei , Ai ) ∈ Cm,n × Cm,n , i = 1, 2, are called locally equivalent if there exist matrices P ∈ Cm,m and Q, R ∈ Cn,n with P and Q nonsingular such that and E2 = P E1 Q A2 = P A1 Q − P E1 R. (2.55) Remarks: • Again, it can be shown that the relation introduced in Definition 15 is indeed an equivalence relation (see Kunkel/Mehrmann 2006, Lemma 3.6). • For R = 0 we obtain the strong equivalence. Hence, there are more transformations available for local equivalence compared to strong equivalence in order to simplify the structure of a given matrix pair. • For P = Im and Q = In we obtain E2 = E1 and A2 = A1 − E1 R. Thus, we can subtract multiples of columns of E1 from arbitrary columns of A1 . So, we can eliminate parts of A2 with the help of E1 . 2.4.2 Strangeness index approach With the help of the local equivalence relation, we have the following theorem for matrix pairs in addition to Theorem 6: Theorem 9: Local canonical form Let E, A ∈ Cm,n and let r, a, s, d, u, v be the characteristic values of (E, A) as in Theorem 6. Then, the matrix pair (E, A) is locally equivalent to the canonical form s d Is 0 ( 0 0 0 0 Id 0 0 0 a u s d a u 0 0 0 0 s 0 0 0 0 0 0 0 0 d 0 0 , 0 0 Ia 0 a ). 0 0 Is 0 0 0 s 0 0 0 0 v 0 0 (2.56) Proof: See Kunkel/Mehrmann (2006), Theorem 3.7. Definition 16: Local characteristic values Let E, A ∈ C(I, Cm,n ) be matrix functions and let t̄ ∈ I be fixed. Then, the characteristic values r, a, s, d, u, v from Theorem 6 for (Ē, Ā) with Ē = E(t̄) and Ā = A(t̄) are called local characteristic values of (E, A) at t̄. Master of Science Thesis CONFIDENTIAL K. Altmann 36 Chapter 2. Analysis of differential-algebraic equations Thus, for a pair of matrix functions (E(t), A(t)) we obtain local characteristic values r, a, s for every t ∈ I, i.e. r, a, s form functions r, a, s : I → N0 of characteristic values. Now, we want to construct a canonical form under global equivalence by first assuming that these functions are constant on I, i.e. that the block matrices in the strong canonical form as well as in the local canonical form do not depend on t ∈ I. We will see later what happens if the local characteristic values are not constant and how far the requirement of constancy actually is a loss of generality. The restriction to constant local characteristic values allows amongst others the application of the following generalisation of the SVD to matrix functions: Theorem 10: Smooth SVD Let E ∈ C l (I, Cm,n ), l ∈ N0 ∪ {∞}, with rank(E(t)) = r ∈ N0 for all t ∈ I. Then, there exist pointwise unitary functions U ∈ C l (I, Cm,m ) and V ∈ C l (I, Cn,n ) such that ∗ U EV = Σ 0 0 0 ! (2.57) where Σ ∈ C l (I, Cr,r ) is pointwise nonsingular. Proof: See Kunkel/Mehrmann (2006), Theorem 3.9; or Steinbrecher (2006), Theorem 2.1.4. Suitable matrix functions U and V in (2.57) can be determined numerically by using the following theorem: Theorem 11 If E ∈ C 1 (I, Cm,n ), then suitable matrix functions U = [Z 0 Z] and V = [T 0 T ] in (2.57) can be obtained as solutions of the ODEs Z 0∗ E ! Ṫ = − T∗ T 0∗ E ∗ Z∗ ! Ż = − Z 0∗ ĖT ! 0 T 0∗ Ė ∗ Z T∗ , 0 Ṫ = − T 0∗ ! 0 ! Z∗ , Z 0∗ ! 0 Ż = − Ṫ ∗ T 0 ! , 0 Ż ∗ Z 0 (2.58) ! 0 , with initial values T (t0 ) = T0 , T 0 (t0 ) = T00 , Z(t0 ) = Z0 , satisfying [Z00 Z0 ] ∗ E(t0 )[T00 T0 ] = Σ0 0 ! 0 0 , Proof: See Kunkel/Mehrmann (2006), Corollary 3.10. K. Altmann CONFIDENTIAL Z 0 (t0 ) = Z00 , Σ0 nonsingular. (2.59) (2.60) Master of Science Thesis 2.4 Linear differential-algebraic equations with variable coefficients 37 By using the smooth SVD, it is possible to construct the following global canonical form for pairs of matrix functions under global equivalence. On the one hand, since R = Q̇ has to be satisfied in the local equivalence relation (2.55), we expect a more complicated canonical form than the local canonical form. On the other hand, since we can use time-dependent transformation matrices, we expect a simpler canonical form than the strong canonical form. Theorem 12: Global canonical form Let E, A ∈ C(I, Cm,n ) be matrix functions and suppose that r(t) ≡ r, a(t) ≡ a, s(t) ≡ s for the local characteristic values r(t), a(t), s(t) of (E(t), A(t)). Then, (E, A) is globally equivalent to the canonical form s Is 0 ( 0 0 0 d 0 Id 0 0 0 a u s 0 0 0 0 0 0 0 0 , 0 0 0 Is 0 0 0 d a A12 0 0 0 0 Ia 0 0 0 0 u s A24 d 0 a ) 0 s v 0 A14 (2.61) where Aij are (non-unique) matrix functions on I. Proof: See Kunkel/Mehrmann (2006), Theorem 3.11. Remark: The global canonical form (2.61) can also be obtained for linear DAEs with constant coefficients if variable transformation matrices are used, in particular in order to eliminate the block A22 . In general, the global canonical form (2.61) cannot be obtained with constant transformation matrices. By transforming the pair of matrix functions (E, A) into the global canonical form (2.61) the corresponding DAE E(t)ẋ = A(t)x + f (t) becomes ẋ1 = A12 (t)x2 + A14 (t)x4 + f1 (t), s (2.62a) ẋ2 = A24 (t)x4 + f2 (t), d (2.62b) 0 = x3 + f3 (t), a (2.62c) 0 = x1 + f4 (t), s (2.62d) 0 = f5 (t). v (2.62e) Again, we have an algebraic equation (2.62c) for x3 , a consistency condition (2.62e) for the inhomogeneity f5 and a differential equation (2.62b) for x2 with a possible free choice in x4 . Furthermore, there is a problematic coupling due to an algebraic equation (2.62d) as well as a differential equation (2.62a) for x1 . Master of Science Thesis CONFIDENTIAL K. Altmann 38 Chapter 2. Analysis of differential-algebraic equations In order to eliminate this problematic coupling, we can apply the differentiation-elimination step in the same way as in the constant coefficient case to obtain the modified DAE 0 = A12 (t)x2 + A14 (t)x4 + f1 (t) + f˙4 (t), s (2.63a) d (2.63b) 0 = x3 + f3 (t), a (2.63c) 0 = x1 + f4 (t), s (2.63d) 0 = f5 (t), v (2.63e) ẋ2 = A24 (t)x4 + f2 (t), with corresponding modified pair of matrix functions 0 0 (0 0 0 0 Id 0 0 0 0 0 0 0 0 0 0 , 0 0 0 Is 0 0 0 0 A12 0 A14 0 0 0 Ia 0 0 0 0 A24 0 ) = (Emod , Amod ). 0 0 (2.64) Again, the linear DAEs corresponding to (E, A) and (Emod , Amod ) have the same set of solutions since we can reverse the procedure of differentiating and eliminating. Because of the non-uniqueness of the global canonical form, the following theorem is necessary to ensure that the modified DAE is still characteristic for the original problem. Theorem 13 Let E, A, Ẽ, Ã ∈ C(I, Cm,n ) with (E, A) and (Ẽ, Ã) being globally equivalent and in global canonical form (2.61). Then, the pairs of matrix functions (Emod , Amod ) and (Ẽmod , Ãmod ) obtained by the DE-step from (E, A) and (Ẽ, Ã), respectively, are also globally equivalent. Proof: See Kunkel/Mehrmann (2006), Theorem 3.14. Now, Theorem 13 allows an extension of Procedure 1 to linear DAEs with variable coefficients: Procedure 2: Reduction to strangeness-free form Consider a linear DAE with variable coefficients of the form E(t)ẋ = A(t)x + f (t) with E, A ∈ C(I, Cm,n ). Starting from (E0 , A0 ) = (E, A), f0 = f , iterate for i = 0, 1, . . . 1) Compute the local characteristic values ri (t), ai (t), si (t) of (Ei , Ai ) for all t ∈ I. If si (t) ≡ 0, stop the procedure. K. Altmann CONFIDENTIAL Master of Science Thesis 2.4 Linear differential-algebraic equations with variable coefficients 39 2) If the local characteristic values are constant on I, transform (Ei , Ai ) into the global canonical form (Ẽi , Ãi ), fi into f˜i and xi into x̃i by using pointwise nonsingular matrix functions Pi ∈ C(I, Cm,m ) and Qi ∈ C(I, Cn,n ) with Ẽi = Pi Ei Qi , Ãi = Pi Ai Qi − Pi Ei Q̇i , f˜i = Pi fi , x̃i = Q−1 i xi . (2.65) 3) Transform (Ẽi , Ãi ) into (Ei+1 , Ai+1 ) and f˜i into fi+1 by the differentiation-elimination step (DE-step), i.e. Ei+1 Is i = Ẽi − 0 0 , fi+1 0 0 ˙ f˜i,4 0 = f˜i + 0 , Ai+1 = Ãi , xi+1 = x̃i . 0 0 (2.66) If Procedure 2 does not break down, i.e. all local characteristic values (ri , si , ai ) are constant, then the procedure determines a non-unique sequence of pairs of matrix functions (Ei , Ai ) and a unique sequence of characteristic values (ri , si , ai ) which becomes stationary after finitely many iterations i. According to the definition of the s-index for linear DAEs with constant coefficients, the s-index of the pair of matrix functions (E, A) and the corresponding DAE are defined as follows: Definition 17: Strangeness index Let E, A ∈ C(I, Cm,n ) and let the sequence (ri , si , ai ), i ∈ N0 , be determined by Procedure 2. In particular, let ri , si , ai be constant on I. Then, vs = min{i ∈ N0 : si = 0} is called strangeness index (s-index) of the pair of matrix functions (E, A) and of the DAE E(t)ẋ = A(t)x + f (t). In the case that vs = 0, both the pair of matrix functions (E, A) and the DAE are called strangeness-free. Furthermore, from the sequence (Ei , Ai , fi ) we obtain a strangenesse-free formulation Ẽv (t)ẋ = Ãv (t)x + f˜v (t). s s s Theorem 14 Let the s-index vs of the DAE (2.49) be well-defined and let f ∈ C vs (I, Cm ). Then, the DAE (2.49) is equivalent (in the sense that there is a 1-to-1 correspondence between the solution spaces via a pointwise nonsingular matrix function) to a DAE of the form x̂˙ 1 = Â13 (t)x̂3 + fˆ1 (t), dvs (2.67a) 0 = x̂2 + fˆ2 (t), avs (2.67b) 0 = fˆ3 (t), v vs (2.67c) where Â13 ∈ C(I, Cdvs ,uvs ) and the inhomogeneities fˆ1 , fˆ2 , fˆ3 are determined from (E, A) and f , f (1) , . . . , f (vs ) using Procedure 2. Master of Science Thesis CONFIDENTIAL K. Altmann 40 Chapter 2. Analysis of differential-algebraic equations According to the strangeness-free form (2.67), the solvability statements for linear DAEs with variable coefficients are exactly the same as for the constant coefficient case: Corollary 2 Let the s-index vs of the DAE (2.49) be well-defined and let f ∈ C vs +1 (I, Cm ). Then we have the following: 1) The DAE (2.49) is solvable if and only if the vvs consistency conditions fˆ3 = 0 are fulfilled. 2) An initial condition (2.50) is consistent if and only if it implies the avs conditions x̂2 (t0 ) = −fˆ2 (t0 ). 3) Any IVP (2.49)-(2.50) with consistent initial values is uniquely solvable if and only if uvs = 0 holds. Example 16 Consider the DAE E(t)ẋ = A(t)x + f (t) from Example 13 with E(t) = −t t2 −1 t ! , A(t) = ! −1 0 0 −1 , f (t) = ! g1 (t) g2 (t) , I = R. Following Procedure 2 yields: • i = 0 : E0 = E, A0 = A, f0 = f . 1) Local characteristic values: r0 = rank(E0 ) = 1 for all t ∈ I. t ! as basis of ker(E0 ) and T 0 = 1 basis of corange(E0 ). Then it follows that: Choose T = 1 ! −t . Choose Z = 1 −t ! as a0 = rank(Z ∗ A0 T ) = rank(0) = 0 for all t ∈ I. Choose V = 1 as basis of corange(Z ∗ A0 T ). Then it follows that: s0 = rank(V ∗ Z ∗ A0 T 0 ) = rank(−1 − t2 ) = 1 for all t ∈ I. Thus, we have (r0 , a0 , s0 ) = (1, 0, 1) constant on I. In particular, s0 6= 0, i.e. the DAE is not strangeness-free. K. Altmann CONFIDENTIAL Master of Science Thesis 2.4 Linear differential-algebraic equations with variable coefficients 41 2) Global canonical form: With P0 = 0 −1 −1 t ! , Q0 = ! 1 t 0 1 , Q̇0 = 0 ! 1 0 0 we obtain Ẽ0 = P0 E0 Q0 = 1 0 0 0 ! , Ã0 = P0 A0 Q0 − P0 E0 Q̇0 = −g2 f˜0 = P0 f0 = ! x̃ = Q−1 0 x= , −g1 + tg2 ! 0 0 x1 − tx2 1 0 ! x2 , . 3) DE-step: 1 E1 = Ẽ0 − ˙ f˜0,2 f1 = f˜0 + 0 ! 0 0 0 ! = = 0 ! 0 0 0 , A1 = Ã0 = −g2 − ġ1 + g2 + tġ2 −g1 + tg2 ! = 0 0 1 0 ! , −ġ1 + tġ2 ! −g1 + tg2 . • i=1: 1) Local characteristic values: r1 = rank(E1 ) = 0 for all t ∈ I. Choose T = 1 0 0 1 ! , T 0 = ∅2,0 and Z = 1 0 ! 0 . Then it follows that: 1 a1 = rank(Z ∗ A1 T ) = rank(A1 ) = 1 for all t ∈ I. Choose V = 1 0 ! as basis of corange(Z ∗ A1 T ). Then it follows that: s1 = rank(V ∗ Z ∗ A1 T 0 ) = rank(∅1,0 ) = 0 for all t ∈ I. Thus, we have (r1 , a1 , s1 , d1 , u1 , v1 ) = (0, 1, 0, 0, 1, 1) constant on I. In particular, s1 = 0, i.e. the procedure stops. Hence, the s-index is vs = 1. Thus, the corresponding strangeness-free formulation is of the form 0 = −ġ1 + tġ2 , 0 = x̃1 − g1 + tg2 . Master of Science Thesis CONFIDENTIAL K. Altmann 42 Chapter 2. Analysis of differential-algebraic equations Hence, the DAE is solvable if and only if the consistency condition 0 = −ġ1 + tġ2 is fulfilled. An initial condition x(t0 ) = x0 is consistent if and only if x̃1 (t0 ) = g1 (t0 ) − t0 g2 (t0 ). With x̃ = x̃1 ! x̃2 = Q−1 0 x = 1 −t 0 1 ! x1 ! x1 − tx2 = x2 ! x2 we get x1 (t0 ) − t0 x2 (t0 ) = g1 (t0 ) − t0 g2 (t0 ). Furthermore, the corresponding IVP is not uniquely solvable since u1 = 1. In particular, x̃2 can be chosen arbitrarily. Thus, the general solution of the DAE is given by x= x1 ! x̃1 = Q0 x2 ! = x̃2 ! 1 t 0 1 g1 − tg2 ! g1 − tg2 + tx̃2 = x̃2 ! x̃2 . Example 17 Consider the DAE E(t)ẋ = A(t)x + f (t) from Example 14 with E(t) = 0 0 1 −t ! , A(t) = ! g1 (t) ! −1 t 0 0 , f (t) = g2 (t) , I = R. Following Procedure 2 yields: • i = 0 : E0 = E, A0 = A, f0 = f . 1) Local characteristic values: r0 = rank(E0 ) = 1 for all t ∈ I. ! 1 t . Choose Z = as basis of ker(E0 ) and T 0 = Choose T = −t 1 of corange(E0 ). Then it follows that: ! ! 1 0 as basis a0 = rank(Z ∗ A0 T ) = rank(0) = 0 for all t ∈ I. Choose V = 1 as basis of corange(Z ∗ A0 T ). Then it follows that: s0 = rank(V ∗ Z ∗ A0 T 0 ) = rank(−1 − t2 ) = 1 for all t ∈ I. Thus, we have (r0 , a0 , s0 ) = (1, 0, 1) constant on I, i.e. the same local characteristic values as in the previous example. In particular, s0 6= 0, i.e. the DAE is not strangeness-free. 2) Global canonical form: With P0 = K. Altmann 0 1 −1 0 ! , Q0 = 1 t 0 1 CONFIDENTIAL ! , Q̇0 = 0 ! 1 0 0 Master of Science Thesis 2.4 Linear differential-algebraic equations with variable coefficients 43 we obtain Ẽ0 = P0 E0 Q0 = 1 ! 0 0 0 , g2 f˜0 = P0 f0 = −g1 0 Ã0 = P0 A0 Q0 − P0 E0 Q̇0 = ! , x̃ = 0 0 0 0 ! Q−1 0 x = x1 − tx2 x2 ! −1 1 0 , ! . 3) DE-step: E1 = Ẽ0 − 1 0 0 0 ! = f1 = f˜0 + ˙ f˜0,2 0 ! , A1 = Ã0 = g2 − ġ1 = −g1 0 1 ! −1 0 , ! . • i=1: 1) Local characteristic values: r1 = rank(E1 ) = 0 for all t ∈ I. Choose T = 1 0 0 1 ! , T 0 = ∅2,0 and Z = ! 0 . Then it follows that: 1 1 0 a1 = rank(Z ∗ A1 T ) = rank(A1 ) = 2 for all t ∈ I. Choose V = ∅2,0 as basis of corange(Z ∗ A1 T ). Then it follows that: s1 = rank(V ∗ Z ∗ A1 T 0 ) = rank(∅0,0 ) = 0 for all t ∈ I. Thus, we have (r1 , a1 , s1 , d1 , u1 , v1 ) = (0, 2, 0, 0, 0, 0) constant on I. In particular, s1 = 0, i.e. the procedure stops. Hence, the s-index is vs = 1. Thus, the corresponding strangeness-free formulation is of the form 0 = −x̃2 + g2 − ġ1 , 0 = x̃1 − g1 . Hence, the DAE is solvable since there is no consistency condition. An initial condition x(t0 ) = x0 is consistent if and only if x̃1 (t0 ) = g1 (t0 ) and x̃2 (t0 ) = g2 (t0 ) − ġ1 (t0 ). With x̃ = x̃1 x̃2 ! = Q−1 0 x= 1 −t 0 1 ! x1 x2 ! = x1 − tx2 ! x2 we get x1 (t0 ) − t0 x2 (t0 ) = g1 (t0 ) and x2 (t0 ) = g2 (t0 ) − ġ1 (t0 ). Master of Science Thesis CONFIDENTIAL K. Altmann 44 Chapter 2. Analysis of differential-algebraic equations Furthermore, the corresponding IVP is uniquely solvable since u1 = 0 with the general solution given by x= x1 ! x2 = Q0 x̃1 x̃2 ! = 1 t 0 1 ! g1 g2 − ġ1 ! = g1 + tg2 − tġ1 g2 − ġ1 ! . Remark: The results obtained so far are valid only for constant local characteristic values. The following theorem on the rank of continuous matrix functions gives a characterisation of how restrictive this constancy assumption is. Theorem 15 Let I ⊆ R be a closed interval and M ∈ C(I, Cm,n ). Then, there exist open intervals Ij ⊆ I, j ∈ N, with [ Ij = I and Ij ∩ Ii = ∅ for i 6= j, (2.68) for all t ∈ Ij , (2.69) j and integers rj ∈ N0 , j ∈ N, such that rank(M (t)) = rj i.e. the rank of M (t) is constant on each subinterval Ij , j ∈ N. Proof: See Campbell/Meyer (1979), Theorem 10.5.2. The local characteristic values of a pair of matrix functions correspond to the rank of several matrices (see Theorem 6). Furthermore, the s-index is well-defined if the local characteristic values of all constructed pairs of matrix functions within Procedure 2 are constant. Therefore, as a consequence of Theorem 15, the s-index of a pair of matrix functions (E, A) restricted to the subinterval Ij is well-defined for all j ∈ N. Thus, the s-index is well-defined on a dense subset of the given closed interval I. But at an exceptional point, i.e. where any of the ranks changes, we cannot associate an s-index. 2.4.3 Derivative array approach In the previous subsection, an iterative procedure for linear DAEs was developed in order to determine an equivalent strangeness-free formulation of the considered DAE. In this subsection, we want to investigate an approach which is not iterative and which is based on the derivative array. The general definition of the derivative array and the corresponding inflated DAEs for nonlinear DAEs (2.39) is also valid here. For linear DAEs with variable coefficients it simplifies to: K. Altmann CONFIDENTIAL Master of Science Thesis 2.4 Linear differential-algebraic equations with variable coefficients 45 Example 18: Derivative array for linear DAEs with variable coefficients Differentiating a linear DAE with variable coefficients (E(t)ẋ = A(t)x + f (t)) l times yields E 0 0 ··· 0 ẋ 0 Ȧ ẍ Ä 0 . = .. . .. 0 x(l+1) A(l) E Ė − A E 0 Ë − 2Ȧ .. . 2Ė − A E .. . ··· .. . .. . E (l) − lA(l−1) ··· ··· lĖ − A A 0 ··· 0 ··· 0 .. . ··· 0 ··· 0 x f 0 ˙ f ẋ 0 . + . , . . .. . . . x(l) f (l) 0 i.e. the derivative array equations of order l for linear DAEs with variable coefficients are given by Ml (t)żl (t) = Nl (t)zl (t) + gl (t) where i i E (i−j) − A(i−j−1) , i, j = 0, . . . , l, j j+1 A(i) for j = 0 , i, j = 0, . . . , l, (Nl )ij = 0 else i using the convention that = 0 for j > i. j (Ml )ij = (zl )j = x(j) , j = 0, . . . , l, (gl )j = f (j) , j = 0, . . . , l, The pair of matrix functions (Ml , Nl ) with Ml , Nl ∈ C(I, Cm(l+1),n(l+1) ) is called inflated pair. Based on the inflated pair (Ml , Nl ), l ∈ N0 , the idea is to construct a DAE with the same solution set as the original DAE, but with better analytical properties, i.e. a strangeness-free DAE with a separated part which explicitly states all constraints of the problem. By assuming a well-defined s-index vs , the determination of an equivalent formulation of the DAE E(t)ẋ = A(t)x + f (t) with the same solution set and of the form (2.67) by using only information from the inflated pair (Mvs , Nvs ) is discussed in Kunkel/Mehrmann (2006, pp. 91-93). In particular, the construction of a matrix function Z2 of size (vs + 1)m × avs with Z2∗ Mvs = 0 and rank Z2∗ Nvs In 0 ∗ = avs , (2.70) rank(ET2 ) = dvs , (2.71) ... 0 a matrix function T2 of size n × (dvs + uvs ) with Z2∗ Nvs In 0 ... 0 ∗ T2 = 0 and and a matrix function Z1 of size m × dvs with rank(Z1∗ ET2 ) = dvs (2.72) CONFIDENTIAL K. Altmann is described. Master of Science Thesis 46 Chapter 2. Analysis of differential-algebraic equations Hence, Z2 extracts avs linearly independent algebraic constraints from the derivative array equations, and Z1 selects dvs differential equations such that the constructed pair of matrix functions Â d Ê 1 1 vs (Ê, Â) = ( 0 , Â2 ) avs 0 vv s 0 (2.73) with entries Ê1 = Z1∗ E, Â1 = Z1∗ A, Â2 = Z2∗ Nvs In 0 ... 0 ∗ (2.74) has the same size as the original pair (E, A). Moreover, it can be shown that this pair of matrix functions is indeed strangeness-free independent of the choice of transformations Z2 , T2 , Z1 : Theorem 16 Let the s-index vs of (E, A) with E, A ∈ C(I, Cm,n ) be well-defined with characteristic values (ri , ai , si ), i = 1, . . . , vs . Then, every pair of matrix functions (Ê, Â) constructed as in (2.73) has a well-defined s-index v̂s = 0 and the local characteristic values of (Ê(t), Â(t)) are given by (2.75) (r̂, â, ŝ) = (dvs , avs , 0) uniformly in t ∈ I. Proof: See Kunkel/Mehrmann (2006), Theorem 3.32. Remark: By setting fˆ1 = Z1∗ f and fˆ2 = Z2∗ gvs , we obtain the equations Ê1 (t)ẋ = Â1 (t)x + fˆ1 (t) and 0 = Â2 (t)x + fˆ2 (t) (2.76) from Mvs (t)żvs (t) = Nvs (t)zvs (t) + gvs (t). Furthermore, by setting fˆ3 = 0 and fˆ 1 ˆ f = fˆ2 , ˆ f3 Ê 1 Ê = 0 , 0 Â 1 Â = Â2 , 0 (2.77) we get the system Ê(t)ẋ = Â(t)x + fˆ(t). (2.78) Due to the construction of that system, every solution of the original DAE (2.49) is also a solution of (2.78), but by setting fˆ3 = 0, we may convert an unsolvable problem into a solvable one. K. Altmann CONFIDENTIAL Master of Science Thesis 2.4 Linear differential-algebraic equations with variable coefficients 47 Example 19 Consider the DAE E(t)ẋ = A(t)x + f (t) from Example 13 with E(t) = −t t2 −1 t ! , A(t) = ! −1 0 0 −1 , ! 0 f (t) = 0 , I = R. In Example 16, we have seen that vs = 1, dvs = 0, avs = 1 and vvs = 1. Then, the inflated pair of level vs = 1 is given by t2 −t −1 M1 (t) = 0 0 0 0 0 0 N1 (t) = 0 0 0 , 2t −t t2 2 −1 t t −1 0 −1 0 0 We have rank(M1 (t)) = 2 for all t ∈ I. By choosing Z2∗ = 1 Z2∗ M1 = 0 0 0 , 0 rank Z2∗ N1 In 0 0 g1 (t) = 0 . 0 0 0 0 , 0 0 0 0 0 −t 0 0 it holds that ∗ = rank −1 0 t = avs = 1. Since dvs = 0, Z1∗ is of size 0 × 2 and we get Z1∗ E Z1∗ A ∗ ∗ = Â(t) = Z N I 0 1 n 2 Ê(t) = 0 = 0 0 ! 0 0 0 , t 0 0 0 ! 0 ! −1 , fˆ(t) = 0 . Thus, the solution x1 = tx2 with arbitrary x2 can be read off directly. Example 20 Consider the DAE E(t)ẋ = A(t)x + f (t) from Example 14 with E(t) = ! 0 0 1 −t , A(t) = ! −1 t 0 0 , f (t) = ! f1 (t) f2 (t) , I = R. In Example 17, we have seen that vs = 1, dvs = 0, avs = 2 and vvs = 0. Then, the inflated pair of level vs = 1 is given by 0 1 M1 (t) = 1 0 Master of Science Thesis 0 0 −t 0 −t 0 −1 1 0 0 , 0 −t −1 0 N1 (t) = 0 0 CONFIDENTIAL t 0 0 0 0 1 0 0 0 0 , 0 0 f1 f2 g1 (t) = f˙ . 1 f˙2 K. Altmann 48 Chapter 2. Analysis of differential-algebraic equations We have rank(M1 (t)) = 2 for all t ∈ I. By choosing Z2∗ = Z2∗ M1 = 0 0 0 ! 0 0 0 0 0 rank Z2∗ N1 In , 1 0 0 0 1 −1 ∗ = rank 0 ! 0 0 it holds that !! −1 t 0 −1 = avs = 2. Since dvs = 0, Z1∗ is of size 0 × 2 and we get Ê(t) = 0 ! 0 0 0 , Â(t) = Z2∗ N1 In 0 ∗ = ! −1 t 0 −1 , fˆ(t) = Z2∗ g1 = ! f1 f2 − f˙1 . Thus, from Ê(t)ẋ = Â(t)x + fˆ(t) it follows that x2 = f2 − f˙1 x1 = tx2 + f1 = tf2 − tf˙1 + f1 . and Now, we discuss under which conditions the original DAE (2.49) and the strangeness-free formulation (2.78) have the same solution set, i.e. when such a reformulation is possible without setting fˆ3 = 0. Therefore, we formulate the following hypothesis that guarantees that the reformulation can be performed without the presence of a consistency condition for the inhomogeneity f or free solution components: Hypothesis 1 There exist integers v̂, â and dˆ (= n − â) such that the inflated pair (Mv̂ , Nv̂ ) of the given pair of matrix functions (E, A) with E, A ∈ C v̂ (I, Cn,n ) satisfies: 1) For all t ∈ I we have rank(Mv̂ (t)) = (v̂ + 1)n − â such that there exists a smooth matrix function Z2 of size (v̂ + 1)n × â and pointwise full rank â satisfying Z2∗ Mv̂ = 0. ∗ 2) For all t ∈ I we have rank(Â2 (t)) = â where Â2 = Z2∗ Nv̂ In 0 . . . 0 such that there exists a smooth matrix function T2 of size n × dˆ and pointwise full rank dˆ satisfying Â2 T2 = 0. 3) For all t ∈ I we have rank(E(t)T2 (t)) = dˆ such that there exists a smooth matrix function Z1 of size n × dˆ and pointwise full rank dˆ satisfying rank(Z ∗ ET2 ) = dˆ for all t ∈ I. 1 Remarks: • If the s-index vs is well-defined and uvs = vvs = 0, then Hypothesis 1 is satisfied with v̂ = vs , â = av and dˆ = dv . s s • If Hypothesis 1 is satisfied, then we have a reduction of the derivative array equations Mv̂ ż = Nv̂ z + gv̂ to Ê ẋ = Âx + fˆ where Ê = Ê1 0 ! = Z1∗ E 0 ! , Â = Â1 Â2 ! = Z1∗ A Z2∗ Nv̂ In ∗ , fˆ = 0 ... 0 fˆ1 fˆ2 ! = Z1∗ f Z2∗ gv̂ ! . (2.79) • Note that there is no change of basis in the state space, i.e. only transformations from the left are performed such that the state variables x are still the same. K. Altmann CONFIDENTIAL Master of Science Thesis 2.4 Linear differential-algebraic equations with variable coefficients 49 By construction, it is immediately clear that if Hypothesis 1 is satisfied, any solution x of the original DAE (2.49) also solves the reduced system (2.79). The following theorem shows that also the inverse holds: Theorem 17 Let (E, A) be a pair of matrix functions satisfying Hypothesis 1. Then, x solves the DAE (2.49) if and only if x solves the DAE (2.79). Proof: See Kunkel/Mehrmann (2006), Theorem 3.51. As a consequence, we obtain the following characterisation of consistency of initial values as well as existence and uniqueness of solutions: Theorem 18 ˆ â and let Let (E, A) be a pair of matrix functions satisfying Hypothesis 1 with values v̂, d, E, A ∈ C v̂+1 (I, Cn,n ) and f ∈ C v̂+1 (I, Cn ). Then we have the following: 1) An initial condition (2.50) is consistent if and only if it implies the â conditions 0 = Â2 (t0 )x0 + fˆ2 (t0 ). (2.80) 2) Any IVP (2.49)-(2.50) with consistent initial values is uniquely solvable. Proof: The proof follows from the previous theorem. 2.4.4 Differentiation index The general definition of the d-index for nonlinear DAEs (see subsection 2.3.4) is also valid here. Similar to linear DAEs with constant coefficients, a simpler characterisation of the d-index for linear DAEs with variable coefficients can be formulated with the help of the following definition: Definition 18: Smoothly 1-full A block matrix function M ∈ C(I, Ckn,ln ) is called smoothly 1-full (with respect to the block structure built from n × n matrix functions) if there exists a pointwise nonsingular matrix function R ∈ C(I, Ckn,kn ) such that RM = In 0 0 H ! (2.81) for some H ∈ C(I, C(k−1)n,(l−1)n ). Master of Science Thesis CONFIDENTIAL K. Altmann 50 Chapter 2. Analysis of differential-algebraic equations Lemma 4 Let M ∈ C(I, Ckn,ln ) have constant rank. Then, M is smoothly 1-full if and only if M is pointwise 1-full. Proof: See Kunkel/Mehrmann (2006), Lemma 3.36. Note that the constant rank assumption in Lemma 4 cannot be removed (otherwise Theorem 10 cannot be applied). Lemma 5 Let a pair of matrix functions (E, A) be given with the inflated pair (Ml , Nl ), l ∈ N. Then, the d-index vd of the DAE E(t)ẋ = A(t)x + f (t) is the smallest number vd ∈ N0 for which Mvd is pointwise 1-full and has constant rank. Now, the next aim is to show that (except for some technical smoothness assumptions) the d-index is well-defined if and only if Hypothesis 1 is satisfied. Theorem 19 Let (E, A) be a pair of matrix functions with a well-defined d-index vd . Then, (E, A) satisfies Hypothesis 1 with v̂ = max{0, vd − 1}, dˆ = n − â, â = 0 corank M for vd = 0, vd −1 (t) otherwise. (2.82) Proof: See Kunkel/Mehrmann (2006), Theorem 3.50. Theorem 20 Let (E, A) be a pair of matrix functions that satisfies Hypothesis 1 with characteristic values v̂, dˆ and â. Then, the d-index vd is well-defined for (E, A). Furthermore, if v̂ is chosen minimally, then it holds that vd = 0 for â = 0, v̂ + 1 otherwise. Proof: See Kunkel/Mehrmann (2006), Corollary 3.53. K. Altmann CONFIDENTIAL (2.83) Master of Science Thesis 2.4 Linear differential-algebraic equations with variable coefficients 51 Example 21 Consider the linear DAE ! 0 t 0 0 ẋ = 1 ! 0 0 1 f1 (t) x+ f2 (t) ! , I = [−1, 1], (see Example 3.54 in Kunkel/Mehrmann 2006). Obviously, the matrix function E has a rank drop at t = 0. Therefore, the s-index is not well-defined. Examining the inflated pair of order 1 0 t 0 (M1 (t), N1 (t)) = ( −1 0 1 0 0 0 0 , 0 t 0 0 0 0 0 0 1 −1 0 0 0 1 0 0 0 0 0 0 ), 0 0 we have that M1 has a constant corank â = 2. By choosing Z2∗ (t) = ! 1 0 0 t 0 1 0 0 with Z2∗ M1 = 0, we obtain Â2 (t) = Z2∗ N1 In 0 . . . 0 ∗ = 1 ! 0 0 1 , fˆ2 = Z2∗ g1 = f1 (t) + tf˙2 (t) f2 (t) ! . Thus, (E, A) satisfies Hypothesis 1 with v̂ = 1, â = 2 and dˆ = 0. Hence, the d-index is well-defined with vd = 2. The reduced system Ê ẋ = Âx + fˆ is given by 0 ! 0 0 0 ẋ = 1 0 0 1 ! x+ ! f1 (t) + tf˙2 (t) f2 (t) . Hence, the DAE has the unique solution x1 (t) = −(f1 (t) + tf˙2 (t)), x2 (t) = −f2 (t) in the entire interval I = [−1, 1]. Master of Science Thesis CONFIDENTIAL K. Altmann 52 Chapter 2. Analysis of differential-algebraic equations Concluding remarks The following summarises the obtained results in this section: 1) By assuming constant local characteristic values, a global canonical form of a given pair of matrix functions is derived. Based on this canonical form, an extension of Procedure 1 to linear DAEs with variable coefficients is developed which results in an equivalent strangenessfree formulation of the DAE. Going out from this procedure, the s-index is defined and the corresponding solvability statements and a solution approach are developed. 2) Furthermore, we have seen that the s-index is well-defined on a dense subset of the given closed interval I, but we cannot associate an s-index at an exceptional point where any of the ranks changes. Therefore, an alternative approach is investigated which is not iterative and which is based on the derivative array. 3) Based on the inflated pair, the idea is to construct a DAE with the same solution set as the original DAE, but with better analytical properties, i.e. a strangeness-free DAE with a separated part which explicitly states all constraints of the problem. Therefore, Hypothesis 1 is formulated such that it guarantees that this reformulation can be performed without the presence of a consistency condition for the inhomogeneity or free solution components. In particular, part 1 of Hypothesis 1 ensures that we have â constraints and that Z2 extracts these constraints from the derivative array. Part 2 then requires that these constraints are linearly independent and also excludes consistency conditions for the inhomogeneity f . The matrix function T2 points to the so-called differential part of the unknown. Finally, part 3 requires that there are dˆ equations in the original problem that yield an ODE for the differential part of the unknown if we eliminate the algebraic part, i.e. the other part of the unknown. These equations are selected by Z1 . From the obtained reduced system, a characterisation of consistency of initial values as well as existence and uniqueness of solutions can be read off. 4) Furthermore, we have seen that the d-index is well-defined if and only if Hypothesis 1 is satisfied. Therefore, both the concept of the d-index and the concept of Hypothesis 1 are equivalent approaches except for some smoothness assumptions. The main difference is that the d-index aims at a reformulation of the original problem as an ODE, whereas Hypothesis 1 is constructed with the intention to reformulate it as a strangeness-free DAE with the same solution set. K. Altmann CONFIDENTIAL Master of Science Thesis 2.5 Nonlinear differential-algebraic equations 2.5 53 Nonlinear differential-algebraic equations In this section, we consider general nonlinear DAEs, i.e. systems of the form (2.84) F (ẋ, x, t) = 0 with F ∈ C(Dẋ × Dx × I, Rn ) where Dẋ , Dx ⊆ Rn are open, possibly together with an initial condition x(t0 ) = x0 with t0 ∈ I, x0 ∈ Rn . (2.85) Note that in this section, we consider only real-valued and square systems (m = n), i.e. the case where the number of equations is equal to the number of unknowns. 2.5.1 Strangeness index approach Definition 19: Jacobians of the derivative array Let Fl (x(l+1) , . . . , x(1) , x, t) be the derivative array of order l ∈ N0 with respect to the DAE 0 = F (ẋ, x, t), i.e. F (ẋ, x, t) d F (ẋ, x, t) dt Fl (x(l+1) , . . . , x(1) , x, t) = . .. . dl F ( ẋ, x, t) dtl (2.86) Then, the associated Jacobians (Ml , Nl ) of the derivative array are defined by Ml (x(l+1) , . . . , x(1) , x, t) = Fl;ẋ,...,x(l+1) (x(l+1) , . . . , x(1) , x, t) = Fl;ẋ Fl;ẍ . . . Fl;x(l+1) (x(l+1) , . . . , x(1) , x, t), Nl (x(l+1) , . . . , x(1) , x, t) = −Fl;x 0 . . . 0 (x(l+1) , . . . , x(1) , x, t). (2.87) Definition 20: Set of solutions of the derivative array Let Fl (x(l+1) , . . . , x(1) , x, t) be the derivative array of order l ∈ N0 with respect to the DAE 0 = F (ẋ, x, t). Then, the set of solutions of the derivative array is defined as Ll = {(z(l+1) , . . . , z0 , t) ∈ Rln × Dẋ × Dx × I : Fl (z(l+1) , . . . , z0 , t) = 0}. Master of Science Thesis CONFIDENTIAL (2.88) K. Altmann 54 Chapter 2. Analysis of differential-algebraic equations The following hypothesis generalises Hypothesis 1 to nonlinear DAEs (2.84): Hypothesis 2 There exist integers v, a and d (= n − a) such that the set of solutions Lv = {(x(v+1) , . . . , ẋ, x, t) ∈ Rvn × Dẋ × Dx × I : Fv (x(v+1) , . . . , ẋ, x, t) = 0} (2.89) associated with F is nonempty and such that the following holds: 1) rank(Mv (x(v+1) , . . . , ẋ, x, t)) = (v+1)n−a on Lv such that there exists a smooth matrix function Z2 of size (v + 1)n × a and pointwise full rank a satisfying Z2∗ Mv = 0 on Lv . ∗ 2) rank(Â2 (x(v+1) , . . . , ẋ, x, t)) = a on Lv where Â2 = Z2∗ Nv In 0 . . . 0 such that there exists a smooth matrix function T2 of size n × d and pointwise full rank d satisfying Â2 T2 = 0. 3) rank(Fẋ (ẋ, x, t)T2 (x(v+1) , . . . , ẋ, x, t)) = d on Lv such that there exists a smooth matrix function Z1 of size n × d and pointwise full rank d satisfying rank(Z1∗ Fẋ T2 ) = d. Definition 21: Strangeness index Given a nonlinear DAE (2.84), the smallest number v such that F satisfies Hypothesis 2 is called strangeness index (s-index) of (2.84). If v = 0, then DAE (2.84) is called strangeness-free. Remarks: • The variables x(i) in Lv are treated locally independently as algebraic variables, i.e. the differential relation is ignored. • The matrix functions Z2 , T2 and Z1 always exist locally under the constant rank assumptions. This follows from a generalisation of the smooth SVD (Theorem 10), see Theorem 4.3 in Kunkel/Mehrmann (2006). • Hypothesis 2 does not require constant ranks in the whole space but only on the set of solutions Lv . This is due to the fact that, in general, away from the solution, the constant rank assumptions in Hypothesis 1 do not hold for the linearisation of the nonlinear DAE (see Example 4.1 in Kunkel/Mehrmann 2006). • Hypothesis 2 is invariant under the following equivalence transformations: 1) Change of variables: x = Q(t, x̃) and ˙ x̃, t) = F (ẋ, x, t) = F (Qt (t, x̃) + Qx̃ (t, x̃)x̃, ˙ Q(t, x̃), t) F̃ (x̃, where Q ∈ C(I × Rn , Rn ) is sufficiently smooth, Q(t, ·) is bijective for every t ∈ I and the Jacobian Qx̃ (t, x̃) is nonsingular for every (t, x̃) ∈ I × Rn . K. Altmann CONFIDENTIAL Master of Science Thesis 2.5 Nonlinear differential-algebraic equations 55 2) Combination of equations: F̃ (ẋ, x, t) = P (ẋ, x, t, F (ẋ, x, t)) where P ∈ C(Rn × Rn × I × Rn , Rn ) is sufficiently smooth, P (ẋ, x, t, ·) is bijective with P (ẋ, x, t, 0) = 0 and the Jacobian Pw (ẋ, x, t, w) is nonsingular for every (ẋ, x, t, w) ∈ Rn × Rn × I × Rn . 3) Making the system autonomous: ˙ x̃) = F̃ (x̃, ! F (ẋ, x, t) ṫ − 1 , x̃ = x ! t , x̃˙ = ẋ ṫ ! . • If F satisfies Hypothesis 2, we obtain the reduced system F̂ (ẋ, x, t) = F̂1 (ẋ, x, t) F̂2 (x, t) ! =0 (2.90) by setting F̂1 (ẋ, x, t) = Z1T F (ẋ, x, t), F̂2 (x, t) = Z2T Fv (x(v+1) , . . . , ẋ, x, t) (2.91) where F̂2 = Z2T Fv can be reformulated in such a way that it only depends on t and x (see Kunkel/Mehrmann 2006, p. 161f.). Furthermore, we have: 1) F̂ is strangeness-free. 2) Every sufficiently smooth solution x of F (ẋ, x, t) also solves the reduced system F̂ (ẋ, x, t) (see Theorem 4.11 in Kunkel/Mehrmann 2006). Without any further assumptions, it is not clear whether a solution of the reduced system also solves the original DAE (2.84). The following theorem gives sufficient conditions for this situation (at least locally): Theorem 21 Consider the nonlinear DAE (2.84) with sufficiently smooth F satisfying Hypothesis 2 with values v, a, d and in addition with v + 1, a, d. 0 ∈ Lv+1 , the reduced system (2.90) has a unique solution satisfying the Then, for every zv+1 0 initial values given in zv+1 . Moreover, this solution locally solves the original DAE (2.84). Proof: See Kunkel/Mehrmann (2006), Theorem 4.13. Remarks: • The local result can be globalised in the same way as for ODEs (see Theorem I.7.4 in Hairer et al. 1993). (v+2) 0 • We have zv+1 = (x0 Master of Science Thesis , . . . , ẋ0 , x0 , t0 ) where (t0 , x0 ) are consistent initial values. CONFIDENTIAL K. Altmann 56 Chapter 2. Analysis of differential-algebraic equations Example 22 Consider the nonlinear DAE F (ẋ, x, t) = ẋ2 − x1 x2 − x1 ẋ1 + ẋ1 ẋ2 ! , I = R, Dẋ = Dx = R2 . Checking Hypothesis 2 with v = 0 yields: ! ẋ2 − x1 F0 = F = x2 − x1 ẋ1 + ẋ1 ẋ2 , L0 = {(ẋ, x, t) ∈ R2 × R2 × R : ẋ2 − x1 = 0, x2 − x1 ẋ1 + ẋ1 ẋ2 = 0} = {(ẋ, x, t) ∈ R2 × R2 × R : x1 = ẋ2 , x2 = 0}, ! ! 0 1 on L0 0 1 M0 = Fẋ = = , N0 = −Fx = ẋ2 − x1 ẋ1 0 ẋ1 ! 1 0 ẋ1 −1 . ! 1) rank(M 0 ) = 1 = (v + 1)n − a on L0 . Thus, a = 1, d = n − a = 1 and we can choose T Z2 = ẋ1 −1 with full rank satisfying Z2T M0 = 0 on L0 . ! 1 2) rank(Z2T N0 ) = rank( 0 1 ) = 1 = a. We can choose T2 = with full rank 0 satisfying Z2T N0 T2 = 0.! 0 ) = 0 6= 1 = d on L0 . Thus, Hypothesis 2 is not satisfied for 3) rank(Fẋ T2 ) = rank( 0 v = 0. Checking Hypothesis 2 with v = 1 yields: F1 = ẋ2 − x1 x2 − x1 ẋ1 + ẋ1 ẋ2 , ẍ2 − ẋ1 ẋ2 − ẋ21 N1 = −F1;x − x1 ẍ1 + ẍ1 ẋ2 + ẋ1 ẍ2 1 ẋ1 0 = 0 ẍ1 0 −1 0 0 0 0 0 , 0 0 0 0 0 L1 = {(ẍ, ẋ, x, t) ∈ R7 : x1 = ẋ2 , x2 = 0, ẍ2 − ẋ1 = 0, ẋ2 − ẋ21 − x1 ẍ1 + ẍ1 ẋ2 + ẋ1 ẍ2 = 0} = {(ẍ, ẋ, x, t) ∈ R7 : x1 = 0, x2 = 0, ẋ2 = 0, ẍ2 = ẋ1 }, 0 1 0 0 0 ẋ2 − x1 ẋ1 0 0 on=L1 0 M1 = F1;ẋ,ẍ = −1 −1 0 0 1 −2ẋ1 + ẍ2 1 + ẍ1 ẋ2 − x1 ẋ1 −ẋ1 1 0 0 ẋ1 0 0 0 1 + ẍ1 0 0 . 1 ẋ1 ! 1) rank(M1 ) = 2 = (v + 1)n − a!on L1 . Thus, a = 2, d = n − a = 0 and we can choose −ẋ1 1 0 0 Z2T = with full rank satisfying Z2T M0 = 0 on L1 . −ẍ1 − 1 0 −ẋ1 1 K. Altmann CONFIDENTIAL Master of Science Thesis 2.5 Nonlinear differential-algebraic equations 2) rank(Z2T N1 In 0 ∗ 0 ) = rank( −1 3) rank(Fẋ T2 ) = 0 = d and Z1T ∈ R0×2 . 57 ! −1 ) = 2 = a. Thus, T2 ∈ R2×0 . 0 Thus, Hypothesis 2 is satisfied for v = 1, a = 2, d = 0. One can show that Hypothesis 2 is also satisfied for v = 2, a = 2, d = 0. Then, the reduced system is given by F̂ (ẋ, x, t) = ! F̂1 (ẋ, x, t) F̂2 (x, t) = Z1T F (ẋ, x, t) ! Z2T F1 (ẍ, ẋ, x, t) = Z2T F1 (ẍ, ẋ, x, t) = x2 x1 ! ! = 0. Hence, the given DAE has the unique solution (x1 , x2 ) = (0, 0). 2.5.2 Structured problems In many industrial applications, the arising DAEs are of a special structure. Additional information about the structure of the system can simplify the analysis of such systems, e.g. the application of Hypothesis 2. In the following, we review results obtained from Hypothesis 2 by exploiting the structure of semi-explicit, semi-implicit and quasi-linear DAEs. Semi-explicit DAEs Definition 22: Semi-explicit DAE The DAE (2.84) is called semi-explicit if F has the form F (ẋ, x, t) = ! ẏ − f (y, z) −g(y, z) =0 (2.92) where x = (y, z) with y ∈ Rny , z ∈ Rnz , n = ny + nz , f ∈ C(Rny × Rnz , Rny ) and g ∈ C(Rny × Rnz , Rm−ny ). Then, (2.84) can be written in the form ẏ = f (y, z), 0 = g(y, z). (2.93) Theorem 22 A uniquely solvable semi-explicit DAE (2.93) with m = n is strangeness-free if and only if gz (y, z) is nonsingular for all (y, z) satisfying g(y, z) = 0. Master of Science Thesis CONFIDENTIAL K. Altmann 58 Chapter 2. Analysis of differential-algebraic equations Remarks: • The assumption of unique solvability guarantees that the set of solutions L0 is nonempty. • The reduced system of the semi-explicit DAE (2.93) is then this DAE itself. • The d-index of the semi-explicit DAE (2.93) is vd = 1 if nz > 0 and vd = 0 if nz = 0. • The underlying ODE is obtained by differentiation of the constraints: ẏ = f (y, z), ż = −gz−1 (y, z)gy (y, z)f (y, z). Note that the underlying ODE does not contain any constraints. Theorem 23 A uniquely solvable semi-explicit DAE with m = n of the form ẏ = f (y, z) (2.94) 0 = g(y) has s-index vs = 1 if and only if gy (y)fz (y, z) is nonsingular for all (y, z) satisfying g(y) = 0 and gy (y)f (y, z) = 0. Remarks: • The equations 0 = gy (y)f (y, z) are the hidden constraints of the system. • The d-index of the semi-explicit DAE (2.94) is vd = 2. • The DAE obtained by differentiation of the constraints ẏ = f (y, z) 0 = gy (y)f (y, z) is strangeness-free. Thus, the s-index is lowered by one, but the original constraints 0 = g(y) are lost. • In general, one can prove that differentiating all constraints lowers the s-index by one, but these constraints are lost and the set of solutions is increased. In particular, the DAE obtained by differentiation of the constraints has solutions which do not solve the original DAE. Theorem 24 A uniquely solvable semi-explicit DAE with m = n of the form ẏ = f (y, z) (2.95) 0 = g(y, z) with rank(gz ) = rz = const. has s-index vs = 1 if and only if rz < nz and there exists ! a smooth pointwise nonsingular matrix function (nz − rz ) × nz satisfying Zgz = 0 such that Z 0 gz Z0 Z ! (Zgy f )z of size nz × nz where Z is of size is nonsingular for all (y, z) satisfying g(y, z) = 0 and (Zgy f )(y, z) = 0. K. Altmann CONFIDENTIAL Master of Science Thesis 2.5 Nonlinear differential-algebraic equations 59 Remarks: • The equations 0 = (Zgy f )(y, z) are the hidden constraints of the system. • The d-index of the semi-explicit DAE (2.95) is vd = 2. Semi-implicit DAEs Definition 23: Semi-implicit DAE The DAE (2.84) is called semi-implicit if F has the form F (ẋ, x, t) = ! E1 (x, t)ẋ − f1 (x, t) −f2 (x, t) =0 (2.96) where E1 ∈ C(Dx × I, Rm1 ,n ), f1 ∈ C(Dx × I, Rm1 ), f2 ∈ C(Dx × I, Rm2 ), m1 + m2 = m. Then, (2.84) can be written in the form E1 (x, t)ẋ = f1 (x, t), (2.97) 0 = f2 (x, t). Theorem 25 A uniquely!solvable semi-implicit DAE (2.97) with m = n is strangeness-free if and only if E1 (x, t) is nonsingular for all (x, t) ∈ M where M = {(x, t) ∈ Dx × I : 0 = f2 (x, t)}. f2;x (x, t) Remarks: • The reduced system of the semi-implicit DAE (2.97) is then this DAE itself. • The d-index of the semi-implicit DAE (2.97) is vd = 1 if m2 > 0 and vd = 0 if m2 = 0. • The underlying ODE is obtained by differentiation of the constraints: ẋ = E1 f2;x !−1 f1 −f2;t ! (x, t). Note that the underlying ODE does not contain any constraints. Theorem 26 Let the semi-implicit DAE (2.97) be uniquely solvable with m = n and let x(t0 ) = x0 be consistent initial values. Master of Science Thesis CONFIDENTIAL K. Altmann 60 Chapter 2. Analysis of differential-algebraic equations Then, the solution of the IVP is invariant under differentiation of the constraints, i.e. = f1 (x, t) 0 = f2 (x, t) x(t0 ) = x0 E1 (x, t)ẋ = f1 (x, t) E1 (x, t)ẋ ⇔ f2;x (x, t)ẋ = −f2,t (x, t) x(t0 ) (2.98) = x0 Remark: Without consistent initial values, the dimension of the set of solutions increases by differentiation of constraints due to the loss of these constraints. Quasi-linear DAEs Definition 24: Quasi-linear DAE The DAE (2.84) is called quasi-linear if F has the form F (ẋ, x, t) = E(x, t)ẋ − f (x, t) = 0 (2.99) where E ∈ C(Dx × I, Rm,n ) is denoted as leading matrix and f ∈ C(Dx × I, Rm ) is denoted as right-hand side. Then, (2.84) can be written in the form E(x, t)ẋ = f (x, t). (2.100) Remark: Quasi-linear DAEs with a nonsingular leading matrix E(x, t) are called ODEs in implicit form. Theorem 27 A uniquely solvable quasi-linear DAE (2.100) with m = n is!strangeness-free if there exists S1 (x, t) with S2 (x, t)E(x, t) = 0 and a pointwise nonsingular matrix function S(x, t) = S2 (x, t) ! S1 (x, t)E(x, t) is nonsingular for all (x, t) ∈ M where (S2 (x, t)f (x, t))x M = {(x, t) ∈ Dx × I : 0 = S2 (x, t)f (x, t)}. Remarks: • Note that the transformation with a pointwise nonsingular S(x, t) from the left corresponds to a combination of equations which does not change the solution set or the s-index. • The reduced system of the quasi-linear DAE (2.100) is S1 (x, t)E(x, t)ẋ = S1 (x, t)f (x, t), 0 = S2 (x, t)f (x, t). K. Altmann CONFIDENTIAL (2.101) Master of Science Thesis 2.5 Nonlinear differential-algebraic equations 61 • The d-index of the quasi-linear DAE (2.100) is vd = 1 if E(x, t) is singular and vd = 0 if E(x, t) is nonsingular. • The underlying ODE is obtained by differentiation of the constraints: ẋ = S1 E !−1 (S2 f )x S1 f −(S2 f )t ! (x, t). Note that the underlying ODE does not contain any constraints. An alternative approach without using Hypothesis 2 for determining the hidden constraints and the s-index is the following generalisation of Procedure 2 to quasi-linear DAEs: Procedure 3 Consider a quasi-linear DAE E(x, t)ẋ = f (x, t) and assume that E and f are sufficiently smooth. Also, we restrict our focus to uniquely solvable systems with m = n. Starting from E 0 (x, t) = E(x, t), f 0 (x, t) = f (x, t), M−1 = Dx × I, iterate for i = 0, 1, . . . 1) If E i (x, t) is nonsingular for all (x, t) ∈ Mi−1 , stop the procedure with v = i. 2) Transformation matrix function: Suppose there exists a sufficiently smooth matrix function Z (x, t) = i is nonsingular for all (x, t) ∈ Mi−1 such that i i Z (x, t)E (x, t) = ! Ẽ i (x, t) and 0 i i Z (x, t)f (x, t) = Z1i (x, t) Z2i (x, t) ! f˜1i (x, t) f˜i (x, t) ! that (2.102) 2 with ri = rank(Ẽ i (x, t)) = rank(E i (x, t)) and Ẽ i ∈ C(Mi , Rr Mi = {(x, t) ∈ Mi−1 : 0 = f˜i (x, t)}. i ,n ) for all (x, t) ∈ Mi and 2 3) Separation: Multiply the quasi-linear DAE E i (x, t)ẋ = f i (x, t) with Z i from the left to obtain the intermediate DAE ! Ẽ i (x, t) 0 ẋ = ! f˜1i (x, t) . f˜i (x, t) (2.103) 2 Denote 0 = f˜2i (x, t) as hidden constraints of level i. 4) Differentiation of the constraints: Replace the constraints of level i by their derivatives with respect to t in order to obtain ! Ẽ i (x, t) ẋ = f˜i (x, t) 2;x ! f˜1i (x, t) −f˜i (x, t) ⇔ E i+1 (x, t)ẋ = f i+1 (x, t). (2.104) 2;t Increase i by one and continue with 1). Master of Science Thesis CONFIDENTIAL K. Altmann 62 Chapter 2. Analysis of differential-algebraic equations Example 23 Consider the DAE E(x, t)ẋ = f (x, t) from Example 14 with E(x, t) = 0 0 1 −t ! , f (x, t) = ! −1 t 0 0 ! g1 (t) x+ g2 (t) , I = R, Dx = Dẋ = R2 . Following Procedure 3 yields: • i = 0 : E 0 (x, t) = E(x, t), f 0 (x, t) = f (x, t), M−1 = Dx × I = R2 × R. 1) E 0 (x, t) is singular ⇒ continue 2) Transformation matrix function: Z0 = 0 1 1 0 ! 3) Separation: Multiplying the quasi-linear DAE E 0 (x, t)ẋ = f 0 (x, t) with Z 0 from the left yields ! ! g2 1 −t ẋ = 0 0 −x1 + tx2 + g1 and we have M0 = {(x, t) ∈ R3 : x1 = tx2 + g1 }. 4) Differentiation of the constraints: ! ! g2 (t) 1 −t ẋ = x2 + ġ1 (t) 1 −t ⇔ E 1 (x, t)ẋ = f 1 (x, t) • i=1: 1) E 1 (x, t) is singular ⇒ continue 2) Transformation matrix function: Z = ! 0 1 1 −1 1 3) Separation: Multiplying the quasi-linear DAE E 1 (x, t)ẋ = f 1 (x, t) with Z 1 from the left yields ! ! 1 −t g2 ẋ = 0 0 x2 + ġ1 − g2 and we have M1 = {(x, t) ∈ R3 : x1 = tx2 + g1 , x2 = g2 − ġ1 }. 4) Differentiation of the constraints: ! ! 1 −t g2 ẋ = 0 1 ġ2 − g̈1 ⇔ E 2 (x, t)ẋ = f 2 (x, t) • i=2: 1) E 2 (x, t) is nonsingular for all (x, t) ∈ M1 K. Altmann CONFIDENTIAL ⇒ stop with v = i = 2. Master of Science Thesis 2.5 Nonlinear differential-algebraic equations 63 Remarks: • If Procedure 3 stops in step 1) with v = i, then . . . . . . the quasi-linear DAE (2.100) has s-index vs = max{0, v − 1} and d-index vd = v (this holds only for uniquely solvable and square systems), . . . the intermediate DAE E v (x, t)ẋ = f v (x, t) corresponds to the underlying ODE in implicit form, . . . the IVP E(x, t)ẋ = f (x, t), x(t0 ) = x0 has the same solution as the intermediately obtained IVPs E i (x, t)ẋ = f i (x, t), x(t0 ) = x0 , i = 0, . . . , v. • Procedure 3 as well as Hypothesis 2 can be extended to over- and underdetermined systems, i.e. m 6= n (see Procedure 3.5.11 and Hypothesis 3.2.7 in Steinbrecher 2006). Definition 25: Maximal constraint level Let Procedure 3 stop with v = i. Then, the highest level of existing (hidden) constraints is vc = v − 1 and we call the quasi-linear DAE E(x, t)ẋ = f (x, t) of maximal constraint level vc . Remarks: • vc = −1 means that there are no constraints, i.e. we have an ODE in implicit form. • In our case of uniquely solvable systems with m = n, we have vc = vs . • In a more general case of over- or underdetermined DAEs, this equality no longer holds. In this case, we have vc ≤ vs . Definition 26: Solution manifold Let Procedure 3 stop with v = i. Then, M = Mvc = {(x, t) ∈ Dx × I : 0 = f˜20 (x, t), . . . , f˜2vc (x, t)} (2.105) is called solution manifold or set of consistency. Remark: Every solution trajectory x lies in the solution manifold, i.e. (x(t), t) ∈ M. Master of Science Thesis CONFIDENTIAL K. Altmann Chapter 3 Numerical solution of differential-algebraic equations This chapter deals with the numerical solution of DAEs. In the first section 3.1, the difficulties which can arise when numerical methods for ODEs are directly used to solve DAEs are discussed. In section 3.2, the idea of index reduction and regularisation of higher index DAEs is presented. Afterwards, in section 3.3, the two main classes of discretisation methods, namely one-step methods and multi-step methods, and their application to strangeness-free DAEs are described. Finally, in section 3.4, the Newton method for the numerical solution of systems of nonlinear equations arising in the integration process is presented. 3.1 Differential-algebraic equations are not ordinary differential equations In principle, one can easily adapt numerical methods for ODEs in order to apply them directly to DAEs, for example by replacing derivatives by finite differences. But it was observed that, in contrast to ODEs, several difficulties arise when numerical methods are used to solve DAEs, e.g. instabilities, inconsistencies, convergence problems, drift-off effects, order reduction phenomena, inaccurate error estimates or amplifications of perturbations (see Petzold 1982; Brenan et al. 1996; Steinbrecher 2006; Kunkel/Mehrmann 2006). These difficulties are caused by the algebraic constraints. In particular, additionally to the constraints explicitly occurring in the DAE, the solution of higher index DAEs is restricted by constraints which are hidden in the DAE, i.e. algebraic constraints that are not explicitly given in the system. Thus, a numerical integration of DAEs containing hidden constraints, i.e. not strangenessfree DAEs, is not recommended. Furthermore, due to algebraic constraints, explicit methods cannot be used to solve DAEs since this would require the solution of a linear system with a typically singular matrix. The following example illustrates that: K. Altmann CONFIDENTIAL Master of Science Thesis 3.2 Index reduction and regularisation 65 Example 24: Explicit Euler method We consider the DAE 0 = F (ẋ(t), x(t), t) with initial value x(t0 ) = x0 in I = [t0 , T ] ⊆ R. By introducing a grid t0 ≤ t1 ≤ . . . ≤ tN = T where hi = ti − ti−1 , i = 1, . . . , N , we are interested in a sequence xi ∈ Rn , i = 0, . . . , N which approximates the solution x at the points ti , i.e. xi ≈ x(ti ). Similar to the ODE case, the idea of the explicit Euler method is to approximate ẋ(ti ) by the forward difference quotient, i.e. ẋ(ti ) ≈ xi+1 − xi . hi+1 Then, we get from 0 = F (ẋ(t), x(t), t) the iteration x0 = x(t0 ), 0=F xi+1 − xi , xi , ti hi+1 for i = 0, . . . , N − 1. Thus, we have a nonlinear system for xi+1 (xi , ti , hi+1 are known) which has to be solved, e.g. with the Newton method: x0i+1 = xi , where G(xji+1 ) = F ( j −1 j xj+1 (xi+1 )G(xji+1 ) for j = 0, . . . i+1 = xi+1 − J xji+1 −xi hi+1 , xi , ti ) and J(xji+1 ) = ∂ 1 xj −xi G(xji+1 ) = Fẋ ( i+1 . hi+1 , xi , ti ) · h ∂xi+1 i+1 In general for DAEs, we have a singular Fẋ . Therefore, we cannot (uniquely) get xi+1 from −xi 0 = F ( xi+1 hi+1 , xi , ti ). Thus, the explicit Euler method is not suitable for DAEs. 3.2 Index reduction and regularisation Due to the difficulties of the numerical integration of higher index DAEs described above, one has to consider an alternative to a direct discretisation of higher index DAEs, namely discretising an equivalent formulation of the problem with s-index zero. A classical approach for index reduction of higher index DAEs is to use index reduction by differentiating to turn the problem into a strangeness-free DAE. The constraints are replaced by their derivatives and differentiated unknowns are substituted as far as possible. One can prove that differentiating all constraints lowers the index by one. Although this actually was the common approach until the early seventies (see Kunkel/Mehrmann 2006, p. 273), there are major disadvantages to this approach. The differentiated constraints are removed from the DAE, and thus they are not present any more during the numerical integration. Therefore, the reduced system has solutions which do not solve the original DAE, i.e. the set of solutions is increased. Thus, discretisation and round-off errors may lead to a so-called drift-off effect of the solution such Master of Science Thesis CONFIDENTIAL K. Altmann 66 Chapter 3. Numerical solution of differential-algebraic equations that the removed constraints are violated since the solution is no longer restricted into the set of consistency. Hence, index reduction by differentiation lowers the index, but increases the set of solutions which leads to undesired drift-off effects. Therefore, an index reduction technique which at the same time does not change the set of solutions is preferable. Consequently, a regularisation is defined as follows: Definition 27: Regularisation A DAE 0 = F̂ (ẋ, x, t) is called regularisation of a DAE 0 = F (ẋ, x, t) if both have the same set of solutions and the s-index of 0 = F̂ (ẋ, x, t) is smaller than the s-index of the original DAE 0 = F (ẋ, x, t). Remarks: • If the DAE 0 = F (ẋ, x, t) satisfies Hypothesis 2 with vs = v, then the reduced system (2.90), i.e. 0 = F̂ (ẋ, x, t) = F̂1 (ẋ, x, t) ! F̂2 (x, t) = Z1T F Z2T Fv ! (3.1) with Z1T and Z2T obtained from Hypothesis 2, corresponds to a regularisation of the DAE 0 = F (ẋ, x, t). • For numerical integration, strangeness-free regularisations where all constraints are stated explicitly, i.e. as algebraic equations, are to prefer. Besides getting a regularisation from Hypothesis 2, which is very technical and requires the determination of the whole derivative array, there is another regularisation approach based on Procedure 3 for quasi-linear DAEs. The idea is to determine all constraints and not to replace them by their differentiated versions, but to simply add them to the system. This leads to an overdetermined system, which then has to be lead back to a square system by only selecting certain differential equations. Regularisation of quasi-linear DAEs Consider a uniquely solvable quasi-linear DAE E(x, t)ẋ = f (x, t) (3.2) with m = n and assume that Procedure 3 applied to (3.2) stops with v = i ≥ 2. For v = 0 or v = 1 a regularisation is not necessary. Then, the maximal constraint level is vc = v − 1 (vs = v − 1, vd = v) and the hidden constraints are determined in step 2) of the procedure as 0 = f˜2i (x, t), K. Altmann i = 0, . . . , vc . CONFIDENTIAL (3.3) Master of Science Thesis 3.2 Index reduction and regularisation 67 Define the set of constraints as f˜20 (x, t) .. ∈ C(M, Rmc ) 0 = g(x, t) = . vc ˜ f2 (x, t) (3.4) where mc is the number of all constraints. Adding the set of constraints to the quasi-linear DAE (3.2) yields E(x, t)ẋ = f (x, t), (3.5) 0 = g(x, t). This is an overdetermined system which is uniquely solvable with the same set of solutions as the original DAE (3.2). Furthermore, it is not strangeness-free since there are redundancies between the differential and algebraic equations, i.e. there are redundancies (rank deficiencies) in (E(x, t)ẋ − f (x, t))ẋ ! (−g(x, t))x = E(x, t) ! −gx (x, t) . (3.6) Eliminating these redundancies by scaling / selecting the differential equations with a so-called selector matrix function S(x, t) ∈ C(M, Rm−mc ,m ) such that ! S(x, t)E(x, t) (3.7) gx (x, t) is nonsingular for all (x, t) ∈ M yields a regularisation of the form S(x, t)E(x, t)ẋ = S(x, t)f (x, t), (3.8) 0 = g(x, t), which is a square system where all redundancies are removed and which has the same set of solutions as the original DAE (3.2). The following theorem summarises the obtained approach: Theorem 28 Let the quasi-linear DAE E(x, t)ẋ = f (x, t) be uniquely solvable with m = n. Furthermore, let Procedure 3 applied to the quasi-linear DAE terminate in iteration step v = i with maximal constraint level vc = v − 1 and the constraints 0 = f˜i (x, t), i = 0, . . . , vc . 2 Then, the DAE S(x, t)E(x, t)ẋ = S(x, t)f (x, t), 0 = g(x, t) Master of Science Thesis CONFIDENTIAL (3.9) K. Altmann 68 Chapter 3. Numerical solution of differential-algebraic equations ! f˜20 (x, t) S(x, t)E(x, t) .. with the set of constraints g(x, t) = nonsingular for all . and gx (x, t) v f˜2 c (x, t) (x, t) ∈ M = {(x, t) ∈ Dx × I : 0 = g(x, t)} is strangeness-free and has the same set of solutions as the quasi-linear DAE E(x, t)ẋ = f (x, t). Thus, (3.9) is a regularisation of the original system. Proof: See Steinbrecher (2006), Lemma 3.5.47. Remark: The regularisation is not unique since the selector S is not unique. Example 25: Mathematical pendulum The mathematical pendulum can be modeled by the movement of a mass point with mass m around the origin of a Cartesian coordinate system (x1 , x2 ) with distance l under the influence of gravity. By introducing the velocity variables (v1 , v2 ) = (ẋ1 , ẋ2 ), the system can be described by the following quasi-linear DAE (for more details see section 4.1): ẋ1 = v1 , ẋ2 = v2 , mv̇1 = −2x1 λ, mv̇2 = −2x2 λ − mg, 0 = x21 + x22 − l2 . From Procedure 3 we obtain the maximal constraint level vc = 2 and the constraints 0 = f˜20 (x, t) = x21 + x22 − l2 0 = f˜21 (x, t) = x1 v1 + x2 v2 0 = f˜22 (x, t) = v12 + v22 − ⇔ 0 = g(x, t). 2 x2 + x2 λ − gx2 m 1 2 Therefore, we have ! E(x, t) = gx (x, t) 2x1 v1 4 − x1 λ m K. Altmann 1 1 m m 0 2x2 0 0 0 v2 4 − x2 λ − g m x1 x2 2v1 2v2 0 2 − x2 + x22 m 1 CONFIDENTIAL . Master of Science Thesis 3.3 Methods for strangeness-free differential-algebraic equations 69 With the selector S(x, t) = x2 −x1 0 0 0 0 0 x2 −x1 0 ! we obtain ! S(x, t)E(x, t) gx (x, t) x2 0 = 2x1 v1 4 − x1 λ m −x1 0 0 0 0 mx2 −mx1 0 2x2 0 0 0 v2 4 − x2 λ − g m x1 x2 2v1 2v2 0 2 − x2 + x22 m 1 , which is nonsingular since x21 + x22 = l2 6= 0. Thus, we obtain the strangeness-free regularisation ! S(x, t)E(x, t) 0 ẋ = ! S(x, t)f (x, t) g(x, t) : x2 ẋ1 − x1 ẋ2 = x2 v1 − x1 v2 , mx2 v̇1 − mx1 v̇2 = −2x1 x2 λ + 2x1 x2 λ + mx1 g, 0 = x21 + x22 − l2 , 0 = x 1 v 1 + x 2 v2 , 2 x2 + x22 λ − gx2 . 0 = v12 + v22 − m 1 Remark: Note that if we use Procedure 3 for the determination of a strangeness-free regularisation, it is only necessary to differentiate the algebraic constraints. In contrast, if we use Hypothesis 2, it is necessary to determine the derivative array, i.e. to determine the derivatives of the whole DAE. Therefore, the use of Hypothesis 2 for the determination of a regularisation of a quasi-linear DAE is more involved than the use of Procedure 3. However, Procedure 3 is only applicable for quasi-linear DAEs, while Hypothesis 2 is suited for general nonlinear DAEs. 3.3 Methods for strangeness-free differential-algebraic equations In this section, the two main classes of discretisation methods, namely one-step methods and linear multi-step methods, and their application to strangeness-free DAEs are discussed. In general, we are interested in a numerical solution of an initial value problem of a general nonlinear DAE of the form F (ẋ, x, t) = 0, x(t0 ) = x0 (3.10) in the interval I = [t0 , T ] ⊆ R. We denote by t0 ≤ t1 ≤ . . . ≤ tN = T grid points in the interval I and by xi approximations to the solution x(ti ). For convenience, we only consider a fixed step size h, i.e. it holds ti = t0 + ih, i = 0, . . . , N and N = T −t0 h . Master of Science Thesis CONFIDENTIAL K. Altmann 70 Chapter 3. Numerical solution of differential-algebraic equations 3.3.1 One-step methods A one-step method for the numerical solution of (3.10) is defined as follows: Definition 28: One-step method A one-step method for the determination of discrete approximations xi , i = 0, . . . , N to the values x(ti ) of a solution x of (3.10) is given by an iteration xi+1 = xi + hΦ(ti , xi , h) for i = 0, . . . , N − 1, (3.11) where Φ is called increment function. We are then interested in conditions that guarantee convergence of the methods, i.e. that xN tends to x(tN ) when h tends to zero. Therefore, we need the following definitions: Definition 29: Consistent of order p A one-step method (3.11) is called consistent of order p, p ∈ N \ {0}, if (3.12) kx(ti+1 ) − x(ti ) − hΦ(ti , x(ti ), h)k ≤ Chp+1 for all i = 0, . . . , N − 1 with a constant C independent of h. Definition 30: Convergent of order p A one-step method (3.11) is called convergent of order p, p ∈ N \ {0}, if (3.13) kx(tN ) − xN k ≤ Chp with a constant C independent of h. Theorem 29: Convergence of one-step methods Let a one-step method (3.11) with order of consistency p be given. Furthermore, let Φ be Lipschitz continuous with respect to its second argument, i.e. (3.14) kΦ(t, x, h) − Φ(t, y, h)k ≤ L kx − yk for all t ∈ [t0 , T ], 0 < h < T − t, x, y ∈ Cn with Lipschitz constant L > 0. Then, the one-step method is also convergent of order p. Proof: See Bollhöfer/Mehrmann (2004), Satz 3.10. K. Altmann CONFIDENTIAL Master of Science Thesis 3.3 Methods for strangeness-free differential-algebraic equations 71 In order to develop appropriate one-step methods for DAEs, the idea is to generalise Runge-Kutta methods for ODEs of the form ẋ = f (t, x) to DAEs of the form 0 = F (ẋ, x, t). The concept of Runge–Kutta methods for ODEs consists in using an increment function Φ(ti , xi , h) that is a linear combination of the values of the right-hand side f in discrete points. A general implicit Runge-Kutta method is defined as follows: Definition 31: s-stage Runge-Kutta method for ODEs Let s ∈ N \ {0} and aij , bi , ci ∈ R, i, j = 1, . . . , s. Then, an s-stage Runge-Kutta method for the solution of ẋ = f (t, x), x(t0 ) = x0 is given by xi+1 = xi + h s X for i = 0, . . . , N − 1 bj kj (3.15) j=1 with kj = f (ti + cj h, xi + h s X ajl kl ), j = 1, . . . , s, (3.16) l=1 where s is the number of stages, kj are called stages, A = [aij ]i,j=1,...,s ∈ Rs,s denotes the Runge-Kutta matrix, b = [bi ]i=1,...,s ∈ Rs denotes the weight vector and c = [ci ]i=1,...,s ∈ Rs denotes the node vector. The coefficients aij , bi , ci ∈ R, i, j = 1, . . . , s, determine the particular Runge-Kutta method and are conveniently expressed in a tableau, a so-called Butcher tableau: c1 .. . a11 .. . ... .. . a1s .. . cs as1 ... ass b1 ... bs c ⇔ A (3.17) bT Note that the stages kj approximate the derivative ẋ at ti + cj h, and furthermore xi + h s P ajl kl l=1 corresponds to an approximation for x at ti + cj h. For DAEs, an explicit approximation for ẋ at ti + cj h cannot be provided since Fẋ is in general singular. Therefore, set Xj0 = kj and s P Xj = xi + h ajl Xl0 . Then, we obtain an implicit Runge-Kutta method for DAEs as follows: l=1 Definition 32: s-stage Runge-Kutta method for DAEs Let s ∈ N \ {0} and aij , bi , ci ∈ R, i, j = 1, . . . , s. Then, an s-stage Runge-Kutta method for the solution of 0 = F (ẋ, x, t), x(t0 ) = x0 is given by xi+1 = xi + h s X bj Xj0 for i = 0, . . . , N − 1 (3.18) j=1 with Xj = xi + h 0= Master of Science Thesis s X ajl Xl0 l=1 F (Xj0 , Xj , ti + cj h) for j = 1, . . . , s. (3.19) CONFIDENTIAL K. Altmann 72 Chapter 3. Numerical solution of differential-algebraic equations Note that (3.19) forms a system of nonlinear equations of size 2ns × 2ns for the determination of Xj0 and Xj for j = 1, . . . , s. The numerical solution of such systems of nonlinear equations is discussed in the next section. Example 26: Implicit Euler method The implicit Euler method is given by the Butcher tableau 1 1 1 . Thus, we get from (3.18) and (3.19): xi+1 = xi + hX10 , X1 = xi + hX10 , 0 = F (X10 , X1 , ti + h). xi+1 − xi . Then, from the second equation it follows that h 0 X1 = xi + hX1 = xi+1 . Thus, by inserting X10 and X1 into the last equation we obtain The first equation yields X10 = 0=F xi+1 − xi , xi+1 , ti+1 . h Example 27: Midpoint method The midpoint method is given by the Butcher tableau 1 2 1 2 1 . Thus, we get from (3.18) and (3.19): xi+1 = xi + hX10 , 1 X1 = xi + hX10 , 2 1 0 = F (X10 , X1 , ti + h). 2 xi+1 − xi The first equation yields X10 = . Then, from the second equation it follows that h 1 xi+1 − xi xi + xi+1 X1 = xi + hX10 = xi + = . Thus, by inserting X10 and X1 into the last 2 2 2 equation we obtain xi+1 − xi xi + xi+1 1 0=F , , ti + h . h 2 2 Note that equations (3.18) and (3.19) only describe a well-defined discretisation method if they define a unique xi+1 at least for sufficiently small step sizes h. Furthermore, another important aspect in the numerical treatment of DAEs is the consistency of numerical solution xi+1 . In addition, we are interested in the convergence properties of the resulting methods compared to the classical order of convergence for ODEs. K. Altmann CONFIDENTIAL Master of Science Thesis 3.3 Methods for strangeness-free differential-algebraic equations 73 Concerning the uniqueness of xi+1 , Kunkel/Mehrmann (2006, pp. 227-228) show that already for the case of linear DAEs with constant coefficient it is necessary that the Runge-Kutta matrix A is nonsingular. Therefore, we are restricted to implicit Runge-Kutta methods since explicit methods have a singular A. If we consider the consistency of xi+1 , it is clear that in the case of strangeness-free DAEs where all constraints are given in an explicit way, the stages Xj , j = 1, . . . , s, are consistent due to equation (3.19), i.e. they satisfy all constraints. Unfortunately, this does not guarantee the consistency of the numerical solution xi+1 since xi+1 is given by equation (3.18). Hence, RungeKutta methods which automatically guarantee the consistency of xi+1 are to be preferred. In this context, an important class of Runge-Kutta methods are the so-called stiffly accurate RungeKutta methods. These are defined to satisfy the condition asj = bj , j = 1, . . . , s (see Prothero/ Robinson 1974). From this it follows that the numerical solution xi+1 coincides with the last stage Xs , i.e. xi+1 = Xs . Therefore, the consistency of xi+1 can be guaranteed for strangenessfree DAEs where all constraints are given in an explicit way. However, for DAEs of higher index or for methods which are not stiffly accurate, the consistency of xi+1 cannot be guaranteed. Concerning the convergence of Runge-Kutta methods, it has been shown that the obtained order of implicit Runge-Kutta methods applied to DAEs differs from the obtained order of these methods applied to ODEs, the so-called classical order. Besides an order reduction, it even may happen that the convergence is completely lost. For more details see e.g. Petzold (1986); Hairer et al. (1989); Steinbrecher (2006), section 3.5.4.5; and Kunkel/Mehrmann (2006), section 5.2. In Hairer et al. (1989), results on the order of implicit Runge-Kutta methods applied to semi-explicit DAEs of d-index one, two and three are investigated. Steinbrecher (2006) generalises these results to strangeness-free quasi-linear DAEs. His results for certain important classes of implicit RungeKutta methods are listed in Table 3.1. Method Gauß Stages ( odd s even Classical order 2s Order of convergence ( s+1 s Stability properties A-stable RadauIa s 2s − 1 s A-, L-stable RadauIIa s 2s − 1 2s − 1 A-, L-stable, stiffly accurate LobattoIIIa s 2s − 2 2s − 2* A-stable, stiffly accurate LobattoIIIb s 2s − 2 ** A-stable LobattoIIIc s 2s − 2 2s − 2 A-, L-stable, stiffly accurate *Results for LobattoIIIa are proven for s = 2, 3 and conjectured for larger s (see Steinbrecher 2006, p. 105) **LobattoIIIb methods are not considered by Hairer et al. (1989) and consequently not generalised by Steinbrecher (2006) Table 3.1: Properties of Runge-Kutta methods applied to strangeness-free quasi-linear DAEs For the numerical solution of DAEs, there are several important classes of implicit Runge-Kutta methods which are well investigated and very popular, namely Gauß methods, Radau methods and Lobatto methods. Master of Science Thesis CONFIDENTIAL K. Altmann 74 Chapter 3. Numerical solution of differential-algebraic equations Gauß methods are collocation methods based on the Gauß quadrature formulas. The s-stage Gauß method is of (maximally reachable) classical order 2s, while the numerical integration of strangeness-free quasi-linear DAEs in general leads to an order of convergence of only s + 1 for odd s and an order of convergence of s for even s. Furthermore, all Gauß methods are A-stable, but they are not L-stable and not stiffly accurate. Due to the choice of nodes, i.e. the parameters cj , there is no stage Xj , j = 1, . . . , s, which coincides with the numerical solution xi+1 . Instead of that, the numerical solution xi+1 is just a linear combination of the stages Xj . Therefore, the consistency of xi+1 cannot be assured. Hence, due to the order reduction and the possible inconsistency of xi+1 , the Gauß methods are impracticable for the numerical solution of DAEs. s=2 s=1 1 2 1 2 √ 1 2 1 2 1 − + 3 √6 3 6 1 4 s=3 1 4√ + √ 1 4 − 3 6 3 6 1 4 1 2 1 2 1 2 1 2 √ 15 10 − 1 2√ 5 36 5 36 15 10 + 5 36√ + + 15 √24 15 30 5 18 2 9 2 9 √ − 15 15 2 9√ + 15 15 4 9 5 36 5 36 √ − − 15 √30 15 24 5 36 5 18 Table 3.2: Butcher tableaus of the first three Gauß methods The s-stage Radau methods are of classical order 2s − 1 which is slightly lower than the classical order of the Gauß methods. But they are designed in such a way that they have excellent stability properties, i.e. Radau methods are A-stable as well as L-stable. The node vector c of the RadauIa methods satisfies the condition c1 = 0, whereas the node vector c of the RadauIIa methods satisfies the condition cs = 1 and furthermore it holds that asj = bj for j = 1, . . . , s, i.e. RadauIIa methods are stiffly accurate. Thus, the numerical solution xi+1 coincides with the last stage Xs for RadauIIa methods. Therefore, the consistency of xi+1 obtained from a RadauIIa method applied to strangeness-free DAEs with all constraints stated in an explicit way follows from the consistency of the stages. The numerical integration of strangeness-free quasi-linear DAEs with RadauIa methods in general leads to an order of convergence of only s, whereas the order of convergence of RadauIIa methods applied to strangeness-free quasi-linear DAEs is still 2s − 1. The great stability properties, the guaranteed consistency of xi+1 and the absence of order reduction make RadauIIa methods excellent candidates for the numerical solution of initial value problems for strangeness-free quasi-linear DAEs. s=1 0 s=2 1 0 1 2 3 1 4 1 4 1 4 s=3 − 14 5 12 3 4 0 √ 6− 6 10 √ 6+ 6 10 1 9 1 9 1 9 1 9 √ −1− 6 18√ 88+7 6 360√ 88+43 6 360 √ 16+ 6 36 √ −1+ 6 18 √ 88−43 6 360√ 88−7 6 360 √ 16− 6 36 Table 3.3: Butcher tableaus of the first three RadauIa methods K. Altmann CONFIDENTIAL Master of Science Thesis 3.3 Methods for strangeness-free differential-algebraic equations s=2 s=1 1 75 1 1 3 1 1 5 12 3 4 3 4 s=3 √ √ 4− 6 10 √ 4+ 6 10 1 − 12 1 4 1 4 √ 296−169 6 1800 √ 88+7 6 360 √ 16+ 6 36√ 16+ 6 36 88−7 6 360 √ 296+169 6 1800 √ 16− 6 36√ 16− 6 36 1 √ −2+3 6 225√ −2−3 6 225 1 9 1 9 Table 3.4: Butcher tableaus of the first three RadauIIa methods The s-stage Lobatto methods are of classical order 2s−2. The numerical integration of strangenessfree quasi-linear DAEs with Lobatto methods in general leads to the same order of convergence of 2s − 2. There are three families of Lobatto methods, called LobattoIIIa, LobattoIIIb and LobattoIIIc. The node vector c of the Lobatto methods satisfies the conditions c1 = 0 and cs = 1. Furthermore, the Lobatto methods are A-stable but only the LobattoIIIc methods are L-stable. In addition, LobattoIIIa and LobattoIIIc methods are stiffly accurate. For LobattoIIIa methods, the first row of the Runge-Kutta matrix A is identical to 0, so that A is singular. Similarly, for LobattoIIIb methods, the last column of the Runge-Kutta matrix A is identical to 0, so that A is singular. Thus, LobattoIIIa and LobattoIIIb methods are not suited for the numerical solution of DAEs because they are not well-defined, i.e. the uniqueness of xi+1 is not assured, while LobattoIIIc methods are good candidates for the numerical solution of strangeness-free DAEs. s=3 s=2 0 0 0 0 0 0 0 1 1 2 1 2 1 2 1 2 1 2 5 24 1 6 1 6 1 3 2 3 2 3 1 − 24 1 1 6 1 6 Table 3.5: Butcher tableaus of the first two LobattoIIIa methods s=2 0 1 2 1 2 1 2 1 s=3 0 0 0 1 2 1 2 1 1 6 1 6 1 6 1 6 − 61 0 1 3 5 6 2 3 0 0 1 6 Table 3.6: Butcher tableaus of the first two LobattoIIIb methods s=2 0 1 1 2 1 2 1 2 s=3 − 12 0 1 2 1 2 1 2 1 1 6 1 6 1 6 1 6 − 31 5 12 2 3 2 3 1 6 1 − 12 1 6 1 6 Table 3.7: Butcher tableaus of the first two LobattoIIIc methods Master of Science Thesis CONFIDENTIAL K. Altmann 76 Chapter 3. Numerical solution of differential-algebraic equations 3.3.2 Multi-step methods The idea of multi-step methods is to not only use the previous approximation xi−1 for the determination of the approximation xi to the solution x(ti ), but also to use several of the previous approximations xi−1 , . . . , xi−k for the computation of xi . In the case of linear multi-step methods, a linear combination of the previous approximations xj and of the right-hand sides f (tj , xj ) is used. Thus, a linear multi-step method for the numerical solution of ODEs is defined as follows: Definition 33: Linear multi-step method for ODEs Let k ∈ N \ {0} and αi , βi ∈ R, i = 0, . . . , k. Then, a linear multi-step method for the solution of ẋ = f (t, x), x(t0 ) = x0 is given by k X αk−l xi−l = h l=0 k X βk−l f (ti−l , xi−l ) for i = k, . . . , N (3.20) l=0 with αk 6= 0 and α02 + β02 6= 0. The method (3.20) is also called a k-step method. Remarks: • In order to initialise the iteration, the values x0 , . . . , xk−1 must be provided. The computation of these values is usually done by an appropriate one-step method. • The linear multi-step method is called explicit if βk = 0, and implicit if βk 6= 0. • By multiplying (3.20) with a nonzero scalar, we may assume that αk = 1 or, in the case of implicit methods, that βk = 1. Definition 34: Consistent of order p A multi-step method (3.20) is called consistent of order p, p ∈ N \ {0}, if k k X X αk−l x(ti−l ) − h βk−l ẋ(ti−l ) ≤ Chp+1 l=0 (3.21) l=0 for all sufficiently smooth functions x with a constant C independent of h. Theorem 30: Consistency of multi-step methods A multi-step method (3.20) is consistent if and only if k X and αl = 0 l=0 k X lαl = l=0 k X βl . (3.22) l=0 Furthermore, (3.20) has order of consistency p if k X l=0 li αl = k X ili−1 βl for i = 0, . . . , p. Proof: See Kunkel/Mehrmann (2006), Theorem 5.20. K. Altmann (3.23) l=0 CONFIDENTIAL Master of Science Thesis 3.3 Methods for strangeness-free differential-algebraic equations 77 Definition 35: Stability A multi-step method (3.20) is called stable if 1) all roots λ ∈ C of %(λ) = 2) all roots λ ∈ C of %(λ) = k P l=0 k P αl λl satisfy |λ| ≤ 1 and αl λl with |λ| = 1 are simple. l=0 Definition 36: Convergent of order p A multi-step method (3.20) is called convergent of order p, p ∈ N \ {0}, if for all tN ∈ I and tN − t0 it holds that all h = N kx(tN ) − xN k ≤ Chp (3.24) with a constant C independent of h. Theorem 31: Convergence of multi-step methods If the multi-step method (3.20) is stable and consistent of order p, then it is convergent of order p. Proof: See Kunkel/Mehrmann (2006), Theorem 5.22. A very popular class of implicit multi-step methods are the so-called BDF methods. The abbreviation BDF stands for backward differentiation formula. BDF methods are constructed by setting β0 = β1 = . . . = βk−1 = 0, βk = 1 and determining the coefficients αi , i = 0, . . . , k, by backward differences, such that the order of consistency is as large as possible. Thus, BDF methods for ODEs have the following form: k 1X αk−l xi−l = f (ti , xi ). h (3.25) l=0 The resulting coefficients αi , i = 0, . . . , k, are stated in Table 3.8: k=1 α0 = −1, α1 = 1 k=2 α0 = 12 , α1 = −2, α2 = k=3 α0 = − 13 , α1 = 32 , α2 = −3, α3 = k=4 α0 = 14 , α1 = − 43 , α2 = 3, α3 = −4, α4 = k=5 α0 = − 15 , α1 = 54 , α2 = − 10 3 , α3 = 5, α4 = −5, k=6 α0 = 16 , α1 = − 65 , α2 = α3 = − 20 3 , α4 = 3 2 15 4 , 11 6 25 12 15 2 , α5 = 137 60 α5 = −6, α6 = 49 20 Table 3.8: Coefficients of the BDF methods Master of Science Thesis CONFIDENTIAL K. Altmann 78 Chapter 3. Numerical solution of differential-algebraic equations By construction, the BDF methods are consistent of order p = k. However, in order to achieve convergence, stability of the methods is required, see Theorem 31. In Hairer/Wanner 1983, it is shown that the BDF methods are stable and thus convergent for 1 ≤ k ≤ 6 and unstable for k ≥ 7. Furthermore, BDF methods are A-stable only for k ≤ 2 and A(α)-stable for 3 ≤ k ≤ 6. In order to generalise the BDF methods for ODEs of the form ẋ = f (t, x) to DAEs of the form 0 = F (ẋ, x, t), it should be noted that in (3.25), the discretisation is yielded by simply replack 1X αk−l xi−l . ing x(ti ) by the approximation xi and ẋ(ti ) = f (ti , x(ti )) by the approximation h l=0 Therefore, by using this discretisation in the same way we obtain the BDF methods for DAEs as follows: k 1X αk−l xi−l , xi , ti h 0=F ! (3.26) . l=0 Example 28: BDF method k = 1 For k = 1 we have α0 = −1 and α1 = 1. Thus, we get from (3.26): 0=F xi − xi−1 , xi , ti . h This is the implicit Euler method. Example 29: BDF method k = 2 For k = 2 we have α0 = 1 3 , α1 = −2, α2 = . Thus, we get from (3.26): 2 2 1 3 1 0=F xi − 2xi−1 + xi−2 , xi , ti . h 2 2 Note that in the case of strangeness-free DAEs where all constraints are given in an explicit way, the consistency of xi is obviously guaranteed due to equation (3.26). Concerning the convergence of BDF methods, it is shown that the obtained order of BDF methods applied to strangeness-free DAEs with all constraints given in an explicit way is equal to the order of these methods applied to ODEs, i.e. BDF methods applied to those DAEs are convergent of order p = k for 1 ≤ k ≤ 6 (see Kunkel/Mehrmann 2006, Theorem 5.27). However, for higher index DAEs, the order of convergence is reduced. Furthermore, according to Petzold (1986, p. 838), it is shown by März (1981) that general linear multi-step methods applied to strangeness-free DAEs have to satisfy an extra set of conditions for the method to be convergent of the classical order, i.e. not to suffer from order reduction. Fortunately, these conditions are satisfied for BDF methods, which makes them excellent candidates for the numerical solution of initial value problems for strangeness-free DAEs. K. Altmann CONFIDENTIAL Master of Science Thesis 3.4 Numerical solution of systems of nonlinear equations 3.4 79 Numerical solution of systems of nonlinear equations In this section, we consider systems of nonlinear equations of the form f (x) = 0 where f ∈ C(Rm , Rm ) is a nonlinear system of functions mapping a vector x ∈ Rm to a vector y = f (x) ∈ Rm , defined by a system of m nonlinear equations f1 (x1 , . . . , xm ) .. . y = f (x) = . fm (x1 , . . . , xm ) (3.27) In the previous section, we have seen that the numerical solution of DAEs always yields such systems of nonlinear equations. In particular, in each iteration step of a Runge-Kutta method a nonlinear system of size 2ns × 2ns for the determination of Xj0 and Xj for j = 1, . . . , s has to be solved where s is the number of stages, see equation (3.19). The size of the system can be reduced to ns × ns by simply substituting the expression of Xj into the second part of (3.19). On the other hand, in each iteration step of a BDF method a nonlinear system of size n × n has to be solved, see equation (3.26). Note that the size of the system does not depend on the order k. Thus, an efficient algorithm for the numerical solution of systems of nonlinear equations is necessary for the solution of DAEs. The numerical solution of systems of nonlinear equations has been investigated extensively in the literature, e.g. see Deuflhard (2011), but it is not the main topic of this thesis. Therefore, only the ordinary Newton method for the numerical solution of systems of nonlinear equations is presented here: Definition 37: Newton method Let f : X → Rn be a continuously differentiable function with X ⊂ Rn open and convex. Let x0 ∈ Rn be given. Suppose that the Jacobian Jf (x) is nonsingular for all x ∈ X. Then, the Newton method for the determination of a sequence {xk } in order to find a solution of f (x) = 0 is given by the iteration Jf (xk )∆xk = −f (xk ), xk+1 = xk + ∆xk . (3.28) Remarks: • Obviously, the Newton method treats the numerical solution of a nonlinear problem by solving a sequence of linear problems with the Jacobian as system matrix. • Under certain conditions, the existence and uniqueness of a solution x as well as quadratic convergence of the Newton iterates xk can be shown (see e.g. Deuflhard 2011, Theorem 2.3). • One of these conditions requires a sufficiently good initial guess x0 of the solution, otherwise convergence cannot be guaranteed. • The Newton method will be carried out until a certain termination criterion is satisfied. There exist many different termination criteria in the literature. In the present work, the Newton iteration is terminated if the residual norm f (xk ) or the correction norm ∆xk fall below a prescribed tolerance, or if a maximum number of iterations is exceeded. Master of Science Thesis CONFIDENTIAL K. Altmann Chapter 4 Test problems 4.1 Mathematical pendulum One of the most famous elementary examples of DAEs is the mathematical pendulum which is modeled by the movement of a mass point with mass m around the origin of a Cartesian coordinate system (x1 , x2 ) with distance l under the influence of gravity, see Figure 4.1. x2 x1 l (x1 , x2 ) m Figure 4.1: The mathematical pendulum The kinetic energy of the system is given by Ek = 1 m(ẋ21 + ẋ22 ). 2 (4.1) Furthermore, the potential energy is given by Ep = mgx2 (4.2) where g is the gravitational acceleration. K. Altmann CONFIDENTIAL Master of Science Thesis 4.1 Mathematical pendulum 81 Together with the constraint equation 0 = x21 + x22 − l2 , the Lagrange function is defined by 1 m(ẋ21 + ẋ22 ) − mgx2 − λ(x21 + x22 − l2 ) 2 L = Ek − Ep − λ(x21 + x22 − l2 ) = (4.3) where λ is the Lagrange multiplier. Then, the equations of motion are of the form 0= d dt ∂L ∂ q̇ − ∂L ∂q (4.4) for the variables q = x1 , x2 , λ, i.e. we obtain 0 = mẍ1 + 2x1 λ, 0 = mẍ2 + 2x2 λ + mg, (4.5) 0 = x21 + x22 − l2 . This system can be transformed into a quasi-linear DAE by introducing the velocity variables (v1 , v2 ) = (ẋ1 , ẋ2 ) in order to eliminate the second time derivatives ẋ1 = v1 , ẋ2 = v2 , mv̇1 = −2x1 λ, (4.6) mv̇2 = −2x2 λ − mg, 0 = x21 + x22 − l2 . Following Procedure 3 in order to determine the s-index, the d-index and the hidden constraints of the DAE (4.6) yields after the first differentiation of the constraint ẋ1 = v1 , ẋ2 = v2 , mv̇1 = −2x1 λ, (4.7) mv̇2 = −2x2 λ − mg, 0 = x 1 v1 + x 2 v2 . By differentiating the constraint once more we obtain ẋ1 = v1 , ẋ2 = v2 , mv̇1 = −2x1 λ, (4.8) mv̇2 = −2x2 λ − mg, 2 0 = v12 + v22 − x2 + x22 λ − gx2 . m 1 Further differentiating of the constraint of (4.8) would lead to an ODE in implicit form. Thus, the s-index of the mathematical pendulum (4.6) is 2, and the d-index is 3. Additionally, (4.7) has s-index 1 and d-index 2, (4.8) has s-index 0 and d-index 1. Master of Science Thesis CONFIDENTIAL K. Altmann 82 Chapter 4. Test problems A strangeness-free regularisation of the mathematical pendulum is given by (see Example 25): x2 ẋ1 − x1 ẋ2 = x2 v1 − x1 v2 , mx2 v̇1 − mx1 v̇2 = −2x1 x2 λ + 2x1 x2 λ + mx1 g, 0 = x21 + x22 − l2 , (4.9) 0 = x 1 v1 + x 2 v2 , 2 0 = v12 + v22 − x2 + x22 λ − gx2 . m 1 4.2 Reheat furnace model Introduction In the steel production industry, many steel products are hot rolled in a hot mill. The products entering the hot mill are relatively cold and need to be heated up before they can be rolled which is done in a so-called reheat furnace. Figure 4.2 shows a schematic side view of a slab reheat furnace for a hot mill. Burners are located at the sides, top and bottom in order to heat the steel slabs. The intensity of the burner flames can be adjusted by fuel flows where the air-to-fuel ratio is controlled automatically. The cold steel slabs enter the furnace on the left hand side, move slowly through the furnace and are pushed out hot on the right-hand side. The flow direction of the waste gases is from the right to the left, where they are conveyed away through a chimney. Figure 4.2: Schematic view of a reheat furnace (Source: DotX Control Solutions BV 2014b) Since such reheat furnaces consume considerably large amounts of fuel, the objective is to heat up the steel products in the furnace in such a way that the consumption of fuel is minimized. Therefore, a simplified model of the reheat furnace is developed which is described in the following. For other and more detailed models see e.g. Pike/Citron (1970), Ko et al. (2000), Zhang et al. (2002) and Chen et al. (2003). K. Altmann CONFIDENTIAL Master of Science Thesis 4.2 Reheat furnace model 83 Model equations The simplified model of the temperatures in a reheat furnace is based on a single reference frame fixed to the furnace. Variations across the width of the furnace are ignored since the furnace is loaded with slabs of a uniform length Ls which are pushed sideways. In order to discretise the furnace in space, it is divided into N sections in the length direction where each section has the same length ∆x as shown in Figure 4.3. It is assumed that in each section, the temperature of the waste gases as well as the temperature of the steel products are uniformly distributed. Figure 4.3: Space discretisation of the furnace (Source: DotX Control Solutions BV 2014b) Figure 4.4 shows the total heat transfer in one section m where the following amounts of heat are considered: Qg,m+1 .. heat brought in by waste gas of section m + 1, Qc,m .. heat brought in by combustion in section m, Qair,m .. heat brought in by air in section m, Qf,m .. heat brought in by fuel in section m, Qs,m .. heat entering steel products in section m, Qw,m .. heat entering the furnace wall in section m, Qo,m .. heat leaving the furnace wall to outside air in section m. Figure 4.4: Heat balance in section m of the furnace (Source: DotX Control Solutions BV 2014b) Master of Science Thesis CONFIDENTIAL K. Altmann 84 Chapter 4. Test problems Then, the heat balance of a section is given by Qg,m+1 − Qg,m + Qc,m + Qair,m + Qf,m − Qw,m − Qs,m = 0 for m = 1, . . . , N (4.10) where Qg,N +1 is set to be zero, i.e. there is no heat brought in by waste gas in section N . The steel slabs are modeled as a flow of steel through the furnace with speed v where the gaps between the single slabs are neglected. Furthermore, it is assumed that the temperature of the steel products Ts only varies in the direction of the length of the furnace x. Hence, the temperature variation of the steel products can be described by the convection equation ρcs ∂Ts ∂Ts +v ∂t ∂x = Qs Vs (4.11) where cs is the heat capacity of the slabs, ρ is the density of steel, v is the speed of the products travelling through the furnace, Qs is the heat entering the steel products and Vs is the volume of the steel products. Using the implicit Euler method for discretising equation (4.11) in space yields Ṫs,m = v Ts,m−1 − Ts,m Qs,m + ∆x ρcs Vs,m for m = 1, . . . , N (4.12) where Ts,0 = 293 K, i.e. the steel products entering the furnace are cold. The heat entering the steel products is assumed to be only due to radiation, and thus is given by 4 4 Qs,m = As,m σεs Tg,m − Ts,m for m = 1, . . . , N (4.13) where As,m is the surface area of the steel products in section m, σ is the Stefan-Boltzmann constant, εs is the emissivity of the steel products, Tg,m is the temperature of the waste gas in section m and Ts,m is the temperature of the steel products in section m. The temperature variation of the furnace wall is modeled by cw ρw Dw Aw,m Ṫw,m = Qw,m − Qo,m for m = 1, . . . , N (4.14) where cw is the heat capacity of the material of the wall, ρw is the density of the wall, Dw is the thickness of the wall, Aw,m is the surface area of the wall in section m and Qo,m is the heat leaving the furnace wall to outside air in section m. Again, the heat is assumed to be only due to radiation, and thus, the heat entering and leaving the furnace wall, respectively, is given by 4 4 Qw,m = Aw,m σεw Tg,m − Tw,m 4 Qo,m = Aw,m σεw Tw,m − To4 for m = 1, . . . , N, (4.15) for m = 1, . . . , N (4.16) where εw is the emissivity of the wall and To is the temperature of the air outside the furnace. K. Altmann CONFIDENTIAL Master of Science Thesis 4.2 Reheat furnace model 85 Furthermore, the heat produced by combustion in section m is given by Qc,m = φf,m H0 for m = 1, . . . , N (4.17) where φf,m is the fuel flow into section m and H0 is the lower calorific value of fuel. The specific amounts of heat of fuel, air and waste gas are computed by Qf,m = φf,m cf Tf Qair,m = φf,m Raf cair Tair Qg,m = φg,m cg Tg,m for m = 1, . . . , N, (4.18) for m = 1, . . . , N, (4.19) for m = 1, . . . , N (4.20) where cf is the heat capacity of fuel, cair is the heat capacity of air, cg is the heat capacity of waste gas, Raf is the ratio of air to fuel, φg,m is the waste gas flow from section m into m − 1, Tf is the temperature of the fuel and Tair is the temperature of the air used in combustion. Under the assumption of incompressibility, the waste gas flow from section m follows from the conservation of volume φg,m = φg,m+1 + φf,m + φf,m Raf for m = 1, . . . , N (4.21) where φg,N +1 is set to be zero, i.e. there is no waste gas flow entering section N . Finally, it is assumed that the furnace consists of three zones, a preheat zone from x = 0 m to 15 m, a heating zone from x = 15 m to 22.8 m and a soaking zone from x = 22.8 m to 30 m where L = 30 m is the length of the furnace. In each zone, there is one burner and the heat of each burner is assumed to be equally distributed over all sections of the corresponding zone such that for N = 50, we have φf,m 1 u 25 1 1 = 13 u2 1u 12 3 for m = 1, . . . , 25 (4.22) for m = 26, . . . , 38 for m = 39, . . . , 50 where u1 , u2 , u3 are the fuel flows to burners 1, 2, 3, respectively. Control criteria The objective of the reheat furnace model is to maximise the production speed at minimal energy cost. This is to be achieved by adjusting the fuel flows u1 , u2 , u3 and the furnace speed v. Thus, the objective function J is defined as Z J= T (4.23) cprod vρLs Ds − cf uel (u1 + u2 + u3 ) dt 0 where cprod is the production profit, cf uel is the cost of fuel and T is the time horizon. Master of Science Thesis CONFIDENTIAL K. Altmann 86 Chapter 4. Test problems The inputs are constrained as follows: 0 < u1 < 4 Nm3 /s, 0 < u2 < 2 Nm3 /s, 0 < u3 < 1.5 Nm3 /s, 0 < v < vmax . (4.24) Furthermore, the dropout temperature of each steel slab must remain within a predefined window such that we have the following output constraints: 1478 K < Ts,N < 1523 K. (4.25) In addition, the temperature of the waste gas is limited such that we have Tg,m < 1573 K for (4.26) m = 1, . . . , N. As initial condition it is supposed that the furnace starts up cold, i.e. Tg,m (0) = Ts,m (0) = Tw,m (0) = 293 K for m = 1, . . . , N. (4.27) Model parameters The values of the parameters used in the model are given in the following table: Parameter L N B H ρ ρw cs cw cf cg cair Ls Ds Dw To Tf Tair σ H0 εw εs Raf T cprod cf uel Meaning length of the furnace number of sections width of the furnace height of the furnace density of steel products density of furnace wall heat capacity of steel products heat capacity of furnace wall heat capacity of fuel heat capacity of waste gas heat capacity of air length of steel products thickness of steel products thickness of furnace wall temperature of air outside the furnace temperature of fuel temperature of air used in combustion Stefan-Boltzmann constant lower calorific value of fuel emissivity of furnace wall emissivity of steel products ratio of air to fuel time horizon production profit cost of fuel Value 30 50 10 2 7800 1000 650 840 1000 1700 500 8 0.2 0.4 373 773 773 5.67·10−8 30 0.8 0.8 10 15 0.5 0.2 Unit m m m kg/m3 kg/m3 J/(kg K) J/(kg K) J/(Nm3 K) J/(Nm3 K) J/(Nm3 K) m m m K K K W/(m2 K4 ) MJ/(Nm3 ) h ¤/kg ¤/(Nm3 ) Table 4.1: Parameter values of the reheat furnace model K. Altmann CONFIDENTIAL Master of Science Thesis Chapter 5 Numerical results In this chapter, the numerical results of several implementations of BDF methods and RadauIIa methods applied to the two test problems are presented. Section 5.1 deals with the efficiency of different approaches for solving systems of nonlinear equations. In section 5.2, the results of the different formulations of the mathematical pendulum are compared and the advantages of a strangeness-free regularisation are shown. Finally, in section 5.3, a comparison of the implemented methods for the two test problems with respect to computing time and accuracy is presented and a best-practice method based on these results is suggested. In order to compare the calculated numerical solutions in section 5.2 and 5.3 with respect to accuracy, the following parameters for determining the base solutions are used: Method Tolerance Step size End point Pendulum (regularised s-index 0 formulation) RadauIIa s = 3 tol = 10−10 h = 0.0001 s T = 100 s Furnace RadauIIa s = 3 tol = 10−12 h = 3s T = 5h Table 5.1: Parameter values of the base solutions Then, the accuracy is determined by the maximum norm of the difference between the calculated numerical solution and the base solution restricted to the evaluated points of the numerical solution. All calculations are done with MATLAB 8.3 (R2014a) on a 32-bit Windows 7 system with an R Intel CoreTM i3 CPU (2.26 GHz) and 4 GB RAM. 5.1 Numerical solution of systems of nonlinear equations In this section, various approaches for solving systems of nonlinear equations are analysed. Therefore, the intrinsic MATLAB function fsolve and several different versions of the Newton method, where the required Jacobian is either given by the user, estimated at each discretisation point by central differences or determined with the MATLAB Symbolic Math Toolbox, are compared. For all approaches, the same prescribed tolerance is used. Master of Science Thesis CONFIDENTIAL K. Altmann 88 Chapter 5. Numerical results Figure 5.1 shows the resulting computing times in dependence of several end points T for the different solvers for systems of nonlinear equations for the mathematical pendulum. The implicit Euler method with step size h = 0.001 s is used where the arising system of nonlinear equations of size 5 × 5 in each time step is solved by the different approaches, all with prescribed tolerance tol = 10−10 . As anticipated, the Newton method with given Jacobian yields the best results, while the MATLAB intrinsic function fsolve is the slowest. The Newton method with estimated Jacobian lies in between. Unexpectedly, the Newton method with symbolic Jacobian is almost as good as the Newton method with given Jacobian. Since the given Jacobian must be determined by the user for every single time stepping method and for every test problem, this approach is associated with a large effort for the user. Therefore, the usage of the Newton method with symbolic Jacobian is preferred. Figure 5.1: Different solvers for systems of nonlinear equations for the pendulum (implicit Euler method, h = 0.001 s, tol = 10−10 ) In Figure 5.2, the resulting computing times in dependence of several end points T for the different solvers for systems of nonlinear equations for the reheat furnace model are depicted. The implicit Euler method with step size h = 60 s is used where the arising system of nonlinear equations in each time step is solved by different approaches, all with prescribed tolerance tol = 10−10 . For this problem, the Newton method with estimated Jacobian is as slow as the MATLAB intrinsic function fsolve. Furthermore, the Newton method with given Jacobian is not calculated due to the big effort for such a large system of size 150 × 150. Thus, the Newton method with symbolic Jacobian yields clearly the best results. In consequence of these results, the Newton method in combination with the symbolic Jacobian is used in the following. K. Altmann CONFIDENTIAL Master of Science Thesis 5.2 Index reduction and regularisation 89 Figure 5.2: Different solvers for systems of nonlinear equations for the furnace (implicit Euler method, h = 60 s, tol = 10−10 ) 5.2 Index reduction and regularisation In this section, the influence of the different formulations of the mathematical pendulum is analysed. Therefore, several implementations of BDF methods and RadauIIa methods are applied to the model equations, and the reached accuracy with respect to the step size is determined for all formulations of the pendulum. Furthermore, the residuals of the constraints are calculated for the BDF k = 6 method. Four formulations of the mathematical pendulum are considered: the original s-index 2 formulation (4.6), the regularised s-index 0 formulation (4.9) as well as the s-index 1 and s-index 0 formulations (4.7) and (4.8) obtained by index reduction by differentiation of the constraints. In Figures 5.3-5.6, the accuracy of the different methods with respect to the step size for the four formulations is depicted. A closer look at Figure 5.4 shows that the original s-index 2 formulation of the pendulum seems to have a lower bound of 10−4 for the accuracy, although the prescribed tolerance is 10−10 . So, this formulation is not suitable for numerical integration. The reached accuracy for the s-index 1 formulation (Figure 5.5) gives similar results as the regularised s-index 0 formulation (Figure 5.3), except that the accuracy for the RadauIIa s = 3 method does not fall below 10−8 . Furthermore, the s-index 0 formulation (Figure 5.6) also gives similar results as the regularised s-index 0 formulation (Figure 5.3), except that for the implicit Euler method (RadauIIa s = 1 and BDF k = 1) and relatively big step sizes h the value of the accuracy is much higher. To explain this behaviour, an analysis of the residuals of the constraints is in order. Master of Science Thesis CONFIDENTIAL K. Altmann 90 Chapter 5. Numerical results Figure 5.3: Accuracy of different methods with respect to the step size h for the regularised s-index 0 formulation (tol = 10−10 , T = 10 s) Figure 5.4: Accuracy of different methods with respect to the step size h for the s-index 2 formulation (tol = 10−10 , T = 10 s) K. Altmann CONFIDENTIAL Master of Science Thesis 5.2 Index reduction and regularisation 91 Figure 5.5: Accuracy of different methods with respect to the step size h for the s-index 1 formulation (tol = 10−10 , T = 10 s) Figures 5.7-5.10 show the residuals of the constraints for the BDF k = 6 method with h = 0.005 s, tol = 10−10 and T = 1000 s. So, compared to the accuracy analysis, a much higher end point T is chosen in order to show the numerical effects more clearly. All three constraints are considered: the constraint of level 0 (0 = x21 + x22 − l2 ), the constraint of level 1 (0 = x1 v1 + x2 v2 ) and the constraint of level 2 (0 = v12 + v22 − 2 2 m (x1 + x22 )λ − gx2 ). For the regularised s-index 0 formulation (Figure 5.7) the residuals of the constraints are all below the prescribed tolerance as desired, whereas for the original s-index 2 formulation (Figure 5.8) only the residual of the constraint of level 0 is satisfactory. The residuals of the constraints of levels 1 and 2 are oscillating around zero with an amplitude higher than the prescribed tolerance tol = 10−10 . This is caused by the fact that both constraints are hidden in the DAE and not explicitly given as in the regularised s-index 0 formulation. Therefore, the numerical method notices that the hidden constraints are violated and tries to correct this, but overreacts, leading to the occurrence of these oscillations. Furthermore, the deeper the constraints are hidden, the stronger are the overreactions and therefore the amplitude of the oscillations. In the s-index 1 formulation (Figure 5.9), the constraint of level 0 is no longer contained in the system of equations and the constraint of level 1 is explicitly given, whereas the constraint of level 2 is hidden. Therefore, a drift of the residual of the constraint of level 0 can be observed due to the discretisation error and the residual of the constraint of level 2 is oscillating with an amplitude higher than 10−10 . At last, in the s-index 0 formulation (Figure 5.10) both constraints of level 0 and 1 are not contained in the DAE and the constraint of level 2 is explicitly given. Thus, a drift of the residuals of the constraints of level 0 and 1 can be observed, whereas the residual of the constraint of level 2 is below the prescribed tolerance. Master of Science Thesis CONFIDENTIAL K. Altmann 92 Chapter 5. Numerical results In summary, formulations which are not strangeness-free contain hidden constraints which are causing oscillations of the residuals with amplitudes higher than the prescribed tolerance. Furthermore, index reduction by differentiation of the constraints lowers the index, but also removes the constraints from the system such that a drift of the residuals can be observed. Therefore, the regularised s-index 0 formulation is the most suitable formulation since all constraints are explicitly given in the system. Figure 5.6: Accuracy of different methods with respect to the step size h for the s-index 0 formulation (tol = 10−10 , T = 10 s) Figure 5.7: Residuals of the constraints for the regularised s-index 0 formulation (BDF k = 6 method, h = 0.005 s, tol = 10−10 , T = 1000 s) K. Altmann CONFIDENTIAL Master of Science Thesis 5.2 Index reduction and regularisation 93 Figure 5.8: Residuals of the constraints for the s-index 2 formulation (BDF k = 6 method, h = 0.005 s, tol = 10−10 , T = 1000 s) Figure 5.9: Residuals of the constraints for the s-index 1 formulation (BDF k = 6 method, h = 0.005 s, tol = 10−10 , T = 1000 s) Figure 5.10: Residuals of the constraints for the s-index 0 formulation (BDF k = 6 method, h = 0.005 s, tol = 10−10 , T = 1000 s) Master of Science Thesis CONFIDENTIAL K. Altmann 94 5.3 Chapter 5. Numerical results Computing time and accuracy In this section, a comparison of the implemented methods for the two test problems with respect to computing time and accuracy is presented. The first subsection deals with the mathematical pendulum and the second subsection addresses the reheat furnace model. 5.3.1 Mathematical pendulum In Figure 5.3, we have already seen the accuracy of the different methods with respect to the step size h for the regularised s-index 0 formulation. As expected, methods with a higher order achieve a good accuracy for relatively large step sizes h, whereas methods with a lower order require relatively small step sizes h in order to achieve a reasonable accuracy. Figure 5.11: Computing time of each of different methods with respect to the step size h for the regularised s-index 0 formulation (tol = 10−10 , T = 10 s) Figure 5.11 shows the computing time of each of the different methods with respect to the step size h for the regularised s-index 0 formulation. It can be observed that the RadauIIa methods require more computing time with increasing order due to the larger system of nonlinear equations which has to be solved in each time step, i.e. a system of size 5 × 5 for RadauIIa s = 1, a system of size 10 × 10 for RadauIIa s = 2 and a system of size 15 × 15 for RadauIIa s = 3. In contrast, the BDF methods do not require significantly more computing time with increasing order since the systems of nonlinear equations have the same size of 5 × 5. K. Altmann CONFIDENTIAL Master of Science Thesis 5.3 Computing time and accuracy 95 Figure 5.12: Computing time of each of different methods with respect to the reached accuracy for the regularised s-index 0 formulation (tol = 10−10 , T = 10 s) In Figure 5.12, the computing time of each of the different methods with respect to the reached accuracy for the regularised s-index 0 formulation is depicted. It shows that the BDF k = 6 method is preferable for a desired accuracy better than 10−4 , and that the RadauIIa s = 3 method should be chosen for a desired accuracy between 10−4 and 10−2 . 5.3.2 Reheat furnace model In Figure 5.13, the accuracy of the different methods with respect to the step size h for the furnace is depicted. It is striking that the RadauIIa s = 3 method yields the best accuracy results. Furthermore, it can be observed that the BDF k = 6 method has two runaway values for the step sizes h = 120 s and h = 240 s. For these large step sizes, the Newton method does not converge within the maximum of 100 iterations. Therefore, for these cases the reached accuracy is extremely bad. Figure 5.14 shows the computing time of each of the different methods with respect to the step size h for the furnace. It can be observed that the RadauIIa methods require more computing time with increasing order due to the larger system of nonlinear equations which has to be solved in each time step, i.e. a system of size 150 × 150 for RadauIIa s = 1, a system of size 300 × 300 for RadauIIa s = 2 and a system of size 450 × 450 for RadauIIa s = 3. In contrast, the BDF methods do not require significantly more computing time with increasing order since the systems of nonlinear equations have the same size of 150 × 150. Unexpectedly, the BDF methods require much more computing time than the RadauIIa s = 1 method, although the system of nonlinear equations has the same size. We will investigate this further in the next paragraphs. Master of Science Thesis CONFIDENTIAL K. Altmann 96 Chapter 5. Numerical results Figure 5.13: Accuracy of different methods with respect to the step size h for the furnace (tol = 10−10 , T = 5 h) Figure 5.14: Computing time of each of different methods with respect to the step size h for the furnace (tol = 10−10 , T = 5 h) K. Altmann CONFIDENTIAL Master of Science Thesis 5.3 Computing time and accuracy 97 Figure 5.15: Computing time of each of different methods with respect to the reached accuracy for the furnace (tol = 10−10 , T = 5 h) In Figure 5.15, the computing time of each of the different methods with respect to the reached accuracy for the furnace is depicted. It shows that the RadauIIa s = 2 method is preferable for a desired accuracy above 10−4 . For a desired accuracy better than 10−4 either the RadauIIa s = 3 method or one of the higher order BDF methods (k = 5 or k = 6) should be chosen. It is remarkable that the RadauIIa s = 2 method, and not one of the BDF methods, is the best method for the relevant accuracy range between 10−4 and 10−2 , although most of the BDF methods have a better order and a smaller system of nonlinear equations which has to be solved in each time step. Furthermore, it is striking that for the BDF methods the slope of the curves in Figure 5.14 is close to zero. This suggests that the time spent for the time integration is insignificant compared to the total computing time. Therefore, a detailed look at the different parts contributing to the computing time is in order. In Table 5.2, the analysis of the computing time for the BDF methods in absolute values and in percent is shown. It becomes apparent that the time spent for the time integration only contributes 3 % to the total time, while determining the symbolic Jacobian takes around 10 % to 11 % of the total time, and most time is spent for determining the initial values, i.e. around 86 % to 87 % of the total time. For all BDF methods, the missing initial values are determined by the RadauIIa s = 3 method since it has the highest order of the implemented one-step methods. Therefore, the analysis of the computing time for the RadauIIa methods in absolute values and in percent is shown in Table 5.3. For all RadauIIa methods, the major part of the total computing time with 77 % to 79 % is spent for determining the symbolic Jacobians, while the time integration only Master of Science Thesis CONFIDENTIAL K. Altmann 98 Chapter 5. Numerical results takes around 21 % to 23 % of the total time. In particular, determining the symbolic Jacobians for the RadauIIa s = 3 method takes 301 s, which explains the high contribution of the time spent for determining the initial values for the BDF methods. Time for determining the symbolic Jacobian Time for determining the initial values Time for the time integration Total time Time for determining the symbolic Jacobian Time for determining the initial values Time for the time integration BDF k = 1 BDF k = 2 BDF k = 3 38 s (11%) 35 s (10%) 36 s (10%) 300 s (86%) 307 s (87%) 302 s (87%) 12 s ( 3%) 11 s ( 3%) 10 s ( 3%) 350 s 353 s 348 s BDF k = 4 BDF k = 5 BDF k = 6 38 s (11%) 40 s (11%) 41 s (11%) 305 s (86%) 303 s (86%) 306 s (86%) 11 s ( 3%) 12 s ( 3%) 11 s ( 3%) 354 s Total time 354 s 358 s Table 5.2: Analysis of the computing time of the BDF methods for the furnace (tol = 10−10 , h = 60 s, T = 5 h) Time for determining the symbolic Jacobian Time for the time integration Total time RadauIIa s = 1 RadauIIa s = 2 RadauIIa s = 3 36 s (79%) 130 s (77%) 301 s (77%) 9 s (21%) 38 s (23%) 88 s (23%) 45 s 168 s 390 s Table 5.3: Analysis of the computing time of the RadauIIa methods for the furnace (tol = 10−10 , h = 60 s, T = 5 h) In consequence of the above results, the computing time for the determination of the Jacobians should be considerably decreased in order to have an applicable and efficient numerical solver for DAEs. Therefore, the transformation of the symbolic functions into MATLAB function handles by the function matlabFunction is changed by the additional option ’file’, which causes that instead of a function handle, a function file with optimized code is created (see MathWorks 2014). This procedure only has to be carried out once for every method and problem, such that in this case nine files for the functions themselves and nine files for the Jacobians are necessary. Hence, the resulting computing time of a simulation by using the generated function files is caused mainly by the time integration, i.e. the time integration takes more than 99 % of the total time. It is remarkable that the generation of the function files takes much more time than creating the MATLAB function handles. In Table 5.4 and 5.5, the computing times for generating the function files for the BDF methods and for the RadauIIa methods, respectively, are shown. For the BDF methods, the computing time for generating the function file for G grows slowly with the order, while the computing time for generating the function files for the Jacobians JG remains constant with the order. Remember that the size of the system of nonlinear equations is constant for all BDF methods, i.e. a system of size 150 × 150. In contrast, the size of the system of nonlinear equations increases with the order for the RadauIIa methods. Therefore, the computing time for generating K. Altmann CONFIDENTIAL Master of Science Thesis 5.3 Computing time and accuracy 99 the function files increases with the order, but the increase seems to be growing exponentially with the size of the systems. In particular, generating the function file for the Jacobian JG for the RadauIIa s = 3 method takes more than one day, while creating the corresponding function handle with the old procedure took only 301 s. Since the generation of the files has to be done only once, this amount of time is still acceptable. Particularly with regard to the planned usage of the DAE solver for NMPC where the simulation has to be carried out every time a new measurement is available, the generation of the function files should be preferred to the function handles since the function handles are created over and over again for every new simulation. Additionally, the computing time for generating the function files can possibly be improved by using other symbolic differentiation codes which are more efficient than MATLAB. BDF k = 1 BDF k = 2 BDF k = 3 BDF k = 4 BDF k = 5 BDF k = 6 Function G 24 s 27 s 32 s 34 s 36 s 40 s Jacobian JG 211 s 214 s 217 s 214 s 213 s 212 s Table 5.4: Computing times of generating the function files and corresponding files for the Jacobians for the BDF methods RadauIIa s = 1 RadauIIa s = 2 RadauIIa s = 3 Function G 53 s 7 min 21 min Jacobian JG 316 s 203 min 37 h Table 5.5: Computing times of generating the function files and corresponding files for the Jacobians for the RadauIIa methods By using the generated function files, new results regarding the computing time with respect to the step size and with respect to the reached accuracy can be obtained, while the results of the accuracy with respect to the step size remain the same (see Figure 5.13). Figure 5.16 shows the new computing time of each of the different methods with respect to the step size h for the furnace. Compared to the old computing times (see Figure 5.14), now all values of the computing time are much lower for all step sizes and methods. Furthermore, it can be observed that the BDF methods require almost the same amount of computing time as the RadauIIa s = 1 method, because the systems of nonlinear equations have the same size. In addition, the computing time of the BDF k = 6 method for the step size h = 240 s is much higher than expected due to the divergence of the Newton method, i.e. the maximum of 100 iterations is exhausted. In Figure 5.17 the new computing time of each of the different methods with respect to the reached accuracy for the furnace is depicted. It shows that the BDF k = 6 method is preferable for a desired accuracy better than 10−2 , and that the BDF k = 5 method should be chosen for a desired accuracy between 10−2 and 10−1 . Master of Science Thesis CONFIDENTIAL K. Altmann 100 Chapter 5. Numerical results Figure 5.16: Computing time of each of different methods with respect to the step size h for the furnace (tol = 10−10 , T = 5 h) Figure 5.17: Computing time of each of different methods with respect to the reached accuracy for the furnace (tol = 10−10 , T = 5 h) K. Altmann CONFIDENTIAL Master of Science Thesis Chapter 6 Conclusion and future research In this thesis, we have investigated three main topics. First, we have intensively focused on the analysis of DAEs. Second, we have investigated the numerical treatment of DAEs. And as a third topic, we have introduced two test examples and applied several implementations of numerical methods to them. Concerning the analysis of DAEs, we considered the range of DAEs from simple types to more general ones in chapter 2. Starting with linear DAEs with constant coefficient, we successively treated linear DAEs with variable coefficients and finally general nonlinear DAEs. In particular, we investigated different concepts of the so-called index with a main focus on the strangeness index. We have seen that DAEs are not just simple combinations of differential and algebraic equations. Further algebraic constraints can be hidden in the system. These hidden constraints are the main challenge of DAEs since they impose additional consistency conditions on the initial values and cause stronger smoothness requirements of the problem in order to obtain existence and uniqueness results. Therefore, they provoke several difficulties in the analysis as well as in the numerical treatment of DAEs of higher index. Chapter 3 was concerned with the numerical solution of DAEs. Due to the difficulties which can arise when numerical methods for ODEs are directly used to solve higher index DAEs, we first presented two regularisation techniques which transform the original DAE to a strangeness-free DAE with the same set of solutions. The first approach was based on Hypothesis 2 which was very technical and furthermore requires the determination of the whole derivative array. In contrast, the second approach is based on Procedure 3, which means that it is only necessary to differentiate the algebraic constraints and not the whole DAE. However, Procedure 3 is only applicable for quasi-linear DAEs, while Hypothesis 2 is suited for general nonlinear DAEs. Afterwards, the two main classes of discretisation methods, namely one-step methods and linear multi-step methods, and their application to strangeness-free DAEs were described, and important aspects like the uniqueness of the numerical solution xi+1 , the consistency of xi+1 as well as the convergence and stability properties of certain methods were discussed. As a result, we have seen that RadauIIa Master of Science Thesis CONFIDENTIAL K. Altmann 102 Chapter 6. Conclusion and future research methods and BDF methods are excellent candidates for the numerical solution of initial value problems for strangeness-free DAEs. In chapter 5, the numerical results of several implementations of BDF methods and RadauIIa methods applied to the two test problems introduced in chapter 4, the mathematical pendulum and a reheat furnace model, were presented and discussed. First, the efficiency of different approaches for solving systems of nonlinear equations was analysed, leading to the conclusion that the usage of the Newton method where the required Jacobians are determined with the MATLAB Symbolic Math Toolbox is to prefer. Second, the different formulations of the mathematical pendulum were compared with respect to the accuracy of their results and the advantages of a strangenessfree regularisation were shown. In particular, it was illustrated that formulations which are not strangeness-free lead to oscillations of the residuals of the constraints. Furthermore, it was shown that index reduction by differentiation of the constraints lowers the index, but also removes the constraints from the system such that a drift of the residuals was observed. Therefore, strangenessfree regularisations where all constraints are stated explicitly are to prefer. Third, a comparison of the implemented methods for the two test problems with respect to computing time and accuracy was presented. For the mathematical pendulum, it turned out that the best method for a desired accuracy better than 10−4 is the BDF k = 6 method. For the reheat furnace model, the BDF k = 6 method is also best for a desired accuracy better than 10−2 . But one should be careful with the conjecture that BDF methods are in general better than RadauIIa methods. The reader should keep in mind that RadauIIa methods have much better stability properties, i.e. they are A- and L-stable, such that for stiff problems they could be the better choice. We have already seen this in Figure 5.13 where the BDF k = 6 method has two runaway values for the step sizes h = 120 s and h = 240 s which can be caused by stability problems. Another conclusion of this chapter was that the generation of the Jacobians during the numerical integration is very time consuming compared to the time spent for the time integration, especially for the reheat furnace model. Thus, in order to reduce the computing time of the numerical methods, the required Jacobians for every method and problem were generated and stored in a function file in advance. Finally, some directions for future research that might build up on this thesis work. In particular, there are many possibilities to improve the current implementation of the DAE solver: • It would be desirable to implement an automated detection that recognises if the provided DAE is not strangeness-free. Furthermore, an automated determination of the strangeness index as well as of a strangeness-free regularisation of the considered DAE would be desirable. Until now, this effort has to be done by the user in advance, i.e. the user has to determine the index of the considered problem and if the index is not zero, the user has to provide a regularisation of the problem in order to get reliable results of the numerical integration. Thus, an implementation of Procedure 3 and subsequently an automated regularisation of the DAE would be desirable. • The performance of the DAE solver could be improved by implementing a step size control, i.e. not to use a fixed step size h anymore. K. Altmann CONFIDENTIAL Master of Science Thesis 103 • Furthermore, the used method for the numerical solution of systems of nonlinear equations could be improved. The implemented ordinary Newton method belongs to the so-called local Newton methods, which require sufficiently good initial guesses for convergence. In contrast, global Newton methods are able to compensate bad initial guesses for instance by damping or trust region strategies (see Deuflhard 2011). • Another approach for the numerical solution of systems of nonlinear equations could be the implementation of the simplified Newton method which uses a fixed approximation of the Jacobian for several or all steps of the Newton iteration process. Especially for large systems where the determination of the Jacobian at discrete values is costly, this could improve the overall computing time, although the method is only linearly convergent and not quadratic convergent anymore. • The determined Jacobians for the Newton iteration are so far not considered to be sparse, although in many applications this would be the case. By implementing sparsity of the Jacobians, the efficiency of the solution of the arising linear systems in the Newton iteration could be improved. • In addition, for extremely large systems of nonlinear equations, it would be desirable to solve the arising linear systems in the Newton iteration with an iterative solver, e.g. GMRES. Until now, the backslash operator of MATLAB is used for the solution of the linear system. Thus, the linear system is solved by a direct solver, namely the LU decomposition. • We have seen that the required computing time for the determination of the Jacobians and the following generation of the function files increases extremely with the size of the system. For a system of size 450 × 450, it already takes more than one day (see Table 5.5). Therefore, it is considerable to use another more efficient programme than MATLAB for the determination of the symbolic derivatives in order to be able to treat problems with even larger system size. In conclusion, we are confident that we have developed efficient numerical methods for solving DAEs in order to enable a prospectively follow-up thesis project to develop an applicable NMPC algorithm with DAEs as system models. Master of Science Thesis CONFIDENTIAL K. Altmann Bibliography Bollhöfer, Matthias / Mehrmann, Volker (2004): Numerische Mathematik. Vieweg, Wiesbaden. Brenan, K. E. / Campbell, Stephen L. / Petzold, Linda R. (1996): Numerical Solution of Initial-Value Problems in Differential-Algebraic Equations. Society for Industrial and Applied Mathematics, Philadelphia. Campbell, Stephen L. (1987): A general form for solvable linear time varying singular systems of differential equations. SIAM Journal on Mathematical Analysis, Volume 18, Issue 4, pp. 1101–1115. Campbell, Stephen L. / Gear, C. William (1995): The index of general nonlinear DAEs. Numerische Mathematik, Volume 72, Issue 2, pp. 173–196. Campbell, Stephen L. / Griepentrog, Eberhard (1995): Solvability of general differential algebraic equations. SIAM Journal on Scientific Computing, Volume 16, Issue 2, pp. 257–270. Campbell, Stephen L. / Meyer, Carl D. (1979): Generalized Inverses of Linear Transformations. Pitman, San Francisco. Chen, Zhigang / Xu, Chao / Zhang, Bin / Shao, Huihe / Zhang, Jianmin (2003): Advanced control of walking-beam reheating furnace. Journal of University of Science and Technology Beijing, Volume 10, Issue 4, pp. 69–74. Deuflhard, Peter (2011): Newton Methods for Nonlinear Problems. Affine Invariance and Adaptive Algorithms. Springer Series in Computational Mathematics, Volume 35, SpringerVerlag, Berlin. DotX Control Solutions BV (2014a): Process control projects. hURL: http://www. dotxcontrol.com/en/projects/process-control.htmli. DotX Control Solutions BV (2014b): Reheat Furnace Model. Internal project notes. DotX Control Solutions BV (2014c): Wind turbine control projects. hURL: http://www. dotxcontrol.com/en/projects/wind-turbine-control.htmli. Gantmacher, F.R. (1959): The Theory of Matrices, Volume 2. Chelsea Publishing Company, New York. K. Altmann CONFIDENTIAL Master of Science Thesis Bibliography 105 Gear, C. William (1988): Differential-algebraic equation index transformations. SIAM Journal on Scientific and Statistical Computing, Volume 9, Issue 1, pp. 39–47. Griepentrog, Eberhard / März, Roswitha (1986): Differential-algebraic equations and their numerical treatment. Teubner Verlag, Leipzig. Hairer, Ernst / Lubich, Christian / Roche, Michel (1989): The Numerical Solution of Differential-Algebraic Systems by Runge-Kutta Methods. Springer-Verlag, Berlin. Hairer, Ernst / Nørsett, Syvert P. / Wanner, Gerhard (1993): Solving Ordinary Differential Equations I. Springer-Verlag, Berlin. Hairer, Ernst / Wanner, Gerhard (1983): On the Instability of the BDF Formulas. SIAM Journal on Numerical Analysis, Volume 20, Issue 6, pp. 1206–1209. Ko, Hyun Suk / Kim, Jung-Su / Yoon, Tae-Woong / Lim, Mokeun / Yang, Dae Ryuk / Jun, Ik Soo (2000): Modeling and Predictive Control of a Reheating Furnace. Proceedings of the American Control Conference, Chicago, USA, pp. 2725–2729. Kronecker, L. (1890): Algebraische Reduction der Schaaren bilinearer Formen. Sitzungsberichte der Königlich Preußischen Akademie der Wissenschaften zu Berlin, zweiter Halbband, pp. 1225– 1237. Kunkel, Peter / Mehrmann, Volker (2006): Differential-Algebraic Equations - Analysis and Numerical Solution. European Mathematical Society, Zürich. MathWorks (2014): MATLAB Documentation. hURL: http://www.mathworks.de/help/ symbolic/matlabfunction.htmli. März, Roswitha (1981): Multistep methods for initial value problems in implicit differentialalgebraic equations. Humboldt-Universität zu Berlin, Sektion Mathematik, Preprint No. 22. Pantelides, Constantinos C. (1988): The Consistent Initialization of Differential-Algebraic Systems. SIAM Journal on Scientific and Statistical Computing, Volume 9, Issue 2, p. 213–231. Petzold, Linda R. (1982): Differential/Algebraic Equations are not ODE’s. Journal on Scientific and Statistical Computing, Volume 3, Issue 3, pp. 367–384. Petzold, Linda R. (1986): Order Results for Implicit Runge-Kutta Methods Applied to Differential/Algebraic Systems. SIAM Journal on Numerical Analysis, Volume 23, Issue 4, pp. 837–852. Pike, H. E. Jr. / Citron, S. J. (1970): Optimization Studies of a Slab Reheating Furnace. Automatica, Volume 6, Issue 1, pp. 41–50. Prothero, A. / Robinson, A. (1974): On the Stability and Accuracy of One-Step Methods for Solving Stiff Systems of Ordinary Differential Equations. Mathematics of Computation, Volume 28, Issue 125, pp. 145–162. Master of Science Thesis CONFIDENTIAL K. Altmann 106 Bibliography Rheinboldt, Werner C. (1984): Differential-Algebraic Systems as Differential Equations on Manifolds. Mathematics of Computation, Volume 43, Issue 168, pp. 473–482. Steinbrecher, Andreas (2006): Numerical Solution of Quasi-Linear Differential-Algebraic Equations and Industrial Simulation of Multibody Systems. Ph. D thesis, Technische Universität Berlin. Weierstraß, K. (1858): Über ein die homogenen Functionen betreffendes Theorem, nebst Anwendung desselben auf die Theorie der kleinen Schwingungen. Monatsberichte der Königlich Preußischen Akademie der Wissenschaften zu Berlin, pp. 207–220. Weintraub, Steven H. (2009): Jordan Canonical Form: Theory and Practice. Morgan & Claypool. Zhang, Bin / Chen, Zhigang / Xu, Liyun / Wang, Jingcheng / Zhang, Jianmin / Shao, Huihe (2002): The Modeling and Control of A Reheating Furnace. Proceedings of the American Control Conference, Anchorage, USA, pp. 3823–3828. K. Altmann CONFIDENTIAL Master of Science Thesis

* Your assessment is very important for improving the work of artificial intelligence, which forms the content of this project

Download PDF

advertisement