# Differential-algebraic equations and matrix-valued singular perturbation $(function(){PrimeFaces.cw("Tooltip","widget_formSmash_items_resultList_23_j_idt799_0_j_idt801",{id:"formSmash:items:resultList:23:j_idt799:0:j_idt801",widgetVar:"widget_formSmash_items_resultList_23_j_idt799_0_j_idt801",showEffect:"fade",hideEffect:"fade",target:"formSmash:items:resultList:23:j_idt799:0:fullText"});});

Linköping studies in science and technology. Dissertations. No. 1292 Differential-algebraic equations and matrix-valued singular perturbation Henrik Tidefelt Department of Electrical Engineering Linköping University, SE–581 83 Linköping, Sweden Linköping 2009 Cover illustration: Stereo pair showing the entries of a sampled uncertain lti dae of nominal index 2 in its canonical form, displayed as A E . The uniform distributions correspond to the intervals in table 7.3, signs were ignored, and values transformed by the map x 7→ 21 |x| + |x|1/7 to enhance resolution near 0 and 1. Sampled values are encoded both as area of markers and as height in the image. Left eye’s view to the left, right eye’s view to the right. Linköping studies in science and technology. Dissertations. No. 1292 Differential-algebraic equations and matrix-valued singular perturbation Henrik Tidefelt [email protected] www.control.isy.liu.se Division of Automatic Control Department of Electrical Engineering Linköping University SE–581 83 Linköping Sweden ISBN 978-91-7393-479-4 ISSN 0345-7524 Copyright © 2009 Henrik Tidefelt Printed by LiU-Tryck, Linköping, Sweden 2009 To Nina @ λ # Abstract With the arrival of modern component-based modeling tools for dynamic systems, the differential-algebraic equation form is increasing in popularity as it is general enough to handle the resulting models. However, if uncertainty is allowed in the equations — no matter how small — this thesis stresses that such equations generally become ill-posed. Rather than deeming the general differential-algebraic structure useless up front due to this reason, the suggested approach to the problem is to ask what assumptions that can be made in order to obtain well-posedness. Here, wellposedness is used in the sense that the uncertainty in the solutions should tend to zero as the uncertainty in the equations tends to zero. The main theme of the thesis is to analyze how the uncertainty in the solution to a differential-algebraic equation depends on the uncertainty in the equation. In particular, uncertainty in the leading matrix of linear differential-algebraic equations leads to a new kind of singular perturbation, which is referred to as matrix-valued singular perturbation. Though a natural extension of existing types of singular perturbation problems, this topic has not been studied in the past. As it turns out that assumptions about the equations have to be made in order to obtain well-posedness, it is stressed that the assumptions should be selected carefully in order to be realistic to use in applications. Hence, it is suggested that any assumptions (not counting properties which can be checked by inspection of the uncertain equations) should be formulated in terms of coordinate-free system properties. In the thesis, the location of system poles has been the chosen target for assumptions. Three chapters are devoted to the study of uncertain differential-algebraic equations and the associated matrix-valued singular perturbation problems. Only linear equations without forcing function are considered. For both time-invariant and timevarying equations of nominal differentiation index 1, the solutions are shown to converge as the uncertainties tend to zero. For time-invariant equations of nominal index 2, convergence has not been shown to occur except for an academic example. However, the thesis contains other results for this type of equations, including the derivation of a canonical form for the uncertain equations. While uncertainty in differential-algebraic equations has been studied in-depth, two related topics have been studied more passingly. One chapter considers the development of point-mass filters for state estimation on manifolds. The highlight is a novel framework for general algorithm development with manifold-valued variables. The connection to differential-algebraic equations is that one of their characteristics is that they have an underlying manifold-structure imposed on the solution. One chapter presents a new index closely related to the strangeness index of a differential-algebraic equation. Basic properties of the strangeness index are shown to be valid also for the new index. The definition of the new index is conceptually simpler than that of the strangeness index, hence making it potentially better suited for both practical applications and theoretical developments. v Populärvetenskaplig sammanfattning Avhandlingen handlar främst om att beräkna hur osäkerhet i så kallade differentialalgebraiska ekvationer påverkar osäkerheten i ekvationernas lösning. Genom att studera problem som tillåter osäkerheter med mindre struktur i jämfört med tidigare forskning, leder problemet snabbt vidare till att studera en ny klass av singulära perturbationsproblem, som här kallas matris-värda singulära perturbationsproblem. Förutom ett verktyg för att förstå osäkerhet i lösningen till ekvationer med osäkerhet, syftar analysen i avhandlingen till att skapa verktyg som kan användas även för problem utan osäkerhet. Som ett första exempel på sådana problem kan nämnas ekvationer som är formulerade i symbol-hanterande mjukvara för differential-algebraiska ekvationer, där det inte alltid går att lita på att mjukvaran klarar av att bevisa att ett visst uttryck kommer vara noll längs ekvationens lösningstrajektoria. Då kan det vara fördelaktigt att kunna betrakta uttrycket som ett osäkert värde nära noll. Som ett andra exempel på sådana problem kan nämnas differential-algebraiska ekvationer med tidsberoende, där den ledande matrisen som beror kontinuerligt av tiden tappar rang tid en viss tidpunkt. Då kan det vara fördelaktigt att kunna approximera den ledande matrisen med en annan som har den lägre rangen i ett helt intervall av tidpunkter kort före och kort efter tidpunkten där den faktiska rangen är lägre. Utöver de resultat som rör osäkerhet i differential-algebraiska ekvationer innehåller avhandlingen ett kapitel med resultat som handlar om att utifrån mätningar med osäkerhet göra en uppskattning av en okänd variabel som tillhör en mångfald (en sfär används som exempel). Den föreslagna metoden bygger på att dela upp mångfalden i små bitar, och beräkna sannolikheten för att variabeln befinner sig i respektive bit. Metoden i sig är inte ny, utan fokus ligger på att föreslå ett ramverk för algoritmer för den här typen av problem. Problem med mångfalds-struktur dyker regelmässigt upp i samband med differential-algebraiska ekvationer. Ett annat kapitel i avhandlingen handlar om ett nytt så kallat index-koncept för differential-algebraiska ekvationer. Det nya indexet är nära relaterat till ett annat väletablerat index, men är definierat på ett enklare sätt. Det nya indexet kan vara av värde både i sig självt och som ett sätt att belysa det som är väl etablerat. vii Acknowledgments My thanks to Professor Lennart Ljung, head of the Division of Automatic Control, for generously allowing me to conduct research in his group. Lennart has been my co-supervisor, and my work would only be half-finished by now if it was not for his efforts to make me complete it. My thanks also to Professor Torkel Glad for being my supervisor, I’ll get back to his name soon. Ulla Salaneck, secretary at the group, is a Swiss army knife capable of solving any practical issue you can think of. Everybody knows this, but what is probably less known is that she is also very good at poking Lennart when it’s about time to prod slow students into finishing their theses. Johan Sjöberg has been an important source of inspiration for the work on differential-algebraic equations, in particular the strangeness index. Marcus Gerdin was also there with experienced advice when it all started. Gustaf Hendeby has served the group with technical expertise in many areas related to computer software, including being the LATEX guru for many years. He developed the rtthesis class used to typeset this thesis, and taught me enough about LATEX so that I was able to tweak the class to my own taste. Gustaf also helped out with proofreading. Martin Enqvist and Daniel Petersson contributed with outstandingly thorough proofreading. Thanks goes to Thomas Schön for letting me work with him in the popular field of state estimation. As you, my dear reader, might already have guessed, Torkel has been involved in proofreading most of the chapters. Those of you who know him can easily imagine the value of this contribution to a thesis in the region of automatic control where there is only little connection to reality. Umut Orguner in the office on the opposite side of the corridor knows too much, so everyone asks him all the questions, and he never refuses to answer. Christian Lyzell brings his great attitude to work, is a hobby hard core Guitar Hero guru, and has also provided valuable feedback on mpscatter, the Matlab toolbox used to create most plots in this thesis. I also like our discussions on numerical maths, even though it isn’t really related to my research. Talking about good discussions, Daniel Petersson deserves a second mention for his interest in and many solutions to a long list of mathematical problems I’ve had during the last years here. There are many things I like to do in my spare time, and I’m very happy to have had so many nice persons in the group to share my interests with. I’m thinking of surfing waves and wind, nightly work on Shapes, gathering people for eating and play, hiking and kayaking, disc golf, climbing, coffee and lunch breaks, and more. It’s been great, and I hope that completing this thesis is not the end of it! I am indebted to the Swedish Research Council for financial support of this work. Nina, you make me ^ ¨ and laugh. In addition, your contribution to this thesis as the excellent chef behind it all has been worth a lot, so many thanks to you too. I’m sad that all the writing has prevented me so much lately from sharing my time with you, but I’m yours now! Linköping, November 2009 Henrik Tidefelt ix Contents Notation I xv Background 1 Introduction 1.1 Differential-algebraic equations in automatic control 1.2 Introduction to matrix-valued singular perturbation 1.2.1 Linear time-invariant examples . . . . . . . . 1.2.2 Application to quasilinear shuffling . . . . . . 1.2.3 A missing piece . . . . . . . . . . . . . . . . . 1.2.4 How to approach the nominal equations . . . 1.2.5 Final remarks . . . . . . . . . . . . . . . . . . 1.3 Problem formulation . . . . . . . . . . . . . . . . . . 1.4 Contributions . . . . . . . . . . . . . . . . . . . . . . 1.5 Thesis outline . . . . . . . . . . . . . . . . . . . . . . 1.6 Notation . . . . . . . . . . . . . . . . . . . . . . . . . 1.6.1 Mathematical notation . . . . . . . . . . . . . 1.6.2 dae and ode terminology . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1 1 2 2 5 6 6 7 8 9 9 10 10 12 2 Theoretical background 2.1 Models in automatic control . . . . . . . . 2.1.1 Examples . . . . . . . . . . . . . . . 2.1.2 Use in estimation . . . . . . . . . . 2.1.3 Use in control . . . . . . . . . . . . 2.1.4 Model classes . . . . . . . . . . . . 2.1.5 Model reduction . . . . . . . . . . . 2.1.6 Scaling . . . . . . . . . . . . . . . . 2.2 Differential-algebraic equations . . . . . . 2.2.1 Motivation . . . . . . . . . . . . . . 2.2.2 Common forms . . . . . . . . . . . 2.2.3 Indices and their deduction . . . . 2.2.4 Transformation to quasilinear form . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15 15 16 17 17 18 19 21 22 23 25 28 35 xi . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . xii CONTENTS 2.3 2.4 2.5 2.6 2.7 2.8 2.9 2.2.5 Structure algorithm . . . . . . . . . . . . . . . . 2.2.6 lti dae, matrix pencils, and matrix pairs . . . 2.2.7 Initial conditions . . . . . . . . . . . . . . . . . 2.2.8 Numerical integration . . . . . . . . . . . . . . 2.2.9 Existing software . . . . . . . . . . . . . . . . . Initial condition response bounds . . . . . . . . . . . . 2.3.1 lti ode . . . . . . . . . . . . . . . . . . . . . . . 2.3.2 ltv ode . . . . . . . . . . . . . . . . . . . . . . 2.3.3 Uncertain lti ode . . . . . . . . . . . . . . . . . Regular perturbation theory . . . . . . . . . . . . . . . 2.4.1 lti ode . . . . . . . . . . . . . . . . . . . . . . . 2.4.2 ltv ode . . . . . . . . . . . . . . . . . . . . . . 2.4.3 Nonlinear ode . . . . . . . . . . . . . . . . . . . Singular perturbation theory . . . . . . . . . . . . . . . 2.5.1 lti ode . . . . . . . . . . . . . . . . . . . . . . . 2.5.2 Generalizations of scalar singular perturbation 2.5.3 Multiparameter singular perturbation . . . . . 2.5.4 Perturbation of dae . . . . . . . . . . . . . . . . Contraction mappings . . . . . . . . . . . . . . . . . . Interval analysis . . . . . . . . . . . . . . . . . . . . . . Gaussian elimination . . . . . . . . . . . . . . . . . . . Miscellaneous results . . . . . . . . . . . . . . . . . . . 3 Shuffling quasilinear dae 3.1 Index reduction by shuffling . . . . . . . . . 3.1.1 The structure algorithm . . . . . . . 3.1.2 Quasilinear shuffling . . . . . . . . . 3.1.3 Time-invariant input affine systems . 3.1.4 Quasilinear structure algorithm . . . 3.2 Proposed algorithm . . . . . . . . . . . . . . 3.2.1 Algorithm . . . . . . . . . . . . . . . 3.2.2 Zero tests . . . . . . . . . . . . . . . . 3.2.3 Longevity . . . . . . . . . . . . . . . 3.2.4 Seminumerical twist . . . . . . . . . 3.2.5 Monitoring . . . . . . . . . . . . . . . 3.2.6 Sufficient conditions for correctness 3.3 Consistent initialization . . . . . . . . . . . 3.3.1 Motivating example . . . . . . . . . . 3.3.2 A bootstrap approach . . . . . . . . . 3.3.3 Comment . . . . . . . . . . . . . . . . 3.4 Conclusions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 38 40 46 47 51 51 52 57 58 58 58 60 66 67 67 68 69 72 73 77 80 82 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 85 86 86 86 87 90 91 91 94 96 99 99 102 104 104 105 106 107 xiii CONTENTS II Results 4 Point-mass filtering on manifolds 4.1 Introduction . . . . . . . . . . . . . . . . . . . 4.2 Background and related work . . . . . . . . . 4.3 Dynamic systems on manifolds . . . . . . . . 4.4 Point-mass filter . . . . . . . . . . . . . . . . . 4.4.1 Point-mass distributions on a manifold 4.4.2 Measurement update . . . . . . . . . . 4.4.3 Time update in general . . . . . . . . . 4.4.4 Dynamics that simplify time update . 4.5 Point estimates . . . . . . . . . . . . . . . . . . 4.5.1 Intrinsic point estimates . . . . . . . . 4.5.2 Extrinsic point estimates . . . . . . . . 4.6 Algorithm and implementation . . . . . . . . 4.6.1 Base tessellations (of spheres) . . . . . 4.6.2 Software design . . . . . . . . . . . . . 4.6.3 Supporting software . . . . . . . . . . 4.7 Example . . . . . . . . . . . . . . . . . . . . . 4.8 Conclusions and future work . . . . . . . . . . 4.A Populating the spheres . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 109 110 112 113 114 114 116 117 118 119 119 120 120 120 121 122 122 124 126 5 A new index close to strangeness 5.1 Two definitions . . . . . . . . . . . . . . . . . . . . . . . . . . 5.1.1 Derivative array equations and the strangeness index 5.1.2 Analysis based on the strangeness index . . . . . . . . 5.1.3 The simplified strangeness index . . . . . . . . . . . . 5.2 Relations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5.3 Uniqueness and existence of solutions . . . . . . . . . . . . . 5.4 Implementation . . . . . . . . . . . . . . . . . . . . . . . . . . 5.4.1 Computational complexity . . . . . . . . . . . . . . . . 5.4.2 Notes from experiments . . . . . . . . . . . . . . . . . 5.5 Conclusions and future work . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 129 129 130 132 134 138 141 145 146 146 147 6 lti ode of nominal index 1 6.1 Introduction . . . . . . . . . . . . . . . . . . . . 6.2 Schematic overview of nominal index 1 analysis 6.3 Decoupling transforms and initial conditions . 6.4 A matrix result . . . . . . . . . . . . . . . . . . . 6.5 An lti ode result . . . . . . . . . . . . . . . . . 6.6 The fast and uncertain subsystem . . . . . . . . 6.7 The coupled system . . . . . . . . . . . . . . . . 6.8 Extension to non-zero pointwise index . . . . . 6.9 Examples . . . . . . . . . . . . . . . . . . . . . . 6.10 Conclusions . . . . . . . . . . . . . . . . . . . . 6.A Details of proof of lemma 6.8 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 149 150 152 154 159 164 167 168 170 174 178 180 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . xiv CONTENTS 7 lti ode of nominal index 2 7.1 Canonical form . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7.1.1 Derivation based on Weierstrass decomposition . . . . . . . . . 7.1.2 Derivation without use of Weierstrass decomposition . . . . . 7.1.3 Example . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7.2 Initial conditions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7.3 Growth of eigenvalues . . . . . . . . . . . . . . . . . . . . . . . . . . . 7.4 Case study: a small system . . . . . . . . . . . . . . . . . . . . . . . . . 7.4.1 Eigenvalues . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7.4.2 Transition matrix . . . . . . . . . . . . . . . . . . . . . . . . . . 7.4.3 Simultaneous consideration of initial conditions and transition matrix bounds . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7.5 Conclusions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7.A Decoupling transforms . . . . . . . . . . . . . . . . . . . . . . . . . . . 7.A.1 Eliminating slow variables from uncertain dynamics . . . . . . 7.A.2 Eliminating uncertain variables from slow dynamics . . . . . . 7.A.3 Remarks on duality . . . . . . . . . . . . . . . . . . . . . . . . . 7.B Example data . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 185 185 187 190 193 195 198 202 203 205 8 ltv ode of nominal index 1 8.1 Slowly varying systems . . . . . . . . . . . . . . . . . . . . . 8.2 Time-varying systems with timescale separation . . . . . . 8.2.1 Overview . . . . . . . . . . . . . . . . . . . . . . . . . 8.2.2 Eliminating slow variables from uncertain dynamics 8.2.3 Eliminating uncertain variables from slow dynamics 8.3 Comparison with scalar perturbation . . . . . . . . . . . . . 8.4 The decoupled system . . . . . . . . . . . . . . . . . . . . . 8.5 Conclusions . . . . . . . . . . . . . . . . . . . . . . . . . . . 8.A Dynamics of related systems . . . . . . . . . . . . . . . . . . 227 228 232 233 234 237 239 239 242 243 9 Concluding remarks . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 207 208 210 211 216 219 221 247 A Sampling perturbations 249 A.1 Time-invariant perturbations . . . . . . . . . . . . . . . . . . . . . . . 249 A.2 Time-varying perturbations . . . . . . . . . . . . . . . . . . . . . . . . 250 Bibliography 255 Index 267 Notation These tables are provided as quickly accessible complements to the lengthier explanations of notation in section 1.6. Some sets, manifolds, and groups Notation N R C Rn Meaning M Set of natural numbers. Set of real numbers. Set of complex numbers. Set of n-tuples of real numbers, or n-dimensional Euclidean space. The n-sphere, that is, the sphere of dimension n. Special orthogonal group of dimension n. S O(3) is the group of rigid body rotations. Standard manifold in chapter 4. Lν Solution set of FνS ( x, ẋ, . . . , ẋ(ν+1) , t ) = 0, in chapter 5. Sn S O(n) ! Matrix properties Notation λ( X ) α( X ) λmin ( X ) λmax ( X ) max(X) kXk2 kXkI XY n Meaning The set of eigenvalues of the matrix X. max { Re λ : λ ∈ λ( X ) } min { |λ| : λ ∈ λ( X ) } max { |λ| : λ ∈ λ( X ) } maxi,j Xij Induced 2-norm of X. supu,0 |X|u|u| supt∈I kX(t)k2 , where I is a given interval of time. X − Y is positive semidefinite. Matrix dimension, see section 1.6.1. xv xvi Notation Basic functions and operators Notation I δ ex etpv |x| bxc dxe d( x, y ) X\Y ∂X • Meaning Identity matrix or the identity map. Dirac delta “function”, in chapter 4. Exponential function evaluated at x. Exponential map based in p, evaluated at t v. Modulus (absolute value) of x if x is scalar, 2-norm of x if x is vector. Floor of x, that is, the largest integer not greater than x. Ceiling of x, that is, the smallest integer not less than x. Distance between x and y in induced Riemannian metric. Set difference, set of elements in X that are not in Y . Boundary of the set X. Argument of function in bullet notation. Example: f ( x, •, z ) = y 7→ f ( x, y, z ). Differentiation and shift operators Notation x0 x0(i) x0(i+) x0{i} q ∇x ∇i x ∇i+ x Meaning Derivative of x, with x being a function of a single real argument. Avoid confusion with ẋ. Derivative of x or order i. Avoid confusion with ẋ(i) . Sequence or concatenation of x0(i) , x0(i+1) , . . . , x0(ν+1) , for some ν determined by context. Analogous definition for ẋ(i+) . Sequence or concatenation of x, x0 , . . . , x0(i) . Analogous definition for ẋ{i} . Shift operator for sequences. (qx)( n ) = x( n + 1 ). Gradient of x, see section 1.6.1. Gradient of x with respect to argument number i, see section 1.6.1. Concatenated gradients of x with respect all arguments starting from number i, see section 1.6.1. Probability theory and filtering Notation P( H ) fx fx|y N ( m, C ) VarX ( x ) y0..t xs|t Meaning Probability of the event H. Probability density function for stochastic variable x. fx|Y =y , with x and Y stochastic variables, and y a point. Gaussian distribution with mean m and covariance C. Variance of X at the point x, see (4.7). Measurements up to time t, in chapter 4. State estimate at time s given y0..t , in chapter 4. xvii Notation Ordo notation Meaning < δ =⇒ y < k Notation y = O( x ) y = O( x0 ) y = o( x ) ∃ δ > 0, k < ∞ : |x| |x| Exception! ∃ δ > 0, k < ∞ : |x| < δ =⇒ y < k |y | limx→0 |x| = 0 Intervals Notation [ a, b ] ( a, b ) ( a, b ] [ a, b ) Meaning {x ∈ R {x ∈ R {x ∈ R {x ∈ R : : : : a ≤ x ≤ b} a < x < b} a < x ≤ b} a ≤ x < b} Logical operators Notation P ∧Q P ∨Q Meaning Logical conjunction of P and Q, “P and Q”. Logical disjunction of P and Q, “P or Q”. Abbreviations Abbreviation bdf dae irk lti ltv ode Meaning Backwards difference formula Differential-algebraic equation(s) Implicit Runge-Kutta Linear time-invariant Linear time-varying Ordinary differential equation(s) Part I Background 1 Introduction This chapter gives an introduction to the thesis by explaining very briefly the field in which it has been carried out, presenting the contributions in view of a problem formulation, and giving some reading directions and explanations of notation. 1.1 Differential-algebraic equations in automatic control This thesis has been carried out at the Division of Automatic Control, Linköping University, Sweden, within the research area nonlinear and hybrid systems. Differentialalgebraic equations is one of a small number of research topics in this area. We shall not dwell on whether these equations are particularly nonlinear or related to hybrid systems; much of the research so far in this group has been on linear time-invariant differential-algebraic equations, although there has lately been some research also on differential-algebraic equations that are not linear. From here on, the abbreviation dae will be used for differential-algebraic equation(s). In the field of automatic control, various kinds of mathematical descriptions are used to build models of the objects to be controlled. Sometimes, the equations are used primarily to compute information about the object (estimation), sometimes the equations are used primarily to compute control inputs to the object (control ), and often both tasks are performed in combination. From the automatic control point of view the dae are thus of interest due to their ability to model objects. Not only are they able to model many objects, but in several situations they provide a very convenient way of modeling these objects, as is further discussed in section 2.2. In practice, the dae generally contain parameters that need to be estimated using measurements on the object; this process is called identification. 1 2 1 Introduction In this thesis the concern is neither primarily with estimation, control, nor identification of objects modeled by dae. Rather, we focus on the more fundamental questions regarding how the equations relate to their solution in so-called initial value problems . It is believed that this will be beneficial for future development of the other three tasks. 1.2 Introduction to matrix-valued singular perturbation Section 2.5 will give some background on scalar and multi-parameter singular perturbation problems, and in chapters 6, 7 and 8 methods from scalar singular perturbation theory (Kokotović et al., 1986) will play a key role in the theoretical development. In view of the problems encountered when analyzing dae under uncertainty, we have coined the term matrix-valued singular perturbation to denote the generalization of the singular perturbation problems to the case when the uncertainties form a whole matrix of small values. For nonlinear time-invariant systems, the basic concern is systems in the form ! x0 (t) + gx ( x(t), z(t) ) = 0 ! E z 0 (t) + gz ( x(t), z(t) ) = 0 (1.1) where E is an unknown small matrix; max(E) ≤ m. For time-varying systems, E is also allowed to be time-varying, and even more general nonlinear systems are obtained by allowing E to also depend on x(t). The problem is to analyze the solutions to the equations as m → 0. However, the nonlinear form (1.1) is much more general than the forms addressed in the thesis. Below, we give examples, clarifications, and further motivation for the study of matrix-valued singular perturbation. 1.2.1 Linear time-invariant examples The linear time-invariant (often abbreviated lti) examples below are typical in that the equations are not given in matrix-valued singular perturbation form. Instead, the form appears after some well-conditioned operations on the equations and changes of variables, which will allow the solution to the original problem to be reconstructed from the solution to the matrix-valued singular perturbation problem. 1.1 Example Starting from an index 0 dae in two variables, " # " # 1. 7. 0 3. 2. ! x (t) + x(t) = 0 1. 3. 2. 1. an index 1 dae in three variables is formed by making a copy of the second variable; x̄1 = x1 , x̄2 = x2 , x̄3 = x2 . In the leading matrix (in front of x̄0 (t)), the second variable The problem of computing the future trajectory of the variables given external inputs and sufficient information about the variables at the initial time. Note that the subscripts on g are just meaningful ornaments, and do not denote partial derivatives. 1.2 3 Introduction to matrix-valued singular perturbation is replaced to 70% by the new variable. In the trailing matrix (in front of x̄(t)) we add ! a row with the coefficients of the copy relation x̄2 (t) − x̄3 (t) = 0. 1. 2.1 4.9 3. 2. 0 ! 1. 0.9 2.1 0 2. 1. 0 x̄ (t) + x̄(t) = 0 0 0 0 0 1 −1 | {z } | {z } E (1.2) A To analyze this dae, it is noted that the leading matrix E is row reduced (its non-zero rows are linearly independent), and the row where the leading matrix has only zeros can be differentiated without introducing higher order derivatives. This leads to 1. 2.1 4.9 3. 2. 0 1. 0.9 2.1 x̄0 (t) + 2. 1. 0 x̄(t) =! 0 0 1 −1 0 0 0 where x0 (t) can be solved for, so that an ode is obtained. In the terminology of dae, index reduction has successfully revealed an underlying (implicit) ode. Now, instead of performing index reduction on (1.2) directly, consider first applying the well-conditioned change of equations given by the matrix −1 2. −9. 0. T B 4. · 8. −5. 3. 1. −5. 7. It is natural to expect that this should not make a big difference to the difficulty in solving the dae via an underlying ode, but when the computation is performed on a computer, the picture is not quite as clear. The new dae has the matrices T E and T A. By computing a QR factorization (using standard computer software) of the leading matrix, a structurally upper triangular leading matrix was obtained together with an orthogonal matrix Q associated with this form. The corresponding trailing matrix is obtained as QT A. This leads to −2.2 −0.53 −0.41 −0.62 −0.95 −1.6 ! 0 0.62 1.4 x̄0 (t) + 0.51 0.56 −0.048 x̄(t) = 0 −7.2 · 10−17 0.46 −0.46 0 0 3.4 · 10−16 where a well-conditioned change of variables can bring the equations into the linear time-invariant form of (1.1) with E ∈ R1×1 . (One can just as easily construct examples where E is of dimension larger than 1 × 1.) Although looking like an implicit ode, this view is unacceptable for two reasons. First, the system of equations is extremely stiff. (Even worse, the fast mode happens to be unstable this time, not at all like the original system.) Second, considering numerical precision in hardware, it would not make sense to compute a solution that depends so critically on a coefficient that is not distinctly non-zero. 4 1 Introduction The ad hoc solution to the problem in the example is to replace the tiny coefficient in the leading matrix by zero, and then proceed as usual, but suppose ad hoc is not good enough. How can one then determine if 3.4 · 10−16 is sufficiently tiny, or just looks tiny due to equation and variable scalings? What is the theoretical excuse for the replacement of small numbers by zeros? What assumptions have to be made? The next example suggests that the ill-posedness may be possible to deal with. The assumptions made here are chosen theoretically insufficient on purpose — the point is that making even the simplest assumptions seems to solve the problem. The example also contains some very preliminary observations regarding how to scale the equations in order to make it possible to make decisions based on the absolute size of the perturbations. 1.2 Example Having equations in the form ! E x0 (t) + A x(t) = 0 modelling a two-timescale system (see section 2.5) where the slow dynamics is known to be stable, we now decide that unstable fast dynamics is unreasonable for the system at hand. In terms of assumptions, we assume that the fast dynamics of the system is stable. We then generate random perturbations in the equation coefficients that we need to replace by zero, discarding any instances of the equations that disagree with our assumption, and use standard software to solven the remaining instances. Two o trailing matrices are used, given by selecting δ from 1, 10−2 in the pattern A11 A12 0.29 0.17 0.046 A = B 0.34 δ 0.66 δ 0.66 A21 A22 0.87 δ 0.83 δ 0.14 and then scaling the last two rows so they get the same norm as the first row. In the leading matrix, 1. 1. 1. E11 E12 B 0 ?11 ?12 E = 0 E22 0 ?21 ?22 it is the block E22 that will be instantiated with small random perturbations. As in the previous example, the form of E is just a well-conditioned change of variables away from the lti form of (1.1). In order to illustrate what happens when the perturbations become smaller, the perturbations are generated such that max(E22 ) = m, for a few values of m. To achieve this, an intermediate matrix T of the same dimensions as E22 is generated by sampling each entry from a uniform distribution over [ −1, 1 ], and m T. then E22 B max(T ) The example is chosen such that m = 0 yields a stable slow system. Thus the perturbations of interest are those that make all modes of the stiff system stable. The initial conditions are chosen with x1 (0) = 1 and consistent with m = 0. 1.2 5 Introduction to matrix-valued singular perturbation 1 1 0.5 0.5 0 0 0 2 4 6 1 1 0.5 0.5 0 0 2 4 6 0 2 4 6 0 2 4 6 0 0 2 4 6 1 1 0.5 0.5 0 0 0 2 4 6 Figure 1.1: Solutions for x1 obtained by generating 50 random perturbations of given magnitudes. Details are given in the text. Left: A defined by δ = 1. Right: A defined by δ = 10−2 . Top: m = 1. · 10−1 . Middle: m = 1. · 10−3 . Bottom: m = 1. · 10−5 . Simulation results are shown in figure 1.1. By choosing a threshold for m based on visual appearance, the threshold can be related to δ. Finding that 1. · 10−3 and 1. · 10−5 could be reasonable choices for δ being 1 and 10−2 , respectively, it is tempting to conclude that it would be wise to base the scaling of the last two rows on A22 alone. 1.2.2 Application to quasilinear shuffling In theory, index reduction of equations in the quasilinear form ! E( x(t), t ) x0 (t) + A( x(t), t ) = 0 (1.3) is simple. Similar to how the linear time-invariant equations were analyzed in example 1.1, the equations are manipulated using invertible row operations so that the leading matrix becomes separated into one block which is completely zeroed, and one block with independent rows. The discovered non-differential equations are then differentiated, and the procedure is repeated until the leading matrix gets full rank. As examples of the in-theory ramifications of this description, consider the following list: • It may be difficult to perform the row reduction in a numerically wellconditioned way. • The produced equations may involve very big expressions. • Testing whether an expression is zero is highly non-trivial. 6 1 Introduction The forthcoming discussion applies to the last of these ramifications. Typical examples in the literature have leading matrices whose rank is determined solely by a zero-pattern. For instance, if some variable does not appear differentiated in any equation, the corresponding column of the leading matrix will be structurally zero. It is then easy to see that this column will remain zero after arbitrarily complex row operations, so if the operations are chosen to create structural zeros in the other columns at some row, it will follow that the whole row is structurally zero. Thus a non-differential equation is revealed, and when differentiating this equation, the presence of variables in the equation determines the structural zero-pattern of the newly created row in the leading matrix, and so the index reduction may be continued. Now, recall how the zero-pattern was lost by a seeminly harmless transform of the equations in example 1.1. Another situation when linear dependence between rows in the leading matrix is not visible in a zero-pattern, is when a user happens to write down equations that are dependent up to available accuracy. It must be emphasized here that available accuracy is often not a mere question of floating point number representation in numerical hardware (as in our example), but a consequence of uncertainties in estimated model parameters. In chapter 3, it is proposed that a numerical approach is taken to zero-testing whenever tracking of structural zeros does not hold the answer, where an expression is taken for being (re-writable to) zero if it evaluates to zero at some trial point. Clearly, a tolerance will have to be used in this test, and showing that a meaningful threshold even exists is one of the main topics in the thesis. When there are many entries in the leading matrix which need numerical evaluation at the time, well-conditioned operations on the equations (row operations and a change of variables) lead to the form (1.1) where E contains all the small expressions, and generally depends on both x(t) and t. 1.2.3 A missing piece in singular perturbation of ODE Our final attempt to convince the reader that our topic is interesting is to remark that matrix-valued singular perturbations are not only a delicate problem in the world of dae. These problems are also interesting in their own right when the leading matrix of a dae (or E in (1.1)) is known to be non-singular so that the dae is really just an implicit ode. Then the matrix-valued singular perturbation problem is a natural generalization of existing singular perturbation problems for ode (see section 2.5). In the language of the thesis, we say that these equations are dae of pointwise index 0, and most of the singular perturbation results in the thesis will actually be restricted to this type of equations. 1.2.4 How to approach the nominal equations The perturbation results in the thesis are often formulated using O( max(E) ) (where E is the matrix-valued perturbation, and not the complete leading matrix) or formulated to apply as m → 0, where m is a bound on all uncertainties in the equations. It 1.2 Introduction to matrix-valued singular perturbation 7 is motivated to ask what the practical implications of such bounds really mean, and there are several answers. In some situations, the uncertainties in the equations will be given without any possibility to be reduced. In that case, our results provide that we will be able to test the size of the uncertainties to see if they are small enough for the perturbation analysis to apply, and if they are, there will be a bound on the uncertainty in the solutions (or whatever other property that the result at hand concerns). On the other hand, there is always the alternative to artificially increase uncertainty in the model. Increasing the uncertainty of a near-zero interval with large relative uncertainty, so that it includes zero, may sometimes be interpreted as model reduction, where very stiff equations are approximated by non-stiff equations. In other situations, there may be a possibility to reduce the size of the uncertainties, typically at the cost of spending more resources of some kind. Then, our results may be interpreted as spending enough resources, property so and so can be obtained. Examples of how spending more resources may reduce uncertainty are given below. • If the equations contain parameters that are the result of a system identification procedure, uncertainty can often be reduced by using more estimation data or by investing in more accurate sensors. • If the uncertainty in the equations is due to uncertainties in floating point arithmetic, a multi-precision library for floating point arithmetic (such as GMP (2009)) may be used to reduce uncertainty. • In a time-varying system (and hopefully non-linear systems in the future), where the matrix-valued singular perturbation problem arises when the leading matrix looses rank at some point, the size of the matrix-valued perturbation can be reduced by integrating the increasingly stiff lower-index equations closer to the point where the rank drops. Due to the increasing stiffness, it will be computationally costly to integrate the lower-index equations with adequate precision. 1.2.5 Final remarks There is also another application of results on matrix-valued singular perturbation, more closely related to the field of automatic control. This concerns the use of unstructured dae as models of dynamic systems; not until well-posedness of solutions to such equations has been established does it make sense to consider problems such as system identification or control for such models. It should also be mentioned that we are not aware of any strong connection between physical models from any field, and matrix-valued singular perturbation. Electrical systems, for instance, have scalar quantities, and the singular perturbation problems one encounters are generally of multiparameter type. A natural place to search for matrix-valued singular perturbations would be inertia-matrices of rod-like objects. However, these matrices are normal, and a norm-preserving linear (although typically uncertain) change of variables can be used to make the inertia-matrix diago- 8 1 Introduction nal. Hence, only if one is unwilling to make use of the uncertain change of variables and ignores the normality constraint which is satisfied by all inertia matrices, will the matrix-valued singular perturbation be necessary to deal with in its full generality. Furthermore, even if a single rod would be handled as a matrix-valued singular perturbation, the dimension of the uncertain subsystem is just 1, so scalar singular perturbation techniques would apply. The rotation of point-like objects on the other hand, does not have a non-trivial nominal solution, making also these objects unsuitable for demonstrations. While we are not aware of physical modeling in any field which would require the use of matrix-valued singular perturbation theory, if modeled carefully, we are aware that many models are not developed carefully in order to avoid matrix-valued singular perturbation problems. Matrix-valued singular perturbation theory needs to be developed for the rescue of these models, as well as all algorithms and software which systematically produce such problems. 1.3 Problem formulation The long term goal of the work in this thesis is a better understanding of uncertainty in differential-algebraic equations used as models in automatic control and related fields. While we were originally concerned with dae in the quasilinear form (1.3), the questions arising regarding uncertainty in this form turned out to be unanswered also for the much more restricted lti dae. In order to understand the solutions of a dae one generally applies some kind of index reduction scheme involving differentiation of the equations with respect to time. One of the more recent approaches to analysis of general nonlinear dae centers around the strangeness index, and one of the problems considered in the thesis is that a better understanding of this analysis is needed in order to even see the structure of the associated perturbation problems arising from the uncertainty in the equations. The main problem addressed in the thesis is related to the less sophisticated index reduction schemes associated with the differentiation index and the shuffle algorithm. Here the perturbation problems turn out to be readily transformable into the matrix-valued singular perturbation form, and we ask how these problems can be approached, what qualitative properties they possess, and how the relation between uncertainty in the equations and the uncertainty in the solution may be quantified. Another problem considered in the thesis, related to differential-algebraic equations used as models in automatic control, is how to develop geometrically sound algorithms with manifold-valued variables. More generally, this suggests that matrix-valued singular perturbation can be avoided in rigid body mechanics as long as the kinetic energy is a quadratic form in the time derivative of the generalized coordinates. 1.4 Contributions 1.4 9 Contributions The main contributions in this thesis are, in approximate order of appearance: • Introduction of the so-called matrix-valued singular perturbation problem as a natural generalization of existing singular perturbation problem classes, with applications to uncertainty and approximation in differential-algebraic equations. • An application related to modeling with differential-algebraic equations: pointmass filtering on manifolds. • The proposed simplified strangeness index along with basic properties and its relation to the closely related strangeness index. • Extension of previous perturbation results for linear time-invariant differentialalgebraic equations of nominal index 1, introducing assumptions about eigenvalues as the main tool to obtain convergence. • A canonical form for uncertain matrix pairs of nominal index 2. • Generalizations of some of the linear time-invariant perturbation results from nominal index 1 to nominal index 2. • Perturbation results for linear time-varying differential-algebraic equations of nominal index 1. 1.5 Thesis outline The thesis is divided into two parts dividing the thesis into theoretical background (first) and new results (second). Some notation is explained in the next section, completing the first chapter in the first part. Most readers will probably find it worth-while skimming through that section before proceeding to later chapters. The theoretical background of the thesis is, with very few exceptions, given in chapter 2. When exceptions are made in the second part of the thesis, this will be made clear so that there is no risk of confusion with new results. Chapter 3 contains material from the author’s licentiate thesis Tidefelt (2007), and is included in the present thesis mainly to show the connection between nonlinear systems and the matrix-valued singular perturbation results for linear systems in the second part of the thesis. Readers interested in index reduction of quasilinear dae may find some of the ideas in the chapter interesting, but the chapter is put in the first part of the thesis since the seminumerical schemes it proposes are incomplete as long as the related singular perturbation problems are better understood. Other readers may safely skip chapter 3. Turning to the second part, the first two chapters are only loosely related to the title of the thesis. Chapter 4 presents a state estimation technique with potential application to systems described by differential-algebraic equations. Then, chapter 5 proposes a new index concept which is closely related to the strangeness index, but unlike the 10 1 Introduction index reduction scheme of chapter 3, the structure of the perturbation problems associated with the strangeness-like indices is not yet analyzed. Hence, it is not clear whether the results on matrix-valued singular perturbation in the following three chapters will find applications in solution techniques related to the strangeness index. Chapter 6 extends the early results on matrix-valued singular perturbation that appeared in Tidefelt (2007). These results apply to lti dae of nominal index 1, and lti dae of nominal index 2 are considered in chapter 7. In chapter 8 some of the nominal index 1 results are extended to time-varying equations. Chapter 9 contains conclusions and directions for future research. 1.6 Notation The present section introduces basic terminology and notation used throughout the thesis. Not all the terminology is defined here though. Abbreviations and some symbols are defined in the tables on page xv, and the reader will find references to all definitions (including those given here) in the subject index at the end of the thesis, after the bibliography. 1.6.1 Mathematical notation The terms and factors ofQ sums and products over index Q sets have unlimited extent to Q the right. For example, ( i |λi | ) + 1 , i |λi | + 1 = i (|λi | + 1). If α is a scalar, Σ is a set of scalars, and ∼ is a relation between scalars, then α ∼ Σ (or Σ ∼ α) means ∀ σ ∈ Σ : α ∼ σ (or ∀ σ ∈ Σ : σ ∼ α). For instance, Re λ( X ) < 0 means that all eigenvalues of X have negative real parts. In the example, we also used that functions automatically map over sets (in Mathematica, the function is said to thread over its argument) if there is no ambiguity. ! The symbol = is used to indicate an equality that shall be thought of as an equation. Compare this to the plain =, which is used to indicate that expressions are equal in the sense that one can be rewritten as the other, possibly√using context-dependent assumptions. For example, assuming x ≥ 0, we may write x2 = x. The symbol B is used to introduce names for values or expressions. The meaning 4 of expressions can be defined using the symbol =. Note that the difference between 4 f B x 7→ x2 and f ( x ) = x2 is mainly conceptual; in many contexts both would work equally well. If x is a function of one variable (typically thought of as time), the derivative of x with respect to its only argument is written x0 . The composed symbol ẋ shall be used to denote a function which is independent of x, but intended to coincide with x0 . For example, in numeric integration of x00 = u, where u is a forcing function, we write 1.6 11 Notation the ordinary differential equation as ( ẋ0 = u x0 = ẋ Higher order derivatives are denoted x00 , x0(3) , . . . , or ẍ, ẋ(3) , . . . . When the highest order of dots, say ẋ(ν+1) , is determined by context, ẋ(i+) is a short hand for the sequence or concatenation of ẋ(i) , . . . , ẋ(ν+1) . Conversely, the sequence or concatenation of x, x0 , . . . , x0(i) is denoted x0{i} , and we define ẋ{i} analogously. Making the distinction between x0 and ẋ this way — and not the other way around — is partly for consistency with the syntax of the Mathematica language, in which our algorithms are implemented. Gradients (Jacobians), are written using the operator ∇. For example, ∇f is the gradient (Jacobian) of f , assuming f takes one vector-valued argument. If a function takes several arguments, a subscript on the operator is used to denote with respect to which argument the gradient is computed. For example, if f is a function of 3 arguments, then ∇2 f = ( x, y, z ) 7→ ∇ ( w 7→ f ( x, w, z ) ) ( y ) The notation ∇i+ is used to denote concatenated gradients with respect to all arguments starting from number i. For example, with f as before h i ∇2+ f = ( x, y, z ) 7→ ∇ ( w 7→ f ( x, w, z ) ) ( y ) ∇ ( w 7→ f ( x, y, w ) ) ( z ) Bullet notation is used for compact notation of functions of one unnamed argument. The expression which becomes the “body” of the function is the smallest complete expression containing the bullet. For example, let the first argument of f be real, then f ( •, y, z )0 ( x ) = ∇1 f ( x, y, z ) 4 For a time series ( xn )n , the forward shift operator q is defined as qxn = xn+1 . Matrices are constructed within square brackets. Vectors are constructed by vertical alignment within parentheses. A single row of a matrix, thought of as an object with only one index, is constructed by horizontal alignment within parentheses. If a column of a matrix is though of as having only one index, it is constructed using the same notation as a vector. There is no distinction in notation between contravariant and covariant vectors. (Square brackets are, however, also used sometimes in the same way as parentheses are used to indicate grouping in equations.) Square brackets and parentheses are also used with two real scalars separated by a comma to denote intervals of real numbers, see the notation table on page xv. Tuples (lists) are also denoted using parentheses and elements separated by commas, but it will be clear from the context when ( a, b ) is an open interval and when it is a two-tuple. If the variable n has no other meaning in the current context, but there is a square matrix that can be associated with the current context, then n denotes the dimension of this matrix. 12 1.6.2 1 DAE and ODE Introduction terminology In accordance with most literature on this subject, equations not involving differentiated variables will often be denoted algebraic equations, although non-differential equations — a better notation from a mathematical point of view — will also be used interchangeably. The quasilinear form of dae has already been introduced, repeated here, ! E( x(t), t ) x0 (t) + A( x(t), t ) = 0 (1.3) The matrix-valued function E which determines the coefficients for the differentiated variables, as well as the expression E( x(t), t ), will be referred to as the leading matrix. This terminology is also used for the important subtypes of quasilinear dae being the linear dae, see below. The function A as well as the expression A( x(t), t ) will be referred to as the algebraic term. This terminology will only be used when the algebraic term is not affine in x(t), for otherwise the terminology of linear dae is more precise. This brings us to the linear dae. An autonomous lti dae has the form ! E x0 (t) + A x(t) = 0 (1.4) where E and A are constant matrices. By autonomous, we mean that there is no way external inputs can enter this equation, so the system evolves in a way completely defined by its initial conditions. Adding a forcing function (often representing external inputs) while maintaining the lti property leads to the general lti dae form ! E x0 (t) + A x(t) + B u(t) = 0 (1.5) where u is a vector-valued function representing external inputs to the model, and B is a constant matrix. In the linear dae (1.5) and (1.4), the matrix A of coefficients for the non-differentiated variables, is denoted the trailing matrix.† It may be a function of time, if the linear dae is time-varying. To complement the terminology that has been introduced for dae, we shall introduce some corresponding terminology for ode. In the nonlinear ode (sometimes written Seeking a notation which is both short and not misleading, the author would prefer static equations, but this notation is avoided to make the text more accessible. By this definition, the algebraic term with reversed sign is sometimes referred to as the right hand side of the quasilinear dae. The solution will be linear in initial conditions (regardless of the initial time) if the forcing function is zero, and linear in the forcing function (regardless of the initial time, if the forcing function is suitably time-shifted) if the initial conditions are zero. In the terminology of quasilinear dae, the expression A x(t) + B u(t) would constitute the algebraic term here. However, it is affine in x(t) so we prefer to use the more specific terminology of linear dae. † This terminology may seem in analogy with the term leading matrix. However, the reason why the leading matrix has received its name is unknown to the author, and trailing matrix was invented for the thesis to avoid ambiguity with the state feedback matrix, to there is no common source of analogy. Rather, the term trailing matrix appears natural in view of the leading matrix being the matrix which is listed first in a matrix pair, see section 2.2.6. 1.6 13 Notation ! with “=” instead of “=” to stress that the differentiated variables are trivial to solve for) ! x0 (t) = f ( x(t), t ) (1.6) the function f as well as the expression f ( x(t), t ) are called the right hand side of the ode. If f is affine in its first argument, that is, f ( x, t ) = M( t ) x(t) + b( t ) the matrix M as well as the expression M( t ) are called the state feedback matrix. When an ode or a dae has a term which only depends on time, such as b(t) here, this term will be denoted the forcing term of the equation. Often, the forcing term is in the form b(t) = β( u(t) ), where the function β is considered fixed, while u is considered an unknown external input to the equation. The function u is then denoted the forcing function or input to the equation. If b(t) is linear in u(t), that is, b(t) = B(t) u(t) where B(t) is a matrix, then this matrix as well as B are called the input matrix of the equation. In case of ode, this leads to the ltv ode form ! x0 (t) = M(t) x(t) + B(t) u(t) (1.7) and the lti ode form ! x0 (t) = M x(t) + B u(t) (1.8) When f does not depend on its second argument the ode is said to be time-invariant . The autonomous counterparts of (1.7) and (1.8) are hence obtained by setting u B 0. 1.3 Example As an example of the notation, note that if E in (1.5) is non-singular, then there is a corresponding ode in x with state feedback matrix −E −1 A. Since the term input matrix is being used both for dae and ode, care must be taken when using the term in a context where a system is being represented both as a dae and an ode; the input matrix of the ode here would be −E −1 B, while the input matrix of the dae (1.5) is B. For dae, the autonomous ltv form is ! E( t ) x0 (t) + A( t ) x(t) = 0 (1.9) and the general ltv dae form with forcing function is ! E( t ) x0 (t) + A( t ) x(t) + B( t ) u(t) = 0 (1.10) This notation is borrowed from Kailath (1980). We hereby avoid the perhaps more commonly used notation system matrix, because of the other — yet related — meanings this term also bears. While this terminology is widely used in the automatic control community, mathematicians tend to denote the ode autonomous rather than time-invariant. Our use of autonomous indicates the absence of a forcing term in the equations, and is only used with equation forms where there is a natural counterpart with forcing function. Note that our definition of autonomous linear time-varying ode is not autonomous in the sense often used by mathematicians. For linear time-invariant ode, however, the two uses of autonomous are compatible. 14 1 Introduction While the solution x to the ode is referred to as the state vector or just the state of the ode, the elements of the solution x to the dae are referred to as the variables of the dae. A dae is denoted square if the number of equations and variables match. When a set of equations characterizing the solution manifold has been derived, these are sometimes completed with differential equations so that a square dae of strangeness index 0 is obtained. This dae will then be referred to as the reduced equation. By an initial value problem we refer to the problem of computing trajectories of the variables of a dae (or ode), over an iterval [ t0 , t1 ], given sufficient information about the variables and their derivatives at time t0 . 2 Theoretical background The intended audience of this thesis is not expected to have prior experience with both automatic control and differential-algebraic equations. For those without background in automatic control, we start the chapter in section 2.1 by providing general motivation for why we study equations, and dae in particular. For those with background in automatic control, but with only very limited experience with dae, we try to fill that gap in section 2.2. The remaining sections of the chapter present other theoretical background material that will be used in later chapters. To keep it clear what the contributions of the thesis are, there are just a few exceptions (most notably in chapter 5, as is explained in the introduction to that chapter) to the rule that existing results used in the second part of the thesis are presented here. 2.1 Models in automatic control Automatic control tasks are often solved by engineers without explicit mathematical models of the controlled or estimated object. For instance, a simple low pass filter may be used to get rid of measurement noise on the signal from a sensor, and this can work well even without saying Assume that the correct measurement is distorted by zero mean additive high frequency noise. Speaking out that phrase would express the use of a simple model of the sensor (whether it could be called mathematical is a matter of taste). As another example, many processes in industry are controlled by a so-called pid controller, which has a small number of parameters that can be tuned to obtain good performance. Often, these parameters are set manually by a person with experience of how these parameters relate to production performance, and this can be done without awareness of mathematical models. Most advances in control and 15 16 2 Theoretical background estimation theory do, however, build on the assumption that a more or less accurate mathematical model of the object is available, and how such models may be used, simplified, and tuned for good numerical properties is the subject of this section. 2.1.1 Examples The model of the sensor above was only expressed in words. Our first example of a mathematical model will be to say the same thing with equations. Since equations are typically more precise than words, we will loose some of the generality, a price we are often willing to pay to get to the equations which we need to be able to apply our favorite methods for estimation and/or control. Denote, at time t, the measurement by y(t), the true value by x(t), and let e be a white noise source with variance σ 2 . Let v(t) be an internal variable of our model: ! y(t) = x(t) + v(t) ! 0 0 v(t) + v (t) = e (t) (2.1a) (2.1b) A drawback of using a precise model like this is that our methods may depend too heavily on that this is the correct model; we need to be aware of how sensitive our methods are to errors in the mathematical model. Imagine, for instance, that we build a device that can remove disturbances at 50 Hz caused by the electric power supply. If this device is too good at this, it will be useless if we move to a country where the alternate current frequency is 60 Hz, and will even destroy information of good quality at 50 Hz. The model (2.1) is often written more conveniently in the Laplace transform domain, which is possible since the differential equations are linear: ! Y (s) = X(s) + V (s) (2.2a) s ! V (s) = E(s) (2.2b) 1+s Here, the s/ ( 1 + s ) is often referred to as a filter; the white noise is turned into high frequency noise by sending it through the filter. As a second example of a mathematical model we consider a laboratory process often used in basic courses in automatic control. The process consists of a cylindrical water tank, with a drain at the bottom. Water can be pumped from a reservoir to the tank, and the drain leads water back to the reservoir. There is also a gauge that senses the level of water in the tank. The task for the student is to control the level of water in the tank, and what makes the task interesting is that the flow of water through the drain varies with the level of water; the larger the level of water, the higher the flow. Limited performance can be achieved using for instance, a manually tuned pid controller, but to get good performance at different desired levels of water, a model-based controller is the natural choice. Let x denote the level of water, and u the flow we demand from the pump. A common approximation is that the flow through the drain is proportional to the square root of the level of water. Denote the White noise and how it is used in the example models is a non-trivial subject, but to read this chapter it should suffice to know that white noise is a concept which is often used as a building block of more sophisticated models of noise. 2.1 Models in automatic control 17 corresponding constant cd , and let the constant relating the flow of water to the time derivative of x (that is, this constant is the inverse of the bottom area of the tank) be denoted ca . Then we get the following mathematical model with two parameters to be determined from some kind of experiment: p (2.3) x0 (t) = ca u(t) − cd x(t) The constant ca could be determined by plugging the drain, adding a known volume of water to the tank, and measuring the resulting level. The other constant can also be determined from simple experiments. 2.1.2 Use in estimation The first model example above was introduced with a very easy estimation problem in mind. Let us instead consider the task of computing an accurate estimate of the level of water, given a sensor that is both noisy and slow. We will not go into details here, but just mention the basic idea of how the model can be used. Since the flow we demand from the pump, u, is something we choose, it is a known quantity in (2.3). Hence, if we were given a correct value of x(0) and the model would be correct, we could compute all future values of x simply by integration of (2.3). However, our model will never be correct, so the estimate will only be good during a short period of time, before the estimate has drifted away from the true value. The errors in our model are not only due to the limited precision in the experiments used to determine the constants, but more importantly because the square root relation is a rather coarse approximation. In addition, it is unrealistic to assume that we get exactly the flow we want from the pump. This is where the sensor comes into play; even though it is slow and noisy, it is sufficient to take care of the drift. The best of both worlds can then be obtained by combining the simulation of (2.3) with use of the sensor in a clever way. A very popular method for this is the so-called extended Kalman filter (for instance, Jazwinski (1970, theorem 8.1)). 2.1.3 Use in control Let us consider the laboratory process (2.3) again. The task was to control the level of water, and this time we assume that the errors in the measurements are negligible. There is a maximal flow, umax , that can be obtained from the pump, and it is impossible to pump water backwards from the tank to the reservoir, so we shall demand a flow subject to the constraints 0 ≤ u(t) ≤ umax . We denote the desired level of water the set point, symbolized by xref . The theoretically valid control law, if x(t) ≥ xref (t) 0, u(t) = (2.4) umax , otherwise will be optimal in theory (when changes in xref cannot be foreseen) in the sense that deviations from the set point are eliminated as quickly as possible. However, this type of control law will quickly wear the pump since it will be switching rapidly between off and full speed once the level gets to about the right level. Although still unrealistically naïve, at least the following control law somewhat reduces wear of the 18 2 Theoretical background pump, at the price of allowing slow and bounded drift away from the set point. It has three modes, called the drain mode, the fill mode, and the open-loop mode: ( u(t) = 0 Drain mode: Switch to open-loop mode if x(t) < xref (t) ( u(t) = umax Fill mode: Switch to open-loop mode if x(t) > xref (t) (2.5) p u(t) = cd xref (t) Open-loop mode: Switch to drain mode if x(t) > ( 1 + δ ) xref (t) Switch to fill mode if x(t) < ( 1 − δ ) x (t) ref where the parameter δ is a small parameter chosen by considering the trade-off between performance and wear of the pump. In the open-loop mode, the flow demanded from the pump is chosen to match the flow through the drain to the best of our knowledge. Note that if δ is sufficiently large, errors in the model will make the level of water settle at the wrong level; to each fixed flow there is a corresponding p level where the level will settle, and errors in the model will make cd xref (t) correspond to something slightly different from xref (t). More sophisticated controllers can remedy this. 2.1.4 Model classes When developing theory, be it system identification, estimation or control, one has to specify the structure of the models to work with. We shall use the term model class to denote a set of models which can be easily characterized. A model class is thus a rather vague term such as, for instance, a linear system with white noise on the measurements. Depending on the number of states in the linear system, and how the linear system is parameterized, various model structures are obtained. When developing theory, a parameter such as the number of states is typically represented by a symbol in the calculations — this way, several model structures can be treated in parallel, and it is often possible to draw conclusions regarding how such a parameter affects some performance measure. In the language of system identification, one would thus say that theory is developed for a parameterized family of model structures. Since such a family is a model class, we will often have such a family in mind when speaking of model classes. The concepts of models, model sets, and model structures are rigorously defined in the standard Ljung (1999, section 4.5) on system identification, but we shall allow these concepts to be used in a broader sense here. In system identification, the choice of model class affects the ability to approximate the true process as well as how efficiently or accurately the parameters of the model may be determined. In estimation and control, applicability of the results is related to how likely it is that a user will choose to work with the treated model structure, in light of the power of the results; a user may be willing to identify a model from a given class if that will enable the user to use a more powerful method. The choice of model class will also allow various amount of elaboration of the theory; a model class with much structural information will generally allow a more precise analysis, 2.1 19 Models in automatic control at the cost of increased complexity, both in terms of theory and implementation of the results. Before we turn to some examples of model classes, it should be mentioned that models are often describing a system in discrete time. However, this thesis is predominantly concerned with continuous time models, so the examples will all be of this kind. Continuing on our first example of a model class, in the sense of a parameterized family of model structures, it could be described as all systems in the linear state space form x0 (t) = A x(t) + B u(t) y(t) = C x(t) + D u(t) + v(t) (2.6) where u is the vector of system inputs, y the vector of measured outputs, v is a vector with white noise, and x is a finite-dimensional vector of states. For a given number of states, n, a model is obtained by instantiating the matrices A, B, C, and D with numerical values. It turns out that the class (2.6) is over-parameterized in the sense that it contains many equivalent models. If the system has just one input and one output, it is wellknown that it can be described by 2 n + 1 parameters, and it is possible to restrict the structure of the matrices such that they only contain this number of unknown parameters without restricting the possible input-output relations. Our second and final example of a model class is obtained by allowing more freedom in the dynamics than in (2.6), while removing the part of the model that relates the system output to its states. In a model of this type, all states are considered outputs: x0 (t) = A( x(t) ) + B u(t) (2.7) Here, we might pose various types of constraints on the function A. For instance, assuming Lipschitz continuity is very natural since it ensures that the model uniquely defines the trajectory of x as a function of u and initial conditions. Another interesting choice for A is the polynomials, and if the degree is at most 2 one obtains a small but natural extension of the linear case. Another important way of extending the model class (2.6) is to look into how the system inputs u are allowed to enter the dynamics. 2.1.5 Model reduction Sophisticated methods in estimation and control may result in very computationally expensive implementations when applied to large models. By large models, we generally refer to models with many states. For this reason methods and theory for approximating large models by smaller ones have emerged. This approximation process is referred to as model reduction. Our interest in model reduction owes to its relation to index reduction (explained in section 2.2), a relation which may not be widely recognized, but one which this thesis tries bring attention to. This section provides a small background on some available methods. 20 2 Theoretical background In view of the dae for which index reduction is considered in detail in later chapters, we shall only look at model reduction of lti systems here, and we assume that the large model is given in state space form as in (2.6). If the states of the model have physical meaning it might be desirable to produce a smaller model where the set of states is a subset of the original set of states. It then becomes a question of which states to remove, and how to choose the system matrices A, B, C, and D for the smaller system. Let the states and matrices be partitioned such that x2 are the states to be removed (this requires the states to be reordered if the states to be removed are not the last components of x), and denote the blocks of the partitioned matrices according to # ! " # ! " B x10 (t) A11 A12 x1 (t) + 1 u(t) = B2 x20 (t) A21 A22 x2 (t) (2.8) ! h i x (t) 1 + D u(t) + v(t) y(t) = C1 C2 x2 (t) If x2 is selected to consist of states that are expected to be unimportant due to the small values those states take under typical operating conditions, one conceivable approximation is to set x2 = 0 in the model. This results in the truncated model x10 (t) = A11 x1 (t) + B1 u(t) y(t) = C1 x1 (t) + D u(t) + v(t) (2.9) Although — at first glance — this might seem like a reasonable strategy for model reduction, it is generally hard to tell how the reduced model relates to the original model. Also, selecting which states to remove based on the size of the values they typically take is in fact a meaningless criterion, since any state can be made small by scaling, see section 2.1.6. Another approximation is obtained by formally replacing x20 (t) by 0 in (2.8). The underlying assumption is that the dynamics of the states x2 is very fast compared to x1 . A necessary condition for this to make sense is that A22 be Hurwitz, which also makes it possible to solve for x2 in the obtained equation A21 x1 (t) + A22 x2 (t) + ! B2 u(t) = 0. Inserting the solution in (2.8) results in the residualized model −1 x10 (t) = A11 − A12 A−1 22 A12 x1 (t) + B1 − A12 A22 B2 u(t) −1 y(t) = C1 − C2 A−1 22 A12 x1 (t) + D − C2 A22 B2 u(t) + v(t) (2.10) It can be shown that this model gives the same output as (2.8) for constant inputs u. If the states of the original model do not have interpretations that we are keen to preserve, the above two methods for model reduction can produce an infinite number of approximations if combined with a change of variables applied to the states; applying the change of variables x = T ξ to (2.6) results in ξ 0 (t) = T −1 A T ξ(t) + T −1 B u(t) y(t) = C T ξ(t) + D u(t) + v(t) (2.11) 2.1 Models in automatic control 21 and the approximations will be better or worse depending on the choice of T . Conversely, by certain choices of T , it will be possible to say more regarding how close the approximations are to the original model. If T is chosen to bring the matrix A in Jordan form, truncation is referred to as modal truncation, and residualization is then equivalent to singular perturbation approximation (see section 2.5). (Skogestad and Postlethwaite, 1996) The change of variables T most well developed is that which brings the system into balanced form. When performing truncation or residualization on a system in this form, the difference between the approximation and the original system can be expressed in terms of the system’s Hankel singular values. We shall not go into details about what these values are, but the largest of them defines the Hankel norm of a system. Neither shall we give interpretations of this norm, but it turns out that it is actually possible to compute the reduced model of a given order which minimizes the Hankel norm of the difference between the original system and the approximation. By now we have seen that there are many ways to compute smaller approximations of a system, ranging from rather arbitrary choices to those which are clearly defined as minimizers of a coordinate-independent objective function. Some model reduction techniques have been extended to lti dae. (Stykel, 2004) However, although the main question in this thesis is closely related to model reduction, these techniques cannot readily be applied in our framework since we are interested in defending a given model reduction (this view should become clear in later chapters) rather than finding one with good properties. 2.1.6 Scaling In section 2.1.5, we mentioned that model reduction of a system in state space form, (2.6), was a rather arbitrary process unless thinking in terms of some suitable coordinate system for the state space. The first example of this was selecting which states to truncate based on the size of the values that the state attains under typical operating conditions, and here we do the simple maths behind that statement. Partition the states such that x2 is a single state which is to be scaled by the factor a. This results in # ! " # ! " 1 B1 x10 (t) A11 A12 x1 (t) a = + u(t) x20 (t) a B2 a A21 A22 x2 (t) (2.12) ! i x (t) h 1 1 y(t) = C1 a C2 + D u(t) + v(t) x2 (t) (not writing out that also initial conditions have to be scaled accordingly). Note that the scalar A22 on the diagonal does not change (if it would, that would change the trace of A, but the trace is known to be invariant under similarity transforms). In the index reduction procedure studied in later chapters, the situation is reversed: it is not a question about which states are small, but which coefficients that are small. The situation is even worse for lti dae than for the state space systems considered so far, since in a dae there is also the possibility to scale the equations independently of 22 2 Theoretical background the states. Again, it becomes obvious that this cannot be answered in a meaningful way unless the coordinate systems for the state space and the equation residuals are chosen suitably. Just like in model reduction, the user may be keen to preserve the interpretation of the model states, and may hence be reluctant to use methods that apply variable transforms to the states. However, unlike model reduction of ordinary differential equations, the dae may still be transformed by changing coordinates of the equation residuals. In fact, changing the coordinate system of the equation residuals is the very core of the index reduction algorithm. Pure scaling of the equation residuals is also an important part of the numerical method for integration of dae that will be introduced in section 2.2.8. There, scaling is important not because it facilitates analysis, but because it simply improves numeric quality of the solution. To see how this works, we use the well-known (see, for instance, Golub and Van Loan (1996)) bound on the relative error in the solution to ! a linear system of equations A x = b, which basically says the relative errors in A and b are propagated to x by a factor bounded by the (infinity norm) condition number of A. Now consider the linear system of equations in the variable qx (that is, x is given) # "1 # "1 ! E1 E1 + A1 qx = x (2.13) A2 0 where is a small but exactly known parameter. If we assume that the relative errors in E and A are of similar magnitudes, smallness of gives both that the matrix on the left hand side is ill-conditioned, and that the relative error of this matrix is approximately the same as the relative error in E1 alone. Scaling the upper row of the equations will hence make the matrix on the left hand side better conditioned, while not making the relative error significantly larger. On the right hand side, scaling of the upper block by is the same as scaling all of the right hand side by , and hence the relative error does not change. Hence, scaling by will give a smaller bound on the relative error in the solution. Although the scaling by was performed for the sake of numerics, it should be mentioned that, generally, the form (2.13) is only obtained after choosing a suitable coordinate system for the dae residuals. Another important situation we would like to mention — when scaling matters — is when gradient-based methods are used in numerical optimization. (Numerical optimization in one form or another is the basic tool for system identification.) Generally, the issue is how the space of optimization variables is explored, not so much the numerical errors in the evaluation of the objective function and its derivatives. It turns out that the success of the optimization algorithm depends directly on how the optimization variables (that is, model parameters to be identified) are scaled. One of the important advantages of optimization schemes that also make use of the Hessian of the objective function is that they are unaffected by linear changes of variables. 2.2 Differential-algebraic equations Differential-algebraic equations (generally written just dae) is a rather general kind of equations which is suitable for describing systems which evolve over time. The 2.2 23 Differential-algebraic equations advantage they offer over the more often used ordinary differential equations is that they are generally easier to formulate. The price paid is that they are more difficult to deal with. The first topic of the background we give in this section is to try to clarify why dae can be a convenient way of modeling systems in automatic control. After looking at some common forms of dae, we then turn to the basic elements of analysis and solution of dae. Finally, we mention some existing software tools. For recent results on how to carry out applied tasks such as system identification and estimation for dae models, see Gerdin (2006), or for optimal control, see Sjöberg (2008). 2.2.1 Motivation Nonlinear differential-algebraic equations is the natural outcome of componentbased modeling of complex dynamic systems. Often, there is some known structure to the equations, for instance, it was mentioned in chapter 1 that we would like to understand a method that applies to equations in quasilinear form, ! E( x(t), t ) x0 (t) + A( x(t), t ) = 0 (2.14) In the next section, we approach this form by looking at increasingly general types of equations. Within many fields, equations emerge in the form (2.14) without being recognized as such. The reason is that when x0 (t) is sufficiently easy to solve for, the equation is converted to the state space form which can formally be written as ! x0 (t) = −E( x(t), t )−1 A( x(t), t ) Sometimes, the leading matrix may be well conditioned, but nevertheless non-trivial to invert. It may then be preferable to leave the equations in the form (2.14). In this case, the form (2.14) is referred to as an implicit ode or an index 0 dae. One reason for not converting to state space form is that one may loose sparsity patterns. Hence, the state space form may require much more storage than the implicit ode, and may also be a much more expensive way of obtaining x0 (t). Besides, even when the inverse of a sparse symbolic matrix is also sparse, the expressions in the inverse matrix are generally of much higher complexity. Here is an example that shows that the inverse of a sparse matrix may be full: 1 1 0 0 1 1 −1 1 1 0 = 2 1 1 −1 1 1 1 −1 −1 1 1 If the example above is extended to a 5 by 5 matrix with unique symbolic constants at the non-zero positions, the memory required to store the original matrix in Mathematica (Wolfram Research, Inc., 2008) is 528 bytes. If the inverse is represented with the inverse of the determinant factored out, the memory requirement is 1648 bytes, and without the factorization the memory requirement is 6528 bytes. 24 2 Theoretical background Although an interesting case by itself, the implicit ode form is not the purpose in this thesis. What remains is the case when the leading matrix is singular. Such equations appear naturally in many fields, and we will finish this section by looking briefly at some examples. As was mentioned above, quasilinear equations are the natural outcome of component-based modeling, and these will generally have a singular leading matrix. This type of modeling refers to the bottom-up process, where one begins by making small models of simple components. The small models are then combined to form bigger models, and so on. Each component, be it small or large, have variables that are thought of as inputs and outputs, and when models are combined to make models at a higher level, this is done by connecting outputs with inputs. Each connection renders a trivial equation where two variables are “set” equal. These equations contain no differentiated variables, and will hence have a corresponding zero row of the leading matrix. The leading matrix must then be singular, but the problem has a prominent structure which is easily exploited. Our next example is models of electric networks. Here, many components (or subnetworks) may be connected in one node, where all electric potentials are equal and Kirchoff’s Current Law provides the glue for currents. While the equations for the potentials are trivial equalities between pairs of variables, the equations for the currents will generate linear equations involving several variables. Still, the corresponding part of the leading matrix is a zero row, and the coefficients of the currents are ±1, when present. This structure is also easy to exploit. The previous example is often recognized as one of the canonical applications of the so-called bond graph theory. Other domains where (one-dimensional) bond graphs are used are mechanical translation, mechanical rotation, hydraulics (pneumatics), some thermal systems, and some systems in chemistry. While the one-dimensional bond graphs are the most widely known, there is an extension systems which among other applications overcomes the limitation in mechanical systems to objects which either translate along a given line or rotate about a given axis. This generalization is known as multi-bond graphs or vector bond graphs, see Breedveid (1982), Ingrim and Y. (1991), and references therein. In the bond graph framework, the causality of a model needs to be determined in order to generate model equations in ode form. However, the most frequently used technique for assigning causality to the bond graph, named Sequential Causality Assignment Procedure (Rosenberg and Karnopp, 1983, section 4.3), suffers from a potential problem with combinatorial blow-up. One way of avoiding this problem is to generate a dae instead. Although some chemical processes can be modeled using bond graphs, this framework is rarely mentioned in recent literature on dae modeling in the chemistry domain. Rather, equation-based formulations prevail, and according to Unger et al. (1995), most models have the quasilinear form. The amount on dae research within the field of chemistry is remarkable, which is likely due to their extensive applicability in a profitable business where high fidelity models are a key to better control strategies. 2.2 25 Differential-algebraic equations 2.2.2 Common forms Having presented the general idea of finding suitable model classes to work with in section 2.1.4, this section contains some common cases from the dae world. As we are moving our focus away from the automatic control applications that motivate our research, towards questions of more generic mathematical kind, our notation changes; instead of using model class, we will now speak of the form of an equation. We begin with some repetition of notation defined in section 1.6. Beginning with the overly simple, an autonomous lti dae has the form ! E x0 (t) + A x(t) = 0 (2.15) where E and A are constant matrices. A large part of this thesis is devoted to the study of this form. Adding forcing functions (often representing external inputs) while maintaining the lti property, leads to the general lti dae form ! E x0 (t) + A x(t) + B u(t) = 0 (2.16) where u is a vector-valued function representing external inputs to the model, and B is a constant matrix. The function u may be subject to various assumptions. 2.1 Example In automatic control, system inputs are often computed as functions of the system state or an estimate thereof — this is called feedback — but such inputs are not external. To see how such feedback loops may be conveniently modeled using dae models, let ! h i u (t) ! 0 1 =0 (2.17) EG x (t) + AG x(t) + BG1 BG2 u2 (t) be a model of the system without the feedback control. Here, the inputs to the system has been partitioned into one part, u1 , which will later be given by feedback, and one part, u2 , which will be the truly external inputs to the feedback loop. Let ! h i u (t) ! 0 1 EH x̂ (t) + AH x̂(t) + BH1 BH2 =0 (2.18) u2 (t) be the equations of the observer, generating the estimate x̂ of the true state x. Finally, let a simple feedback be given by u1 (t) = L x̂(t) (2.19) Now, it is more of a matter of taste whether to consider the three equations (2.17), (2.18), and (2.19) to be in form (2.16) or not; if not, it just remains to note that if u1 is made an internal variable of the model, the equations can be written 0 BG1 x(t) BG2 EG x (t) AG ! x̂0 (t) + EH AH BH1 x̂(t) + BH2 u2 (t) = 0 (2.20) 0 u1 (t) −L I u1 (t) 26 2 Theoretical background Of course, eliminating u1 from these equations would be trivial; " # 0 ! " # ! " # EG x (t) AG −BG1 L x(t) BG2 ! + + u (t) = 0 EH x̂0 (t) AH − BH1 L x̂(t) BH2 2 but the purpose of this example is to show how the model can be written in a form that is both a little easier to formulate and that is better at displaying the logical structure of the model. One way to generalize the form (2.16) is to remove the restriction to time-invariant equations. This leads to the linear time-varying form of dae: ! E( t ) x0 (t) + A( t ) x(t) + B( t ) u(t) = 0 (2.21) While this form explicitly displays what part of the system’s time variability that is due to “external inputs”, one can, without loss of generality, assume that the equations are in the form ! E( t ) x0 (t) + A( t ) x(t) = 0 (2.22) This is seen by (rather awkwardly) writing (2.21) as " ! # # 0 ! " A( t ) B( t ) u(t) x(t) ! x (t) E( t ) =0 + α(t) I α 0 (t) ! α(t0 ) = 1 where the variable α has been included as an awkward way of denoting the constant 1. Still, the form (2.21) is interesting as it stands since it can express logical structure in a model, and if algorithms exploit that structure one may obtain more efficient implementations or results that are easier to interpret. In addition, it should be noted that the model structures are not fully specified without telling what constraints the various parts of the equations must satisfy. If one can handle a larger class of functions representing external inputs in the form (2.21) than the class of functions at the algebraic term in (2.22), there are actually systems in the form (2.21) which cannot be represented in the form (2.22). The same kind of considerations should be made when considering the form ! E( t ) x0 (t) + A( t ) x(t) + f (t) = 0 (2.23) as a substitute for (2.21). A natural generalization of (2.23) is to allow dependency of all variables where (2.23) only allows dependency of t. With the risk of loosing structure in problems with external inputs etc the resulting equations are then in the quasilinear form, repeated here, ! E( x(t), t ) x0 (t) + A( x(t), t ) = 0 (2.14) The most general form of dae is ! f ( x0 (t), x(t), t ) = 0 (2.24) 2.2 27 Differential-algebraic equations but it takes some analysis to realize why writing this equation as ! f ( ẋ(t), x(t), t ) = 0 ! ẋ(t) − x0 (t) = 0 (2.25) does not show that (2.14) is the most general form we need to consider. So far, we have considered increasingly general forms of dae without considering how the equations can be analyzed. For instance, modeling often leads to equations which are clearly separated into differential and non-differential equations, and this structure is often possible to exploit. Since discussion of the following forms requires the reader to be familiar with the contents of section 2.2.3, the forms will only be mentioned quickly to give some intuition about what forms with this type of structural properties may look like. What follows is a small and rather arbitrary selection of the forms discussed in Brenan et al. (1996). The semi-explicit form looks like ! x10 (t) = f1 ( x1 (t), x2 (t), t ) (2.26) ! 0 = f2 ( x1 (t), x2 (t), t ) and one often speaks of semi-explicit index 1 dae (the concept of an index will be discussed further in section 2.2.3), which means that the function f2 is such that x2 can be solved for: ∇2 f2 is square and non-singular (2.27) Another often used form is the Hessenberg form of size r, ! x10 (t) = f1 ( x1 (t), x2 (t), . . . , xr (t), t ) ! x20 (t) = f2 ( x1 (t), x2 (t), . . . , xr−1 (t), t ) .. . (2.28) ! xi0 (t) = f2 ( xi−1 (t), xi (t), . . . , xr−1 (t), t ) .. . ! 0 = fr ( xr−1 (t), t ) where it is required that ! ! ∂fr ( xr−1 , t ) ∂fr−1 ( xr−2 , t ) ∂xr−1 ∂xr−2 ∂f2 ( x1 , x2 , . . . , xr−1 , t ) ··· ∂x1 is non-singular. ! ∂f1 ( x1 , x2 , . . . , xr , t ) ∂xr ! (2.29) 28 2 2.2.3 Theoretical background Indices and their deduction In the previous sections, we have spoken of the index of a dae and index reduction, and we have used the notions as if they were well defined. This is not the case; there are many definitions of indices. In this section, we will mention some of these definitions, and define what shall be meant by just index (without qualification) in the remainder of the thesis. We shall do this in some more length than what is needed for the following chapters, since this is a good way of introducing readers with none or very limited experience with dae to typical dae issues. At least three categories of indices can be identified: • For equations that relate forcing functions to the equation variables, there are indices that are equal for any two equivalent equations. In other words, these indices are not a property of the equations per se, but of the abstract system defined by the equations. • For equations written in particular forms, one can introduce perturbations or forcing functions at predefined slots in the equations, and then define indices that tell how the introduced elements are propagated to the solution. Since equivalence of equations generally do not account for the slots, these indices are generally not the same for two equations considered equivalent. In other words, these indices are a property of the equations per se, but are still defined abstractly without reference to how they are computed. • Analysis (for instance, revealing the underlying ordinary differential equation on a manifold) and solution of dae has given rise to many methods, and one can typically identify some natural number for each method as a measure of how involved the equations are. This defines indices based on methods. Basically these are a property of the equations, but can generally not be defined abstractly without reference to how to compute them. The above categorization is not a clear cut in every case. For instance, an index which was originally formulated in terms of a method may later be given an equivalent but more abstract definition. Sometimes, when modeling follows certain patterns, the resulting equations may be of known index (of course, one has to specify which index is referred to). It may then be possible to design special-purpose algorithms for automatic control tasks such as simulation, system identification or state estimation. In this thesis, we regard the solution of initial value problems as a key to understanding other aspects of dae in automatic control. We are not so much interested in the mathematical questions of exactly when solutions exist or how the solutions may be described abstractly, but often think in terms of numerical implementation. For equations of unknown, higher index, all existing approaches to numerical solution of initial value problems that we know of perform index reduction so that one obtains equations of low index (typically 0 or 1), which can then be fed to one of the many available solvers for such equations. The index reduction algorithm used in the following chapters on singular perturbation (described in chapter 3) relates to 2.2 Differential-algebraic equations 29 the differentiation index, which we will define first in terms of this algorithm. We will then show an equivalent but more abstract definition. See Campbell and Gear (1995) for a survey (although incomplete today) of various index definitions and for examples of how different indices may be related. The algorithm that we use to reveals the differentiation index is a so-called elimination-differentiation approach. These have been in use for a long time, and as is often the case in the area of dynamic systems, the essence of the idea is best introduced by looking at linear time-invariant (lti) systems, while the extension to nonlinearities brings many subtleties to the surface. The linear case was considered in Luenberger (1978), and the algorithm is commonly known as the shuffle algorithm. For notational convenience in algorithm 2.1 (on page 30), we recall the following definition from section 1.6.1 u u 0 4 u 0{i} = .. . 0(i) u In the algorithm, there is a clear candidate for an index: the final value of i. We make this our definition of the differentiation index. 2.2 Definition (Differentiation index). The differentiation index of a square lti dae is given by the final value of i in algorithm 2.1. While the compact representation of lti systems makes the translation of theory to computer programs rather straightforward, the implementation of nonlinear theory is not at all as straightforward. This seems, at least to some part, to be explained by the fact that there are no widespread computer tools for working with the mathematical concepts from differential geometry. A theoretical counterpart (called the structure algorithm, see section 2.2.5) of the shuffle algorithm, but applying to general nonlinear dae, was used in Rouchon et al. (1992). However, its implementation is nontrivial since it requires a computable representation of the function whose existence is granted by the implicit function theorem. For quasilinear dae, on the other hand, an implicit function can be computed explicitly, and our current interest in these methods owes to this fact. For references to implementation-oriented index reduction of quasilinear dae along these lines, see for example Visconti (1999) or Steinbrecher (2006). Instead of extending the above definition of the differentiation index of square lti dae to the quasilinear form, we shall make a more general definition, which we will prove is a generalization of the former. The following definition of the differentiation index of a general nonlinear dae can be found in Campbell and Gear (1995). It should be mentioned, though, that the authors of Campbell and Gear (1995) are not in favor of using this index to characterize a model, and define replacements. On the other hand, in the context of particular algorithms, the differentiation index may nevertheless be a relevant characterization. 30 2 Theoretical background Algorithm 2.1 The shuffle algorithm. Input: A square lti dae, ! E x0 (t) + A x(t) + B u(t) = 0 Output: An equivalent non-square dae consisting of a square lti dae with non- singular leading matrix (and redefined forcing function) and a set C = equality constraints involving x and u 0{i} for some i. S i Ci of linear Algorithm: E0 B E A0 B A B0 B B iB0 while Ei is singular Manipulate the equations by row operations so that Ei becomes partitioned as " # Ēi , where Ēi has full rank and Ẽi = 0. This can be done by, for instance, Ẽi Gaussian elimination or QR factorization. Perform the same row operations on the other matrices, and partition the result similarly. ! Ci B Ãi x + B̃i u 0{i} = 0 " # Ē Ei+1 B i Ãi " # Ā Ai+1 B i 0 " # — B̄ — 0 Bi+1 B 0 — B̃ — i B i+1 if i > dim x abort with “ill-posed” end end Remark: The new matrices computed in each iteration simply correspond to differentiating the equations from which the differentiated variables have been removed by the row operations. (This should clarify the notation used in the construction the Bi .) Since the row operations generate equivalent equations, and the equations that get differentiated are also kept unaltered in C, it is seen that the output equations are equivalent to the input equations. See the notes in algorithm 2.2 regarding geometric differentiation, and note that assumptions about constant Jacobians are trivially satisfied in the lti case. 2.2 31 Differential-algebraic equations Consider the general nonlinear dae ! f ( x0 (t), x(t), t ) = 0 (2.30) ẋ{i} (t) = x(t), ẋ(t), . . . , ẋ(i) (t) (2.31) By using the notation ! the general form can be written f0 ( ẋ{1} (t), t ) = 0. Note that differentiation with ! respect to t yields an equation which can be written f1 ( ẋ{2} (t), t ) = 0. Introducing the derivative array f0 ( x0{1} (t), t ) .. Fi ( x0{i+1} (t), t ) = (2.32) . 0{i+1} fi ( x (t), t ) the implied equation ! Fi ( ẋ{i+1} (t), t ) = 0 (2.33) is called the derivative array equations accordingly. 2.3 Definition (Differentiation index). Suppose (2.30) is solvable. If ẋ(t) is uniquely determined given x(t) and t in the non-differential equation (2.33), for all x(t) and t such that a solution exist, and νD is the smallest i for which this is possible, then νD is denoted the differentiation index of (2.30). Next, we show that the two definitions of the differentiation index are compatible. 2.4 Theorem. Definition 2.3 generalizes definition 2.2. ! Proof: Consider the derivative array equations Fi ( ẋ{i+1} , t ) = 0 for the square lti dae of definition 2.2: x B u(t) A0 E0 ẋ B u 0 (t) A E 0 0 ! = 0 (2.34) .. + .. . . . . . . . . A0 E0 ẋ(i+1) B u 0(i) (t) Suppose definition 2.2 defines the index as i. Then Ei in algorithm 2.1 is non-singular by definition. The first row elimination of the shuffle algorithm on (2.34) yields B̄ u(t) Ā0 Ē0 Ã B̃ u(t) 0 x B̄ u 0 (t) Ā0 Ē0 ẋ 0 Ã0 . + B̃ u (t) =! 0 . .. .. .. . (i+1) . . . ẋ 0(i) Ā0 Ē0 B̄ u (t) 0(i) Ã0 B̃ u (t) 32 Reordering the rows as Ā0 Ē0 Ã0 .. . Ã0 2 .. . Ā0 Ē0 Ã0 Ā0 Theoretical background B̄ u(t) B̃ u 0 (t) x .. ẋ . ! .. + B̄ u 0(i−1) (t) = 0 . ẋ(i+1) B̃ u 0(i) (t) B̄ u 0(i) (t) Ē0 B̃ u(t) (2.35) and ignoring the last two rows, this can be written x A1 E1 ẋ A E 1 1 ! .. + · · · = 0 .. .. . . . A1 E1 ẋ(i) using the notation in algorithm 2.1. The forcing function u has been suppressed for brevity. After repeating this procedure i times, one obtains ! h i x ! Ai Ei + ··· = 0 ẋ which shows that definition 2.2 gives an upper bound on the index defined by definition 2.3. Conversely, it suffices to show that the last two rows of (2.35) do not contribute to the determination of ẋ. The last row only restricts the feasible values for x, which is considered a given in the equation. The second last row contains no information than can be propagated to ẋ since it can be solved for any ẋ(i) by a suitable choice of ẋ(i+1) (which appears in no other equation). Since this shows that no information about ẋ was discarded, we have also found that if the index as defined by definition 2.2 is greater than i, then Ei is singular, and hence the index as defined by definition 2.3 must also be greater than i. That is, definition 2.2 gives a lower bound on the index defined by definition 2.3. Many other variants of differentiation index definitions can be found in Campbell and Gear (1995), which also provides the relevant references. However, they avoid discussion of geometric definition of differentiation indices. While not being important for lti dae, where the representation by numeric matrices successfully captures the geometry of the equations, geometric definitions turn out to be important for nonlinear dae. This is emphasized in Thomas (1996), as it summarizes results by other authors. (Rabier and Rheinboldt, 1994; Reich, 1991; Szatkowski, 1990, 1992) It is noted that the geometrically defined differentiation index is bounded by the dimension of the equations, and cannot be computed reliably using numerical methods; the indices which can be computed numerically are not geometric and may not be bounded even for well-posed equations. The presentation in Thomas (1996) is further developed in Reid et al. (2001) to apply also to partial differential-algebraic equations. 2.2 33 Differential-algebraic equations Having discussed the differentiation index with its strong connection to algorithms, we now turn to an index concept of another kind, namely the perturbation index. The following definition is taken from Campbell and Gear (1995), which refers to Hairer et al. (1989). ! 2.5 Definition. The dae f ( x0 (t), x(t), t ) = 0 has perturbation index νP along a solution x on the interval I = [ 0, T ] if νP is the smallest integer such that if ! f ( x0 (t), x(t), t ) = δ(t) for sufficiently smooth δ, then there is an estimate kx̂(t) − x(t)k ≤ C kx̂(0) − x(0)k + kδktνP −1 Clearly, one can define a whole range of perturbation indices by considering various “slots” in the equations, and each form of the equations may have its own natural slots. There are two aspects of these indices we would like to emphasize. First, they are defined completely without reference to a method for computing them, and in this sense they seem closer to capturing intrinsic features of the system described by the equations, than indices that are defined by how they are computed. Second, and on the other hand, the following example shows that these indices may be strongly related to which set of equations are used to describe a system. 2.6 Example Consider computing the perturbation index of the dae ! f ( x0 (t), x(t), t ) = 0 We must then examine how the solution depends on the forcing perturbation function δ in ! f ( x0 (t), x(t), t ) = δ(t) Now, let the matrix K( x(t), t ) define a smooth, non-singular transform of the equations, leading to ! K( x(t), t ) f ( x0 (t), x(t), t ) = 0 with perturbation index defined by examination of ! K( x(t), t ) f ( x0 (t), x(t), t ) = δ(t) Here, the norm with ornaments is defined by kδktm = m X sup δ0(i) (τ) , i=0 τ∈[ 0, t ] Zt 0(i) t = kδk−1 δ (τ) dτ 0 m≥0 34 2 Theoretical background Trying to relate this to the original perturbation index, we could try rewriting the equations as ! f ( x0 (t), x(t), t ) = K( x(t), t )−1 δ(t) but this introduces x(t) on the right hand side, and is no good. Further, since the perturbation index does not give bounds on the derivative of the estimate error, there are no readily available bounds on the derivatives of the factor K( x(t), t )−1 that depend only on t. In the special case when the perturbation index is 0, however, a bound on K allows us t t K( to translate a bound in terms of x(t), t )−1 δ(t)−1 to a bound in terms of kδ(t)k−1 . This shows that, at least, this way of rewriting the equations does not change the perturbation index. It would be interesting to relate the differentiation index to the perturbation index, but we have already seen an example of how different index definitions can be related, and shall not dwell more on this. Instead, there is one more index we would like to mention since it is instrumental to a well developed theory and will be the starting point for chapter 5. This is the strangeness index, developed for time-varying linear dae in Kunkel and Mehrmann (1994), see also Kunkel and Mehrmann (2006). Perhaps due to its ability of revealing a more intelligent characterization of a system compared to, for instance, the differentiation index, the strangeness index is somewhat expensive to compute. This becomes particularly evident in the associated method for solving initial value problems, where the index computations are performed at each step of the solution. This is addressed in the relatively recent Kunkel and Mehrmann (2004), see also Kunkel and Mehrmann (2006, remark 6.7 and remark 6.9). However, one caveat remains, being that the implications of determining ranks numerically are not understood, see for instance Kunkel and Mehrmann (2006, remark 6.7) or Kunkel et al. (2001, remark 8). The kind of results that are missing here is highly related to the matrix-valued perturbation problems considered in this thesis, although our analysis is related with the differentiation index rather than the strangeness index. A quite different method which reduces the index is the Pantelides’ algorithm (Pantelides, 1988) and the dummy derivatives (Mattsson and Söderlind, 1993) extension thereof. This technique is in extensive use in component-based modeling and simulation software for the Modelica language, such as Dymola (Mattsson et al., 2000; Brück et al., 2002) and OpenModelica (Fritzson et al., 2006a,b). A major difference between the previously discussed index reduction algorithms and Pantelides’ algorithm is that the former use mathematical analysis to derive the new form, while the latter uses only the structure of the equations (the equation–variable graph). Since the equation–variable graph does not require the equations to be in any particular form, the technique is applicable to general nonlinear dae. While the graph-based technique is expected to be mislead by a change of variables and other manipulations of the equations (see section 1.2.1), it is well suited for the equations as they arise in the software systems mentioned above. 2.2 35 Differential-algebraic equations Hereafter, when speaking of just the index (without qualification), we refer to the differentiation index, often thinking of it as the number of steps required to shuffle the equations to an implicit ode. In the presence of uncertainty, there are two more index concepts which we need to define. Thinking of the uncertain dae as some element (point) in a set of dae, we may use the term point dae to denote an exact dae. When a dae is uncertain, its index also becomes uncertain in the general case, which is emphasized by the next definition. 2.7 Definition (Pointwise index). Let just “index” refer to one of the notions of index for a point dae. The pointwise index of the uncertain dae ! f ( x(t), x0 (t), t ) = 0 for some fixed f ∈ F is defined as the set ! index of f ( x(t), x0 (t), t ) = 0 : f ∈ F If the set contains exactly one element by construction or by assumption, we will make reuse of notation and let the pointwise index refer to this element instead of the set containing it. We will often make assumptions regarding the pointwise index of a dae, and it is important to see that this is a way of removing unwanted point dae from the uncertain dae. The second index concept appears when the uncertain dae is being approximated by one with less (or none) uncertainty, typically of higher index than the dae being approximated. 2.8 Definition (Nominal index). When an uncertain dae is being approximated by another dae where some of the uncertainty is removed, the nominal index refers to the index of the latter dae. The nominal index is thus something defined by how the uncertain dae is being approximated, and there will generally be a trade-off between stiffness in an approximation of low nominal index, and large uncertainty bounds on the solution to an approximation of higher nominal index. 2.2.4 Transformation to quasilinear form In this section, the transformation of a general nonlinear dae to quasilinear form is considered. This may seem like a topic for section 2.2.2, but since we need to refer to the index concept, waiting until after section 2.2.3 has been motivated. For ease of notation, we shall only deal with equations without explicit dependence on the time variable in this section. This way, it makes sense to write a time-invariant nonlinear dae as ! f ( x, x0 , x00 , . . . ) = 0 (2.36) 36 2 Theoretical background The variable in this equation is the function x, and the zero on the right hand side must be interpreted as the mapping from all of the time domain to the constant real vector 0. We choose to interpret the equality relation of the equation pointwise, although other measure-zero interpretations could be made (we are not seeking new semantics, only a shorter notation compared to (2.24)). Including higher order derivatives in the form (2.36) may seem just like a minor convenience compared to using only first order derivatives in (2.24), but some authors remark that this is not always the case (see, for instance, (Mehrmann and Shi, 2006)), and this is a topic for the discussion below. The time-invariant quasilinear form looks like ! E( x ) x0 + A( x ) = 0 (2.37) Assuming that (2.24) has index νD but is not in the form (2.37), can we say something about the index of the corresponding (2.37)? Not being in the form (2.37) can be for two reasons: • There are higher-order derivatives. • The residuals are not linear in the derivatives. To remedy the first, one simply introduces new variables for all but the highest and the higher-than-1-order derivatives. Of course, one also adds the equations relating the introduced variables to the derivatives they represent; each new variable gets one associated equation. This procedure does not raise the index, since the derivatives which have to be solved for really have not changed. If the highest order derivatives could be solved for in terms of lower-order derivatives after νD differentiations of (2.24), they will be possible to solve for in terms of the augmented set of variables after νD differentiations of (2.37) (of course, there is no need to differentiate the introduced trivial equations). The introduced variables’ derivatives that must also be solved for are trivial (that is why the definitions of index does not have to mention solution of the lower-order derivatives). Turning to the less trivial reason, nonlinearity in derivatives, the fix is still easy; introduce new variables for the derivatives that appear nonlinearly and add the linear (trivial) equations that relate the new variables to derivatives of the old variables; change ! f ( x, x0 ) = 0 (2.38) to ! x0 = ẋ f ( x, ẋ ) =! 0 Note the important difference to the previous case: this time we introduce new variables for some highest-order derivatives. This may have implications for the index. If the index was previously defined as the number of differentiations required to be 2.2 37 Differential-algebraic equations able to solve for x0 , we must now be able to solve for ẋ0 = x00 . Clearly, this can be obtained by one more differentiation once x0 has been solved for, as in the following example. 2.9 Example Consider the index-0 dae x 0 ! x1 e 2 = e x0 =! −x2 1 Taking this into the form (2.37) brings us to 0 ! x2 = ẋ ẋ ! x1 e =e x0 =! −x 2 1 where ẋ0 cannot be solved for immediately since it does not even appear. However, after differentiating the purely algebraic equation once, all derivatives can be solved for; 0 ! x2 = ẋ ẋ 0 ! x1 0 e ẋ = e x1 x0 =! −x 2 1 However, the index is not raised in general; it is only in case the nonlinearly appearing derivatives could not be solved for in less than νD steps that the index will raise. The following example shows a typical case where the index is not raised. 2.10 Example By modifying the previous example we get a system that is originally index-1, 0 ! x2 x1 e = e 0 ! x1 = −x2 x =! 1 3 Taking this into the form (2.37) brings us to 0 ! x = ẋ 2 ẋ ! x1 e = e ! x10 = −x2 ! x3 = 1 38 2 Theoretical background which is still index-1 since all derivatives can be solved for after one differentiation of the algebraic equations: ! x20 = ẋ ẋ 0 ! x1 0 e ẋ = e x1 ! x10 = −x2 ! x30 = 0 Although the transformation discussed here may raise the index, it may still be a useful tool in case the equations and forcing functions are sufficiently differentiable. The transformation has been implemented as a part of a tool for finding the quasilinear structure in equations represented in general form. However, even though automatic transformation to quasilinear form is possible, it should be noted that formulating equations in the quasilinear form is a critical part of the modeling process, and should be done carefully. This is emphasized in the works on dae with properly state leading terms by März and coworkers (Higueras and März, 2004; März and Riaza, 2006, 2007, 2008). 2.2.5 Structure algorithm The application of the structure algorithm to dae described in this section is due to Rouchon et al. (1992), which relies on results in Li and Feng (1987). The structure algorithm was developed for the purpose of computing inverse systems; that is, to find the input signal that produces a desired output. It assumes that the system’s state evolution is given by an ode and that the output is a function of the state and current input. Since the desired output is a known function, it can be included in the output function; that is, it can be assumed without loss of generality that the desired output is zero. The algorithm thus provides a means to determine u in the setup ! 0 x (t) = h( x(t), u(t), t ) ! 0 = f ( x(t), u(t), t ) ! The algorithm produces a new function η such that u can be determined from 0 = η( x, u, t ). By taking h( x, u, t ) = u, this reduces to a means for determining the derivatives of the variables x in the dae ! 0 = f ( x(t), x0 (t), t ) In algorithm 2.2 we give the algorithm applied to the dae setup. It is assumed that dim f = dim x, that is, that the system is square. 2.2 39 Differential-algebraic equations Algorithm 2.2 The structure algorithm. Input: A square dae, ! f ( x(t), x0 (t), t ) = 0 0 Output: An equivalent non-square dae consisting of a square dae from which x can be solved for, and a set of constraints C = V ! i Φ i ( x(t), t, 0 ) = 0 . Let α be the smallest integer such that ∇2 fα ( x, ẋ, t ) has full rank, or ∞ if no such number exists. Invariant: The sequence of fk shall be such that the solution is always determined by fk ( x, ẋ, t ) = 0, which is fulfilled for f0 by definition. Reversely, this will make fk ( x, ẋ, t ) = 0 along solutions. Algorithm: f0 = f iB0 while ∇2 fi ( x, ẋ, t ) is singular Since the rank of ∇2 fi ( x, ẋ, t ) is not full, it makes sense to split fi into two parts; f¯i being a selection of components of fi such that ∇2 f¯i ( x, ẋ, t ) has full and maximal rank (that is, the same rank as ∇2 fi ( x, ẋ, t )), and f˜i being the remaining components. Locally (and as all results of this kind are local anyway, this will not be further emphasized), this has the interpretation that the dependency of f˜i on ẋ can be expressed in terms of f¯i ( x, ẋ, t ) instead of ẋ; there exists a function Φ i such that f˜i ( x, ẋ, t ) = Φ i ( x, t, f¯i ( x, ẋ, t ) ). Since f¯i ( x, ẋ, t ) = 0 along solutions, we replace the equations given by f˜i by the residuals obtained by differentiating Φ i ( x(t), t, 0 ) with respect to t and substituting ẋ for x0 ; ! f¯i fi+1 = ( x, ẋ, t ) 7→ ∇1 Φ i ( x, t, 0 ) ẋ + ∇2 Φ i ( x, t, 0 ) i B i+1 if i > dim x abort with “ill-posed” end end Remark: Assuming that all ranks of Jacobian matrices are constant, it is safe to abort after dim x iterations. (Rouchon et al., 1992) Basically, this condition means that the equations are not used pointwise, but rather as geometrical (algebraic) objects. Hence, in the phrasing of Thomas (1996), differentiations are geometric, and α becomes analogous to the geometric differentiation index. In Rouchon et al. (1992), additional assumptions on the selection of components to constitute f¯i are made, but we will not use those here. 40 2.2.6 2 LTI DAE , Theoretical background matrix pencils, and matrix pairs The linear time-invariant dae ! E x0 (t) + A x(t) = 0 (2.39) is closely connected to the concepts of matrix pencils and matrix pairs. To the equation we may associate the matrix pencil s 7→ s E + A, and a large amount of dae analysis in the literature is formulated using matrix pencil theory. The sign convention we use (with addition instead of subtraction in the expression for the pencil) differs from much of the literature on lti dae (for instance, Stewart and Sun (1990)), but is natural in view of how the dae (2.39) is written, and is also the convention which generalizes to higher order matrix polynomials, compare Higham et al. (2006). We will not go deep into this theory in this background, since the theory is basically concerned with exactly known matrices, while the theme of this thesis is uncertain matrices. Just to show the close connection between (2.39) and the corresponding matrix pencil, note that the Laplace transform of the equation is ! E ( s X(s) − x(0) ) + A X(s) = 0 or ! ( s E + A ) X(s) = E x(0) If the matrix pencil is invertible at some s, it will be invertible at almost every s, the pencil is called regular, X(s) can be evaluated at almost every point, and it will be possible to find x(t) by inverse Laplace transform. In the other case, when the pencil is singular at every s, the pencil is called singular, and the next theorem explains why we will avoid singular pencils in this thesis. 2.11 Theorem. If the matrix pencil associated with the dae (2.39) is singular, the dae with initial conditions x(0) = 0 has an infinite number of solutions, including both bounded and exponentially increasing functions. Among the non-zero bounded solutions, there are solutions with arbitrarily slow exponential decay. Proof: The following proof is similar to that of Kunkel and Mehrmann (2006, theorem 2.14). Since λ E + A is singular for every λ, we can take n + 1 numbers { λi }n+1 i=1 and corresponding vectors { vi , 0 }ni=1 such that ( λi E + A ) vi = 0 for all i. To construct real solutions, we make the selection such that if Im λi , 0, then the complex conjugate of λi appears as λj for some j, and the corresponding vi and vj are also related by complex conjugate. Since the number of elements in { vi , 0 }ni=1 exceeds the dimension of space, they are linearly dependent, and there is a linear combination which vanishes, X αi vi = 0 i 2.2 41 Differential-algebraic equations where not all αi are zero. It follows that the function X 4 x(t) = αi vi eλi t i is real-valued, satisfies x(0) = 0, is not identically zero, and solves (2.39). Since the choice of { λi }n+1 i=1 was arbitrary up to the pairing of complex conjugates, and two disjoint such sets cannot produce the same solution, the number of solutions is infinite, and since the Re λi may be chosen all negative as well as all positive, there are both bounded and exponentially increasing solutions. Arbitrarily slow exponential decay is obtained by selecting all Re λi negative, but close to zero, and by setting one eigenvalue to zero, the solutions will have a non-zero constant asymptote. From the linearity of the differential equation (2.39), it follows that if there exists a solution with non-zero initial conditions, it will not be unique since all the solutions with zero initial conditions may be added to it to produce new solutions. In view of this, we call the dae singular (regular) if its matrix pencil is singular (regular). 2.12 Corollary. An lti dae with singular pencil does not have a finite index. Proof: If it had, the definition of index would imply that the dae had a unique solution. This contradicts singularity of its pencil. The next lemma helps us detect singular pencils. 2.13 Lemma. If the respective right null spaces of E and A intersect, or the respective left null spaces intersect, the pencil s 7→ s E + A is singular. Proof: First, the case of intersecting right null spaces. Take a v , 0 belonging to the ! intersection. Then v is a non-trivial solution to ( s E + A ) v = 0 for any s, and hence s E + A is not invertible for any s. ! For intersecting left null spaces, take v , 0 from the intersection. Then ( s E + A )T v = 0 has a nontrivial solution for every s. Hence det ( s E + A )T = det ( s E + A ) = 0 for every s. The consequence of intersecting right null spaces is easy to characterize. It means that a change of variables will reveal one or more transformed variables that do not appear in the equations at all. Clearly, any differentiable functions passing through the origin will do as a solution for these variables, if the problem has a solution at all. However, the dae will generally have no solutions since the remaining variables will be over-determined. The case of intersecting left null spaces is illustrated as an example of theorem 2.11 at the end of the section. For many purposes, the asymmetric roles of E and A in the matrix pencil may be disturbing, and it may make more sense to speak of the matrix pair ( E, A ) instead. 42 2 Theoretical background In accordance with lti dae, the first matrix in the pair (here, E) will be denoted the leading matrix of the pair, while the second matrix (here, A) will be denoted the trailing matrix of the pair. The notions of regular and singular carry over from matrix pencils to matrix pairs by the obvious association. 2.14 Definition (Eigenvalues of a matrix pair). The eigenvalues of the matrix pair ( E, A ) are defined as the equivalence classes of complex scalar pairs ( α, β ) , ( 0, 0 ) such that there exists a vector x , 0 for which ! αEx + βAx = 0 Here, equivalence is defined as two pairs being equivalent if one equals the other times some complex scalar. By identifying the eigenvalue [( α, 0 ) ] with ∞, and any eigenvalue [( α, β ) ] where β , 0 with the common ratio αβ , we may also consider the eigenvalues as belonging to C ∪ { ∞ }. Two matrix pairs ( E1 , A1 ) and ( E2 , A2 ) are said to be equivalent if there exist nonsingular matrices T and V such that E2 V = T E1 and A2 V = T A1 . From the definition of eigenvalues, it is easy to see that two equivalent matrix pairs have the same eigenvalues. The symmetric view on an eigenvalue as an equivalence class of pairs of scalars is the natural choice when the symmetric relation between the two matrices in a matrix pair is to be maintained. However, in our view of the matrix pair as a representation of an lti dae, the matrices in the pair do not have a truly symmetric relation — it is always the leading one which is multiplied by the scalar parameter in a matrix pencil. The following trivial theorem justifies the other view on matrix pair eigenvalues in this thesis. 2.15 Theorem. If E is non-singular in the matrix pair ( E, A ), then the eigenvalues of the pair are the same as the eigenvalues of the matrix −E −1 A. Proof: The pair ( E, A ) is equivalent to I , E −1 A . Clearly [( α, 0 ) ] is not an eigenvalue, so any eigenvalue is in the form [( α, 1 ) ], satisfying the equation ! α x = −1 E −1 A x This shows that [( α, 1 ) ], identified with α1 = α, is also a matrix eigenvalue of −E −1 A. The argument also works in the converse direction. The following theorem generalizes the Jordan canonical form for matrices to matrix pairs. The definition follows Stewart and Sun (1990), although the sign conventions differ due to different sign conventions for matrix pencils. 2.2 Differential-algebraic equations 43 2.16 Theorem (Weierstrass canonical form). Let ( E, A ) be a regular matrix pair. Then it is equivalent to a pair in the form " # " #! I J , (2.40) N I where J is in Jordan canonical form, and N is a nilpotent matrix in Jordan canonical form. Proof: This result is easy to find in the literature, and we suggest Stewart and Sun (1990, chapter vi, theorem 1.13) since it has other connections to this thesis as well. It is easy to see that the index of nilpotency of N in the canonical form coincides with the differentiation index of the corresponding dae, with the convention that the index of nilpotency of an empty matrix is 0. 2.17 Definition (Singular/regular uncertain matrix pair). An uncertain matrix pair is said to be singular if it admits at least one singular point matrix pair. Compare singular interval matrix. Otherwise, it is said to be regular. The definitions of singular/regular are also applied to uncertain lti dae in the obvious manner. The section ends with an example of a dae with singular matrix pair. 2.18 Example Row and column reductions of the matrix pair are often useful tools to discover structure in linear dae. Row reduction corresponds to replacing equations by equivalent ones, while column reduction corresponds to invertible changes of coordinates. Suppose row reduction of the leading matrix of some pair resulted in 1 1 1 1 1 0 1 0 0 1 1 1 0 1 1 2 (2.41) 0 0 0 0 , 1 1 1 1 0 0 0 0 1 1 1 1 where the lower part of the trailing matrix does not have full rank since it has linearly dependent rows. By also performing row reduction on the trailing matrix, a common left null space is revealed, 1 1 1 1 1 0 1 0 0 1 1 1 0 1 1 2 0 0 0 0 , 1 1 1 1 0 0 0 0 0 0 0 0 That is, the vector 0 0 0 1 proves that the matrix pencil is singular according to lemma 2.13, and next we will construct some of the solutions whose existence are given by theorem 2.11. 44 2 Theoretical background Before we start solving the right null space, column operations (that is, a change of variables) are applied to the leading matrix to reveal as many zeros as possible, 1 0 0 0 1 −1 1 0 0 1 0 0 0 1 0 1 0 0 0 0 , 1 0 0 0 0 0 0 0 0 0 0 0 The pencil, at some point λ, −1 1 0 1 + λ 0 1 + λ 0 1 1 0 0 0 0 0 0 0 can now be column reduced using a change of variables that will depend on λ, revealing its right null space −1 1 0 1 0 0 0 0 0 1 0 1 + λ 0 1 + λ 0 1 0 1 0 0 0 0 0 1 = 0 0 0 −( 1 + λ ) 1 1 0 1 0 0 0 1 0 0 0 0 0 0 0 0 0 −( 1 + λ ) 0 1 Hence, 1 0 0 1 4 v( λ ) = 1 −( 1 + λ ) 0 −( 1 + λ ) 0 0 0 0 0 0 1 1 = 1 1 0 0 0 1 0 −( 1 + λ ) shows how to find non-trivial elements in the right null space. Inspection shows that only one of the components of v( λ ) actually depends on λ, so any set of three or more such vectors will be linearly dependent (form a matrix with the v( λ ) as columns, and consider the row rank). Denoting three values for λ by { λi }3i=1 , it can be seen that for every α, α ( λ2 − λ3 ) v( λ1 ) + α ( λ3 − λ1 ) v( λ2 ) + α ( λ1 − λ2 ) v( λ3 ) = 0 For the coefficients to be real, α must be purely imaginary if there is a pair of complex conjugates. Hence, 1 0 4 x( t, α, λ1 , λ2 , λ3 ) = α 0 0 −1 0 1 −1 0 1 0 0 ( λ2 − λ3 ) v( λ1 ) eλ1 t 0 + −1 ( λ3 − λ1 ) v( λ2 ) eλ2 t 0 + 1 ( λ1 − λ2 ) v( λ3 ) eλ3 t (2.42) is a family of nontrivial solutions (the matrix undoes the change of variables used to eliminate entries in the leading matrix). Figure 2.1 shows a random selection of bounded solutions. 2.2 45 Differential-algebraic equations 1.5 2 1 0.5 1 0 −0.5 5 10 15 t 0 0 0.1 5 10 15 t 5 10 15 t 0.5 0.05 0 −0.05 5 10 15 t 0 −0.5 −0.1 Figure 2.1: First coordinate of randomly generated solutions to the singular dae (2.41), given by (2.42). Random numbers have been sampled uniformly from the interval [ 0.1, 1 ]. Random real parts of exponents have been chosen with negative sign to produce bounded solutions. The number α in (2.42) is chosen with modulus 1, and such that real solutions are produced. Upper left, one random real exponent, and one pair of random complex conjugates. Upper right, one exponent at 0, and one pair of complex conjugates. Lower left three random real exponents. Lower right, one exponent at 0, two random real exponents. 46 2 Theoretical background The example shows that if a singular dae is within the set of dae defined by an uncertain dae, the infinite set of solutions produced by the singular dae will contain solutions which could be the output of a regular system with any eigenvalues. Hence, the type of assumptions used in the thesis to restrict the set of solutions to the uncertain system, formulated in terms of system poles, will not be capable of ruling out the solutions of the singular dae. Indeed, the eigenvalues of a singular pencil are not even defined. On the other hand, as will be seen in section 7.1.2, the uncertain dae that admit singular ones will be possible to detect without additional assumptions, and this allows us to disregard this case as one which is not covered by our theory. Conversely, when our methods (including assumptions we have to make) show that the solutions to the dae are converging uniformly as the uncertainties tend to zero, this shows that the uncertain dae is regular, and hence that the singular uncertain dae that are excluded from our theory are exceptional in this sense. 2.2.7 Initial conditions The reader might have noticed that the shuffle algorithm (on page 30) not only produces an index and an implicit ode, but also a set of constraints. These constrains the solution at any point in time, and the implicit ode is only to be used where the constraints are satisfied. The constraints are often referred to as the algebraic constraints which emphasizes that they are non-differential equations. They can be explicit as in the case of non-differential equations in the dae as it is posed, or implicit as in the case of the output from the shuffle algorithm. Of course, the constraint equations are not unique, and it may well happen that some of the equations output from the shuffle algorithm were explicit in the original dae. Making sure that numerical solutions to dae do not leave the manifold defined by the algebraic constraints is a problem in itself, and several methods to ensure this exist. However, in theory, no special methods are required, since the produced implicit ode is such that an exact solution starting on the manifold will remain on the manifold. This brings up another practical issue, namely that initial value problems are ill-posed if the initial conditions they specify are inconsistent with the algebraic constraints. Knowing that a dae can contain implicit algebraic constraints, how can we know that all implicit constraints have been revealed at the end of the index reduction procedure? If the original dae is square, any algebraic constraints will be present in differentiated form in the index 0 square dae. This implies that the solution trajectory will be tangent to the manifold defined by the algebraic constraints, and hence it is sufficient that the initial conditions for an initial value problem are consistent with the algebraic constraints for the whole trajectory to remain consistent. In other words, there exist solutions to the dae starting at any point which is consistent with the algebraic constraints, and this shows that there can be no other implicit constraints. We shall take a closer look at this problem in section 3.3. Until then, we just note that rather than rejecting initial value problems as ill-posed if the initial conditions they specify are inconsistent with algebraic constraints, one usually interprets the initial conditions as a guess, and then applies some scheme to find truly consistent 2.2 47 Differential-algebraic equations initial conditions that are close to the guess in some sense. The importance of this task is suggested by the fact that the influential Pantelides (1988) addressed exactly this, and it is no surprise (Chow, 1998) since knowing where a dae can be initialized entails having a characterization of the manifold to which all of the solution must belong. Another structural approach to system analysis is presented in Unger et al. (1995). Their approach is similar to the one we propose in chapter 3. However, just as Pantelides’ algorithm, it considers only the equation-variable graph, although it is not presented as a graph theoretical approach. A later algorithm which is presented as graph theoretical, is given in Leitold and Hangos (2001), although a comparison to Pantelides’ algorithm seems missing. In Leimkuhler et al. (1991), consistent initial conditions are computed using difference approximation of derivatives, assuming that the dae is quasilinear and of index 1. Later, Veiera and Biscaia Jr. (2000) gives an overview of methods to compute consistent initial conditions. It is noted that several successful approaches have been developed for specific applications where the equations are in a well understood form, and among other approaches (including one of their own) they mention that the method in Leimkuhler et al. (1991) has been extended by combining it with Pantelides’ algorithm to analyze the system structure rather than assuming the quasilinear index 1 form. Their own method, presented in some more detail in Veiera and Biscaia Jr. (2001), is used to find initial conditions for systems starting in steady state, but allows for a discontinuity in forcing functions at the initial time. Of all previously presented methods for analysis of dae, the one which most resembles that proposed in chapter 3 is found in Chowdhry et al. (2004). They propose a method similar to that in Unger et al. (1995), but take it one step further by making a distinction between linear and nonlinear dependencies in the dae. This allows lti dae to be treated exactly, which is an improvement over Unger et al. (1995), while performing at least as good in the presence of nonlinearities. In view of our method, the partitioning into structural zeros, constant coefficients, and nonlinearities seems somewhat arbitrary. However, they suggest that even more categories could be added to extend the class of systems for which the method is exact. The need for a rigorous analysis of how tolerances affect the algorithm is not mentioned. 2.2.8 Numerical integration There are several techniques in use for the solution of dae. In this section, we mention some of them briefly, and explain one in a bit more detail. A classic accessible introduction to this subject is Brenan et al. (1996), which contains many references to original papers and further theory. The method we focus on in this section is applicable to equations with differentiation index 1, and this is the one we describe first. It belongs to a family referred to as backward difference formulas or bdf methods. The formula of the method tells how to treat x0 (t) in ! f ( x0 (t), x(t), t ) = 0 when the problem is discretized. By discretizing a problem we refer to replacing 48 2 Theoretical background the infinite-dimensional problem of computing the value of x at each point of an interval, with a finite-dimensional problem from which the solution to the original problem can be approximately reconstructed. The most common way of discretizing problems is to replace the continuous function x by a time series which approximates x at discrete points in time: xi ≈ x(ti ) Reconstruction can then be performed by interpolation. A common approach to the interpolation is to do linear interpolation between the samples, but this will give a function which is not even differentiable at the sample points. To remedy this, interpolating splines can be used. This suggests another way to discretize problems, namely to represent the discretized solution directly in spline coefficients, which makes both reconstruction as well as treatment of x0 trivial. However, solving for such a discretization is a much more intricate problem than to solve for a pointwise approximation. Before presenting the bdf methods, let us just mention how the simple (forward) Euler step for ode fits into this framework. The problem is discretized by point! wise approximation, and the ode x0 (t) = g( x(t), t ) is written as a dae by defining 4 f ( ẋ, x, t ) = −ẋ + g( x, t ). Replacing x0 (tn ) by the approximation ( xn+1 − xn )/( tn+1 − tn ) then yields the familiar integration method: x − xn ! 0 = f ( n+1 ,x ,t ) ⇐⇒ tn+1 − tn n n x − xn ! + g( xn , tn ) ⇐⇒ 0 = − n+1 tn+1 − tn ! xn+1 = xn + ( tn+1 − tn ) g( xn , tn ) The k-step bdf method also discretizes the problem by pointwise approximation, but replaces x0 (tn ) by the derivative at tn of the polynomial which interpolates the points ( tn , xn ), ( tn−1 , xn−1 ), . . . , ( tn−k , xn−k ). (Brenan et al., 1996, section 3.1) We shall take a closer look at the 1-step bdf method, which given the solution up to ( tn−1 , xn−1 ) and a time tn > tn−1 solves the equation ! xn − xn−1 ! f ,x ,t =0 tn − tn−1 n n to obtain xn . Of course, selecting how far from tn−1 we may select tn without getting too large errors in the solution is a very important question, but it is outside the scope of this background to cover this. A related topic of great importance is to ensure that the discretized solution converges to the true solution as the step size tends to zero, and when it does, to investigate the order of this convergence. Such analyzes reveal how the choice of k affects the quality of the solution, and will generally also give results that depend on the index of the equations. The following example is not giving any theoretical insights, but just shows the importance of the index when solving a dae by the 1-step bdf method. 2.2 49 Differential-algebraic equations 2.19 Example Consider applying the 1-step bdf method to the square index 1 lti dae ! E x0 (t) + A x(t) + B u(t) = 0 Discretization leads to E xn − xn−1 ! + A xn + B u(tn ) = 0 hn Where hn = tn − tn−1 . By writing this as ! ( E + hn A ) xn = E xn−1 − hn B u(tn ) we see that the iteration matrix E + hn A (2.43) must be non-singular for the solution to be well defined. Recalling that the differentiation index is revealed by the shuffle algorithm, we know that there exists a nonsingular matrix K such that # " # " # " #! " # " # " I I Ē Ā Ē Ā ( ) K E + h A = + h = + h 1 1 n n n 0 0 Ã Ã hn I hn I where the first term is non-singular. This proves the non-singularity of the iteration matrix (2.43) in general, since it is non-singular for hn = 0, and will hence only be singular for finitely many values of hn . Had the index been higher than 1, interpretation of the index via the shuffle algorithm reveals that the iteration matrix is singular for hn = 0, and hence ill-conditioned for small hn . (It can be shown that it is precisely the dae where the iteration matrix is singular for all hn , that are not solvable at all. (Brenan et al., 1996, theorem 2.3.1)) This shows that this method is limited to systems of index no more than 1. Note that the row operations that revealed the non-singularity also have practical use, since if applied before solving the dae, the condition number of the iteration matrix is typically improved significantly, and this condition is directly related to how errors in the estimate xn−1 are propagated to errors in xn . The following example shows how to combine the shuffle algorithm with the 1-step bdf method to solve lti dae of arbitrary index. 2.20 Example Consider solving an initial value problem for the square higher-index (solvable) lti dae ! E x0 (t) + A x(t) + B u(t) = 0 After some iterations of the shuffle algorithm (it can be shown that the index is bounded by the dimension of x for well-posed problems, see the remark in algo- 50 2 Theoretical background rithm 2.1), we will obtain the square dae " # " # ĀνD −1 ĒνD −1 0 ! x (t) + x(t) + · · · = 0 0 ÃνD −1 where the dependence on u and its derivatives has been omitted for brevity. At this stage, the full set of algebraic constraints has been revealed, which we write ! CνD x(t) + · · · = 0 It is known that " ĒνD −1 ÃνD −1 # is full-rank, where the lower block is contained in CνD . This shows that it is possible to construct a square dae of index 1 which contains all the algebraic constraints, by selecting as many independent equations as possible from the algebraic constraints, and completing with differential equations from the upper block of the index 0 system. Note that the resulting index 1 system has a special structure; there is a clear separation into differential and non-differential equations. This is valuable when the equations are integrated, since it allows row scaling of the equations so as to improve the condition of the iteration matrix — compare the previous example. In the previous example, a higher index dae was transformed to a square index 1 dae which contained all the algebraic constraints. Why not just compute the implicit ode and apply an ode solver, or apply a bdf method to the index 1 equations just before the last iteration of the shuffle algorithm? The reason is that there is no magic in the ode solvers or the bdf method; they cannot guarantee that algebraic constraints which are not present in the equations they see remain satisfied even though the initial conditions are consistent. Still, the algebraic constraints are not violated arbitrarily; for consistent initial conditions, the true solution will remain on the manifold defined by the algebraic constraints, and it is only due to numerical errors that the computed solution will drift away from this manifold. By including the algebraic constraints in the index 1 system, it is ensured that they will be satisfied at each sampling instant of the computed solution. There is another approach to integration of dae which seems to be gradually replacing the bdf methods in many implementations. These are the implicit RungeKutta methods, and early work on their application to dae include Petzold (1986) and Roche (1989). Although these methods are basically applicable to dae of higher index, poor convergence is prohibitive unless the index is low. (Compare the 1-step bdf method which is not at all applicable unless the index is at most 1.) The class of irk methods is large, and this is where the popular Radau IIa belongs. Having seen that higher index dae require some kind of index-reducing treatment, we finish this section by reminding that index reduction and index deduction are closely related, and that both the shuffle algorithm (revealing the differentiation in- 2.3 Initial condition response bounds 51 dex) and the algorithm that is used to compute the strangeness index may be used to produce equations of low index. In the latter context, one speaks of producing strangeness-free equations. 2.2.9 Existing software To round off our introductory background on dae topics, some existing software for the numerical integration of dae will be mentioned. However, as numerical integration is merely one of the applications of the work in this thesis, the methods will only be mentioned very briefly just to give an idea of what sort of tools there are. The first report on dassl (Brenan et al., 1996) was written by Linda Ruth in September 1982. It is probably the best known dae solver, but has been superseded by an extension called daspk (Brown et al., 1994). Both dassl and daspk use a bdf method with dynamic selection of order (1-step through 5-step) and step size, but the latter is better at handling large and sparse systems, and is also better at finding consistent initial conditions. The methods in daspk can also be found in the more recent ida (dating 2005) (Hindmarsh et al., 2004), which is part of the software package sundials (Hindmarsh et al., 2005). The name of this software package is an abbreviation of SUite of Nonlinear and DIfferential/Algebraic equation Solvers, and the emphasis is on the movement from Fortran source code to C. The ida solver is the dae solver used by the general-purpose scientific computing tool Mathematica. While the bdf methods in the software mentioned so far require that the user ensures that the index is sufficiently reduced, the implementations built around the strangeness index perform index reduction on the fly. Another interesting difference is that the solvers we find here implement also irk methods beside bdf. In 1995, the first version of gelda (Kunkel et al., 1995) (A GEneral Linear Differential Algebraic equation solver) appeared. It applies to linear time-varying dae, and there is an extension called genda (Kunkel and Mehrmann, 2006) which applies to general nonlinear systems. The default choice for integration of the strangeness-free equations is the Radau IIa irk method implemented in radau5 (Hairer and Wanner, 1991). 2.3 Initial condition response bounds The initial condition response of a system is the solution to the dynamic equations of the system when all forcing functions have been set to zero, given an initial state. Since setting all forcing functions to zero yields an autonomous system, the study of initial condition responses is the study of autonomous systems. For linear systems, the output of a system with forcing functions is the sum of the initial condition response, and the response to the forcing functions from a zero initial state. Hence, initial condition responses are also important for the understanding of systems which are not autonomous. Version 7 being the current version, see http://reference.wolfram.com/mathematica/ tutorial/NDSolveIDAMethod.html, or the corresponding part of the on-line help. 52 2 Theoretical background One of the key problems is to bound the largest possible gain from initial conditions to the state at any later time. For linear systems, the state at time t is given by the transition matrix (Rugh, 1996, chapter 3), sometimes known as the fundamental matrix, x(t) = Φ(t, 0) x(0) Hence, the gain to be bounded may be expressed as sup kΦ(t, 0)k2 t≥0 and we will often use language in terms of the transition matrix and initial condition responses interchangeably. We are interested in systems which are asymptotically stable (below, we will use stronger stability conditions, such as definition 2.37), for which it is meaningful to seek bounds that hold at all future times. 2.3.1 LTI ODE For linear time-invariant systems, x0 (t) = M x(t) the transition matrix is given by Φ(t, 0) = eM t . Bounding the matrix eM t is a fundamental problem which has been studied by many, and this section contains a selection of results from the litterature. Before we start, however, we remind of one of the basic results by Lyapunov. 2.21 Theorem (An inverse Lyapunov theorem). If M is a Hurwitz matrix (that is, α( M ) < 0), then there exists a symmetric positive definite matrix P satisfying the (time-invariant) Lyapunov equation ! M T P + P M = −I (2.44) (The matrix I may be replaced by any positive definite matrix.) The solution is given by Z∞ T (2.45) P = eM t eM t dt 0 Proof: This is a well-known result; for instance, see Rugh (1996, theorem 7.11). Generally speaking, a Lyapunov function is a function used to prove stability properties of a system. They are used for lti, ltv, as well as nonlinear systems. The idea is that the function shall be a continuously differentiable non-negative function of the state, being 0 only at the origin, and such that when it is composed with the state trajectory, it becomes a decreasing function of time. If a function is intended to be a Lyapunov function, but it hasn’t proven so yet (for instance, because it has some parameters to be determined first) it is often referred to as a Lyapunov function candidate. See Khalil (2002, chapter 4) for an introduction in the general context of nonlinear systems. The primary purpose of theorem 2.21 is to use x 7→ xT P x as 2.3 53 Initial condition response bounds a Lyapunov function for the system x0 = M x, and doing so it is easy to derive a constant bound on eM t for Hurwitz M. 2.22 Theorem. If M is a Hurwitz matrix, then q eM t ≤ kP k2 P −1 , 2 2 for all t ≥ 0 where P is the symmetric positive definite matrix whose existence is ensured by theorem 2.21. Proof: Consider solutions to the differential equation x0(t) = M x(t) with initial con 0 M t e ditions x(0) = x , and let t ≥ 0. Then |x(t)| = x0 . Since x 7→ xT P x is a Lyapunov function, it follows that x(t)T P x(t) ≤ x(0)T P x(0). Knowing that P is a sym2 T metric positive definite matrix, we may conclude that x(t) P x(t) ≥ σmin (P ) |x(t)| , 2 T −1 and x(0) P x(0) ≤ σmax (P ) |x(0)| . Noting that σmin (P ) = P 2 and σmax (P ) = kP k2 , q one obtains eM t x0 = |x(t)| ≤ kP k2 P −1 2 x0 . Since x0 was arbitrary, this implies the result. In Gurfil and Jodorkovsky (2003), the Lyapunov method of theorem 2.22 was applied in combination with convex optimization techniques to find a matrix P with small condition number. The beauty of the method of using Lyapunov functions is that it is not restricted to linear systems. Theorem 2.22 is very coarse. For instance, at t = 0 it is clear that eM 0 = 1, while 2 the theorem completely fails to capture this. Additionally, it is well known that M M t being Hurwitz implies that e 2 → 0 as t → ∞, and the theorem fails to capture this too. A common technique is to obtain decaying bounds by using shifts. 2.23 Lemma. For any scalar z, eM t = e− Re z eM t+I z 2 2 (2.46) Proof: Since M t and I z commute, eM t+I z = eM t eI z = ez eM t . Taking norms on M t both sides and solving for e 2 gives the result. Applying lemma 2.23 to theorem 2.22 with z = a ∈ ( 0, −α( M ) ) gives q eM t ≤ e−a t kPa k2 Pa−1 2 2 (2.47) where Pa is the solution to (2.44) with the Hurwitz M + a I instead of M. Note the form of this bound; it is the product of one finite expression which depends only on M, and one expression which is exponentially decaying for Hurwitz M. Better bounds were derived in Van Loan (1977). For instance, the next theorem gives a bound which is able to capture the exponential decay. In Veselić (2003, equation (13)) there is a reference to a similar result, where the same bound as above is multiplied with the exponentially decaying factor e−t/( 2 kP k2 ) . 54 2 Theoretical background 2.24 Theorem. If M has Schur decomposition M = Q ( D + N ) QH (Q will be unitary, D diagonal, and N upper triangular), then n−1 X kN tk2k eM t ≤ eα( M ) t 2 k! (2.48) k=0 Proof: See the derivation of Van Loan (1977, equation (2.11)). The theorem captures the fact that the problem is trivial if M is normal, as this implies N = 0, but this will not be the case in this thesis — this is a matter of what we are willing to assume about M. More generally, if we were willing to make assumptions about kN k2 , being the departure from normality of M, this would also immediately yield a bound by theorem 2.24. However, we are inclined to only make assumptions about system features, and this measure’s being invariant under norm-preserving changes of variables does not convince us that it could rightfully be considered a system feature; it is the restriction to norm-preserving transformations which bothers us. Making shifts in (2.48) makes no difference, but a bound comparable with (2.47) is still easy to derive. The following two results (except for the shifted bound) appeared in Tidefelt and Glad (2008) and are much more simple than tight. 2.25 Corollary. The matrix exponential is bounded according to n−1 X ( 2 kMk2 )i t i eM t ≤ eα( M ) t 2 i! (2.49) i=0 Proof: Let QH MQ = D + N be a Schur decomposition of M, and use kN k2 = kQH MQ − Dk2 ≤ kMk2 + kMk2 in theorem 2.24. 2.26 Lemma. If the map M is Hurwitz, that is, α( M ) < 0, then for t ≥ 0, kMk 2 e−1 n −α( M2 ) eM t ≤ e 2 (2.50) Further, shifting with a ∈ ( 0, −α( M ) ) results in kMk 2 e−1 n −( α( M 2)+a ) eM t ≤ e−a t e (2.51) 2 4 Proof: Let f ( t ) = eM t 2 . From corollary 2.25 we have that f(t) ≤ n−1 X X ( 2 kMk2 )i t i α( M ) t e C fi ( t ) i! i=0 i Each fi ( t ) can easily be bounded globally since they are smooth, tend to 0 from above as t → ∞, and the only stationary point is found via fi0 ( t ). From fi0 ( t ) = eα( M ) t ( 2 kMk2 )i t i−1 (t α( M ) + i) i! 2.3 55 Initial condition response bounds it follows that the stationary point is t = − α( iM ) . Hence, fi ( t ) ≤ fi i kMk ! 2 kMk2 i i i 2 e−1 n −α( M2 ) i −α( M ) − = e−i ≤ α( M ) i! i! and it follows that f(t) ≤ i n−1 2 e−1 n kMk2 X −α( M ) i=0 i! ≤ i kMk ∞ X 2 e−1 n −α( M2 ) i! i=0 =e 2 e−1 n kMk2 −α( M ) The shifted result follows immediately from (2.50). However, after the development of this result, the theorem below was found in the literature. The bounds provided by the two theorems are both functions of the same ratio between a matrix’ norm and the smallest distance from any eigenvalue to the imaginary axis, and hence they are equivalent for our qualitative convergence results. However, for practical purposes, when quality must be turned into quantity, the theorem below offers a tremendous advantage. 2.27 Theorem. For a Hurwitz matrix M ∈ Rn×n and t ≥ 0, the matrix exponential is bounded as !n−1 kMk2 eM t ≤ γ( n ) eα( M ) t/2 (2.52) 2 −α( M ) where γ( n ) = 1 + n−1 X i=1 4i (ie−1 )i i! Proof: This is Godunov (1997, proposition 3.3, p 20) extended with the expression for γ( n ), which can easily be extracted from the proof. Comparing (2.47) with (2.51), we note that the former bound involves the non-trivial dependence on M through the solution to the Lyapunov equation (2.44), while the latter often grossly over-estimates the norm it bounds, but uses only very elementary properties of the matrix. However, the condition number of the solution to the Lyapunov equation may be bounded without actually solving the equation, by application of bounds listed in the survey et al. (1996, equation (70) and equation (87)) Kwon (in their notation, kPa k2 = α1 , Pa−1 2 = αn−1 , and kM + a I k2 = γ1 ). The only upper bound they list for α1 makes use of twice the logarithmic norm (see Ström (1975) for properties of this norm and further references) of M + a I , being α( M + M T + 2a I ), and requires this to be negative. When this is the case the following bound is obtained, s q kM + a I k2 kPa k2 Pa−1 2 ≤ −α( M + M T + 2a I ) 56 2 Theoretical background but unfortunately the logarithmic norm may be positive even though α( M + a I ) is negative. Hence, we are unable to derive from (2.47) a bound that is both exponentially decaying whenever α( M ) is negative, and expressed without direct reference to the solution to the Lyapunov equation. While theorem 2.24 both gives a tight estimate at t = 0 and exhibits the true rate of the exponential decay, the polynomial coefficient makes the estimate very conservative, even for small t. Since the tightness of bounds for the matrix exponential are directly related to how well our results in this thesis are connected to applications (by means of deriving useful quantitative bounds), we shall end our discussion of matrix exponential bounds with a recent result which appear in Veselić (2003). However, while the original presentation is concerned with exponentially stable semigroups (which may be infinite-dimensional), the results are stated here in terms of matrices to make the results more accessible to readers unfamiliar with the original framework. The bounds are formulated using the following two scalar functions of a matrix: δ( M ) = 2 sup Re xH M x ! |x|=1 γ( M ) = 2 inf Re xH M x ! |x|=1 For their forthcoming analysis, it is assumed that γ( M ) < δ( M ). The first of these definitions, δ( M ), may be recognized as twice the logarithmic norm of M. Among the properties for the logarithmic norm in Ström (1975, lemma 1c), we note 1 δ( M ) ≤ kMk2 (2.53) 2 and the following alternative formulation shows its close connection to the norm of the matrix exponential eMh 2 δ( M ) = 2 lim h h→0+ α( M ) ≤ Veselić (2003) also reminds that δ( M ) = α( M + M T ), and regarding the second of the definitions, it is shown that γ( M ) ≤ − kP k−1 2 , where P is the solution to (2.44). 2.28 Theorem. For Hurwitz M, δ( M ) t e 2 , 1 eM t ≤ 1 2 ) kP k2 2 + 2 δ( M ) kP k2 −t/( 2 kP k2 ) 1+δ( M)/γ( e , 1−δ( M M) t ≤ h0 ( M ) (2.54) h0 ( M ) ≤ t where 1 + δ( M ) kP k2 1 h0 ( M ) = ln δ( M ) 1 − δ( M ) /γ( M ) and P is the solution to (2.44). Proof: See Veselić (2003, theorem 4). ! (2.55) 2.3 57 Initial condition response bounds 2.3.2 LTV ODE When we consider linear time-varying systems in chapter 8, we extend results from Kokotović et al. (1986, section 5.2) and we shall make use of some results from there. 2.29 Lemma. Let φ( t, s ) denote the transition matrix of the time-scaled ltv system m z 0 (t) = M(t) z(t) Assume that there exist a time interval I and constants c1 > 0, c2 , β, such that ∀ t ∈ I : α( M(t) ) ≤ −c1 ∀ t ∈ I : kM(t)k2 ≤ c2 ∀ t ∈ I : M 0 (t)2 ≤ β Then there exist positive constants m0 , a, K, such that for all m < m0 , and s, t in I, t ≥ s ⇒ φ( t, s ) − eM(s) ( t−s )/m ≤ m K e−a ( t−s )/m (2.56) 2 Proof: See Kokotović et al. (1986, lemma 5:2.2), with further references to similar results given in Kokotović et al. (1986, section 5:10). While discussing linear time-varying systems, we take the opportunity to give a definition related to transformations of such systems, even though it is not particularly related to the bounding of initial condition responses. Consider the change of variables T (t) z(t) = x(t) (2.57) x0 (t) = M(t) x(t) (2.58) in the system Via the intermediate dae, T 0 (t) z(t) + T (t) z 0 (t) = M(t) T (t) z(t) the ode in z is found to be z 0 (t) = T (t)−1 M(t) T (t) − T (t)−1 T 0 (t) z(t) (2.59) 2.30 Definition (Lyapunov transformation). The square time-varying matrix T is called a Lyapunov transformation if it is continuously differentiable, T (t)is invertible for every t, and there are time-invariant constants bounding kT (t)k2 and T (t)−1 2 for all t. Knowing that a transformation matrix is a Lyapunov transformation allows us to work with the transformed system instead of the original one, knowing that the qualitative properties will be the same, and when we are done, we apply the reverse transformation to obtain results for the original system. For a theoretical application of this definition, see for instance Rugh (1996, theorem 6.15). 58 2.3.3 2 Uncertain Theoretical background LTI ODE Bounds on the initial condition response of an uncertain system is closely related to perturbation theory, so there is a strong connection between the present section and section 2.4.1 below. The bounds mentioned so far only apply to exactly known systems, while the applications in this thesis will concern uncertain systems. In Boyd et al. (1994), a bound for linear differential inclusions (see section 2.4.2) is given as a linear matrix inequality optimization problem. The technique is based on the idea to use Lyapunov functions as described above. However, the classes of uncertainty that can be handled cannot cater for the problems we encounter in later chapters. An alternative to convex optimization might be to use the plethora of bounds on the eigenvalues (or, equivalently, the singular values) of the solution to the Lyapunov equation. The survey Kwon et al. (1996) contains many such bounds, including the following theorem. 2.31 Theorem. Let M be Hurwitz. The solution P to the Lyapunov equation (2.44) satisfies 1 (2.60) kP k2 ≥ 2 kMk2 Proof: This is a special case of the main result in Shapiro (1974) — the general case allows for the right hand side of (2.44) to be an arbitrary negative definite matrix instead of just −I . However, the current case is trivial since ! 1 = k−I k2 = P M + M T P 2 ≤ 2 kP k2 kMk2 2.4 Regular perturbation theory By a regular perturbation we refer to perturbations of expression for the time derivative in an ode. In the literature, the perturbations often occur in just one small parameter, but we shall not restrict our notion to this case. Instead, we let perturbation theory refer to any theory which aims to describe how the solutions to equations depend on small parameters in the equations. The perturbation parameters may be used to model uncertainties, but the theory may also be useful for known quantities. In this and the following sections, we only consider perturbations of differential equations (compare lemma 2.46 which concerns the problem perturbation in matrix inversion). Like the perturbed equations themselves, the perturbation parameters may or may not be allowed to be time-varying. 2.4.1 LTI ODE Since the solution to the initial value problem x0 (t) = M x(t) x(0) = x0 is given by x(t) = eM t x0 (2.61) 2.4 59 Regular perturbation theory understanding the perturbed problem z 0 (t) = ( M + F ) z(t) z(0) = x0 (2.62) becomes a matter of understanding the sensitivity of the matrix exponential with respect to perturbations. The sensitivity of the norm of the matrix exponential was the theme of Van Loan (1977), to which we have referred previously regarding results on bounds on the matrix exponential. It turns out that it is the bound on the matrix exponential (section 2.3.1) with is the key to the relative sensitivity, formalized by the following lemma. 2.32 Lemma. Assume there exists is a monotonically increasing function γ on [ 0, ∞ ) and a constant β such that t ≥ 0 implies eM t ≤ γ(t) eβ t 2 Then e( M+F )t 2 ≤ kFk2 t γ(t)2 e[ β−α( M )+kFk2 γ(t) ] t eM t (2.63) 2 Proof: This is Van Loan (1977, lemma 1). The lemma should be compared with the outer approximations of the reachable sets in example 2.36 on page 61. For any choice of β and γ, the lemma implies that restriction to a finite time interval makes the perturbations in the solutions O( kFk2 ). While the lemma bounds |z(t)| by a factor times |x(t)|, we are often interested in a different bound, namely the absolute difference between the two, |z(t) − x(t)|. 2.33 Lemma. Assume that the nominal system (2.61) is stable, and that there exist a polynomial γ and a constant β < 0 such that t ≥ 0 implies e( M+F )t ≤ γ(t) eβ t 2 Then there is a finite constant k such that the solution to the perturbed system (2.62) satisfies sup |z(t) − x(t)| = k kFk2 t≥0 Proof: Introducing y = z − x, we find y 0 (t) = ( M + F ) y(t) + F x(t) y(0) = 0 with solution Zt y(t) = 0 e( M+F )( t−τ ) F x(τ) dτ 60 2 Theoretical background Since the nominal system is stable, |x| will be a bounded function, say supt≥0 |x(t)| ≤ x̄. Then we get the estimate Zt Zt ( )τ M+F dτ ≤ kFk2 x̄ e y(t) ≤ kFk2 x̄ γ(t) eβ t dτ 2 0 0 Here, the integrand has a primitive function which is also a polynomial (with coefficients depending on β) times eβ t . Hence, the integral will be bounded independently of t, which completes the proof. Several possible choices of polynomials and exponents to use with lemma 2.33 are listed in Van Loan (1977), but we find theorem 2.27 particularly convenient since it only relies on two basic properties of the perturbed matrix. Clearly the number k provided by the lemma would be possible to improve if we also used that x will satisfy an exponentially decaying bound, which may be important to utilise in applications when quantitative perturbation bounds need to be as tight as possible. 2.34 Lemma. Consider the perturbed solution over the finite interval [ 0, tf ]. Then there is a finite constant k such that the solution to the perturbed system (2.62) satisfies sup |z(t) − x(t)| = k kFk2 t∈[ 0, tf ] Proof: Compare the proof of lemma 2.33. Since the nominal solution x will be defined on a compact interval, it will be a bounded function. For the integral, we may use the coarse over-estimate Zt Zt 1 e( M+F )τ dτ ≤ ekM+Fk2 τ dτ = kM+Fk2 t e − 1 2 kM + Fk2 0 2.4.2 0 LTV ODE We now turn to perturbations of the system x0 (t) = M(t) x(t) x(t0 ) = x0 (2.64) with transition matrix φ. Let the time interval I be defined as [ t0 , ∞ ), so that kMkI ≤ α is the same as saying that kM(t)k2 ≤ α for all t ≥ t0 . We will often consider t0 = 0 without loss of generality. The following definition turns out useful. 2.35 Definition (Uniformly bounded-input, bounded-state stable). The ltv system x0 (t) = M(t) x(t) + B(t) u(t) x(t0 ) = 0 with input u is called uniformly bounded-input, bounded-state stable if there exists a finite constant γ such that for any t0 ≥ 0, sup |x(t)| ≤ γ sup |u(t)| t≥t0 t≥t0 2.4 61 Regular perturbation theory See Rugh (1996, note 12.1) regarding some subtleties of definition 2.35. Three types of results for the perturbation of (2.64) dominate the literature. The first type is confined to only consider the stability properties of the perturbed system, and the amount of results shows that this is both important and non-trivial. Some results which will be useful in later chapters are included below. The second type, well explained in Khalil (2002, chapter 10), addresses the effect that a scalar perturbation has on the solutions, and here the amount of literature is likely to be related to the many application areas where corresponding methods have been successful. Since we are mainly interested in non-scalar perturbations in this thesis, we shall not give an account of the scalar perturbation results, but turn attention to the solutions of the perturbed equation z 0 (t) = [ M(t) + F(t) ] z(t) z(0) = x0 (2.65) where there is a bound kFkI ≤ f0 . Introducing y = z − x yields the system y 0 (t) = [ M(t) + F(t) ] y(t) + F(t) x(t) y(0) = 0 (2.66) which can be handled by showing that (2.66) is uniformly bounded-input, boundedstate stable from the input F(t) x(t). Since the input to the system decays with the size of F, uniform convergence of y to zero follows (provided that the input-state relation is not only bounded uniformly in the input, but also in the perturbation). We refer to Rugh (1996, chapter 12) for the definitions and basic results. Clearly, using the gain provided by the uniformly bounded-input, bounded-state stability property will result in very conservative perturbation bounds, since they will only depend on the peak value of |x|, even though x is a known function. The third way to analyze perturbations is to approximate (2.65) conservatively using differential inclusions (Filippov, 1985), n o z 0 (t) ∈ { [ M(t) + ∆ ] z(t) : k∆k2 ≤ f0 } z(0) ∈ z 0 (2.67) That is, we include all the solutions obtained by letting F(t) vary arbitrarily from one t to another. Hence, the differential inclusion approximation corresponds to ignoring all differentiability and continuity properties that we may have for F. The solution to the problem is represented by a set-valued solution function, at each t giving the reachable set at that time. If these sets can be computed conservatively (outer approximations), we have a means to deal with quite general perturbations of ltv systems. The concepts are illustrated by the following example, applied to a time-invariant system. In the book Boyd et al. (1994) on linear matrix inequalities, four out of ten chapters are devoted to the study of linear differential inclusions, and the text should be accessible to a broad audience. 2.36 Example Consider the perturbed lti system ! " # ! ! z10 (t) 0 1 z1 (t) = + F z20 (t) −1 −1 z2 (t) ! ! z1 (0) 0 = z2 (0) 1 (2.68) 62 2 Theoretical background where max(F) ≤ . Let B 0.1. The set-valued function f defined by !! " # ! z1 4 [ −0.1, 0.1 ] [ 0.9, 1.1 ] z1 f = z2 [ −1.1, −0.9 ] [ −1.1, −0.9 ] z2 maps any point z to a point with interval coordinates, that is, a (convex) rectangle in the ( z1 , z2 ) plane. It is easy to see that the image of a convex set under f is also a convex set, and according to Kurzhanski and Vályi (1997, lemma 1.2.1) it follows that the reachable sets are also convex. We shall approach the perturbation problem in three ways, all illustrated in figure 2.2: • Making a grid of points in the 4 dimensional uncertainty space, and generate the corresponding solutions. This is supposed to be a reasonably good inner approximation of the perturbation problem (2.68) we aim to solve. • Compute an outer approximation of the reachable sets of the differential inclusion, by making an interval approximation of the reachable set at each time instant. This results in an ode in 4 variables, being the lower and upper interval bounds on z1 , z2 . • Compute an inner approximation of the reachable sets by discretizing time and compute a set of points in the interior of the reachable set at each time instant. Since the reachable sets will be convex, the points will actually represent their convex hull. To “integrate” from one time instant to the next, each vertex in the convex set is mapped to several new points by evaluating all possible combinations of minimum, mid, and maximum values in the uncertain intervals. When all vertices have been mapped, points that are not vertices of the new convex hull are first removed, and then a selection of the remaining vertices is made so that the number of vertices is never more than 10 at any time instant. Additional outer approximations for smaller values of are shown in figure 2.3. It is seen that the outer approximation is useful during a short time interval, but that it explodes exponentially — this is typical for this type of interval analysis. Of course, other outer approximations could also be considered. For instance, in the context of linear matrix inequalities ellipsoids are the obvious choice (see Boyd et al. (1994)), and ellipsoids were also used for hybrid systems in Jönsson (2002). The inner approximation seems to approximate the solution to the original problem (2.68) well. Formalizing the differential inclusion idea gives a constructive method to prove that the solutions to the perturbed problem converge uniformly as the perturbation tends to zero, see Filippov (1985, theorem 8.2). Analogously to the method using the bounded-input, bounded-output stability above, application of Filippov (1985, theorem 8.2) to perturbed linear systems, requires us to conservatively use a bound on the peak norm of the nominal solution. Unlike when uniform bounded-input boundedoutput stability is applied to (2.66), the cited theorem for differential inclusions applies only to bounded intervals of time and does not guarantee a rate of convergence. Hence, in view of the fundamental theorems that establish continuous dependence of 2.4 63 Regular perturbation theory 2 1 0 1 2 3 5 4 t 6 −1 −2 Figure 2.2: Inner and outer approximation of the reachable sets of the differential inclusion corresponding to (2.68) with = 0.1. The converging trajectories (gray) were generated by deterministically replacing the uncertainty F by the 34 matrices obtained by enumerating all matrices with entries in { −, 0, }. This provides a heuristic inner approximation of the original perturbation problem, to which the differential inclusion should be compared. The diverging trajectories are the bounds of the outer interval approximation of the reachable sets, found by integrating an ode. The vertical lines are the projections of the inner approximations of the reachable sets, obtained by replacing the uncertainty in the differential inclusion by fixed choices of F over short intervals of time. Here, the same 34 matrices were used again, over time intervals of length 0.01 (to enhance readability in the plot, the projections are only shown at a sparse selection of time instants). 2 1 0 1 −1 −2 2 = 10−2 3 4 = 10−3 5 = 10−4 6 7 8 t = 10−5 Figure 2.3: Outer approximations of the reachable sets of the differential inclusion corresponding to (2.68), for smaller values of compared to figure 2.2. 64 2 Theoretical background solutions as functions of parameters in the system (uniformly in time), the strength of the theorem lies in its applicability to equations with discontinuous right hand side. Since we will not encounter such equations in this thesis, the method of differential inclusion is here mainly considered a computational tool for perturbed systems. The sequence of results below are slightly more precise formulations of results in Rugh (1996). They are based on time-varying Lyapunov functions, and in their original form they provide conditions for uniform exponential stability, explained by the following definition according to Rugh (1996, definition 6.5). 2.37 Definition (Uniformly exponentially stable). The system (2.64) is said to be uniformly exponentially stable if there exist finite positive constants γ, λ such that for any t0 , x0 , and t ≥ t0 the solution x satisfies |x(t)| ≤ γ e−λ ( t−t0 ) x0 For lti systems, uniform exponential stability is easily characterized in terms of eigenvalues. 2.38 Theorem. The lti system x0 (t) = M x(t) is uniformly exponentially stable if and only if M is Hurwitz. Proof: The time-invariance of the system allows us to use t0 = 0 without loss of generality. If M is Hurwitz, then we may take λ ∈ ( 0, −α( M ) ), and it remains to show that |x(t)| λ t e x0 can be bounded by some constant γ. Since |x(t)| ≤ eM t 2 x0 and theorem 2.24 shows that there exists a polynomial p such that eM t ≤ p(t) eα( M ) t 2 it follows that |x(t)| λ t e ≤ p(t) e( λ+α( M ) ) t x0 where λ + α( M ) < 0. Since the exponential decay will dominate the polynomial growth as t → ∞, and the function to be bounded is continuous, the function is bounded. Conversely, if M is not Hurwitz, then there exists at least one eigenvalue λ with Re λ ≥ 0. Taking x0 as the corresponding eigenvector shows that |x(t)| does not tend to zero as t → ∞, showing that the system cannot be uniformly exponentially stable. 2.4 65 Regular perturbation theory The additional precision needed in this thesis is captured by the next definition, which requires the constants of the exponential decay to be made visible. For systems without uncertainty, the difference to the usual uniform exponential stability above is minor, but for uncertain systems, the new definition means that the uniform exponential stability is uniform with respect to the uncertainty. h i 2.39 Definition (Uniformly γ e−λ• -stable). The system (2.64) is said to be unih i formly γ e−λ• -stable if it is uniformly exponentially stable with the parameters γ, λ used in definition 2.37. We now rephrase three theorems in Rugh (1996) using our new definition. q ρ − 2νρ • e 2.40 Theorem. The system (2.64) is uniformly -stable if there exists a η symmetric matrix-valued, continuously differentiable function P and constants η > 0, ρ ≥ η, and ν > 0, for all t satisfying η I P (t) ρ I (2.69a) 0 (2.69b) T M(t) P (t) + P (t) M(t) + P (t) −ν I Proof: The proof of Rugh (1996, theorem 7.4) applies. h i 2.41 Theorem. Suppose the system (2.64) is uniformly γ e−λ• -stable and kMkI ≤ α. Then the matrix-valued function P defined by Z∞ φ(τ, t)T φ(τ, t) dτ P (t) = (2.70) t is symmetric for all t, continuously differentiable, and satisfies (2.69) with η= 1 2α ρ= γ2 2λ ν=1 Proof: The proof of Rugh (1996, theorem 7.8) applies. For the perturbation z 0 (t) = [ M(t) + F(t) ] z(t) (2.71) of (2.64) we now have the following theorem. 2.42 Theorem. If the system (2.64) satisfies the assumptions of theorem 2.41, then there exists a constant hβ > 0 isuch that kFkI ≤ β implies that the perturbed system (2.71) is uniformly γ̃ e−λ̃• -stable with r α λ γ̃ = γ λ̃ = λ 2 γ2 66 2 Theoretical background Proof: Follows by the proof of Rugh (1996, theorem 8.6) with minor addition of detail. The main idea is to use the P which theorem 2.41 provides for the nominal system (2.64), and use it in theorem 2.40 applied to the perturbed system (2.71). Of the two conditions P must satisfy, (2.69a) is trivial since it does not involve the perturbation. For the other condition, (2.69b), the cited proof shows that ν = 12 does the job. The proof is completed by inserting the values for η, ρ, ν in theorem 2.40. The strength of theorem 2.42 compared to Rugh (1996, theorem 8.6) is that the exponential convergence parameters of the perturbed system are expressed only in the exponential convergence parameters of the nominal system and the norm bound on M. This will be useful in chapter 8, where the “nominal” system is unknown up to the specifications required by theorem 2.42. 2.4.3 Nonlinear ODE That a whole chapter in the classic Coddington and Levinson (1985, chapter 17) is devoted to the perturbations of a nonlinear system in two dimensions, signals that perturbations of nonlinear systems is in general a very difficult problem. Nevertheless, Lyapunov-based stability results similar to those in the previous section exist, see, for instance, Khalil (2002, chapter 9). However, since perturbations of nonlinear systems will not be considered in later chapters, we will not present any of the Lyapunov-based results here. Instead, we will just quickly show how a standard perturbation form can be derived. Consider adding a small perturbation g( x(t), t ) to the right hand side of the nominal system x0 (t) = f ( x(t), t ) (2.72) z 0 (t) = f ( z(t), t ) + g( z(t), t ) (2.73) yielding Introducing y = z − x, and subtracting (2.72) from (2.73) results in y 0 (t) = f ( x(t) + y(t), t ) − f ( x(t), t ) + g( x(t) + y(t), t ) Series expansion of the first term and regarding x(t) as a given function of t shows that y 0 (t) = M(t) y(y) + h( y(t), t ) + g( x(t) + y(t), t ) (2.74) where h( y, t ) = o( y ) for each t. To help the analysis, it is typically assumed that h( y, t ) + g( x(t) + y, t ) = o( y ) uniformly in t. Additionally assuming that M is time-invariant helps even more, leading to the standard form y 0 (t) = M y(t) + fˆ( y(t), t ) where M is assumed Hurwitz and fˆ( y, t ) = o( y ). (2.75) Results regarding the solutions to (2.75) can be found in standard text books on ode, such as Coddington and Levinson (1985, chapter 13), Cesari (1971, chapter 6), or Khalil (2002). 2.5 67 Singular perturbation theory 2.5 Singular perturbation theory Recall the model reduction technique called residualization (section 2.1.5). In singular perturbation theory, a similar reduction can be seen as the limiting system as some dynamics become arbitrarily fast. (Kokotović et al., 1986) However, some of the assumptions made in the singular perturbation framework are not always satisfied in the presence of matrix-valued singular perturbations, and this is a major concern in this thesis. The connection to model reduction and singular perturbation theory is interesting also for another reason, namely that the classical motivation in those areas is that the underlying system being modeled is singularly perturbed in itself, and one is interested in studying how this can be handled in modeling and model-based techniques. Although that framework is built around ordinary differential equations, the situation is just as likely when dae are used to model the same systems. It is a goal of this thesis to highlight the relation between matrix-valued singular perturbations that are due to stiffness in the system being modeled, and the treatment of matrixvalued singular perturbations that are artifacts of numerical errors and the like. In view of this, this section not only provides background for forthcoming chapters, but also contains theory with which later development is to be contrasted. Singular perturbation theory has already been mentioned when speaking of singular perturbation approximation in section 2.1.5. However, singular perturbation theory is far more important for this thesis than just being an example of something which reminds of index reduction in dae. First, it provides a theorem which is fundamental for the analysis in the second part of the thesis. Second, the way it is developed in Kokotović et al. (1986) contains the key ideas used in our development from chapter 6 on. In this section, we begin by stating a main theorem for lti systems. We then briefly indicate how the lti scalar singular perturbation problem has been generalized, as some of these generalizations provide important directions for future developments of our work. We then give a more detailed account on the work on the so-called multiparameter singular perturbations, since this generalization relative to scalar singular perturbation reminds of the generalization to matrix-valued singular perturbation initiated in this thesis. A fairly recent overview of singular perturbation problems and techniques is presented in Naidu (2002). 2.5.1 LTI ODE The following (scalar) singular perturbation theorem found in Kokotović et al. (1986, chapter 2, theorem 5.1) will be useful. Consider the singularly perturbed lti ordinary differential equation ! " # ! ! ! x0 (t) ! M11 M12 x(t) x(t0 ) ! x0 = = 0 (2.76) z 0 (t) M21 M22 z(t) z(t0 ) z −1 where we are interested in small > 0. Define M0 B M11 − M12 M22 M21 , denote ! xs0 (t) = M0 xs (t) ! xs (t0 ) = x0 (2.77) 68 2 Theoretical background the slow model (obtained by setting B 0 and eliminating z using the thereby obtained non-differential equations), and denote ! zf0 (τ) = M22 zf (τ) ! −1 zf (0) = z 0 + M22 M21 x0 (2.78) the fast model (which is expressed in the timescale given by τ ∼ t − t0 ). 2.43 Theorem. If α( ( ) M22 ) < 0, there exists an ∗ > 0 such that, for all ∈ ( 0, ∗ ], the states of the original system (2.76) starting from any bounded initial conditions x0 and z 0 , x0 < c1 , z 0 < c2 , where c1 and c2 are constants independent of , are approximated for all finite t ≥ t0 by x(t) = xs (t) + O( ) −1 z(t) = −M22 M21 xs (t) + zf (τ) + O( ) (2.79) where xs (t) and zf (τ) are the respective states of the slow model (2.77) and the fast model (2.78). If also α( ( ) M0 ) < 0 then (2.79) holds for all t ∈ [ t0 , ∞ ). Moreover, the boundary layer correction zf (τ) is significant only during the initial short interval [ t0 , t1 ], t1 − t0 = O( log ), after which −1 z(t) = −M22 M21 xs (t) + O( ) Among the applications of this theorem, numerical integration of the equations is probably the simplest example. The theorem says that for every acceptable tolerance δ > 0 in the solution, there exists a threshold for such that for smaller , the contribution to the global error from the timescale separation is at most, say, δ/2. If the timescale separation is feasible, one can apply solvers for non-stiff problems in the fast and slow model separately, and then combine the results according to (2.79). This approach is likely to be much more efficient than applying a solver for stiff systems to the original problem. However, note that the theorem only states that there exist certain constants (each O-expression has two inherent constants), as opposed to giving explicit expressions for these. That is, even though the constants are possible to compute, it requires a bit of calculations, and hence the theorem highlights the qualitative nature of the result. Similarly, the perturbation results to be presented in later chapters of this thesis are also formulated qualitatively, even though the constructive nature of the proofs allows error estimates to be computed. 2.5.2 Generalizations of scalar singular perturbation As an indication of how our results in this thesis may be extended in the future, we devote some space here to listing a few directions in which theorem 2.43 has been extended. The extensions in this section are still concerned with the case of just one small perturbation parameter, and are all found in Kokotović et al. (1986). The first extension is that the O( ) expressions in (2.79) can be refined so that the first order dependency on is explicit. Neglecting the higher order terms in , this makes it possible to approximate the thresholds which are needed to keep track of 2.5 Singular perturbation theory 69 the global error when integrating the equations in separate timescales. However, it is still not clear when is sufficiently small for the O( 2 ) terms to be neglected. The other extension we would like to mention is that of theorem 2.43 to time-varying linear systems. That such results exist may not be surprising, but it should be noted that time-varying systems have an additional source of timescale separation compared to time-invariant systems. This must be taken care of in the analysis, and is a potential difficulty if these ideas are used to analyze a general nonlinear system by linearizing the equations along a solution trajectory (because of the interactions between timescale separation in the solution itself and in the linearized equations that determine it). The decoupling transform for time-varying systems appears in Chang (1969, 1972). For the problem of finding a bound on the perturbation parameter such that the asymptotic stability of the coupled system is ensured, Abed (1985b) contains a relatively recent result. In Khalil (1984), singularly perturbed systems are also derived from ode with singular trailing matrices. The so-called semisimple null structure assumption they make enables the formulation of a corresponding scalar singularly perturbation problem. Another related problem class is obtained if stochastic forcing functions are added to the singularly perturbed systems (see, for instance, Ladde and Sirisaengtaksin (1989)). The properties of the resulting stochastic solution processes may then be studies using decoupling techniques similar to those used for deterministic systems. Looking at statistical properties such as mean and variance will give quite different results compared to the L∞ function measure often used for deterministic systems, and this relates to yet another related deterministic problem class, obtained by replacing the L∞ function measure by, for instance, the L2 function measure. For many applications such norms may be more relevant than the maximum-norm-over-time measure of L∞ . While, the L∞ measure remains an important general-purpose measure that should be supported by general-purpose numerical integration software, it appears that both developers and users of numerical integration software would benefit from also allowing other error estimates for the computed solution. We note that both stochastic formulation and alternative function measures constitute relevant generalizations of the singular perturbation results derived in the thesis. 2.5.3 Multiparameter singular perturbation The multiparameter singular perturbation problem is closely related to the subject of the present work in that it considers several small parameters at the same time. The multiparameter singular perturbation form arises when small parasitics are included in a physical model. For instance, a parasitic parameter may be the capacitance of a wire in an electric model where such capacitances are expected to be negligible. Since the parasitics have physical origin, they are known to be greater than zero, and this requirement is part of the multiparameter singular perturbation form. 70 2 Theoretical background In its linear formulation, the autonomous multiparameter singular perturbation form may be written x0 = Axx x + Axz z 1 I .. . N I 0 z = Azx x + Azz z (2.80) Here, all the i > 0 are the small singular perturbation parameters, and the goal is to understand how the system behaves as maxi i tends to zero. By introducing the parameter µ = maxi i the system may be written in a form which is closer to the scalar singular perturbation form, x0 = Axx x + Axz z µ 1 I ( 0 . . µ z = Azx x + Azz z ) . µ N I | {z } (2.81) D In the early work on multiparameter singular perturbation in Khalil and Kokotović (1979), all singular perturbation parameters are assumed to be of the same order of µ magnitude, corresponding to assuming abound on the in (2.81) (another common i µ choice of µ is the geometric mean of all i , and then the ratios needs to be bounded i both from above and below to imply the equal order of magnitude condition). The condition about equal order of magnitudes was originally formulated as m ≤ i ≤ M for all i, j (2.82) j and this remains the most popular way to state the assumption. The condition should be seen in contrast to the case when the singular perturbation parameters are assumed of different orders of magnitudes in the sense that (2.83) lim i+1 = 0 for all i = 1, 2, . . . , N − 1 i →0 i Such problems can be analyzed by applying a sequence of scalar singular perturbation results, and are said to have multiple time scales (where multiple refers to the number of fast time scales). The main assumption used in Khalil and Kokotović (1979) is the so-called Dstability, which means that a system remains stable (with some positive, fixed, margin between poles and the imaginary axis) if the state feedback matrix is leftmultiplied by any D of (2.81). They remark that the condition (2.82) is not realistic in many applications. In Khalil (1981), the results are extended to the nonlinear setting using Lyapunov methods, and further refinement was made in Khorasani and Pai (1985). 2.5 Singular perturbation theory 71 Later, the condition (2.82) was removed for lti systems in Abed (1985a), and the socalled strong D-stability condition was then introduced in Abed (1986) to simplify the analysis. However, when ltv systems are treated in Abed (1986), the condition (2.82) is still used. In Khalil (1987), the condition (2.82) used in Khalil and Kokotović (1979), has been removed. Further, both slow and fast subsystems are allowed to be nonlinear. Instead of (2.82), assumptions are used to ensure the existence of Lyapunov functions with certain additional constraints. This technique was used for scalar singular perturbation in Saberi and Khalil (1984), where references to other authors’ early work on singular perturbations based on Lyapunov functions can be found. In Coumarbatch and Gajic (2000), the algebraic Riccati equation is analyzed for a multiparameter singularly perturbed system. The system under consideration has two small singular perturbation parameters of the same order of magnitude. To understand properties of the solutions of Riccati equations for singularly perturbed systems seems a promising tool for future developments in singular perturbation theory, and the replacement of the equal order of magnitude assumption by something more realistic from an application point of view would be a valuable development by itself. Within the class of multiparameter singularly perturbed problems, the class of multiple time scale singularly perturbed problems allows the small singular perturbation parameters to belong to different groups depending on order of magnitude. Within a group of singular perturbation parameters, all parameters satisfy a condition of the kind (2.82), while there is an ordering among the groups such dividing a parameter from one group by a parameter from a succeeding group yields a ratio which tends to zero as the latter parameter tends to zero. This problem was studied for two fast time scales using partial decoupling in Ladde and S̆iljak (1989) and later will full decoupling in Ladde and Rajalakshmi (1985). The generalization to more than two fast time scales was later presented with partial decoupling in Ladde and Rajalakshmi (1988), and with full decoupling and a flowchart for the decoupling procedure in Kathirkamanayagan and Ladde (1988). In Abed and Tits (1986), the strong D-stability property is extended to the context of multiple time scales. They also highlight an example showing that for asymptotic stability in the multiple time scale setting (2.83), asymptotic stability given (2.82) is a sufficient condition for and only for N = 2 and dim z = 2. Comparing the multiparameter singular perturbation theory with the matrix-valued singular perturbation results in the second part of the thesis, there are a few things to mention here. Assumptions are often used to restrict the singular perturbation parameters within the basic multiparameter singular perturbation form, but authors agree that some of these assumptions are not realistic in view of typical applications for the theory. Hence, there is a constant drive to do away with such assumptions, replacing them by conditions that can be verified by inspection of properties of the unperturbed system. For lti systems, the (strong) D-stability definition is such a condition. We remark that the requirement that all singular perturbation parameters be positive adds important structure to the problem, and this is essential for the Dstability concept. 72 2 Theoretical background When we consider matrix-valued singular perturbations, the lack of structure in the perturbation makes conditions such as D-stability much harder to come up with. The kind of properties which can be meaningfully verified are simple things such as norm bounds on matrices in the unperturbed system. Everything else will have to be assumed, and finding assumptions which are reasonable in view of applications will be key to a successful theory. Unlike multiparameter singular perturbation, however, matrix-valued singular perturbation cannot be handled without assumptions, and this is in line with the non-physical origin of the matrix-valued singular perturbation problems that we know of. That is, it is primarily for problems of physical origin that imposing assumptions can be inappropriate — for perturbation problems that are due to modeling and software artifacts, it is less surprising that assumptions may be necessary to mitigate the effects of those artifacts. 2.5.4 Perturbation of DAE The issue with perturbations in dae has been considered previously in Mattheij and Wijckmans (1998). While they consider perturbations of the trailing matrix and not in the leading matrix, we share many of their observations regarding the possibility of ill-posed problems. It is due to this similarity and that the dae perturbations we study turn out to be of singular perturbation type, that the current section resides under section 2.5. The above-mentioned work on perturbations in the trailing matrix is referred to in Kunkel and Mehrmann (2006, remark 6.7), as the latter authors remark that a perturbation analysis is still lacking in their influential framework for numerical solution of dae based on the strangeness index. Although this thesis deals with perturbation related more to index reduction by shuffling than the methods of Kunkel and Mehrmann (2006), it is hoped that our work will inspire the development of similar perturbation analyses in other contexts as well. When the study of matrix-valued singular perturbation was motivated in section 1.2.4, one of the applications was to handle a change of rank in the leading matrix of a time-varying system. For a recent alternative approach to singularities in time-varying systems, see März and Riaza (2007), where the assumed existence of smooth projector functions is used to mitigate isolated rank drops. An interesting topic in perturbation of dae is the study of how sensitive the eigenvalues are to perturbations. The eigenvalue problems generalize naturally form matrix pencils (or pairs) to matrix polynomials, with immediate application to higher order dae. Some recent results on the conditioning of the eigenvalue problem appear in Higham et al. (2006). Although eigenvalues are very central to our treatment of perturbations in dae, the assumptions we use allow the “fast and uncertain” eigenvalues to be very sensitive to perturbation. This behavior is very different from the setting where eigenvalue perturbation results can be used to bound the sensitivity, and unfortunately the difference has hindered us from seeing applications of the eigenvalue perturbation theory in our work. 2.6 2.6 73 Contraction mappings Contraction mappings Contraction mappings can provide elegant proofs of existence and uniqueness of the solution to an equation. In this section, we state the fundamental theorem, which can be found in standard text books on real analysis. The theorem is illustrated with one example and one lemma which will be useful in later chapters. 2.44 Theorem (Contraction principle). Let X be a complete metric space, with metric d. Let T be a mapping from X into itself such that d( T (x2 ), T (x1 ) ) ≤ c d( x2 , x1 ) for some c < 1. Then there exists a unique x ∈ X such that T (x) = x. Proof: See Rudin (1976, theorem 9.23). 2.45 Example Consider the equation ! x2 = 4 + , x>0 (2.84) for small √ values of the parameter ≤ m. Although we know that the solution is given by x = 4 + , we shall estimate the quality of the first order approximation to the solution. As the first order approximation is given by x0 () = 2 + 1 4 , we set x() = x0 () + m2 y() where y() shall be bounded independently of (for sufficiently small m) using a contraction mapping argument. Inserting (2.45) in (2.84) yields (dropping the argument to y) 1 2 1 ! 4++ + 2 2 + m2 y + m4 y 2 = 4 + 16 4 which is rewritten with y alone on one side of the equation (but y may still appear on the other side of the equation as well) ! y=− 2 2+ 1 4 1 1 2 2 4 m + m y 16 (2.85) Now assume that y ≤ ρ, where ρ is to be selected soon, and define the mapping 4 T y=− 2 2+ 1 4 1 1 2 2 4 16 m +m y so that a fixed point of T is a solution to (2.85). In view of 1 T y ≤ 16 4 + 1 2 1 + m2 y 74 2 1 ||2 x0 () + 16 4 − 2.9 3 Theoretical background 2 √ 1 4+ 4 + 2.9 x= 1 ||2 x0 () − 16 0 0 2 4 6 4+ Figure 2.4: Approximating the square root near 4 using a contraction mapping argument. Using to denote the deviation from 4, and introducing the first order approximation x0 () = 2 + 14 , the square root is expressed in the variable 2 ! y through the equation x0 () + y = 4 + . The computed upper and lower 0 bounds on y are added to x and shown in the figure, and are known to be valid for all ≤ 2.9. 1 we take ρ = 1/16 so that 21 m < 2, and hence 16 m2 < 1 ensures that T y ≤ ρ. Solving √ the requirements for m reveals that m shall be chosen less than the smallest of 16 and 4 (both equal to 4). To prove that T is a contraction, note that it is continuously differentiable with positive and decreasing derivative (since T y tends to zero from below as y grows). Hence, the modulus of the derivative at the low end of the domain is a valid Lipschitz constant in all of the domain. In view of this, the domain shall be selected such that the derivative at −ρ is less than 1. This yields the equation 2 2+ 1 4 m4 1 2 2 16 < 1 2 4 m −m ρ or 1 4+ 1 2 − m2 ρ 2 2 m < 16 which is implied by m ≤ 2.9 (the optimal bound is somewhere between 2.9 and 3.0). Using theorem 2.44 it may be concluded that for || ≤ 2.9, 1 x() − x0 () ≤ ||2 16 illustrated in figure 2.4. 2.6 75 Contraction mappings The example deserves a few remarks. First, note that ρ could have been selected 1 arbitrarily close to 64 at the price of obtaining very small bounds on m. Also note that other ways of isolating y would lead to other operators, and possibly to improved bounds. Another feature of the example is that the operator could be defined without reference to the square root function, which is the function we set out to approximate in the first place. The method can be contrasted with series expansion techniques. Using the contraction mapping principle we first guess the approximation x0 (), and then prove bounds on the rest term. In contrast, a Taylor expansion requires the existence of derivatives of the function being approximated. Here, x00 () = − 1 4 ( 4 + )3/2 and bounding |x00 ()| for || ≤ 2 yields 1 1 x() − x0 () ≤ √ ||2 2 8 2 | {z } = 1√ 16 2 Note that this bound is stronger than that obtained in the example. Having tried the technique on the scalar example, we now turn to matrix equations, and the result is general enough to be put as a lemma. 2.46 Lemma. Let the non-singular matrix X have its inverse bounded as X −1 2 ≤ c. Then ρ 1 kFk2 ≤ ρ+c c (2.86) =⇒ ( X + F )−1 − X −1 ≤ ρ 2 Proof: Assume kFk2 ≤ ρ 1 ρ+c c , define the operator 4 T G = − X −1 F X −1 + G F X −1 and consider the set G = G : kGk2 ≤ ρ . Then kT Gk2 ≤ kFk2 ( c + ρ ) c ≤ ρ so T G ⊂ G. Since kT G2 − T G1 k2 = G2 F X −1 − G1 F X −1 2 ≤ kFk2 c kG2 − G1 k2 ρ+c < kFk2 c kG2 − G1 k2 ≤ kG2 − G1 k2 ρ T is a contraction on G. 76 2 Theoretical background By theorem 2.44 there is a unique solution G ∈ G. Hence, G satisfies ! G = − X −1 F X −1 + G F X −1 and multiplying by X from the right reveals ! X −1 F + G X + G F = 0 Adding I to both sides allows us to write ! X −1 + G ( X + F ) = I Where it is seen that X + F indeed is invertible, and ! G = ( X + F )−1 − X −1 shows that ( X + F )−1 − X −1 ∈ G. 2.47 Corollary. Let X be a non-singular matrix with kXk2 ≤ c and let ρ > 0 be given. Then there exists a constant m > 0 such that (2.87) kFk2 ≤ m =⇒ ( X + F )−1 2 ≤ c + ρ ρ Proof: Take m = ρ+c 1c and use ( X + F )−1 2 = ( X + F )−1 − X −1 + X −1 2 ≤ ( X + F )−1 − X −1 + X −1 . 2 2 In chapter 8, when contracting mapping arguments are applied to time-varying systems, the fixed-point equations will be integral equations. We end this section with an example that will come in handy when reading those arguments. 2.48 Example Consider equations in time-varying matrices over the time interval [ 0, tf ). Let φ be the transition matrix of the system x0 (t) = M(t) x(t) so that φ(t, t) = I φ(•, τ)0 (t) = M(t) φ(t, τ) φ(τ, •)0 (t) = −φ(τ, t) M(t) Define the operators S and T according to 4 Zt 4 Ztf 1 (S R) (t) = a φ(t, τ) P (τ) dτ 0 (T R) (t) = 1 a P (τ) φ(τ, t) dτ t 2.7 77 Interval analysis where a is a constant and P is a matrix-valued function of time. The fixed-point equation ! SR=R then implies (swapping the sides, multiplying by a, and differentiating) Zt ! 0 a R (t) = P (t) + Zt M(t) φ(t, τ) P (τ) dτ = P (t) + M(t) 0 φ(t, τ) P (τ) dτ 0 (2.88) = P (t) + a M(t) (S R) (t) = P (t) + a M(t) R(t) Similarly, the fixed-point equation ! T R=R implies ! a R0 (t) = −P (t) − Ztf t tf Z P (τ) φ(τ, t) M(t) dτ = −P (t) − P (τ) φ(τ, t) dτ M(t) (2.89) t = −P (t) − a (T R) (t) M(t) = −P (t) − a R(t) M(t) Hence, by identifying the forms of (2.88) or (2.89) with some equation in timevarying matrices, we will be able to formulate corresponding fixed-point integral equations. 2.7 Interval analysis While perturbation analysis is the theoretical tool used to prove convergence results in this thesis, it relies on a too coarse uncertainty model to be successful in applications. In the much more fine-grained uncertainty model used in interval analysis, one uncertainty interval is used for each scalar quantity. A survey and quick introduction is given in Kearfott (1996). Another quick introduction including many algorithms is given in Jaulin et al. (2002), written jointly with two of the authors of the popular book Jaulin et al. (2001). Though superior to “implemented perturbation analysis”, interval analysis is often blamed for producing error bounds so pessimistic that they are useless. Unfortunately, pessimistic error bounds — which are not always useless — is the price one has to pay to be certain that the result of the uncertain computation really includes every possible outcome of the uncertain problem. Methods to improve performance include use of preconditioning matrices, coordinate transformations, and uncertainty models which captures some of the correlation between uncertain quantities. For instance, even if the uncertainty in the quantity x is large, as long as x is 78 2 Theoretical background known to be non-zero it holds that x x+x =1 x−x=0 =2 x 2x−x but without support from symbolic math computations these relations may be difficult to maintain. This problem is addressed in Neumaier (1987), but in this thesis we shall only use the following trivial observation. Notation. In the language of interval analysis, scalars, vectors, and matrices, represented by box constraints on each entry, are denoted intervals, interval vectors, and interval matrices, respectively. In contrast — when the difference needs to be emphasized — the denotation of the exactly known corresponding objects are real number, point vector, and point matrix. The notion of point objects is natural in view of the uncertain objects being technically defined as sets of point objects. Since this thesis is not in the field of interval analysis (we merely use the technique for illustration), we tend to use the terms uncertain and exact instead. A function containing uncertainty is also thougt of as a set of exact functions. This enables us to define the solution to the uncertain equation in the variable x ∈ Rn ! f(x) = 0 as x ∈ Rn : ! ∃ fˆ ∈ f : x = fˆ( x ) Notation. For a general set we speak of inner and outer approximations, refering to its subsets and supersets. In the context of equations, we simply write inner/outer solution when referring to inner/outer approximations of the solution. An uncertain matrix is called regular if it only admits non-singular point matrices. Otherwise, it is called singular. The next definition is non-standard but will be convenient in the thesis. 2.49 Definition (Pointwise non-singular uncertain matrix). An uncertain matrix is said to have the additional property of being pointwise non-singular if it admits only non-singular point matrices. Clearly every regular uncertain matrix has the property of being pointwise nonsingular, but for singular uncertain matrices this is the property which allows the inverse to be formed formally, even though the inverse is a matrix with at least one unbounded interval of uncertainty. For the inverse to be useful, additional assumptions will have to be added, for instance, a bound on the norm of the inverse. It is important to note that an interval matrix with additional constraints, such as pointwise non-singularity or boundedness of inverse, is generally not possible to represent as an interval matrix, and it becomes more appropriate to use the more general notion of uncertain matrix. 2.7 79 Interval analysis If X is known to be a regular matrix, so that X −1 X = I , the following column reduction is possible " # h i X −1 −X −1 Y h i X Y = I 0 0 I This also makes use of the trivial −X X −1 Y + Y = −I Y + Y = 0. Thinking of the h col-i umn reduction as a coordinate transformation, the uncertainty in the matrix X Y was transferred to uncertainty in the coordinate transform. Of course, row reduction can be carried out analogously. While approximate intervals for X −1 are easily obtained by a first order expansion of X −1 around some point matrix X0 ∈ X, good bounds that are guaranteed to contain X −1 are more demanding. We shall not go into details, since we will rely on Mathematica to deliver accurate results, and interested readers are referred to Neumaier (1990, theorem 4.1.11) for a theoretical result. The following theorem is characteristic for interval arithmetic. It is a simple form of constraint propagation, and proves itself. 2.50 Theorem. Consider the fixed-point equation, ! x = f(x) where uncertainty in f implies uncertainty in the solution, and where the solution set x is known to be a subset of x1 . If conservative evaluation of f ( x1 ) results in x2 , that is f ( x1 ) ⊂ x2 then x ⊂ x1 ∩ x2 The theorem immediately gives a technique for iterative refinement of outer solutions to a fixed-point equation. Hence, even very conservative outer solutions may be very valuable as they can serve as a starting point for the iterative refinement. 2.51 Example Consider the uncertain polynomial (the brackets denote intervals here, not matrices) 4 f ( x ) = [ 0.9, 1.1 ] x2 + [ 9.5, 10.5 ] x + [ −8.5, −7.5 ] which has a solution near 0.75. The plot of f in figure 2.5 allows the solution to be read off as the interval where 0 ∈ f ( x ), and it is easy to find that the true solution set is approximately [ 0.6676, 0.8295 ]. ! Let x0 = [ 0.0, 2.0 ] be a given outer solution, and rewrite the equation f ( x ) = 0 as ! the fixed-point equation x = T ( x ), where 4 T(x) = − [ 0.9, 1.1 ] x2 + [ −8.5, −7.5 ] [ 9.5, 10.5 ] 80 2 Theoretical background f(x) 2 0 0.6 0.65 0.7 0.75 0.8 0.85 0.9 x −2 Figure 2.5: The uncertain function f in example 2.51. Since no uncertain quantities appear more than once in the expression for f , it is straint-forward to compute the set f ( x1 ) at any point x1 . The iterative refinement produces the following sequence of outer solutions. Iterate x0 x1 x2 x3 Outer solution [ 0.00, 2.00 ] [ 0.55, 0.89 ] [ 0.63, 0.87 ] [ 0.64, 0.86 ] Further iteration gives little improvement; x6 = [ 0.6375, 0.8562 ], sharing its four most significant digits with x200 . Hence, the iterative refinement produced significant improvement of the initial outer solution in just a few iterations, but was unable to converge to the true solution set. 2.8 Gaussian elimination Although assumed that the reader is familiar with Gaussian elimination, in this section some aspects of particular interest for the proposed algorithm in chapter 3 will be discussed. The shuffling algorithm in chapter 3 makes use of row reduction. The most well known row reduction method is perhaps Gaussian elimination, and although infamous for its numerical properties, it is sufficiently simple to be a realistic choice for implementations. In fact, the proposed algorithm makes this particular choice, and among the many variations of Gaussian elimination, a fraction-free scheme is used. This technique for taking a matrix to row echelon form uses only addition and multi A matrix is said to be in row echelon form if each non-zero row has more leading zeros than the previous row. Actually, in order to account for the outcome when full pivoting is used, one should really say that the matrix is in row echelon form after suitable reordering of variables. In the current setting of elimination 2.8 Gaussian elimination 81 plication operations. In contrast, a fraction-producing scheme involves also division. The difference is explained by example. Consider performing row reduction on a matrix of integers of the same order of magnitude: " # 5 7 3 −4 A fraction-free scheme will produce a new matrix of integers, " # " # 5 7 5 7 = 5 · 3 − 3 · 5 5 · (−4) − 3 · 7 0 −41 while a fraction producing scheme generally will produce a matrix of rational numbers, # # " " 5 7 5 7 = 0 −(41/5) 3 − (3/5) · 5 (−4) − (3/5) · 7 The fraction-free scheme thus has the advantage that it is able to preserve the integer structure present in the original matrix. On the other hand, if the original matrix is a matrix of rational numbers, both schemes generally produce a new matrix of rational numbers, so there is no advantage in using the fraction-free scheme. Note that it is necessary not to allow the introduction of new integer entries in order to keep the distinction clear, since any matrix of rational numbers can otherwise be converted to a matrix of integers. Further, introducing non-integer scalars would destroy the integer structure. The two schemes should also be compared by the numbers they produce. The number −41 in comparison with the original numbers is a sign of the typical blowup of entries caused by the fraction-free scheme. The number −(41/5) = −8.2 does not indicate the same tendency. When the matrix is interpreted as the coefficients of a linear system of equations to be solved in the floating point domain, the blowup of entries implies bad numeric condition, which in turn has negative implications for the quality of the computed solution. Unfortunately, this is not the only drawback of the fraction-free scheme, since the operations involved in the row reduction are ill-conditioned themselves. This means that there may be poor correspondence between the original equations and the row reduced equations, even before attempting to solve them. Fraction-free Gaussian elimination can also be applied to a matrix of polynomials, and will then preserve the polynomial structure. Note also that the structure is not destroyed by allowing the introduction of new scalars. This can be used locally to drastically improve the numerical properties of the reduction scheme by making it approximately the same as those of the fraction producing scheme. That is, multiplication by scalars is used to locally make the pivot polynomial approximately equal to 1, and then fraction-free operations are used to eliminate below the pivot as usual. Finally, recall that Gaussian elimination also takes different flavors in the pivoting dimension. However, this dimension is not explored when proposing the algorithm where it makes sense to speak of structural zeros, the reference to reordering of variables can be avoided by saying that the reduced matrix is such that each non-zero row has more structural zeros than the previous row. 82 2 Theoretical background in chapter 3. 2.9 Miscellaneous results The chapter ends with a collection of results that are included here only to be referenced from subsequent chapters. Please refer to cited references for discussion of these results. The following theorem gives an upper bound on the roots of a polynomial, purely in terms of the moduli of the coefficients. Of course, there are many ways to obtain tighter estimates by using more knowledge about the coefficients, but we don’t have that kind of information in this thesis, and the bounds we get from this theorem are tight enough for our purposes. 2.52 Theorem. The moduli of the roots of the polynomial f(z) = n X ai z i i=0 are bounded by 1 an−i i 2 max an n−1 1 [ a0 n 2 an i=1 Proof: The result is included in the form of an exercise in the thorough Marden (1966, exercise 30.5). Motivated by theorem 2.27, the bounding of matrices is very important in our work, but it turns out that we actually know more about the inverse of the matrix, than about the matrix itself. Since bounding the norm of the inverse of a matrix is related to bounding the condition number, several useful results are presented with condition number bounds in mind. The survey Higham (1987) lists many such results. One of them which is useful to us and applies to upper triangular matrices is given as the next theorem. 2.53 Theorem. For the upper triangular matrix U , it holds that p ( a + 1 )2 n + 2 n ( a + 2 ) − 1 kU k2 ≤ (a + 2)b where, with J = U −1 , a = max i<j Jij |Jii | = max λi Jij ≤ max U −1 λmax ( U ) i<j b = min |Jii | = min i i 1 1 = λi λmax ( U ) (2.90) 2.9 Miscellaneous results 83 Proof: This is an immediate consequence of results in Lemeire (1975) and builds on the theory of M-matrices. 3 Shuffling quasilinear DAE Methods for index reduction of general nonlinear differential-algebraic equations are generally difficult to implement due to the recurring use of functions defined only via the implicit function theorem. The problem is avoided in chapter 5, but instead of implementing the implicit functions, additional variables are introduced, and an under-determined system of equations needs to be solved each time the derivative (or the next iterate of a discretized solution) is computed. As an alternative, additional structure may be added to the equations in order to make the implicit functions possible to implement, thereby avoiding additional variables and under-determined systems of equations. In particular, this is possible for the quasilinear and linear time-invariant (lti) structures, and it turns out that there exists an algorithm for the quasilinear form that is a generalization of the shuffle algorithm for the lti form in the sense that, when applied to the lti form, it reduces to the shuffle algorithm. For this reason, the more general algorithm is referred to as a quasilinear shuffle algorithm. This chapter is devoted to quasilinear shuffling. It is included in the background part of the thesis since it was the application to quasilinear shuffling that was the original motivation behind the study of matrix-valued singular perturbations. This connection was mentioned in section 1.2.2 and will be discussed below when the seminumerical approach presented in section 3.2.4. This chapter also contains a discussion on how the quasilinear shuffle algorithm can be used to find consistent initial conditions, and touches upon the issue of the algorithm complexity. The contents of the chapter presents only minor improvements compared to the chapter with the same title in the author’s licentiate thesis, Tidefelt (2007, chapter 3), except that the section on algorithm complexity has been removed due to its weak connection to the current thesis. 85 86 3 Shuffling quasilinear dae Notation. We use a star to mark that a symbol denotes a constant. For instance, the symbol E ∗ denotes a constant matrix, while a symbol like E would in general refer to a matrix-valued function. A few times, we will encounter the gradient of a matrix-valued function. This object will be a function with 3 indices, but rather than adopting tensor notation with the Einstein summation convention, we shall permute indices using generalized transposes denoted (•)T and (•) . Since their operation will be clear form the context, they will not be defined formally in this thesis. T 3.1 Index reduction by shuffling In section 2.2.3, algorithm 2.1 provided a way of reducing the differentiation index of lti dae. The extension of that algorithm to the quasilinear form is immediate, but to put this extension in a broader context, we will take the view of it as a specialization instead. In this section, we mainly present the algorithm as it applies to equations which are known exactly, and are to be index reduced exactly. 3.1.1 The structure algorithm In section 2.2.5 we presented the structure algorithm (algorithm 2.2) as means for index reduction of general nonlinear dae, ! f ( x0 (t), x(t), t ) = 0 (3.1) This method is generally not possible to implement, since the recurring use of the implicit function theorem often leaves the user with functions whose existence is given by the theorem, but whose implementation is very involved (to the author’s knowledge, there are to date no available implementations serving this need). However, it is possible to implement for the quasilinear form, as was done, for instance, using Gauss-Bareiss elimination (Bareiss, 1968) in Visconti (1999), or outlined in Steinbrecher (2006). 3.1.2 Quasilinear shuffling Even though algorithms for quasilinear dae exist, the results they produce may be computationally demanding, partly because the problems they apply to are still very general. This should be compared with the linear time-invariant (lti) case, ! E x0 (t) + A x(t) + B u(t) = 0 (3.2) to which the very simple and certainly tractable shuffle algorithm (see section 2.2.3) applies — at least as long as there is no uncertainty in the equation coefficients. Interestingly, the algorithm for quasilinear dae described in Visconti (1999) is using the same idea, and it generalizes the shuffle algorithm in the sense that, when applied to the lti form, it reduces to the shuffle algorithm. For this reason, the algorithm in Visconti (1999) is referred to as a quasilinear shuffle algorithm. Note that it is not referred to as the quasilinear shuffle algorithm, since there are many options regarding how to do the generalization. There are also some variations on the theme of the lti shuffle algorithm, leading to slightly different generalizations. 3.1 87 Index reduction by shuffling In the next two sections, the alternative view of quasilinear shuffling as a specialization of the structure algorithm is taken. Before doing so, we show using a small example what index reduction of quasilinear dae can look like. 3.1 Example This example illustrates how row reduction can be performed for a quasilinear dae. The aim is to present an idea rather than an algorithm, which will be a later topic. Consider the dae (dropping the dependency of x on t from the notation) 5 x2 4t 2 + tan( x1 ) x3 2 cos( x1 ) x0 + x2 + x3 =! 0 0 e sin( x1 ) x2 cos( x1 ) 4 t cos( x1 ) − ex3 x1 ex3 + t 3 x2 The leading matrix is singular at any point since the first row times cos( x1 ) less the second row yields the third row. Adding the second row to the third, and then subtracting cos( x1 ) times the first, is an invertible operation and thus yields the equivalent equations: 5 ! 2 + tan( x1 ) x2 4 t 0 x3 = 0 2 cos( x1 ) x + x 0 e x + 2 3 x 3 3 0 0 0 x1 e + t x2 + x2 + x3 − 5 cos( x1 ) This reveals the implicit constraint of this iteration, ! x1 (t) ex3 (t) + t 3 x2 (t) + x2 (t) + x3 (t) − 5 cos( x1 (t) ) = 0 Then, differentiation yields the new dae x2 4t 2 + tan( x1 ) 5 ! 2 cos( x ) 0 ex3 x0 + x2 + x3 = 0 1 x3 2 e − 5 sin( x1 ) t 3 + 1 x1 ex3 + 1 3 t x2 Here, the leading matrix is generally non-singular, and the dae is esentially an ode bundeled with the derived implicit constraint. 3.1.3 Time-invariant input affine systems In this section, the structure algorithm is applied to equations ! 0 = f ( x(t), x0 (t), t ) where f is in the form f ( x, ẋ, t ) = E( x ) ẋ + A( x ) + B( x ) u(t) (3.3) with u being a given forcing function. This system is considered time-invariant since time only enters the equation via u. After one iteration of the structure algorithm, we will see what requirements (3.3) must fulfill in order for the equations after one iteration to be in the same form as the original equations. This will show that (3.3) is not a natural form for dae treated by the structure algorithm. In the next section, a more successful attempt will be made, 88 3 Shuffling quasilinear dae starting from a more general form than (3.3). The system is rewritten in the form ! x0 (t) = ẋ(t) (3.4a) ! 0 = f ( x(t), ẋ(t), t ) (3.4b) to match the setup in Rouchon et al. (1992) (recall that the notation ẋ is not defined to denote the derivative of x; it is a composed symbol denoting a newly introduced function which is required to equal the derivative of x by (3.4a)). As is usual in the analysis of dae, the analysis is only valid locally, giving just a local solution. As is also customary, all matrix ranks that appear are assumed to be constant in the neighborhood of the initial point defining the meaning of local solution. We will now follow one iteration of the structure algorithm applied to this system (compare algorithm 2.2). 4 Let f0 B f , and introduce E0 , A0 and B0 accordingly. Let µk = rank Ek (that is, the rank of the “ẋ-gradient” of fk , which by assumption may be evaluated at any point in the neighborhood), and let f¯k denote µk components of fk such that Ēk denotes µk linearly independent rows of Ek . Let f˜k denote the remaining components of fk . By the constant rank assumption it follows that, locally, the rows of Ēk ( x ) span the rows of Ek ( x ), and hence there exists a function ϕk such that Ẽk ( x ) = ϕk ( x ) Ēk ( x ) Hence, f˜k ( x, ẋ, t ) = Ẽk ( x ) ẋ + Ãk ( x ) + B̃k ( x ) u(t) = ϕk ( x ) Ēk ( x ) ẋ + Ãk ( x ) + B̃k ( x ) u(t) = ϕk ( x ) Ēk ( x ) ẋ + Āk ( x ) + B̄k ( x ) u(t) − ϕk ( x ) Āk ( x ) − ϕk ( x ) B̄k ( x ) u(t) + Ãk ( x ) + B̃k ( x ) u(t) = ϕk ( x ) f¯k ( x, ẋ, t ) + Ãk ( x ) − ϕk ( x ) Āk ( x ) + B̃k ( x ) − ϕk ( x ) B̄k ( x ) u(t) Define 4 Âk ( x ) = Ãk ( x ) − ϕk ( x ) Āk ( x ) 4 B̂k ( x ) = B̃k ( x ) − ϕk ( x ) B̄k ( x ) (3.5) 4 Φ k ( x, t, y ) = ϕk ( x ) y + Âk ( x ) + B̂k ( x ) u(t) and note that along solutions, Φ k ( x, t, 0 ) = Φ k ( x, t, f¯k ( x, ẋ, t ) ) = f˜k ( x, ẋ, t ) = 0 In particular, the expression is constant over time, so it can be differentiated with respect to time to obtain a substitute for the (locally) uninformative equations given 3.1 89 Index reduction by shuffling by f˜k . Thus, let 4 fk+1 ( x, ẋ, t ) = f¯k ( x, ẋ, t ) ! (3.6) ∂ t7→Φ k ( x(t), t, 0 ) (t) ∂t Expanding the differentiation using the chain rule, it follows that ∂ t 7→ Φ k ( x(t), t, 0 ) (t) = ∇1 Φ k ( x(t), t, 0 ) x0 ( t ) + ∇2 Φ k ( x(t), t, 0 ) ∂t = ∇Âk ( x(t) ) + ∇TB̂k ( x(t) ) u(t) x0 ( t ) T (3.7) + B̂k ( x(t) ) u 0 (t) However, since x0 = ẋ along solutions, the following defines a valid replacement for fk : fk+1 ( x, ẋ, t ) = # ! ) ! " (" # " Ēk ( x ) B̄ ( x ) 0 Āk ( x ) u(t) ẋ + + k + 0 0 ∇TB̂k ( x ) ∇Âk ( x ) 0 B̂k ( x ) # u(t) u̇(t) ! (3.8) T We have now completed one iteration of the structure algorithm, and turn to finding conditions on (3.3) that make (3.8) fullfill the same conditions. In (3.8), the product between u(t) and ẋ( t ) is unwanted, so the structure is restricted by requiring ∇B̂k ( x(t) ) = 0 that is, B̂k is constant; B̂k ( x ) = (3.9) B̂∗k . Unfortunately, the conflict has just been shifted to a new location by this requirement. The structure of fk+1 does not match the structure in (3.3) together with the requirement (3.9), since B̂k ( x ) includes the non-constant expression ϕk ( x ). Hence it is also required that Ek is constant so that ϕk ( x ) may be chosen constant. This is written as Êk ( x ) = Êk∗ . Then, if structure is to be maintained, " # Ēk∗ ∇Âk ( x ) would have to be constant. Again, this condition is not met since ∇Âk ( x ) is generally not constant. Finally, we are led to also requiring that ∇Âk ( x ) be constant. In other words, that Âk ( x ) = Â∗k x so the structure of (3.3) is really f ( x, ẋ, t ) = E ∗ ẋ + A∗ x + B∗ u(t) which is a standard lti dae. 90 3 Shuffling quasilinear dae Note that another way to obtain conditions on (3.3) which become fulfilled also by (3.8) is to remove the forcing function u. The key point of this section, however, is that we have seen that in order to be able to run the structure algorithm repeatedly on equations in the form (3.3), an implementation that is designed for one iteration on (3.3) is insufficient. In other words, if an implementation that can be iterated exists, it must apply to a more general form than (3.3). 3.1.4 Quasilinear structure algorithm Seeking a replacement for (3.3) such that an implementation for one step of the structure algorithm can be iterated, a look at (3.8) suggests that the form should allow for dependency on time in the leading matrix. Further, since the forcing function u has entered the leading matrix, the feature of u entering the equations in a simple way has been lost. Hence it is no longer motivated to keep Ak ( x ) and Bk ( x )uk (t) separate, but we might as well turn to the quasilinear form in its full generality, fk ( x, ẋ, t ) = Ek ( x, t ) ẋ + Ak ( x, t ) The reader is referred to the previous section for the notation used below. This time, the constant rank assumption leads to the existence of a ϕk such that Ẽk ( x, t ) = ϕk ( x, t ) Ēk ( x, t ) Such a ϕk can be obtained from a row reduction of E, and corresponds to the row reduction performed in a quasilinear shuffle algorithm. It follows that f˜k ( x, ẋ, t ) = Ẽk ( x, t ) ẋ + Ãk ( x, t ) = ϕk ( x, t ) Ēk ( x, t ) ẋ + Ãk ( x, t ) = ϕk ( x, t ) Ēk ( x, t ) ẋ + Āk ( x, t ) − ϕk ( x, t ) Āk ( x, t ) + Ãk ( x, t ) = ϕk ( x, t ) f¯k ( x, ẋ, t ) + Ãk ( x, t ) − ϕk ( x, t ) Āk ( x, t ) Define 4 Âk ( x, t ) = Ãk ( x, t ) − ϕk ( x, t ) Āk ( x, t ) 4 (3.10) Φ k ( x, t, y ) = ϕk ( x, t ) y + Âk ( x, t ) and note that along solutions, Φ k ( x, t, 0 ) = Φ k ( x, t, f¯k ( x, ẋ, t ) ) = f˜k ( x, ẋ, t ) = 0 Taking a quasilinear shuffle algorithm perspective on this, we see that Φ k ( x, t, 0 ) = Âk ( x, t ) is computed by applying the same row operations to A as were used to find the function ϕk above. The expression Φ k ( x, t, 0 ) is constant over time, so it can be differentiated with respect to time to obtain a substitute for the (locally) uninformative equations given 3.2 91 Proposed algorithm by f˜k . Thus, let 4 fk+1 ( x, ẋ, t ) = f¯k ( x, ẋ, t ) ! ∂ t7→Φ k ( x(t), t, 0 ) (t) ∂t Expanding the differentiation using the chain rule, it follows that ∂ t 7→ Φ k ( x(t), t, 0 ) (t) = ∇1 Φ k ( x(t), t, 0 ) x0 ( t ) + ∇2 Φ k ( x(t), t, 0 ) ∂t = ∇1 Âk ( x(t), t ) x0 ( t ) + ∇2 Âk ( x(t), t ) Again, since x0 = ẋ along solutions, fk may be replaced by ! " # Ēk ( x, t ) Āk ( x, t ) ẋ + fk+1 ( x, ẋ, t ) = ∇1 Âk ( x, t ) ∇2 Âk ( x, t ) (3.11) (3.12) This completes one iteration of the structure algorithm, and it is clear that this can also be seen as the completion of one iteration of a quasilinear shuffle algorithm. As opposed to the outcome in the previous section, this time (3.12) is in the form we started with, so the procedure can be iterated. 3.2 Proposed algorithm Having seen how the structure algorithm can be implemented as an index reduction method for (exact) quasilinear dae, and that this results in an immediate generalization of the shuffle algorithm for lti dae, we now turn to the task of detailing the algorithm to make it applicable in a practical setting. Issues to be dealt with include finding a suitable row reduction method and determining whether an expression is zero along solutions. The problem of adopting algorithms for revealing hidden constraints in exact equations to a practical setting has previously been addressed in Reid et al. (2002). The geometrical awareness in their work is convincing, and the work was extended in Reid et al. (2005). For examples of other approaches to system analysis and/or index reduction which remind of ours, see for instance Unger et al. (1995) or Chowdhry et al. (2004). 3.2.1 Algorithm The reason to do index reduction in the following particular way is that it is simple enough to make the analysis easy, and also that it does not rule out some of the candidate forms (Tidefelt, 2007, section 4.2) already in the row reduction step by producing a leading matrix outside the form. If maintaining invariant forms would not be a goal in itself, the algorithm could easily be given better numeric properties (compare section 2.8), and/or better performance in terms of computation time (by reuse of expressions and similar techniques). 92 3 Shuffling quasilinear dae Algorithm 3.1 Quasilinear shuffling iteration for invariant forms. Input: A square dae, ! E( x(t), t ) x0 (t) + A( x(t), t ) x(t) = 0 It is assumed that the leading matrix is singular (when the leading matrix is nonsingular, the index is 0 and index reduction is neither needed nor possible). Output: An equivalent square dae of lower index, and additional algebraic con- straints. Iteration: Select a set of independent rows in E( x(t), t ). Perform fraction-free row reduction of the equations such that exactly the rows that were not selected in the previous step are zeroed. The produced algebraic terms corresponding to the zero rows in the leading matrix, define algebraic equations restricting the solution manifold. Differentiate the newly found algebraic equations with respect to time, and join the resulting equations with the ones selected in the first step to obtain the new square dae. Remarks: The most important remark to make here is that the differentiation is not guaranteed to be geometric (recall the remark in algorithm 2.2). Hence, the termination criterion based on the number of iterations in algorithm 2.2 cannot be used safely in this context. If that termination criterion is met, our algorithm aborts with “non-geometric differentiation” instead of “ill-posed”, but no conclusion regarding the existence of solutions to the dae can be drawn. Although there are choices regarding how to perform the fraction-free row reduction, a conservative approach is taken by not assuming anything more fancy than fractionfree Gaussian elimination, with pivoting used only when so required and done the most naïve way. This way, it is ensured that the tailoring of the reduction algorithms is really just a tailoring rather than something requiring elaborate extension. As an alternative to the fraction-free row reduction, the same step may be seen as a matrix factorization. (Steinbrecher, 2006) This view hides the reduction process in the factorization abstraction, and may therefore be better suited for high-level reasoning about the algorithm, while current presentation may be more natural from an implementation point of view and easier to reason about on a lower level of abstraction. It would be of no consequence for the analysis in the current chapter to require that the set of equations chosen in the first step always include the equations selected in the previous iteration, as is done in Rouchon et al. (1992). 3.2 Proposed algorithm 93 We stress again that an index reduction algorithm is typically run repeatedly until a low index is obtained (compare, for instance, algorithm 2.2). Here, only one iteration is described, but this is sufficient since the algorithm output is in the same form as the algorithm input was assumed to be. Recall the discussion on fraction producing versus fraction-free row reduction schemes in section 2.8. The proposed algorithm uses a fraction-free scheme for two reasons. Most importantly in this chapter, it does so in order to hold more invariant forms (to be defined). Of subordinate importance is that it can be seen as a heuristics for producing simpler expressions. The body of the index reduction loop is given in algorithm 3.1. 3.2 Example Here, one iteration is performed on the following quasilinear dae: 0 3 x1 (t) x2 (t) sin(t) 0 x1 (t) x2 (t) ! 0 x1 (t) 0 x2 (t) + cos(t) = 0 ex3 (t) 0 4 t 1 0 x3 (t) The leading matrix is clearly singular, and has rank 2. For the first step in the algorithm, there is freedom to pick any two rows as the independent ones. For instance, the rows { 1, 3 } are chosen. The remaining row can then be eliminated using the following series of fraction-free row operations. First 0 3 x1 (t) x2 (t) − t sin(t) 0 0 x1 (t) x2 (t) − 4 sin(t) ! 0 0 x20 (t) + cos(t) − 4 x1 (t) = 0 ex3 (t) − t x1 (t) 0 4 t 1 0 x3 (t) Then 0 3 x1 (t) x2 (t) − t sin(t) 0 0 x1 (t) x2 (t) − 4 sin(t) ! 0 0 0 x20 (t) + e( x(t), t ) = 0 0 t 1 0 x3 (t) 4 where the algebraic equation discovered is given by e( x, t ) = x1 x2 − t sin(t) cos(t) − 4 x1 − ex3 − t x1 x23 − 4 sin(t) Differentiating the derived equation with respect to time yields a new equation with residual in the form 0 x1 (t) a1 ( x(t), t ) a2 ( x(t), t ) a3 ( x(t), t ) x20 (t) + b( x(t), t ) 0 x3 (t) 94 3 Shuffling quasilinear dae where a1 ( x, t ) = x2 ( cos(t) − 4 x1 ) − 4 ( x1 x2 − t sin(t) ) + t x23 − 4 sin(t) a2 ( x, t ) = x1 ( cos(t) − 4 x1 ) − 3 x22 ( ex3 − t x1 ) a3 ( x, t ) = −ex3 x23 − 4 sin(t) b( x, t ) = − ( sin(t) + t cos(t) ) ( cos(t) − 4 x1 ) − sin(t) ( x1 x2 − t sin(t) ) + x1 x23 − 4 sin(t) 4 cos(t) ( ex3 − t x1 ) Joining the new equation with the ones selected previously yields the following output from the algorithm (dropping some notation for brevity): 0 3 x1 (t) x2 (t) sin(t) 0 x1 (t) x2 (t) ! t 1 0 x20 (t) + 4 = 0 0 a1 a2 a3 x3 (t) b Unfortunately, the expression swell seen in this example is typical for the investigated algorithm. Compare with the neat outcome in example 3.1, where some intelligence was used to find a parsimonious row reduction. 3.2.2 Zero tests The crucial step in algorithm 3.1 is the row reduction, but exactly how this can be done has not been discussed yet. One of the important topics for the row reduction to consider is how it should detect when it is finished. For many symbolic matrices whose rank is determined by the zero-pattern, the question is easy; the matrix is row reduced when the rows which are not independent by construction consist of only structural zeros. This was the case in example 3.2. However, the termination criterion is generally more complicated since there may be expressions in the matrix which are identically zero, although this is hard to detect using symbolic software. It is proposed that structural zeros are tracked in the algorithm, making many of the zero tests trivially affirmative. An expression which is not a structural zero is tested against zero by evaluating it (and possibly its derivative with respect to time) at the point where the index reduction is being performed. If this test is passed, the expression is assumed rewritable to zero, but anticipating that this will be wrong occasionally, the expression is also kept in a list of expressions that are assumed to be zero for the index reduction to be valid. Numerical integrators and the like can then monitor this list of expressions, and take appropriate action when an expression no longer evaluates to zero. Note that there are some classes of quasilinear dae where all expressions can be put in a canonical form where expressions that are identically zero can be detected. For instance, this is the case when all expressions are polynomials. Of course, some tolerance must be used when comparing the value of an expression against zero. Setting this tolerance is non-trivial, and at this stage we have no scien- 3.2 Proposed algorithm 95 tific guidelines to offer. This need was the original motivation for the research on matrix-valued singular perturbations. The following example exhibits the weakness of the numerical evaluation approach. It will be commented on in section 3.2.5. 3.3 Example Let us consider numerical integration of the inevitable pendulum , modeled by 00 ! x = λx 00 ! y = λy −g 1 =! x2 + y 2 where we take g B 10.0. Index reduction will be performed at two points (the time part of these points is immaterial and will not be written out); one at rest, the other not at rest. Note that it is quite common to begin a simulation of a pendulum (as well as many other systems) at rest. The following values give approximately an initial angle of 0.5 rad below the positive x axis: x0,rest : { x(0) = 0.87, ẋ(0) = 0, y(0) = −0.50, ẏ(0) = 0, λ(0) = −5.0 } x0,moving : { x(0) = 0.87, ẋ(0) = −0.055, y(0) = −0.50, ẏ(0) = −0.1, λ(0) = −4.8 } Clearly, if ẋ or ẏ constitutes an entry of any of the intermediate leading matrices, the algorithm will be in trouble, since these values are not typically zero. After two reduction steps which are equal for both points, the equations look as follows (not showing the already deduced algebraic constraints): 0 1 x −λ x 1 ẋ0 10 − λ y 1 y 0 −ẋ ! = 0 + 0 1 ẏ −ẏ 2 ẋ 2 x 2 ẏ 2 y λ0 0 Reducing these equations at x0,rest , the algorithm produces the algebraic equation ! 20 y = 2 x2 + y 2 λ but the correct equation, produced at x0,moving is ! 20 y = 2 x2 + y 2 λ + ẋ2 + ẏ 2 The perturbation results in the second part of the thesis are limited to linear systems, but here a theory for non-linear systems is needed. The two-dimensional pendulum in Cartesian coordinates is used so often in the study of dae that it was given this nick name in Mattsson and Söderlind (1993). While their model contains parameters for the length and mass of the pendulum, our pendulum is of unit length and mass to simplify notation. 96 3 Shuffling quasilinear dae Our intuition about the mechanics of the problem gives immediately that ẋ0 and ẏ 0 are non-zero at x0,rest . Hence, computing the derivatives of all variables using a reduction to index 0 would reveal the mistake. As a final note on the failure at x0,rest , note that ẋ and ẏ would be on the list of expressions that had been assumed zero. Checking these conditions after integrating the equations for a small period of time would detect the problem, so delivery of an erroneous result is avoided. 3.2.3 Longevity At the point ( x(t0 ), t0 ), the proposed algorithm performs tasks such as row operations, index reduction, selection of independent equations, etc. Each of these may be valid at the point they were computed, but fail to be valid at future points along the solution trajectory. By the longevity of such an operation, or the product thereof, we refer to the duration until validity is lost. A row reduction operation becomes invalid when its pivot entry becomes zero. A selection of equations to be part of the square index 1 system becomes invalid when the iteration matrix looses rank. An index reduction becomes invalid if an expression which was assumed to be zero becomes non-zero. The importance of longevity considerations is shown by an example. 3.4 Example A circular motion is described by the following equations (think of ζ as “zero”): ! 0 0 ζ = x x+y y ! 0 2 1 = (x ) + (y 0 )2 1 =! x2 + y 2 This system is square but not in quasilinear form. The trivial conversion to quasilinear form described in section 2.2.4 yields a square dae of size 5 with new variables introduced for the derivatives x0 and y 0 . By the geometrical interpretation of the equations we know that the solution manifold is one-dimensional and equal to the two disjoint sets (distinguished by the sign choices below, of which one has been chosen to work with) − x = cos(β), ẋ = (+) sin(β), + y = sin(β), ẏ = cos(β), (−) 5 ( ) x, ẋ, y, ẏ, z ∈ R : ζ = 0, β ∈ [ 0, 2π ] 3.2 97 Proposed algorithm Let us consider the initial conditions given by β = 1.4 in the set characterization. The quasilinear equations are: ! ẋ = x0 ẏ =! y 0 ! ζ = ẋ x + ẏ y ! 1 = ẋ2 + ẏ 2 1 =! x2 + y 2 Note that there are three algebraic equations here. The equations are already row reduced, and after performing one differentiation of the algebraic constraints and one row reduction, the dae looks like 0 −ẋ 1 x 0 y 1 −ẏ ẋ ẏ −1 x 0 y ζ + 0 0 ẋ 2 ẋ 2 ẏ 0 0 ẏ 2 x ẋ + 2 y ẏ Differentiation of the derived algebraic constraints will yield a full-rank leading matrix, so the index reduction algorithm terminates here. There are now four differential equations, x0 1 y 0 −ẋ 0 −ẏ 1 ζ + ẋ ẏ −1 x y 0 0 ẋ 2 ẋ 2ẏ 0 0 ẏ and four algebraic equations ! ζ = ẋ x + ẏ y ! 1 = ẋ2 + ẏ 2 ! 1 = x2 + y 2 ! 0 = 2 x ẋ + 2 y ẏ with Jacobian ẋ 2x 2 ẋ ẏ 2y 2 ẏ −1 x 2 ẋ 2x y 2 ẏ 2y ! Another quasilinear formulation would be obtained by replacing the third equation by ζ = x x0 + y y 0 , containing only two explicit algebraic equations. The corresponding leading matrix would not be row reduced, so row reduction would reveal an implicit algebraic equation, and the result would be the same in the end. 98 3 Shuffling quasilinear dae The algebraic equations are independent, so they shall be completed with one of the differential equations to form a square index 1 system. The last two differential equations are linearly dependent on the algebraic equations by construction, but either of the first two differential equations is a valid choice. Depending on the choice, the first row of the iteration matrix will be either of 1 0 0 −h 0 0 1 0 0 −h or After a short time, the four other rows of the iteration matrix (which are simply the Jacobian of the algebraic constraints) will approach π − cos( π2 ) −1 cos( π2 ) sin( π2 ) 1 −1 1 sin( 2 ) π π 2 sin( 2 ) −2 cos( 2 ) 2 = π π 2 2 cos( 2 ) 2 sin( 2 ) π π π π 2 sin( 2 ) −2 cos( 2 ) 2 cos( 2 ) 2 sin( 2 ) 2 2 In particular, the third row will be very aligned with 0 1 0 0 −h , which means ! ! that it is better to select the differential equation ẋ = x0 than ẏ = y 0 . This holds not only on paper, but numerical simulation using widespread index 1 solution software (Hindmarsh et al., 2004, through the Mathematica interface) demands that the former differential equation be chosen. This example illustrated the fact that, if an implementation derives the reduced equations without really caring about the choices it makes, things such as the ordering of variables may influence the end result. Hence, the usefulness of the reduced equations would depend on implementation details in the algorithm, even though the result does not feature any numerically computed entities. Even though repeated testing of the numerical conditioning while the equations are integrated is sufficient to detect numerical ill-conditioning, the point made here is that at the point ( x0 , t0 ) one wants to predict what will be the good ways of performing the row reduction and selecting equations to appear in the square index 1 form. While it is difficult to foresee when the expressions which are assumed rewritable to zero seizes to be zero (the optimistic longevity estimation is simply that they will remain zero forever), there is more to be done concerning the longevity of the row reduction operations. For each entry used as a pivot, it is possible to formulate scalar conditions that are to be satisfied as long as the pivot stays in use. For instance, it can be required that the pivot be no smaller in magnitude than half the magnitude of largest value it is used to cancel. Using the longevity predictions, each selection of a pivot can be made to maximize longevity. Clearly, this is a non-optimal greedy strategy (since only one pivot selection is considered at a time, compared to considering all possible sequences of pivot selections at once), but it can be implemented with little effort and at a reasonable runtime cost. 3.2 99 Proposed algorithm 3.2.4 Seminumerical twist In section 3.2.2 it was suggested that numerical evaluation of expressions (combined with tracking of structural zeros) should be used to determine whether an expression can be rewritten to zero or not. That added a bit of numerics to an otherwise symbolic index reduction algorithm, but this combination of symbolic and numeric techniques is more of a necessity than a nice twist. We now suggest that numerical evaluation should also be the basic tool when predicting longevity. While the zero tests are accompanied by difficult questions about tolerances, but are otherwise rather clear how to perform, it is expected that the numeric decisions discussed in this section allow more sophistication while not requiring intricate analysis of how tolerances shall be set. Without loss of generality, it is assumed that the scalar tests compare an expression, e, with the constant 0. The simplest way to estimate (predict) the longevity of e( x(t), t ) < 0 at the point ( x0 , t0 ) is to first compute the derivatives x0 at t0 using a method that does not care about longevity, and use linear extrapolation to find the longevity. In detail, the longevity, denoted Le ( x0 , t0 ), may be estimated as 4 ė( x0 , t0 ) = ∇1 e( x0 , t0 ) x0 (t0 ) + ∇2 e( x0 , t0 ) − e( x0 , t0 ) if ė > 0 4 ė( x0 , t0 ) L̂e ( x0 , t0 ) = ∞ otherwise In case of several alternatives having infinite longevity estimates by the calculation above, the selection criterion needs to be refined. The natural extension of the above procedure would be to compute higher order derivatives to be able to estimate the first zero-crossing, but that would typically involve more differentiation of the equations than is needed otherwise, and is therefore not a good option. Rather, some other heuristic should be used. One heuristic would be to disregard signs, but one could also altogether ignore derivatives when this situation occurs and fall back on the usual pivot selection based on magnitudes only. A very simple way to select equations for the square index 1 system is to greedily add one equation at a time, picking the one which has the largest angle to its projection in the space spanned by the equations already selected. If the number of possible combinations is not overwhelmingly large, it may also be possible to check the condition number for each combination, possibly also taking into account the time derivative of the condition number. 3.2.5 Monitoring Since the seminumerical algorithm may make false judgements regarding what expressions are identically zero, expressions which are not structurally zero but have passed the zero-test anyway needs to be monitored. It may not be necessary to evaluate these expressions after each time step, but as was seen in example 3.3, it is wise 100 3 Shuffling quasilinear dae to be alert during the first few iterations after the point of index reduction. For the (extremely) basic bdf method applied to equations of index 1, the local integration accuracy is limited by the condition number of the iteration matrix for a time step of size h. In the quasilinear index 1 case and for small h, the matrix should have at least as good condition as " # Ē( x, t ) (3.13) ∇1 Ã( x, t ) To see where this comes from, consider solving for xn in # ! " Ā( xn , tn ) ! Ē( xn , tn ) xn − xn−1 =0 + 0 Ã( xn , tn ) hn where xn−1 is the iterate at time tn − hn . The equations being index 1 guarantees that this system has a locally unique solution for hn small enough. Any method of some sophistication will perform row (and possibly column) scaling at this stage to improve numerical conditioning. (Brenan et al., 1996, section 5.4.3)(Golub and Van Loan, 1996) It is assumed that any implementation will achieve at least as good condition as is obtained by scaling the first group of equations by hn . For small hn the equations may be approximated by their linearized counterpart for which the numerical conditioning is simply given by the condition number of the coefficient matrix for xn . See for example Golub and Van Loan (1996) for a discussion of error analysis for linear equations. This coefficient of the linearized equation is " # ∇TĒ( xn , tn ) · (xn − xn−1 ) + Ē( xn , tn ) hn ∇1 Ā( xn , tn ) 1 + ∇1 Ã( xn , tn ) 0 T Using the approximation " xn − xn−1 ≈ hn x0 (tn ) ≈ hn Ē( xn , tn ) ∇1 Ã( xn , tn ) #−1 Ā( xn , tn ) ∇2 Ã( xn , tn ) ! gives the coefficient T " " #−1 ! T Ē( x , t ) Ā( x , t ) n n n n + ∇1 Ā( xn , tn ) Ē( xn , tn ) ∇1Ē( xn , tn ) + hn ∇1 Ã( xn , tn ) ∇2 Ã( xn , tn ) ∇1 Ã( xn , tn ) 0 # As hn approaches 0, the matrix tends to (3.13). This limit will be used to monitor numerical integration in examples, but rather than looking at the raw condition number κ(t) as a function of time t, a static transform, φ, will be applied to this value in order to facilitate prediction of when the iteration matrix becomes singular. If possible, φ should be chosen such that φ( κ(t) ) is approximately affine in t near a singularity. Note that the iteration matrix of example 2.19 was found for an lti dae, while we are currently considering the more general quasilinear form. The notation used is not widely accepted. Neither will it be explained here since the meaning should be quite intuitive, and the terms involving inverse transposes will be discarded in just a moment. 3.2 101 Proposed algorithm 0 (κ = ∞) 1 2 3 t 0 −0.05 1 − κ(t) −0.1 −0.15 Figure 3.1: The condition of the iteration matrix for the better choice of square index 1 system in example 3.4. The strictly increasing transform of the condition number makes it roughly linear over time near the singularity. Helper lines are drawn to show the longevity predictions at the times 1.7 (pessimistic), 2.0 (optimistic), and 2.5 (rather accurate). Since the ∞-norm and 2-norm condition numbers are essentially the same, the static transform will be heuristically developed for the 2-norm condition number. Approximating the singular values to first order as functions of time, it follows that, near a singularity, the condition number can be expected to grow as t 7→ t 1−t , where t1 is 1 the time of the singularity. The following observation is useful to see how φ can be chosen to match the behavior of the condition number near a singularity. Suppose φ is unbounded above, that is, φ(κ) → ∞ as κ → ∞. Then every affine approximation of φ( κ(t) ) will be bad near the singularity, since the affine approximation cannot tend to infinity. Hence, one should consider strictly increasing functions that map infinity to a finite value, and one may normalize the finite value to be 0 without loss of generality. Similarly, the slope of the affine approximation may be normalized to 1. Given the assumed growth of the condition number near a singularity, this leads to the following equation for φ( κ ). ! 1 1 ! ⇐⇒ φ = t − t1 = − 1 t1 − t t1 −t 1 φ( κ ) = − κ ! Since κ is always at least 1, this will squeeze the half open interval [ 1, ∞ ) onto [ −1, 0 ). As is seen in figure 3.1, the first order approximation is useful well in advance of the singularity. However, further away it is not. For example the prediction based on the linearization at time 2 would be rather optimistic. 102 3.2.6 3 Shuffling quasilinear dae Sufficient conditions for correctness It may not be obvious that the seminumerical row reduction algorithm above really does the desired job. After all, it may seem a bit simplistic to reduce a symbolic matrix based on its numeric values evaluated at a certain point. In this section, sufficient (while perhaps conservative) conditions for correctness will be presented. Some new nomenclature will be introduced, but only for the purpose of making the theorem below readable. Consider the quasilinear dae ! E( x(t), t ) x0 (t) + A( x(t), t ) = 0 Replacing a row ! ei ( x(t), t ) x0 (t) + ai ( x(t), t ) = 0 by (dropping some “(t)” in the notation for compactness) h i h i ! ω( x, t )ei ( x, t ) + η( x, t )ej ( x, t ) x0 + ω( x, t )ai ( x, t ) + η( x, t )aj ( x, t ) = 0 where ω and η are both continuous at ( x0 , t0 ) and ω is non-zero at this point, is called a non-singular row operation on the dae. Since the new dae is obtained by multiplication from the left by a non-singular matrix, the non-singular row operation on the quasilinear dae does not alter the rank of the leading matrix. Let x be a solution to the dae on the interval I, and assume that the rank of E( x(t), t ) is constant as a function of t on I. A valid row reduction at ( x0 , t0 ) of the original (quasilinear) dae ! E( x(t), t ) x0 (t) + A( x(t), t ) = 0 is a sequence of non-singular row operations such that the resulting (quasilinear) dae ! Err ( x(t), t ) x0 (t) + Arr ( x(t), t ) = 0 has the following properties: • A solution x is locally a solution to the original dae if and only if it is a solution to the resulting dae. • Err ( x, t ) has only as many non-zero rows as E( x, t ) has rank. 3.5 Theorem. Consider the time interval I with inf I = t0 , and the dae with initial ! condition x(t0 ) = x0 . Assume 1. The dae with initial condition is consistent, and the solution is unique and differentiable on I. 2. The dae is sufficiently differentiable for the purpose of running the row reduction algorithm. 3.2 Proposed algorithm 103 3. Entries of E( x0 , t0 ) that are zero, are zero in E( x(t), t ) for all t ∈ I. Further, this condition shall hold for intermediate results as well. Then there exists a time t1 ∈ I with t1 > t0 such that the row reduction of the symbolic matrix E( x, t ) based on the numeric guide E( x0 , t0 ) will compute a valid row reduction where the non-zero rows of the reduced leading matrix Err ( x(t), t ) are linearly independent for all t ∈ [ t0 , t1 ]. Proof: The first two assumptions ensure that each entry of E( x(t), t ) is continuous as a function of t at every iteration. Since the row reduction will produce no more intermediate matrices than there are entries in the matrix, the total number of entries in question is finite, and each of these will be non-zero for a positive amount of time. Further, the non-zero rows of Err ( x0 , t0 ) are independent by construction (as this is the reduced form of the guiding numeric matrix). Therefore they will contain a nonsingular sub-block. The determinant of this block will hence be non-zero at time t0 , and will be a continuous function of time. Hence, there exists a time t1 ∈ I with t1 > t0 such that all those expressions that are non-zero at time t0 remain non-zero for all t ∈ [ t0 , t1 ]. In particular, the determinant will remain non-zero in this interval, thus ensuring linear independence of the nonzero reduced rows. The last assumption ensures the constant rank condition required by the definition of valid row reduction, which is a consequence of each step in the row reduction preserving the original rank, and the rank revealed by the reduced form is already shown to be constant. Beginning with the part of the definition of valid row reduction concerning the number of zero-rows, note first that the number of non-zero rows will match the rank at time t0 since the row reduction of the numeric guide will precisely reveal its rank as the number of non-zero rows. It then suffices to show that the zero-pattern of the symbolic matrix contains that of the numeric guide during each iteration of the row reduction algorithm. However, this follows quite direct by the assumptions since the zero-pattern will match at E( x(t0 ), t0 ), and the assumption about zeros staying zero will ensure that no spurious non-zeros appear in the symbolic matrix evaluated at later points in time. It remains to show that a function x is a solution of the reduced dae if and only if it is a solution of the original dae. However, this is trivial since the result of the complete row reduction process may be written as a multiplication from the left by a sequence of non-singular matrices. Hence, the equations are equivalent. Note that, in example 3.3, the conditions of theorem 3.5 were not satisfied since the expressions ẋ and ẏ were zero at ( x0 , t0 ), but do not stay zero. Since their deviation from zero is continuous, they will stay close to zero during the beginning of the solution interval. Hence, it might be expected that the computed solution is approximately correct near t0 , and this is confirmed by experiments. However, to show that this observation is generally valid, and to quantify the degree of approximation, we 104 3 Shuffling quasilinear dae need a kind of perturbation theory which remains to be developed. This problem was part of the original motivation for studying matrix-valued singular perturbations, the main topic of the thesis. 3.3 Consistent initialization The importance of being able to find a point on the solution manifold of a dae, which is in some sense close to a point suggested or guessed by a user, was explained section 2.2.7. In this section, this problem is addressed using the proposed seminumerical quasilinear shuffle algorithm. While existing approaches (see section 2.2.7) separate the structural analysis from the determination of initial conditions, we note that the structural analysis may depend on where the dae is to be initialized. The interconnection can readily be seen in the seminumerical approach to index reduction, and a simple bootstrap approach can be attempted to handle it. 3.3.1 Motivating example Before turning to discussing the relation between guessed initial conditions and algebraic constraints derived by the seminumerical quasilinear shuffle algorithm, we give an illustration to keep in mind in the sections below. 3.6 Example Let us return to the inevitable pendulum in example 3.3, 00 ! x = λx 00 ! y = λy −g 1 =! x2 + y 2 where g = 10.0, with guessed initial conditions given by x0,guess : { x(0) = cos(−0.5), ẋ(0) = 0, y(0) = sin(−0.5), ẏ(0) = −0.1, λ(0) = 0 } Running the seminumerical quasilinear shuffle algorithm at x0,guess produces the algebraic constraints ! 1 = x2 + y 2 ! Cx0,guess = 0 = 2 x ẋ + y ẏ 0 =! 2 x2 λ + 2 y ( g − y λ ) + 2 ẏ 2 The implementation used here does not make the effort to compute the derivatives needed to make better zero tests and longevity estimates. 3.3 105 Consistent initialization The residuals of these equations at x0,guess are 0.0 0.0959 9.61 so either the algorithm simply failed to produce the correct algebraic constraints although x0,guess was consistent, or x0,guess is simply not consistent. Assuming the latter, we try to find another point by modifying the initial conditions for the three variables ẋ, y, and λ to satisfy the equally many equations in Cx0,guess . This yields x0,second : { x(0) = cos(−0.5), ẋ(0) = −0.055, y(0) = −0.48, ẏ(0) = −0.1, λ(0) = −4.804 } (This point does satisfy Cx0,guess ; solving the equations could be difficult, but in this case it was not.) At this point the algorithm produces another set of algebraic constraints: ! 1 = x2 + y 2 ! Cx0,second = 0 = 2 x ẋ + y ẏ 0 =! 2 x2 λ + 2 y ( g − y λ ) + 2 ẋ2 + 2 ẏ 2 with residuals at x0,second : 0.0 0.0 0.0060 By modifying the same components of the initial conditions again, we obtain x0,final : { x(0) = cos(−0.5), ẋ(0) = −0.055, y(0) = −0.48, ẏ(0) = −0.1, λ(0) = −4.807 } This point satisfies Cx0,second , and generates the same algebraic constraints as x0,second . Further, the algorithm encountered no non-trivial expressions which had to be assumed rewritable to zero, so the index reduction was performed without seminumerical decisions. Hence, the index reduction is locally valid, and the reduced equations provide a way to construct a solution starting at x0,final . In other words, x0,final is consistent. 3.3.2 A bootstrap approach A seminumerical shuffle algorithm maps any guessed initial conditions to a set of algebraic constraints. Under certain circumstances, including that the initial conditions are truly consistent, the set of algebraic constraints will give a local description of the solution manifold. Hence, truly consistent initial conditions will be consistent with the derived algebraic constraints, and our primary objective is to find points 106 3 Shuffling quasilinear dae with this property. Of course, if a correct characterization of the solution manifold is available, finding consistent initial conditions is easy given a reasonable guess — there are several ways to search a point which minimizes some norm of the residuals of the algebraic constraints. If the minimization fails to find a point where all residuals are zero, the guess was simply not good enough, and an implementation may require a better guess form the user. If the guessed initial conditions are not consistent with the derived algebraic constraints, the guess cannot be truly consistent either, and we are interested in finding a nearby point which is truly consistent. In hope that the derived algebraic constraints could be a correct characterization of the solution manifold, even though they were derived at an inconsistent point, the proposed action to take is to find a nearby point which satisfies the derived constraints. What shall be considered nearby is often very application-specific. Variables may be of different types, defying a natural metric. Instead, if the solution manifold is characterized by m independent equations, a user may prefer to keep all but m variables constant, and adjust the remaining to make the residuals zero. This avoids in a natural way the need to produce an artificial metric. No matter how nearby is defined, we may assume that the definition implies a mapping from the guessed point to a point which satisfies the derived algebraic constraints (or fails to find such a point, see the remark above). Noting that a guessed point of initial conditions is mapped to a set of algebraic constraints, which then maps the guessed point to a new point, we propose that this procedure be iterated until convergence or cycling is either detected or suspected. 3.3.3 Comment The algebraic constraints produced by the proposed seminumerical quasilinear shuffle algorithm are a function of the original equations and a number of decisions (pivot selections and termination criteria) that depend on the point at which index reduction is performed. Since the number of index reduction steps before the algorithm gives up is bounded given the number of equations, and the number of pivot selections and termination criterion evaluations is bounded given the number of index reduction steps, the total number of decisions that depend on the point of index reduction is bounded (although the algorithm has to give up for some problems). Hence, any given original equations can only give rise to finitely many sets of algebraic constraints. The space of initial conditions (almost all of which are inconsistent) can thus be partitioned into finitely many regions according to what algebraic constraints the algorithm produces. Defining the index of a dae without assuming that certain ranks are constant in the neighborhood of solutions can be difficult, and motivates the use of so-called uniform and maximum indices, see Campbell and Gear (1995). To the bootstrap approach above, constant ranks near solutions implies that the algorithm will produce the correct algebraic constraints if only the guessed initial conditions are close enough to the solution manifold. To see how the approach suffers if constant ranks near solutions are not granted, it suffices to note that even finding a point which gen- 3.4 Conclusions 107 erates constraints at level 0 which it satisfies can be hard. In other words, consistent initialization can then be hard even for problems of index 1. 3.4 Conclusions The shuffle algorithm for lti dae can be generalized so that it applies to quasilinear dae. The current chapter highlights that numerical evaluation may be both necessary and convenient, and proposes several aspects of index reduction and numerical solution where seminumerical methods are useful. The numerical evaluation introduces uncertainty in the equations, and the related perturbation problems do not have the structure of any of the well known perturbation problems in the literature. New perturbation problems also arise near points where there is a change in matrix ranks. Since the quasilinear form is a nonlinear form, the perturbation theory needed to complete the proposed algorithms will also be nonlinear. Although the matrixvalued perturbation problems studied in the second part of the thesis are all linear, it has always been a long term goal to derive results for nonlinear systems so that the quasilinear shuffle algorithm can be theoretically justified. Having emphasized the connection between nonlinear problems and the linear theory developed in the second part of the thesis, it must be mentioned that the results for linear systems in the second part of the thesis have an immediate application in the shuffle algorithm for linear systems. Part II Results 4 Point-mass filtering on manifolds L ife in differential-algebraic equations is full of constraints. What if the index-reduction based on matrix-valued singular perturbations revealed a constraint showing that we might as well consider ourselves living on a sphere?! If not knowing where we are on this sphere is a problem we often have to deal with, we will surely write a lot of algorithms to find out. Writing algorithms in terms of coordinate tuples was so convenient when we thought the world was flat, but now what? Trying to use the same interpretation of the coordinate tuple at any point in the world causes singularity in our algorithms, and we get lost again. We need new tools to shield us form the traps of the curvature in space. The current chapter is an extended version of Henrik Tidefelt and Thomas B. Schön. Robust point-mass filters on manifolds. In Proceedings of the 15th IFAC Symposium on System Identification, pages 540–545, Saint-Malo, France, July 2009. and presents a framework for algorithm development for objects on manifolds. The only connection with the rest of the thesis is that index reduction of nonlinear differential-algebraic equations may reveal a manifold structure in the problem. There are two main approaches for how to deal with this. The first approach is to keep the equations in their differential-algebraic form, and try to invent (or search in the literature) dae versions of all the theory and algorithms we need for the task at hand (for example, finding out where we are on the sphere). The second approach is to use the ordinary differential equations that result from index reduction together with the manifold structure implied by the discovered non-differential constraints. We then need tools for working with ordinary differential equations on a manifold, and the present chapter is a contribution to this field. 109 110 4.1 4 Point-mass filtering on manifolds Introduction State estimation on manifolds is commonly performed by embedding the manifold in a linear space of higher dimension, combining estimation techniques for linear spaces with some projection scheme (Brun et al., 2007; Törnqvist et al., 2009; Crassidis et al., 2007; Lo and Eshleman, 1979). Obvious drawbacks of such schemes are that computations are carried out in the wrong space, and that the arbitrary choice of embedding has an undesirable effect on the projection operation. Another common approach is to let the filter run in a linear space of local coordinates on the manifold. Drawbacks include the local nature of coordinates, the nonlinearities introduced by the curved nature of the manifold, and the dependency on the choice of coordinates. Despite the drawbacks of these two approaches, it should be admitted that they work well for many “natural” choices of embeddings and local coordinates, as long as the uncertainty about the state is concentrated to a small — and hence approximately flat — part of the manifold. Still, the strong dependency on embeddings and local coordinates suggests that the estimation algorithms are not defined within the appropriate framework. The Monte-Carlo technique called the particle filter lends itself naturally to a coordinate-free formulation (as in Kwon et al. (2007)). However, the stochastic nature of the technique makes it unreliable, and addressing this problem motivates the word robust in the title of this work. With a growing geometric awareness among state estimation practitioners, geometrically sound algorithms tailored for particular applications are emerging. A very common application is that of orientations of rigid bodies (for instance, Lee and Shin (2002)), and this is also a guiding application in our work. Our interest in this work is to examine how robust state estimation on compact manifolds of low dimension can be performed while honoring the geometric nature of the problem. The robustness should be with respect to uncertainties which are not concentrated to a small part of the manifold, and is obtained by using a non-parametric representation of stochastic variables on the manifold. By honoring the geometric nature we mean that we intend to minimize references to embeddings and local coordinates in our algorithms. We say minimize since, under a layer of abstraction, we too will employ embeddings to implement the manifold structure, and local coordinates are the natural way for users to interact with the filter. Still, the proposed framework for state estimation can be characterized by the abstraction barrier that separates the details of the embedding from the filter algorithm. For example, in the context of estimation of orientations, rather than speaking of filters for unit quaternions or rotation matrices, this layer of abstraction enables us to simply speak of filters for S O(3) — both unit quaternions and rotation matrices may be used to implement the low-level details of the manifold structure, but this is invisible to the higher-level estimation algorithm. Pursuing non-parametric filtering in curved space comes at some computational The technique is unreliable in the sense that the produced estimate depends on samples from random variables inside the algorithm, and the estimate can at most be expected to be correct on average. If the random samples come out very unfortunate, the produced estimate may come out very far from the correct result. 4.1 Introduction 111 costs compared to the linear space setting. Most notably, equidistant meshes do not exist, but on the other hand our restriction to compact manifolds means that the whole manifold can be “covered” by a mesh with finitely many nodes. One of the practical benefits of the proposed non-parametric filter is the ability to dynamically adapt the mesh to enhance the degree of detail in regions of interest, for instance, where the probability density is high. The proposed point-mass-based solution for filtering in curved space has three main components: • Compute — and possibly update — a tessellation of the manifold. Each region of the tessellation is required to be associated with a point that will represent the location of the region in calculations, and the volume of each region must be known. • Implement measurement and time updates. This requires a system model which, unlike when filtering in Euclidean space, cannot have additive noise on the state. • Provide the user with a point estimate. There is always the option to compute a cheap extrinsic estimate (typically the extrinsic mean), but honoring geometric reasoning in this work, we also look into intrinsic estimates. Each of these components will be considered in the following sections, including special treatment for the case of spheres where the general situation lacks detail. A more detailed, algorithmic, description of the proposed solution is given in section 4.6. Terminology. By manifold , we refer to a differentiable, Riemannian manifold. Loosely speaking, a (contravariant) vector is a velocity on the manifold, belonging to the tangent space (which is a vector space) at some point on the manifold, and is basically valid only at that point. A curve on the manifold which locally connects points along the shortest path between the points, is called a geodesic, and the exponential map maps vectors to points on the manifold in such a way that, for a vector v at p, the curve t 7→ etpv has velocity v at t = 0, and is a geodesic. When needed, we shall assume that the manifold is geodesically complete, meaning that the exponential map shall be defined for all vectors. We recommend Frankel (2004) for an introduction n too these concepts from differential geometry. A tessellation of the manifold is a set Ri of subsets of the manifold, such that “there is no overlay and i no gap” between regions; the union of all regions shall be the whole manifold, and the intersection of any two regions shall have measure zero. We shall additionally require that each region Ri be simply connected. Notation. The manifold on which the estimated state evolves is denoted M. We make no distinction in notation between ordinary and stochastic variables; x my refer both to a stochastic variable over the manifold and a particular point on the manifold. The probability of a statement, such as x ∈ R, is written P( x ∈ R ). The probability density function for a stochastic variable x is written fx . When conditioning on a variable taking on a particular value, we usually drop the stochastic variable from the notation; for instance, fx|y is a shorthand for fx|Y =y , where the distinction between the 112 4 Point-mass filtering on manifolds stochastic variable, Y , and the value it takes, y, had to be made clear. The distance in the induced Riemannian metric, between the points x and y, is written d( x, y ). The symbol δ is used to denote the Dirac delta “function”. A Gaussian distribution over a vector space, with mean m and covariance C, is denoted N ( m, C ), and if the variable x is distributed according to this distribution, we write x ∼ N ( m, C ). (The covariance is a symmetric, positive semidefinite, linear mapping of pairs of vectors to scalars, and it should be emphasized that a covariance is basically only compatible with vectors at a certain point on the manifold.) In relation to plans for future work, we should also mention that group structure on the manifold is not used in this work, although such manifolds, Lie groups, are often a suitable setting for estimation of dynamic systems. 4.2 Background and related work For models with continuous-time dynamics, the evolution of the probability distribution of the state is given by the Fokker-Planck equation, and a great amount of research has been aimed at solving this partial differential equation under varying assumptions and approximation schemes. Daum (2005) gives a good overview that should be accessible to a broad audience. In the present discrete-time setting, the corresponding relation is the Chapman-Kolmogorov equation. It tells how the distribution of the state at the next time step (given all available measurements up till the present) depends on the distribution of the state at the current time step (given all available measurements up till the present) and the process noise in the model. Let y0..t be the measurements up to time t, and xs|t be the state estimate at time s given y0..t . Conditioned on the measurements y0..t , and using that xt+1 is conditionally independent of y0..t given xt , the Chapman-Kolmogorov equation states the familiar Z fxt+1|t ( xt+1 ) = fxt+1 |xt ( xt+1 ) fxt|t ( xt ) dxt (4.1) In combination with Bayes’ rule for taking the information in new measurements into account, fxt|t−1 ( xt ) fyt |xt ( yt ) fxt|t ( xt ) = fyt|t−1 ( yt ) this describes exactly the equations that the discrete-time filtering problem is all about. To mention just a few references for the particular application of filtering on S O(3), a filter for random walks on the tangent bundle (with the only system noise being additive noise in the Lie algebra corresponding to velocities) was developed in Chiuso and Soatto (2000), a quaternion representation was used with projection and a Kalman filter adapted to the curved space in Choukroun et al. (2006), and Lee et al. (2008) proposes a method to propagate uncertainty under continuous-time dynamics in a noise-free setting. The particle filter approach in Kwon et al. (2007) has already been mentioned. 4.3 113 Dynamic systems on manifolds A solid account of the most commonly used methods for filtering on S O(3) is provided by Crassidis et al. (2007). In Lo and Eshleman (1979) the authors presents an interesting representation of probability density functions on S O(3), making use of exponential Fourier densities. 4.3 Dynamic systems on manifolds The filter is designed to track the discrete-time stochastic process x, evolving on some manifold of low dimension. That the dimension is low is instrumental to enabling the use of filter techniques that, in higher dimensions, break down performance-wise due to the curse of dimensionality (Bergman, 1999, section 5.1). We use discrete-time models in the form qx ∼ Wg( x, u ) y ∼ Vx where Wg( x, u ) is the random distribution of process noise taking values on the manifold, u is a known external input, and the measurement y is distributed according to the random distribution Vx . Not being aware of a standard name for a distribution over the manifold, parameterized by a point on the same manifold, we shall use distribution field for W• (here, the bullet indicates that there is a free parameter — for a fixed value of this parameter, we have an ordinary random distribution). For example, the measurement equation could be given by Vx = N h( x ), Cy ( x ) That is, we have additive Gaussian white noise added to the nominal measurements h( x ), and we allow the noise covariance to depend on the state. A less general example of the dynamic equation could be to combine Gaussian distributions with the exponential map qx ∼ exp N 0, Cg( x, u ) Here, N 0, Cg( x, u ) is our way of denoting a zero mean Gaussian distribution of vectors at g( x, u ). However, (without the structure of a Lie group) the simplicity of this expression is misleading, since the Gaussian distributions at different points on the manifold are defined in different tangent spaces. Hence, a common matrix will not be sufficient to describe the covariance in all points. To really obtain simple equations for the dynamic equation, we may employ distributions that only depend on the distance fqX ( qx ) = fd ( d( qx, g( x, u ) ) ) We avoid the term state space model here since this notion is so strongly associated with models in terms of a state vector which is just a coordinate tuple; our models shall be stated in a coordinate-free manner. 114 4.4 4 Point-mass filtering on manifolds Point-mass filter The main idea of the point-mass filter is to model the probability distribution of the state x being estimated as a sum of weighted Dirac delta functions. The Dirac deltas are located at fixed positions in a uniform grid, and the idea dates back to the seminal work by Bucy and Senne (1971). When the filter is run, a sequence of such random variables will be produced and there is a need to distinguish between the variables before and after measurement and time updates, recall the notation introduced in section 4.2. Readers familiar with the particle filter will notice many similarities to the proposed filter, but should also pay attention to the differences. To mention a few, the proposed filter is deterministic (and in this sense robust), does not require resampling, associates each probability term (compare particle) with a region in the domain of the estimated variable, and calculates with the volumes of these regions. One notable drawback compared to the particle filter is that when the estimated probability is concentrated to a small part of the domain, the particle filter will automatically adapt to provide estimates with smaller uncertainty, while the proposed filter would require a non-trivial extension to do so. In this section, we first discuss the representation of stochastic variables, and then turn to deriving equations for the time and measurement updates, expressed using the proposed representation. 4.4.1 Point-mass distributions on a manifold In this section, we consider how any random variable on the manifold may be represented, and omit time subscripts to keep the notation clear. That the idea is termed point-mass is due to the sometimes used assumption that the probability is distributed discretely at certain points. Written using the Dirac delta, the probability density function for x is then given by X fX ( x ) = p i δ( x − x i ) i where the sum is over some finite number of points with probability p i located at x i . While this makes several operations on the distribution feasible, which would be extremely computationally demanding using other models, this is clearly very different from what we would expect the density function to look like. To be able to make other interpretations of the pairs ( p i , x i ), each such pair needs to i be associated n o with a region R of the probability space, and we require that the set of i i regions, R , be a tessellation. Let µ = µ Ri , where µ( • ) measures volume. i That our definition of tessellation did not require that the overlaps between regions be empty forces us to use only the interior of the regions for many purposes, that is, Ri \ ∂Ri instead of Ri . For the sake of brevity, however, we shall make abuse of notation and often write simply Ri when it is actually the interior that is referred to — the reader should be able to see where this applies. 4.4 115 Point-mass filter n o Given a tessellation Ri (of cardinality N ), a more relaxed interpretation of the i probabilities p i is obviously P X ∈ Ri = p i (4.2) and a more realistic model of the distribution is that it is piecewise constant; X pi fX ( x ) = µi i i : x∈R Note that the sum may expand to more than one term, but only on a set of measure zero. Given the tessellation, including the µi , it is clear that the numbers p i my be replaced 4 pi by f i = µi . Since this is a more natural representation of piecewise constant functions in general, we choose to use this also for the probability density function estimate. For completeness, we state the above equations again, now using f i instead of p i : P X ∈ Ri = f i µi (4.3) P f i µi δ( x − x i ) , (Point-mass) i fX ( x ) = (4.4) P i : x∈Ri f i , (Piecewise constant) The point-mass filter is a meshless method in that it does not make use of a connection graph describing neighbor relations between the nodes x i . (A connection graph is implicit in the tessellation, but it is not used.) While meshless methods in many finite element method applications would use interpolation (of, for instance, Sibson or Laplace type, see Sukumar (2003) for an overview of these) instead of the piecewise constant (4.4), our choice makes it easy to ensure that the density is non-negative and integrates to 1. Furthermore, both computation of the interpolation itself, and use of the interpolated density, would drastically increase the computational cost of the algorithm. It turns out that computing good tessellations is a major task of the implementation of point-mass filters on manifolds, just like mesh generation is a major task when using finite element methods. It may also be a time-consuming task, but a basic implementation may do this once for all, offline. Since the number of regions greatly influences the runtime cost of the filter, a tessellation computed offline will have to be rather coarse. For models where large uncertainty is inherent in the filtering problem, this may be sufficient, but if noise levels are low and accurate estimation is theoretically achievable, the tessellation should be adapted to have smaller regions in areas where the probability density is high. If each region Ri is given as the set of points being closer to x i than to all other x j,i , the tessellation is called a Voronoi diagram of the manifold (in case of the 2-sphere, see for instance Augenbaum and Peskin (1985); Na et al. (2002)). Since this will make This statement is based on intuition; it is a topic for future research to provide a theoretical foundation for how to best adapt the tessellation. 116 4 Point-mass filtering on manifolds the point-mass interpretation more reasonable, it seems to be a desirable property of the tessellation, although a formal investigation of this strategy remains a relevant topic for future research. To make transitions between tessellations easy, we require that adaptation is performed by either splitting a region into smaller regions, or by recombining the parts of a split region. Following this scheme, two kinds of tessellation operations are needed; first one to compute a base tessellation of the whole manifold, and then one to split a region into smaller parts. When the base tessellation is computed, the curved shape of the manifold on a global scale will be necessary to consider. The base tessellation should be fine enough to make flat approximations of each region feasible. Such approximation should be useful to the algorithm that then splits regions into smaller parts. How to compute good base tessellations will generally require some understanding of the particular manifold at hand, and will therefore require specialized algorithms, while the splitting of approximately flat regions should be possible in a general setting. Finally, a scheme for when to split and when to recombine will be required. This scheme shall ensure that the regions are small where appropriate, while keeping the total number of regions below some given bound. 4.4.2 Measurement update Just as for particle filters, the measurement update is a straightforward application of Bayes’ rule. To incorporate a new measurement of the random variable Y ∼ Vx modeling the output, we have fY |X∈Ri ( y ) P X ∈ Ri i P X ∈ R |y = fY ( y ) fY |X=xi ( y ) P X ∈ Ri ≈ fY ( y ) where the measurement prior fY ( y ) need not be known since it is a common factor to all probabilities on the mesh, and will just act as a normalizing constant. Converting to our favorite representation f i , adding time indices, conditioning on Y0..t−1 , and To see this, let By (r) denote a ball of radius r centered at y. The relation follows directly from P X ∈ Ri | y P X ∈ Ri | Y ∈ By (r) P Y ∈ By (r) fY ( y ) = lim r→0 µ By (r) P X ∈ Ri ∧ Y ∈ By (r) = lim r→0 µ By (r) P Y ∈ By (r) | X ∈ Ri P X ∈ Ri = lim r→0 µ By (r) = fY |X∈Ri ( y ) P X ∈ Ri 4.4 117 Point-mass filter using conditional independence of Y and Y0..t−1 given X, this reads i fY |X=xi ( y ) ft|t−1 P X ∈ Ri | y0..t i ft|t = ≈ fYt|t−1 ( y ) µi By defining 4 BayesRule( f , g ) = R f g f g and noting that the result will always be a proper probability distribution (and hence integrate to 1, just as the result of the BayesRule operator) we can write: fXt|t = BayesRule fXt|t−1 , fY | X=• ( y ) Note how the volumes of regions enter the computation of the BayesRule operator: BayesRule( f , g ) ( x i ) ≈ P 4.4.3 f ( x i ) g( x i ) j j j j f ( x ) g( x ) µ (4.5) Time update in general The time update can be described by the relation Z Z P qX ∈ Ri = fWg( x, u ) ( x̄ ) fX ( x ) dx̄ dx M Ri In the filtering application, the stochastic entities in this relation will be conditioned on y0..t , but since the conditioning is the same on both sides, it may be dropped for the sake of a more compact notation in this section. By the mean value theorem, we find Z P qX ∈ Ri = µi fWg( x, u ) ( x̄ ) fX ( x ) dx M Ri , for some x̄ ∈ and dividing both sides by µi and fitting the region in a shrinking ball centered at x i , we obtain P qX ∈ Ri → fqX x i i µ and Z Z fWg( x, u ) ( x̄ ) fX ( x ) dx → M fWg( x, u ) x i fX ( x ) dx M Hence, we obtain the Chapman-Kolmogorov equation (4.1) in the limit, Z fqX x i = fX ( x ) fWg( x, u ) x i dx M 118 4 Point-mass filtering on manifolds and this we make the definition of the convolution: fqX = fX ∗ fWg( •, u ) The convolution of a distribution field and a probability density function is a new probability density function. We shall think of the time update as implementing this relation. By approximating the probability density functions as constant over small regions (assuming all the regions Ri are small ), we get the time update approximation Z Z i P qX ∈ R = fWg( x, u ) ( x̄ ) fX ( x ) dx̄ dx M Ri Z ≈ µi fWg( x, u ) x i fX ( x ) dx M =µ i XZ j ≈µ i X Rj fW j = µi X j fWg( x, u ) x i fX ( x ) dx g( x j , u ) x i Z fX ( x ) dx Rj fW g( x j , u ) x i P X ∈ Rj This is readily converted to an implementation of the convolution (here, the conditioning is written out for future reference): P qX ∈ Ri | y0..t i ft+1|t = µi X P X ∈ Rj | y0..t (4.6) fW j xi ≈ µj j g( x , u ) µ j X j fW j x i ft|t µj = j 4.4.4 g( x , u ) Dynamics that simplify time update Since the number of regions may be large, and computing the update convolu time 2 i tion involves N lookups of the probability density fW j x , we should consider g( x , u ) means to keep the cost of each such lookup low. First, if the system is autonomous (that is, g( x j , u ) does not depend on u), all transitions may be computed offline and stored in a stochastic matrix . The µj could also be included in this matrix, reducing the convolution computation to a matrix multiplication. This matrix is often called the transition matrix, but this notion has a different meaning in the thesis. 4.5 119 Point estimates As was noted above, one class of distributions for the noise on the state, which makes the expression simple, is that where the density depends only on the distance from the nominal point; X j i ft|t µj ≈ fd d g( x j , u ), x i ft+1|t j This will be the structure in our example. 4.5 Point estimates The distinction between intrinsic and extrinsic was introduced in Srivastava and Klassen (2002), where a mean value of a distribution on a manifold was estimated by first estimating the mean of the distribution of the manifold embedded in Euclidean space, and then projecting the mean back to the manifold. This, they termed the extrinsic estimator. In contrast, an intrinsic estimator was defined without reference to an embedding in Euclidean space. While this may seem a hard contrast at first, Brun et al. (2007) shows that both kinds of estimates may be meaningful from a maximum likelihood point of view, for some manifolds with “natural embedding”. 4.5.1 Intrinsic point estimates A common intrinsic generalization of the usual mean in Euclidean space is defined as a point where the variance obtains a global minimum, where the variance “only” requires a distance to be defined: Z 4 VarX ( x ) = d( x̄, x )2 fX ( x̄ ) dx̄ (4.7) Unfortunately, such a mean may not be unique, but if the support of the distribution is compact, there will be at least one. Other intrinsic point estimates may also be defined, but these alternatives will not be discussed further. The reason is that the motivation for the current discussion is just to illustrate that it is possible to define algorithms aimed at computing intrinsic point estimates based on the proposed probability density representation. Since distributions with a globally unique minimum may be arbitrarily close to distributions with several distinct global minimums, it is our understanding that schemes based on local search, devised to find one good local minimizer, are reasonable approximations of the definition. Hence, there are two tasks to consider; implementation of the local search, and a scheme that uses the local search in order to find a good local minimizer. Given an implementation of the local search, we propose that it be run just once, initiated at the region representative x i with the least variance. Since the region representatives are assumed to be reasonably spread over the whole manifold, there is good hope that at least one of them is in the region of attraction of the global minimum. However, even if this is the case, it may not include the x i with least variance, 120 4 Point-mass filtering on manifolds which directly leads to more robust schemes where the local search is initiated at several (possibly all) x i . A completely different approach to initialization of the local search, is to use an extrinsic estimate of the mean if available. Since the extrinsic mean may be extremely cheap to compute compared to even evaluating the variance at one point, and may at the same time be a good approximator of the intrinsic mean, it is very beneficial to use, while the major drawback is that it requires us to go outside the geometric framework. To implement a local search, one must be able to compute search directions and to perform line searches. For this, we rely on the exponential map, which allows these tasks to be carried out in the tangent space of the current search iterate. The search direction used is steepest descent computed using finite difference approximation, although more sophisticated methods exist in the literature (Pennec, 2006). 4.5.2 Extrinsic point estimates The extrinsic mean estimator proposed in Srivastava and Klassen (2002) is defined by replacing the distance d( x̄, x ) in (4.7) by the distance obtained by embedding the manifold in Euclidean space and measuring in this space instead. It is argued that if the support of the distribution is small, this should give results similar to the intrinsic estimate. However, considering how arbitrary the choice of embedding is, it is clear that the procedure as a whole is rather arbitrary as well. (Nevertheless, a good embedding seems likely to produce useful results, see for instance the examples in Srivastava and Klassen (2002).) Recall that the algorithm for computing the extrinsic mean is very efficient; first compute the mean in the embedding space, and then project back to the manifold. The projection step is defined to yield the point on the manifold which is closest to the mean in the embedding space, and clearly assumes that this point will be unique. To give an example of how sensitive the extrinsic mean is to the selection of embedding, and why we find it worth-while to spend effort on intrinsic estimates, consider embedding S 2 in R3 . However, instead of identifying S 2 with the sphere in R3 , we magnify the sphere in some direction. 4.6 Algorithm and implementation The final component to discuss before putting the theory of the previous sections together in an algorithm, is how tessellations are computed. In this section, we do this, present the algorithm in a compact form in algorithm 4.1 on page 123, and include some notes on the software design and implementation. 4.6.1 Base tessellations (of spheres) To be more specific about how a base tessellation may be computed, we have considered how this can be done for spheres, but the technique we employ does not only work for spheres. 4.6 Algorithm and implementation 121 The first step is to generate the set of points x i . Here, the user is given the ability to affect the number of points generated, but precise control is sacrificed for the sake of more evenly spread points. The basic idea is to use knowledge of a sphere’s total volume to compute a desired volume of each region. Then we use spherical coordinates in nested loops, with the number of steps in each loop being a function of the current coordinates of the loops outside. The details for the 2-sphere and 3-sphere are provided in section 4.A. The remaining steps are general and do not only apply to spheres. First, equations for the half-space containing the manifold and being bordered by the tangent space at each point x i is computed. This comes down to finding a base for the space orthogonal to the tangent space at x i — for spheres, this is trivial. The intersection of these half-spaces is a polytope with a one-to-one correspondence between facets and generating points. (We rely on existing software here, please refer to section 4.6.3 at this point.) Projecting the facets towards the origin will generate a tessellation, and for spheres this will be a Voronoi tessellation if the “natural” embedding is used. Each region is given by the set of projected vertices of the corresponding polytope facet. As part of the tessellation task, the volume of each region must also be computed. For the 2-sphere this can be done exactly thanks to the simple formula giving the area of the region bounded inside the geodesics between three points on the sphere (Beger, 1978, p 198). In the general case we approximate the volumes on the manifold by the volume of the polytope facets. (Note that a facet can be reconstructed from the projected vertices by projecting back to the (embedded) tangent space at the generating point.) For spheres the ideal total volume is known, and any mismatch between the sum of the volumes of the regions and the ideal total volume is compensated by scaling all volumes by a normalizing constant. 4.6.2 Software design Our implementation is written in C++ for fast execution. Still, there is a strong emphasis on careful representation of the concepts of geometry in the source code. Perhaps most notably, a manifold is implemented as a C++ type, and allows elements to be handled in a coordinate-free manner. By providing a framework for writing coordinate-free algorithms, we try to guide algorithm development in a direction that makes sense from a geometric point of view. Quite obviously, there is an overhead associated with the use of our framework, but it is our understanding that if the developed algorithms are to be put in production units, they shall be rewritten directly in terms of the underlying embedding — our framework is aimed at research and development, and it is an attempt to increase awareness of geometry in the filtering community. Other concepts of geometric relevance that are represented in the software design are: • Scalar functions, that is, mappings from a manifold to the set of real numbers. • Coordinate maps, that is, invertible mappings from a part of the manifold to tuples of real numbers. 122 4 Point-mass filtering on manifolds • Tangent spaces, that is, the linear spaces of directional derivatives at a certain point of the manifold. As with the manifold elements, elements of the tangent spaces are handled in a coordinate-free manner. The basic means for construction of tangents is to form the partial derivative with respect to a coordinate function. • Euclidean spaces are implemented as special cases of manifolds. 4.6.3 Supporting software A very important part of the tessellation procedure for spheres and other manifolds with a convex interior seen in the embedding space, are the conversions between polytope representations. That is, given a set of bounding hyperplanes, we want a vertex representation of all the faces, and given a set of vertices, we want the corresponding set of hyperplanes. I our work, these tasks were carried out using cddlib (Fukuda, 2008), distributed under the GNU general public licence. Although several algorithms for computing the volume of polytopes of arbitrary dimension exist (Büeler et al., 2000), no freely available implementation compatible with C++ was found. We would like to encourage the development, the sharing, and the advertisement of such software. The authors’ implementation for this task is a very simple triangulation-based recursion scheme. 4.7 Example To illustrate the proposed filtering technique, a manifold of dimension 2 was chosen so that the probability distributions are amenable to illustration. We consider the bearing-tracking problem in 3 dimensions, that is, the state evolves on the 2-sphere. This may be a robust alternative to tracking the position of an object when rangeinformation cannot be determined reliably. It is also a good example to mention when discussing models without dynamics (velocities are not part of the state), since the lack of (Lie) group structure makes the extension to dynamic models non-trivial. As an example of a bearing-sensor in 3 dimensions, we may consider a camera and an object recognition algorithm, which returns image coordinates in each image frame, which are then converted to the three components of a unit vector in the corresponding direction. The example is about the higher-level considerations of the filtering problem, and not the low-level details of implementing the manifold at hand. The deterministic part of the dynamic equation, g, does not depend on any external input, and just maps any state to itself. The noise in the equation is given by a von Mises-Fischer distribution field (see the overview Schaeben (1992)) with concentration parameter κ = 12 everywhere. The three scalar relations in the measurement equation are clearly dependent, as the manifold has only two dimensions. Also, the fact that the noise in the estimate from the object recognition has only two dimensions, implies that the noise on the three components in the measurement equation will be correlated. Besides the dependencies and correlations, noise levels should be state-dependent, as the uncertainty for a 4.7 Example 123 Algorithm 4.1 Summary of point-mass filter on a manifold. Input: • A model of a system with state belonging to a manifold. • An a priori probability distribution for the state at the initial time. • A sequence of measurement data. Output: A sequence of probability density estimates for the filtered state, possibly along with or replaced by point estimates. i are the (approximate) values of the probability density Notation: The numbers ft|t−1 function at the point x i , at time t given the measurements from time 0 to time t − 1. i The numbers ft|t are the (approximate) values of the probability density function at time t, given also the measurements available at time t. Initialization: Compute a tessellation with regions Ri of the manifold. Assign a representative point x i to each region, and measure the volumes µi . In case of spheres, see section 4.6.1. i i be the a priori distribution. That is, each f0|−1 is assigned a non-negative Let f0|−1 P i i value, and all values jointly satisfy i f0|−1 µ = 1. Process measurements: for t = 0, 1, 2, . . . Compute a point prediction from ft|t−1 , for instance, by minimizing (4.7). Use the measurements yt to compute ft|t using BayesRule, see (4.5) for details. Compute a point estimate from ft|t , for instance, by minimizing (4.7). Make a time update to compute ft+1|t using (4.6). Possibly update the tessellation. (Details are subject for future work.) end given direction component is at minimum (though not zero) when the tracked object is in line with the component, and at maximum when at a right angle. Despite our awareness of this structure, we make the model as simple as possible by assuming independent and identically distributed Gaussian noise on the three components, hence parameterized by the single scalar σ = 0.4. Given an initial state, a simulation of the model equations (compare with simulating a moving object in 3 dimensions, with measurement noise entering in a simulated object recognition algorithm) is run, resulting in a sequence of measurements. The manifold is tessellated into N = 200 approximately equally sized regions, and the filter is initialized with a uniform probability density. The probability density estimate is then updated as measurements are made available to the filter. The result is illustrated in figure 4.1. 124 4.8 4 Point-mass filtering on manifolds Conclusions and future work We have shown that point-mass filters can be used to construct robust filters on compact manifolds. By separating the implementation of the low-level manifold structure from the higher-level filter algorithm, we are able to formulate and implement much of the algorithm without reference to a particular embedding. The technique has been demonstrated by considering a simple application on the 2-sphere. Future work includes application to S O(3), that is, the manifold of orientations, adaptation of the tessellation, and utilizing Lie group structure when available. In order to cope with the substantial increase of dimension that would result from augmenting the state of our models to also include physical quantities such as angular momentum, the filter should be tailored to tangent or cotangent bundles. 4.8 Conclusions and future work 125 Figure 4.1: Estimated probability density function. Left: predictions before a measurement becomes available. Right: estimates after measurement update. Rows correspond to successive time steps. Patches are colored proportional to the density in each region, and random samples are marked with dots. The color of the patches is scaled so that white corresponds to zero density, while black corresponds to the maximum density of the distribution (hence, the scale differs from one figure to another). It is seen how the uncertainty increases when time is incremented, and decreases when a measurement becomes available, and that the uncertainty decreases over time as the information from several measurements is fused. Appendix 4.A Populating the spheres This appendix contains the two algorithms we use to populate spheres S 2 (algorithm 4.2) and S 3 (algorithm 4.3) with points such that the density of points is approximately constant over the whole space. The method contains a minor random element, but this is not crucial for the quality of the result, and could easily be replaced by deterministic choice. The idea for populating spheres generalize to higher dimensions. The number of steps to take in each loop is found by computing the length of the curve obtained by sweeping the corresponding coordinate over its range while the other coordinates are held fixed, and the curve length is divided by the side length of a hypercube of desired volume. The curve length is found as the width of the coordinate’s span, times the product of the cosines of the other coordinates, and the hypercube volume times the desired number of points in the population should equal the total volume of the sphere (the formula for the volume can be found under the entry for sphere in Hazewinkel (1992)). Denoting the side of the hypercube δ0 , the dimension of the sphere N , and the desired number of points n, this corresponds to setting v t N +1 2π 2 N δ0 B n Γ ( N2+1 ) where Γ is the standard gamma function. The volume of S 2 is often denoted the area of the sphere. 126 4.A Populating the spheres 127 Algorithm 4.2 Populating the 2-sphere. Input: The desired number n of points in the population. Output: A set P of points on S 2 , approximately of cardinality n. Notation: Let φ denote the usual polar coordinate map mapping a point S 2 to the tuple ( θ, ϕ ), where θ ∈ [ 0, π ], ϕ ∈ [ 0, 2 π ]. That is, embedding S 2 in R3 , the inverse map is identified with cos( θ ) cos( ϕ ) φ−1 ( θ, ϕ ) = cos( θ ) sin( ϕ ) sin( θ ) Algoritm body: P ← {} q δ0 B 4nπ (Compute the desired volume belonging to each point, and compute an approximation of the angle which produces a square on the sphere with this volume.) l m max iθ B δπ (Compute the number of steps to take in the θ coordinate.) 0 π ∆θ B − i max θ (Compute the corresponding step size in the θ coordinate.) θ0 B π for iθ = 0, 1, . . . , (iθmax − 1) θ B θ0 + iθ ∆θ n j 2 π cos( θ ) k o iϕmax B max 1, (Compute the circumference of the sphere at δ0 the current θ coordinate, and find the number of ϕ steps by dividing by the desired step length.) π ∆ϕ B i2max ϕ ϕ0 B x, where x is a random sample from [ 0, 2π ]. for iϕ = 0, 1, . . . , (iϕmax − 1) ϕ B ϕ0 + i ϕ ∆ ϕ n o P ← P ∪ φ−1 ( θ, ϕ ) end end Remark: A deterministic replacement for the random initialization of ϕ0 in each iθ iteration would be to add ϕ ← ϕ− 1 2 ∆ϕ just before the update of θ, and then use the final ϕ at the end of one iθ iteration as the initial ϕ in the next iθ iteration. 128 4 Point-mass filtering on manifolds Algorithm 4.3 Populating the 3-sphere. Input: The desired number n of points in the population. Output: A set P of points on S 3 , approximately of cardinality n. Notation: Let φ denote the usual polar coordinate map mapping a point S 3 to the tuple ( θ, ϕ, γ ), where θ ∈ [ 0, π ], ϕ ∈ [ 0, π ], γ ∈ [ 0, 2 π ]. That is, embedding S 3 in R4 , the inverse map is identified with cos( θ ) cos( ϕ ) cos( γ ) cos( θ ) cos( ϕ ) sin( γ ) φ−1 ( θ, ϕ, γ ) = cos( θ ) sin( ϕ ) sin( θ ) Algoritm body: Compare the body of algorithm 4.2. This algorithm has the same structure, and we shall only indicate how the important quantities are computed, namely the number of steps to take in the different loops. ... q 2 π2 n l m π max iθ B δ 0 π ∆θ B − i max θ δ0 B 3 ... for iθ . . . θ B ... n j π cos( θ ) k o iϕmax B max 1, δ π ∆ϕ B i max ϕ ... for iϕ . . . ϕ B ... 0 n j 2 π cos( θ ) cos( ϕ ) k o iγmax B max 1, δ 0 π ∆γ B i2max γ ... for iγ . . . γ B ... n o P ← P ∪ φ−1 ( θ, ϕ, γ ) end end end Remark: The random choices in this algorithm can be made deterministic in the same way as in algorithm 4.2. 5 A new index close to strangeness Kunkel and Mehrmann have developed a theory for analysis and numerical solution of differential-algebraic equations. The theory centers around the strangeness index, which differs from the differentiation index in that it does not consider the derivatives of the solution to be independent of the solution itself at each time instant. Instead, it takes the tangent space of the manifold of solutions into account, thereby reducing the number of dimensions in which the derivative has to be determined. The book Kunkel and Mehrmann (2006) covers the theory well and will be the predominant reference used in the current chapter. The numerical solution procedure applies to general nonlinear differential-algebraic equations of higher indices, and is currently the only one we know of that can handle such problems, let be that it does not provide a sensitivity analysis. Our interest in this matter is mostly due to this capability. Since the theme of the chapter is to relate a new index to the closely related strangeness index, parts of the background theory has been included in the present chapter instead of chapter 2 in order to put the two definitions side by side. Care has been taken to make it clear what the contributions of the chapter are. 5.1 Two definitions In this section, two index definitions will be presented along with some basic properties of each. The one to be presented first is the strangeness index, found in Kunkel and Mehrmann (2006). The second, which is proposed as an alternative, is called the simplified strangeness index. Both are based on the derivative array equations. 129 130 5.1.1 5 A new index close to strangeness Derivative array equations and the strangeness index As always when working with dae, it is crucial to be aware that the solutions are restricted to a manifold. In practice, one is interested in obtaining equations describing that manifold, and the way this is done in the present chapter is by using the derivative array introduced in Campbell (1993), see section 2.2.3. Consider the dae ! f ( x(t), x0 (t), t ) = 0 (5.1) Assuming sufficient differentiability of f and of x, the idea is that the original dae is completed with derivatives of the equations with respect to time. This will introduce higher order derivatives of the solution, but the key idea is that, given values of x(t), it suffices to be able to determine x0 (t) in order to compute a numerical solution to the equations. That is, higher order derivatives such as x00 (t) may appear in the equations, but are not necessary to determine. Conversely, the choice of x(t) will affect the possibility to determine x0 (t), and the set of points x(t) where the derivative array equations can be solved for x0 (t) is the solution manifold. Hence, the derivative array equations can be used as a characterization of the solution manifold. If the completion procedure is continued until the derivative array equations are one-full with respect to x0 (t), the procedure has revealed the differentiation index of the dae, see definition 2.3. The meaning of one-full is defined in terms of the equation considered pointwise in time, so that a variable and its derivative become independent variables. We emphasize the independence by using the variable ẋ(t) instead of x0 (t), where the dot is just an ornament, while the prime is an operator. The equations are then said to be one-full if they determine ẋ(t) uniquely within some open ball, given x(t) and t. An equivalent characterization can be made in terms of the Jacobian of the derivative array with respect to its differentiated variables; then the equations are one-full if and only if row operations can bring the Jacobian into block diagonal form, with a non-singular block in the block column corresponding to derivatives with respect to ẋ(t) (clearly, this shows that it is possible to solve for ẋ(t) without knowing the variables corresponding to higher order derivatives of x at time t). However, instead of requiring that the completed equations be one-full, it turns out that there are good reasons for using the weaker requirement that the equations display the strangeness index instead. The definition of strangeness index is the topic of the current chapter, and will soon be considered in detail. It turns out that equations displaying the strangeness index determine x0 (t) uniquely if one takes into account the connection between x(t) and x0 (t) being imposed by the non-differential constraints which locally describe the solution manifold. Strangeness-free equations (strangeness index 0) are suitable for numerical integration. (Kunkel and Mehrmann, 1996) 5.1 131 Two definitions In the sequel, it will be convenient to speak of properties which hold on non-empty open balls inside the set 4 ! L νS = t, x, ẋ, . . . , ẋ νS +1 : FνS ( x, ẋ, . . . , ẋ(νS +1) , t ) = 0 (5.2) 5.1 Definition (Strangeness index). The strangeness index νS at ( t0 , x0 ) is defined as the smallest number (or ∞ if no such number exists) such that the derivative array equations ! FνS ( t, x(t), x0 (t), x0(2) (t), . . . , x0(νS +1) (t) ) = 0 (5.3) satisfy the following properties on \n o L νS t, x, ẋ, . . . , ẋ νS +1 : t ∈ Bt0 (δ) ∧ x ∈ Bx0 (δ) | {z } bδ for some δ > 0. • P1a–[5.1] There shall exist a constant number na such that the rank of 4 ∂Fν ( t, x, ẋ, ..., ẋ(νS +1) ) ∂FνS ( t, x, ẋ, ..., ẋ(νS +1) ) S MνS ( t, x, ẋ, . . . , ẋ(νS +1) ) = ... (ν +1) ∂ẋ ∂ẋ S is pointwise equal to ( νS + 1 ) nx − na , and shall exist a smooth matrix-valued function Z2 with na pointwise linearly independent columns such that Z2T MνS = 0. • P1b–[5.1] Let nd = nx − na , and let AνS = Z2T NνS where ∂FνS ( t x, ẋ, . . . , ẋ(νS +1) ) ∂x Then the rank of AνS shall equal na , and there shall exist a smooth matrix-valued function X with nd pointwise linearly independent columns such that AνS X = 0. 4 NνS ( t, x, ẋ, . . . , ẋ(νS +1) ) = • P1c–[5.1] The rank of ∇2 f X shall be full, and there shall exist a smooth matrixvalued function Z1 with nd pointwise linearly independent columns such that Z1T ∇2 f X is non-singular. In section 5.1.2 we present the well-known result that the derivative x0 (t) is uniquely ! defined by FνS = 0, and that we can construct a square strangeness-free dae with the same solution for x0 (t). The square and strangeness-free equations are referred to as the reduced equation. Then, it is not surprising that the reduced equations can also be used for numerical integration, see Kunkel and Mehrmann (1996, 1998). In section 5.1.3 we propose the new index, seen directly from the viewpoint of discretized equations. In sections 5.2 and 5.3 it is shown that the two views are closely related. The notation is defined such that x0(1) = x0 . 132 5 A new index close to strangeness When working with the strangeness index for nonlinear dae, the next theorem is important to keep in mind. 5.2 Theorem (Kunkel and Mehrmann (2006, theorem 4.13)). Let the strangeness index of (5.1) be νS . If the conditions of definition 5.1 also hold with ( νS + 1, na , nd ), (ν +2) S and there is a point t0 , x0 , ẋ0 , . . . , ẋ ∈ LνS +1 , then the reduced square and ! strangeness-free dae obtained from FνS = 0 has a unique solution passing through the given point, and this solution also solves (5.1). Proof: This is Kunkel and Mehrmann (2006, theorem 4.13), and we shall just give a very brief overview of their proof. A closely related result for the simplified strangeness index is proved in section 5.3. Since the reduced equation is implied by the original equation, the original equation cannot have more solutions than the reduced equation. It follows that it suffices to show that the reduced equation has a unique solution, and that this solution satisfies the original equation. In particular, it needs to be shown that the derivatives of the algebraic variables are consistent with the original equation. The thing to notice about theorem 5.2 is that only knowing that (5.1) has strangeness index νS at the point ( t0 , x0 ) is not enough to ensure that there is a solution passing through this point which also solves (5.1). This fact is well illustrated by Kunkel and Mehrmann (2006, exercise 4.11). ! However, if the reduced equation (or the full FνS = 0) has a unique solution, an approximate alternative to using theorem 5.2 is simply to test the obtained solution (at a finite number of points along the trajectory) against the original equation. In view of this, we mainly consider definition 5.1 as a means for determining when the derivative array equations can be used to show uniqueness of a solution, if one exists at all. In the next section, we elaborate what might be obvious, namely that definition 5.1 corresponds to a procedure to determine a solution uniquely, if it exists. Then, in section 5.1.3 we make an alternative characterization of νS . 5.1.2 Analysis based on the strangeness index The first step in the analysis, relying on P1a–[5.1], is to determine the local nature of the non-differential constraints that can be deduced from the derivative array equations. By definition, these constraints do not involve any differentiated variables, so the local nature of these constraints is obtained as linear combinations of the derivative array equations such that the gradient with respect to derivatives vanishes. P1a–[5.1] states that there are na such linear combinations, and that the linear combinations, the columns of Z2 , can be selected smoothly and linearly independent. Since the Gram-Schmidt procedure can be carried out smoothly, it follows that the columns of Z2 can be selected of unit length and orthogonal to each other. 5.1 133 Two definitions Since the linear combinations Z2T FνS are smooth functions on a non-empty open set, with zero derivative with respect to the differentiated variables everywhere, these linear combinations do not depend on the differentiated variables at all. Hence, they are pure non-differential constraints that give a local characterization of the solution manifold. To see the local nature of these constraints, their gradient with respect to x is computed, and P1b–[5.1] states that the so obtained normal directions to the solution manifold are linearly independent. Since they are linearly independent, the dimension of the solution manifold is nd = nx − na . P1b–[5.1] then states that it is possible to construct a local coordinate map x(t) = φ−1 ( xd (t), t ) with coordinates xd , determined by the partial differential equation ! ∇1 φ−1 ( xd (t), t ) = X( x(t), t ) (5.4) where the columns of X are smooth functions and pointwise linearly independent. Again, they can be selected of unit length and orthogonal to each other. That is, the columns of X can be selected as an orthonormal basis for the right null space of the matrix A. The local coordinates xd are denoted the dynamic variables. (If the requirement that X have orthonormal columns is dropped, the dynamic variables can be selected as a subset of the original variables x, but this may lead to numeric ill-conditioning.) The last property, P1c–[5.1], is finally there to ensure that the time derivative of the local coordinates on the solution manifold are determined by the original equation (5.1). Replacing (5.1) by an equation with residual expressed only through the dynamic variables, 4 fd ( xd , ẋd , t ) = f φ−1 ( xd , t ), ∇1 φ−1 ( xd , t ) ẋd + ∇2 φ−1 ( xd , t ), t (5.5) property P1c–[5.1] states that the Jacobian with respect to ẋd , ∇2 fd ( xd , ẋd , t ) = ∇2 f φ−1 ( xd , t ), ∇1 φ−1 ( xd , t ) ẋd + ∇2 φ−1 ( xd , t ), t X (5.6) is full-rank. Since there are only nd derivatives to be determined, and there are nx equations, there are na more equations than unknowns. The property P1c–[5.1] also states that nd linear combinations, given by the columns of Z1 , of the equations in (5.1) can be chosen smoothly and linearly independent (and hence orthonormal), so that these linear combinations are sufficient to determine the time derivatives of the dynamic variables. The reduced equations can now be constructed by joining the nd residuals Z1T f with the na residuals Z2T FνS . The resulting system has nx equations and ( νS + 1 ) nx unknowns (x and t being known variables), but νS nx of these cannot and need not be solved for. Hence, the system may be considered square, and it is easy to see that it is strangeness-free (with trivial choices of Z1 and Z2 , and the same X as when the strangeness index of (5.1) was determined). Unfortunately, despite the theoretical appeal of the reduced equations, and while it is sufficient to approximate Z1 in numeric implementations, the practical implementation of Z2 needs to make the non-differential equations truly independent of the differentiated variables. This presents severe difficulties unless it can be shown 134 5 A new index close to strangeness that Z2 only depends on t, for then it can be computed pointwise in time. It follows that the reduced equations will generally not be suitable for numerical integration. We shall now show how the analysis above can be used for numerical integration without using the reduced equation. Although Kunkel and Mehrmann (2006, section 6.2) addresses this, we give another (although similar) argument for the case here. Consider numerical integration via a first order bdf method. We then need to show that this formula uniquely determines the next iterate, qx, given x. The equations to which the bdf method is applied is the full derivative array equations, where each derivative is considered an unknown, and without discretizing any of the derivatives. To these equations the nd selected linear combinations of f are added, discretizing all derivatives. We know that the derivative array equations constrain qx to lie on a manifold which can be parameterized locally using xd as coordinates. It remains to show that these coordinates are uniquely determined by the dynamic equations. However, thinking of the dynamic equations in terms of the dynamic variables, it is readily seen that being able to solve for the derivatives (which is possible by definition 5.1) is equivalent to being able to solve for qxd for sufficiently small step lengths. 5.1.3 The simplified strangeness index The final remarks on numerical integration in the previous section relates closely to the analysis in this section. Here we begin, not with reasoning about the ability to solve for derivatives, but going directly to the topic of finding equations that uniquely determine the next iterate in a bdf method. Later, it will be shown in lemma 5.13 that the resulting index definition can also be interpreted as a condition to make x0 (t) uniquely determined by x(t) and t. By discretizing the derivatives (using a bdf method) in the original equation (5.1) (and scaling the equations by the step length), we get that the gradient of these equations with respect to x tends to ∇2 f ( x, ẋ, t ) as the step length tends to zero. Hence, joining these equations with the full derivative array equations (where no derivatives are discretized) yields a set of equations which (locally) shall determine x uniquely. This leads to the following definition. 5.3 Definition (Simplified strangeness-index). The simplified strangeness index νq at ( t0 , x0 ) is defined as the smallest number (or ∞ if no such number exists) such that the derivative array equations ! Fνq ( t, x(t), x0 (t), x0(2) (t), . . . , x0(νq +1) (t) ) = 0 satisfy the following property on \n o L νq t, x, ẋ, . . . , ẋ νq +1 : t ∈ Bt0 (δ) ∧ x ∈ Bx0 (δ) | {z } bδ for some δ > 0. (5.7) 5.1 135 Two definitions • P2–[5.3] Let ∂f ( x, ẋ, t ) ∂ẋ 4 Hνq = ∂F ( t, x, ẋ, ..., ẋ(νq +1) ) νq ∂x 0 ∂Fνq ( t, x, ẋ, ..., ẋ(νq +1) ) ∂ẋ (νq +1) ∂Fνq ( t, x, ẋ, ..., ẋ ) ... (νq +1) 0 ... ∂ẋ ∇2 f 0 = N M νq νq where Nνq and Mνq are defined as in definition 5.1. Then it shall hold that # " 0 I ∇ f 0 0 =! rank 2 rank ∇2 f N νq M νq N M νq νq That is, the basis vectors corresponding to x shall be in the span of the rows of Hνq , which may be recognized as the property of Hνq being one-full. The property P2–[5.3] can be interpreted as that there is no freedom in the x components of the solution to ! h f ( x, 1h ( x − q−1 x ), t ) ! =0 Fνq ( t, x, ẋ, . . . , ẋ(νq +1) ) since adding additional equations for the x variables alone does not decrease the solution space of the linearized equations. For theoretic considerations, however, the continuous-time interpretation provided by lemma 5.13 below is more relevant. Of course, we must show what the simplified strangeness index is for the inevitable pendulum. 5.4 Example Let us once more consider the pendulum from example 3.3. To match the notation of the present chapter we define ξ ξ̇ λ ξ − u̇ λ y − g − v̇ u u̇ 4 f y , ẏ , t = ξ 2 + y 2 − 1 v v̇ ξ̇ − u λ λ̇ ẏ − v We consider initial conditions where the pendulum is in motion and neither x nor y is zero. To check P2–[5.3] we look at the projection of a basis for the right null space of Hi onto the space spanned by the basis vectors corresponding to x, for i = 1, 2, . . . , νq . (The projection is implemented by just keeping the five first entries of the vectors.) The basis for the null space is computed using Mathematica , and for i = 0, 1, 2 the 136 projected basis vectors are, in order, 0 0 0 0 0 0 0 0 0 0 0 0 , , , 0 0 0 0 1 ξ 0 − 0 y ẏ ξ−ξ̇ y 5 0 0 0 0 −y ẏ ξ−ξ̇ y A new index close to strangeness 0 0 0, 0 0 0 0 0 , 0 0 0 0 0 0 0 Assuming that the symbolic null space computations are valid in some neighborhood of the initial conditions inside Li , it is seen that the λ component is undetermined for i = 0 and i = 1, and as all components are determined for i = 2 we get νq = 2. To verify that the symbolic computations of the null space are actually valid, it must be checked that the denominators are non-zero. The expressions which were removed by the projections ! are also rational with the denominator ẏ ξ − ξ̇ y. Since the ξ is 1 on Li , a geometric interpretation shows that the denomlength of the vector y ! ξ̇ inator expression is the scalar product of the vector and a unit vector which is ẏ tangent to the unit circle at the !point !( ξ, y ). Our intuition about the problem gives u ξ̇ that the initial conditions for may actually be chosen parallel with the tan= v ẏ gent, and the ! restriction to a neighborhood of the initial conditions inside Li gives u will at least be close to parallel with the tangent. Hence, the denominathat any v tor expression is zero precisely when the velocity variables are zero. Since we chose to analyze the equations for initial conditions where the pendulum is in motion, the velocity will remain non-zero in a neighborhood of the initial conditions, proving the validity of the null space computations. Since our prior understanding of the problem makes it easy to compute points inside Li for any i, the simplified strangeness index can also be computed numerically if we either assume or make sure that certain critical ranks are not sensitive to small perturbations of the variables. The method is not pursued in the example, in order to keep focus in the current chapter on methods and theory for exact dae. The following lemma is an example of how easy the simplified strangeness index is to work with. 5.5 Lemma. If P2–[5.3] is satisfied for νq on (the obvious projection of) Lνq +i ∩ bδ for some i ≥ 1, then P2–[5.3] is also satisfied for νq + i on the same set. Proof: If the basis vectors corresponding to x is in the span of the rows of Hνq , they (extended to the appropriate dimension) will also be in the span of the rows of Hνq +i h i since the upper part of this matrix equals Hνq 0 . 5.1 Two definitions 137 While lemma 5.13 below quite intuitively will show that a finite simplified strangeness index implies uniqueness of solutions to the dae, we postpone until section 5.3 to consider how P2–[5.3] may be used to also test existence of solutions. For now, we concentrate on how to compute the solution if it exists; recall that there is always a possibility to test any solution numerically (at a finite set of points along the trajectory) against the original equation, which should yield a good indication of true existence. Once νq has been determined, the next task is to select which equations to use (that is, pick a suitable Z1 ), and which variables to discretize (possibly via a change of variables, using an approximation of X). It is important to not make the discretized equations over-determined by including too many independent columns in Z1 , as this may compromise the non-differential constraints of the dae. Since the discretized derivatives are approximations, as few variables as possible should be discretized. The procedure prescribed by definition 5.3 may become demanding if one tries to use it directly to find a subset of components of f which is sufficient to determine x, since a null space of a large map has to be computed for each candidate subset of components. The following constructive method remedies this. h i First, a basis for the right null space of Nνq Mνq (that is, the gradient of the derivative array equations with respect to all unknown variables) is computed. The basis vectors are then chopped to get the tangent space of the solution manifold seen in xspace. Note that the chopped vectors will span the tangent space, but always contain too many elements to be a basis (there are ( νq + 1 ) nf equations and ( νq + 2 ) nx variables, and the equations will generally be dependent). To simplify the argument, a basis X for the tangent space is constructed, and the number of elements in this basis is the number of dynamic variables. Hence, it is possible to locally parameterize x in xd , and more equations are needed in order to determine xd given q−1 x. As in section 5.1.2, we are led to rewriting the equations given by f in terms of xd , and require that the gradient of these equations (where derivatives have been discretized) with respect to xd be non-singular. By the chain rule, this means that the product ∇2 f X shall have full column rank. Selecting a subset of components of f directly translates to selecting a subset of rows in this matrix product, and selecting a non-singular such subset is relatively cheap compared to computing the large null spaces in P2–[5.3]. It can be seen that computing the null space basis X is not necessary, but at least the dimension nd of the null space has to be known, because instead of requiring that the product ∇2 f X be non-singular, we shall require that its rank agrees with nd . Note that if a consistent point is given (with as many derivatives as we may need), νq can be determined using definition 5.3, and then the constructive method can be used to determine a sufficient subset of components of f (or, in general, determine the matrix Z1 ). For lti dae, definition 5.3 is easily related to the differentiation index νD (definition 2.3), as the next theorem shows. 138 5 A new index close to strangeness 5.6 Theorem. For the lti dae ! E x0 (t) + A x(t) + B u(t) = 0 it holds that νq = max { 0, νD − 1 }. Proof: The residual function of the dae is given by f ( x, ẋ, t ) = E ẋ + A x + B u(t) If νD = 0, ∇2 f ( x, ẋ, t ) = E is non-singular, and it follows that νq = 0. It remains to consider νD > 0. Recall the special structure of the derivative array equations for lti dae, seen in (2.34), ẋ B u(t) E A 0 0 A E . B u (t) . FνD ( t, x, ẋ, . . . , ẋ(νD +1) ) = . x + + . .. .. .. .. . . . ẋ(νD +1) 0(ν ) D 0 A E Bu (t) |{z} | {z } N νD M νD ! By definition of νD , ẋ is uniquely determined by FνD = 0, which is a condition only in terms of MνD . Partitioning this matrix as E ∇2 f A E MνD = = .. .. NνD −1 MνD −1 . . A E shows that νq = νD − 1. The strangeness index νS is known to have the same relation to the differentiation index also for ltv dae (Kunkel and Mehrmann, 2006, section 3.3) and nonlinear dae in Hessenberg form (Kunkel and Mehrmann, 2006, theorem 4.23), but some of the proofs are lengthy and instead of making more comparisons of νq versus νD , we now turn to the direct relation between νq and νS . 5.2 Relations In this section, the two indices νS and νq will be shown to be closely related. This is done by means of a matrix decomposition developed for this purpose. We first show the matrix decomposition, and then interpret the two definitions in terms of this decomposition. 5.2 139 Relations 5.7 Lemma. The matrix h N M i where N ∈ Rk×l , M ∈ Rk×k , rank M = k − a, a ≥ 1, can be decomposed as T Q3,1 0 " # T h i h i 0 0 Σ 0 Q3,2 0 N M = Q1,1 Q1,2 A 0 0 0 Σ−1 QT N QT 1,1 2,1 T 0 Q2,2 In this decomposition, the left matrix is unitary, as are the diagonal blocks of the right matrix. The matrix Σ is a diagonal matrix of the non-zero singular values of M. The matrix A is square. Proof: Introducing the singular value decomposition # " #" # " i Σ 0 QT Σ 0 T h 2,1 Q2 = Q1,1 Q1,2 M = Q1 T 0 0 0 0 Q2,2 we get h N " i M = Q1 T Q1,1 N T Q1,2 N Σ 0 0 0 0 T Q2,1 T Q2,2 # I 0 0 By the QR decomposition # " i QT 3,1 Q1,2 N = A 0 T Q3,2 h T where A is square and may contain dependent columns (in particular, some of the columns may be zero), we then get T 0 Q3,1 " T # T T h i 0 Q N Q3,1 Q1,1 N Q3,2 Σ 0 Q3,2 N M = Q1 1,1 T Q2,1 A 0 0 0 0 T 0 Q2,2 Finally, the relation h T Q1,1 N Q3,1 T Q1,1 N Q3,2 Σ T Q3,1 i QT T 0 3,2 = Q1,1 N 0 0 h = 0 0 Σ T Q3,1 i QT 0 −1 3,2 T N Σ Q1,1 0 140 5 A new index close to strangeness enables us to write h N i h M = Q1,1 " Q1,2 i 0 0 A 0 Σ 0 T # Q3,1 T 0 Q3,2 −1 T 0 Σ Q1,1 N 0 0 0 T Q2,1 T Q2,2 as desired. 5.8 Theorem. Definition 5.1 and definition 5.3 satisfy the relation νS ≥ νq . Proof: Suppose that the strangeness index is νS and finite, as the infinite case is trivial. Let the matrices N and M in lemma 5.7 correspond to MνS and NνS as in definition 5.1. First, let us consider νS in view of this decomposition. The left null space of M is spanned by Q1,2 , and making these linear combinations of N results in T Q3,1 " # h T h i Q i QT T 3,1 3,2 = A 0 Q1,2 N = A 0 0 0 −1 T T Q3,2 Σ Q1,1 N 0 where A has full rank due to P1b–[5.1]. This matrix determines the tangent space of the non-differential constraints as being its null space, spanned by the independent columns of Q3,2 . Hence, we can parameterize x as x = Q3,2 xd . Turning to νq , we follow the constructive interpretation of P2–[5.3] in section 5.1.3. h i The right null space of N M is spanned by the second and fourth rows of the right factor in the decomposition; ! h i x ! N M =0 ⇐⇒ y (5.8) ! " # ! x Q 0 z1 ∃z1 , z2 : = 3,2 y 0 Q2,2 z2 Extracting the part of this equation which only involves x, we find that it can be parameterized in z1 alone, and since the columns of Q3,2 are independent, we can use z1 as dynamic variables; x = Q3,2 xd . Since the strangeness index is νS , ∇2 f Q3,2 has full column rank according to P1c–[5.1]. Hence, " # ! ∇2 f 0 x ! =0 ⇐⇒ N M y (5.9) ! ! x 0 ∃z2 : = y Q2,2 z2 which is exactly the condition captured by P2–[5.3]. Since νq is the smallest index such that this condition is satisfied, it is no greater than νS . 5.3 141 Uniqueness and existence of solutions 5.9 Theorem. Given the property h i • P3–[5.9] The matrix Nνq Mνq has full row rank on the set Lνq ∩ bδ in definition 5.3. That is, h i rank Nνq Mνq = ( νq + 1 ) nx (5.10) it holds that definition 5.1 and definition 5.3 satisfy the relation νS = νq . Proof: Due to theorem 5.8 it suffices to show νS ≤ νq . To this end, suppose the equations have finite simplified strangeness index νq , as the infinite case is trivial. Let the matrices N and M in lemma 5.7 correspond to Mνq and Nνq as in definition 5.3. The rank condition (5.10) implies that A in lemma 5.7 is non-singular. ! Consider (5.8) and (5.9). Since adding the equation ∇2 f x = 0 is sufficient to conclude ! x = 0 given x = Q3,2 z1 , it is seen that ∇2 f Q3,2 z1 = 0 must imply z1 = 0. This is only true if ∇2 f Q3,2 has full column rank, which shows that P1c–[5.1] holds. Since νS is the smallest index such that this condition is satisfied, it is no greater than νq . We now have an alternative to the procedure of definition 5.1 for computing the strangeness index νS . First, one computes νq according to definition 5.3. If P3–[5.9] is satisfied for νq and some selection of bδ in definition 5.3, νS = νq according to theorem 5.9. If P3–[5.9] is not satisfied for any choice of bδ , νS = ∞. The remaining case, when P3–[5.9] holds on some set where P2–[5.3] does not hold, νS > νq may still be finite. According to lemma 5.5 it can be found as the smallest number such that P3–[5.9] holds on the intersection of νS and bδ ∈ Rnx ( νS +2 )+1 , while P2–[5.3] holds on the obvious projection of this set. 5.3 Uniqueness and existence of solutions The present section gives a result corresponding to what Kunkel and Mehrmann (2006, theorem 4.13) states for the strangeness index. As the difference between the two index definitions is basically a matter of whether P3–[5.9] is required or not, the main ideas in Kunkel and Mehrmann (2006) apply here as well. 5.10 Lemma. If the simplified strangeness index νq is finite, there exist matrix functions Z1 , Z2 , X, similar to those in definition 5.1. They are all smooth with pointwise linearly independent columns, satisfying Z2T Mνq = 0 Z2T Nνq X = 0 T Z1 ∇2 f X and columns of Z2 span left null space of Mνq (5.11a) and columns of X span right null space of Z2T Nνq (5.11b) is non-singular (5.11c) Proof: Using the decomposition of lemma 5.7, we may take Z2 B Q1,2 and X = Q3,2 . As in the proof of theorem 5.9, (5.8) and (5.9) then imply that ∇2 f X has full column rank, and the existence of Z1 follows. 142 5 A new index close to strangeness Multiplying the relations in (5.11) by smooth pointwise non-singular matrix functions shows that the matrix functions Z1 , Z2 , X are not unique, but they can be replaced by any smooth matrices with columns spanning the same linear spaces. For numerical purposes, the smooth Gram-Schmidt orthonormalization procedure may be used to obtain matrices with good numerical properties, while the theoretical argument of the present section benefits from another choice, to be derived next. h i Select the non-singular constant matrix P = Pd Pa such that Z2T Nνq Pa is nonsingular in a neighborhood of the initial conditions, and make a change of the undotted variables in Lνq according to ! h i x d (5.12) x = Pd Pa xa The following notation will turn out to be convenient later (note that Nνaq is nonsingular) 4 Nνdq = Z2T Nνq Pd 4 Nνaq = Z2T Nνq Pa (5.13) The next result corresponds to Kunkel and Mehrmann (2006, corollary 4.10) for the strangeness index. 5.11 Lemma. There exists a smooth function R such that xa = R( xd , t ) (5.14) inside Lνq , in a neighborhood of the initial conditions. Proof: In Lνq it holds that Fνq = 0 and Z2T Mνq = 0, and it follows that ∂Z2T Fνq ∂ẋ(1+) = Z2T ∂Fνq ∂ẋ(1+) + ∂Z2T ∂ẋ(1+) F νq = 0 Hence, the construction of Z2 is such that Z2T Fνq only depends on t and x, and the change of variables (5.12) was selected so that the part of the Jacobian corresponding to xa is non-singular. It follows that xa can be expressed locally as s function of xd and t. We now introduce the function φ−1 to describe the local parameterization of x using the coordinates xd and t, ! xd 4 −1 x = φ ( xd , t ) = P (5.15) R( xd , t ) and the next lemma shows an important coupling between φ−1 and lemma 5.10. 5.12 Lemma. The matrix X in lemma 5.10 can be chosen in the form " # I X̂ = P = ∇1 φ−1 ( xd , t ) ∇1 R( xd , t ) (5.16) 5.3 143 Uniqueness and existence of solutions Proof: Clearly, the columns are linearly independent and smooth. By verifying that the matrix is in the right null space of Z2T Nνq we will show that its column spans the same linear space as X. It will then follow that X and X̂ are related by a relation in the form X̂ = X W for some smooth non-singular matrix function W . Using the form X W then shows that (5.11c) is also satisfied. Hence, it remains to show that X̂ is in the right null space of Z2T Nνq . Using (5.14) and allowing also the dotted variables ẋ(1+) to depend on xd in (suppressing arguments) ∂Z2T Fνq ∂xd it follows that ∂Fνq ∂Z T ∂Fνq ∂Z2T 2 T F νq + Z 2 + Fνq + Z2T ∂xd ∂xd ∂xa ∂xa Here, Fνq = 0 and Z2T Z2T ∂Fνq ∂xd ∂Fνq ∂ẋ(1+) + Z2T =0 ∂Fνq ∂xa ∂Z2T T ∂x + ∂ẋ(1+) Fνq + Z2 ∂ẋ(1+) d ∂ẋ(1+) ! =0 ∂xd = Z2T Mνq = 0 implies that ∂Fνq ∂xa h ∇1 R = Z2T ∇2 Fνq Pd Pa i " # I ! = Z2T Nνq X̂ = 0 ∇1 R Back in section 5.1.3 it was indicated that we would be able to show that a finite simplified strangeness index implies local uniqueness of solutions. With lemma 5.7 at our disposal this statement can now be shown rather easily. 5.13 Lemma. If the simplified strangeness index is finite and x is a solution to the dae for some initial conditions in Lνq ∩ bδ , then the solution x is locally unique. Proof: Using the parameterization of x given by (5.15), it suffices to show that the coordinates xd are uniquely defined. By the smoothness assumptions and the analytic implicit function theorem, Hörmander (1966), showing that xd0 (t) is uniquely determined given xd (t) and t will be sufficient, since then the corresponding ode will have a right hand side which is continuously differentiable, and hence locally Lipschitz on any compact set. One may then complete the argument by applying a basic local uniqueness theorem for ode, such as Coddington and Levinson (1985, theorem 2.2)). Reusing (5.5) for the current context, xd0 (t) is seen to be uniquely determined if ∇2 fd ( xd , ẋd , t ) is non-singular (in some neighborhood Lνq ∩ bδ of the initial conditions). Identifying (5.6) in (5.11c), lemma 5.12 completes the proof. With X̂ according to (5.16) it follows that ! Z2T Nνq X̂ = Nνdq + Nνaq ∇1 R = 0 (5.17) 144 5 A new index close to strangeness using the notation (5.13). Before stating the main theorem of the section we derive one more equation. Using (5.14) and allowing also the dotted variables ẋ(1+) to depend on t in (suppressing arguments) Z2T ∂Fνq ∂t ! =0 it follows that Z2T ∇1 Fνq + ∇2 Fνq ∇2 φ−1 + Mνq ∂ẋ(1+) ∂t ! ! = Z2T ∇1 Fνq + ∇2 Fνq ∇2 φ−1 = 0 (5.18) 5.14 Theorem. Consider a sufficiently smooth dae (5.1), repeated here, ! f ( x(t), x0 (t), t ) = 0 (5.1) with finite simplified strangeness index νq and where the un-dotted variables in Lνq form a manifold of dimension nd . If the set where P2–[5.3] holds is the projection of a similar bδ+ Lνq +1 , and P2–[5.3] also holds on bδ+ Lνq +1 with the same dimension nd , then there is a unique solution to (5.1) for any initial conditions in bδ+ Lνq +1 . Proof: Considering how Fνq +1 is obtained form Fνq , it is seen that the equality Fνq +1 = 0 can be written ∇1 Fνq + ∇2 Fνq ẋ + ∇3+ Fνq ẋ(2+) = 0 Multiplying by Z2T from the left and identifying the expressions for Nνq and Mνq , one obtains Z2T ∇1 Fνq + ∇2 Fνq ẋ = 0 Using (5.18) and the change of variables (compare (5.12)) ! h i ẋ d P P ẋ = d a ẋa (5.19) leads to (using the notation introduced in (5.13)) h Nνdq Nνaq i −P −1 ∇2 φ −1 ẋ + d ẋa !! =0 Using (5.15) and (5.17) yields −Nνaq ∇2 R( xd , t ) − Nνaq ∇1 R( xd , t ) ẋd + Nνaq ẋa = 0 and since Nνaq is non-singular, it must hold that ! ! ! ẋ I 0 ẋ = P d = P ẋd + P = ∇1 φ−1 ( xd , t ) ẋd + ∇2 φ−1 ( xd , t ) ẋa ∇1 R( xd , t ) ∇2 R( xd , t ) ! Since f ( x, ẋ, t ) = 0 holds by definition on Lνq , it follows that f ( φ−1 ( xd , t ), ∇1 φ−1 ( xd , t ) ẋd + ∇2 φ−1 ( xd , t ), t ) = 0 5.4 145 Implementation where ẋd is uniquely determined given xd and t by (5.11c) with ∇1 φ−1 = X̂ in place of X. Hence, the dae f ( φ−1 ( xd (t), t ), ∇1 φ−1 ( xd (t), t ) xd0 (t) + ∇2 φ−1 ( xd (t), t ), t ) = 0 has a (locally unique) solution and the trajectory generated by x(t) = φ−1 ( xd (t), t ) is a solution to the original dae (5.1). 5.4 Implementation The definition of the simplified strangeness index does not prescribe that a basis for the tangent space of x should be computed in the same way as the definition of the strangeness index does. We have seen, however, that this basis is an important intermediate object for the selection of equations to be discretized during numerical integration. Two ways to compute this basis will be presented in this section, and their computational complexity will be compared. We shall assume that occuring matrices are full, so that there is no particular structure that can be utilized in the problem. This assumption may be highly questionable in many applications, but that just opens up for a refined analysis in the future, taking sparsity into account. We assume QR decomposition is used both for computing a well-conditioned base for a null space, and to compute a well-conditioned base for a range space. If sparsity is not utilized, the QR decomposition is preferably computed using Housholder reflections, while Givens rotations would be used to take advantage of sparsity. A conceptuallyh simple way i to determine the basis is to compute a basis for the right null space of Nνq Mνq , and project these vectors on the x-space; these are the directions in which x will be free to move under the algebraic constraints. The set of projected vectors will always contain at least nx elements which is typically too many for the set to be a basis, and the vectors may also be poorly conditioned. Hence, to obtain a well conditioned basis one additional computation has to be performed. This method will be referred to as the projection method below. An alternative way to determine the basis is to follow the definition of the strangeness index, except that one does not require the matrix Aνq to have independent rows. This method requires computation of the left null space of Mνq and the right null space of Aνq . Lemma 5.7 was originally developed to show that the two ways of computing the basis are equivalent. This method will be referred to as the strangeness index method below. 146 5.4.1 5 A new index close to strangeness Computational complexity Both methods will perform two QR decompositions, one large and one small. The small one would not be required for the projection method unless a (wellconditioned) base was sought in the end, but it is not here the big difference in computational burden is to be sought. Note that computing a complete QR decomposition involves more than twice the number of multiplications compared to only computing the upper triangular factor. Hence, for the strangeness index method, it will be more efficient to apply the same row operations to Nνq as are applied when row-reducing Mνq , than first compute a matrix spanning the left null space and then apply it to Nνq . Similarly, for the projection method, where only the projection of a null space onto x-space is needed, it suffices to compute just the first columns of the unitary matrix. This can be implemented by row reducing the left part of the matrix " T # N νq I MνTq 0 and then reading the lower right block of the resulting matrix. This will, however, always involve more computation than to do the row reduction of the left block of the matrix h i M νq N νq since both reductions involve the same number of columns, but the former has nx more rows to take care of and will not terminate early (hence requiring ( νq + 1 ) nx − 1 Housholder reflections), while the latter will terminate when the na last rows of the left block are found to be zero (thus requiring ( νq + 1 ) nx − na − 1 Housholder reflections). The comparison shows that the strangeness index method has an advantage. Still, we think that the conceptual simplicity of the projection method adds valuable insight. 5.4.2 Notes from experiments Experimental results are not included in the section in the usual sense, the reason being that the two methods are equivalent. Therefore, we only include some brief remarks based on experience from tests with our experimental implementations of the two methods. In all examples, the simplified strangeness index has been equal to the strangeness index. Since the construction of Z1 is not canonical, it will generally depend on X via ∇2 f X, or more generally, the columns used to span the tangent space of the solution manifold. We have seen that the two methods yield the same linear space spanned by the columns of X, but the construction of X differs. Hence, in our experimental setup where Z1 has been chosen using a greedy method to pick out a subset of the rows of ∇2 f X with good condition number, the selection of rows is not always the same for the two methods. However, comparing the resulting difference in the numerical solution is meaningless since the differences will be due to the greedy algorithm and not due to differences conceptual differences between the two methods. 5.5 Conclusions and future work 147 We remark that consistent initialization (that is, finding a root to the residual Fνq +1 , compare Kunkel and Mehrmann (2006, theorem 4.13)) has been a major concern for the numeric experiments. However, the Mathematica function FindMinimum has been a very useful tool, while finding a good enough initial guess for Mathematica’s local search method FindRoot has turned out to be notoriously hard. 5.5 Conclusions and future work In our view, a simpler way of computing the strangeness index has been proposed. It gives a lower bound on the strangeness index, and when the auxiliary property P3–[5.9] holds, it gives an equivalent test. While the original definition follows a three step procedure, the proposed definition has just one step (except that P3–[5.9] needs to be verified separately). The new index definition is also appealing due to its immediate interpretation from a numerical integration perspective. Just as the strangeness index, the simplified strangeness index emphasizes the parameterization of all variables via a smaller number of “differential” variables, but the corresponding dimensions are not required to be visible by looking only at Mνq . Analogues of central results for the original strangeness index have been derived for the simplified strangeness index. In particular, it has been shown that a finite simplified strangeness index implies that if a solution exists, it will be unique, and existence of a solution can be established by checking the property that defines the index for two successive values of the index parameter. For the simplified strangeness index, the computational complexities of two methods for computing a basis for the x tangent space have been compared. The outcome was favorable for a method closely related to the definition of the original strangeness index. The other method offers superior conceptual simplicity, and adds insight to the more efficient definition, and hence also to the original strangeness index. These observations are though be useful when the strangeness index concept is being taught. An important aspect of the analysis of the strangeness index provided in Kunkel and Mehrmann (2006, chapter 4) is that the strangeness index is shown to be invariant under some transformations of the equations which are known to yield equivalent formulations of the same problem. It is an important topic for future research to find out whether the simplified strangeness index is also invariant under these transformations. Another interesting topic for future work is to seek examples where νq , νS in order to get a better understanding of this exceptional case. Finally, in view of the emphasis that this thesis puts on the singular perturbation problems arising from uncertainties in dae, it would be very interesting to derive the singular perturbation problems related to the results of the present chapter. 6 LTI ODE of nominal index 1 This is the first chapter of three with results for uncertain dae and the related matrixvalued singular perturbation problems. The current chapter considers the same problem as was considered in Tidefelt (2007, chapter 6), and deals with the two major deficiencies discussed there. At the same time, the next chapter will show that some of the central ideas here are limited to the nominal index 1 case, so this is the chapter where the strongest results appear. The chapter contains the results of the two papers Henrik Tidefelt and Torkel Glad. Index reduction of index 1 dae under uncertainty. In Proceedings of the 17th IFAC World Congress, pages 5053–5058, Seoul, Korea, July 2008. Henrik Tidefelt and Torkel Glad. On the well-posedness of numerical dae. In Proceedings of the European Control Conference 2009, pages 826–831, Budapest, Hungary, August 2009. as well as some variations of results in Tidefelt (2007, chapter 6), that were omitted from Tidefelt and Glad (2008) due to space constraints. At the end, the chapter contains a new example which indicates better applicability of the theoretical results, adds important insights to the problem, and connects the current chapter with the next. The singular perturbation theory, as presented in (Kokotović et al., 1986), provides very relevant background for this chapter. In particular, theorem 6.21 herein should be compared with Kokotović et al. (1986, chapter 2, theorem 5.1). The chapter is organized as follows (compare figure 6.1). Section 6.1 introduces the problem and derives the related matrix-valued singular perturbation problem. To be149 150 6 lti ode of nominal index 1 gin with, it is assumed that the nominal index 1 dae is pointwise index 0. Then the new section 6.2 gives a schematic overview of the analysis and captures the essence of the current chapter as well as chapter 8. Section 6.3 considers the decoupling of nominal and fast uncertain dynamics. In section 6.4, we derive a matrix result which will be the main tool for the formulation of assumptions in terms of eigenvalues. It is applied in section 6.5 when we formulate results for ode which will then be used when we study the fast and uncertain subsystem in section 6.6. In section 6.7 we draw conclusions regarding the original coupled system using results from previous sections. Then, section 6.8 considers what happens if the pointwise index 0 assumption is dropped. Two examples of the theory are given in section 6.9, before the chapter is concluded in section 6.10. 6.1 Introduction We are interested in the utility of modeling dynamic systems as unstructured uncertain differential-algebraic-equations. Consider the linear time-invariant dae ! Ē x̄0 (t) + Ā x̄(t) = 0 (6.1) If Ē is a regular matrix, this equation is readily turned into an ordinary differential equation, but our interest is with the other case. By saying that this dae is unstructured, we mean that we cannot assume any of the matrix entries to be known exactly. Instead, we assume that there is an uncertainty model with independent uncertainty descriptions for the matrix entries, and then example 1.1 showed that additional assumptions are needed to turn the equation into an uncertain ode. To see what kind of assumptions we might find reasonable to add, we recall that the equations represent a dynamic system, and so we might be willing to make assumptions regarding system features; that is, properties of a dynamic system which are not dependent of the particular equations used to describe the system. Invertibility of Ē is not a system feature. The poles are a system feature, and the poles of an ode are given by the finite eigenvalues of the matrix pair ( Ē, Ā ). One way to analyze the equations is to apply a row reduction procedure to the equations, trying to bring Ē into upper triangular form (variables may be reordered as needed). Such a procedure (we think of Gaussian elimination, or QR decomposition using Given’s rotations or Householder reflections) can only proceed as long as the lower right block (which remains to reduce to upper triangular form) contains an entry which can be distinguished from zero. If the uncertain matrix is not regular, the procedures will at some point fail to find an entry which can be distinguished from zero (or else the procedure would prove the regularity of the matrix), and will be unable to continue beyond that point: " # ! " # ! Ẽ11 Ẽ12 x̄10 (t) Ã11 Ã12 x̄1 (t) ! + =0 (6.2) 0 Ẽ22 x̄20 (t) Ã21 Ã22 x̄2 (t) Since all variables in the dae are considered outputs, the system is trivially observable — that is, the eigenvalues cannot correspond to un-observable modes without corresponding system poles. 6.1 151 Introduction where Ẽ11 is regular, and Ẽ22 has no entries which can be distinguished from zero. The dae is considered ill-posed if the family of solutions does not converge as the uncertainty tends to zero, for any initial conditions in some set of interest. Hence, showing well-posedness in this sense is a first step towards replacing the ad hoc procedure of neglecting Ẽ22 with an analysis that accounts for the error in the solution introduced by substituting zero for Ẽ22 . The remaining part of this introduction contains just enough analysis to reach the fundamental matrix-valued singular perturbation problem. It is assumed that the equations are of nominal index 1, where nominal is taken to refer to max Ẽ22 = 0. This means that the matrix " # Ẽ11 Ẽ12 (6.3) Ã21 Ã22 is regular. When Ẽ22 = 0, the second group of equations becomes a static relation between the variables. We also assume that the initial conditions x̄(0) satisfy this relation. Next, a change of variables leads to " # ! " I 0 x0 (t) A + 11 0 E z 0 (t) A21 A12 A22 # ! x(t) ! =0 z(t) (6.4) From the assumed regularity of (6.3), it follows that the new matrix ! I 0 A21 A22 and hence A22 , is regular. Maintaining the style of Tidefelt and Glad (2008), applicability of the results in this chapter are increased by generalizing (6.4) slightly, allowing the trailing matrix to depend on E. Doing so leads to the following lti matrix-valued singular perturbation form (in chapter 7 a somewhat different form is used, given by (7.8)) ! x0 (t) + A11 (E) x(t) + A12 (E) z(t) = 0 ! E z 0 (t) + A21 (E) x(t) + A22 (E) z(t) = 0 (6.5x) (6.5z) where the following properties will be used throughout the chapter. P1–[6.1] Property. The functions Aij shall be analytic functions, without uncertainty at 0, and with a known, finite, Lipschitz constant. P2–[6.2] Property. The (nominal) matrix A22 (0) shall be non-singular. That is, the uncertainty in the form (6.5) is in the matrix E and how the trailing matrices Aij depend on E under the respective Lipschitz conditions. That is, the differentiation index, see definition 2.2. 152 6 lti ode of nominal index 1 Note that the Lipschitz condition for A22 together with corollary 2.47 provide that the inverse of A22 (E) can be bounded by a constant, if E is required to be sufficiently small. To ease notation, we shall often not write out the dependency of the Aij on E. For reference to the published work behind this chapter, the notion of unstructured perturbation has to be explained. It refers to the lack of structure in the uncertainty E compared to the singular perturbation problems studied in the past (see section 2.5), where E has either been in the form I for a small scalar > 0, or a diagonal matrix with small but positive entries on the diagonal. In this thesis, the lack of structure is marked by the use of matrix-valued uncertainty instead. 6.2 Schematic overview of nominal index 1 analysis The analysis and convergence proofs (where present) in the present and subsequent chapters have much in common. To emphasize this, and to enhance the reading of these chapters, this section contains a schematic overview of the common structure. The schematic overview is given in figure 6.1, annotated below. • (a) The solution to the uncertain part of the decoupled system will generally not be known beyond boundedness. Hence, it is not necessary that the change of variables converge to a known transform as max(E) → 0, but what matters is that the relation between the solution to the slow dynamics and the original variables converges to a known entity, and that the influence of the solution to the uncertain dynamics on the original variables is bounded independently of max(E). In chapter 8, where the pointwise index of the equations is assumed zero, showing the existence of a decoupling transform with the required properties is the main concern. • (b) Showing that the eigenvalues of the uncertain dynamics must grow as max(E) → 0 is the main concern in chapter 7, as the rest of this chapter has much in common with the present chapter. • (c) Since the solution η will have a non-vanishing influence on some of the original variables, it is necessary for convergence in the original variables that η converges uniformly to zero as max(E) → 0. In particular, it has to be shown that there is uniform convergence at the initial time. • (d) Making assumptions about eigenvalues is a key ingredient in the convergence proofs. In the end, these assumptions will restrict the uncertainty sets of the uncertain entities in the equations, but we prefer making the restrictions indirectly via the eigenvalues since these can be related to system properties. Besides stating the assumptions, it needs to be verified that the restricted uncertainty sets are non-empty, or else any further reasoning will be meaningless. If it would not have been in order to maintain the style of Tidefelt and Glad (2008), the uncertainty model of chapter 7 would have been used instead. There all deviation from some nominal matrices are assumed bounded by a common number m, and instead of considering max(E) → 0, one considers m → 0. 6.2 153 Schematic overview of nominal index 1 analysis Uncertain dae a Slow ode in ξ d Uncertain dae in η Decoupling transform c b Eigenvalue assumptions Show |λ| → ∞ Show η(0) → 0 e λ ∈ fast region f Bound kΦ(t, τ)k2 η→0 uniformly Combine solutions g Convergence of solutions Figure 6.1: Schematic overview of convergence proofs. The crucial steps have been marked with a thick border. The labels link to annotations in the text. 154 6 lti ode of nominal index 1 • (e) Using that the eigenvalues of the uncertain dynamics must tend to infinity as max(E) → 0, it can be concluded that for sufficiently small max(E) the eigenvalues of the uncertain dynamics must belong to a subset of the assumed region, strictly included in the left half of the complex plane. • (f) The last crucial step is to show that the location of the eigenvalues of the uncertain dynamics imply that the initial condition response to the uncertain dynamics can be bounded independently of E, for sufficiently small max(E). (Then, uniform convergence of initial conditions to zero implies that the whole solution converges uniformly to zero.) In the lti cases, we use results from the theory of M-matrices. In the ltv case we also need Lyapunov-based methods. • (g) This is the final conclusion of the analysis. While we are mainly concerned with the qualitative property of convergence in this thesis, a review of the proofs will indicate how the convergence may be quantified. However, some steps in the analysis seem to give rise to gross over-estimation, making the quantitative results unpleasant to use in real applications (the alternative being to ignore the issue with matrix-valued singular perturbation altogether, or try to avoid the issue by re-deriving the equations with more attention to structure, recall section 1.2.5). 6.3 Decoupling transforms and initial conditions Following the scheme outlined in the previous section, we now start by deriving the decoupling transform. Similarly to how this done in most of the literature on other singular perturbation problems, the transform is divided into two steps. Since the initial conditions for the decoupled system are a direct consequence of the decoupling transform (and the initial conditions for the coupled system), results on initial conditions are also included in the present section. With this short introduction, we now begin with a lemma for the first decoupling step. 6.3 Lemma. There exists an analytic matrix-valued function L such that, for sufficiently small max(E), the change of variables ! " # ! x I 0 x = (6.6) z L( E ) I η transforms (6.5) into the system " # 0 ! " I x (t) A + A12 L( E ) + 11 E η 0 (t) 0 A12 A22 − E L( E ) A12 # ! x(t) ! =0 η(t) (6.7) The matrix L( E ) satisfies L( 0 ) = −A22 (0)−1 A21 (0) and a Lipschitz condition. (6.8) 6.3 155 Decoupling transforms and initial conditions Proof: Applying the change of variables shows that x is eliminated from the η 0 equation provided L( E ) satisifies ! 0 = A21 (E) + A22 (E) L( E ) − E L( E ) A11 (E) + A12 (E) L( E ) (6.9) For E = 0 there is the solution L( 0 ) = −A22 (0)−1 A21 (0) The derivative of the right hand side of (6.9) with respect to each column of L at E = 0 is A22 (0), which is non-singular. It follows from the analytical implicit function theorem, (Hörmander, 1966), that the equation can be solved to give an analytical L in some neighborhood of 0. On a closed ball of positive radius within that neighborhood, L will be bounded due to its continuity. Since L0 will also be analytic (see, for instance, Krantz and Parks (2002, proposition 1.1.14)), it follows that L0 will also be bounded on the same closed ball, implying the Lipschitz condition. Let the initial conditions for (6.5) be x(0) = x0 z(0) = z 0 6.4 Lemma. If the initial conditions x0 and z 0 are chosen to make the dae consistent for E = 0, that is, ! 0 = A21 (0) x0 + A22 (0) z 0 then the initial condition η(0) = (6.10) η 0 (E) for the lower part of (6.7) satisfies i η 0 (E) = −A22 (0)−1 A21 (0) − L( E ) x0 h (6.11) In particular η 0 is analytic with η 0 (E) = O( kEk2 ) (6.12) In chapter 7, the decoupling transforms will be established using fixed-point methods rather than analytic calculus. Then the neighborhoods — whose existence is the only thing we care about in this chapter — will be balls with radii that are obtained constructively. This will bring theory much closer to application, and the presentation in this chapter is using analytic calculus to maintain the flavor of the published works that the chapter builds upon. We shall briefly indicate the type of results that fixed-point methods provide by looking at how a bound on L over a closed ball may be constructed. Let a1 (E), . . . , am (E) denote the columns of A21 (E), and let l1 (E), . . . , lm (E) be the columns of L( E ). Impose a bound on E so that there exists constants k11 and k12 such that kA11 (E)k2 ≤ k11 and kA12 (E)k2 ≤ k12 . Let ρ > A22 (0)−1 A21 (0)2 denote the bound on kL( E )k2 to be established and write (6.9) as a1 (E) A22 (E) . ! − . = . am (E) .. . + F( E ) A22 (E) l1 (E) . . . lm (E) where kF( E )k2 ≤ kEk2 ρ ( k11 + k12 ρ ). According to corollary 2.47, the solution L( E ) will satisfy the bound kL( E )k2 ≤ ρ if kF( E )k2 is made sufficiently small. In other words, for each ρ > A22 (0)−1 A21 (0)2 we can construct an open ball for E, within which the bound kL( E )k2 ≤ ρ holds. The derivative can be bounded similarly. 156 6 lti ode of nominal index 1 Proof: From the definition of the change of variables z = L( E ) x + η it follows that η 0 ( E ) = z 0 − L( E ) x0 Substituting z 0 from (6.10) gives (6.11) while (6.8) and the Lipschitz condition for L gives (6.12). Introduce the notation 4 Aη( E ) = A22 (E) − E L( E ) A12 (E) (6.13) where Aη( 0 ) is the non-singular matrix A22 (0), and a Lipschitz condition for Aη is obtained by taking E sufficiently small. To emphasize the difference between uniform and “directional convergence” with respect to the uncertainty, the following lemma gives a result to be contrasted with lemma 6.20. 6.5 Theorem. Assume E = m E∗ , where E∗ is a non-singular matrix with max(E∗ ) = 1, such that −E∗−1 Aη(0) is Hurwitz. Then, sup η(t) = O( m ) (6.14) t≥0 Proof: With the change of variables t = mτ we get ∂η = −E∗−1 Aη( m E∗ ) η, ∂τ η(0) = η 0 with solution −1 A η(τ) = e−E∗ η( m E∗ ) τ η0 Since −E∗−1 Aη(0) is a Hurwitz point matrix, the time-scaled system with m = 0 can be h i shown to be uniformly γ e−λ• -stable for some α −E∗−1 Aη(0) < λ < 0 according to theorem 2.38. Since the matrix in the exponent (as a function of m) satisfies a Lipschitz condition, and −E∗−1 Aη(0)2 is bounded since it is a point matrix, theorem 2.41 shows that there exist positive constants C1 and m0 (ignoring the exponential decay rate) such that −1 m ≤ m0 =⇒ e−E∗ Aη( m E∗ ) τ ≤ C1 2 Since η0 = O( kEk2 ) = O( m ) the result follows. If E∗ would have been singular in theorem 6.5, other estimates would be possible, but note that the “directional convergence” is not the type of convergence we seek. Consequently, theorem 6.5 has no applications in the thesis. The following example gives a better picture of the problem we have to address. 6.3 157 Decoupling transforms and initial conditions 6.6 Example In this example, the bounding of η over time is considered in case η has two components. For simplicity, we shall assume that η is given by η 0 (t) = E −1 η(t) where −δ E= 0 1−δ −δ ! where ∈ ( 0, m ] and δ > 0 is a small uncertainty parameter so that max(E) ≤ m. Since ! 2 −1 −1 −1/δ 1/δ − 1/δ E = 0 −1/δ it is seen that both eigenvalues are perfectly stable and far into the left half plane, while the off-diagonal entry is at the same time arbitrarily big. It is easy to verify using software that the maximum norm of the matrix exponential grows without bound as δ tends to zero, for any bound m. Hence, knowing that the initial conditions η(0) must tend to zero with max(E) is not sufficient to obtain a uniform bound on η which tends to zero with max(E). In Tidefelt and Glad (2008), the unboundedness supt≥0 η(t) was remedied by assuming that the condition number of E be bounded, and it is easy to see that this would imply a lower bound on δ in this example, thereby solving the problem. However, assuming a bound on the condition number of E is very artificial and will not be done in the thesis. So far, the presentation in the present chapter has been rather close to Tidefelt and Glad (2008). However, we will now omit Tidefelt and Glad (2008, lemma 5) and take a different route thanks to the improved results in Tidefelt and Glad (2009). It is the topic of section 6.6 to find conditions that can be used to provide a uniform bound on the solution η which tends to zero with E, when η satisfies ! E η 0 (t) + Aη(E) η(t) = 0 The vanishing function η may then be regarded an external input to the ode for x in (6.7), and regular perturbation techniques may be used to show that the solutions x converge as E tends to zero. However, it is also illustrative to apply a second decoupling transform which isolates the slow dynamics; the transform is guaranteed to exist by the next lemma. Continuing on the result of lemma 6.3, the following lemma shows that the influence of η on x is small. 158 6 lti ode of nominal index 1 6.7 Lemma. There exists a matrix-valued function H such that, for sufficiently small max(E), the change of variables ! " # ! x I H( E ) E ξ = (6.15) η 0 I η transforms (6.7) into the system " # 0 ! " A (E) + A12 (E) L( E ) I ξ (t) + 11 0 E η 0 (t) 0 Aη(E) # ! ξ(t) ! =0 η(t) (6.16) where kH( E )k2 is bounded by a constant independently of E. Proof: Applying the change of variables and then performing row operations on the equations to eliminate η 0 from the first group of equations, lead to the condition defining H( E ) (dropping other dependencies on E from the notation): ! 0 = [ A11 + A12 L ] H( E ) E + A12 − H( E ) [ A22 − E L A12 ] (6.17) It follows that H( 0 ) = A12 (0) A22 (0)−1 which is clearly bounded independently of E. The equation is linear in H( E ), invertible at E = 0, and the coefficients depend smoothly on E, so as for L it follows that H is analytical in some neighborhood of 0. Hence, restricting H to a sufficiently small closed ball (via the selection of a sufficiently small bound on max(E)) will make kH( E )k2 bounded independently of E. The results Tidefelt (2007, lemma 6.4, corollary 6.1) consider the derivatives of η 0 ( E ) with respect to E, and belong in the present section. However, since these have no counterparts in later chapters, we prefer to present them using the more constructive fixed-point methods of later chapters, rather than using the original framework of real analytic functions. 6.8 Lemma. Irrespectively of the rank of E, the solution L to (6.9) has the form L0 ( E ) = −A22 (E)−1 A21 (E) h i L( E ) = L0 ( E ) + A22 (E)−1 E L0 ( E ) A11 (E) + A12 (E) L0 ( E ) + m RL (E) (6.18) where m is an upper bound on E, and RL (E) can be bounded independently of E, for m sufficiently small. Proof: The following proof will never make use of the rank or pointwise nonsingularity of E, which makes it valid for any rank. Inserting (6.18) in (6.9), repeated here, ! 0 = A21 (E) + A22 (E) L( E ) − E L( E ) A11 (E) + A12 (E) L( E ) As in lemma 6.3, more constructive formulations are easily obtained using corollary 2.47. 6.4 159 A matrix result yields ! 0 = A21 (E) − A22 (E) A22 (E)−1 A21 (E) h i + E L0 ( E ) A11 (E) + A12 (E) L0 ( E ) + m RL (E) − E L( E ) A11 (E) + A12 (E) L( E ) and then ! 0 = m E RL (E) h i + E L0 ( E ) A11 (E) + A12 (E) L0 ( E ) − E L0 ( E ) A11 (E) + A12 (E) L0 ( E ) − E L0 ( E ) A12 (E) L( E ) − L0 ( E ) − E L( E ) − L0 ( E ) A11 (E) + A12 (E) L( E ) Cancelling a factor of E on the left means that the above equation is implied by the one below. ! 0 = m RL (E) − L0 ( E ) A12 (E) L( E ) − L0 ( E ) − L( E ) − L0 ( E ) A11 (E) + A12 (E) L( E ) The rest of the proof is a contraction mapping argument showing that there is a ρL > 0, such that kRL (E)k2 ≤ ρL if max(E) is required to be sufficiently small. The argument is given in section 6.A. In addition to some details of the proof of lemma 6.8, section 6.A contains an example of the lemma. 6.9 Corollary. If the trailing matrices in lemma 6.4 are independent of E, then the following relation gives a more precise account of the relation (6.12). h i 0 A11 + A12 L0 + O( m ) x0 (6.19) η 0 ( E ) = −A−1 22 E L Proof: Use lemma 6.8. That the trailing matrices are independent of E implies that the difference A22 (E)−1 A21 (E) − A22 (0)−1 A21 (0) vanishes. 6.4 A matrix result Before we take on the problem of analyzing the dynamics of η, we need to switch context for a while and see how eigenvalue conditions can help to bound the inverse of a small matrix. The result we need is quickly derived, and we then turn to examples in an attempt to illustrate its qualities. 160 6 lti ode of nominal index 1 6.10 Lemma. For an invertible upper triangular matrix U , it holds that p ( a + 1 )2 n + 2 n ( a + 2 ) − 1 kU k2 ≤ λmax ( U ) a+2 where a = max U −1 λmax ( U ). (6.20) Proof: The difference to theorem 2.53 is only minor. Here one uses that the bound (2.90) of theorem 2.53 is increasing with a, so the bound is overestimated by inserting the expression for b and the upper bound for a. Extending the result for upper triangular matrices to the general case is easy and relies on the Schur decomposition of a matrix. Unfortunately, the nice property of the Schur decomposition that all factors but one are unitary, is not quite as beneficial as if our results had been assuming bounds on the induced 2-norm of a matrix instead of entry-wise maximum. 6.11 Theorem. For an invertible matrix X ∈ Rn×n , it holds that p ( n a + 1 )2 n + 2 n ( n a + 2 ) − 1 kXk2 ≤ λmax ( X ) na+2 where a = max X −1 λmax ( X ). (6.21) Proof: Let Q U QH = X be a Schur decomposition of X. Then kXk2 = kU k2 , and λmax ( U ) = λmax ( X ), so a bound on max U −1 will yield a bound on kXk2 by lemma 6.10. From max U −1 ≤ U −1 2 = Q U −1 QH2 = X −1 2 ≤ n max X −1 (6.22) the result follows by substituting n max X −1 for max U −1 in (6.20). We now turn to the examples. An exact treatment of the optimization problem bounded by (6.21), maximize kXk2 X∈Rn×n max X −1 ≤ m λmax ( X ) ≤ λ̂ appears difficult, even for as low dimension as n = 2. Instead, in example 6.12 we turn to numeric (nonlinear, global) optimization in order to obtain examples which we provide as indications of how tight the bound may actually be. The section then ends with example 6.13, where we plot the true norm against the bound for a large number of randomly generated matrices. 6.4 161 A matrix result 6.12 Example Since we are only interested in finding good examples here, we choose to consider the problem minimize max X −1 n×n X∈R λmax ( X ) ≤ λ̂ ! kXk2 = X 0 2 where X 0 2 is a feasible objective function value in the original problem for bounds m as low as indicated by solutions to this problem. The optimization problem is further simplified by a restrictive parameterization of X −1 as Q U −1 QT. Here, U −1 is chosen as −1 η 0 ... 0 λ .. −1 0 λ η . −1 . .. U = (6.23) η 0 0 .. . 0 λ−1 η 0 ... 0 λ−1 with λ = −λ̂ and η chosen as some small constant (this will define X 0 ), and the 2 orthogonal Q is parameterized as a composition of Given’s rotations and reflections. Different combinations of reflections are enumerated, and for each enumeratedcombination, simulated annealing is applied to find rotations that yield a small max Q U −1 QT . Application of this scheme for some choices of n, λ̂, and η, gives the results shown in table 6.1. The table shows that the bound is no or only a few orders of magnitude from the ideal bound in cases of practical importance. The following matrix manifests one of the rows of table 6.1. The reader is encouraged to verify this, but is warned that the precision in the printed matrix entries is insufficient to reproduce the λmax ( X ) column with more than one accurate digit, causing the bound (6.21) to have zero accurate digits. X −1 = 0.1518712275 0.1524043923 −0.1524043968 0.1465683972 −0.1522399412 −0.143406358 −0.1487240742 0.1522222484 −2.388603613 · 10−3 −2.337599229 · 10−3 −0.1475359243 −0.152030243 Orthogonal instead of unitary ensures that X −1 is real. 2.33223507 · 10−3 2.282425336 · 10−3 0.1479232046 0.1524043881 162 6 lti ode of nominal index 1 dim X λmax ( X ) η max X −1 kXk2 (6.21) 2 2 2 2 4 4 4 4 0.3 0.3 30. 30. 30.1 30. 3.03 · 102 3.01 · 102 0.3 30. 0.3 30. 0.3 3 · 10−2 0.3 3 · 10−2 3.33 18. 0.18 15. 0.157 3.33 · 10−2 0.152 1.61 · 10−2 0.314 2.73 2.73 · 102 2.7 · 104 2.21 · 104 76.1 2.19 · 108 2.21 · 105 0.735 3.27 3.27 · 102 2.71 · 104 2.23 · 105 3.13 · 103 1.93 · 109 2.4 · 106 Table 6.1: Some examples of the bound (6.21) of theorem 6.11 compared to the true norm. The parameters are explained in the text. All λmax entries should be 3 times 10 to an integer power, exceptions are due to numeric ill-conditioning. Note that the bound is very tight where the inequality (6.22) is tight. 6.13 Example Another way to illustrate the bound (6.21) is to generate a large number of random instantiations of X −1 and plot the true norm of kXk2 against the bound. Since a scaling of X −1 results in the same scaling of the bound, the scaling should be fixed to make the two-dimensional nature of the plot meaningful (otherwise, a histogram over the ratios would be a better illustration). Here, the freedom to scale is used to fix the value of max X −1 , making the x-values in the plot a monotone function of λmax ( X ). Figures 6.2 and 6.3 show the result of this procedure for two different values of n. Looking at the figures — figure 6.3 in particular — it is tempting to conclude that there must be a possibility to tighten the bound by some factor which is independent of λmax ( X ). 6.4 163 A matrix result kXk2 1012 1010 108 106 104 106 104 108 1010 1012 bound Figure 6.2: The true norm in (6.21) against the corresponding bound is plotted for n = 2 and a fixed value of max X −1 , for 5000 random instantiations of X −1 . In order to obtain the fixed value (here 1) for max X −1 , each matrix X −1 was generated via an intermediate matrix Y according to X −1 = max(Y )−1 Y , where the entries of Y were sampled from independent uniform distributions over the interval [ −1, 1 ]. kXk2 1015 1012 109 106 103 103 106 109 1012 1015 bound Figure 6.3: Analogous to figure 6.2, but with n = 3. 164 6 lti ode of nominal index 1 Im ā m−1 R0 Re φ0 Figure 6.4: The region (white color) in the complex plane where the eigenvalues are assumed to reside, illustrating the condition (A1–6.26). The slow dynamics are assumed to have eigenvalues smaller than R0 , while the fast and uncertain dynamics are assumed to have eigenvalues smaller than ā m−1 and damping no less than cos(φ0 ). The dashed line emphasizes the upper bound on the real part of all eigenvalues larger than R0 . 6.5 An LTI ODE result To simplify matters and to obtain results that are not limited to dae, we shall begin by assuming that the invertible −Aη is the identity matrix, leading to the fundamental question of bounding the initial condition response of the system ! E η 0 (t) = η(t) (6.24) given stability, a bound on E, and some additional assumptions which remain to be stated. Although a bound on the induced matrix norm, kEk2 would be convenient for the analysis, we observe that it is much more useful from an application point of view to assume a bound on the maximum size of any entry in E, and this was why theorem 6.11 was formulated accordingly. However, since max(E) ≤ kEk2 , any result which assumes max(E) ≤ m immediately follows if kEk2 ≤ m. Hence, the results below can readily be rewritten using an ordinary induced norm instead of max(•), but doing so would introduce additional slack in the derived inequalities. In the rest of this section, (6.24) is written in ode form, ! η 0 (t) = M η(t) (6.25) and we let m > 0 be the known bound on max M −1 , that is, max(E) ≤ m. We are ultimately interested in giving conditions under which the solutions converge as the upper bound m on max(E) tends to zero. We shall assume that the poles of the dynamic system being modeled satisfy the following condition, illustrated in figure 6.4, 6.5 165 An lti ode result A1–[6.14] Assumption. Let λ denote any eigenvalue of M. Assume there exist constants R0 > 0, φ0 ∈ [ 0, π/2 ), and ā > 1 such that and (A1–6.26) |λ| > R0 =⇒ |arg(−λ)| ≤ φ0 where m is an upper bound for max M −1 , and ā presents a trade-off between generality of the assumption and the quantitative properties of the forthcoming convergence results. |λ| m < ā Note that if A1–[6.14] is acceptable for the value of m at hand, the assumption will only become weaker (the feasible region for λ will grow) as we imagine smaller values of m. It would be a subtle mistake to propose assumptions which are not satisfied for any M, which motivates the following simple lemma. 6.15 Lemma. The condition ā > 1 for (A1–6.26) is sufficient (but not necessary) for the assumption to be possible to fulfill with some M, for arbitrarily small m. It is necessary that ā ≥ n−1 . Proof: Sufficiency follows by noting that M = −m−1 I obviously satisfies max M −1 ≤ m with |arg(−λ)| = 0 and |λ| m = 1 for every eigenvalue λ. That the condition ā > 1 is not necessary — at least not for n = 2 and φ0 ≥ π/4 — is demonstrated by the following example, !−1 −m m M= −m −m √ with |λ| m = 1/ 2 and |arg(−λ)| = π/4 for both eigenvalues. Necessity of ā ≥ n−1 is a consequence of any eigenvalue being greater than −1 −1 M −1 ≥ n max M −1 ≥ (n m )−1 2 so for any eigenvalue λ it holds that |λ| m ≥ n−1 , and hence ā < n−1 would be a contradiction. The section now ends with a bound on the initial condition response of (6.25). It should be noted here that the uncertainty model max M −1 ≤ m is often just a coarse over-estimation of a more fine-grained model with individual uncertainty bounds on each matrix entry. Hence, some of the best known uncertainty intervals for individual matrix entries may be much smaller than m, and the matrix used to prove sufficiency here may actually not fall within these bounds. Note then, that the coarser uncertainty model has a family of solution trajectories which includes those of the more finegrained uncertainty model. Hence, at some point, the fine-grained uncertainty model may be abandoned, and the coarser uncertainty model which is more tractable be used instead — it is in this situation the present lemma is to be applied. 166 6 lti ode of nominal index 1 6.16 Theorem (Main theorem for ode). Consider the ordinary differential equation x0 (t) = Mx(t) (6.27) where max M −1 ≤ m. Assuming A1–[6.14], there exist constants m0 > 0 and γ < ∞ such that m < m0 =⇒ sup kx(t)k2 ≤ γ kx(0)k2 t≥0 Proof: In view of theorem 2.27, it is seen that the initial condition response of (6.27) gets bounded if m kMk2 kMk2 = −α( M ) −m α( M ) can be bounded. To see that the denominator is bounded from below by a constant independent of m, if m is sufficiently small, we use A1–[6.14] and lemma 6.15. Then, any eigen −1 ≥ (n m )−1 , showing that m < ( n R0 )−1 implies value is greater than n max M −1 that all eigenvalues are greater than R0 , and hence −m α( M ) > m ( n m )−1 cos( φ0 ) = n−1 cos( φ0 ). That the numerator m kMk2 can be bounded from above follows from theorem 6.11, as it shows that p ( n ā + 1 )2 n + 2 n ( n ā + 2 ) − 1 m kMk2 ≤ ā (6.28) n ā + 2 Combining the bound for the denominator with the bound for the numerator, one obtains p ( n ā + 1 )2 n + 2 n ( n ā + 2 ) − 1 n ā kMk2 ≤ (6.29) −α( M ) n ā + 2 cos( φ0 ) 6.17 Remark. Note the trade-off present in the selection of ā. If ā is selected very large, (A1– 6.26) is easier to justify, while the bound on the gain of the initial condition response becomes larger. At the other end, as ā → n−1 , (A1–6.26) becomes increasingly restrictive (recall that lemma 6.15 does not even ensure consistency for ā < 1), while our bound for the gain of the initial condition response tends to (here γ is referring to the notation of theorem 2.27) eM t ≤ γ( n ) 2 n−1 √ 22 n + 6 n − 1 = γ( n ) 3 cos( φ0 ) n−1 √ 22 n + 6 n − 1 1 3 cos( φ0 )n−1 The part of this expression that only depends on n is 1 for n = 1, 4.3 for n = 2, and grows rapidly. For n = 5, it is 9.7 · 105 , so even for very well dampened systems for which 1/ cos( φ0 )n−1 ≈ 1, the bound will be huge. This highlights the qualitative nature of this work; the quantitative relation implied by the proof of theorem 6.16 will give so poor error bounds for the solution that they are rarely meaningful to apply. This problem can be handled 6.6 167 The fast and uncertain subsystem both by removing slack in the derivation of bounds under the current assumptions, or taking advantage of stronger assumptions. 6.6 The fast and uncertain subsystem As we now return to dae after the two preceding sections on matrices and ode, we make the assumption that the uncertain dae is pointwise index 0. This assumption will be removed in section 6.8. Let us now give A1–[6.14] a new interpretation when considering the differentialalgebraic equation ! E x0 (t) + A x(t) = 0 (6.30) where A is known and non-singular (the unknown but regular case will be considered soon), while E is unknown but assumed pointwise non-singular (by definition of pointwise index). For this system, the eigenvalues λ in A1–[6.14] naturally refer to the eigenvalues of ( E, A ), and m is an upper bound for max(E). In this setup, we require that ā > max(A). 6.18 Lemma. The condition ā > max(A) for (A1–6.26) in the context of (6.30) is sufficient for the assumption to be possible to fulfill with some E, for arbitrarily small m. m A. max(A) −1 −max(A) m . Proof: Just take E = ( E, A ) equal to This satisfies max(E) ≤ m with all eigenvalues of We now obtain a corollary to theorem 6.16 by making minor changes to its proof. 6.19 Corollary. Consider (6.30). Assuming (A1–6.26) with ā > max(A), there exist constants m0 > 0 and γ < ∞ such that m < m0 =⇒ sup kx(t)k2 ≤ γ kx(0)k2 t≥0 Proof: Compare the proof of theorem 6.16. Writing the equation as an ode, x0 (t) = −E −1 A x(t) the ratio which needs to be bounded is seen to be −E −1 A m −E −1 A2 2 = −α( −E −1 A ) −m α( −E −1 A ) For the denominator, any eigenvalue is greater than −1 −E −1 A −1 ≥ A−1 −1 ( m n )−1 2 2 Recall definition on page 35. 168 6 lti ode of nominal index 1 so for sufficiently small m, any eigenvalue will be greater than R0 , and hence −1 −1 −m α −E −1 A ≥ m A−1 2 ( m n )−1 cos( φ0 ) = A−1 2 n−1 cos( φ0 ) In order to apply theorem 6.11 for the numerator, we note that −1 ≤ |λ| A−1 E 2 ≤ |λ| A−1 2 m n ≤ ā A−1 2 n C ā∗ |λ| max −E −1 A yielding m −E −1 A2 ≤ ā p ( n ā∗ + 1 )2 n + 2 n ( n ā∗ + 2 ) − 1 n ā∗ + 2 Combining denominator and numerator bounds, one obtains p −E −1 A ( n ā∗ + 1 )2 n + 2 n ( n ā∗ + 2 ) − 1 ā∗ 2 ≤ ∗ −1 cos( φ0 ) −α( −E A ) n ā + 2 6.20 Lemma. Compare the definition of Aη in (6.13). The conclusion of corollary 6.19 still holds if (6.30) is replaced by ! E x0 (t) + A( E ) x(t) = 0 (6.31) where the analytic A satisfies a Lipschitz condition in a neighborhood of zero, and A( 0 ) is without uncertainty and non-singular. Proof: By corollary 2.47 it follows that we can choose m0 so small that A( E )−1 2 can be bounded. The conclusion is now reached by following the steps of the proof of corollary 6.19. 6.7 The coupled system In Kokotović et al. (1986), results come in two flavors; one where approximations are valid on any finite time interval, and one where stability of the slow dynamics in the system makes the approximations valid without restriction to finite time intervals. Compare lemma 2.34 and lemma 2.33, for the respective cases. Here, only finite time intervals are considered, but the other case is treated just as easily. Recall from section 6.1 how transformations of bounded condition number were used to bring the original system (6.1) into the matrix-valued singular perturbation form (6.5), repeated here, ! x0 (t) + A11 (E) x(t) + A12 (E) z(t) = 0 ! E z 0 (t) + A21 (E) x(t) + A22 (E) z(t) = 0 (6.5x) (6.5z) 6.7 169 The coupled system Let the x̊ be the solution to x̊0 = ( A11 (0) + A12 (0) L( 0 ) ) x̊ x̊(0) = x0 (6.32) where L( 0 ) = −A22 (0)−1 A21 (0) according to (6.8), and let the solution to (6.5) at time t be denoted x( t, E ), z( t, E ). For E = 0 (the nominal system), we have x( t, 0 ) = x̊(t) (6.33z) z( t, 0 ) = L( 0 ) x̊(t) (6.33z) Summarizing the result of previous sections leads to the following theorem. 6.21 Theorem (Main theorem for pointwise index 0). Consider the form (6.5) where E is pointwise non-singular, but otherwise unknown. The matrix expressions Aij (E) have to satisfy a Lipschitz condition with respect to E, and A22 (0) is nonsingular (that is, the nominal dae is index 1). Assume that the initial conditions are consistent with E = 0, and that A1–[6.14] holds with ā > max(A). Let I = [ 0, tf ] be a finite interval of time. Then sup |x( t, E ) − x( t, 0 )| = O(max(E)) (6.34x) t∈I sup |z( t, E ) − z( t, 0 )| = O(max(E)) (6.34z) t∈I Proof: Define L( E ) and H( E ) as in section 6.3, the solution in terms of and consider ξ and η in (6.16). According to lemma 6.4, η( 0, E ) = O( kEk2 ) = O( max(E) ), and then lemma 6.20 shows that supt≥0 η( t, E ) = O( max(E) ). Note that x( t, 0 ) coincides with ξ( t, 0 ), so the left hand side of (6.34x) can be bounded as sup |x( t, E ) − x( t, 0 )| = sup ξ( t, E ) + H( E ) E η( t, E ) − ξ( t, 0 ) t∈I t∈I ≤ sup |ξ( t, E ) − ξ( t, 0 )| + O( max(E)2 ) t∈I To see that the first of these terms is O( max(E) ), note first that lemmas 6.4 and 6.7 give that the initial conditions for ξ are only O( max(E)2 ) away from x0 . Hence, the restriction to a finite time interval gives that the contribution from initial conditions is negligible. The difference between the state feedback matrix of ξ( •, E ) and ξ( •, 0 ) in (6.16) is seen to be O( max(E) ) by using the Lipschitz conditions in A11 (E) + A12 (E) L( E ). Hence, the contribution from perturbation of the state feedback matrix for ξ is O( max(E) ) according to lemma 2.34. Concerning z, sup |z( t, E ) − z( t, 0 )| = sup z( t, E ) + A22 (0)−1 A21 (0) x( t, 0 ) t∈I t∈I ≤ sup z( t, E ) + A22 (0)−1 A21 (0) x( t, E ) t∈I + sup A22 (0)−1 A21 (0) ( x( t, 0 ) − x( t, E ) ) t∈I 170 6 lti ode of nominal index 1 The proof is completed by noting that sup A22 (0)−1 A21 (0) ( x( t, 0 ) − x( t, E ) ) ≤ A22 (0)−1 A21 (0) 2 O( max(E) ) t∈I = O( max(E) ) and sup z( t, E ) + A22 (0)−1 A21 (0) x( t, E ) ≤ sup |z( t, E ) − L( E ) x( t, E )| t∈I t∈I + O( max(E) ) sup |x( t, E )| t∈I = sup η( t, E ) + O( max(E) ) sup |x( t, E )| t∈I t∈I = O( max(E) ) since |x( t, E )| can be bounded over any finite time interval. An immediate consequence of theorem 6.21 is the establishment of an O max Ẽ22 bound for the error introduced by neglecting Ẽ22 in (6.2). From a practical point of view, though, this observation appears to be only of minor interest since determining A22 (E) (or at least A22 (0)) seems necessary for any quantitative analysis of the fast and uncertain dynamics. 6.8 Extension to non-zero pointwise index With the exceptions of some results (including lemmas 6.3, 6.4, 6.7, and 6.8), the results so far require, via lemma 6.20, that E (or Ẽ22 in (6.2)) be pointwise non-singular. However, we are able to obtain some results also when some singular values of E are exactly zero. To that end, the results of the previous section will be extended to this situation by revisiting the relevant proofs. Since there are only finitely many choices of rank for E (that is, how many non-zero singular values there are), showing convergence for an arbitrary value of the rank immediately implies convergence independently of the rank. 6.22 Lemma. (Compare lemma 6.20.) In addition to the assumptions of lemma 6.4, assume the perturbed dae has pointwise index no more than 1, and that its poles (that is, the finite eigenvalues of the associated matrix pair) satisfy A1–[6.14]. Then, sup E η( t, E ) = O( max(E)2 ) t≥0 Proof: The case of pointwise index 0, when E is full-rank, was treated in lemma 6.20, so it remains to consider the case of pointwise index 1. When the rank of E is zero, E = 0 and it is immediately seen from (6.7) that η must be identically zero and the conclusion follows trivially. Hence, assume that the rank is neither full nor zero and 6.8 171 Extension to non-zero pointwise index let h E = U1 (E) U2 (E) " #" # i Σ(E) 0 V (E)T 1 0 0 V2 (E)T be an SVD of E where Σ(E) is pointwise non-singular and ! of known dimensions. η̄1 Applying the unknown change of variables η = V (E) and the row operations η̄2 represented by U (E)T, (6.7) turns into (dropping E from the notation) 0 I ξ (t) A11 + A12 L A12 V1 A12 V2 ξ(t) ! Σ 0 η̄10 (t) + K22 K23 η̄1 (t) = 0 0 0 0 η̄2 (t) K32 K33 η̄2 (t) where, for instance and in particular, 4 K33 = U2T A22 V2 − U2T E L A12 V2 = U2T A22 V2 Since the dae is known to be pointwise index 1, differentiation of the last group of equations shows that K33 (E) is non-singular, and hence the change of variables ! ! " # I 0 η̄¯1 (t) η̄1 (t) (6.35) = η̄2 (t) −K33 (E)−1 K32 (E) I η̄¯2 (t) leads to the dae in ( ξ, η̄¯1 , η̄¯2 ) with matrix pair −1 I A11 + A12 LE A12 V1 − A12 V2 K33 K32 −1 Σ 0 , K22 − K23 K33 K32 0 0 0 A12 V2 K23 K33 It is seen that η̄¯2 = 0 and that η̄¯1 is given by an ode with state feedback matrix 4 Mη̄¯1 (E) = −Σ(E)−1 K22 (E) − K23 (E) K33 (E)−1 K32 (E) Just like in lemma 6.20 it needs to be shown that the eigenvalues of this matrix grow as max(E)−1 , but here we need to recall that E is not only present in Σ(E), but also in the unknown unitary matrices U (E) and V (E). Let m be a bound on max(E). Then kΣ(E)k2 = kEk2 ≤ m n. From the partial block matrix inversion formula " #−1 −1 −1 K22 (E) K23 (E) = K22 (E) − K23 (E) K33 (E) K32 (E) K32 (E) K33 (E) ? it follows that " K22 (E) K (E) 32 K23 (E) K33 (E) ? ? #−1 −1 −1 ≥ K22 (E) − K23 (E) K33 (E) K32 (E) 2 2 172 6 lti ode of nominal index 1 and hence K22 (E) − K23 (E) K33 (E)−1 K32 (E) −1 2 −1 T ≤ U (E) A22 (E) − E L( E ) A12 (E) V (E) 2 −1 T = V (E) A22 (E) − E L( E ) A12 (E) U (E) 2 −1 = A22 (E) − E L( E ) A12 (E) (6.36) 2 This means that the eigenvalues are bounded from below by −1 −1 −1 Mη̄¯1 (E)−1 = K22 (E) − K23 (E) K33 (E)−1 K32 (E) Σ(E) 2 2 −1 −1 ≥ m−1 n−1 A22 (E) − E L( E ) A12 (E) 2 and just like in lemma 6.20 the expression gives that the eigenvalues of Mη̄¯1 (E) grow as m−1 . Before reaching the same conclusion as in lemma 6.20, it remains to show that the constant ā∗ in the proof of corollary 6.19 can be chosen finite, but this also follows from (6.36). Hence, E can be chosen sufficiently small to make supt≥0 η̄¯1 (t) bounded by some factor times η̄¯1 (0). Further, ! ! ! η̄¯ (0) η̄ (0) η̄ (0) 1 1 1 = η 0 ( E ) = O( max(E) ) ¯ η̄1 (0) = ¯ = ≤ (6.37) η̄2 (0) 0 η̄2 (0) Using this, the conclusion finally follows by taking such a small bound m on max(E), ! ! ¯1 (t) I 0 η̄ E η( t, E ) = E V −1 0 −K33 K32 I ! ! I 0 Σ 0 T O( max(E) ) ≤ U V V −1 0 0 −K33 K32 I 2 ! Σ 0 O( max(E) ) = O( max(E)2 ) = 0 0 2 6.23 Corollary. Lemma 6.22 can be strengthened when z has only two components. Then, just like in lemma 6.20, the conclusion is sup η( t, E ) = O( max(E) ) t≥0 Proof: The only rank of E that needs to be considered is 1, and then η̄¯1 will be a scalar and commute with the corresponding transition matrix φη̄¯1 . From (6.35) and the last two equalities of (6.37) it follows that K33 (E)−1 K32 (E) η̄¯1 (0) = O( max(E) ), The matrix A in the proof of corollary 6.19 is the trailing matrix of (6.30), here corresponding to the trailing matrix of the dae for η̄¯1 . 6.8 173 Extension to non-zero pointwise index and hence ! ! η̄1 (t) η̄¯1 (t) sup = sup −K (E)−1 K (E) φ ¯ ( t, 0 ) η̄¯ (0) η̄ (t) 33 32 η̄1 1 2 t≥0 t≥0 −1 ¯ ¯ ≤ sup η̄1 (t) + K33 (E) K32 (E) η̄1 (0) sup φη̄¯1 ( t, 0 )2 t≥0 t≥0 = O( max(E) ) It just remains to use η(t) = η̄(t). An alternative proof is included as a footnote. Theorem 6.21 can be extended as follows. 6.24 Theorem. Consider the setup (6.5), but rather than assuming that E be pointwise non-singular, it is assumed that E is a matrix with max(E) ≤ m, and that the dae has pointwise index no more than 1. Except regarding E, the same assumptions that were made in theorem 6.21 are made here. Then sup |x( t, E ) − x( t, 0 )| = O( max(E) ) (6.38x) t∈I sup |E [z( t, E ) − z( t, 0 ) ] | = O( max(E)2 ) (6.38z) t∈I where the rather useless second equation is included for comparison with theorem 6.21. Proof: Define L(E) and H(E) as above, and consider the solution expressed in the variables ξ and η. Lemma 6.22 shows how E η is bounded uniformly over time. Note that x( t, 0 ) coincides with ξ( t, 0 ), so the left hand side of (6.38x) can be bounded as sup |x( t, E ) − x( t, 0 )| = sup ξ( t, E ) + H( E ) E η( t, E ) − ξ( t, 0 ) t∈I t∈I ≤ sup |ξ( t, E ) − ξ( t, 0 )| + O( max(E)2 ) t∈I The conclusion concerning x then follows by an identical argument to that found in the proof of theorem 6.21. For the weak conclusion regarding E z, the relation z = L( E ) x + η together with (6.38x) and lemma 6.22 immediately yields the bound. The alternative proof uses that K33 (E)−1 K32 (E) is a scalar, and hence of perfect condition number. It follows that sup η̄2 (t) ≤ K33 (E)−1 K32 (E)2 sup η̄¯1 (t) t≥0 t≥0 ≤ K33 (E)−1 K32 (E)2 sup φη̄¯1 ( t, 0 ) η̄¯1 (0) t≥0 2 ≤ sup φη̄¯1 ( t, 0 ) K33 (E)−1 K32 (E)2 2 t≥0 | K (E)−1 K (E) −1 η̄ (0) 33 32 2 2 {z =1 } 174 6 lti ode of nominal index 1 The following result reminds of the example concerning multiple time scale singular perturbations given in Abed and Tits (1986), see section 2.5.3. That is, this is not the first time in the history of singular perturbations that a result has been shown only in the case when the fast dynamics has dimension two. Unlike Abed and Tits (1986), however, we have not been able to show that the result holds only in this case. 6.25 Corollary. Theorem 6.24 can be strengthened in case z has only two components. Then (6.38z) is replaced by sup |(z( t, E ) − z( t, 0 ) ) | = O( max(E) ) t∈I Proof: Follows by repeating the argument of theorem 6.21 using corollary 6.23. 6.26 Remark. Regarding the failure to show convergence in z unless it has at most two components: Having excluded the possibility of bounding η by looking at the matrix exponential alone, it remains to explore the fact that we are actually not interested in knowing the maximum gain from initial conditions to later states of the trajectory of η. That is, since the initial conditions are a function of E, it might be sufficient to maximize over a subset of initial conditions. Here, it is expected that lemma 6.8 will come to use. Compare section 7.4. 6.9 Examples The examples given here are primarily meant to illustrate the convergence property being established in this chapter. We shall consider an uncertain dae and plot trajectories of randomly selected instances of the dae which are in agreement with the assumptions proposed in this chapter. In order to make a close connection to theory, the spread of these trajectories should be related to the bounds which can be constructed from the proofs herein. However, as has been indicated above, these bounds will be overly pessimistic, and as we work them out it will be clear that they give bounds which do not fit into our plots. Again, this stresses the qualitative nature of our results. The first example is constructed from the index 1 dae in the variable x̄, with matrix pair (written A E) 1.3 0.17 4.6 · 10−2 0.34 0.66 0.66 0.87 0.83 0.14 111 000 000 with the only finite eigenvalue −2.32. By operating on the equation with random orthogonal matrices from both sides, an equally well behaved system of equations should be obtained. The rows and columns are ordered so that the best pivot entry is at the (1, 1) position — otherwise numerics would come out worse than necessary. Finally an interval uncertainty of ±1.0 · 10−2 is added to each matrix entry. This results 6.9 175 Examples in the pair 0.21±1 · 10−2 1.3±1 · 10−2 3.2 · 10−2 ±1 · 10−2 0.69±1 · 10−2 0.38±1 · 10−2 0.65±1 · 10−2 0.85±1 · 10−2 0.78±1 · 10−2 0.14±1 · 10−2 ! 1.1±1 · 10−2 0.95±1 · 10−2 0.99±1 · 10−2 6.5 · 10−2 ±1 · 10−2 5.9 · 10−2 ±1 · 10−2 6.2 · 10−2 ±1 · 10−2 −5.2 · 10−2 ±1 · 10−2 −4.7 · 10−2 ±1 · 10−2 −4.9 · 10−2 ±1 · 10−2 ! 6.27 Example To illustrate theorem 6.21, we first take the equations into the form (6.5), propagating interval uncertainties, ! 0.2±1.1 · 10−2 1.1±3.5 · 10−2 −0.16±2.4 · 10−2 −2 −2 −3 −2 0.68±1.3 · 10 −0.31±4.8 · 10 8.7 · 10 ±3.6 · 10 0.86±1.3 · 10−2 6.5 · 10−2 ±5.1 · 10−2 −0.68±3.9 · 10−2 (6.39) 1 0 0 0 −1.9 · 10−4 ±2 · 10−2 −2 · 10−4 ±2.1 · 10−2 0 1.9 · 10−4 ±2 · 10−2 1.9 · 10−4 ±2 · 10−2 This gives us the bound m for max(E), and noting that what is A in (A1–6.26) will be close to A22 (E) in (6.5), we get approximately the lower bound for ā. Picking values for ā, φ0 , and with R0 = 5 (this is big enough to encompass the slow eigenvalue, as will be seen soon), the region where eigenvalues are allowed has been determined. Ignoring the O(max(E)) terms in the transform to the form (6.7), we still obtain an approximation of the interval uncertainties in (6.7), and are thus able to obtain an approximate interval for the eigenvalue of the slow dynamics. It turns out to be −2.5± 1.2, so at least the uncertainties do not destroy the stability of the slow dynamics. Next, the interval uncertainties in (6.5) are replaced by independent rectangular random variables, and the random equation is then sampled. The samples in the leading matrix are scaled such that max(E) = m (expecting the constraint max(E) = m to be active in the worst case). Samples which do not fulfill the eigenvalue condition are rejected, while valid samples are used to compute and plot the trajectory of x̄1 to give an idea of the uncertainty in the solution, see figure 6.5. For comparison, we outline how the bound based on the proofs herein can be approximated. Since the uncertain leading matrix of the fast dynamics will be of the same order of magnitude as the interval uncertainty, and since this will also be the order of magnitude of the initial conditions for the fast and uncertain dynamics, the crucial question is how the initial condition response gain of the fast and uncertain dynam ics can be bounded. The tightest bound is obtained with ā∗ = max(A) A−1 2 n, where A is the trailing matrix of thefast and uncertain dynamics. Conservatively over-estimating each of max(A) and A−1 2 independently of the other (where the latter norm is approximated using local optimization), it is found that ā∗ < 17.5. This corresponds to a gain of 4.2 · 104 . The inverse of this gain is an upper bound for the order of magnitude that can be tolerated in the original uncertainty, if the computed uncertainty estimates shall be at all usable. Being two orders of magnitude smaller than the size of the uncertainties used to generate figure 6.5, this appears overly conservative. 176 6 lti ode of nominal index 1 1 0.5 0 0 0.5 1 1.5 Figure 6.5: Random trajectories from systems in the form (6.5), satisfying the eigenvalue condition. The uncertainty intervals around each region is due to the uncertainty in the change of variables leading to (6.5). We shall now consider the same uncertain dae once more. This time, we will bring the equations into a form where theorem 6.16 can be applied instead of corollary 6.19. In reaching (6.39), uncertain row and column operations had to be performed on the equations. Since the elimination operations are given by simple rational expressions in the uncertain entries of the matrices, it is straightforward to compute the uncertainty in the transforms. At this point, however, it is time to apply the decoupling transforms, being given as the solution to second order matrix equations, and accurately computing the resulting interval uncertainties is non-trivial. We note the following — two approximate and one conservative — options • Compute the nominal solution, and then find an approximate interval solution by solving the first order approximations at the nominal solution. • Compute the nominal solution, and use it as a starting point in a local optimization method which optimizes the lower and upper interval bounds for each entry in the solution. Though a theoretically sound approach, this method has the disadvantages that it is both time consuming and relies on local optimization in possibly non-convex problems. • Derive L and H using the same technique as is done later in section 7.A. Then outer solutions for the uncertainties follow from the constructive style of the fixed-point argument, and iterative refinement may then be applied to decrease the uncertainties while maintaining the property of being outer approximations. This technique will be applied in example 7.3 on page 194. Since the only conservative option relies on a technique which was not used in the present chapter, we opt for one of the approximations in the present section. For the particular problem of finding the interval solution to the matrix inverse problem, 6.9 177 Examples x1 x2 0 0.5 1 1.5 t x3 0 0.5 1 1.5 t 1 −0.2 0.5 −0.5 0 −1 −0.4 −0.6 0.5 1 1.5 t Figure 6.6: Bounds on the uncertainty in the solution. That the bounds do not converge as the solution components tend to zero is a consequence of the emphasis being on uniform convergence. Converging bounds are easily obtained by using the time dependency of the bound in theorem 2.27. examples show that the two approximation approaches often produce very similar results, while the result as computed by Mathematica is distinctively more conservative. Even though we are just computing approximate solutions, we shall use the result computed by Mathematica since it presumably uses techniques which have been more carefully developed than the two approximations listed above. In case a first order approximation leads to a problem which can be solved via matrix inversion, we will do so. Now we have the tools to proceed with the decoupling transforms. 6.28 Example In this example, we will be able to derive bounds on the uncertainty in the solution to the coupled system, which — if not tight — are at least possible to visualize in the same plots as the nominal solution. To this end we will use smaller interval uncertainties than in the previous example; instead of ±1.0 · 10−2 we add just ±1.0 · 10−6 to the unperturbed matrix entries. Instead of (6.39), we now obtain ! 0.2±1.1 · 10−6 1.1±3.5 · 10−6 −0.16±2.4 · 10−6 0.68±1.3 · 10−6 −0.31±4.8 · 10−6 9.1 · 10−3 ±3.6 · 10−6 0.86±1.3 · 10−6 6.5 · 10−2 ±5 · 10−6 −0.68±3.9 · 10−6 (6.40) 1 0 0 0 −1.9 · 10−12 ±2 · 10−6 −2 · 10−12 ±2.1 · 10−6 0 1.9 · 10−12 ±2 · 10−6 1.9 · 10−12 ±2 · 10−6 Recall the equations for the decoupling transforms, (6.9) and (6.17). In the nota−1 tion of these equations, knowing that L( E ) = −M22 M21 + O( max(E) ) and H( E ) = −1 M12 M22 + O( max(E) ), the first order approximations are seen to be ! −1 −1 0 = A21 + A22 L + E M22 M21 A11 − A12 M22 M21 ! −1 0 = A11 + A12 L( 0 ) M12 M22 E + A12 − H A22 − E L( 0 ) A12 178 6 lti ode of nominal index 1 These equations are solved using matrix inversion, and after application of the two decoupling transforms, the pair becomes ! 2.3±1.8 · 10−4 0 0 0 −0.31±1.3 · 10−5 9.1 · 10−3 ±4.8 · 10−6 0 6.5 · 10−2 ±1.3 · 10−5 −0.68±5.1 · 10−6 (6.41) 1 0 0 0 −1.9 · 10−12 ±2 · 10−6 −2 · 10−12 ±2.1 · 10−6 0 1.9 · 10−12 ±2 · 10−6 1.9 · 10−12 ±2 · 10−6 As a last step, a matrix inverse is applied to the rows of the fast and uncertain dynamics to bring it into the form of theorem 6.16; 2.3±1.8 · 10−4 0 0 0 0 1 10 01 0 0 0 6.1 · 10−12 ±6.6 · 10−6 6.2 · 10−12 ±6.7 · 10−6 0 −2.2 · 10−12 ±3.6 · 10−6 −2.3 · 10−12 ±3.7 · 10−6 (6.42) In the previous example, the initial conditions were never stated explicitly. Given the unperturbed system at hand, the set of initial conditions which are valid for arbitrarily small perturbations form a one-dimensional linear space. Fixing the first 0 component to 1 implies x̄ = 1. −0.923 −0.619 . Transforming to the variables of (6.42), the initial conditions are given by ! −0.415 ± 7.87 · 10−5 ξ0 = 1.95 · 10−9 ± 3.4 · 10−4 , η 0 ≤ 4.24 · 10−4 0 η 1.59 · 10−9 ± 2.52 · 10−4 With n = 2, ā = 1.1, and φ0 = 1.4, (6.29) used in theorem 2.27 gives the bound 83.7 on the gain from η 0 to η(t). This allows each of the components of η(t) to be bounded by a small constant. Concerning ξ(t), it is a scalar system, so upper and lower bounds are easy to compute given the intervals of ξ 0 and the corresponding eigenvalue (seen in (6.42)). Inverting the variable transforms one by one, we are finally able to compute interval uncertainties in the original variables. The bounds are shown in figure 6.6. 6.10 Conclusions The chapter has derived a matrix-valued singular perturbation problem related to the analysis of uncertain lti dae of nominal index 1. The perturbation problem has been solved using assumptions in terms of system features, namely its poles. Depending on whether we also made the assumption that the dae be pointwise index 0 or not, the convergence results come out a bit different except for when the fast and uncertain dynamics has at most two states. The decoupling transformations related to the matrix-valued singular perturbation It is also possible to compute the aggregated variable transform by multiplying the individual transforms, and then compute just one inverse, but it turns out that this causes loss of precision compared to computing the inverses of the individual transforms. 6.10 Conclusions 179 problem have been analyzed using asymptotic arguments. The reason for not deploying the more constructive methods used in the following chapters has been to preserve the style of the original published work that the chapter builds upon. (Except for that, though, the style of presentation has been changed substantially for better compatibility with the following chapters.) The problem of bounding the norm of a matrix, given a bound on the moduli of its eigenvalues and an entry-wise max bound on its inverse, has been motivated as a useful tool in analysis of uncertain dae. A bound has been derived, and its quality has been addressed in examples. An example was used to illustrate the usefulness of analyzing the equations in a form where the uncertainties of the fast and uncertain sub-system were brought entirely to the leading matrix. This idea will appear again in the next chapter, when we set out to understand perturbed equations of nominal index 2. Appendix 6.A Details of proof of lemma 6.8 This section proves that there exists a ρL > 0 such that the equation ! 0 = m RL (E) − L0 ( E ) A12 (E) L( E ) − L0 ( E ) − L( E ) − L0 ( E ) A11 (E) + A12 (E) L( E ) appearing in lemma 6.8 has a solution satisfying kRL (E)k2 ≤ ρL if max(E) is required to be sufficiently small. The equation is written as the fixed-point form ! RL (E) = TL ( RL (E), E ) by defining 1 0 L ( E ) A12 (E) L( E ) − L0 ( E ) m 1 + L( E ) − L0 ( E ) A11 (E) + A12 (E) L( E ) m where the dependence on RL (E) is through L( E ). 4 TL ( RL (E), E ) = From here on, the dependency of RL (E) and TL ( RL (E), E ) on E is dropped from the notation, so instead of TL ( RL (E), E ) we just write TL RL . Consider RL ∈ L = RL : kRL k2 ≤ ρL . By the bounded derivative of all matrices in the problem that depend on E, it follows that my requiring max(E) to be sufficiently small, it will be possible to find c3 , c0 , c11 , 180 6.A 181 Details of proof of lemma 6.8 + c11 , c12 which fulfill A22 (E)−1 ≤ c3 2 L0 ( E ) ≤ c0 2 kA11 (E)k2 ≤ c11 + A11 (E) + A12 (E) L0 ( E ) ≤ c11 2 kA12 (E)k2 ≤ c12 Applied to (6.18) these bounds yield + L( E ) − L0 ( E ) = m c3 c0 c11 + m ρ L 2 To ensure that TL maps L into itself, first note that 1 L( E ) − L0 ( E ) c1 kTL RL k2 ≤ 2 m where upper bounds on m and ρL are used to ensure that c1 can be chosen to fulfill (dropping dependencies on E from the notation) + L0 A12 + A11 + A12 L0 + m c3 c0 c11 + m ρL kA12 k2 ≤ c1 2 2 for the lowest of all bounds imposed on max(E). Setting + ρL B ( 1 + αL ) c3 c0 c11 c1 (6.43) for some αL > 0, will then yield the condition m≤ + αL c0 c11 αL 1 = ρL 1 + αL c3 c1 (6.44) for TL to map L into itself. For the contraction part of the argument, let L1 and L2 denote the expressions for L( E ) corresponding to RL,1 and RL,2 , respectively. Then ( L2 − L0 )( A11 + A12 L2 ) − ( L1 − L0 )( A11 + A12 L1 ) = ( L1 − L0 ) A12 ( L2 − L1 ) + ( L2 − L1 ) ( A11 + A12 L2 ) As (6.18) shows that L( E ) is affine in m RL with the matrix coefficient A22 (E)−1 E acting from the left, one obtains 1 0 TL RL,2 − TL RL,1 = L A12 A−1 22 E m RL,2 − m RL,1 m 1 + ( L1 − L0 ) A12 A−1 22 E m RL,2 − m RL,1 m 1 + A−1 22 E m RL,2 − m RL,1 ( A11 + A12 L2 ) m = L1 A12 A−1 E R − R L,2 L,1 22 −1 + A22 E RL,2 − RL,1 ( A11 + A12 L2 ) 182 6 lti ode of nominal index 1 Using upper bounds on m and ρL to ensure that cL may be chosen to fulfill + + m ρL ≤ cL kLk2 ≤ c0 + m c3 c0 c11 one obtains (using nz to denote the dimension of E) TL RL,2 − TL RL,1 ≤ m nz c3 ( c11 + 2 c12 cL ) RL,2 − RL,1 2 2 Hence, the condition m< 1 nz c3 ( c11 + 2 c12 cL ) (6.45) implies that TL is a contraction on L, and the contraction principle (theorem 2.44) gives that there is a unique solution RL ∈ L. This concludes the argument, and the section ends with an small example. 6.29 Example To illustrate lemma 6.8, let us consider a small example with constant matrices in (6.5) given by # # " " 0.1 0.5 2. 1. 0.5 A12 = A11 = 0.5 0.1 1. 0.1 1. 1. 1. 2. −1. 0.5 3. 0.3 A22 = 0 1. 0.5 A21 = 0.5 −0.5 −0.5 0.5 0 Sampling the matrix E randomly a large number of times, indicates that ρL might be as low as 3.5 · 103 for m < 1.0 · 10−3 . Supposing (guided by the Monte Carlo analysis) that the bounds to be derived will show that ρL ≤ 10 · 103 , and require m to be less than 0.01, one obtains the following numeric values of the constants used in the proof of lemma 6.8. c3 = 3.561 c0 = 7.598 c12 = 2.311 c1 = 28.06 c11 = 1.32 + = 6.31 c11 cL = 12.87 Taking αL = 0.5 yields ρL = 7.185 · 103 and the following two bounds on m m ≤ 3.336 · 10−3 and m < 1.54 · 10−3 These values are in accordance with the supposed bounds. For a particular value of m, we may now use interval arithmetic fixed-point iteration to improve the bound on RL . In this example, m = 1.0 · 10−3 and five fixed point The entries of E are sampled from independent uniform distributions over [ −m, m ]. For each E, we first compute L, and then solve for RL in (6.18). In case E is not full rank, it is still possible to solve for RL by using pseudo inverse techniques. 6.A Details of proof of lemma 6.8 iterations results in [ −6.476 · 103 , 6.476 · 103 ] [ −1.817 · 102 , 1.817 · 102 ] RL ∈ [ −5.322 · 103 , 5.322 · 103 ] [ −1.517 · 102 , 1.517 · 102 ] 3 3 2 2 [ −4.716 · 10 , 4.716 · 10 ] [ −1.331 · 10 , 1.331 · 10 ] The improvement of the entry-wise bounds is larger for smaller values of m. 183 7 LTI ODE of nominal index 2 In chapter 6 the convergence of uncertain dae of true and nominal indices at most 1 was considered. In this chapter, the nominal index 2 case will be considered. As in the nominal index 1 case, the analysis depends on the true index, and to simplify matters true indices higher than 0 will not be considered. For many purposes, it turns out that it is useful to distinguish between dae based on their index; lower index dae or higher index dae. Only index-0 and index 1 are considered low (recall that these have strangeness index 0), and these are generally considered easy to deal with in comparison with the higher indices. Hence, the current chapter opens the door to the analysis of equations which are expected to be difficult to deal with in comparison to the equations in previous chapters. The chapter is organized as follows. In section 7.1 a canonical form for perturbed matrix pairs of nominal index 2 is proposed. Section 7.3 contains an analysis of the growth of the uncertain eigenvalues as the uncertainties tend to zero, and the initial conditions of the fast and uncertain subsystem is the topic of section 7.2. Then, in section 7.4 we take a closer look at a very small index 2 system, and we will see both that it is possible to prove convergence of solutions in this case, and that the index 2 case really is a lot harder to deal with than the lower index systems. Section 7.5 summarizes our conclusions from the chapter. 7.1 Canonical form In chapter 6, we were able to analyze the equations without bringing the equations into a form where we could really define the size of the uncertainties; scaling rows and columns in the equation could change the absolute size of the uncertainties arbi185 186 7 lti ode of nominal index 2 trarily. However, (6.16) is only a small step away from " # 0 ! " # ! I ξ (t) ! Mξξ + O( max(E) ) ξ(t) = E η 0 (t) I η(t) where E is still O( m ), although different compared to (6.16). It would be possible to add some more structure by, for instance, bringing Mξξ + O( max(E) ) into the form Jξξ + O( max(E) ) where Jξξ would be the Jordan form of the nominal Mξξ , but since there are many choices, and the choice has no implications for our understanding of the pair (or dae) aspects of the matrix pair, we prefer to defer the choice of structure for Mξξ + O( max(E) ). As this form can be reached from the original, coupled, equation using row and column operations which are nominally non-singular, and with uncertainties of size O( m ), this may be considered a canonical form for perturbed nominal index 1 lti dae — recall how the use of this form was intrumental for the improvement in example 6.28 over example 6.27. In this section, a corresponding canonical form is derived for lti dae of nominal index 2. The theorem is formulated in terms of matrix pairs. 7.1 Theorem (Perturbed index 2 canonical form). Consider the parameterized set of uncertain matrix pairs ( E(m), A(m) ) , m≥0 satisfying E(m) − E 0 = O( m ) A(m) − A0 = O( m ) for some point matrix pair E 0 , A0 of index 2. Then there exist a number m0 > 0 and uncertain regular matrices T (m) and V (m) with cond T (m) = O( m0 ) cond V (m) = O( m0 ) such that for all m ≤ m0 ( E(m), A(m) ) ⊂ I I E E T (m) , (m) (m) 33 34 E E 43 (m) 44 (m) A J + 11 (m) I I A I 44 (m) where J is a point matrix and Proof: See section 7.1.1. E ij = O( m ), i, j ∈ { 3, 4 } A ii = O( m ), i ∈ { 1, 4 } V (m)−1 7.1 Canonical form 187 The canonical form will first be derived by assuming that the Weierstrass form of the nominal matrix pair is known. Except for the initial step where the Weierstrass decomposition is used (see comment below), the form is derived constructively by prescribing a sequence of transformations which each bring some additional structure to the matrix pair representing the equations. Each transformation is nominally invertible, with O( m ) uncertainty, m as usual being the entry-wise bound on the matrix entries in the original matrix pair. The sequence ends at a stage where the perturbed nominal index 2 nature is obvious and we are unable to add more structure using nominally invertible transformations. What makes the Weierstrass decoupling step non-constructive is that the nominal matrix to be decomposed is typically not obvious in applications. Rather, the nominal matrix is something which is selected as a means to obtain as small perturbation bounds as possible, and it is not until the pair has been transformed into a form which reveals more of the structure in the pair that the selection should take place. Motivated by the practical shortcomings of the derivation based on the Weierstrass form, the same form will be derived again using a sequence of steps which is possible to use in applications. Before we start, we also remark that the proposed form — similar to the Weierstrass form — may be best suited for theoretical arguments regarding perturbed matrix pairs. Once their theory is better understood, other forms based on approximate orthogonal matrix decompositions may be both be more applicable (allowing for larger uncertainties) and able to deliver higher accuracy. We shall not explore such forms in this chapter, but the idea was present back in theorem 6.24. 7.1.1 Derivation based on Weierstrass decomposition The Weierstrass canonical form (recall theorem 2.16) allows us to identify any nominal index 2 dae with a pair in the form " # " # ! I J −1 0 −1 0 V +E , T V +F T N I where T and V are non-singular matrices, and E 0 and F 0 (here, non-negative superscripts are used as ornaments, while the superscript −1 denotes inverse) are matrix 0 0 the uncertain perturbations satisfying max E ≤ m and max F ≤ m. Note that while the nominal index is 2, the pointwise index is generally 0 since the perturbation E 0 generally makes the leading-matrix non-singular. By application of T −1 from the left and V from the right, the pair transforms into " # " #! 1 1 1 1 I + E11 E12 J + F11 F12 , 1 1 1 1 E21 N + E22 F21 I + F22 where E 1 = T −1 E V = O( m ) and F 1 = T −1 F V = O( m ) since T and V were assumed non-singular point matrices. 1 Since I +E11 = I +O( m ), taking m sufficiently small will make it invertible according to corollary 2.47, and applying the inverse as a small but uncertain row operation 188 7 lti ode of nominal index 2 produces " I 1 E21 # " 2 2 J + F11 E12 1 1 , F21 N + E22 2 F12 1 I + F22 #! where, for instance, 1 1 1 −1 1 1 −1 2 J = O( m ) − E11 F11 − J = I + E11 J + F11 = I + E11 F11 It will be necessary to apply corollary 2.47 in nearly every transformation we make, so from here on we take its use for granted at any point where needed. Since the number of applications will be finite, the smallest of all imposed bounds on m will still be positive. We are now only one near-identity row operation and one near-identity column operation from " # " #! 2 3 I J + F11 F12 3 , 3 3 N + E22 F21 I + F22 The O( m ) property of E 3 and F 3 is maintained. Since the Jordan blocks of N are of size 1 or 2, with at least one of size 2 (or the nominal index would be less than 2), there are numbers n1 and n2 such that n1 + 2 n2 equals the size of N , with n2 being the total number of off-diagonal 1 entries. Let us consider permutations of rows and columns in the pair ( N , I ) for a while. By permuting rows and columns in the same way, it is possible to bring the n1 Jordan blocks of size 1 to the lower right part of N : #! " 1 # " I N2 , I 0 By permuting columns in N21 so that columns 2, 4, . . . appears before columns 1, 3, . . . , and permuting rows so that rows 1, 3, . . . appears before rows 2, 4, . . . , one obtains the form I 0 0 I I 0 0 0 , 0 I and we finally swap the second and third block rows and columns to obtain I I , I 0 0 I Using these permutations in the perturbed pair results in 2 4 4 F12 F13 I J + F11 4 4 4 4 4 4 I + E22 E23 E24 F21 F22 F23 , 4 4 4 4 4 4 E32 E33 E34 F31 F32 I + F33 4 4 4 4 4 4 E42 E43 E44 F41 I + F42 F43 4 F14 4 I + F24 4 F34 4 F44 7.1 189 Canonical form 4 Similar to the first transforms, we now use that I + E22 can be reduced to I using a matrix inverse for sufficiently small m, and then be used to eliminate below and to the right in the leading matrix. 2 5 5 5 F12 F13 F14 I J + F11 5 5 5 5 F21 I F F I + F 22 23 24 , F5 5 5 5 5 5 E E F I + F F 33 34 32 33 34 31 5 5 5 5 5 5 E43 E44 F41 I + F42 F43 F44 Yet three more rounds of two inversions and elimination results in 2 5 5 6 F12 F13 F14 I J + F11 5 5 5 6 F21 I F22 F23 I + F24 , 6 6 6 6 E33 E34 F31 F32 I 6 6 6 6 6 E43 E44 F41 I + F42 F44 7 7 7 7 7 I + E11 E12 J + F11 F12 F13 7 5 5 F 7 E21 I F F I 21 22 23 , 7 6 7 7 E33 E34 F32 I F31 7 7 7 E43 E44 I F44 8 8 8 I J + F11 F12 F13 F 8 8 8 I F22 F23 I 21 , 6 7 7 8 E33 E34 F31 F32 I 7 7 7 E43 E44 I F44 8 8 This is the form in which the decouplings of section 7.A apply. Due to F21 , F31 being O( m ), also the decoupling transforms will only be O( m ) away from identity transforms. After the transforms, the pair has the form J + F9 I 11 I 9 9 9 F22 F23 I + F24 6 7 (7.1) , E34 E33 9 9 9 F32 I + F33 F34 7 7 E43 E44 9 9 9 I + F42 F43 F44 9 The first block to eliminate is F22 , it takes two rounds. 9 J + F11 I 10 10 I E E 10 10 F I + F 23 24 23 24 , 6 7 9 9 9 E33 E34 F I + F F 32 33 34 7 7 9 9 9 E43 E44 I + F42 F43 F44 J + F9 I 11 I 10 10 F I + F 23 24 6 7 , E E 9 11 11 33 34 F I + F F 32 33 34 7 7 E43 E44 9 11 11 I + F42 F43 F44 190 7 lti ode of nominal index 2 Finally, four more O( m ) blocks are removed. 9 I J + F11 I I 12 12 , 12 12 E33 E34 I + F33 F34 12 12 12 12 E43 E44 I F43 F44 I I 13 13 E33 E34 , 13 13 E43 E44 9 J + F11 I I 13 I F44 (7.2) Equation (7.2) is the proposed canonical form. At the cost of a substantial increase 13 13 in the uncertainties, one could additionally obtain E33 and F44 in real Schur form. The reason they cannot be put in Jordan canonical form is that the condition number of the similarity transform must be possible to bound in order to maintain the O( m ) size of the uncertainties. 7.1.2 Derivation without use of Weierstrass decomposition When we now derive the same canonical form again, we will be able to make reuse of the latter part of the derivation in the previous section. The interesting part of the derivation is how to get started without knowing the nominal pair. The notation in this section is independent of that introduced in the previous section. While corollary 2.47 was used repeatedly in the previous section to invert perturbed identity matrices. We start by making a similar remark regarding perturbed full rank matrices. Consider the perturbed matrix X + F with no less rows than columns, where X being the nominal matrix has full column rank, and F is a perturbation of size O( m ). Then a QR decomposition brings the perturbed matrix into the form " # R + Q1T F Q2T F where R is an invertible point matrix, and QT F is still an O( m ) perturbation. It follows by corollary 2.47 that taking m sufficiently small will allow the upper block to be inverted using a column operation of bounded condition number, leading to I −1 QT F R + QT F 2 1 where the lower block is still of size O( m ). Hence, a row operation of bounded condition number brings the matrix into the final form " # I 0 7.1 191 Canonical form The procedure of bounding m to be sufficiently small for this reduction to be possible will be applied several times in the following derivation, and we take its use for granted at any point where needed. The transposed case is analogous, and since the total number of applications is finite, the smallest of all imposed bounds on m will still be positive, and the respective products of all row and column operations will still have bounded condition numbers. Since the nominal matrices are unknown to us this time, we simply write the original matrix pair as E 0 , A0 (7.3) where both E 0 and A0 are uncertain matrices. As always, we start by applying row operations and column permutations until we reach the form E 1 E 1 A1 A1 11 12 , 11 12 E 1 A1 A1 22 1 E11 21 22 1 E22 1 where is non-singular while = O( m ). Typically, E22 will be so small that it has no entries which can be distinguished from 0 — one proceeds with row operations as long as there exists non-zero entries to pivot on. When transforms are 1 applied to given pair data, however, there is no m which tends to zero, and E22 may even contain non-zero intervals as long as they are sufficiently small — rather than pivoting on a very small entry with large relative uncertainty, it may be wiser to artificially increase the uncertainty so that zero is within the interval of uncertainty in order to avoid a very large uncertainties in other parts of the equations. For further discussion of how to think of O( m ), see section 1.2.4. Next, column operations are applied to yield 2 2 I A A 11 12 1 , 2 E22 A21 A222 and if A222 would be non-singular, we know from chapter 6 that the pair can be decoupled and that there is a natural choice of nominal equations of index 1. Since we are considering dae of nominal index 2 in this section, it follows that A222 is singular (that is, the uncertainties allows for an instantiation of A222 which is singular in the ordinary sense of point matrices). Since A222 is singular, it is possible to apply row and column operations which decomposes A222 in the same way as E 0 was decomposed. 2 A11 A312 A313 I E 3 E 3 3 A21 I , (7.4) 22 23 E 3 E 3 3 3 A F 32 33 31 33 3 where F33 = O( m ) (and the same remark regarding given data applies again). 192 7 lti ode of nominal index 2 If A331 would not have full row rank, it would be possible to row reduce to reveal a row with only small and uncertain coefficients, and the corresponding row in the leading matrix would also contain only small and uncertain entries. Hence, the unit basis vector which corresponds to this row would be in the left null space of both the leading and the trailing matrix, showing that the matrix pair is singular. This would contradict the index 2 assumption according to corollary 2.12, and it follows that A331 must at least have full row rank in the nominal case. It follows that there exists a positive bound on m which will make also the uncertain A331 have full row rank. In particular, this means that A331 must have at least as many columns as rows. By symmetry, it follows that A313 has the transposed size. If A313 would not have full column rank, column operations would reveal a vector in the right null space of both matrices, again showing that the pair is singular. However, we shall soon see that A313 having full column rank is implied by the property that the nominal index of the dae is 2. 7.2 Remark. That the nominal A331 has full row rank is not particularly related to the index 2, but will be necessary for any finite index. We now know the existence of column operations on A331 which are applied together 4 4 with row operations which maintain the leading matrix (the matrices Eij and F44 introduced here are identical to matrices in the previous step, but the new notation is introduced to avoid confusing subscripts in disagreement with the block structure), yielding A4 A4 A4 A4 I 11 12 13 14 I 4 A A422 A423 A424 21 4 4 (7.5) , 4 E33 E34 A31 A432 I 4 4 E33 E44 4 I F44 Until now, we have not made use of the property that the nominal index of the pair be 2, but at this point the pair has enough structure to directly relate it to its index via the shuffle algorithm. Let us temporarily consider the following nominal equation. A411 A412 A413 A414 I 4 A21 A422 A423 A424 I , 0 0 4 A31 A432 I 0 0 I 0 Shuffling the last two rows leads to I I , 4 A31 A432 I I 0 A411 A412 A413 A414 4 4 4 4 A 21 A22 A23 A24 0 0 0 0 0 0 0 0 7.1 193 Canonical form which is row reduced to I I , 4 A31 A432 I 0 0 A411 4 A 21 0 −A4 21 A412 A413 A422 A423 0 0 −A422 −A423 A414 A424 0 −A4 24 Shuffling the last row shows that the nominal A424 will have to be non-singular for the index 2 property to hold, and we now return to the non-nominal equations. For sufficiently small m, A424 will be regular, enabling the following three steps to be carried out. A4 A4 A4 A5 I 11 12 13 14 4 I A A422 A423 I 21 , 4 5 4 E33 E34 A31 A432 I 4 5 E E 5 33 44 I F 44 I E 6 12 I , 4 5 E33 E34 4 5 E33 E44 I I 4 5 , E E 33 34 4 5 E33 E44 A611 A612 A613 4 4 4 A A A I 21 22 23 4 4 A 31 A32 I 5 I F44 A611 A712 A613 4 7 4 A A A I 21 22 23 4 7 A 31 A32 I 5 I F44 (7.6) Here, the form where the decoupling transforms of section 7.A apply has been reached, and the remaining steps towards the canonical form are exactly the same as in the previous section. Note that, in the previous section, both the regularity and the nominal index of the matrix pair were ensured by using the Weierstrass canonical form as a starting point. In the current section, these properties were added as requirements along the way in order to be able to proceed with the reduction towards the canonical form. The regularity property will always be necessary in order to rule out system where the uncertainties have destroyed the nominal solution. The restriction to system of nominal index at most 2, on the other hand, would make sense to relax. 7.1.3 Example As an illustration of the two approaches to the canonical form, we shall use the Weierstrass form to construct an example of nominal index 2, and then use the constructive approach to rediscover the nominal structure. Among other things, the example 194 7 lti ode of nominal index 2 will show that the O( m ) of the theoretical development can be turned into concrete quantities when our techniques are applied to data. Due to space constraints, we are unable to present matrix entries to sufficient precision to make it possible to repeat the computations. The data is given in section 7.B, but readers interested in the full precision will receive the complete computation as a Mathematica notebook upon request. 7.3 Example In order to avoid trivial dimensions, let us take as example a matrix pair in Weierstrass form (recall theorem 2.16) with Eigenvalue 0 −0.5 ∞ ∞ ∞ ∞ Size of Jordan block 1 2 1 1 2 2 That is, the slow dynamics has 3 states, the index of the pair is 2, there are 2 index 1 variables, and 2 index 2 subsystems with 2 variables each. The Weierstrass form is mutliplied by random matrices from the left (condition number 2.7) and right (condition number 1.9), so that a pair with known eigenvalue structure, but no visible structure, is obtained. Finally, an interval uncertainty of ±1.0 · 10−9 is added to each entry. The added uncertainty may be strikingly small, but this is necessitated by the conservative estimates we will make to obtain the initial outer solution in the first decoupling step. The resulting matrix pair is show in table 7.2 on page 222. Carrying out the transformation steps given in section 7.1.2, the matrix pair in table 7.3 is obtained, along with two chains of uncertain transformation matrices (one operating from the left, and one from the right). Multiplying together the factors in each chain, the condition numbers can be bounded as 40.0 (left) and 69.5 (right). Regarding the two decoupling steps, implemented based the derivation in section 7.A.1 (applied to the transposed matrices in the second step), it is the first one which turns out critical here, requiring the uncertainties to be very small to ensure that the transformation is valid. As the O( m ) expressions in the derivation are replaced by interval quantities in computations, there is no use of the parameter m, and it may — without loss of generality — be set to 1. The items below provide some insight into the computations. • Bounding constants: cL0 B 1.00 · 10−3 , cL B 13.0 · 100 , cE B 9.03 · 10−7 , αL B 8.36 · 10−2 , ρL B 1.41 · 10−2 . • Constraint on m: 1 ≤ 1.55. • After a few rounds of iterative refinement (using (7.35) solved with respect to RL by inverting L), the uncertainty kRL k2 comes down to 3.59 · 10−3 . 7.2 195 Initial conditions The rather large potential for improvement of the initial outer solution for RL indicates that there is also potential for relaxing the constraint on m. To verify that the decomposition is valid, the transformation chains are applied to the pair in the canonical form, which shall result in a matrix pair containing the original matrix pair. Consider the leading matrix, given by an expression like T1 T2 · · · Tn E Vm · · · V2 V1 For point matrices, the order in which the multiplications are carried out would not matter, but as an illustration of the nature of interval arithmetic, we compare two different ways of carrying them out. • Collapse the transformation matrices first, with multiplication associating to the left, that is [( ( T1 T2 ) · · · Tn ) E ] ( ( Vm · · · V2 ) V1 ) The result is shown in table 7.4, and completely includes the original matrix pair. • Apply the transformation matrices one by one, that is [ [ [ ( T1 ( T2 · · · ( Tn E ) ) ) Vm ] · · · V2 ] V1 ] The result is shown in table 7.5, and completely includes the original matrix pair as well as the pair in table 7.4. The difference is remarkable; the median interval width in table 7.5 is approximately 7 times that in table 7.4, and the ratio between the widest intervals is near 30. It is a topic for future research to investigate whether the collapsed transformation matrices can be given even higher accuracy by iterative refinement methods such as forward-backward propagation (Jaulin et al., 2001, section 4.2.4). 7.2 Initial conditions We now turn to the question whether nominally consistent initial conditions of the original coupled system imply that the initial conditions of the fast and uncertain subsystem tend to zero with m. For the purposes of this section, the transformations leading to the canonical form are divided into three groups. The variables of the original, coupled, form (7.3) are denoted x̄, ! Ex̄ x̄ x̄0 + Ax̄ x̄ x̄ = 0 (7.7) 196 7 lti ode of nominal index 2 The first group of transformations brings us to the form (7.6), where variables are denoted ! x v x = 1 v v2 v3 The corresponding dae manifests the lti matrix-valued singular perturbation form of the present chapter (compare (6.5)), written # ! # 0! " " x Axx Axv x ! I =0 (7.8) + Avx Avv v Evv v 0 It is easy to check that the variable transforms have O( m0 ) norm, so ! x v = O( |x̄| ) The second group of transforms are the decoupling transforms, leading to the form (7.1), with variables ! ξ η ξ = 1 η η2 η3 belonging to " # I Eηη ! " A ξ0 + ξξ η0 # Aηη ! ξ ! =0 η Here, v = L( m ) x + η relates η to the variable before the transforms, where L( m ) is the matrix analyzed in section 7.A.1. Finally, the last group of transforms, which only operates on the last three block rows and columns, lead to the form (7.2), with variables ! ξ ξ η̄ = 1 η̄ η̄2 η̄3 and " # I Eη̄ η̄ ! " A ξ0 + ξξ η̄ 0 # Aη̄ η̄ ! ξ ! =0 η̄ (7.9) Again, it is easy to check that the variable transforms have O( m0 ) norm, so η̄ = O( η ) and hence we shall focus on the initial conditions for η in the rest of this section. In section 7.A.1 is was shown that Evv L = O( m ), and it follows that Eηη = Avv − Evv L Axv = Avv + O( m ) 7.2 Initial conditions 197 We now end this section with a variation of lemma 6.4, establishing that the initial conditions of the uncertain system tend to zero with m. The initial conditions for η given by η(0) = v 0 − L x0 depend on the choice of x0 and v 0 , and becomes uncertain due to the uncertainty in L. Note that it is generally not possible to establish convergence of the solutions if x0 and v 0 are set to fixed values without consideration of the algebraic relations imposed by the dae. Rather, an integrator for dae need freedom to select suitable initial conditions from some region or in the neighborhood of some “guess” that a user may provide. 7.4 Lemma. Consider the fast and uncertain subsystem in (7.9), obtained from (7.7) using the transformations of section 7.1. Allow for O( m ) uncertainty in equation coefficients as well as initial conditions. The initial conditions satisfy η̄ 0 = O(m) if and only if ! Avx x0 + Avv v 0 = O(m) (7.10) Proof: The statement may be proved for η instead of η̄ without loss of generality. For the sufficiency, consider the expression for η 0 , 0 0 − m RL x0 (7.11) η 0 = v 0 − L x0 = A−1 vv Avx x + Avv v Here, A−1 vv 2 can be bounded by taking m sufficiently small, so the implication follows. For the necessity, rearranging (7.11), we find that Avx x0 + Avv v 0 = Avv η 0 + m RL x0 from which the O(m) size of (7.10) follows. 7.5 Corollary. The degrees of freedom in assignment of initial conditions under (7.10) can be expressed already at the stage of (7.4). Denoting the variables belong ing to this form x̆ v̆1 v̆2 , and writing Av̆2 x̆ in place of A331 , the condition may be expressed as ! Av̆2 x̆ x̆0 = O( m ) leaving no degrees of freedom for v̆1 and v̆2 . Equations to determine v̆1 and v̆2 in terms of x̆0 are available in (7.5) Proof: The statements can be verified by first checking that the involved block rows are not changed by row operations between the stage of use and (7.6). Then using 3 that F33 in (7.4), the condition on x̆0 follows. That there is no degrees of freedom for the remaining initial states, and that they can be determined from (7.5) follows immediately from the non-singularity of " 4 # A23 A424 I 198 7.3 7 lti ode of nominal index 2 Growth of eigenvalues This section amounts to some tedious bookkeeping of the characteristic polynomial of the pair belonging to the η̄ dynamics (compare (7.2)). Since we are only concerned with the pair in the final form in this section, we drop the superscripts on the symbols, I I ! 0 E33 E34 η̄ (t) + I (7.12) η̄(t) = 0 E43 E44 I F44 The proofs are not difficult to understand, but may take some time to read. We begin by stating some properties of the characteristic polynomial of (7.12) with the nominal trailing matrix; I I ! E33 E34 η̄ 0 (t) + I (7.13) η̄(t) = 0 E43 E44 I 0 | {z } CAη̄ η̄ (0) For future reference, the corresponding determinant is written separately in the variable λ I λ I λ E33 + I λ E34 det (7.14) I λ E43 λ E44 We now make three observations. First, the modulus of a product of k entries from E can be written as k for some ≤ max(E). Second, the term in the characteristic polynomial which is free of factors from λ and E, is the determinant of the trailing matrix. Third, with every factor from E follows a factor λ. Hence, the characteristic polynomial expanded as a sum of monomials can be written I X n + I det σi λmi i i (7.15) I 0 i | {z } CD where |D| = 1, |σi | = 1, |i | is some number smaller than max(E), and mi ≥ ni ≥ 1. (The number of terms in the sum over i will depend on the matrix block dimensions.) The ratios bound. mi ni turn out to be important, and in particular we will need a good upper 7.6 Lemma. The ratios attained. mi ni are bounded from above by 2, and this bound is generally Proof: Please refer to (7.14) during the following argument. 7.3 199 Growth of eigenvalues Since we are trying to maximize the power of λ relative to the power of i , we are interested in the terms in the determinant which contain one or more factors from the upper left block. From the structure of the matrix, with identity matrices in the lower left and upper right blocks, any selection of n factors from the diagonal upper left block will remove the corresponding rows and columns from the lower and right block rows and columns from the remaining determinant factor. The lowest order term in this remaining determinant will be a product containing all the positions along the I in the middle block, and hence all remaining factors must come from the lower right block. Adding up, the lowest order terms containing n factors from the upper left block will be in the form X Y λn (E33 )ii + 1 λn det( (E44 )ii ) i⊂Nn i where the last sum is over all minors with symmetric choice of rows and columns. Hence, the generality of the constructions depends on at least one of these sums to be non-zero. To see that 2 is an upper bound, note that the ratio is made big by including factors with a λ that do not come with an entry from E. Such factors only exist in the upper left block, but it is clear from the argument h above that ifor every such pure factor λ, one factor from the corresponding row of λ E43 λ E44 must also be in the product. The remaining factors in the term will be from the middle block row where there are no factors λ that do not come with an entry from E, and hence the power of will always be at least twice the power of λ. Next, the lemma is used to show that the eigenvalues of the uncertain subsystem must grow as max(E) → 0. 7.7 Theorem. There exists a constant k > 0 such that |λ| ≥ k max(E)−1/2 . Proof: Let nη be the dimension of η̄ (and hence also of η), and hence the degree of the characteristic polynomial (7.15). Lemma 7.6 allows us to write ni = mi /2 + ri for some ri > 0. Rearranging the characteristic polynomial as a sum of monomials in λ, we obtain nη X X m /2+ri D+ λd σi i i d=1 i : mi =d | {z } Cad r Using i ≤ max(E) and assuming max(E) ≤ 1 (so that i i ≤ 1), the coefficients may be bounded as X 1 |ad | ≤ max(E)d/2 i : mi =d | {z } Cαd 200 7 lti ode of nominal index 2 where αd ∈ N only depends on the matrix block dimensions. Dividing the polynomial by λnη , and writing it as nη−1 −1 nη D (λ ) + X anη−d (λ−1 )d d=0 an upper bound on λ−1 may be obtained using theorem 2.52. It gives that λ−1 is bounded by 2 times the maximum of the expressions a 1/d 1/d d ≤ max(E)1/2 αd , d = 1, . . . , nη − 1 D D a 1/nη α 1/nη nη nη ≤ max(E)1/2 2 D 2D Inverting the bound, the proof is completed by taking ( ) 2 D 1/nη D 1/d nη−1 [ 1 k = min αnη 2 αd d=1 This was for the case of the nominal trailing matrix. Note that typical perturbation theory (for instance, Stewart and Sun (1990, chapter vi)) may be hard to apply here, since we only have knowledge of the egienvalue magnitues so far, and only care about the magnitudes. Typical perturbation analyses will study the perturbations of the eigenvalues themselves, but these perturbations may actually be large in the present situation without conflicting with our needs. So, instead of trying to use existing eigenvalue perturbation theory, we consider the characteristic equation agian, this time with perturbed coefficients. The perturbed determinant det [ λ E − ( A + m F ) ] where max(F) = O( max(E)0 ), and be rewritten det [ λ E − ( A + m F ) ] h i = det ( A + m F ) A−1 A ( A + m F )−1 [ λ E − ( A + m F ) ] h i h i = det I + m F A−1 det λ A ( A + m F )−1 E − A h i h h i i = det I + m F A−1 det λ I − m F ( A + m F )−1 E − A Here, −F44 I ( A + m F )−1 = I = O( m0 ) 2 I 2 h i −1 Hence bounding det I + m F A from below makes it possible to regard h i I − m F ( A + m F )−1 E as a new unstructured uncertainty which is still O( m ). If the special structure of the leading matrix would not have been used in theorem 7.7, it would have been possible 7.3 201 Growth of eigenvalues to apply again. However, the use of the blocked structure of the leading matrix does not permit this, and even though the proof of the next theorem is far from as elegant as the idea that we just ruled out, it does the job. 7.8 Lemma. Let the uncertainties in (7.12) be bounded by m by setting m B max { max(E) , max(F) } The characteristic polynomial can then be written I X n I det σi0 λmi i i + I F44 i | {z } CD 0 where |D 0 | = 1 + O( m ), σi0 = 1, |i | < m, and mi ≥ ni ≥ 1. Just as for the case of nominal trailing (F44 = 0), it holds that mi ≤ 2 ni . Proof: Compare lemma 7.6. The characteristic polynomial is now given by the determinant I λ I λ E33 + I λ E34 det I λ E43 λ E44 + F44 Trying to construct monomials in the determinant with as high degree in λ relative to the degree in the entries of E and F, leads to the same reasoning as in lemma 7.6. That is, with any in the monomial, there must be one factor h pure factor λ included i from the block λ E43 λ E44 + F44 . The only difference to the previous case is that the included factor may be in the form λ e + f instead of just λ e this time. Hence, there will be more monomials than before, but the added ones will never be one of those which maximize the degree in λ relative to the degree in the entries of E and F. Hence, the old result of lemma 7.6 obtains. 7.9 Example For the matrix block dimensions corresponding to a nominal system with 3 index 1 subsystems and 4 index 2 subsystems, the dimension of η̄ is 1 · 3 + 2 · 4 = 11, and depending on whether F is included or not, the characteristic polynomial is characterized by the numbers in table 7.1. It is seen that the table is in agreement with lemma 7.8. 7.10 Corollary. Let the uncertainties in (7.12) satisfy max(E) = O( m ) max(F) = O( m ) Then there are constants k > 0 and m0 > 0 such that for m < m0 , |λ| ≥ k m−1/2 for every eigenvalue λ belonging to (7.12). 202 7 lti ode of nominal index 2 d 1 2 3 4 5 6 7 8 9 10 11 F=0 2 min { ni : mi = d } 2 2 4 4 6 6 8 8 10 12 14 αd 3 10 30 84 204 456 1008 1464 3240 2160 5040 d 1 2 3 4 5 6 7 8 9 10 11 F,0 2 min { ni : mi = d } 2 2 4 4 6 6 8 8 10 12 14 αd 7 34 138 492 1524 3984 8592 14424 17640 13680 5040 Table 7.1: Data for example 7.9. Compare the proof of theorem 7.7. Proof: The O( m ) size of the uncertainties implies the existence of the constant m0 > 0, and numbers lE and lF such that m ≤ m0 implies max(E) ≤ lE m max(F) ≤ lF m By defining m0 B max { lE , lF } m, and repeating the proof of theorem 7.7 with m0 in place of max(E), it is seen that there exists k 0 > 0 such that |λ| ≥ k 0 m0 = k 0 max { lE , lF } m. Hence, setting k B k 0 max { lE , lF } completes the proof. 7.4 Case study: a small system At this point, we have established some results for index 2 lti dae that remind of the results obtained for index 1 lti dae in chapter 6. Unfortunately, this route comes to an end here. We shall investigate the reasons for this in the present section, but as a first sign of the difficulties which arise for index 2 systems, note that the state feedback matrix of (7.12) written as an ode will inevitably grow as max(E)−1 , while the eigenvalues of this system have only been shown to grow at least as max(E)−1/2 . Hence, it appears hard to establish a bound like (6.29) which allows the transition matrix of the η̄ subsystem to be bounded by a constant. Indeed, we shall soon see that no such bound exists. Consider the smallest possible perturbed nominal index 2 fast and uncertain subsystem, " # " # 1 1 ! 0 η̄ + η̄ = 0 (7.16) e 1 f where both |e| and |f | are O( m ). 7.4 203 Case study: a small system 7.4.1 Eigenvalues The characteristic polynomial in λ is !2 !2 f f −1 e λ + f λ − 1 = e λ + − − e 2e 2e 2 (7.17) For both eigenvalues to be in the left half plane, it is required that e and f have equal signs. For complex conjugate eigenvalues −e−1 = λ1 λ2 > 0, implying e < 0. For real poles, stability requires s !2 f f + e−1 < 0 + − 2e 2e which also simplifies to e < 0. Hence, both e and f must always be negative for the eigenvalues to be in the left half plane. For complex conjugate poles, |λ| = (−e)−1/2 f Re λ = − 2e Since the modulus does not depend on f , f will only affect the argument of complex conjugate eigenvalues, and when f ≤ −(−e)−1/2 2e the eigenvalues will become real. This condition simplifies to − − f ≥ 2 (−e)1/2 (7.18) In case the eigenvalues are complex, the m−1/2 growth is obvious, but in the case of real roots we resort to theorem 2.52. Applied to ! −1 (λ−1 )2 + f (λ−1 ) + e = 0 the theorem immediately provides ( |λ| ≥ min 1 1 , 2 |f | |e|1/2 ) which proves the m−1/2 growth regardless of poles being complex conjugates or not. As usual we make the assumption there is a φ0 < π/2 such that arg(−λ) ≤ φ0 . Since the argument is given by the ratio between imaginary and real parts, this effectively puts an upper bound on |f | in relation to |e| via f cos( φ0 ) ≤ − Re λ 2e = |λ| (−e)−1/2 204 7 lti ode of nominal index 2 That is, |f | = −f ≥ 2 (−e)1/2 cos( φ0 ) In case they are real, the characteristic polynomial is written e λ + (−e)−1/2 r λ + (−e)−1/2 r −1 ! f e, where r ≥ 1 must satisfy (−e)−1/2 ( r + r −1 ) = ! r + r −1 = that is, r is the bigger solution to −f ≥2 (−e)1/2 (7.19) where the left hand side is increasing for r ≥ 1, but r can also be found directly from the expression for the smaller eigenvalue, r f 2 + 4e 1 1 r = − (−e)−1/2 f + (7.20) 2 2 −e In chapter 6, A1–[6.14] also imposed the bound |λ| m < ā (7.21) on the eigenvalue moduli. We do not make this assumption yet in the current case study, but we just note what it would imply. For complex eigenvalues, it would imply 1 = max |λ| ≤ ā m−1 (−e)1/2 (7.22) (which is equivalent to a lower bound on |e| proportional to m2 ). For real eigenvalues, it would imply 1 r = max |λ| ≤ ā m−1 (7.23) − 1/2 (−e) and hence, r ≤ ā m−1 (−e)1/2 = O( m−1/2 ) (7.24a) min |λ| = (−e)−1/2 r −1 ≥ ā−1 m (−e)−1 = O( m0 ) (7.24b) As an alternative to the bound on moduli in A1–[6.14], we shall also consider bounding of r here. As an upper bound on r can be expressed in terms of the fast eigenvalues, max { |λi | : |λi | ≥ R0 } ≤ r02 (7.25) min { |λi | : |λi | ≥ R0 } where the known growth of eigenvalues ensures that all the eigenvalues of the η̄ subsystem will satisfy |λ| ≥ R0 for sufficiently small m. Of course, (7.25) would imply r ≤ r0 here. While φ ≤ φ0 imposes a lower bound on f relative to e, the bound on r imposes an upper bound on |f | relative to |e| via (7.19) (the eigenvalues will always 7.4 205 Case study: a small system be real near this limit), −f ≤ r0 + r0−1 (−e)1/2 (7.26) While the bound r ≤ r0 is a much stronger assumption than (7.21) for real eigenvalues (compare (7.24a)), the bound adds no information given that the eigenvalues form a single complex conjugate pair (compare (7.22)). A lower bound on r would be quite artificial, and will not be considered an option. 7.4.2 Transition matrix The transition matrix is computed using the Laplace transform. In the Laplace variable s the transition matrix is given by " −1 # 1 e f + s −1 (7.27) e−1 s −e−1 + e−1 f s + s2 The gain of the transition matrix will be no smaller than the entries in the second row. The form of the corresponding time functions expressed in the real domain will depend on whether the eigenvalues are complex or not. Let us first consider the case of complex conjugate eigenvalues. Using φ ∈ [ 0, π/2 ) to denote the common value of |arg( −λ )|, we introduce α B −(−e)−1/2 cos( φ ) β B (−e)−1/2 sin( φ ) The characteristic polynomial can now be expressed using α 2 + β 2 = −e−1 −2 α = e−1 f Then the last row of transition matrix is αt αt e sin( β t ) eβ e [ β cos( β t )+α sin( β t ) ] β Optimizing out t, the largest gains are found to be (these values are attained, not just upper bounds) e−φ cot( φ ) −φ cot( φ ) cos( φ ) 2 e (−e)1/2 The expression e−φ cot( φ ) is a monotonically increasing function of φ, tending to e−1 as φ → 0, and equals 0 at φ = π/2. The second entry is a monotonically decreasing function which tends to 2 e−1 as φ → 0, and equals 0 at φ = π/2. Since e−1 ≥ e−1 m−1/2 (−e)1/2 206 7 lti ode of nominal index 2 the maximum entry implies that the transition matrix is bounded from below, sup kΦ(t)k2 ≥ e−1 m−1/2 (7.28) t≥0 Using (7.22), we would obtain the bound e−1 ≤ e−1 ā m−1 (−e)1/2 but the bound grows too fast as m → 0 to be successful in combination with the O( m ) convergence of initial conditions. Let us see if it helps to constrain the eigenvalues to be real. Again, optimizing out t from the result yields expressions where the dependency on (−e)−1/2 can be factored out, but this time the remaining factor depends on r ≥ 1 instead of φ, r 2 being the ratio between the larger and smaller eigenvalues, g1 ( r ) g ( r ) 1 (−e)1/2 Here, both g1 and g2 are monotonically decreasing functions tending to zero, with |g1 ( 1 )| = e−1 and |g2 ( 1 )| = e−2 . As we are unwilling to assume a lower bound on r, it is seen that restriction to real eigenvalues would not lower the lower bound (7.28) on the transition matrix. Without an upper bound on eigenvalue magnitudes or r, (7.20) allows us to evaluate the limit g (r) 1 lim 1 = e→0− (−e)1/2 f which that the transition matrix will grow as sup kΦ(t)k2 ≥ m−1 t≥0 However, letting e → 0− alone would violate (7.23) since the eigenvalues would tend to infinity while the fixed f prevents m from approaching zero. On the other hand, if the constraint (7.21) is added (7.23) yields g (r) g1 ( r ) ≤ 1 ā m−1 ≤ e−1 ā m−1 1/2 r (−e) which is the same bound as for complex eigenvalues. An upper bound on r would merely produce the same kind of lower bound as (7.28), but with g1 ( r0 ) replacing e−1 . Since the transition matrix is bounded from below when the tight upper bounds in −1 this section are expressed in terms of m, and the upper bounds grow as m , it is not possible to conclude that supt≥0 η̄(t) → 0 as m → 0, even though η̄(0) → 0. The difficulty in separating the bounding of initial conditions from the bounding of the initial condition response gain, is a major obstacle for the analysis of nominal index 2 dae. 7.4 207 Case study: a small system 7.4.3 Simultaneous consideration of initial conditions and transition matrix bounds We now end this case study by proving that supt≥0 η̄(t) → 0 as m → 0 in a very simple case, despite how difficult this appears to us in the general case due to the inability to consider bounds on initial conditions and initial condition response gain in terms of m. The key to the problem is to consider how initial conditions and the tight transition matrix upper bounds depend directly on e and f . Noting that it is the gain from the initial condition of the first state which cannot be bounded in terms of m, and that this gain grows as (−e)−1/2 , it will suffice to show that the initial condition tends to 0 as |e|. It is the utter lack of generality that makes us present this result as an example, even though the whole case study is a kind of example in itself. 7.11 Example Consider the coupled nominal index 2 system described by the pair 1 1 1 , a 1 e 1 f (7.29) The system is decoupled using the matrix L = L0 + m RL , with exact solutions given by " # af L0 = −a " # a e−f L= e−f −1 1 and thanks to the structure of the original pair, the decoupling will lead directly to the canonical form, with the same parameters as in the rest of the current case study. That is, η̄ = η in this case. If the initial conditions x0 and v 0 of (7.29) are chosen nominally consistent, (7.11) gives that the initial conditions η 0 are given by η 0 = −m RL x0 Computing m RL as L − L0 yields η0 = − # " a e(1 − f ) + f 2 0 x e−f e−f −1 Hence, for the first state to be O( |e| ), it is required that that f 2 = O( |e| ). It the eigenvalues are not real, (7.18) directly gives f 2 ≤ 4 |e| 208 7 lti ode of nominal index 2 It the case of real eigenvalues, (7.19) gives 2 f 2 = r + r −1 (−e) Now, the bound (7.25) on r is the most natural assumption to add in order to obtain f 2 = O( |e| ). However, by taking m = |f |, the other bound (7.21) is still a valid alternative since (−e)1/2 r ≤ ā m−1 (−e)1/2 = ā |f | (−e)1/2 where |f | ≤ 1/2 for real eigenvalues. (Clearly, one would have to take ā > 2, but it is expected that there will be a lower bound on how small ā can be chosen, compare the sufficient condition for index 1 systems given in lemma 6.15.) We argue that it is more elegant to use the assumption on r rather than m |λ|, since the former does not involve the parameter m. Also note that it was due to the special structure of the system under consideration that we were able to derive a bound on r from the bound on m |λ| — the example does not show that this can be done for general systems of nominal index 2. From the simple example, we learnt that for systems of nominal index 2, it may be necessary to bound the ratio between the moduli of the fast and uncertain eigenvalues (in one way or another), in order to obtain converging solutions of the fast and uncertain subsystem. We find it more elegant to assume this directly, rather than via the bound on m |λ| currently used for systems of nominal index 1. To assume a bound on r may have many other applications as well, for instance, it might be a possible and more elegant replacement for the m |λ| bound also for systems of nominal index 1. 7.5 Conclusions This chapter was devoted to the analysis of autonomous index 0 lti dae of nominal index 2. As the case study in section 7.4 shows in theory, and as do numerical experiments (not included in the thesis) indicate in other cases, there can be and generally seems to be uniform convergence of solutions — just as we were able to prove generally for nominal index 1 in chapter 6. Though we have not been able to prove this generally (the case study of a particular form of small system being the exception), the chapter has contributed with several findings which we think will help in future research on these systems. First, the existence of a canonical form for perturbed lti dae of nominal index 2 has been proposed. It is closely related to the Weierstrass form for exact matrix pairs, and can be seen as a statement of where non-trivial perturbations of this form need to be considered. The derivation includes existence proofs for the decoupling transforms which isolates the fast and uncertain dynamics from the slow and regularly perturbed dynamics. 7.5 Conclusions 209 Second, it was shown that the eigenvalues of the fast and uncertain subsystem must grow as m → 0. This makes it possible to formulate assumptions about the eigenvalues of the fast and uncertain subsystem. Third, the case study of a small system has contributed with three findings. On the one hand, the study shows that at least there can be uniform convergence of solutions, which should inspire research on how to prove this also in the general case. On the other hand, the study revealed some drastic differences between the nominal index 1 and nominal index 2 cases. In particular, the basic idea of limiting the initial condition response gain of the fast and uncertain subsystem by a constant independent of the size of the perturbations turned out to be useless in this case. Finally, the example in the case study showed that while the assumed bound on m |λ| used in chapter 6 was still sufficient to obtain convergence, bounding the ratio of eigenvalue moduli for the fast and uncertain subsystem can be an elegant replacement. Appendix 7.A Decoupling transforms Inspection of the equations that L and H must satisfy to yield a decoupling transform reveal that the results from chapter 6 are not readily applicable. Most notably, the leading matrix of the lower part of the dae does not vanish with max(E). In this section, we shall use notation which is unrelated to other sections in this chapter, considering a matrix pair which is in the form I A11 A12 A13 A21 A22 A23 I24 I , (7.30) E33 E34 A31 A32 I33 E43 E44 I42 F44 | {z } CP where • max Eij and max(F44 ) are both O( m ). • I24 , I33 , and I42 each have a corresponding non-singular nominal matrix Iij0 such that max Iij − Iij0 = O( m ). • Aij are bounded independently of m. In preceding sections of this chapter, it has been shown how this form can be reached using non-singular transforms from more general starting points. Combining these transforms with the transforms of the present section would result in decoupling transforms applicable to more general forms than (7.30). However, we prefer the notion of decoupling transforms applying to (7.30) avoid unnecessary clutter in this section which is already lengthy. For an application of these decoupling transforms to a concrete example, see example 7.3. 210 7.A 211 Decoupling transforms The idea to use a fixed-point theorem to prove the existence of decoupling transforms for lti systems appears in Kokotović (1975), where tighter estimates are provided compared to the more general ltv results in Chang (1972). Notation. For brevity, we shall often omit the for m sufficiently small in this section. For instance, when we say that there exists a constant which gives a bound on something, this typically means that there is a m0 > 0 such that the bound is valid for all m < m0 . 7.A.1 Eliminating slow variables from uncertain dynamics In the first decoupling step we seek a matrix L partitioned as L1 L = L2 L3 such that the blocks of I − | I E33 E43 " I P E34 L I L E44 {z # I } CPL (which is a pair with the same leading matrix as P ) below the “11-position” in the trailing matrix are zero. Writing L = L0 + m RL with L0 denoting the nominal solution corresponding to m = 0, we shall prove uniqueness of L0 and that kRL k2 = O( m0 ). Recall (6.9), the equation that L must satisfy. In the current context given by (7.30), a corresponding residual function is defined by A21 A22 A23 I24 I h i 4 L − L A11 + A12 A13 0 L E E δL ( L ) = A31 + A32 I33 33 34 0 I42 F44 E43 E44 (7.31) so that the equation is written ! δL ( L ) = 0 (7.32) Let us first consider the nominal solution to this equation, in which case the last of the tree groups of equations reads ! 0 0 0 = I42 L1 0 and the known non-singularity of I42 gives that L01 = 0. Hence, the third term in (7.32) vanishes in the nominal case, and L0 is obtained by either of the following 212 7 lti ode of nominal index 2 expressions 0 " # # " −1 0 L0 = A023 I24 A021 − 0 A031 I33 0 0 −1 0 A22 A023 I24 A21 0 0 0 A31 = − A32 I33 0 0 I42 0 (7.33) Note that the second form is exactly the same as in the index 1 case. Since L0 only solves the nominal equation, the residual δL ( L0 ) is generally non-zero. However, using that I 0 0 E33 E34 L = E33 E34 L0 E43 E44 E43 E44 it is seen that δL ( L0 ) = O( m ), so by taking m sufficiently small, we know the existence of ∃ cL < ∞ : δ ( L0 ) ≤ cL m 0 L 2 0 Leaving the nominal case, we seek an O( m0 ) bound (that is, a bound which is independent of m for m sufficiently small) on kRL k2 . Inserting the decomposed L in (7.32) and cancelling a factor of m in the equation, it reads A22 A23 I24 I h i ! 0 0 RL A11 + A12 A13 0 L0 0 = A32 I33 RL − I42 F44 0 0 − m−1 δL ( L0 ) 0 h i E33 E34 L0 A12 A13 0 RL − E43 E44 0 h i E33 E34 RL A11 + A12 A13 0 L0 − E43 E44 I h i E33 E34 RL A12 A13 0 RL − m E43 E44 To simplify notation, we introduce the linear function L given by A22 A23 I24 I h 4 RL − RL A11 + A12 A13 0 0 L( RL ) = A32 I33 I42 F44 0 0 (7.34) i 0 L0 7.A 213 Decoupling transforms and name the remaining terms according to 0 h i 4 E33 E34 L0 A12 A13 0 RL g1 ( RL ) = E43 E44 0 h E33 E34 RL A11 + A12 A13 + E43 E44 I h i 4 E33 E34 RL A12 A13 0 RL g2 ( RL ) = E43 E44 i 0 L0 This allows us to write (7.34) as ! L( RL ) = m−1 δL ( L0 ) + g1 ( RL ) + m g2 ( RL ) (7.35) As it was possible to solve for L0 in the nominal equation, it is seen that the matrix of L is only O( m ) away from an invertible matrix, so taking m sufficiently small allows the induced 2-norm of the operator’s inverse to be bounded. P1–[7.12] Property. The constant cL < ∞ shall be chosen so that L−1 ≤ c 2 L (P1–7.36) For instance, the bound may be computed using def L−1 R L−1 R √ F 2 −1 L = sup ≤ 1 = n (vec L)−1 2 2 kRk2 √ kRkF R,0 n n o where n = min nη, nξ (nη by nξ being the dimensions of L), and vec L refers to a vectorized version of L. The following constants are also readily available h i ∃ c1 < ∞ : A11 + A12 A13 0 L0 ≤ c1 2 h i ∃ c2 < ∞ : A12 A13 0 ≤ c2 2 Further, the O( m ) property of the Eij implies the existence of 0 m−1 E33 m−1 E34 ≤ cE ∃ cE < ∞ : m−1 E43 m−1 E44 2 214 7 lti ode of nominal index 2 Now consider RL ∈ L = RL : kRL k2 ≤ ρL , where ρL is to be selected later. The the following bounds are obtained for m small enough m−1 δL ( L0 ) ≤ c0L 2 kg1 ( RL )k2 ≤ m cE L0 2 c2 + c1 ρL kg2 ( RL )k2 ≤ c2 ρL2 Using the matrix equality X2 Q X2 − X1 Q X1 = ( X2 − X1 ) Q X2 + X1 Q ( X2 − X1 ) (7.37) (used already in Chang (1969)), one also obtains g1 ( RL,2 ) − g1 ( RL,1 ) ≤ m cE L0 c2 + c1 RL,2 − RL,1 2 2 2 g2 ( RL,2 − RL,1 ) ≤ 2 c2 ρL RL,2 − RL,1 2 2 In search of a contraction mapping to prove the existence of abounded solution RL ∈ L, the operator TL is defined by 4 TL RL = L−1 m−1 δL ( L0 ) + g1 ( RL ) + m g2 ( RL ) (7.38) Setting ρL = ( 1 + αL ) cL c0L (7.39) for some αL > 0, and considering i h kTL RL k2 ≤ cL c0L 1 + m [ 1 + αL ] cE L0 2 c2 + c1 + c2 ρL it is seen that TL maps L to itself if i h m [ 1 + αL ] cE L0 2 c2 + c1 + c2 ρL ≤ αL or, equivalently, m≤ 1 αL cE L0 2 c2 + c1 + c2 ρL 1 + αL (7.40) From TL RL,2 − TL RL,1 ≤ m cL cE L0 c2 + c1 + 2 c2 ρL RL,2 − RL,1 2 2 2 it is seen that TL is a contraction if m≤ 1 1 cE L0 2 c2 + c1 + 2 c2 ρL cL and the conjunction of the two conditions (7.40) and (7.41) is equivalent to ( ) 1 αL 1 m≤ min , c L 1 + αL cE L0 2 c2 + c1 + 2 c2 ( 1 + αL ) cL c0L (7.41) (7.42) 7.A 215 Decoupling transforms If the parameter αL is tuned for maximizing the bound on m, the optimal choice in (7.42) can be given in closed form. In case c2 = 0 the best choice is that which makes the two bounds equal (larger values will only worsen the bound on ρL without improving the bound on m), so we consider the more interesting case when c2 , 0. One first computes the optimum of (7.40) alone, v t cE L0 2 c2 + c1 αL1 B 1 + 2 c2 cL c0L h i The objective function in (7.40) is increasing in the range 0, αL1 , but the combined objective in (7.42) is only increasing up to the point where αL ! 1 = 1 + αL cL This can only happen if cL > 1, with solution αL = c 1−1 . Hence, the bound (7.42) is L maximized by α1 1 if c2 , 0 and L 1 ≤ c1 αL , L 1+α (7.43) αL = L c 1−1 , otherwise L The choice (7.43) should be used with care, since if cL < 1 and c2 approaches zero, the rule will assign arbitrarily large values to αL , ignoring the consequences for the bound on ρL . For future reference, we note that c2 = 0 and maximizing the bound in (7.42) with respect to αL yields the bound 1 m≤ (7.44) cE c1 c L This concludes the proof of existence of the approximation L = L0 + m RL valid for sufficiently small m. Given a choice of the tuning parameter αL > 0, the bound on m is given in (7.42), while the bound on kRL k2 in (7.39) is no less than cL c0L . If the bound on kRL k2 is not critical, (7.43) may be used to set the tuning parameter. Now that the approximation has been proved, we may additionally conclude that I 0 I 0 E E E E E E = L + m L RL 33 34 33 34 33 34 (7.45) E43 E44 2 E43 E44 E43 E44 2 ≤ L0 c + ρ m 2 E L 216 7.A.2 7 lti ode of nominal index 2 Eliminating uncertain variables from slow dynamics In the second decoupling step we seek a matrix H partitioned as h i H = H1 H2 H3 such that the blocks of " I I −H PL I # I E33 E43 I E34 H E44 (which again is a pair with the same leading matrix as P ) below and to the right of the “11-position” in the trailing matrix are zero. While the objective is primarily to show just that H = O( m0 ), we will still write H = H 0 + m RH with H 0 denoting the nominal solution corresponding to m = 0, in order to be able to provide better estimates of the size of H. Hence, we shall prove uniqueness of H 0 and that kRH k2 = O( m0 ). This section is to a large extent analogous to the previous section. This should come as no surprise since the decoupling can be implemented with the same kind of transform used in the previous section, only applied to the transposed matrix pair this time. While this proves the existence, the pair PLT has some structural differences to the pair P , and we aim to exploit this to get insight into the problem and hopefully obtain tighter bounds. We shall return to the duality between the two decoupling steps in section 7.A.3 when we have the bounding expressions of the two steps at hand, and the reader who is not interested in the minor details of obtaining good bounds should skip to section 7.A.3 at this point. The condition that H must satisfy has a corresponding residual function i h i I 4 h E33 E34 δH ( H ) = A12 A13 0 + A11 + A12 A13 0 L H E43 E44 h A22 A23 I24 I i E33 E34 L A12 A13 0 − H A32 I33 − I42 F44 E43 E44 (7.46) so that the equation is written ! δH ( H ) = 0 Using knowledge about L0 , the equation for H 0 simplifies to 0 0 A22 A023 I24 h I i h i ! 0 = A0 A0 0 + A0 + A0 A0 0 L0 H 0 H 0 A032 I33 11 12 13 12 13 0 I42 0 (7.47) 0 0 0 0 7.A 217 Decoupling transforms Reading off the last block column of the equation reveals the readily invertible ! 0 H10 I24 =0 From H10 = 0 it follows that I H 0 E33 E43 (7.48) 0 E34 = H 0 E44 E33 E43 E34 E44 and hence that H 0 is given by either of the following two expressions " " # # h i A0 I 0 −1 0 0 0 32 33 H = 0 A12 A13 0 I42 0 0 0 −1 h i A22 A23 I24 0 = A012 A013 0 A032 I33 0 I42 0 (7.49) (7.50) Since H 0 only solves the nominal equation, δH ( H 0 ) is generally non-zero. However, using (7.49) and (7.45) it is seen that δH ( H 0 ) = O( m ), so by taking m sufficiently small, we know the existence of ∃ cH < ∞ : δ ( H 0 ) ≤ cH m 0 H 2 0 Note that it was possible to solve for H 0 without introducing additional assumptions about distinct eigenvalues, as is typically needed when the equation is in the form ! H 0 A + B H 0 = C. In particular, this means that the linear operator H defined by A22 A23 I24 I h i 4 − A11 + A12 A13 0 L H 0 0 0 H( H 0 ) = H 0 A32 I33 I42 F44 0 0 has a matrix which is only O( m ) away from an invertible one, and hence taking m sufficiently small will enable us to bound the inverse of the operator, P2–[7.13] Property. The constant cH < ∞ shall be chosen so that ∃ : H−1 ≤ c 2 H (P2–7.51) Now that the existence and uniqueness of a nominal solution has been established, we turn to RH . Inserting H = H 0 + m RH in (7.47) and cancelling a factor of m in the equation, one obtains I h i ! E33 E34 L A12 A13 0 H( RH ) = m−1 δH ( H 0 ) + RH E43 E44 0 h i E33 E34 (7.52) + A11 + A12 A13 0 L RH E43 E44 218 7 lti ode of nominal index 2 which is written ! H( RH ) = m−1 δH ( H 0 ) + m h( RH ) (7.53) by means of the definition I 4 h( RH ) = RH m−1 h E33 E43 h + A11 + A12 h E34 L A12 E44 A13 A13 0 i i 0 L RH m−1 i 0 E33 E43 E34 E44 Restricting the analysis to m so small that (7.45) is valid and using that h h i i A11 + A12 A13 0 L ≤ c1 + m c2 ρL 2 the gain of the linear function h can be bounded as kh( RH )k2 0 ≤ L 2 cE + ρL c2 + [ c1 + m c2 ρL ] cE C ch kRH k2 (7.54) For the operator TH defined by 4 (7.55) TH RH = H−1 m−1 δH ( H 0 ) + m h( RH ) and restricted to RH ∈ H = RH : kRH k2 ≤ ρH , where ρH is to be selected later, we then obtain kTH RH k2 ≤ cH c0H + m ch ρH TH RH,2 − TH RH,1 = m H−1 h( RH,2 − RH,1 ) 2 2 R ≤ m c H ch H,2 − RH,1 2 It just remains to set ρH = ( 1 + αH ) cH c0H with αH > 0, so that m≤ 1 αH cH ch 1 + αH (7.56) ensures that TH maps H into itself, and since this implies that m c H ch < 1 the contraction property imposes no additional requirements on m. This concludes the proof of existence of the approximation H = H 0 + m RH valid for sufficiently small m. The provided bound on kRH k2 is no less than cH c0H , and for each choice of the bound, a corresponding bound on m follows from (7.56). 7.A 219 Decoupling transforms We now end the section with a last analogy to the previous section. I 0 I H E33 E34 = H 0 E33 E34 + m RH E33 E34 E43 E44 2 E43 E44 E43 E44 2 ≤ H 0 2 cE + ρH m 7.A.3 (7.57) Remarks on duality We indicated in the beginning of section 7.A.2 that the existence of the second decoupling step could simply be obtained by applying the decoupling developed in section 7.A.1 to the transposed pair PLT. In this section we shall make some comparisons that will show whether or not the development in section 7.A.2 was any good. The theory provides two bounds, one on m which should be large for wider applicability, and one on ρH which should be small for increased precision in the results. In view of n view of theorem 2.50, however, obtaining a tight bound on ρH may be of little importance in applications as iterative refinement will generally produce RH with kRH k2 < ρH anyway. Hence, the crucial bound for the comparison at hand is that of m. We need some notation to indicate what the expressions in section 7.A.1 would be if applied to the decoupling step in section 7.A.2. We will let {expr}7.A.1 denote the expression or quantity we would use in place of expr in section 7.A.1. The pair PLT is given by I I E33 E43 | , E34 E44 h A11 + A12 A13 T A12 AT 13 0 {z i T 0 L T M } PLT where A22 M = A32 I42 A23 I33 I24 I − F44 E33 E43 h E34 L A12 E44 A13 0 i That is, to use the notation of section 7.A.1 we have to make the replacements T I I E E E E ∼ 33 34 33 34 E43 E44 7.A.1 E43 E44 h i T {A11 }7.A.1 ∼ A11 + A12 A13 0 L nh io A12 A13 0 ∼0 7.A.1 220 7 lti ode of nominal index 2 A 22 A32 I42 T A A12 21 AT A ∼ 13 31 0 0 7.A.1 A23 I24 I33 ∼ MT F44 7.A.1 Using the replacement rules, we find that n h i o h A11 + A12 A13 0 L0 ∼ A11 + A12 7.A.1 i T 0 L A13 and hence the only difference between the two operators L and HT is that A22 A23 I24 A 32 I33 I42 F44 has been replaced by M, which we know is an O( m ) difference. Noting that H−1 2 = (HT)−1 2 , we see that the larger uncertainty in M generally implies that {cL }7.A.1 ≥ cH , {cL }7.A.1 − cH = O( m ) Next, from δH ( −H )T = {δL }7.A.1 ( H ) it follows that n o c0H = c0L 7.A.1 Hence for a given value of the trade-off parameter α, applying section 7.A.2 would yield the better bound on ρH , but the difference is small. To see if there are any interesting differences in the more crucial bound on m, we assume that αL7.A.1 is selected optimal in this respect. In section section 7.A.2, the bound on (7.56) approaches 1 c H ch from below as αH grows. Since {c2 }7.A.1 ∼ 0, the condition (7.44) shall be used in section 7.A.1, ( ) 1 m≤ cE c1 cL 7.A.1 where {c1 }7.A.1 ∼ c1 + m c2 ρL This means that {cE c1 }7.A.1 constitutes just one of the two positive terms in ch (see (7.54)), and hence the bound (7.56) is more restricting nthan (7.44), o even for large . values of αH . At the same time, (7.44) is valid as soon as αL ≥ c 1−1 L 7.A.1 7.B 221 Example data To conclude this section, the most promising approach to the second decoupling step is that in section 7.A.1. In a particular problem with uncertain data, however, many of the triangle inequalities used to establish bounds in this section, take (7.54) as an example, may be unnecessarily conservative; more direct methods for computing upper bounds on the gain are likely to produce tighter bounds. Hence, we are unable to tell a priori which approach would superior for some given data. In the implementation behind some of the examples in this thesis, we have only implemented section 7.A.1. 7.B Example data This section contains tables with matrix pair data referenced from section 7.1.3. To fit the matrices on the page, the pair h i h i E 1 E 2 , A1 A2 where the partitioning is related with space constraints — and not with the real structure in the pair — will be typeset as [A1 [E1 ··· A2 ] E2 ] and rotated. ··· −0.56744±1 · 10−9 0.10134±1 · 10−9 1.0545±1 · 10−9 0.63513±1 · 10−9 −0.81415±1 · 10−9 0.34382±1 · 10−9 0.8838±1 · 10−9 1.1997±1 · 10−9 −0.45008±1 · 10−9 −7.1453 · 10−2 ±1 · 10−9 0.1257±1 · 10−9 0.29268±1 · 10−9 6.6423 · 10−2 ±1 · 10−9 4.5755 · 10−2 ±1 · 10−9 0.44589±1 · 10−9 −0.47065±1 · 10−9 0.14189±1 · 10−9 0.29152±1 · 10−9 0.7618±1 · 10−9 0.69394±1 · 10−9 0.16941±1 · 10−9 −0.63687±1 · 10−9 0.92851±1 · 10−9 0.51348±1 · 10−9 1.0851±1 · 10−9 −0.43322±1 · 10−9 −1.2893±1 · 10−9 0.33946±1 · 10−9 5.5605 · 10−2 ±1 · 10−9 −0.45697±1 · 10−9 −0.39125±1 · 10−9 −0.62017±1 · 10−9 5.8362 · 10−2 ±1 · 10−9 0.48852±1 · 10−9 0.15069±1 · 10−9 −0.22055±1 · 10−9 0.28734±1 · 10−9 −0.14599±1 · 10−9 0.59524±1 · 10−9 0.68238±1 · 10−9 6.2673 · 10−2 ±1 · 10−9 0.24208±1 · 10−9 −0.18688±1 · 10−9 0.86988±1 · 10−9 −0.59554±1 · 10−9 −0.33192±1 · 10−9 0.64177±1 · 10−9 −0.62088±1 · 10−9 0.74071±1 · 10−9 9.9538 · 10−2 ±1 · 10−9 −4.7671 · 10−3 ±1 · 10−9 0.10846±1 · 10−9 −0.41661±1 · 10−9 2.5639 · 10−2 ±1 · 10−9 0.54801±1 · 10−9 −0.16533±1 · 10−9 −1.543±1 · 10−9 −6.5648 · 10−2 ±1 · 10−9 0.31343±1 · 10−9 −0.57061±1 · 10−9 −7.7208 · 10−2 ±1 · 10−9 −1.3632±1 · 10−9 −2.6105 · 10−2 ±1 · 10−9 −0.51955±1 · 10−9 1.0882±1 · 10−9 −0.66258±1 · 10−9 −0.15435±1 · 10−9 −0.57407±1 · 10−9 9.1776 · 10−2 ±1 · 10−9 −0.11226±1 · 10−9 −1.4279±1 · 10−9 0.33673±1 · 10−9 −0.2089±1 · 10−9 −1.3841±1 · 10−9 0.22632±1 · 10−9 0.57909±1 · 10−9 0.29551±1 · 10−9 1.6751±1 · 10−9 0.1552±1 · 10−9 0.31229±1 · 10−9 −9 −2 −9 −9 −9 0.21854±1 · 10 2.4901 · 10 ±1 · 10 0.62085±1 · 10 −0.25087±1 · 10 −9 −9 −9 −9 0.46305±1 · 10 0.44453±1 · 10 −0.86007±1 · 10 −0.29519±1 · 10 −9 −9 −9 −9 −0.55363±1 · 10 −0.40435±1 · 10 0.61376±1 · 10 0.71604±1 · 10 −2 −9 −9 −9 −2 −9 0.10724±1 · 10 0.10682±1 · 10 −3.1899 · 10 ±1 · 10 5.9122 · 10 ±1 · 10 −9 −2 −9 −9 −2 −9 −0.75859±1 · 10 −5.3715 · 10 ±1 · 10 −0.84413±1 · 10 7.6798 · 10 ±1 · 10 −9 −9 −9 −9 −0.82638±1 · 10 −0.31475±1 · 10 0.36745±1 · 10 −0.23708±1 · 10 0.18827±1 · 10−9 −0.72398±1 · 10−9 −6.0146 · 10−2 ±1 · 10−9 1.176±1 · 10−9 −0.10204±1 · 10−9 4.0286 · 10−2 ±1 · 10−9 2.2346 · 10−2 ±1 · 10−9 −0.48771±1 · 10−9 −0.28646±1 · 10−9 −1.242±1 · 10−9 −0.13023±1 · 10−9 0.15927±1 · 10−9 −0.13891±1 · 10−9 0.32844±1 · 10−9 0.32631±1 · 10−9 −3.034 · 10−4 ±1 · 10−9 −9 −9 −2 −9 −9 0.92142±1 · 10 −0.73052±1 · 10 −4.9639 · 10 ±1 · 10 0.51766±1 · 10 0.36114±1 · 10−9 −0.28374±1 · 10−9 0.52565±1 · 10−9 0.84354±1 · 10−9 −0.18868±1 · 10−9 0.58109±1 · 10−9 0.38569±1 · 10−9 0.87639±1 · 10−9 1.0234±1 · 10−9 −0.24213±1 · 10−9 −0.78102±1 · 10−9 −0.58414±1 · 10−9 0.16812±1 · 10−9 0.9353±1 · 10−9 2.6376 · 10−2 ±1 · 10−9 −0.89432±1 · 10−9 −0.29085±1 · 10−9 −0.73002±1 · 10−9 0.20136±1 · 10−9 0.25038±1 · 10−9 1.1712±1 · 10−9 1.5482±1 · 10−9 0.49793±1 · 10−9 0.40038±1 · 10−9 −5.0966 · 10−2 ±1 · 10−9 0.29889±1 · 10−9 −0.33093±1 · 10−9 −5.9865 · 10−3 ±1 · 10−9 −0.13283±1 · 10−9 0.7529±1 · 10−9 5.1293 · 10−2 ±1 · 10−9 0.58978±1 · 10−9 5.2723 · 10−2 ±1 · 10−9 0.71232±1 · 10−9 0.62964±1 · 10−9 0.64198±1 · 10−9 8.108 · 10−2 ±1 · 10−9 0.37423±1 · 10−9 222 7 lti ode of nominal index 2 Table 7.2: The initial matrix pair. 0 0 0 0 0 −16 1.5999 · 10 ±6.2549 · 10−6 −17 −7.8679 · 10 ±6.5123 · 10−6 −5.2925 · 10−17 ±2.2497 · 10−6 5.9823 · 10−17 ±2.5663 · 10−6 ··· −1.0818±7.6346 · 10−4 −1.3864±1.0081 · 10−3 −0.5101±1.3366 · 10−3 −0.75049±1.9634 · 10−3 0.52916±1.8735 · 10−3 0.6205±2.7679 · 10−3 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 1 0 0 0 0 0 1 0 0 0 0 0 1 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 00000 00000 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 1 0 0 0 1 0 0 0 1 0 0 −13 −6 −12 −6 0 3.0652 · 10 ±4.3756 · 10 3.145 · 10 ±4.7619 · 10 0 −4.235 · 10−13 ±5.6642 · 10−6 −3.818 · 10−12 ±6.2896 · 10−6 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1.5112 · 10−16 ±1.7972 · 10−6 2.3882 · 10−16 ±4.0818 · 10−7 9.2357 · 10−17 ±6.2096 · 10−7 −17 −6 −16 −7 −16 −7 −3.0947 · 10 ±1.8711 · 10 −2.6407 · 10 ±4.2497 · 10 −1.393 · 10 ±6.4651 · 10 −16 −7 −16 −7 −16 −7 4.2506 · 10 ±6.4639 · 10 −2.4822 · 10 ±1.4681 · 10 −2.3711 · 10 ±2.2334 · 10 −16 −7 −16 −7 −16 −7 −2.9196 · 10 ±7.3733 · 10 3.8433 · 10 ±1.6747 · 10 2.5761 · 10 ±2.5477 · 10 0 0 0 0 0 1 0 0 0 0 −3 0.88959±1.0295 · 10−3 0.53515±1.9335 · 10 −0.36626±2.7152 · 10−3 0 0 0 0 0 7.B Example data 223 Table 7.3: The decomposed pair in the proposed canonical form. 1.171±6.99 · 10−2 −0.5674±2.231 · 10−2 0.1013±1.708 · 10−2 1.055±1.25 · 10−2 0.548±7.038 · 10−2 0.6351±2.005 · 10−2 −0.8142±1.576 · 10−2 0.3438±1.136 · 10−2 −0.1653±6.651 · 10−2 1.548±6.788 · 10−2 0.498±6.367 · 10−2 0.8838±2.512 · 10−2 1.2±1.842 · 10−2 −0.4501±1.359 · 10−2 −1.543±7.037 · 10−2 0.4004±6.809 · 10−2 −7.146 · 10−2 ±3.18 · 10−2 0.1257±2.203 · 10−2 0.2927±1.646 · 10−2 −6.563 · 10−2 ±8.163 · 10−2 −5.113 · 10−2 ±6.54 · 10−2 6.645 · 10−2 ±4.252 · 10−2 4.574 · 10−2 ±2.834 · 10−2 0.4459±2.138 · 10−2 0.3135±9.378 · 10−2 0.299±4.122 · 10−2 −0.4707±2.686 · 10−2 0.1419±1.845 · 10−2 0.2915±1.425 · 10−2 −0.5707±5.858 · 10−2 −0.3309±0.1286 0.7618±3.414 · 10−2 0.6939±2.721 · 10−2 0.1694±1.981 · 10−2 −7.719 · 10−2 ±0.1211 −6.025 · 10−3 ±6.027 · 10−2 −0.6369±2.08 · 10−2 0.9285±1.597 · 10−2 0.5135±1.189 · 10−2 −1.363±6.252 · 10−2 −0.1327±7.339 · 10−2 1.085±3.024 · 10−2 −0.4332±2.179 · 10−2 −1.289±1.608 · 10−2 −2.617 · 10−2 ±8.285 · 10−2 0.7529±4.42 · 10−3 0.3395±1.008 · 10−3 5.56 · 10−2 ±8.774 · 10−4 −0.457±5.882 · 10−4 −0.5196±3.921 · 10−3 5.129 · 10−2 ±3.431 · 10−3 −0.3913±8.262 · 10−4 −0.6202±7.223 · 10−4 5.836 · 10−2 ±4.827 · 10−4 1.088±3.102 · 10−3 −0.2205±6.36 · 10−4 −0.6626±4.462 · 10−3 0.5898±4.859 · 10−3 0.4885±1.194 · 10−3 0.1507±9.969 · 10−4 5.273 · 10−2 ±7.656 · 10−3 0.2873±1.82 · 10−3 −0.146±1.524 · 10−3 0.5952±9.755 · 10−4 −0.1544±6.941 · 10−3 0.7123±1.083 · 10−2 0.6824±2.659 · 10−3 6.267 · 10−2 ±2.172 · 10−3 0.2421±1.34 · 10−3 −0.5741±9.992 · 10−3 0.6296±5.721 · 10−3 −0.1869±1.442 · 10−3 0.8699±1.195 · 10−3 −0.5955±7.426 · 10−4 9.178 · 10−2 ±5.344 · 10−3 0.642±7.147 · 10−3 −0.3319±1.637 · 10−3 0.6418±1.499 · 10−3 −0.6209±1.062 · 10−3 · 10−3 8.108 · 10−2 ±3.986 · 10−3 0.7407±9.501 · 10−4 9.954 · 10−2 ±8.377 · 10−4 −4.767 · 10−3 ±5.591 · 10−4 −0.1123±6.284 −1.428±3.596 · 10−3 0.3742±6.36 · 10−3 0.1085±1.555 · 10−3 −0.4166±1.292 · 10−3 2.564 · 10−2 ±8.148 · 10−4 0.3367±5.836 · 10−3 ··· −2 −2 −0.209±8.443 · 10 −1.384±4.042 · 10 0.2263±1.645 · 10−2 0.5792±5.876 · 10−2 −2 −2 −2 −2 0.2955±8.149 · 10 1.675±3.565 · 10 0.1552±1.418 · 10 0.3123±5.735 · 10 0.2186±7.705 · 10−2 2.489 · 10−2 ±4.706 · 10−2 0.6209±1.947 · 10−2 −0.2509±5.317 · 10−2 −2 −2 −2 −2 0.4631±8.31 · 10 0.4445±6.101 · 10 −0.8601±2.569 · 10 −0.2953±5.555 · 10 −2 −2 −2 −2 −0.5539±8.098 · 10 −0.4044±8.413 · 10 0.6138±3.618 · 10 0.7162±5.261 · 10 −2 −2 −2 −2 −2 −2 5.917 · 10 ±5.139 · 10 0.1073±5.293 · 10 0.1068±2.26 · 10 −3.191 · 10 ±3.39 · 10 −2 −2 −2 −2 −0.7586±0.154 −5.371 · 10 ±5.87 · 10 −0.8441±2.269 · 10 7.675 · 10 ±0.1076 −0.8265±7.291 · 10−2 −0.3148±3.756 · 10−2 0.3675±1.509 · 10−2 −0.237±5.086 · 10−2 0.1884±8.888 · 10−2 −0.724±5.711 · 10−2 −6.014 · 10−2 ±2.386 · 10−2 1.176±6.08 · 10−2 −0.102±5.491 · 10−3 4.029 · 10−2 ±1.863 · 10−3 2.235 · 10−2 ±8.336 · 10−4 −0.4877±3.916 · 10−3 −0.2865±4.26 · 10−3 −1.242±1.536 · 10−3 −0.1302±6.852 · 10−4 0.1593±3.059 · 10−3 −0.1389±6.07 · 10−3 0.3284±2.296 · 10−3 0.3263±1.03 · 10−3 −3.025 · 10−4 ±4.316 · 10−3 0.9214±9.599 · 10−3 −0.7305±3.488 · 10−3 −4.964 · 10−2 ±1.591 · 10−3 0.5177±6.789 · 10−3 0.3611±1.363 · 10−2 −0.2837±5.204 · 10−3 0.5257±2.361 · 10−3 0.8435±9.615 · 10−3 −0.1887±7.11 · 10−3 0.5811±2.797 · 10−3 0.3857±1.189 · 10−3 0.8764±5.113 · 10−3 1.023±8.84 · 10−3 −0.2421±2.904 · 10−3 −0.781±1.287 · 10−3 −0.5841±6.386 · 10−3 −3 −3 −2 −4 −3 0.1681±4.911 · 10 0.9353±1.741 · 10 2.638 · 10 ±7.367 · 10 −0.8943±3.56 · 10 −0.2908±7.978 · 10−3 −0.73±3.009 · 10−3 0.2014±1.371 · 10−3 0.2504±5.647 · 10−3 224 7 lti ode of nominal index 2 Table 7.4: Reconstruction of the original pair by applying the reverse transformations to the pair in its canonical form. Transformations are collapsed to just one matrix on each side of the pair, before the pair itself is transformed. 1.171±1.161 −0.5674±0.1622 0.1013±0.1536 1.055±0.1129 0.548±0.9027 1.548±2.023 0.6351±0.2839 −0.8141±0.2683 0.3438±0.1969 −0.1653±1.574 0.4979±2.43 0.8838±0.3404 1.2±0.3215 −0.4501±0.2361 −1.543±1.889 0.4004±1.821 −7.145 · 10−2 ±0.2545 0.1257±0.2403 0.2927±0.1768 −6.566 · 10−2 ±1.415 −5.1 · 10−2 ±3.839 6.641 · 10−2 ±0.5376 4.574 · 10−2 ±0.5076 0.4459±0.373 0.3135±2.985 0.2989±2.961 −0.4706±0.4161 0.1419±0.3929 0.2915±0.2882 −0.5706±2.304 −0.3309±2.288 0.7618±0.3214 0.6939±0.3032 0.1694±0.2226 −7.721 · 10−2 ±1.779 −5.987 · 10−3 ±1.566 −0.6369±0.2184 0.9285±0.2065 0.5135±0.1523 −1.363±1.217 −0.1328±1.501 1.085±0.2107 −0.4332±0.1989 −1.289±0.146 −2.61 · 10−2 ±1.168 0.7529±2.272 · 10−2 0.3395±3.392 · 10−3 5.561 · 10−2 ±3.576 · 10−3 −0.457±2.39 · 10−3 −0.5196±1.793 · 10−2 1.088±3.035 · 10−2 5.129 · 10−2 ±3.855 · 10−2 −0.3913±5.704 · 10−3 −0.6202±5.892 · 10−3 5.836 · 10−2 ±4.016 · 10−3 0.5898±4.695 · 10−2 0.4885±6.988 · 10−3 0.1507±7.263 · 10−3 −0.2205±4.909 · 10−3 −0.6626±3.702 · 10−2 5.272 · 10−2 ±3.834 · 10−2 0.2873±5.695 · 10−3 −0.146±5.991 · 10−3 0.5952±4.008 · 10−3 −0.1544±3.024 · 10−2 0.7123±7.614 · 10−2 −2 6.267 · 10−2 ±1.197 · 10−2 −3 0.6824±1.14 · 10 0.2421±7.988 · 10 −0.5741±6.013 · 10−2 0.6296±5.635 · 10−2 −0.1869±8.424 · 10−3 0.8699±8.747 · 10−3 −0.5955±5.886 · 10−3 9.178 · 10−2 ±4.447 · 10−2 0.642±4.694 · 10−2 −0.3319±7.059 · 10−3 0.6418±7.31 · 10−3 −0.6209±4.972 · 10−3 −0.1123±3.707 · 10−2 8.108 · 10−2 ±2.801 · 10−2 0.7407±4.207 · 10−3 9.954 · 10−2 ±4.454 · 10−3 −4.767 · 10−3 ±2.946 · 10−3 −1.428±2.214 · 10−2 0.3742±3.396 · 10−2 0.1085±5.004 · 10−3 −0.4166±5.186 · 10−3 2.564 · 10−2 ±3.529 · 10−3 0.3367±2.673 · 10−2 ··· −0.2089±1.369 −1.384±0.2295 0.2263±7.882 · 10−2 0.5791±0.9775 0.2955±2.385 1.675±0.4023 0.1552±0.1375 0.3123±1.704 0.2185±2.864 2.49 · 10−2 ±0.4825 0.6209±0.1649 −0.2509±2.046 0.4631±2.146 0.4445±0.3601 −0.8601±0.1232 −0.2952±1.532 −0.5537±4.525 −0.4043±0.7619 0.6138±0.2604 0.7161±3.232 5.913 · 10−2 ±3.49 0.1072±0.59 0.1068±0.2016 −3.191 · 10−2 ±2.494 −0.7586±2.697 −5.372 · 10−2 ±0.455 −0.8441±0.155 7.68 · 10−2 ±1.927 −0.8264±1.847 −0.3147±0.3083 0.3674±0.1056 −0.2371±1.317 0.1883±1.77 −0.724±0.2986 −6.014 · 10−2 ±0.1019 1.176±1.265 −0.102±2.702 · 10−2 4.029 · 10−2 ±5.151 · 10−3 2.235 · 10−2 ±2.046 · 10−3 −0.4877±1.993 · 10−2 −0.2865±4.578 · 10−2 −1.242±8.507 · 10−3 −0.1302±3.247 · 10−3 0.1593±3.35 · 10−2 −0.1389±5.579 · 10−2 0.3284±1.049 · 10−2 0.3263±4.045 · 10−3 −3.032 · 10−4 ±4.094 · 10−2 0.9214±4.56 · 10−2 −0.7305±8.636 · 10−3 −4.964 · 10−2 ±3.421 · 10−3 0.5177±3.358 · 10−2 0.3611±9.055 · 10−2 −0.2837±1.726 · 10−2 0.5257±6.771 · 10−3 0.8435±6.671 · 10−2 −0.1887±6.696 · 10−2 0.5811±1.269 · 10−2 0.3857±4.833 · 10−3 0.8764±4.916 · 10−2 1.023±5.579 · 10−2 −0.2421±1.048 · 10−2 −0.781±3.975 · 10−3 −0.5841±4.088 · 10−2 −2 −3 −2 −3 −2 0.1681±3.334 · 10 0.9353±6.413 · 10 2.638 · 10 ±2.573 · 10 −0.8943±2.465 · 10 −2 −3 −3 −2 −0.2908±4.034 · 10 −0.73±7.492 · 10 0.2014±2.893 · 10 0.2504±2.954 · 10 7.B Example data 225 Table 7.5: Reconstruction of the original pair by applying the reverse transformations to the pair in its canonical form. Transformations are applied one by one. 8 LTV ODE of nominal index 1 In the previous chapter, we explored some of the difficulties in generalizing the results for lti systems of nominal index 1 to nominal index 2. In this chapter we take on another generalization of the lti nominal index 1 results, namely that to timevarying systems. In view of the failure to produce a general convergence result for the lti systems of nominal index 2, treating ltv systems of nominal index 1 is the best we can hope for. Unlike chapter 6, the technicalities of dealing with non-zero pointwise indicies are avoided in the current chapter (compare with section 6.8). The idea to use a fixed-point theorem to prove the existence of decoupling transforms for ltv systems appears in Chang (1969, 1972). When it appears again in Kokotović et al. (1986, section 5:2), it has been modified slightly, and we shall remark on the difference in due time. The chapter is organized as follows. Section 8.1 prepares the analysis of systems with timescale separation by considering systems where only the fast time scale is present. For ltv dae of nominal index 1, the first steps of analysis (corresponding to section 6.1 for lti systems) lead to the linear time-varying matrix-valued singular perturbation form ! x0 (t) + A11 (t) x(t) + A12 (t) z(t) = 0 ! E(t) z 0 (t) + A21 (t) x(t) + A22 (t) z(t) = 0 The decoupling of these equations into slow and uncertain subsystems is the topic of section 8.2, and section 8.3 contains some remarks on the difference compared to the scalar perturbation case. In section 8.4 the results of previous sections are summarized in a theorem for ltv dae of nominal index 1. Section 8.5 concludes the chapter. 227 228 8.1 8 ltv ode of nominal index 1 Slowly varying systems Here, the results in Kokotović et al. (1986, section 5.2) are generalized to matrixvalued singular perturbations. The form of equations to be analyzed is ! E(t) z 0 (t) + A(t) z(t) = 0 (8.1) where E(t) is an unknown square matrix, which is at least assumed non-singular and with a known bound on the entries, max(E(t)) ≤ m. (For comparison, the uncertainty has the form E(t) = I in Kokotović et al. (1986).) Our interest is restricted to systems whose time-invariant approximations at each time instant are stable, as formalized by the following assumption about the eigenvalues λ of the pair ( E(t), A(t) ) for a fixed t: A1–[8.1] Assumption. Assume there exist constants R0 > 0, φ0 ∈ [ 0, π/2 ), and ā > supt max(A(t)) such that |λ| m < ā and |λ| > R0 =⇒ |arg(−λ)| ≤ φ0 (A1–8.2) where ā presents a trade-off between generality of the assumption and the quantitative properties of the forthcoming convergence results. We refer to section 6.5, A1–[6.14], and lemma 6.18 for illustration and discussion of this assumption. The method used in experiments to produce time-varying perturbations in agreement with A1–[8.1] is described in appendix A. Two more constants are introduced to specify properties of A. P1–[8.2] Property. The constant c2 < ∞ shall be chosen so that (P1–8.3) kAkI ≤ c2 P2–[8.3] Property. The constant c3 < ∞ shall be chosen so that A−1 ≤ c I 3 (P2–8.4) The bound on kAkI is used also in Kokotović et al. (1986), while the bound on A−1 I is a consequence of the need to deal with the matrix-valued uncertainty E instead of a scalar. 8.4 Remark. To depend on a bound on A−1 I is actually very natural for the present setup. Since a bound on kAkI is needed, it is realized that the smaller this bound is, the stronger the conclusions regarding convergence will be. Further, any convergence result should be such that smaller values of the bound m on E(t) also leads to stronger conclusions. Hence, if there would be no need to bound A−1 I , scaling both E and A by some positive factor less than would yield stronger results! This is clearly contradictory, and we may consider bounding 1 A−1 as a way of fixing the scaling of the problem so that an absolute interpretation of m I becomes meaningful. 8.5 Remark. That P2–[8.3] should relate to the scaling of the problem is also well in agreement with example 6.28, where the was fixed by inverting the trailing scaling of the problem matrix. Then, a bounded A(t)−1 2 ensures that max A(t)−1 E(t) will still be O( m ). In view 8.1 229 Slowly varying systems of the possibility to invert the trailing matrix, and in view of the success of this approach in ! example 6.28, it would also make sense to use E(t) z 0 (t) + z(t) = 0 as a starting point instead of (8.1). On the other hand, the inversion of the trailing matrix will generally introduce additional uncertainty in the problem, and it is therefore of value to not assume that this has been done beforehand. The assumption A1–[8.1] is recognized as (A1–6.26) in chapter 6, to which we refer for illustration and discussion of this condition. The following results are readily extracted from corollary 6.19 in the same chapter. 8.6 Lemma. Under A1–[8.1], there is a constant k1 such that m E(t)−1 A(t) ≤ k 1 2 (8.5) Proof: This is readily extracted from the proof of corollary 6.19. Since −1 −1 −1 −1 −1 λmin −E(t)−1 A(t) ≥ A(t)−1 E(t)2 ≥ A(t)−1 2 kE(t)k−1 2 ≥ c3 n m (where n is the dimension of (8.1)) we will only consider −1 −1 m ≤ R−1 0 c3 n (8.6) in the rest of the chapter, so that the A1–[8.1] argument bound on the uncertain eigenvalues applies. 8.7 Lemma. Assume A1–[8.1] and take m according to (8.6). Then there exist constants K∗ and a∗ > 0 such that for all θ ≥ 0, e−E(t)−1 A(t) θ ≤ K∗ e−a∗ θ (8.7) 2 Proof: As in corollary 6.19, we use that A1–[8.1] together with (8.6) implies that there exists a constant a∗ > 0 such that 1 α −E(t)−1 A(t) < −2 m−1 c3−1 n−1 cos( φ0 ) < 0 2 | {z } ≥a∗ Hence (using lemma 8.6) the ratio −E(t)−1 A(t) 2 −α( −E(t)−1 A(t) ) is also bounded by a constant independent of t, and the bound (8.7) is now a consequence of theorem 2.27. 8.8 Corollary. Assuming A1–[8.1], using P2–[8.3], and taking m according to lemma 8.7, there is a bound on m E −1 I 230 8 ltv ode of nominal index 1 Proof: m E −1 = m E −1 A A−1 ≤ c3 k1 I I The most important thing in lemma 8.7 is that the exponential decay rate (with respect to θ) is independent of t. From here, it would be possible to derive results along a parallel path to that taken in Kokotović et al. (1986), but instead of going along a parallel track we shall build upon those results. Since it has been assumed that E(t) is invertible, the system (8.1) can be written in ode form. Scaling the equation by m, the ode reads m z 0 (t) = −m E(t)−1 A(t) z(t) (8.8) which reminds of the standard singular perturbation setup, although it contains the matrix-valued uncertainty E(t) on the right hand side. However, not much has to be known about the right hand side in order to apply the results in Kokotović et al. (1986), and the assumptions 2.1 and 2.2 made there have already been treated in the current context as part of the proof of lemma 8.7. We now make an additional assumption regarding the time-variability, corresponding to assumption 2.3 in Kokotović et al. (1986). A2–[8.9] Assumption. Assume d m E(t)−1 A(t) ≤ β 1 dt (A2–8.9) I for some constant β1 . This assumption involves the time variability of E(t), and may seem hard to justify in applications. However, in seminumerical approaches to index reduction in dae, the uncertainty E(t) may be a symbolic expression which one is unable (or unwilling to try) to reduce to zero, and then it may be possible to compute a true bound on the time variability of E(t). By writing 0 m E −1 A = m E −1 A A−1 A0 − m E −1 A A−1 m−1 E 0 m E −1 A it is seen that (using lemma 8.6) d m E(t)−1 A(t) ≤ k c A0 (t) + k m−1 E 0 (t) 1 3 1 dt 2 2 2 It follows that the following two conditions may be a useful alternative to A2–[8.9]. P3–[8.10] Property. The constant β2 < ∞ shall be chosen so that A0 ≤ β I 2 (P3–8.10) A3–[8.11] Assumption. Assume E 0 ≤ m β3 I for some constant β3 . (A3–8.11) 8.1 231 Slowly varying systems The second of these should be interpreted as a requirement that the bound on kE 0 kI scales with the bound on kEkI , which should be reasonable in many situations. 8.12 Lemma. Given a consistent choice of eigenvalue conditions, selecting β3 ≥ in A3–[8.11] is sufficient to ensure the existence of a perturbation E. m supt max(A(t)) β2 m 0 c2 max(A (t)) ≤ m c2 . Proof: It suffices to note that the instantiation E(t) = all t) both max(E(t)) ≤ m and max(E 0 (t)) = β2 c2 A(t) satisfies (for The assumptions made so far allow us, according to lemma 2.29, to approximate the solution to the time-varying system (8.1) by a time-invariant system, as m → 0. The lemma would have given a rather detailed account of the convergence if −m E(s)−1 A(s) would have been known — in the usual singular perturbation setup −m E(s)−1 = I and A is assumed to be a known slowly varying matrix — but here the presence of the matrix-valued uncertainty E(s) implies that convergence to zero is the only kind of convergence we can hope for. Since t > s implies e−m E(s)−1 A(s) ( t−s )/m → 0, as m → 0 2 according to theorem 2.27, pointwise convergence of φ( t, s ) for t ≥ s is established (here, we used φ( s, s ) = I ). However, we shall also include another proof of this fact without taking the detour via lemma 2.29. Let P (t) be the solution to the time-invariant Lyapunov equation (2.44) with M sub1 stituted by −m E(t)−1 A(t) (so that z 0 = − m M). Then V ( t, z ) = z TP (t) z is a time-dependent Lyapunov function candidate. Since d 1 V ( t, z(t) ) = − z(t)T P (t) M(t) + M(t)TP (t) z(t) dt m + z(t)T P 0 (t) z(t) 1 1 − m P 0 (t)2 |z(t)|2 ≤− m this can be made negative by taking m sufficiently small if there is a bound on kP 0 kI . In addition to such a bound, a bound on kP kI will allow V ( 0, z(0) ) to be bounded (in relation to z(0), of course, and for this we only need a bound on kP (0)k2 ), and a bound on P −1 I will show that |z(t)| is bounded by a decreasing function. The upper bound on kP (0)k2 is readily obtained by use of (8.7) in the formal solution (2.45) to the Lyapunov equation. The corresponding lower bound is established by theorem 2.31. To bound P 0 (t), the time-dependent Lyapunov equation may be differentiated with respect to t, which yields a new Lyapunov equation in P 0 (t), and whose formal solution turns out to be bounded by 2 β1 and the bound for kP kI squared. These results are applied in the next lemma. 232 8 ltv ode of nominal index 1 8.13 Lemma. Under P1–[8.2], P2–[8.3], A2–[8.9], the time-varying system (8.1) is λ̃w ( m ) uniformly γw e− m • -stable, with r γw = K∗ λ̃w ( m ) = c2 c2 a∗ m 1− 2 2 a∗ /( β1 K∗4 ) (8.12) ! Proof: Inserting the bounds on kP kI (upper and lower) and kP 0 kI (upper), the convergence can be stated via the coupled system in |z(t)| and V (t) = V ( t, z(t) ): p |z(t)| ≤ 2 c2 V (t) K∗2 |z(0)|2 2 a∗ ! 1 m 0 V (t) ≤ − 1− |z(t)|2 m 2 a2∗ /( β1 K∗4 ) V (0) ≤ Solving this system with equalities everywhere will give upper bounds as functions of t. With V̄ (t) being the upper bound for V (t), one obtains − V (t) ≤ V̄ (t) = V (0) e 2 c2 m ! 1− m 4 2 a2 ∗ /( β1 K∗ ) t and it just remains to take a square root. 8.14 Corollary. Under the assumptions of lemma 8.13, bounding m by a constant − λmw • 2 4 -stable where less than 2 a∗ /( β1 K∗ ) makes the system (8.1) uniformly γw e λw > 0 is independent of m. Proof: Follows immediately from lemma 8.13. 8.2 Time-varying systems with timescale separation In the last section, the main tool was to use bounds for −m E(t)−1 A(t), which led to immediate application of previous results valid for scalar perturbations. In this section we shall study the decoupling transform in the presence of a matrix-valued uncertainty. For scalar perturbations in time-varying systems, a common technique is to study series expansions in the (scalar) perturbation variable. (Naidu, 2002) However, the technique is demanding to generalize since expanding a multivariable function can easily result in an overwhelming amount of bookkeeping. The choice of notation for the parameters is motivated by the context where the lemma is applied in section 8.2. 8.2 233 Time-varying systems with timescale separation For the system ! x0 (t) + A11 (t) x(t) + A12 (t) z(t) = 0 ! E(t) z 0 (t) + A21 (t) x(t) + A22 (t) z(t) = 0 (8.13x) (8.13z) our goal is to study the decoupling transform, that is, the change(s) of variables that isolates the fast and uncertain dynamics from the slow dynamics which we wish to approximate. In a fashion similar to Kokotović et al. (1986) the existence of these transforms will be established constructively so that their approximation properties for small m are made visible, where m now is the upper bound on E in (8.13) rather than (8.1). 8.2.1 Overview The first decoupling step serves to eliminate x from (8.13z). Making the change of variables (compare (6.15) in the time-invariant case) z(t) = L(t) x(t) + η(t) and eliminating x0 (t) (8.14) from (8.13z) by row operations, one obtains ! x0 + ( A11 + A12 L ) x + A12 η = 0 ! 0 E η + N1 x + ( A22 − E L A12 ) η = 0 (8.15x) (8.15η) where N1 is an expression that is to be eliminated by the choice of L. Equating N1 with 0 gives ! E L0 = −A21 − A22 L + E L ( A11 + A12 L ) (8.16) Assuming that this equation is solved by the choice of L (we will have to ensure that the solution can be approximated well for sufficiently small m, even though E is unknown), we can proceed to the second decoupling step; elimination of η from (8.15x) by the change of variables x(t) = ξ(t) + m H(t) η(t) Making the change of variables and eliminating one obtains (8.17) η 0 (t) from (8.15x) by row operations, ! ξ 0 + ( A11 + A12 L ) ξ + N2 η = 0 ! E η 0 + ( A22 − E L A12 ) η = 0 where N2 is to be eliminated by the choice of H. Equating N2 with 0 gives ! m H 0 = m H E −1 A22 − L A12 − A12 − m ( A11 + A12 L ) H (8.18ξ) (8.18η) (8.19) Until this point, the steps taken to derive the decoupling transform have been almost identical to those in Kokotović et al. (1986, section 5.3), but as they make a series expansion of L and H in the scalar perturbation parameter (with the coefficients being functions of time), the close similarity must end here. 234 8 ltv ode of nominal index 1 To make use of the results in the previous section we need to replace P2–[8.3]. It would be an unnecessary restriction to require that the slow dynamics of the coupled system must not have eigenvalues at the origin, and the property we need is another one, stated next. P4–[8.15] Property. The constant c3 < ∞ shall be chosen so that A−1 ≤ c 3 22 I 8.2.2 (P4–8.20) Eliminating slow variables from uncertain dynamics Since we are interested in convergence as m → 0, rather than forming an expansion of L (and soon H) in E, we write L in the form L(t) = mN RL (t) + N −1 X mj Lj (t) (8.21) j=0 where, in order for the expansion to be meaningful, we must ensure that RL is bounded independently of m, and that each Lj can be approximated sufficiently well independently of m. In this work, the main focus is to prove convergence and it suffices to consider N = 1 — larger values of N are of interest when a more accurate solution to the original equations is sought. In general, it will only be possible to show that the expansion is valid for sufficiently small m, but it is of interest to find an upper bound on m where validity is known. To find equations for RL and Lj , (8.21) is used in (8.16), occurrences of E are rewritten as m ( m−1 E ), and equal powers in m (considering m−1 E as one unit) are identified. For N = 1 (8.16) is written ! m (m−1 E) m R0L + L00 = −A21 − A22 [ m RL + L0 ] + m (m−1 E) [ m RL + L0 ] ( A11 + A12 [ m RL + L0 ] ) Gathering the m0 terms and collecting what remains, the following two equations are obtained ! (8.22a) 0 = A21 + A22 L0 ! m R0L + L00 = −m E −1 A22 RL + [ m RL + L0 ] ( A11 + A12 [ m RL + L0 ] ) h i = −m E −1 A22 − L0 A12 RL + L0 ( A11 + A12 L0 ) +m RL ( A11 + A12 ( m RL + L0 ) ) | {z } | {z } f1 (8.22b) f 2 ( RL ) −A22 (t)−1 A21 (t) By this identification, L0 (t) = is completely independent of E, and assuming A21 and A22 are bounded with bounded derivatives, P2–[8.3] implies that Unlike the previous chapters on ltv systems, we now write Lj instead of Lj to denote the different functions in the expansion. This notation allows the usual notation for the differentiated function to be used conveniently, and unlike the nominal index 2 case, there is no partitioning of L with blocks to be referred to using subscripts. 8.2 235 Time-varying systems with timescale separation both L0 (t) and L00 (t) can be bounded independently of t. It follows that there exists a c4 such that −L0 + f ≤ c (8.23) 0 1 I 4 It remains to establish boundedness of RL , and in the spirit of Kokotović et al. (1986) and section 7.A.1 we do so by a contraction mapping argument. We shall define an operator S to act on RL ∈ L = RL : kRL kI < ρL (ρL will be chosen later) such that • S RL = RL implies that RL solves (8.22b). • S maps L into itself if ρL is chosen small enough. • S RL,1 − S RL,2 I < c RL,1 − RL,2 I for some c < 1. When these conditions hold, it follows that the fix point equation S RL = RL has a unique solution in L, hence establishing boundedness of the solution to (8.22b). We now introduce the approximation of the η subsystem (8.18η) obtained by replacing the decoupling function L by all terms in the expansion (8.21) except for the so far unknown rest term m RL , ! E w0 + (A22 − E L0 A12 ) w = 0 (8.24) It is for this system the assumption A1–[8.1] will be made in the current context. We must remark that this is not quite satisfying since (8.24) has not yet been related to the system features of the system modeled by the equations. We shall discuss this choice briefly soon hereafter, and then discuss the issue in more detail in section 8.A. We now check P1–[8.2], P2–[8.3], and P3–[8.10] for the trailing matrix in (8.24) in place of the A of section 8.1 as follows. • P2–[8.3] may be checked by first bounding A−1 22 I and kL0 A12 kI , and then using corollary 2.47. Less restrictive bounds on m can be obtained by considering the bound as a function of time. • P1–[8.2] is checked directly by inspection of A22 , L0 A12 = −A−1 22 A21 A12 and any of the bounds imposed on m (for instance the one used to check P2–[8.3]). • P3–[8.10] is checked using A3–[8.11] and inspection of A022 and the time derivative of L0 A12 . Applying corollary 8.14 to (8.24)instead of (8.1) now gives that, for sufficiently small m, (8.24) is uniformly γw e− λw m • -stable for some constants γw and λw > 0. Let φ denote the transition matrix of (8.24), so that taking m sufficiently small gives φ(t, t) = I φ(•, τ)0 (t) = − E(t)−1 A22 (t) − L0 (t) A12 (t) φ(t, τ) φ(τ, •)0 (t) = φ(τ, t) E(t)−1 A22 (t) − L0 (t) A12 (t) λw φ(t, τ) ≤ γw e− m ( t−τ ) 2 236 8 ltv ode of nominal index 1 Regarding the choice of system associated with φ, Kokotović et al. (1986) differs from the original Chang (1969), and our choice is a compromise. We prefer not to follow Chang (1969) since that would require the properties of (8.18η), where L appears, to be checked in the process of determining L. On the other hand, we prefer not to follow Kokotović et al. (1986) since that would make the discrepancy larger than necessary between equations for which A1–[8.1] is assumed and equations which can be related to the system being modeled. A feature of this work shared by Kokotović et al. (1986) but not by Chang (1969), is the splitting of L into a nominal part and higher order terms, providing better insight into approximation properties and the decoupled problem. By differentiation with respect to t, the following choice of S is verified to be compatible with (8.22b) (compare example 2.48, (2.88)) 1 (S RL ) (t) = m 4 Zt φ(t, τ) −L00 (τ) + f1 (τ) + m f2 ( RL )(τ) dτ (8.25) 0 In addition to (8.23), RL ∈ L implies that kf2 ( RL )kI ≤ cξξ ρL + m cxz ρL2 (8.26) for some cξξ , cxz that satisfy kA11 + A12 L0 kI < cξξ kA12 kI < cxz (8.27) To ensure that S maps L into itself, we note that for RL ∈ L, 1 kS RL kI ≤ m Zt γw e− λw m ( t−τ ) h i c4 + m cξξ ρL + m2 cxz ρL2 dτ 0 γ γ c ≤ w 4 + w m cξξ ρL + m2 cxz ρL2 λw λw The second of these terms will be made small by imposing a bound on m, so by taking αL > 0 and setting γ c (8.28) ρL = ( 1 + αL ) w 4 λw we obtain kS RL kI < ρL whenever m cξξ ρL + m2 cxz ρL2 < αL c4 , or s !2 cξξ λw αL c4 cξξ m< + + − ( 1 + αL ) γw c4 2 cxz cxz 2 cxz λw = αL + O( αL2 ) cξξ γw To establish the contraction on L, note that Zt φ(t, τ) f2 ( RL,2 )(τ) − f2 ( RL,1 )(τ) dτ S RL,2 (t) − S RL,1 (t) = 0 (8.29) 8.2 237 Time-varying systems with timescale separation Hence, a sufficient condition for S to be a contraction on L is available in (using (7.37) to express the difference between the terms quadratic in RL ) m γw S RL,2 − S RL,1 ≤ cξξ + 2 cxz ρL RL,2 − RL,1 I I λw ! cξξ γw + 2 cxz c4 ( 1 + αL ) RL,2 − RL,1 I =m λw As αL → 0, this the condition for contraction tends to a constant positive bound on m (we use 0.99 < 1 just to ensure strict contraction), m< = 0.99 cξξ γw λw + 2 cxz c4 ( 1 + αL ) 0.99 cξξ γw λw + 2 cxz c4 (8.30) + O( αL ) Hence, for small αL (8.30) will be implied by (8.29), but while the bound on m in (8.29) is initially increasing with αL , (8.30) is decreasing for all values of αL so one generally has to consider both bounds. This concludes the contraction mapping argument for RL in the expansion L = L0 + m RL . That is, by taking m less than all of the finitely many positive bounds imposed on it, we obtain kRL kI ≤ ρL . The corresponding decoupling transformation will be applied in section 8.4. 8.2.3 Eliminating uncertain variables from slow dynamics We now turn to the second change of variables given by (8.17), where H satisfies (8.19), which can be partitioned either as ! m H 0 = m H E −1 A22 − L A12 − A12 − m [ A11 + A12 L ] H or ! m H 0 = m H E −1 A22 − L0 A12 − A12 − m ( m H RL A12 + [ A11 + A12 L ] H ) | {z } g2 ( H ) depending on whether one prefers to reuse the transition matrix φ from the previous section, or if one rather makes assumptions regarding the (8.18η) system which has been isolated as a subsystem if the system being modeled. Using the first partitioning, the expression for g2 becomes easier to work with, and to check P3–[8.10] one can use A3–[8.11] and corollary 8.8 in (8.16). On the other hand, assumptions have already been made corresponding to the latter partitioning, and this is the choice made here. The following approximate expression for the solution can be derived by making an expansion of H in powers of m, −1 H = A12 A−1 22 m E + O(m) 238 8 ltv ode of nominal index 1 Since this expression is dominated by a term which can only be bounded, but not further approximated, there is little use of an expansion of H in powers of m. Instead, we aim directly for a bound on kHkI , and introduce an operator for this purpose. Let the following operator T be defined for H ∈ H = H : kHkI < ρH (ρH will be chosen soon). (Compare example 2.48, (2.89)) 1 T H= m 4 Ztf [ A12 (τ) + m g2 ( H )(τ) ] φ(τ, t) dτ (8.31) t For m sufficiently small to make the previous approximation of L valid, it holds that kg2 ( H )kI ≤ m ρL kA12 kI + kA11 + A12 L0 kI + m ρL kA12 kI kHkI ≤ cξξ + 2 m ρL cxz (8.32) and we obtain h kT HkI ≤ kA12 kI + m cξξ + 2 m ρL cxz i tf 1 Z φ(τ, t) dτ ρH I m t γw ≤ cxz + m cξξ ρH + 2 m ρL cxz ρH λw 2 We now pick αH > 0 and set ρH = ( 1 + αH ) γw cxz λw which means that T maps H into itself whenever m cξξ ρH + 2 m2 ρL cxz ρH ≤ αH cxz . That is, we require s ! cξξ ρH cξξ ρH 2 ρH 1 m≤ − + αH ρH 2 ρL 2 cxz 2 ρL 2 cxz 2 ρL (8.33) cxz 2 = αH + O(αH ) cξξ ρH Since g2 ( H ) is linear in H, g2 ( H2 ) − g2 ( H1 ) = g2 ( H2 − H1 ), and hence kg2 ( H2 ) − g2 ( H1 )kI ≤ cξξ + 2 m ρL cxz kH2 − H1 kI In kT H2 − T H1 kI ≤ m γw cξξ + 2 m ρL cxz k H2 − H1 kI λw it is seen that for m < 1, the requirement for contraction here is weaker than that for S. This bound should be compared with the approximate expression for H, suggesting that the bound could actually be as small as cxz c3 nz . 8.3 239 Comparison with scalar perturbation For completeness, ρL is expressed using αL , yielding the following expression for the bound on m (again 0.99 < 1 is an arbitrary choice just to ensure strict contraction) v t 2 2 2 2 cξξ cξξ 0.99 λw cξξ m≤ ( 1 + αL ) − + 1 + αL γw cξξ 4 c4 cxz 2 c4 cxz 4 c4 cxz = 0.99 λw γw cξξ v t c2 2 2 2 cξξ cξξ ξξ + +O(αL ) − 4 c c 2 c4 cxz 4 c4 cxz 4 xz | {z } (8.34) ≤1 This concludes the contraction mapping argument for H. That is, by taking m less than all of the finitely many positive bounds imposed on it, we obtain kHkI ≤ ρH . The corresponding decoupling transformation will be applied in section 8.4. 8.3 Comparison with scalar perturbation Since the preceding section is similar to the treatment for a scalar perturbation found in Kokotović et al. (1986), we would like to highlight some of the differences. • The matrix-valued uncertainty E(t) does not commute with other matrices in the way would. • In the change of variables (8.17), the bound m on the perturbation is used rather than the perturbation E(t) itself. • In the contraction operators, the transition matrix φ is associated with an approximation of what will turn out as the η subsystem, rather than with the ( E, A22 ) system. See section 8.A for a discussion. • In the equation for H, there is now a term m H E −1 A22 where there used to be just H A22 . • The “nominal” (or “reduced”) solution for H, that is H0 , is no longer the known −1 −1 entity A12 A−1 22 , but instead A12 A22 m E. However, since all that can be said about the solution η is that it will be vanishing with m, not knowing H0 is no limitation. 8.4 The decoupled system Starting from the ltv dae ! Ē(t) x̄0 + Ā(t) x̄(t) = 0 x̄(0) = x̄0 (8.35) time-varying row and column reductions are applied in the time-varying analog of the transforms for lti dae in section 6.1. The resulting system is (8.13), for which section 8.2 shows the existence and approximation properties of the two decoupling 240 8 ltv ode of nominal index 1 matrices L and H. The two changes of variables given by L and H can be written compactly as ! ! ! ! ! ! ξ I −m H x I −m H I 0 x = = η 0 I η 0 I −L I z ! ! I + m H L −m H x (8.36) = −L I z | {z } T −1 with inverse given by I T = L mH mLH +I ! (8.37) From the two factors making up T −1 it is readily seen that det T −1 = 1, and with L and H bounded, any bound on m gives that T −1 is bounded, and hence that T −1 defines a Lyapunov transformation (recall definition 2.30). Applying the transformation yields the system (8.18) which we repeat here ξ 0 = − ( A11 + A12 L ) ξ = − A11 − A12 A−1 22 A21 + m A12 RL ξ ! E η 0 + ( A22 − E L A12 ) η = 0 (8.38ξ) (8.38η) This system has two isolated parts, where the (8.38ξ) is a regularly perturbed problem which may be addressed with any available method (see section 2.4). To be able to establish a rate of convergence in this section, we demand one more condition on the system (which will have to be verified in applications). Let x̊ denote the nominal solution for ξ (and x as it turns out), that is, the solution to x̊0 = A11 − A12 A−1 A x̊ (8.39) 21 22 P5–[8.16] Property. The system (8.39) is uniformly exponentially stable. h i By theorem 2.42, P5–[8.16] lets us conclude that (8.38ξ) is uniformly γξ e−λξ • stable for some constants γξ , λξ > 0, independently of RL . Then, Rugh (1996, theorem 12.2) provides that (8.38ξ) is uniformly bounded-input, bounded-state stable, which means that the reformulated system (2.66) can be used to conclude that supt |ξ(t) − x̊(t)| = O( m ). The system (8.38η) is the kind of system considered in section 8.1, and since it is a true subsystem of the real system, our assumptions apply and lemma 8.13 provides parameters of uniform exponential stability for this system. Since T −1 in (8.36) is bounded, it follows that bounded initial conditions for x(0) and z(0) implies bounded initial conditions for η(0). Multiplying by the exponential convergence parameter γw , one obtains a bound on η(t), valid for all t ≥ 0. Looking at (8.17), it is seen that the boundedness of H then implies that x converges to ξ uniformly as m → 0. 8.4 241 The decoupled system In order to also obtain convergence for z to z̄ = −A−1 22 A21 x̊, it is necessary to show that η is not only bounded, but converges uniformly to 0 as m → 0 (compare with (8.14) and recall that L = −A−1 22 A12 + O( m )). This can only follow if the initial conditions for η(0) converges to 0 with m (and then the uniform convergence follows). The convergence was the subject of lemma 6.4 as well as lemma 7.4. By identifying (8.13) with (7.8), lemma 7.4 applies in the index 1 case as well, and the time-variability here does not matter for initial conditions. Hence, there is no need to derive the convergence again, and we just remind that the choice z 0 = −A22 (0)−1 A21 (0) x0 is the only fixed choice for z 0 that can be used for arbitrarily small m. The section is concluded with a theorem summarizing the convergence result for ltv dae of nominal index 1. 8.17 Theorem. Consider the nominal index 1 ltv dae (8.35) repeated here, ! Ē(t) x̄0 + Ā(t) x̄(t) = 0 x̄(0) = x̄0 (8.35) and the corresponding partitioned equations (8.13) repeated here, ! x(0) = x0 ! z(0) = z 0 x0 (t) + A11 (t) x(t) + A12 (t) z(t) = 0 E(t) z 0 (t) + A21 (t) x(t) + A22 (t) z(t) = 0 over the time interval I = [ 0, tf ). Let the nominal equation refer to the same equation, but with E(t) replaced by 0. Let x0 , z 0 satisfy the nominal equation, and let x̊ denote the solution to (8.39) (that is, the nominal differential equation for x). Let max(E(t)) ≤ m for all t, and make the pointwise in time assumption A1–[8.1] regarding the eigenvalues of the approximation (8.24) of the fast and uncertain subsystem, as well as the either of the assumptions A2–[8.9] or A3–[8.11] regarding the time variability of E(t). Assume that the properties P1–[8.2], P3–[8.10]–P5–[8.16] are also satisfied. Then there exists constants k and m0 > 0 such that m ≤ m0 implies sup |x(t) − x̊(t)| ≤ k m t∈I sup z(t) + A22 (t)−1 A21 (t) x̊ ≤ k m t∈I and the solution to (8.35) converges at the same rate. Proof: This is a summary of results obtained in the present chapter. Since the convergence of the full system is dependent of the convergence in the subsystem (8.38ξ), the requirement of P5–[8.16] can be replaced by other conditions which enables convergence in (8.38ξ) to be established, and if the established convergence is not O( m ) uniformly in time, this convergence rate will replace the rates in theorem 8.17. On the other hand, in a particular problem when the uncertainties are given, the rate of convergence is not of importance since it is only the fixed bound on supt |ξ(t) − x̊(t)| that matters. 242 8.5 8 ltv ode of nominal index 1 Conclusions The solutions of the two timescale system (8.13) with a matrix-valued singular perturbation have been shown to converge as the bound on the perturbation tends to zero. Aside from properties that can be verified in applications, the eigenvalue assumption used for lti systems in chapter 6 is assumed to hold pointwise in time here, and is formulated with respect to an approximation of the fast and uncertain subsystem rather than the true system. Further, compared to chapter 6, the timevariability of the system has led to the use of an additional assumption bounding the time derivative of the perturbation. Regarding directions for future research, there are some results in chapter 7 which might be possible to generalize to ltv systems, and it remains to relax the assumption that E(t) be pointwise non-singular, so that the theory covers not only nominal index 1, but also true index 1 systems. However, while general convergence results for nominal index 2 lti systems still remain to be derived, extending the results in this chapter to higher nominal indices should wait. In the mean time, there is also plenty of work to be done on the numeric implementation. In doing so, the SVD decomposition of Steinbrecher (2006, theorem 2.4.1) is expected to be a key tool. Appendix 8.A Dynamics of related systems This section contains some remarks concerning the choice of system to which the transition matrix φ in section 8.2 belongs. Recall that we would ideally formulate the eigenvalue assumptions for the matrix pair of (8.38η), but we were led to use the matrix pair of (8.24) instead since (8.38η) was not available at the time when the assumptions were needed. In retrospect, we would like to indicate how assumptions about (8.38η) can justify the use of (8.24). In the original references on this method, Chang (1969), the transition matrix φ is associated with the fast (and uncertain) subsystem (8.38η), while Kokotović et al. (1986) associates it with the system ! E w00 + A22 w0 = 0 (8.40) (but with E = I , of course). Our choice (8.24), repeated here, ! E w0 + (A22 − E L0 A12 ) w = 0 (8.24) is a third option, and it was explained in section 8.2.2 why this was preferred. With the (8.38η) subsystem, repeated here, ! E η 0 + ( A22 − E L A12 ) η = 0 (8.38η) isolated by the choice of L given in section 8.2.2 we now consider associating φ with this system instead, just like in Chang (1969), so that the eigenvalue assumptions really concern instantaneous poles for the slowly varying fast subsystem. First of all, once a crude estimate of L is available, the (8.38η) subsystem becomes isolated, and it would be possible to start over from the beginning of section 8.2.2 with φ associated with (8.38η) instead of (8.24). Doing so would not justify the assumptions about (8.24), but there may be other ways of obtaining the initial crude estimate. 243 244 8 ltv ode of nominal index 1 We require that, for m small enough, the decoupling matrix satisfies a constant bound, (8.41) kLkI ≤ lˆ (Note that we do not assume convergence as m → 0.) Then there is also a number ρ ≤ lˆ + kL0 kI such that km RL kI ≤ ρ (8.42) (It is seen that (8.42) also implies (8.41), so either one may be used as starting point.) As on page 235, we must check P1–[8.2], P2–[8.3], and P3–[8.10] for the trailing matrix in (8.38η) in place of the A of section 8.1. P1–[8.2] and P2–[8.3] are readily checked using (8.41). To check P3–[8.10], the product E L is treated as one unit in (8.38η), and in dE(t) L(t) (t) = E 0 (t) L(t) + E(t) L0 (t) dt it is seen that the time derivative is bounded by inserting (8.16) for E L0 and using A3–[8.11] to bound E 0 L. Making assumption A1–[8.1] for (8.38η), corollary 8.14 provides that the sys λη tem (8.38η) is uniformly γη e− m • -stable for some constants γη, λη > 0. The following lemma then shows that we can obtain the same qualitative exponential convergence for the approximation (8.24). − λmw • 8.18 Lemma. The system (8.24) is uniformly γw e -stable for some constants γw , λw > 0. Proof: Rewrite (8.24) as a perturbed system, i h ! w0 = − E −1 A22 − L0 A12 w = − E −1 A22 − L A12 + m RL A12 w 4 and time-scale by means of w̄(t) = w(m t). This yields h i w̄0 = − m E −1 A22 − L A12 + m ( m RL ) A12 w̄ Time-scaling the corresponding “nominal” system ū 0 = −m E −1 A22 − L A12 ū 4 by means of ū(t) = u(m t) yields u 0 = − E −1 A22 − L A12 u which is recognized as the η subsystem (8.38η), with known uniform exponential convergence parameters. Due to P4–[8.15], (8.41), and corollary 2.47 it can be seen that −1 E −1 A22 − L A12 = (A22 − E L A12 )−1 E 8.A 245 Dynamics of related systems can be bounded by some constant α̂u times m,for m sufficiently small. This, together with A1–[8.1], implies that m E −1 A22 − L A12 I ≤ αū for some constant αū according to theorem 6.11 (compare (6.28)). λη Since the η subsystem is uniformly γη e− m • -stable, the ū system is uniformly h i γη e−λη• -stable. Since the state feedback matrix of the same system has a norm 2.42 provides a bound on m which bound of αū and we have km RLh kI ≤ ρ, theorem i makes the w̄ system uniformly γw e−λw • -stable for some positive constants γw , λw . λw It then follows that the system (8.24) is uniformly γw e− m • -stable. The lemma shows that making assumptions about the eigenvalues of (8.38η) and using the bound (8.41) (which may either be derived or postulated, and does not require L to converge as m → 0), the necessary convergence property of the transition matrix φ used in section 8.2.2 follows. We end the section with a corollary which provides a possible substitute for lemma 8.6, applicable in the context of the coupled system instead of the slowly varying system in section 8.1. 8.19 Corollary. Taking m sufficiently small will provide a bound on m E −1 A22 I . Proof: Using the αū m E −1 A from the proof of lemma 8.18 we obtain = m E −1 A − L A + L A ≤ α + m lˆ 22 I 22 12 12 I ū kA12 kI 9 Concluding remarks Looking back on the previous chapters in the thesis, we let the self-contained chapter 4 on filtering and chapter 5 on the new index speak for themselves. Here, we wrap up our findings concerning matrix-valued singular perturbation problems related to uncertain dae, because this is where the emphasis has been in the thesis. The matrix-valued singular perturbation problems were introduced in section 1.2, and chapter 3 detailed an application of future nonlinear results. The results in the thesis have been limited to autonomous linear dae. Using assumptions regarding the system poles, convergence of solutions has been established in the nominal index 1 case, for both lti and ltv dae. For lti dae of nominal index 2 we have not (except for a very small example) been able to establish convergence of solutions, but several results that are expected to be useful in the future have been derived. These results include a Weierstrass-like canonical form for uncertain matrix pairs of nominal index 2. Most results assume that the pointwise index of the uncertain dae is 0, but for lti dae of nominal index 1 results were partly extended to pointwise index 1. Some directions for future research have been mentioned in earlier chapters, but the following short list contains some which we think are particularly interesting. • The canonical form for lti dae of nominal index 2 should be extended to higher indices, and a reliable numeric implementation should be developed. • The results for ltv systems of nominal index 1 should be extended from pointwise index 0 to pointwise index 1, as was done in the lti case. • Other function measures or stochastic formulations of the problems may both be relevant in applications and result in better error estimates. To consider alternative function measures appears to be a good option also for systems with inputs. 247 A Sampling perturbations Being unable to compute tight bounds in the analysis of matrix-valued singularly perturbed systems is very related to the inability to construct worst-case perturbations. To illustrate our results, we are left with the option to sample randomly from the set of perturbations that agree with our assumptions, and observe how the corresponding solution set changes as a function of the parameters in our assumptions. In this chapter, we detail how the random samples were generated, so that our examples can be reproduced and readers can try their own examples. Since our aim in the examples is to illustrate convergence of the solutions as the bound on the size of the perturbation tends to zero, it is desirable that the perturbations are such that there is not much slack in this constraint. A.1 Time-invariant perturbations Our sampling strategy for time-invariant perturbations is trivial in the index 0 case, details follow. Given m > 0 and the parameters ā, R0 , and φ0 in max(E) ≤ m ∀λ : |λ| m < ā and |λ| > R0 =⇒ |arg(−λ)| < φ0 we sample each entry of the matrix E from a uniform distribution over the interval ! [ −m, m ], and then the whole matrix is scaled to satisfy max(E) = m. We then compute the eigenvalues (possibly taking also the slow dynamics into account), and reject any samples that do not satisfy the eigenvalue constraints. Clearly, one must not select ā too small, or an infinite loop of rejections will occur. 249 250 A Sampling perturbations 0.01 0.005 0 −0.005 −0.01 0 1 2 3 t Figure A.1: The trajectories of the entries of the perturbation E produced in example A.1. A.2 Time-varying perturbations In the time-varying case, it is not only desirable to have little slack in the constraint max(E) ≤ m, but also that there is small slack in the time variability constraint. This makes sampling time-varying perturbations considerably harder than the timeinvariant case. The word sample has two meanings in the current sectionand to remove some of the confusion we shall refer to the time-varying samples of E as realizations. To obtain a computable test, the continuous-time eigenvalue constraint is relaxed by only requiring it to hold at a limited number of sampling instants. If realization would produce unexpected results in examples, it is important to remember this relaxation and check the conditions more carefully before drawing any conclusions. Our algorithm, presented in algorithm A.1 (page 251) and algorithm A.2, constructs trajectories for E (that is, realizations) by a sequence of steps. It works with time samples of the realization, and uses entry-wise linear interpolation between sampling instants. The algorithm is initialized with the trivial trajectory given by E(t) = m M22 (t) (which is assumed to fulfill the time-variability constraint), and max(M22 (t)) maintains feasibility of the trajectory during the process of repeated perturbation of individual samples along the trajectory. To simplify notation, we give the algorithm for the case when there is no slow dynamics, but the extension to also include the slow dynamics is straightforward. Following the algorithms, chapter ends with an example. Note that sample is used with two meanings here. In the time-invariant setting, it was natural to think of E as a sample from a random variable. Extending this use of sample to the time-varying setting, we use it to refer to E as a realization of some stochastic process, and to produce such realizations is what the chapter is all about. The other meaning of sample is to evaluate functions of time at certain sampling instants in time. A.2 251 Time-varying perturbations Algorithm A.1 Sampling perturbation trajectories for ltv systems. Input: • An interval I of time for which the realization is to be computed. • The trailing matrix (as a matrix-valued function of time) M22 . • The bound m and bound on the time-variability. The following two types to time-variability constraints will be considered below: d (A.1a) ∀t : m E(t)−1 A(t) ≤ β1 dt 2 ∀t : max(E 0 (t)) ≤ m β3 (A.1b) Output: A continuous, piecewise linear trajectory E, which satisfies max(E) ≤ m and the time-variability constraint (at all times for (A.1b), or at a finite set of sampling instants for (A.1a)), and satisfies the sampling-relaxation of the eigenvalue constraint. Initialization: Select an initial number of sampling instants (not counting the initial time), and distribute the sapling instants evenly over I. Initialize the trajectory as a linear interpolation of the function t 7→ m M22 (t) sampled at the sampling instants. max(M (t)) 22 Main loop: repeat 2 or 3 times Increase the number of sampling instants by a positive integer factor. Distribute the sampling instants evenly over I (denote the interval t∆ ), and sample the current trajectory accordingly. This results in a sequence of matrices { ( ti , Ei ) }i . Perturb the sequence by performing a fixed number of minor iterations, see algorithm A.2. Reconstruct the continuous trajectory by linear interpolation. end Remarks: By keeping low the initial number of sampling instants, large but slow variations are obtained at a moderate computational cost. However, the timevariability is constrained by the sampling interval, which is why the number of sampling instants is increased in each iteration. It should be validated that the final number of sampling instants is sufficiently large to allow the time variability constraint to be activated. By multiplying the number of sampling instants by an integer in each major iteration, and distributing the sampling instants evenly over I, it is ensured that the up-sampled trajectory is identical to the trajectory at the end of the previous major iteration. This is important for maintaining feasibility of the trajectory. 252 A Sampling perturbations Algorithm A.2 Details of algorithm A.1 — minor iterations. Input/initialization: This algorithm is just a step in algorithm A.1. Minor iteration: The number of minor iterations during one major iteration is typ- ically in the sane order of magnitude as the number of sampling instants, but may also be bounded by computational time considerations. In each minor iteration, the following steps are taken to perturb the sequence of matrix samples. Select which matrix sample to perturb at random (denoting the corresponding index i) and note the neighboring matrices (at the end points, there is just one neighbor). if using (A.1b) Find the intervals of radius β3 t∆ centered at each entry in the neighboring matrices, intersect the intervals entry-wise, and intersect also with [ −m, m ]. Using M22 (ti ), draw a random sample from the derived intervals in the same manner as for time-invariant systems. That is, draw random samples uniformly from each interval until the eigenvalue conditions are satisfied. optional: Scale the obtained matrix so that it has at least one entry at the boundary of the corresponding interval. else (That is, in case of (A.1a).) Using M22 (ti ), draw a random sample matrix in exactly the same manner as for time-invariant systems. Denote the resulting matrix E ∗ . Starting with 1, try successively smaller values of the scalar parameter a in Ei + a ( E ∗ − Ei ) until the linear interpolation between the neighbors and the new matrix satisfies the time variability constraint at a set of intermediate points. (Note that maintaining feasibility of the trajectory implies that a = 0 is a feasible — though pointless — choice.) end For each neighbor, indexed by j, select a small number of time instants evenly distributed between ti and tj (excluding those of ti and tj that the eigenvalue conditions have already been checked at), and use linear interpolation between the neighbor and the new matrix to check the eigenvalue condition at the intermediate time instants. if any eigenvalue condition fails Discard the new matrix (and do not update the sequence of matrix samples). else Replace Ei by the new matrix. else A.2 253 Time-varying perturbations A.1 Example As an illustration of the sampling algorithm for time-varying perturbations, we consider the trailing matrix given by " # 0.94 − 0.05 log(t + 1) 4 0.94 A( t ) = −0.46 0.23 over the time interval I = [ 0, 3.5 ]. The constraints on the time-varying perturbation are given by the following parameters: m = 0.01 β1 = 10.0 ā = 10.0 φ0 = 1.4 R0 = 5.0 The initial trajectory was first sampled with a sampling interval of 0.3, and 30 minor iterations were performed. Then the trajectory was sampled with a sampling interval of 0.03, and 300 minor iterations were performed. The four entries of the final E are shown in figure A.1. The eigenvalue assumption is verified in figure A.2, and the time variability assumption is verified in figure A.3. The figures show that while the constraints given by m and β1 are satisfied with little or even negative slack, the other constraints have large slacks. Since the eigenvalues grow as m → 0, the slack in the lower bound given by R0 can always be made large by selecting m small, and hence the large slack in this constraint does not have to be considered a deficiency of the sampling algorithm. Based on corresponding eigenvalue plots for 30 realizations of E, it was seen that the constraint given by φ0 also can obtain small slack at some point, even though the current algorithm has no component to increase the chance that this will happen. Regarding the upper bound on the eigenvalues given by m−1 ā, the large slack seen in figure A.2 appears typical for the proposed algorithm; in all 30 realizations, the eigenvalue moduli were less then 400 at all times. In view of example 6.13 and considering that the time-variability constraint is locally independent of the pointwisein-time constraints, this is expected to be a deficiency of the algorithm, and not due to the nature of the problem. It would be a possible future extension of the perturbation sampling algorithm to add components which try to minimize the slack in the constraints given by φ0 and ā. The algorithm may violate constraints since it only checks validity at a finite number of points. In the current example, careful inspection of the plots shows that the violation occurs for a part of the trajectory which was generated in the first iteration of algorithm A.1. The current implementation checks the constraints at two intermediate points between the sampling instants, meaning that the constraints are checked at points 0.1 apart in the first iteration. In the next iteration where the time intervals are ten times shorter, the violation gets detected, and attempts to modify this part of the trajectory are very likely to be rejected. 254 A Sampling perturbations Im 100 −300 Re −200 −100 d m E(t)−1 A(t) m dt 2 Figure A.2: Verification of the eigenvalue assumptions in example A.1. The eigenvalues of ( E(t), A(t) ) have been sampled with a time interval of 0.01. The dashed rays show the angle constraint given by φ0 . The arc near the origin is the lower bound on the eigenvalues given by R0 , while the upper bound m−1 ā = 1000 is outside the figure. All constraints are seen to be satisfied. β1 7.5 5 2.5 0 0 1 2 3 t Figure A.3: Verification of the time-variability assumption in example A.1. Time instants with sampling interval 0.01 are marked with dots. The horizontal marks are used to label the points where the assumption is checked during the first iteration of algorithm A.1. The dashed line shows the assumed bound, which is only violated between the points checked during the first iteration. Bibliography Eyad H. Abed. Multiparameter singular perturbation problems: Iterative expansions and asymptotic stability. Systems & Control Letters, 5(4):279–282, February 1985a. Cited on page 71. Eyad H. Abed. A new parameter estimate in singular perturbations. Systems & Control Letters, 6(3):153–222, August 1985b. Cited on page 69. Eyad H. Abed. Decomposition and stability of multiparameter singular perturbation problems. IEEE Transactions on Automatic Control, AC-31(10):925–934, October 1986. Cited on page 71. Eyad H. Abed and André L. Tits. On the stability of multiple time-scale systems. International Journal of Control, 44(1):211–218, 1986. Cited on pages 71 and 174. Jeffrey M. Augenbaum and Charles S. Peskin. On the construction of the voronoi diagram on a sphere. Journal of Computational Physics, 59(2):177–192, June 1985. Cited on page 115. Erwin H. Bareiss. Sylvester’s identity and multistep integer-preserving Gaussian elimination. Mathematics of Computation, 22(103):565–578, July 1968. Cited on page 86. William H. Beger, editor. CRC Handbook of mathematical sciences. CRC Press, Inc., 5th edition, 1978. Cited on page 121. Niclas Bergman. Recursive Bayesian estimation — Navigation and tracking applications. PhD thesis, Linköping University, May 1999. Cited on page 113. Stephen Boyd, Laurent El Ghaoui, Eric Feron, and Venkataramanan Balakrishnan. Linear matrix inequalities in system and control theory. SIAM Studies in Applied Mathematics, 1994. Cited on pages 58, 61, and 62. P. C. Breedveid. Proposition for an unambiguous vector bond graph notation. Journal of Dynamic Systems, Measurement, and Control, 104(3):267–270, September 1982. Cited on page 24. 255 256 Bibliography Kathryn Eleda Brenan, Stephen L. Campbell, and Linda Ruth Petzold. Numerical solution of initial-value problems in differential-algebraic equations. SIAM, 1996. Classics edition. Cited on pages 27, 47, 48, 49, 51, and 100. Peter N. Brown, Alan C. Hindmarsh, and Linda Ruth Petzold. Using Krylov methods in the solution of large-scale differential-algebraic systems. SIAM Journal on Scientific Computation, 15(6):1467–1488, 1994. Cited on page 51. Anders Brun, Carl-Fredrik Westin, Magnus Herberthson, and Hans Knutsson. Intrinsic and extrinsic means on the circle — a maximum likelihood interpretation. In IEEE Conference on Acoustics, Speech, and Signal Processing, volume 3, pages 1053–1056, Honolulu, HI, USA, April 2007. Cited on pages 110 and 119. Dag Brück, Hilding Elmqvist, Hans Olsson, and Sven Erik Mattsson. Dymola for multi-engineering modeling and simulation. 2nd International Modelica Conference, Proceedings, pages 55–1–55–8, March 2002. Cited on page 34. R. S. Bucy and K. D. Senne. Digital synthesis of nonlinear filters. Automatica, 7: 287–298, 1971. Cited on page 114. Benno Büeler, Andreas Enge, and Komei Fukuda. Polytopes: Combinatorics and computation, pages 131–154. Number 29 in DMV Seminar. Birkhäuser, 2000. Chapter title: Exact volume computation for polytopes: A practical study. Cited on page 122. Stephen L. Campbell. Least squares completions for nonlinear differential algebraic equations. Numerische Mathematik, 65(1):77–94, December 1993. Cited on page 130. Stephen L. Campbell and C. William Gear. The index of general nonlinear daes. Numerische Mathematik, 72:173–196, 1995. Cited on pages 29, 32, 33, and 106. Lamberto Cesari. Asymptotic behavior and stability problems in ordinary differential equations. Springer-Verlag, third edition, 1971. Cited on page 66. Kok Wah Chang. Remarks on a certain hypothesis in singular perturbations. Proceedings of the American Mathematical Society, 23(1):41–45, October 1969. Cited on pages 69, 214, 227, 236, and 243. Kok Wah Chang. Singular perturbations of a general boundary value problem. SIAM Journal on Mathematical Analysis, 3(3):520–526, August 1972. Cited on pages 69, 211, and 227. Alessandro Chiuso and Stefano Soatto. Monte Carlo filtering on Lie groups. In Proceedings of the 39th IEEE Conference on Decision and Control, pages 304–309, Sydney, Australia, December 2000. Cited on page 112. Daniel Choukroun, Itzhack Y. Bar-Itzhack, and Yaakov Oshman. Novel quaternion Kalman filter. IEEE Transactions on Areospace and Electronic Systems, 42(1):174– 190, January 2006. Cited on page 112. Bibliography 257 Timothy Y. Chow. The surprise examination or unexpected hanging paradox. American Mathematical Monthly, 105(1):41–51, January 1998. Cited on page 47. Shantanu Chowdhry, Helmut Krendl, and Andreas A. Linninger. Symbolic numeric index analysis algorithm for differential algebraic equations. Industrial & Engineering Chemistry Research, 43(14):3886–3894, 2004. Cited on pages 47 and 91. Earl A. Coddington and Norman Levinson. Theory of ordinary differential equations. Robert E. Krieger Publishing Company, Inc., third edition, 1985. Cited on pages 66 and 143. Cyril Coumarbatch and Zoran Gajic. Exact decomposition of the algebraic Riccati equation of deterministic multimodeling optimal control problems. IEEE Transactions on Automatic Control, 45(4):790–794, April 2000. Cited on page 71. John L. Crassidis, F. Landis Markley, and Yang Cheng. Survey of nonlinear attitude estimation methods. Journal of Guidance, Control, and Dynamics, 30(1):12–28, January 2007. Cited on pages 110 and 113. Fred Daum. Nonlinear filters: Beyond the kalman filter. IEEE Aerospace and Electronic Systems Magazine, 20(8:2):57–69, 2005. Cited on page 112. Alekseı̆ Fedorovic Filippov. Differential equations with discontinuous righthand sides. Mathematics and its applications. Kluwer Academic Publishers, 1985. Cited on pages 61 and 62. Theodore Frankel, editor. The geometry of physics — an introduction. Cambridge University Press, 2nd edition, 2004. Cited on page 111. Peter Fritzson, Peter Aronsson, Adrian Pop, David Akhvlediani, Bernhard Bachmann, David Broman, Anders Fernström, Daniel Hedberg, Elmin Jagudin, Håkan Lundvall, Kaj Nyström, Andreas Remar, and Anders Sandholm. OpenModelica system documentation — preliminary draft, 2006-12-14, for OpenModelica 1.4.3 beta. Technical report, Programming Environment Laboratory — PELAB, Department of Computer and Information Science, Linköping University, Sweden, 2006a. Cited on page 34. Peter Fritzson, Peter Aronsson, Adrian Pop, David Akhvlediani, Bernhard Bachmann, David Broman, Anders Fernström, Daniel Hedberg, Elmin Jagudin, Håkan Lundvall, Kaj Nyström, Andreas Remar, and Anders Sandholm. OpenModelica users guide — preliminary draft, 2006-09-28, for OpenModelica 1.4.2. Technical report, Programming Environment Laboratory — PELAB, Department of Computer and Information Science, Linköping University, Sweden, 2006b. Cited on page 34. Komei Fukuda. cddlib reference manual, version 0.94. Institute for Operations Research and Institute of Theoretical Computer Science, ETH Zentrum, CH8092 Zurich, Switzerland, 2008. URL http://www.ifor.math.ethz.ch/ ~fukuda/cdd_home/cdd.html. Cited on page 122. Markus Gerdin. Identification and estimation for models described by differentialalgebraic equations. PhD thesis, Linköping University, 2006. Cited on page 23. 258 Bibliography Developers of GMP. The GNU multiple precision arithmetic library, version 4.3.1. Free Software Foundation, 2009. URL http://gmplib.org/. Cited on page 7. Sergeı̆ Konstantinovich Godunov. Ordinary differential equations with constant coefficient, volume 169 of Translations of mathematical monographies. American Mathematical Society, 1997. Cited on page 55. Gene H. Golub and Charles F. Van Loan. Matrix computations. The Johns Hopkins University Press, third edition, 1996. Cited on pages 22 and 100. P. Gurfil and M. Jodorkovsky. Unified initial condition response analysis of Lur’e systems and linear time-invariant systems. International Journal of Systems Science, 34(1):49–62, 2003. Cited on page 53. Ernst Hairer and Gerhard Wanner. Solving ordinary differential equations II — Stiff and differential-algebraic problems, volume 14. Springer-Verlag, 1991. Cited on page 51. Ernst Hairer, Christian Lubich, and Michel Roche. The numerical solution of differential-algebraic systems by Runge-Kutta methods. Lecture Notes in Mathematics, 1409, 1989. Cited on page 33. Michiel Hazewinkel, editor. Encyclopedia of mathematics, volume 8. Kluwer Academic Publishers, 1992. URL http://eom.springer.de/. Cited on page 126. Nicholas J. Higham. A survey of condition number estimation for triangular matrices. SIAM Review, 29(4):575–596, December 1987. Cited on page 82. Nicholas J. Higham, D. Steven Mackey, and Françoise Tisseur. The conditioning of linearizations of matrix polynomials. SIAM Journal on Matrix Analysis and Applications, 28(4):1005–1028, 2006. Cited on pages 40 and 72. Inmaculada Higueras and Roswitha März. Differential algebraic equations with properly stated leading terms. Computers & Mathematics with Applications, 28 (1–2):215–235, 2004. Cited on page 38. Alan C. Hindmarsh, Radu Serban, and Aaron Collier. User documentation for IDA v2.4.0. Technical report, Center for Applied Schientific Computing, Lawrence Livermore National Laboratory, 2004. Cited on pages 51 and 98. Alan C. Hindmarsh, Peter N. Brown, Keith E. Grant, Steven L. Lee, Radu Serban, Dan E. Shumaker, and Carol S. Woodward. SUNDIALS: Suite of nonlinear and differential/algebraic equation solvers. ACM Transactions on Mathematical Software, 31(3):363–396, 2005. Cited on page 51. Lars Hörmander. An introduction to complex analysis in several variables. The University Series in Higher Mathematics. D. Van Nostrand, Princeton, New Jersey, 1966. Cited on pages 143 and 155. M. E. Ingrim and Masada G. Y. The extended bond graph notation. Journal of Dynamic Systems, Measurement, and Control, 113(1):113–117, March 1991. Cited on page 24. Bibliography 259 Luc Jaulin, Michel Kieffer, Olivier Didrit, and Walter Éric. Applied interval analysis. Springer-Verlag, London, 2001. Cited on pages 77 and 195. Luc Jaulin, Isabelle Braems, and Eric Walter. Interval methods for nonlinear identification and robust control. In Proceedings of the 41st IEEE Conference on Decision and Control, pages 4676–4681, Las Vegas, NV, USA, December 2002. Cited on page 77. Andrew H. Jazwinski. Stochastic processes and filtering theory, volume 64 of Mathematics in science and engineering. Academic Press, New York and London, 1970. Cited on page 17. Ulf T. Jönsson. On reachability analysis of uncertain hybrid systems. In Proceedings of the 41st IEEE Conference on Decision and Control, pages 2397–2402, Las Vegas, NV, USA, December 2002. Cited on page 62. Thomas Kailath. Linear Systems. Prentice-Hall, Inc., 1980. Cited on page 13. M. Kathirkamanayagan and G. S. Ladde. Diagonalization and stability of large-scale singularly perturbed linear system. Journal of Mathematical Analysis and Applications, 135(1):38–60, October 1988. Cited on page 71. R. Baker Kearfott. Interval computations: Introduction, uses, and resources. Euromath Bulletin, 1(2):95–112, 1996. Cited on page 77. Hassan K. Khalil. Asymptotic stability of nonlinear multiparameter singularly perturbed systems. Automatica, 17(6):797–804, November 1981. Cited on page 70. Hassan K. Khalil. Time scale decomposition of linear implicit singularly perturbed systems. IEEE Transactions on Automatic Control, AC-29(11):1054–1056, November 1984. Cited on page 69. Hassan K. Khalil. Stability of nonlinear multiparameter singularly perturbed systems. IEEE Transactions on Automatic Control, AC-32(3):260–263, March 1987. Cited on page 71. Hassan K. Khalil. Nonlinear systems. Prentice Hall, Inc., third edition, 2002. Cited on pages 52, 61, and 66. Hassan K. Khalil and Peter V. Kokotović. D-stability and multi-parameter singular perturbation. SIAM Journal on Control and Optimization, 17(1):56–65, 1979. Cited on pages 70 and 71. K. Khorasani and M. A. Pai. Asymptotic stability improvements of multiparameter nonlinear singularly perturbed systems. IEEE Transactions on Automatic Control, AC-30(8):802–804, 1985. Cited on page 70. Petar V. Kokotović. A Riccati equation for block-diagonalization of ill-conditioned systems. IEEE Transactions on Automatic Control, 20(6):812–814, December 1975. Cited on page 211. 260 Bibliography Petar V. Kokotović, Hassan K. Khalil, and John O’Reilly. Singular perturbation methods in control: Analysis and applications. Academic Press Inc., 1986. Cited on pages 2, 57, 67, 68, 149, 168, 227, 228, 230, 233, 235, 236, 239, and 243. Steven G. Krantz and R. Parks, Harold. A primer of real analytic functions. Boston. Birkhäuser, second edition, 2002. Cited on page 155. Peter Kunkel and Volker Mehrmann. Canonical forms for linear differential-algebraic equations with variable coefficients. Journal of computational and applied mathematics, 56(3):225–251, 1994. Cited on page 34. Peter Kunkel and Volker Mehrmann. A new class of discretization methods for the solution of linear differential-algebraic equations with variable coefficients. SIAM Journal on Numerical Analysis, 33(5):2941–1961, October 1996. Cited on pages 130 and 131. Peter Kunkel and Volker Mehrmann. Regular solutions of nonlinear differentialalgebraic equations. Numerische Mathematik, 79(4):581–600, June 1998. Cited on page 131. Peter Kunkel and Volker Mehrmann. Index reduction for differential-algebraic equations by minimal extension. ZAMM — Journal of Applied Mathematics and Mechanics, 84(9):579–597, 2004. Cited on page 34. Peter Kunkel and Volker Mehrmann. Differential-algebraic equations — Analysis and numerical solution. European Mathematical Society, 2006. Cited on pages 34, 40, 51, 72, 129, 132, 134, 138, 141, 142, and 147. Peter Kunkel, Volker Mehrmann, Werner Rath, and Jörg Weickert. GELDA: A software package for the solution of general linear differential algebraic equations, 1995. Cited on page 51. Peter Kunkel, Volker Mehrmann, and Werner Rath. Analysis and numerical solution of control problems in descriptor form. Mathematics of Control, Signals, and Systems, 14(1):29–61, 2001. Cited on page 34. Alexander B. Kurzhanski and István Vályi. Ellipsoidal calculus for estimation and control. Birkhäuser, 1997. Cited on page 62. Junghyun Kwon, Minseok Choi, F. C. Park, and Changmook Chun. Particle filtering on the Euclidean group: Framework and applications. Robotica, 25:725–737, 2007. Cited on pages 110 and 112. Wook Hyun Kwon, Young Soo Moon, and Sang Chul Ahn. Bounds in algebraic ricatti and lyapunov equations: A survey and some new results. International Journal of Control, 64(3):377–389, June 1996. Cited on pages 55 and 58. G. S. Ladde and S. G. Rajalakshmi. Diagonalization and stability of multi-time-scale singularly perturbed linear systems. Applied Mathematics and Computation, 16 (2):115–140, February 1985. Cited on page 71. Bibliography 261 G. S. Ladde and S. G. Rajalakshmi. Singular perturbations of linear systems with multiparameters and multiple time scales. Journal of Mathematical Analysis and Applications, 129(2):457–481, February 1988. Cited on page 71. G. S. Ladde and O. Sirisaengtaksin. Large-scale stochastic singularly perturbed systems. Mathematics and Computers in Simulation, 31(1–2):31–40, February 1989. Cited on page 69. G. S. Ladde and D. D. S̆iljak. Multiparameter singular perturbations of linear systems with multiple time scales. Automatica, 19(4):385–394, July 1989. Cited on page 71. Jehee Lee and Sung Yong Shin. General construction of time-domain filters for orientation data. IEEE Transactions on Visualization and Computer Graphics, 8(2): 119–128, April 2002. Cited on page 110. Taeyoung Lee, Melvin Leok, and Harris McClamroch. Global symplectic uncertainty propagation on SO(3). In Proceedings of the 47th IEEE Conference on Decision and Control, pages 61–66, Cancun, Mexico, December 2008. Cited on page 112. Ben Leimkuhler, Linda Ruth Petzold, and C. William Gear. Approximation methods for the consistent initialization of differential-algebraic equations. SIAM Journal on Numerical Analysis, 28(1):204–226, February 1991. Cited on page 47. Adrien Leitold and Katalin M. Hangos. Structural solvability analysis of dynamic process models. Computers and Chemical Engineering, 25(11–12):1633–1646, 2001. Cited on page 47. Frans Lemeire. Bounds for condition numbers of triangular and trapezoid matrices. BIT Numerical Mathematics, 15(1):58–64, March 1975. Cited on page 83. C.-W. Li and Y.-K. Feng. Functional reproducibility of general multivariable analytic nonlinear systems. International Journal of Control, 45(1):255–268, 1987. Cited on page 38. Lennart Ljung. System identification, Theory for the User. Prentice-Hall, Inc., 1999. Cited on page 18. James Ting-Ho Lo and Linda R. Eshleman. Exponential fourier densities on SO(3) and optimal estimation and detection for rotational processes. SIAM Journal on Applied Mathematics, 36(1):73–82, February 1979. Cited on pages 110 and 113. David G. Luenberger. Time-invariant descriptor systems. Automatica, 14(5):473– 480, 1978. Cited on page 29. Morris Marden. Geometry of polynomials. American Mathematical Society, second edition, 1966. Cited on page 82. R. M. M. Mattheij and P. M. E. J. Wijckmans. Sensitivity of solutions of linear dae to perturbations of the system matrices. Numerical Algorithms, 19(1–4):159–171, 1998. Cited on page 72. 262 Bibliography Sven Erik Mattsson and Gustaf Söderlind. Index reduction in differential-algebraic equations using dummy derivatives. SIAM Journal on Scientific Computation, 14 (3):677–692, May 1993. Cited on pages 34 and 95. Sven Erik Mattsson, Hans Olsson, and Hilding Elmqvist. Dynamic selection of states in dymola. Modelica Workshop, pages 61–67, October 2000. Cited on page 34. Volker Mehrmann and Chunchao Shi. Transformation of high order linear differential-algebraic systems to first order. Numerical Algorithms, 42(3–4):281– 307, July 2006. Cited on page 36. Roswitha März and Ricardo Riaza. Linear differential-algebraic equations with properly stated leading term: Regular points. Journal of Mathematical Analysis and Applications, 323(2):1279–1299, December 2006. Cited on page 38. Roswitha März and Ricardo Riaza. Linear differential-algebraic equations with properly stated leading term: A-critical points. Mathematical and Computer Modelling of Dynamical Systems, 13(3):291–314, 2007. Cited on pages 38 and 72. Roswitha März and Ricardo Riaza. Linear differential algebraic equations with properly stated leading terms: B-critical points. Dynamical Systems: An International Journal, 23(4):505–522, 2008. Cited on page 38. Hyeon-Suk Na, Chung-Nim Lee, and Otfried Cheong. Voronoi diagrams on the sphere. Computational Geometry, 23:183–194, 2002. Cited on page 115. D. Subbaram Naidu. Singular perturbations and time scales in control theory and applications: An overview. Dynamics of Continuous, Discrete and Impulsive Systems, 9(2):233–278, 2002. Cited on pages 67 and 232. Arnold Neumaier. Overestimation in linear interval equations. SIAM Journal on Numerical Analysis, 24:207–214, 1987. Cited on page 78. Arnold Neumaier. Interval methods for systems of equations. Cambridge University Press, 1990. Cited on page 79. Constantinos Pantelides. The consistent initialization of differential-algebraic systems. SIAM Journal on Scientific and Statistical Computing, 9(2):213–231, March 1988. Cited on pages 34 and 47. Xavier Pennec. Intrinsic statistics on Riemannian manifolds: Basic tools for geometric measurements. Journal of Mathematical Imaging and Vision, 25(1):127–154, July 2006. Cited on page 120. Linda Ruth Petzold. Order results for Runge-Kutta methods applied to differential/algebraic systems. SIAM Journal on Numerical Analysis, 23(4):837–852, 1986. Cited on page 50. P. J. Rabier and W. C. Rheinboldt. A geometric treatment of implicit differentialalgebraic equations. Journal of Differential Equations, 109(1):110–146, April 1994. Cited on page 32. Bibliography 263 S. Reich. On an existence and uniqueness theory for non-linear differential-algebraic equations. Circuits, Systems, and Signal Processing, 10(3):344–359, 1991. Cited on page 32. Gregory J. Reid, Ping Lin, and Allan D. Wittkopf. Differential eliminationcompletion algorithms for dae and pdae. Studies in Applied Mathematics, 106: 1–45, 2001. Cited on page 32. Gregory J. Reid, Chris Smith, and Jan Verschelde. Geometric completion of differential systems using numeric-symbolc continuation. ACM SIGSAM Bulletin, 36(2): 1–17, June 2002. Cited on page 91. Gregory J. Reid, Jan Verschelde, Allan Wittkopf, and Wenyuan Wu. Symbolicnumeric completion of differential systems by homotopy continuation. Proceedings of the 2005 international symposium on symbolic and algebraic computation, pages 269–276, 2005. Cited on page 91. Michel Roche. Implicit Runge-Kutta methods for differential algebraic equations. SIAM Journal on Numerical Analysis, 26(4):963–975, 1989. Cited on page 50. Ronald C. Rosenberg and Dean C. Karnopp. Introduction to physical system dynamics. McGraw-Hill Book Company, 1983. Cited on page 24. P. Rouchon, M. Fliess, and J. Lévine. Kronecker’s canonical forms for nonlinear implicit differential systems. In Proceedings of the 2nd IFAC Workshop on Systems Structure and Control, pages 248–251, Prague, Czech Republic, September 1992. Cited on pages 29, 38, 39, 88, and 92. Walter Rudin. Principles of mathematical analysis. McGraw-Hill, third edition, 1976. Cited on page 73. Wilson J. Rugh. Linear system theory. Prentice-Hall, Inc., second edition, 1996. Cited on pages 52, 57, 61, 64, 65, 66, and 240. A. Saberi and Hassan K. Khalil. Quadratic-type Lyapunov functions for singularly perturbed systems. IEEE Transactions on Automatic Control, AC-29(6):542–550, June 1984. Cited on page 71. Helmut Schaeben. “Normal” orientation distributions. Texture, Stress, and Microstructure, 19(4):197–202, 1992. J. changed name in 2008 from Textu. M.-struct. Cited on page 122. Eliezer Y. Shapiro. On the Lyapunov matrix equation. IEEE Transactions on Automatic Control, 19(5):594–596, October 1974. Cited on page 58. Johan Sjöberg. Optimal control and model reduction of nonlinear dae models. PhD thesis, Linköping University, 2008. Cited on page 23. Sigurd Skogestad and Ian Postlethwaite. Multivariable feedback control. John Wiley & Sons, 1996. Cited on page 21. 264 Bibliography Anuj Srivastava and Eric Klassen. Monte Carlo extrinsic estimators of manifoldvalued parameters. IEEE Transactions on Signal Processing, 50(2):299–308, February 2002. Cited on pages 119 and 120. Andreas Steinbrecher. Numerical solution of quasi-linear differential-algebraic equations and industrial simulation of multibody systems. PhD thesis, Technischen Universität Berlin, 2006. Cited on pages 29, 86, 92, and 242. G. W. Stewart and Ji-guang Sun. Matrix perturbation theory. Computer Science and Scientific Computing. Academic Press, 1990. Cited on pages 40, 42, 43, and 200. Torsten Ström. On logarithmic norms. SIAM Journal on Numerical Analysis, 12(5): 741–753, 1975. Cited on pages 55 and 56. Tatjana Stykel. Gramian based model reduction for descriptor systems. Mathematics of Control, Signals, and Systems, 16(4):297–319, 2004. Cited on page 21. N. Sukumar. Voronoi cell finite difference method for the diffusion operator on arbitrary unstructured grids. International Journal for Numerical Methods in Engineering, 57(1):1–34, May 2003. Cited on page 115. Andrzej Szatkowski. Generalized dynamical systems: Differentiable dynamic complexes and differential dynamic systems. International Journal of Systems Science, 21(8):1631–1657, August 1990. Cited on page 32. Andrzej Szatkowski. Geometric characterization of singular differential algebraic eqautions. International Journal of Systems Science, 23(2):167–186, February 1992. Cited on page 32. G. Thomas. Symbolic computation of the index of quasilinear differential-algebraic equations. Proceedings of the 1996 international symposium on Symbolic and algebraic computation, pages 196–203, 1996. Cited on pages 32 and 39. Henrik Tidefelt. Structural algorithms and perturbations in differential-algebraic equations. Technical Report Licentiate thesis No 1318, Division of Automatic Control, Linköping University, 2007. Cited on pages 9, 10, 85, 91, 149, and 158. Henrik Tidefelt and Torkel Glad. Index reduction of index 1 dae under uncertainty. In Proceedings of the 17th IFAC World Congress, pages 5053–5058, Seoul, Korea, July 2008. Cited on pages 54, 149, 151, 152, and 157. Henrik Tidefelt and Torkel Glad. On the well-posedness of numerical dae. In Proceedings of the European Control Conference 2009, pages 826–831, Budapest, Hungary, August 2009. Cited on page 157. Henrik Tidefelt and Thomas B. Schön. Robust point-mass filters on manifolds. In Proceedings of the 15th IFAC Symposium on System Identification, pages 540– 545, Saint-Malo, France, July 2009. Not cited. David Törnqvist, Thomas B. Schön, Rickard Karlsson, and Fredrik Gustafsson. Particle filter slam with high dimensional vehicle model. Journal of Intelligent and Robotic Systems, 55(4–5):249–266, August 2009. Cited on page 110. Bibliography 265 J. Unger, A. Kröner, and W. Marquardt. Structural analysis of differential-algebraic equation systems — theory and applications. Computers and Chemical Engineering, 19(8):867–882, 1995. Cited on pages 24, 47, and 91. Charles F. Van Loan. The sensitivity of the matrix exponential. SIAM Journal on Numerical Analysis, 14(6):971–981, December 1977. Cited on pages 53, 54, 59, and 60. R. C. Veiera and E. C. Biscaia Jr. An overview of initialization approaches for differential-algebraic equations. Latin American Applied Research, 30(4):303–313, 2000. Cited on page 47. R. C. Veiera and E. C. Biscaia Jr. Direct methods for consistent initialization of dae systems. Computers and Chemical Engineering, 25(9–10):1299–1311, September 2001. Cited on page 47. Krešimir Veselić. Bounds for exponentially stable semigroups. Linear Algebra and its Applications, 358:309–333, 2003. Cited on pages 53 and 56. Josselin Visconti. Numerical solution of differential algebraic equations, global error estimation and symbolic index reduction. PhD thesis, Institut d’Informatique et Mathématiques Appliquées de Grenoble, November 1999. Cited on pages 29 and 86. Wolfram Research, Inc. Mathematica. Wolfram Research, Inc., Champaign, Illinois, 2008. Version 7.0.0. Cited on page 23. Index abstraction barrier, 110 algebraic constraints, 46 algebraic equation, 12 algebraic term, 12 autonomous, 12, 13, 118 backward difference formulas, see bdf method balanced form, 21 Bayes’ rule, 112 operator, 117 bdf abbreviation, xvii method, 47, 134 bond graph, 24 boundary layer, 68 dassl, 51 decoupling transform lti index 1, 177 lti index 2, 210 ltv index 1, 232 departure from normality, 54 derivative array, 31, 130 equations, 31, 131, 134, 138 differential inclusion, 61 differential-algebraic equation, see dae differentiation index, 29, 31, 32, 35, 43, 130, 137 drift, 17, 50 dummy derivatives, 34 eigenvalues of matrix pair, 42 elimination-differentiation, 29 Chapman-Kolmogorov equation, 112, embedding, 110, 124, 127, 128 117 “natural”, 119, 121 component-based model, 24 Euclidean space, 122 constraint propagation, 79 notation, xv contraction exponential map, 111 mapping, 73, 182, 214, 218, 235, 238 notation, xvi principle, 73 extrinsic mean, 120 contravariant vector, 11, 111 Fokker-Planck equation, 112 convolution, 118 forcing function, 12, 13 coordinate map, 121, 133 form covariant vector, 11 balanced, 21 D-stability, 70 implicit ode, 23 dae, 22 quasilinear, 23 abbreviation, xvii state space, 19 quasilinear, 23 fraction-free, 80 daspk, 51 fraction-producing, 81 267 268 fundamental matrix, 52 synonym, see transition matrix Index iteration matrix, 49 leading matrix of (quasi)linear dae, 12 of matrix pair, 42 Lie group, 112, 113 local coordinates, 110, 133 longevity, 96 lti Hankel norm, 21 abbreviation, xvii autonomous dae, 12, 25 ida, 51 autonomous ode, 13 ill-posed dae, 12, 25 in quasilinear shuffle algorithm, 92 ode, 13 in shuffle algorithm, 30 ltv in structure algorithm, 39 abbreviation, xvii initial value problem, 46 autonomous dae, 13, 26 uncertain dae, 151 autonomous ode, 13 implicit ode, 23 dae, 13, 26 implicit Runge-Kutta methods, see irk ode, 13 index, 28 Lyapunov equation, 52 (unqualified), 35 Lyapunov function, 52 differentiation, see differentiation candidate, 52, 231 index Lyapunov transformation, 57, 240 nominal, see nominal index perturbation, see perturbation index manifold, 111 pointwise, see pointwise, index Mathematica, 11, 23n, 51, 79, 135, 177 simplified strangeness, see simpli- matrix pair, 41, 186 fied strangeness index matrix pencil, 40 strangeness, see strangeness index matrix-valued singular perturbation, 2, index reduction, 3, 28, 34, 51, 85, 109 71, 151, 196, 227 seminumerical, 99, 105 matrix-valued uncertainty, 152 inevitable pendulum, 95 measurement update, 111, 116 initial value problem, 14 meshless, 115 consistent initial conditions, 47 model, 15 ill-posed, see ill-posed initial value component-based, 24 problem residualized, 20 inner approximation, 78 truncated, 20 input (to differential equation), 13 model class, 18 input matrix, 13 model reduction, 7, 19 interval model structure, 18 matrix, 78 multiparameter singular perturbation, real, 78 69 vector, 78 multiple time scale singular perturbaintrinsic mean, 119 tion, 71 irk, 50, 51 abbreviation, xvii nominal index, 35 Gaussian distribution, 112 notation, xvi gelda, 51 genda, 51 geodesic, 111 Index 269 non-differential equation, 12 right hand side of ode, 13 normal, 7, 54 of quasilinear dae, 12 departure from, see departure from normality scalar singular perturbation, 67 shuffle algorithm, 29, 30 ode quasilinear, see quasilinear shuffle abbreviation, xvii algorithm autonomous, 13n simplified strangeness index, 134 implicit, 23 singular time-invariant, 13 lti dae, 41 one-full, 130, 135 matrix pair, 42 outer approximation, 78 matrix pencil, 40 pair, matrix, 41 uncertain matrix, 78 Pantelides’ algorithm, 34 uncertain matrix pair, 43 particle filter, 110 singular perturbation, 67 pencil, matrix, 40 matrix-valued, see matrix-valued perturbation singular perturbation regular, see regular perturbation multiparameter, see multiparameter singular, see singular perturbation singular perturbation perturbation index, 33 multiple time scale, see multiple point time scale singular perturbation matrix, 78 scalar, see scalar singular perturbavector, 78 tion point estimate, 111, 119 singular perturbation approximation, 21 point-mass distribution, 114 square (dae), 14 point-mass filter, 111, 114 state (vector), 14 pointwise state feedback matrix, 13 index, 35 state space model, 113n non-singular, 78 strangeness index, 34, 72, 131 properly stated leading term, 38 structural zero, 6, 81n, 94 structure algorithm, 29, 38, 39 quasilinear form, 5, 12, 23 quasilinear shuffle algorithm, 5, 29, 86, tangent space, 111, 122 92 tessellation, 111, 114, 120 time update, 111, 117, 118 radau5, 51 trailing matrix reduced equation, 14, 98, 131 of linear dae, 12 regular of matrix pair, 42 lti dae, 41 transition matrix, 52, 118n, 205, 235 matrix pair, 42 truncated model, 20 matrix pencil, 40 truncation, 20 uncertain matrix, 42, 78 uncertain matrix pair, 43 uniformly bounded-input, boundedregular perturbation, 58 output stable, 60 residualization, 20 uniformly exponentially stable, 64 residualized model, 20 unstructured perturbation, 152 270 variable, 14 Voronoi diagram, 115, 121 Index PhD Dissertations Division of Automatic Control Linköping University M. Millnert: Identification and control of systems subject to abrupt changes. Thesis No. 82, 1982. ISBN 91-7372-542-0. A. J. M. van Overbeek: On-line structure selection for the identification of multivariable systems. Thesis No. 86, 1982. ISBN 91-7372-586-2. B. Bengtsson: On some control problems for queues. Thesis No. 87, 1982. ISBN 91-7372-5935. S. Ljung: Fast algorithms for integral equations and least squares identification problems. Thesis No. 93, 1983. ISBN 91-7372-641-9. H. Jonson: A Newton method for solving non-linear optimal control problems with general constraints. Thesis No. 104, 1983. ISBN 91-7372-718-0. E. Trulsson: Adaptive control based on explicit criterion minimization. Thesis No. 106, 1983. ISBN 91-7372-728-8. K. Nordström: Uncertainty, robustness and sensitivity reduction in the design of single input control systems. Thesis No. 162, 1987. ISBN 91-7870-170-8. B. Wahlberg: On the identification and approximation of linear systems. Thesis No. 163, 1987. ISBN 91-7870-175-9. S. Gunnarsson: Frequency domain aspects of modeling and control in adaptive systems. Thesis No. 194, 1988. ISBN 91-7870-380-8. A. Isaksson: On system identification in one and two dimensions with signal processing applications. Thesis No. 196, 1988. ISBN 91-7870-383-2. M. Viberg: Subspace fitting concepts in sensor array processing. Thesis No. 217, 1989. ISBN 91-7870-529-0. K. Forsman: Constructive commutative algebra in nonlinear control theory. Thesis No. 261, 1991. ISBN 91-7870-827-3. F. Gustafsson: Estimation of discrete parameters in linear systems. Thesis No. 271, 1992. ISBN 91-7870-876-1. P. Nagy: Tools for knowledge-based signal processing with applications to system identification. Thesis No. 280, 1992. ISBN 91-7870-962-8. T. Svensson: Mathematical tools and software for analysis and design of nonlinear control systems. Thesis No. 285, 1992. ISBN 91-7870-989-X. S. Andersson: On dimension reduction in sensor array signal processing. Thesis No. 290, 1992. ISBN 91-7871-015-4. H. Hjalmarsson: Aspects on incomplete modeling in system identification. Thesis No. 298, 1993. ISBN 91-7871-070-7. I. Klein: Automatic synthesis of sequential control schemes. Thesis No. 305, 1993. ISBN 917871-090-1. J.-E. Strömberg: A mode switching modelling philosophy. Thesis No. 353, 1994. ISBN 917871-430-3. K. Wang Chen: Transformation and symbolic calculations in filtering and control. Thesis No. 361, 1994. ISBN 91-7871-467-2. T. McKelvey: Identification of state-space models from time and frequency data. Thesis No. 380, 1995. ISBN 91-7871-531-8. J. Sjöberg: Non-linear system identification with neural networks. Thesis No. 381, 1995. ISBN 91-7871-534-2. R. Germundsson: Symbolic systems – theory, computation and applications. Thesis No. 389, 1995. ISBN 91-7871-578-4. P. Pucar: Modeling and segmentation using multiple models. Thesis No. 405, 1995. ISBN 917871-627-6. H. Fortell: Algebraic approaches to normal forms and zero dynamics. Thesis No. 407, 1995. ISBN 91-7871-629-2. A. Helmersson: Methods for robust gain scheduling. Thesis No. 406, 1995. ISBN 91-7871628-4. P. Lindskog: Methods, algorithms and tools for system identification based on prior knowledge. Thesis No. 436, 1996. ISBN 91-7871-424-8. J. Gunnarsson: Symbolic methods and tools for discrete event dynamic systems. Thesis No. 477, 1997. ISBN 91-7871-917-8. M. Jirstrand: Constructive methods for inequality constraints in control. Thesis No. 527, 1998. ISBN 91-7219-187-2. U. Forssell: Closed-loop identification: Methods, theory, and applications. Thesis No. 566, 1999. ISBN 91-7219-432-4. A. Stenman: Model on demand: Algorithms, analysis and applications. Thesis No. 571, 1999. ISBN 91-7219-450-2. N. Bergman: Recursive Bayesian estimation: Navigation and tracking applications. Thesis No. 579, 1999. ISBN 91-7219-473-1. K. Edström: Switched bond graphs: Simulation and analysis. Thesis No. 586, 1999. ISBN 917219-493-6. M. Larsson: Behavioral and structural model based approaches to discrete diagnosis. Thesis No. 608, 1999. ISBN 91-7219-615-5. F. Gunnarsson: Power control in cellular radio systems: Analysis, design and estimation. Thesis No. 623, 2000. ISBN 91-7219-689-0. V. Einarsson: Model checking methods for mode switching systems. Thesis No. 652, 2000. ISBN 91-7219-836-2. M. Norrlöf: Iterative learning control: Analysis, design, and experiments. Thesis No. 653, 2000. ISBN 91-7219-837-0. F. Tjärnström: Variance expressions and model reduction in system identification. Thesis No. 730, 2002. ISBN 91-7373-253-2. J. Löfberg: Minimax approaches to robust model predictive control. Thesis No. 812, 2003. ISBN 91-7373-622-8. J. Roll: Local and piecewise affine approaches to system identification. Thesis No. 802, 2003. ISBN 91-7373-608-2. J. Elbornsson: Analysis, estimation and compensation of mismatch effects in A/D converters. Thesis No. 811, 2003. ISBN 91-7373-621-X. O. Härkegård: Backstepping and control allocation with applications to flight control. Thesis No. 820, 2003. ISBN 91-7373-647-3. R. Wallin: Optimization algorithms for system analysis and identification. Thesis No. 919, 2004. ISBN 91-85297-19-4. D. Lindgren: Projection methods for classification and identification. Thesis No. 915, 2005. ISBN 91-85297-06-2. R. Karlsson: Particle Filtering for Positioning and Tracking Applications. Thesis No. 924, 2005. ISBN 91-85297-34-8. J. Jansson: Collision Avoidance Theory with Applications to Automotive Collision Mitigation. Thesis No. 950, 2005. ISBN 91-85299-45-6. E. Geijer Lundin: Uplink Load in CDMA Cellular Radio Systems. Thesis No. 977, 2005. ISBN 91-85457-49-3. M. Enqvist: Linear Models of Nonlinear Systems. Thesis No. 985, 2005. ISBN 91-85457-64-7. T. B. Schön: Estimation of Nonlinear Dynamic Systems — Theory and Applications. Thesis No. 998, 2006. ISBN 91-85497-03-7. I. Lind: Regressor and Structure Selection — Uses of ANOVA in System Identification. Thesis No. 1012, 2006. ISBN 91-85523-98-4. J. Gillberg: Frequency Domain Identification of Continuous-Time Systems Reconstruction and Robustness. Thesis No. 1031, 2006. ISBN 91-85523-34-8. M. Gerdin: Identification and Estimation for Models Described by Differential-Algebraic Equations. Thesis No. 1046, 2006. ISBN 91-85643-87-4. C. Grönwall: Ground Object Recognition using Laser Radar Data – Geometric Fitting, Performance Analysis, and Applications. Thesis No. 1055, 2006. ISBN 91-85643-53-X. A. Eidehall: Tracking and threat assessment for automotive collision avoidance. Thesis No. 1066, 2007. ISBN 91-85643-10-6. F. Eng: Non-Uniform Sampling in Statistical Signal Processing. Thesis No. 1082, 2007. ISBN 978-91-85715-49-7. E. Wernholt: Multivariable Frequency-Domain Identification of Industrial Robots. Thesis No. 1138, 2007. ISBN 978-91-85895-72-4. D. Axehill: Integer Quadratic Programming for Control and Communication. Thesis No. 1158, 2008. ISBN 978-91-85523-03-0. G. Hendeby: Performance and Implementation Aspects of Nonlinear Filtering. Thesis No. 1161, 2008. ISBN 978-91-7393-979-9. J. Sjöberg: Optimal Control and Model Reduction of Nonlinear DAE Models. Thesis No. 1166, 2008. ISBN 978-91-7393-964-5. D. Törnqvist: Estimation and Detection with Applications to Navigation. Thesis No. 1216, 2008. ISBN 978-91-7393-785-6. P-J. Nordlund: Efficient Estimation and Detection Methods for Airborne Applications. Thesis No. 1231, 2008. ISBN 978-91-7393-720-7.

* Your assessment is very important for improving the work of artificial intelligence, which forms the content of this project

Download PDF

advertisement