H A Nonlinear Optimization Approach to Optimal Modeling and Control 2
Linköping studies in science and technology. Dissertations.
No. 1528
A Nonlinear Optimization Approach to
H
2
Optimal Modeling and Control
Daniel Petersson
Department of Electrical Engineering
Linköping University, SE–581 83 Linköping, Sweden
Linköping 2013
Linköping studies in science and technology. Dissertations.
No. 1528
A Nonlinear Optimization Approach to
H
2
Optimal Modeling and Control
Daniel Petersson
[email protected]
www.control.isy.liu.se
Division of Automatic Control
Department of Electrical Engineering
Linköping University
SE–581 83 Linköping
Sweden
ISBN 9789175195674 ISSN 03457524
Copyright © 2013 Daniel Petersson
Printed by LiUTryck, Linköping, Sweden 2013
To Maria, Wilmer and Elsa!
Abstract
Mathematical models of physical systems are pervasive in engineering. These models can be used to analyze properties of the system, to simulate the system, or synthesize controllers. However, many of these models are too complex or too large for standard analysis and synthesis methods to be applicable. Hence, there is a need to reduce the complexity of models. In this thesis, techniques for reducing complexity of large linear timeinvariant ( lti
) statespace models and linear parametervarying ( lpv
) models are presented. Additionally, a method for synthesizing controllers is also presented.
The methods in this thesis all revolve around a system theoretical measure called the
H
2
norm, and the minimization of this norm using nonlinear optimization.
Since the optimization problems rapidly grow large, signiﬁcant e ﬀ ort is spent on understanding and exploiting the inherent structures available in the problems to reduce the computational complexity when performing the optimization.
The ﬁrst part of the thesis addresses the classical modelreduction problem of lti statespace models. Various
H
2 problems are formulated and solved using the proposed structureexploiting nonlinear optimization technique. The standard problem formulation is extended to incorporate also frequencyweighted problems and norms deﬁned on ﬁnite frequency intervals, both for continuous and discretetime models. Additionally, a regularizationbased method to account for uncertainty in data is explored. Several examples reveal that the method is highly competitive with alternative approaches.
Techniques for ﬁnding lpv models from data, and reducing the complexity of lpv models are presented. The basic ideas introduced in the ﬁrst part of the thesis are extended to the lpv case, once again covering a range of di ﬀ erent setups.
lpv models are commonly used for analysis and synthesis of controllers, but the e ﬃ ciency of these methods depends highly on a particular algebraic structure in the lpv models. A method to account for and derive models suitable for controller synthesis is proposed. Many of the methods are thoroughly tested on a realistic modeling problem arising in the design and ﬂight clearance of an Airbus aircraft model.
Finally, outputfeedback
H
2 controller synthesis for lpv models is addressed by generalizing the ideas and methods used for modeling. One of the ideas here is to skip the lpv modeling phase before creating the controller, and instead synthesize the controller directly from the data, which classically would have been used to generate a model to be used in the controller synthesis problem. The method specializes to standard outputfeedback
H
2 controller synthesis in the lti case, and favorable comparisons with alternative stateoftheart implementations are presented.
v
Populärvetenskaplig sammanfattning
Inom många naturvetenskapliga och tekniska områden används matematiska modeller för att beskriva olika system, till exempel för att beskriva hur ett ﬂygplan kommer att röra sig givet att piloten ställer ut ett visst roderutslag. Dessa matematiska modeller kan exempelvis användas för att spara resurser genom att testa olika prototyper med simuleringar utan att behöva ha den fysiska prototypen. Dessa modeller kan skapas genom fysikaliska principer eller genom att en modell har byggts upp med hjälp av insamlad data.
Dagens moderna och komplexa system kan leda till väldigt stora och komplicerade matematiska modeller och dessa kan ibland vara för stora för att simulera eller analysera. Då behöver man kunna
reducera
komplexiteten på dessa modeller för att det skall vara möjligt att använda dem. Kravet på den reducerade modellen
är att den skall kunna beskriva den stora komplexa modellen tillräckligt väl för det ändamål som krävs.
Det ﬁnns många olika slags matematiska modeller av olika grader av komplexitet. Den enklaste typen av modeller är
linjära modeller
och för dessa modeller
är det möjligt att analysera egenskaper och dra viktiga slutsatser om systemet.
Linjära modeller har dock nackdelen att de är begränsade i hur mycket de kan beskriva. Om vi igen tar ett ﬂygplan som exempel, kan man säga att en linjär modell kan beskriva vad som händer med ﬂygplanet om det håller sig på en speciﬁk höjd med en speciﬁk fart. Dock klarar inte den linjära modellen av att beskriva vad som händer om ﬂygplanet avviker från dessa speciﬁka värden på fart och höjd för mycket. En annan typ av modeller är
linjärt parametervarierande modeller
. Dessa modeller beror på en eller ﬂera parametrar som kan beskriva vissa tillstånd. Flygplanet som vi förut beskrev med en linjär modell för en speciﬁk fart och höjd, skulle nu istället kunna beskrivas med en parametervarierande modell.
Denna parametervarierande modell kan, till exempel, vara beroende av dessa parametrar, höjd och fart, och kan då även beskriva vad som händer när ﬂygplanet stiger till en ny höjd och ändrar farten.
I denna avhandling utvecklar vi metoder för att kunna reducera stora komplexa, linjära och linjära parametervarierande modeller till mindre, mer överkomliga modeller. Kravet är att dessa modeller fortfarande ska kunna beskriva det ursprungliga systemet väl så att de kan användas, till exempel, för att analysera systemet.
Med de metoder som har utvecklats för att reducera stora komplexa modeller till mindre modeller som utgångspunkt har även metoder för att kunna konstruera regulatorer för att styra dessa stora komplexa modeller utvecklats.
vii
Acknowledgments
First of all, I would like to thank my supervisor Dr. Johan Löfberg and my cosupervisor Professor Lennart Ljung for all your patience and support. Especially
Johan, for his vast (this time I got it right) knowledge in optimization and always having an open door and taking time to answer my questions.
I would like to thank Professor Lennart Ljung again, as the former head of the
Division of Automatic Control, for the privilege of letting me join the Automatic
Control group and also our current head of the Division of Automatic Control,
Professor Svante Gunnarsson, for always being able to improve on an already excellent workplace and research environment. Of course, I would also like to thank our current administrator, Ninna Stensgård and her predecessors Ulla Salaneck and Åsa Karmelind for always keeping track of everything and always being helpful.
This thesis has been proofread by Dr. Johan Löfberg, Dr. Christian Lyzell, Lic.
Sina Khoshfetrat Pakazad and Lic. Patrik Axelsson. Thank you for your invaluable comments. I would also like thank Dr. Henrik Tidefelt, Dr. Gustaf Hendeby
A
TEX template that was used when writing this thesis.
There have been many joys on the journey as a Ph.Dstudent, both at work and private. The colleagues that I have shared o ﬃ ce with, Dr. Henrik Tidefelt and
Lic. Zoran Sjanic deserve an extra thanks for being a very good company in the beginning of this journey, maybe not in the mornings but at least after lunch.
Lic. Rikard Falkeborn, Dr. Ragnar Wallin and Dr. Christian Lyzell also deserves an extra thanks for always being there to discuss anything and everything, both work related and (mostly) irrelevant subjects.
Another person I would like to thank is Dr. Elina Rönnberg. We started at Y together a long time ago and have ever since not been able to leave the university.
All the “onsdagslunchar” and “ﬁka” have meant a lot. Thank you.
A few more people deserve my gratitude, Lic. Fredrik Lindsten and Dr. Jonas
Callmer. As the journey got closer to the end, and the anxiety, over the fact that a thesis should be written, started to grow, Dr. Jonas Callmer, my “Bother in arms”
[
sic!
], helped me by sharing the anxiety by also writing his thesis at the same time.
What also helped was that I found out that Lic. Fredrik Lindsten and I shared a common interest, Beer!, which we like to both talk about and drink. I hope there will be more beer tastings in the future.
For ﬁnancial support, I would like to thank the European Commission under contract No. AST5CT2006030768COFCLUO.
Finally, I would like to thank the person who has meant the most. Thank you
Maria! Thank you for all the support and encouragement and thank you for bringing me two of the most important persons in my life; Wilmer and Elsa.
Linköping, August 2013
Daniel Petersson
ix
Contents
Notation
1 Introduction
1.1 Outline of the Thesis . . . . . . . . . . . . . . . . . . . . . . . . . .
1.2 Contributions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
xv
1
2
2
2 Preliminaries
2.1 System Theory . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
2.1.1
2.1.2
Basic Theory and Notation . . . . . . . . . . . . . . . . . . .
Gramians . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
2.1.3
2.1.4
System Norms . . . . . . . . . . . . . . . . . . . . . . . . . .
OutputFeedback Controller . . . . . . . . . . . . . . . . . .
2.1.5
lpv
Systems . . . . . . . . . . . . . . . . . . . . . . . . . . .
2.2 Optimization . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
2.2.1
Local Methods . . . . . . . . . . . . . . . . . . . . . . . . . .
2.3 Matrix Theory . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
2.3.1
Properties for Dynamical Systems . . . . . . . . . . . . . . .
2.3.2
Matrix Functions . . . . . . . . . . . . . . . . . . . . . . . .
3 FrequencyLimited
H
2
Norm
3.1 FrequencyLimited Gramians . . . . . . . . . . . . . . . . . . . . .
3.1.1
Continuous Time . . . . . . . . . . . . . . . . . . . . . . . .
3.1.2
Discrete Time . . . . . . . . . . . . . . . . . . . . . . . . . .
3.2 FrequencyLimited
H
2
Norm . . . . . . . . . . . . . . . . . . . . .
3.2.1
Continuous Time . . . . . . . . . . . . . . . . . . . . . . . .
3.2.2
Discrete Time . . . . . . . . . . . . . . . . . . . . . . . . . .
3.3 Concluding Remarks . . . . . . . . . . . . . . . . . . . . . . . . . .
34
34
36
37
23
23
24
30
9
12
14
15
16
5
6
5
5
19
19
20
4 Model Reduction
4.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
4.2 Balanced Truncation . . . . . . . . . . . . . . . . . . . . . . . . . .
4.3 Overview of ModelReduction Methods using the
4.4 Model Reduction using an
H
2
H
2
Norm . . . .
Measure . . . . . . . . . . . . . . . .
39
39
41
43
45 xi
xii
Contents
4.4.1
4.4.2
Standard Model Reduction . . . . . . . . . . . . . . . . . . .
Robust Model Reduction . . . . . . . . . . . . . . . . . . . .
4.4.3
FrequencyLimited Model Reduction . . . . . . . . . . . . .
4.5 Computational Aspects of the Optimization Problems . . . . . . .
4.5.1
Structure in Variables . . . . . . . . . . . . . . . . . . . . . .
4.5.2
4.5.3
Initialization . . . . . . . . . . . . . . . . . . . . . . . . . . .
Structure in Equations . . . . . . . . . . . . . . . . . . . . .
4.6 Examples . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
4.7 Conclusions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
4.A Gradient of
V
rob
. . . . . . . . . . . . . . . . . . . . . . . . . . . . .
4.B Equations for FrequencyWeighted Model Reduction . . . . . . . .
4.B.1
4.B.2
Continuous Time . . . . . . . . . . . . . . . . . . . . . . . .
Discrete Time . . . . . . . . . . . . . . . . . . . . . . . . . .
4.C Gradient of the FrequencyLimited Case . . . . . . . . . . . . . . .
5 lpv
Modeling
5.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
5.2 Global Methods . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
5.3 Local Methods . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
5.4
lpv
Modeling using an
H
2
Measure . . . . . . . . . . . . . . . . . .
5.4.1
General Properties . . . . . . . . . . . . . . . . . . . . . . .
5.4.2
5.5.1
The Optimization Problem . . . . . . . . . . . . . . . . . . .
5.5 Computational Aspects of the Optimization Problems . . . . . . .
Structure in Variables and Equations . . . . . . . . . . . . .
90
92
97
97
5.5.2
Initialization . . . . . . . . . . . . . . . . . . . . . . . . . . .
5.6 Examples . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
98
99
5.7 Conclusions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 101
87
87
88
88
89
77
80
81
82
65
66
67
75
84
45
55
60
64
65
6 Controller Synthesis 103
6.1 Overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 103
6.2 Static OutputFeedback
H
2
Controllers . . . . . . . . . . . . . . . . 104
6.2.1
Continuous Time . . . . . . . . . . . . . . . . . . . . . . . . 105
6.2.2
Discrete Time . . . . . . . . . . . . . . . . . . . . . . . . . . 107
6.3 Static OutputFeedback
H
2 lpv
Controllers . . . . . . . . . . . . . 108
6.4 Computational Aspects . . . . . . . . . . . . . . . . . . . . . . . . . 109
6.5 Examples . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 110
6.6 Conclusions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 118
7 Examples of Applications 121
7.1 Aircraft Example
7.1.1
lpv
. . . . . . . . . . . . . . . . . . . . . . . . . . . . 121
Simpliﬁcation . . . . . . . . . . . . . . . . . . . . . . . 122
7.1.2
Model Reduction . . . . . . . . . . . . . . . . . . . . . . . . 123
7.2 Model Reduction in System Identiﬁcation . . . . . . . . . . . . . . 128
7.3 Conclusions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 131
8 Concluding Remarks 133
Contents
Bibliography xiii
135
Notation
Symbols, Operators and Functions
Notation
C
O
N
R
∈
[
a, b
]
A
A e
i
a
i a
Re
a
Im
a
˙ (
t
)
A
I
0
[ A ]
A
T
ij
A
∗
A
−
1
(
≺
( tr A
) 0
) 0 rank A
Meaning the set of natural numbers the set of real numbers the set of complex numbers
Ordo belongs to the closed interval from
a
to
b
√
−
1 the complex conjugate of
a
the real part of
a
the complex part of
a
the time derivative of the function the unit vector with a one in the
x
(
t
)
i
:th element the elementwise complex conjugate of the vector a matrices are denoted by bold, upright capitalized letters the identity matrix a matrix with only zeros element (
i, j
) of the matrix A the transpose of A the complex conjugate transpose of A the inverse of A
A is a positive (semi)deﬁnite matrix
A is a negative (semi)deﬁnite matrix the trace of the matrix A the rank of the matrix A xv
xvi
Notation
Symbols, Operators and Functions
Notation
∂
A
∂a

·

2

N
·

·


·


F
H
2
H
2
,ω

·

H
∞
μ, σ
2
E (
X
Cov (
)
X
)
Meaning denotes the element wise di ﬀ erentiation of the matrix
A with respect to the scalar variable
a
for vectors the two norm and for matrices the induced two norm the Frobenius norm the
H
2
norm for dynamical systems the frequencylimited
H
2
norm for dynamical systems, deﬁned in Chapter 3 the
H
∞
norm for dynamical systems the Gaussian distribution with mean
σ
2
μ
and variance the expected value of the random variable
X
the covariance matrix of the random variable
X
Abbreviations
Abbreviation lti lpv ltv lft lfr siso miso simo mimo oe qp sdp nlp lmi bmi bfgs cofcluo ls lasso svd
Meaning
Linear time invariant
Linear parameter varying
Linear time varying
Linear fractional transformation
Linear fractional representation
Single input single output
Multiple input single output
Single input multiple output
Multiple input multiple output
Output error
Quadratic programming
Semideﬁnite programming
Nonlinear programming
Linear matrix inequality
Bilinear matrix inequality
BroydenFletcherGoldfarbShanno
Clearance of ﬂight control laws using optimization
Least squares
Least absolute shrinkage and selection operator
Singular value decomposition
1
Introduction
Mathematical models of physical systems are pervasive in engineering. These models can be used to analyze properties of the systems, to simulate the systems, or synthesize controllers. However, many of these models are too complex or too large for standard analysis and synthesis methods to be applicable. Hence, there is a need to be able to reduce the complexity of models. The main goal of this thesis is to develop methods for reducing the complexity of di ﬀ erent systems by minimizing the
H
2
norm between the large complex system and the reduced system.
Many of the early methods for controller synthesis and model reduction relies on linear algebra and solutions to Lyapunov and Riccati equations. Later, when solvers for more general and advanced optimization methods were developed, it was possible to formulate many of the problems in control theory as, for example, semideﬁnite programs to be solved using interiorpoint solvers. However, many of these programs included, not only linear matrix inequalities, lmi s, but also bilinear matrix inequalities, bmi s, which make the problems nonconvex. This and the fact that semideﬁnite programs generally do not scale well with the number of variables sometimes make these problems time consuming and di ﬃ cult to solve. In this thesis, we take a step back, and instead try to keep the original structure of the problem and formulate a general nonlinear optimization problem using linear algebra and Lyapunov equations, and use a general quasi
Newton solver to solve the problem. The problems formulated in this thesis are still nonconvex, but since the original structure of the problem is kept and a more direct approach is used, it is possible to, for example, impose certain structural constraints on the system matrices and still be able to use the methods for mediumscale systems.
1
2
1 Introduction
1.1
Outline of the Thesis
Most of the results in this thesis concern the minimization of the
H
2
norm of various linear timeinvariant ( lti
) systems with di ﬀ erent structures and how to utilize the di ﬀ erent characteristics of the di ﬀ erent problems. Most of the results are based on standard concepts in matrix theory, linear systems theory and optimization. A brief overview of the necessary concepts in matrix theory, linear systems theory and optimization are presented in Chapter 2.
In Chapter 3, the concepts of frequencylimited Gramians are presented, Additionally, complete derivations for both the discretetime case and continuoustime case are presented. These are then used to form a frequencylimited
H
2
norm, which is later used in some of the proposed algorithms.
In Chapter 4, a short overview of the modelreduction problem is presented before a number of modelreduction algorithms are presented. These algorithms all try to utilize the di ﬀ erent structures of the equations to be able to solve the problems e ﬃ ciently using quasiNewton methods.
In Chapter 5, a number of methods for generating linear parametervarying models, using the modelreduction methods in Chapter 4 as a foundation, are presented.
In Chapter 6, methods for designing
H
2 controllers, both for linear timeinvariant systems and linear parametervarying systems, are presented. These methods are based on the same procedure as the methods in Chapter 4 and Chapter 5.
Chapter 7 presents two larger examples that highlight some properties and applications for the model reduction and linear parametervarying algorithms. One example shows a ﬂight clearance application of an Airbus aircraft model and the other example highlights the connections between
H
2 model reduction and system identiﬁcation.
Finally in Chapter 8 some concluding remarks about the results and suggestions about future research directions are presented.
1.2
Contributions
The ﬁrst main contributions in the thesis are the modelreduction methods presented in Chapter 4 and especially the frequencylimited model reduction in
Section 4.4.3 and the uniﬁed and complete derivation of the frequencylimited
Gramians and frequencylimited
H
2
norm in Chapter 3, which are based on the publication
Daniel Petersson and Johan Löfberg.
Model reduction using a frequencylimited ber 2012a. URL
H
2
cost.
arXiv preprint arXiv:1212.1603
http://arxiv.org/abs/1212.1603
,
, Decemwhich has been submitted to
Systems and Control Letters
.
1.2
Contributions
3
The second main contributions in the thesis are the linear parametervarying generating methods in Chapter 5. To be able to reduce the complexity of a linear parametervarying model, the idea of model reduction is used to have methods that are invariant to state transformations. These results are based on the publication
Daniel Petersson and Johan Löfberg. Optimization based roximation of multimodel systems. In lpv
app
Proceedings of the European
Control Conference
, pages 3172–3177, Budapest, Hungary, 2009, which was extended with
Daniel Petersson and Johan Löfberg. Robust generation of space models using a regularized
H
2
cost.
In lpv state
Proceedings of the
IEEE International Symposium on ComputerAided Control System
Design
, pages 1170–1175, Yokohama, Japan, 2010, to be able to handle uncertainties in the data. These publications with some extensions have also been published in
Daniel Petersson.
based lpv
Nonlinear optimization approaches to modelling and control
H
2
norm
. Licentiate thesis no. 1453, Department of Electrical Engineering, Linköping University, 2010, and
Daniel Petersson and Johan Löfberg.
Optimization Based Clearance of Flight Control Laws  A Civil Aircraft Application
, chapter Identiﬁcation of lpv
StateSpace Models Using
H
2
Minimisation, pages
111–128. Springer, 2012b, and have been submitted as
Daniel Petersson and Johan Löfberg. Optimizationbased modeling of lpv systems using an
H
2 objective.
nal of Control
, December 2012c.
Submitted to International Jour
Additionally, an extension of the linear parametervarying generating methods is presented, where it is possible to control the rank of the coe ﬃ cient matrices in the resulting linear parametervarying model.
The third main contributions are the
H
2 controllersynthesis methods in Chapter 6, which use similar ideas as the other contributions to synthesize
H
2 controllers instead. This chapter is partly based on the publication
Daniel Petersson and Johan Löfberg.
ing nonlinear programming. In lpv
H
2
controller synthesis us
Proceedings of the 18th IFAC World
Congress
, pages 6692–6696, Milan, Italy, 2011.
2
Preliminaries
This chapter begins by presenting some theory and concepts for system theory.
Some basic optimization background with focus on the concept of quasiNewton methods will then be presented. The chapter will ﬁnish with some matrix theory that will be used in the thesis, where, for example, the concepts of matrix functions are presented.
2.1
System Theory
This section reviews some of the standard system theoretical concepts and explains some system norms that will be used in the thesis.
2.1.1
Basic Theory and Notation
In engineering, mathematical models are often described, in continuous time, by ordinary di ﬀ erential equations. An important subclass of these models is the class of systems of linear ordinary di ﬀ erential equations with constant coe ﬃ
cients. The models in this class, which are called lti
linear timeinvariant models
, models, can mathematically be described, for a continuoustime model, as
˙ y
(
(
t t
) =
) =
Ax
Cx
(
(
t t
) +
) +
Bu
Du
(
(
t t
)
)
,
,
and for a discretetime model with sample time
T
S
as x (
t
+
T
S
) = Ax (
t
) + Bu (
t
)
,
y (
t
) = Cx (
t
) + Du (
t
)
,
(2.1a)
(2.1b)
(2.2a)
(2.2b)
5
6
2 Preliminaries where x (
t
)
∈ R
n x
is a vector containing the states of the system, vector containing the input to the system and y (
t
)
∈ R
n y
u (
t
)
∈ R
n u
is a is a vector containing the output of the system. The matrices A
,
B
,
C and D are constant matrices of suitable dimensions, where A describes the dynamics of the system, B describes how the input enters the system and C and D describes what is being measured from the system. The system in (2.1) is expressed in statespace form, the corresponding transferfunction form, for the system from u (
t
) to y (
t
), is
Y
(
s
) =
G
(
s
)
U
(
s
)
,
where
U
(
s
) and
Y
(
s
) are the Laplace transforms of u (
t
) and y (
t
) and
G
(
s
) = C (
s
I −
A )
−
1
B + D
A
C
B
D
.
Here, the notation
A
C
B
D is introduced as the transfer function of the system given a particular realization, A
,
B
,
C and D .
In discrete time, di ﬀ erence equations are used to describe the dynamics of the system, (2.2), and consequently use the
z
transform instead of the Laplace transform to express the transfer function, i.e., given the discretetime system in (2.2) the transfer function becomes
G
(
z
) = C (
z
I −
A )
−
1
B + D .
The vector x , describing the states, can be transformed into a new basis, ˆ , using an invertible matrix, T , i.e., ˆ Tx . This yields the realization
˙ˆ
(
t
) = TAT
−
1
ˆ (
t
) + TBu (
t
)
,
y (
t
) = CT
−
1
ˆ (
t
) + Du (
t
)
.
(2.3a)
(2.3b)
The transfer function for this system is
ˆ
(
s
) CT
−
1
(
s
I −
TAT
−
1
)
−
1
TB + D = C (
s
I −
A )
−
1
B + D =
G
(
s
)
,
thus, there exists inﬁnitely many realizations of a system.
(2.4)
2.1.2
Gramians
Two important entities when it comes to system theory and determining system properties are the
controllability Gramian
, P and the
observability Gramian
, Q .
The equations for these di ﬀ er in continuous and discrete time and the rest of the section is split up into two subsections, one for continuous time and one for discrete time.
2.1
System Theory
7
ContinuousTime Systems
Deﬁnition 2.1.
The controllability and observability Gramians, in the continuoustime domain, of the system (2.1) are deﬁned as
P
∞ e
A
τ
BB
T e
A
T
τ
d
τ,
(2.5a)
Q
0
∞ e
A
T
τ
C
T
C e
A
τ
d
τ.
0
(2.5b)
The Gramians in (2.5) can also be written as the stationary solutions to the di ﬀ erential equations
= AP + PA
T
+ BB
T
,
= A
T
Q + QA + C
T
C
,
(2.6a)
(2.6b)
= 0 , thus becoming solutions to the algebraic equations, called
Lyapunov equations,
0 =
0 =
AP + PA +
A
T
Q + QA
BB
T
,
+ C
T
C
.
(2.7a)
(2.7b)
By using Parseval’s identity on (2.5), the Gramians can be expressed in the frequency domain.
Deﬁnition 2.2.
The controllability and observability Gramians, in frequency domain, for the system (2.1) are deﬁned as where H (
iω
)
P
Q
∞
1
2
π
−∞
∞
1
H
∗
(
iν
) C
T
CH (
iν
) d
ν,
2
π
−∞
H (
iν
) BB
T
H
∗
(
iν
) d
ν,
(
I
iω
−
A )
−
1 and H
∗ denote the conjugate transpose of H .
(2.8a)
(2.8b)
One important observation to make, both for the Gramians in continuous time and discrete time (see Section 2.1.2), is that the Gramians are dependent on which
= Tx , T is invertible, the Gramians change
P
T
= T
−
1
PT
− T
,
Q
T
= T
T
QT
.
(2.9a)
(2.9b)
8
2 Preliminaries
Hence, the eigenvalues of the Gramians change if a state transformation is performed. However, the eigenvalues of the product of the Gramians,
λ
( PQ ), are invariant to state transformations, since
λ i
( P
T
Q
T
) =
λ i
T
−
1
PT
− T
T
T
QT =
λ i
T
−
1
PQT =
λ i
( PQ )
σ i
2
,
(2.10) where
σ i
is called a
Hankel singular value
of the system.
The Gramians, both in continuous time and discrete time, can be interpreted physically (see, e.g., Skogestad and Postlethwaite [2007] or Antoulas [2005]). Given a state x , the smallest amount of energy needed to steer a system from 0 to x is given by x
T
P
−
1 x
,
(2.11) and the observability Gramian describes the energy obtained by observing the output of a system with initial condition x and given no other input and is described by x
T
Qx
.
(2.12)
This goes for both continuous and discretetime systems.
DiscreteTime Systems
Deﬁnition 2.3.
The controllability and observability Gramians, in discrete time, of the system (2.2) are deﬁned as
∞
P A
k
BB
T
A
k
T
,
(2.13a)
k
=0
∞
Q A
k
T
C
T
CA
k
.
(2.13b)
k
=0
These Gramians also satisfy the discrete Lyapunov equations
0 =
0 =
APA
T
−
P +
A
T
QA
−
Q +
BB
T
,
C
T
C
.
(2.14a)
(2.14b)
The deﬁnition of the discretetime Gramians in frequency domain becomes
Deﬁnition 2.4.
The controllability and observability Gramians, in frequency domain, for the system (2.2) are deﬁned as
P
Q
π
1
2
π
−
π
π
1
H (
ν
) BB
T
H
∗
(
ν
) d
ν,
H
∗
(
iν
) C
T
CH (
iν
) d
ν,
2
π
−
π
(2.15a)
(2.15b)
2.1
System Theory where H e
iω
=
I e
iω
−
A
−
1 and H
∗ denote the conjugate transpose of H .
9
2.1.3
System Norms
System norms are important tools when it comes to comparing and analyzing systems. In this thesis, mainly the
H
2
norm will be used. In this section, the two most commonly used norms in system theory, namely the
H
2
norm and the
H
∞
norm are presented and deﬁned.
Given a system
G
=
A
C
B
D such that
˙
(
t
) = Ax (
t
) + Bw (
t
)
,
z (
t
) = Cx (
t
) + Dw (
t
)
,
(2.16a)
(2.16b) where x is the state, w is a disturbance and z is the output of interest. Suppose a system that guarantees a certain performance is wanted, e.g., w does not inﬂuence z too much. The system norms are functions that quantify this into something computationally tractable, with di ﬀ erent interpretations. System norms can be interpreted as norms that answer the question: “given information about the allowed input, how large can the output be?”.
To be able to do this, two signal norms that will be used to interpret the system norms are deﬁned.
Deﬁnition 2.5 (
L
2 is deﬁned by
, 2norm in time).
The
L
2
norm for square integrable signals
 e (
t
)

L
2
∞
 e (
τ
)

2
2 d
τ.
0
(2.17)
 e (
t
)

L
2 is also referred to as the energy of the signal e (
t
).
Deﬁnition 2.6 (
L
∞ signals is deﬁned as
,
∞
norm in time).
 e (
t
)

L
∞
The
L
∞
norm for magnitudebounded sup
τ
≥
0
 e (
τ
)

2
.
(2.18)
For a scalar signal e (
t
),
 e (
t
)

L
∞ is simply the peak of the signal.
These signal norms are used to deﬁne some system norms in the next section.
10
2 Preliminaries
ContinuousTime
H
2
Norm
For a the siso system
H
2
G
, which has the realization (2.16) with
norm can be deﬁned as
A Hurwitz and D = 0 ,

G

H
2 sup
 w (
t
)

L
2
≤
1
 z (
t
)

L
∞
.
(2.19)
For some physical interpretations of the
H
2
norm, see for example Skogestad and Postlethwaite [2007], Skelton et al. [1998] or Zhou et al. [1996]. However, the deﬁnition that will be used mostly in this thesis is
Deﬁnition 2.7 ( proper ( D = 0
H
2
norm).
For an asymptotically stable ( A Hurwitz) and strictly
) continuoustime system,
G
, the
H
2
norm is deﬁned as

G

H
2
1
∞
2
π
−∞ tr
G
∗
(
iν
)
G
(
iν
)d
ν.
(2.20)
One important thing to note about the norm (see Section 2.1.3),
not
the multiplicative property,
 an induced norm and does
GF

H
2
H
2
norm is that it is, in contrast to the
≤ 
G

H
2

F

H
2
, with
not
, in general, satisfy
G
and
F
being two
H
∞ lti systems. This property, if true, makes it possible to analyze individual systems
in series to conclude facts about the interconnected system.
The forms in (2.19) and (2.20) are not suitable for actual evaluation of the norm. However, the dly form. The
H
2
H
2
norm can be expressed in a more computationally frien
norm in (2.20) can be rewritten, given a system
G
H
2
with a realization as in (2.16), using the Gramians in (2.5), to

G
 2
H
2
=
=
=
1
2
π
−∞
∞
1
2
π
tr
−∞
∞
1 tr
2
π
∞ tr
G
∗
(
iν
)
G
(
iν
)d
ν
−∞
B
T
H
CH
∗
iν iν
C
BB
T
T
CH
H
∗
=
iν iν
1
2
π
−∞
B d
∞
ν
C
T d
ν
tr
G
(
iν
)
G
∗
(
iν
)d
ν
= tr
= tr
B
T
QB
CPC
T
,
(2.21a)
(2.21b) where P and Q satisfy
0 =
0 =
AP
A
T
+
Q
PA
+
T
+
QA +
BB
T
,
C
T
C
.
(2.22a)
(2.22b)
DiscreteTime
H
2
Norm
All the material for the continuoustime case is readily extended to the discretetime case.
2.1
System Theory
11
Deﬁnition 2.8 (
H
2 system,
G
, the
H
2
norm).
For an asymptotically stable (
norm is deﬁned as
A Schur) discretetime

G

H
2 2
1
π
π
−
π
tr
G
∗
(e
iν
)
G
(e
iν
)d
ν.
(2.23)
An important observation here is that the system does not have to be strictly proper for the
H
2
norm to be deﬁned. As in the continuoustime case, the above deﬁnition is not in a computationally friendly form, and (2.23) can be reformulated using the deﬁnitions of the discretetime Gramians, (2.13), which yields

G
 2
H
2
=
1
π
tr
G
∗
2
π
= tr
−
π
B
T
QB +
(e
iν
D
T
)
D
G
(e
iν
)d
ν
=
π
1
2
π
−
π
tr
G
(e
iν
)
G
∗
(e
iν
)d
ν
= tr CPC
T
+ DD
T
,
where P and Q satisfy
0 =
0 =
APA
T
−
A
T
QA
P
−
Q
+ BB
T
,
+ C
T
C
.
(2.24a)
(2.24b)
(2.25a)
(2.25b)
ContinuousTime
H
∞
Norm
Although our proposed methods revolve around the
norm, the
H
∞
H
2
measure, the
H
∞
measure will be used in various comparisons. Hence, the deﬁnition of it will be presented in this section. As with the
H
2
norm can be deﬁned using the signal norms presented in Section 2.1.3. Given an asymptotically stable ( A
Hurwitz) continuoustime system,
G
, the
H
∞
norm is

G

H
∞ max w (
t
) 0
 z (
t
)

 w (
t
)

L
2
L
2
= max
 w (
t
)

L
2
=1
 z (
t
)

L
2
.
(2.26)
Looking at (2.26), it can be observed that, the
H
∞
norm is indeed an induced norm, and hence satisﬁes the multiplicative property

GF

H
∞
≤ 
G

H
∞

F

H
∞
.
This is one reason for the popularity of this norm.
The deﬁnition for the
H
∞
norm in the frequency domain is
Deﬁnition 2.9 (
H
∞ uoustime system,
G
norm).
, the
H
∞
For an asymptotically stable ( A Hurwitz) contin
norm is, in the frequency domain, deﬁned as

G

H
∞ max
ω
∈ R
¯ (
G
(
iω
))
.
(2.27)
12
2 Preliminaries
Observe that for the
H
∞
norm, the system does not have to be strictly proper.
The
H
∞
norm is however not as straightforward to compute as the
H
2 way to compute the
Hamiltonian matrix
W
H
∞
W
⎛
norm is to compute the smallest value
γ
norm. One such that the
− has no eigenvalues on the imaginary axis, where
C
A
T
+
I
BR
+
−
1
DR
D
−
1
T
D
C
T
C
−
A
BR
+
−
1
BR
B
−
1
T
D
T
C
T
⎞
(2.28) and R
γ
2
−
D
T
D .
DiscreteTime
H
∞
Norm
The material for the continuoustime case is readily extended to the discretetime case. The deﬁnition for the
H
∞
norm in discrete time becomes
Deﬁnition 2.10 (
H
∞ time system,
G
, the
H
norm).
∞
For an asymptotically stable ( A Schur) discrete
norm is, in the frequency domain, deﬁned as

G

H
∞ max
ω
∈
[
−
π,π
]
G
(e
iω
)
.
(2.29)
2.1.4
OutputFeedback Controller
An
outputfeedback controller
,
K
, of order
n
K
can be described as a linear system x
K
(
t
) = K
A
x
K
(
t
) + K
B
y (
t
) u (
t
) = K
C
x
K
(
t
) + K
D
y (
t
)
(2.30a)
(2.30b) where x
K
signal and
∈ u
R
∈
n
K
is the state vector of the controller,
R
n u
y
∈ R
n y
the measurement the control signal. A commonly used model for analyzing systems and measure performance, which will be used in this thesis, is
⎛
⎜⎜⎝ x z y
⎞
⎟⎟⎠
=
⎜⎜⎝
⎜⎜⎜⎜
A
C
C
1
2
B
D
D
1
11
21
B
D
D
2
12
22
⎟⎟⎠
⎟⎟⎟⎟
⎜⎜⎝
⎜⎜⎜⎜ x w u
⎞
⎟⎟⎠
,
(2.31) where x
∈ R
n
control signal,
x
is the state vector, z
∈ R
n z
w
∈ R
n w
the disturbance signal, the performance measure and y
∈ R
n y
u
∈ R
n u
the the measurement signal. Here, the matrix D
22 is assumed, without loss of generality, to be zero, see
Zhou et al. [1996]. Combine equations (2.31) and (2.30) to arrive at a statespace representation of the closedloop system from
⎡
T w,z
=
⎢⎢⎢⎢
⎣
C
A
1
+
+ B
K
D
2
B
12
K
C
K
D
2
D
C
C
2
2
B
2
K
D
K
A
12
C
K
C
w
D
B
1
11 to
+
+ z , see Figure 2.1,
⎤
K
B
2
B
D
K
D
D
12
21
K
D
D
21
D
21
⎥⎥⎥⎥
⎦
.
(2.32)
The two types of controllers that will be mentioned in this thesis are controllers. These controllers are designed to minimize the
H
2 or
H
2
H
∞ and
H
∞
norm of
2.1
System Theory
13
z w
G y u
K
Figure 2.1:
Feedback
the closedloop system,
T w,z
. The problem of ﬁnding an or can be divided into three cases. The simple case, both in the case of
H
2 controllers, is to ﬁnd a full order controller,
n
K
=
n
H
2
H
∞ controller
H
∞ and
x
, see e.g., Skogestad and
Postlethwaite [2007] or Zhou et al. [1996]. The two more di ﬃ cult cases are to ﬁnd a
n
K reducedorder controller
, 0
< n
K
< n x
, or a
static outputfeedback controller
= 0. However, the problem of computing a reducedorder controller can be reformulated as a static controller problem, this is shown in El Ghaoui et al.
,
[1997] and restated here for clariﬁcation.
To see that the problem of ﬁnding a reducedorder controller can be reformulated as a static outputfeedback controller, ﬁrst create the augmented system,
G aug
.
G aug
=
⎢⎢⎢⎢
⎣
A
aug
=
A
0
A
aug
C
1
,aug
C
2
,aug
D
B
1
,aug
11
,aug
D
21
,aug
0
0
,
B
1
,aug
=
B
1
0
B
2
,aug
D
12
,aug
D
22
,aug
,
B
2
,aug
=
⎥⎥⎥⎥
⎦
,
where
0
I
B
0
2
,
C
1
,aug
= C
1
C
2
,aug
=
0
,
D
11
,aug
= D
11
,
D
12
,aug
=
0
C
2
I
0
,
D
21
,aug
=
D
0
21
,
D
0 D
22
,aug
12
= 0
,
,
with the new state space vector augmented with x
K
∈ R
n
K
, x
aug
= x x
K
, the new control signal augmented with u
K
∈ R
n
K
, u
aug
= u
K
u and the new measurement signal augmented with y
K
∈ R
n
K
sizes with all elements zero and
, y
aug
= y y
K
. The 0 ’s are matrices of compatible
I are identity matrices of compatible sizes.
Now use the static controller, structure u
aug
= K
aug
y
aug
, on
G aug
, where K
aug
has the
K
aug
=
K
A
K
C
K
K
B
D
,
where K
A
,
K
B
,
K
C
and K
D
are the matrices from the controller in (2.30). Computing the closedloop equations for this feedback system will lead to obtaining
14
2 Preliminaries the same equations as in (2.32). This shows that any method for computing a static outputfeedback controller can also be used to compute a reducedorder controller.
2.1.5
LPV
Systems
A natural generalization of lti systems is
linear timevarying systems
, ltv systems, where the statespace matrices can be dependent on time. The drawback is that ltv systems are very hard to analyze and work with. This raises the need of an intermediate step to represent systems, and this is where
linear parametervarying systems
, lpv systems, comes in.
lpv systems depend on scheduling parameters, p , that varies with time,
but are measurable
. A general lpv system can be written, in statespace representation, in continuous time, (see Tóth [2008]), as
G
( p ) :
˙
(
t
) = A ( p ) x (
t
) + B ( p ) u (
t
)
,
y (
t
) = C ( p ) x (
t
) + D ( p ) u (
t
)
,
(2.33) where p is the vector of
scheduling parameters
. Note that there is no restriction on how the lpv system depends on the scheduling parameters, hence it can be nonlinear and also depend on the time derivative of p .
lpv systems have the property that if the scheduling parameters in the the system becomes a regular lti system.
lpv system are kept constant,
As with ordinary lti systems, the statespace representation for an lpv system is not unique and it is possible, by applying a state transformation, to change the basis of the states. As with the system matrices, when generalizing to tems from lti lpv syssystems, the state transformations can depend on the scheduling parameters, i.e., x = T ( p )ˆ
,
(2.34) where T ( p ) is a nonsingular continuously di ﬀ erentiable matrix for all this similarity transformation to the system in (2.33) yields
t
. Applying
ˆ
( p ) =
T
−
1
( p ) A ( p ) T ( p ) + T
−
1
( p ) ˙ ( p )
C ( p ) T ( p )
T
−
1
( p ) B ( p )
D ( p )
.
(2.35)
Note that there is a term in the new A matrix that depends on the time derivative of the state transformation.
A general discretetime statespace lpv
Tóth [2011], system can be written as, see Kulcsar and
G
(
P
k
) =
A (
P
C (
P
k k
)
)
B (
P
k
D (
P
k
)
)
,
(2.36) where
P
k
= p
k
+
j
∞
j
=
−∞
. By applying a similarity transformation (which can depend on the parameters), i.e., x
k
= T ( p
k
x
k
,
(2.37)
2.2
Optimization
15 where T ( p
k
) is a nonsingular and bounded matrix for all
k
, an lpv system with the same behavior but with another statespace representation is constructed,
ˆ
(
P
k
) =
T ( p
k
+1
C (
P
) A (
P
k k
) T ( p
)
k
T
)
( p
k
) T ( p
k
+1
D (
P
) B (
P
k k
)
)
.
(2.38)
Looking at how the state transformations work for the lpv system above, one realizes that in one state base the statespace matrices can depend on only the current value of the parameter and in another it can also depend the derivative
(in discrete time, the parameter values at other time steps than the current). Similar behavior can be seen when going from an lpv form to an inputoutput model structure of the system described in statespace lpv system. For example, study an example from Tóth et al. [2012], where a second order statespace representation of an lpv system is used, x
k
+1
=
0
1
a
2
a
1
(
p k
(
p k
)
) x
k
+
1 x
k
.
b
2
b
1
(
p k
(
p k
)
)
u k
, y k
= 0
This system only depends on the current parameter value, i.e., equivalent inputoutput form becomes
p k
. However, the
y k
=
a
1
(
p k
−
1
)
y k
−
1
+
a
2
(
p k
−
2
)
y k
−
2
+
b
1
(
p k
−
1
)
u k
−
1
+
b
2
(
p k
−
2
)
u k
−
2
,
which is clearly not only dependent of only the current parameter value. Hence, it is important to note, when working with lpv systems, if one is working with statespace or inputoutput forms, since these can give rise to di ﬀ erent dependencies of the parameters.
2.2
Optimization
This section starts by giving a brief presentation of optimization and some methods that can be used to solve optimization problems. The presentation will closely follow relevant sections in Nocedal and Wright [2006].
Most optimization problems can mathematically be written as minimize x
f
( x ) subject to
g
I,i
( x )
≤
0
, g
E,i
( x ) = 0
, i i
= 1
, . . . , m
= 1
, . . . , m
I
E
where the
f
( x ) is the
cost function
,
f
:
constraint functions
. A vector x
R
n
→ R and x
∈ R
n
, and
g
I,i
( x )
, g
E,i
( x ) are is called optimal if it produces the smallest value of the cost function of all the x that satisfy the constraints. In this thesis, the problems will mostly be unconstrained, i.e., problems without any
g
E,i
( x ). The value attained at the solution, is called a
minimum
x
g
I,i
, to the optimization problem,
(
f
x ) or
( x ),
. This can either be a local or global minimum and the point where this value is attained, x is called a
minimizer
(local or global). One way
16
2 Preliminaries to be able to classify when a minimum is attained is to use ﬁrst order necessary conditions.
Optimization problems can be divided into two classes,
problems
and
nonconvex optimization problems convex optimization
. The problems of interest in this thesis will be nonconvex. To explain what a nonconvex problem is, a convex problem is presented ﬁrst.
First, deﬁne a convex set. A convex set,
N
, is a set, such that any point, line between any two points, x
,
y , in the set, this point, z z , on a
, should also lie in the set, i.e.,
θ
x + (1
−
θ
) y = z
∈ N
,
∀
θ
∈
[0
,
1]
,
x
,
y
∈ N
.
(2.39)
A convex function is deﬁned in the same manner. A function is convex if it satisﬁes for all x
,
y
∈ N
f
(
θ
x + (1
−
θ
) y )
≤
θf
(
x
) + (1
−
θ
)
f
( y ) and
θ
∈
[0
,
1], where
N is a convex set.
A convex optimization problem is an optimization problem where both the cost function and the feasible set, the set of x ’s deﬁned by the constraints, are convex. Convex optimization problems have the feature that
a local minimizer is always a global minimizer
. This means that when a minimum is found in a convex optimization problem it is the global minimum. This guarantee does not exist in general for nonconvex optimization problems. The problem of ﬁnding the global minimizer for a general nonconvex optimization problem is di ﬃ cult and often only local minimizers are sought. For further reading see e.g., Nocedal and Wright [2006].
2.2.1
Local Methods
One approach to solve nonconvex optimization problems is to use
local methods
, methods that seek for a local minimizer, i.e., a point that in a neighborhood of feasible points has the smallest value of the cost function. A class of local methods which is widely used today in solving nonlinear nonconvex problems is the class of
quasiNewton linesearch methods
. These methods typically require that the cost function is twice continuously di ﬀ erentiable, at least for the convergence theory to hold. However, in practice, these methods have been shown to work well on certain nonsmooth problems as well, see for example Lewis and Overton
[2012].
The line search strategy is to ﬁnd a direction p
k
, and a step
α k
, such that f
k f
( x
k
)
> f
( x
k
+
α k
p
k
)
.
(2.40)
There exist many suggestions of how to ﬁnd the direction p
k
and the step length
α k
. One suggestion, and maybe the most obvious, is to take the direction, which is p
k
=
−
∇ f
k
∇ f
k
 and choose
α k
as
steepest descent
α k
arg min
α f
( x
k
−
α
p
k
)
.
2.2
Optimization
17
A beneﬁt with the choice p
k
=
−
∇ f
k
∇ f
k

, is that only information about the gradient is needed and no secondorder information, i.e., information about the Hessian.
The problem of choosing the steepest descent direction is that the convergence can be extremely slow.
By exploiting secondorder information about the cost function a better search direction can be produced. Assume a model function
m k
( p ) f
k
+ p
T
∇ f
k
+ p
T
∇ 2 f
k
p
,
that approximates the function f be the solution to well in a neighborhood of x
k
, then deﬁne p
k
to minimize p
m k
( p )
,
i.e., p
k
=
−
(
∇ 2 f
k
)
−
1 ∇ f
k
and
α k
is chosen according some conditions, for more detail see, for example, Nocedal and Wright [2006]. A method with this choice of direction is called a
Newton method
. There are however two major drawbacks with this method, the Hessian has to be computed which can be very time consuming, and the Hessian has to be positive deﬁnite.
QuasiNewton Methods
QuasiNewton methods are methods that resemble Newton methods but in some way tries to approximate the Hessian in a computationally e ﬃ cient manner. As in the Newton method, start with a quadratic model function
m k
( p ) f
k
+
∇ f
T
k
p +
1
2 p
T
B
k
p
,
where B
k
is a symmetric positive deﬁnite matrix. Instead of computing a new
B
k
for every iteration only an update of is wanted to obtain
Newton method, the minimizer to the model function is p
k
=
B
−
B then used to calculate x
k
+1 as
B
k
−
k
1
+1
k
. As for the
∇ f
k
, which is x
k
+1 x
k
+
α k
p
k
.
As in the Newton method,
α k
is chosen according to some conditions which will not be further discussed here, see e.g., Nocedal and Wright [2006] for further reading.
One way of updating B
k
is to let B
k
+1 be the solution to the optimization problem minimize
B subject to B = B
T
,

B
−
B
Bs
k k

G
−
k
= y
k
1
(2.41a)
(2.41b) where s
k
α k
p
k
and y
k
∇ f
k
+1
− ∇ f
k
. The norm that is used in the optimization problem is the weighted Frobenius norm,
18
2 Preliminaries

B

G
−
1
k
G
−
k
1
2
BG
−
k
1
2
F
,
G
k
0
1
∇
f
( x
k
+
τα k
p
k
)
dτ.
The structure of the optimization problem (2.41) can be explained like this. The constraint that B , which is an approximation of the Hessian, should be symmetric is obvious for a function that is a twice di ﬀ erentiable function. The second constraint, the
secant equation
, ensures that B generates a consistent expression for a ﬁrstorder approximation of the Hessian using the gradient. To determine B
k
+1 uniquely, the B , in some sense, closest to B
k
is chosen. Additionally, the minimization problem is made scaleinvariant and dimensionless, which explains the minimization and the choice of norm and weights.
The optimization problem (2.41) has a closed form solution,
B
k
+1
= (
I −
ρ k
y
k
s
T
k
) B
k
(
I −
ρ k
s
k
y
T
k
) +
ρ k
y
k
y
T
k
, ρ k
y
T
k
1 s
k
.
This update of B
k
is called the dfp
(which stands for DavidonFletcherPowell) updating formula. To compute the direction needed. Since B
k
+1 is a rank two update of B
k
p
k
=
−
B
−
1
k
∇ f
k
, the inverse of
, the inverse of
H
−
1
k
+1
B
k
is can be expressed in closed form as
B
k
+1
H
k
+1
= H
k
−
H
k
y
T
k
y
k
y
H
k
T
k
y
H
k k
+ s
k
y
T
k
s
T
k
s
k
.
An even better updating formula is the bfgs
(which stands for BroydenFletcher
GoldfarbShanno) updating formula where a similar optimization problem as before, but for problem
H
k
+1 instead, is solved.
H
k
+1 is the solution to the optimization minimize
H subject to H = H
T
,

H
−
H
k

G
k
Hy
k
= s
k
which has the solution
H
k
+1
(
I −
ρ k
s
k
y
T
k
) H
k
(
I −
ρ k
y
k
s
T
k
) +
ρ k
s
k
s
T
k
.
The beneﬁt with quasiNewton methods is that every iteration in the optimization scheme now can be performed with complexity
O
(
n
2
), not including function and gradient evaluations.
The bfgs
scheme will be used extensively in the strategies proposed in this thesis.
2.3
Matrix Theory
19
2.3
Matrix Theory
This section will brieﬂy present, for the sake of easy reference in the later chapters, some basic matrixtheory concepts and deﬁnitions. The presented theory can also be found in Higham [2008], Skelton et al. [1998] and Lancaster and Tismenetsky [1985].
2.3.1
Properties for Dynamical Systems
In this thesis, linear dynamical systems plays an important role, especially asymptotically stable linear systems. Two useful matrix deﬁnitions for discrete and continuoustime linear systems are,
Deﬁnition 2.11.
then A
Let
λ i
is called Hurwitz.
be the eigenvalues to the square matrix A . If Re
λ i
<
0
,
∀
i
,
Deﬁnition 2.12.
then A
Let is called Schur.
λ i
be the eigenvalues to the square matrix A . If

λ i

<
1
,
∀
i
,
For a continuoustime (discretetime) linear system it holds that, if the is Hurwitz (Schur), then the system is asymptotically stable.
A matrix
As was explained in Section 2.1.2, the Gramians for linear systems are an important part in this thesis. To compute these Gramians a number of Lyapunov equations (both continuous and discrete), as in (2.7) and (2.14), have to be solved.
An important question to ask is; when do these equations have a unique solution?
Theorem 2.1 (Corollary 3.3.3 in Skelton et al. [1998]).
Lyapunov equation
0 = AX + XA
T
+ Y
,
Y 0
A matrix
X
solving a
(2.42)
is unique if and only if there are no two eigenvalues of located about the imaginary axis.
A
that are symmetrically
Proof: The left eigenvalues left and right by v
i
∗ and v
j
v
i
of A satisfy v
∗
i
A =
, respectively, to obtain
λ i
v
∗
i
. Multiplying (2.42) from
0 = v
i
∗
AXv
j
+ v
i
∗
XA
T v
j
+ v
i
∗
Yv
j
= v
i
∗
Xv
j
λ i
+
λ j
+ v
i
∗
Yv
j
.
(2.43)
This yields unique values for the elements of the transformed ˆ :
ij
V
−
1
XV
−∗
ij
= v
i
∗
Xv
j
=
− v
λ i i
∗
Yv
j
+
λ j
,
∀
i, j,
V
−∗
= [ v
1
· · · v
n
] if and only if
λ i
+
λ j
0 for all
i
and
j
.
(2.44)
20
2 Preliminaries
Theorem 2.2 (Corollary 3.4.1 in Skelton et al. [1998]).
discrete Lyapunov equation is unique if and only if λ i
(
0 = A
T
XA
−
X + Y
,
A )
Y 0
λ j
( A )
−
1
for all i and j .
A matrix
X
solving the
(2.45)
Proof: of A
Multiply (2.45) from the left and right with the matrix of left eigenvectors
(where as follows,
λ i
v
∗
i
= v
i
∗
A , V
−∗
= [ v
1 v
2
· · · v
n
], V
−
1
AV =
Λ
= diag (
λ
1
, λ
2
, . . . , λ n
)),
V
−
1
XV
−∗
= V
−
1
AXA
T
+ Y V
−∗
= V
−
1
AVV
−
1
XV
−∗
V
∗
A
T
V
−∗
=
Λ
V
−
1
XV
−∗
Λ
+ V
−
1
YV
−∗
.
+ V
−
1
YV
−∗
This yields unique values for the elements of the transformed ˆ ,
ij
V
−
1
XV
−∗
= v
∗
i
Xv
j
= 1
−
λ i
λ j
−
1 v
∗
i
Yv
j
,
if and only if
λ i
λ j
1, for all
i
and
j
.
(2.46)
The two theorems above tells us that, given an asymptotically stable system ( A
Hurwitz for continuous time and A Schur for discrete time), then the solutions to the Lyapunov equations for the Gramians are unique.
2.3.2
Matrix Functions
This section will give some deﬁnitions of matrix functions and present some theory that will be useful in the later chapters of the thesis.
As stated in Higham [2008], there exist many ways of deﬁning matrix functions,
f
( A ). Presented here, is the deﬁnition via Jordan canonical form, which exists for all matrices, see for example Lancaster and Tismenetsky [1985].
Deﬁnition 2.13 (Deﬁnition 1.1 in Higham [2008]).
deﬁned on the spectrum of A
∈ C
n
×
n
if the values
The function
f
is said to be
f
(
j
)
(
λ i
)
, j
= 0
,
1
, . . . , n i
−
1
, i
= 1
,
2
, . . . , s
(2.47) exist. These are called the values of the function the sizes of the individual Jordan blocks in A and
f
eigenvalues.
s
on the spectrum of A .
n i
are is the number of individual
Now, if
f
( A ).
f
is deﬁned on the spectrum of the matrix, then it is possible to deﬁne
Deﬁnition 2.14 (Deﬁnition 1.2 in Higham [2008]).
spectrum of A
∈ C
n
×
n
and let J
k
Let denote a Jordan block in
f
A be deﬁned on the with A ZJZ
−
1
=
2.3
Matrix Theory
Z diag ( J
k
) Z
−
1 and
λ k
denote an eigenvalue of A . Then
f
( A ) Z
f
( J ) Z
−
1
= Z diag (
f
( J
k
)) Z
−
1
,
where
f
( J
k
)
⎜⎜⎜⎜
⎜⎜⎜⎜
⎜⎝
⎜⎜⎜⎜
⎜⎜⎜⎜
f
(
λ k
)
f f
(
λ k
(
λ k
)
)
. . .
. ..
. ..
f
(
nk
(
n k
−
1)
(
λ
−
1)!
.
..
k
)
f f
(
λ
(
λ k k
)
)
⎟⎟⎟⎟
⎟⎟⎟⎟
⎟⎠
⎟⎟⎟⎟
⎟⎟⎟⎟
.
21
(2.48)
(2.49)
For example, given the function
f
the deﬁnition above can be used to compute
A = ZDZ
−
1
= Z diag(
λ i
) Z
−
1
, as
(
x
) = sin
x
, and we want to compute
f
( A
f
( A ). Then
), given a diagonalizable matrix sin A = Z (sin D ) Z
−
1
= Z diag(sin
λ i
) Z
−
1
.
(2.50)
A number of properties for general matrix functions, to be able to use them more e ﬃ ciently, can be derived.
Theorem 2.3 (Theorem 1.18 in Higham [2008]).
subset
Ω ⊆ C
Let such that each connected component of f
Ω
be analytic on an open is closed under conjugation. Consider the corresponding matrix function
C
n
×
n
, the set
D
=
{
A
∈ C
n
×
n
:
Λ
( A )
⊆
f on its natural domain in
Ω }
. Then the following are equivalent:
(a) f
( A
∗
) =
f
∗
( A )
for all
A
∈ D
.
(b) f
( A ) =
f
( A )
for all
A
∈ D
.
(c) f
(
R
n
×
n
∩ D
)
⊆ R
n
×
n
.
(d) f
(
R ∩ Ω
)
⊆ R
.
Theorem 2.4 (Theorem 1.19 in Higham [2008]).
or
C
and let f be n
−
1
Let
D
be an open subset of times continuously di ﬀ erentiable on
D
. Then f
( A )
continuous matrix function on the set of matrices
A
∈ C
n
×
n with spectrum in
R
is a
D
.
Theorem 2.5 (Theorem 1.20 in Higham [2008]).
Theorem 2.4. Then f
( A ) = 0
f
( A ) = 0
for all diagonalizable for all
A
∈ C
A
∈
n
×
n
C
n
×
n
Let f satisfy the conditions in with spectrum in with spectrum in
D
.
D
if and only if
Theorem 2.5 (together with Theorem 2.4) can be interpreted as, if a function satisﬁes some mild continuity conditions (see Theorem 2.4), then to check the validity of a matrix identity it is su ﬃ cient to only check it for diagonalizable matrices.
One matrix function that will be used extensively in this thesis is the matrix logarithm, deﬁned below.
22
2 Preliminaries
Deﬁnition 2.15.
R
−
. Let A
Assume A
∈ satisfy the equation
C
n
×
n
A and that
= e
B
A does not have any eigenvalues on for a matrix B
∈ C
n
×
n
, then it holds that
B = ln A , where ln denotes the principal logarithm.
This means, for a diagonalizable matrix A = logarithm of the matrix A can be written as
ZDZ
−
1
= Z diag(
λ i
) Z
−
1
, the complex ln A = Z diag (ln

λ i

+
i
arg
λ i
) Z
−
1
.
(2.51)
Since computing the matrix logarithm can be computationally heavy, it can be beneﬁcial, when having a sum of logarithm evaluations, to combine them, when possible, to one matrix logarithm computation, e.g., ln two theorems will guide us to when this is possible.
A + ln B = ln AB . The next
Theorem 2.6 (Theorem 11.2 in Higham [2008]).
values on
− ln A
and
R
− ln
and
A
1
/
2
α
=
∈
1
2
[
− ln
1
A
,
.
1]
it holds that
ln A
α
=
α
For
ln
A
A
∈ C
n
×
n with no eigen
. In particular,
ln A
−
1
=
Theorem 2.7 (Theorem 11.3 in Higham [2008]).
no eigenvalues on
R
−
and that
BC = CB
Suppose
B
,
C
∈
. If for every eigenvalue corresponding eigenvalue μ j of
C
,
C
n
×
n
λ j of both have
B
and the
arg
λ j
+ arg
μ j
< π,
(2.52)
then
ln BC = ln B + ln C
.
The methods that will be derived in this thesis will be gradientbased optimization algorithms. Hence, it will be required to compute the Fréchet derivative of the matrix logarithm. The Fréchet derivative can be seen as generalization of the ordinary derivative for matrix functions.
Theorem 2.8 (See Chapter 11 in Higham [2008]).
Let L
( A
,
E )
denote the Fréchet derivative of the matrix logarithm, deﬁned in Deﬁnition 2.15, at in the direction
E
∈ C
n
×
n
. Then it holds that
A
∈ C
n
×
n
1
L
( A
,
E ) =
(
t
( A
− I
) +
I
)
−
1
E
(
t
( A
− I
) +
I
)
−
1 d
t.
(2.53)
0
As written in (2.51) and (2.53), these equations are not suitable for computational evaluation. Thankfully, there exists computationally e ﬃ cient and stable algorithms to compute these entities, e.g., the SchurParlett algorithm (see, e.g.,
Higham [2008]) can be used to compute ln( A ), and all other functions that are analytic, and an algorithm for computing the Fréchet derivative of the matrix logarithm is described in AlMohy et al. [2012].
FrequencyLimited
H
3
2
Norm
In this chapter, a new
H
2
measure that, instead of taking the whole frequency interval into account, only focuses on prespeciﬁed intervals is presented. The chapter starts by deﬁning some new Gramians that are based on the ordinary
Gramians in Section 2.1.2, but are limited to a limited frequency interval. These new Gramians are then used to deﬁne a new
H
2
measure that computes the
H
2
norm for a limited frequency interval.
3.1
FrequencyLimited Gramians
This section presents the framework that the new measure, that is presented in Section 3.2, is based on, the frequencylimited Gramians. These Gramians were introduced in Gawronski and Juang [1990] (continuous time) and Horta et al. [1993] (discrete time). The section starts by deﬁning the frequencylimited
Gramians and continues by deriving some properties of the Gramians. Ways to e ﬃ ciently compute the Gramians are also presented. The results for the continuoustime case, which are also presented in Gawronski and Juang [1990] and
Gawronski [2004], are presented, both for the sake of completeness, and to give a more thorough derivation. Theorem 3.1 and Theorem 3.2, describing the frequencylimited Gramians, are results that already exist in Gawronski [2004]. However, in this section, the results are presented using the given notation and in more detail. The reformulations of S
ω
and S
Ω
3.1 have not been published elsewhere.
presented in Theorem 3.3 and Corollary
The results for the discretetime case contain a new derivation which di ﬀ ers from
Horta et al. [1993], both in approach and result.
23
24
3 FrequencyLimited
H
2
Norm
3.1.1
Continuous Time
In this section, it is assumed that the system that is used, stable, with a realization
G
, is asymptotically
˙ (
t
) = Ax (
t
) + Bu (
t
)
,
y (
t
) = Cx (
t
) + Du (
t
)
.
(3.1a)
(3.1b)
G
being asymptotically stable is equivalent to having A Hurwitz. For this system we have that the standard controllability and observability Gramians are
P
Q
∞
1
2
π
−∞
∞
1
H
H
∗
2
π
−∞
iν iν
BB
T
H
∗
C
T
CH
iν iν
d
ν,
d
ν,
(3.2a)
(3.2b) where H
iν
(
I
iν
−
A )
−
1
. The controllability and observability Gramians also satisfy the Lyapunov equations
0 = AP + PA
T
+ BB
T
,
0 = A
T
Q + QA + C
T
C
.
(3.3a)
(3.3b)
Narrowing the frequency band in (3.2), from (
−∞
,
∞
) to (
−
ω, ω
), where
ω <
∞
, leads to the deﬁnition of the frequencylimited Gramians, see Gawronski and
Juang [1990].
Deﬁnition 3.1.
The frequencylimited controllability and observability Gramians for the system (3.1), are deﬁned as
P
Q
ω
ω
ω
1
2
π
−
ω
ω
1
H
H
∗
2
π
−
ω iν iν
BB
T
H
∗
C
T
CH
iν iν
d
ν,
d
ν,
(3.4a)
(3.4b) with
ω <
∞
.
As with the ordinary Gramians, the frequencylimited Gramians can also be written as solutions to two Lyapunov equations.
Theorem 3.1.
Given a system G
=
A
C
B
D
, where
A
is Hurwitz, it holds that
P
ω where
AP + PA
T
+ BB
T
= 0
and
S
ω
=
S
ω
P + PS
T
ω
,
1
2
π
ω
−
ω
H
iν
(3.5) d
ν . Furthermore,
P
ω can also
3.1
FrequencyLimited Gramians
25
be computed as a solution to
AP
ω
+ P
ω
A
T
+ S
ω
BB
T
+ BB
T
S
T
ω
= 0
.
(3.6)
Lemma 3.1.
For the ordinary controllability and observability Gramians,
Q
, in (3.3), it holds that
P
and
H
iν
BB
T
H
∗
iν
H
∗
iν
C
T
CH
iν
= PH
∗
= QH
iν iν
+ H
iν
P
,
+ H
∗
iν
Q
.
(3.7a)
(3.7b)
Proof: Using the deﬁnition of side of (3.7a), it holds that
H
iν
H
−
1
iν
P + PH
−∗
iν
= (
iν
I −
A ) P + and starting with a variant of the right hand
P
−
iν
I −
A
T
=
−
AP + PA
T
= BB
T
,
(3.8) which can be written as (3.7a) by multiplying with H and right, respectively. Similarly, it holds that
iν
H
−∗
iν
Q + QH
−
1
iν
=
−
iν
I −
A
T
Q + Q (
iν
I −
A ) =
− and H
∗
iν
A
T
Q + QA which can be written as (3.7b) by multiplying with and right, respectively.
H
∗
iν
and H
iν
from left
= C
T
C
(3.9) from left
Proof of Theorem 3.1: can be written as
P
ω
=
=
1
ω
2
π
PS
∗
−
ω
ω
+
H
S
ω iν
P
.
Using the deﬁnition of
BB
T
H
∗
iν
d
ν
ω
1
= P
2
π
−
ω
P
ω
H
∗ in (3.4a) and Lemma 3.1, P
ω iν
d
ν
+
ω
1
2
π
−
ω
H
iν
d
ν
P
Hence, it holds that P
ω
= PS + S
ω
P , with S
ω
=
1
2
π
ω
−
ω
H
iν
d
ν
.
Before showing that (3.6) holds, observe that
AS
ω
= A
⎛
⎜⎜⎜⎝
2
1
π
ω
−
ω
H
iν
d
ν
⎟⎟⎟⎟
⎟⎟⎟⎠
= A
⎜⎜⎜⎝
⎜⎜⎜⎜
ω
=
⎜⎜⎜⎝
⎜⎜⎜⎜
1
2
π
−
ω
(
iν
I −
A )
−
1 d
ν
⎟⎟⎟⎠
⎟⎟⎟⎟
A =
1
2
π
−
ω
S
ω
ω
A
,
(
iν
I −
A
)
−
1 d
ν
⎟⎟⎟⎠
⎟⎟⎟⎟ i.e., the matrices
S
ω
P
A and S
ω
commute. Using the newly shown result together with the fact that A and S
ω
commute, AP
ω
+ P
ω
A
T
P
ω
= PS
∗
ω
+ can be written
26
3 FrequencyLimited
H
2
Norm as
AP
ω
+ P
ω
A
T
=
=
A ( S
S
ω
ω
P
AP
+ PS
∗
ω
+ PA
T
) + (
+
S
ω
P
AP +
+ PS
∗
ω
) A
T
PA
T
S
∗
ω
=
−
S
ω
BB
T
−
BB
T
S
∗
ω
.
Hence, (3.6) holds.
The same can be stated for the observability Gramian
Theorem 3.2.
Given a system G
=
A
C
B
D
, where
A
is Hurwitz, it holds that
Q
ω where
A
T
Q + QA + B
T
B = 0
be computed as a solution to and
S
ω
=
S
T
ω
Q + QS
ω
,
1
2
π
ω
−
ω
H
iν
(3.10) d
ν
. Furthermore,
Q
ω can also
A
T
Q
ω
+ Q
ω
A + S
T
ω
C
T
C + C
T
CS
ω
= 0
,
(3.11)
Proof: The proof is analogous with the proof in the previous theorem, with the controllability Gramian.
To be able to compute the limitedfrequency Gramians P
ω
and have a more computationally tractable expression for the matrix S
Q
ω
ω
.
we need to
Theorem 3.3.
The matrix
S
ω
S
ω
=
1
2
π
= Re
ω
−
ω i
π
H
iν
ln (
−
A d
ν
−
can be written as iω
I
)
!
.
(3.12)
Proof: We have that
S
ω
1
ω
2
π
−
ω
H
iν
d
ν
=
1
ω
2
π
−
ω
(
iν
I −
A )
−
1 d
ν f
( A )
.
(3.13)
With
f
(
x
) =
1
2
π
ω
−
ω
(
iν
I −
x
)
−
1 d
ν
, Theorem 2.5 states that it is su late the function on the spectrum of A . Let
λ
ﬃ be an eigenvalue of cient to calcu
A and since A is Hurwitz, it holds that Re
λ <
0. Hence
1
ω
2
π
−
ω iν
1
−
λ
d
ν
=
1
2
π
[
−
i
ln (
iν
−
λ
) ]
ω
ω
=
2
1
π
(
i
ln (
−
iω
−
λ
)
−
i
ln (
iω
−
λ
) )
,
where ln ln

λ

+
i
λ
(3.14) denotes the principal branch of the complex logarithm, namely ln arg
λ
,
−
π <
arg
λ
≤
π
. Going back to the matrix form entails
λ
=
S
ω
=
ω
1
2
π
−
ω
H
iν
d
ν
=
1
2
π
[
i
ln (
−
iω
−
A )
−
i
ln (
iω
−
A ) ]
.
(3.15)
3.1
FrequencyLimited Gramians
27
Since the principal branch of the logarithm is used, Theorem 2.3 is applicable, which for this case means that given a matrix C
∈ C
n
×
n
it holds that ln C = ln C .
S
ω
becomes
S
ω
=
2
i
π
ln(
−
A
−
iω
I
) +
2
i
π
ln(
−
A
−
iω
I
) = Re
i
π
ln(
−
A
−
iω
I
)
!
.
Remark 3.1.
An interesting property to investigate is what happens when ﬁnity. First note that if
x
∈ C \ R
−
ω
tends to in
, then Re [
i
ln
x
] =
− arg
x
. Now, let
λ
be an eigenvalue to
A with Re
λ <
0 since A is Hurwitz, then Re proaches inﬁnity. Hence, S
ω i
π
ln(
−
A
−
iω
I
) will approach
1
2 when
ω
apwill approach
I
2 and the Lyapunov equations (3.6) and (3.11) will approach the Lyapunov equations for the regular Gramians (3.3) when
ω
approaches inﬁnity.
Until now, only a single frequency band (
−
ω, ω
) around 0 has been considered.
It is also possible to have arbitrary segments in the frequency domain, e.g.,
Ω
= [
−
ω
4
,
−
ω
3
]
∪
[
−
ω
2
,
−
ω
1
]
∪
[
ω
1
, ω
2
]
∪
[
ω
3
, ω
4
], 0
< ω
1
< ω
2
< ω
3
< ω
4
.
Q
Ω
,
Corollary 3.1.
For a union of disjunct frequency intervals
Ω
=
"
[
−
ω
2
k
,
−
ω
2
k
−
1
]
∪
k
=1
[
ω
2
k
−
1
, ω
2
k
]
, with
0
≤
ω
1
< ω
2
<
· · ·
< ω
2
N
<
∞
, it holds that
1
P
Ω
=
2
π
Ω
1
Q
Ω
=
2
π
Ω
H
H
∗
iν iν
BB
T
H
∗
C
T
CH
iν iν
d d
ν,
ν,
(3.16a)
(3.16b)
satisfy the Lyapunov equations
0
0
=
=
AP
Ω
A
T
Q
+
ω
+
P
Ω
A
Q
ω
T
+
A +
S
Ω
S
T
Ω
BB
T
+
C
T
C +
BB
T
S
T
Ω
C
T
CS
,
Ω
, where
S
Ω
= Re
i
π
ln ⎢⎣
⎢⎢⎢⎢
k
=1
(
−
A
−
iω
2
k
I
) (
−
A
−
iω
2
k
−
1
I
)
−
1
⎥⎦
⎥⎥⎥⎥
.
(3.17a)
(3.17b)
(3.18)
Proof: The corollary is proven for the observability Gramian, the proof for the controllability Gramian follows the same procedure. Splitting the integral in
(3.16b) into two di ﬀ erent sums with limits of the integral centered around 0,
28
3 FrequencyLimited
H
2
Norm yields
Q
Ω
1
2
π
Ω
H
iν
BB
T
H
∗
iν
ω
2
k
−
1
−
1
2
π
−
ω
2
k
−
1
H
N
1
ω
2
k
d
ν
=
k
=1
2
π
−
ω
2
k iν
BB
T
H
∗
iν
H d
ν iν
=
BB
N k
=1
T
H
∗
Q
ω
2
k iν
− d
Q
ν
ω
2
k
−
1
.
(3.19)
Deﬁne entails
L
ω i
A
T
Q
ω i
+ Q
ω i
A + S
i
C
T
C + C
T
CS
ω i
= 0 . Using the fact that L
ω i
= 0
0 =
N k
=1
L
ω
2
k
−
+
L
ω
⎜⎝
⎜⎜⎜⎜
2
k
−
1
N k
=1
S
=
ω
2
k
A
T
⎜⎝
⎜⎜⎜⎜
N k
=1
Q
ω
−
S
ω
2
k
−
1
⎟⎠
⎟⎟⎟⎟
T
2
k
C
T
−
Q
ω
2
k
−
1
C + C
T
⎟⎠
⎟⎟⎟⎟
C
= A
T
Q
Ω
+
+
⎜⎝
⎜⎜⎜⎜
N k
=1
Q
Ω
⎜⎝
⎜⎜⎜⎜
N k
=1
A
S
+
ω
Q
2
k
S
ω
2
k
T
Ω
−
C
T
−
C
Q
+
ω
2
k
−
1
S
ω
2
k
−
1
⎟⎟⎟⎟
⎟⎠
C
T
⎟⎟⎟⎟
⎟⎠
CS
A
Ω
.
(3.20)
Hence, it is proven that (3.17b) holds. If
2
N
S
Ω
Lyapunov equation has to be solved to obtain can be computed, then only one
Q
Ω
.
S
Ω is for the moment a sum of matrix logarithms, which, using Theorem 2.6, can be rewritten as
S
Ω
=
N k
=1
S
ω
2
k
−
S
ω
2
k
−
1
= Re
= Re
i
N
π k
=1
i
π
N k
=1
[ln (
−
A
−
iω
2
k
I
)
− ln (
−
A
−
iω
2
k
−
1
I
) ] ln (
−
A
−
iω
2
k
I
) + ln (
−
A
−
iω
2
k
−
1
I
)
−
1
(3.21)
Now, we want to show that this sum can be combined into one matrix logarithm evaluation. Theorem 2.5 states that it is su ﬃ cient to calculate the function on the spectrum of A to show this. Let it holds that Re
λ <
−
π/
2
N k
=1
<
ln arg
x
2
k x i
<
+ ln
x
−
1
2
k
−
1
0. Deﬁne arg
x j
< π/
2 for
λ x i
be an eigenvalue to
=
−
λ
−
iω
and reorder the terms
i
, with
A
ω i > j
. Note that arg
x
and since
i i
1
>
=
A is Hurwitz,
0 then it holds that
− arg
x i
. Start with
N k
=1 ln
x
2
k
+ ln
x
−
1
2
k
−
1
= ln
x
2
N
+ ln
x
−
1
1
+
N
−
1 ln
x
2
k k
=1
+ ln
x
−
1
2
k
+1
.
(3.22)
3.1
FrequencyLimited Gramians
29
Analyzing the argument of the ﬁrst two terms,
−
π <
arg
x
2
N x
2
N
+ arg
x
−
1
1
<
and
0
, x
1
, gives hence, using Theorem 2.7, ln
x
2
N
+ ln
x
−
1
1
= ln
x
2
N x
−
1
1
,
−
π <
arg
x
2
N x
−
1
1
<
0
.
(3.23)
(3.24)
Analyzing the argument for the last sum in (3.22), yields that for all that 0
0
< ω
1
<
arg
x
< ω
2
2
k
<
+ arg
x
· · ·
< ω
N
−
1
2
k
+1
< π
and all
. Hence, ln
x i x
2
k
+ ln
x
−
1
2
k
+1
= ln
x
2
k x
−
1
2
k
+1
k
, it holds
. Now, since are in the open right half plane, it holds that
0
<
N
−
1 arg
x
2
k x
−
1
2
k
+1
k
=1
< π.
Hence, using Theorem 2.7,
N
−
1 ln
x
2
k
+ ln
x
−
1
2
k
+1
= ln
k
=1
k
=1
x
2
k x
−
1
2
k
+1
,
0
<
arg
'
1
x
2
k x
−
1
2
k
+1
k
=1
< π.
(3.25)
(3.26)
Returning to (3.22),
N k
=1 ln
x
2
k
+ ln
x
−
1
2
k
−
1
= ln
x
2
N
= ln
x
+ ln
x
2
N x
−
1
1
−
1
1
+
+ ln
N
−
1 ln
x
2
k
+ ln
x
−
1
2
k
+1
k
=1
'
1
x
2
k x
−
1
2
k
+1
= ln
k
=1
k
=1
x
2
k x
−
1
2
k
−
1
(3.27) since
−
π <
arg
x
2
N x
−
1
1
+ arg
,
N
−
1
k
=1
A , and therefore it also holds that
x
2
k x
2
k
+1
< π
. This holds for all eigenvalues of
S
Ω
=
N k
=1
S
ω
2
k
−
S
ω
2
k
−
1
= Re
= Re
i
N
[ln (
−
A
−
i
π
π
ln
k
⎡
⎢⎣
⎢⎢⎢⎢
k
=1
(
−
A
−
iω iω
2
k
2
k
I
)
− ln (
I
) (
−
A
−
−
A
iω
−
2
k
−
1
iω
2
k
−
1
I
)
−
1
⎥⎥⎥⎥
⎥⎦
I
) ]
.
(3.28)
Theorem 3.1 tells us that, by using addition of two or more frequencylimited
Gramians corresponding to di ﬀ erent frequency intervals, it is possible to construct a frequencylimited Gramian for a combined frequency interval, e.g., you can construct the frequencylimited controllability Gramian,
ω
∈ Ω
=
Ω
1
∪ Ω
2
, with
Ω
1
= [
−
ω
2
,
−
ω
1
]
∪
[
ω
1
, ω
2
] and
Ω
2
P
Ω
= [
−
ω
4
, for the interval
,
−
ω
3
]
∪
[
ω
3
, ω
4
]
30
3 FrequencyLimited
H
2
Norm as
AP
Ω
+ P
Ω
A
T
+ S
Ω
BB
T
+ BB
T
S
T
Ω
= 0
,
(3.29) with S
Ω computed as in Corollary 3.1.
Remark 3.2.
It is also possible to use, with abuse of notation,
ω
in that case the ordinary controllability Gramian, P
=
∞ as the end frequency, can be used in combination with the frequencylimited Gramians.
3.1.2
Discrete Time
The equations for the discretetime frequencylimited Gramians are similar to the ones in the continuoustime case. However, since the derivation in Horta et al.
[1993] is not as straightforward and yields an erroneous result, we will present our derivation in this section.
Given an asymptotically stable system stable means having A
G
=
A
C
B
D
.
G
being asymptotically
Schur. For this system the frequencylimited controllability and observability Gramians can be deﬁned.
Deﬁnition 3.2.
The frequencylimited controllability and observability Gramians for the system
G
=
A
C
B
D
, are deﬁned as with
ω < π
and H e
iω
P
ω
Q
ω
=
I e
ω
1
2
π
−
ω
ω
1
H
H
∗
2
π
−
ω
e
iν
e
iν
BB
T
H
∗
C
T
CH
iω
−
A
−
1
.
e
iν
e
iν
d
ν,
d
ν,
(3.30)
(3.31)
Inspired by the continuoustime case, the frequencylimited Gramians in discretetime can be written as solutions to two discretetime Lyapunov equations.
Theorem 3.4.
Given a discretetime system G
=
A
C
B
D
, where holds that
P
ω
S
ω
P + PS
T
ω
, where more,
APA
P
ω
T
−
P + BB
T
= 0
and
S
ω
=
1
4
π
ω
−
ω can be computed as a solution to
AP
ω
A
T
−
P
ω
+ S
ω
BB
T
I − e
−
iν
A
+ BB
T
S
T
ω
−
1
= 0
.
I
+ A e
−
iν
A d
is Schur, it
ν
(3.32)
. Further
(3.33)
To prove Theorem 3.4, a lemma is ﬁrst presented.
3.1
FrequencyLimited Gramians
Lemma 3.2.
For the ordinary Gramians
P
and
Q
, in
(2.14)
, it holds that
31
H e
iω
BB
T
H
∗ e
iω
H
∗ e
iω
C
T
CH e
iω
=e
−
iω
PH
∗ e
iω
=e
iω
QH e
iω
+ e
iω
H e
iω
P
−
P
,
+ e
−
iω
H
∗ e
iω
Q
−
Q
.
(3.34a)
(3.34b)
Proof: Using the deﬁnition of tions yields
H e
iω
= e
iω
I −
A
−
1
. Straightforward calculae
−
iω
PH
∗ e
iω
+ e
iω
H e
iω
P
−
P
= e
= e
−
iω
−
iω
H
−
1 e
iω
I −
A e
iω
P
P + e
iω
+ e
iω
P
PH e
iω
−∗
I − e
iω
A
−
H
−
1
∗
− e
iω
I e
iω
−
A
PH
−∗
P e
iω
e
iω
I −
A
∗
=
−
APA
T
−
P = BB
T
,
(3.35) which can be written as (3.34a) by multiplying with H and right, respectively. Similarly, it holds that e
iω
and H
∗ e
iω
from left e
iω
QH e
iω
+ e
−
iω
H
∗ e
iω
Q
−
Q
= e
= e
iω
H
−∗
iω
e
iω
I − e
iω
∗
A
Q + e
−
iω
Q + e
−
iω
Q
QH
−
1 e
iω
I e
iω
−
A
=
−
A
T
QA
−
Q
−
−
H
−∗ e
iω
e
iω
I −
A
QH
−
1
∗
Q e
iω
e
iω
I −
A
= C
T
C
(3.36) which can be written as (3.34b) by multiplying with H
∗ and right, respectively.
e
iω
and H e
iω
from left
Proof of Theorem 3.4: can be written as
Using the deﬁnition of P
ω
in (3.30) and Lemma 3.2, P
ω
32
3 FrequencyLimited
H
2
Norm
=
P
ω
1
ω
=
ω
4
π
−
ω
1
H e
iν
BB
T
H
∗ e
iν
d
ν
2
π
−
ω
=
1
ω
2
π
−
ω
I − e
−
iν
A
e
iν
H
−
1
I
+ e
iν
A e
−
−
iν
I
2
.
!
= d d
ν
ν
1
P
P
ω
2
π
−
ω
e
−
iν
ω
1
+ P
2
π
−
ω
ω
1
+ P
4
π
−
ω
PH
∗
e
I
iν
e
H
−
iν
e e
iν
−
iν
+ e
A
iν
−
−
1
H
I
2
.
∗
I e
iν
d
+
ν
A
P e
−
−
iν
P
!
∗
= S
ω
P + PS
∗
ω
.
d
ν
d
ν
(3.37)
Hence, it holds that P
ω
= S
ω
P + PS
S
ω
=
, with
1
4
π
−
ω
ω
I − e
−
iν
A
−
1
I
+ A e
−
iν
!
d
ν.
Before showing that (3.33) holds, observe that
AS
ω
= A
⎜⎜⎜⎝
⎜⎜⎜⎜
4
1
π
ω
−
ω
I − e
−
iν
A
−
1
I
+ A e
−
iν
!
d
ν
⎞
⎟⎟⎟⎠
=
⎜⎜⎜⎝
⎜⎜⎜⎜
1
4
π
ω
−
ω
I − e
−
iν
A
−
1
I
+ A e
−
iν
!
d
ν
⎟⎟⎟⎠
⎟⎟⎟⎟
A = S
ω
A
,
(3.38) i.e., the matrices that A and S
ω
A and commute,
S
ω
AP commute. Using that
ω
A
T
−
P
ω
P
ω
= can be written as
S
ω
P + PS
∗
ω
and the fact
AP
ω
A
T
−
P
ω
= A ( S
S
ω
ω
P +
APA
T
PS
∗
ω
−
P
) A
T
+
−
( S
ω
APA
T
P +
−
P
PS
∗
ω
S
∗
ω
)
=
−
S
ω
BB
T
+ BB
T
S
∗
ω
.
Hence, (3.33) holds.
(3.39)
The same can be shown for the observability Gramian.
Theorem 3.5.
Given a discretetime system G
=
A
C
B
D
holds that where
A
T
QA
−
Q + C
T
C = 0
Q
ω and
S
ω
S
T
ω
Q + QS
ω
,
=
1
4
π
ω
−
ω
I − e
−
iν
A
−
1
, where
A
is Schur, it
I
+ A e
−
iν
d
ν
(3.40)
. Fur
3.1
FrequencyLimited Gramians
33
thermore,
Q
ω can be computed as a solution to
A
T
Q
ω
A
−
Q
ω
+ S
T
ω
C
T
B + C
T
CS
ω
= 0
.
(3.41)
Proof: The proof is analogous to the one for the controllability Gramian.
Theorem 3.6.
as
The matrix
S
ω
=
1
4
π
S
ω
=
1
2
π
Re
ω
−
ω
I − e
−
iν
A
−
1
I
+ A e
ω
I −
2
i
ln
I −
A e
−
iω
.
−
iν
d
ν can be written
(3.42)
Proof: We have that
S
ω
=
ω
1
4
π
−
ω
I − e
−
iν
A
−
1
I
+ A e
−
iν
d
ν f
( A )
.
(3.43)
With
f
(
x
) =
1
ω
4
π
−
ω
I − e
−
iν x
−
1
I
+
x
e
−
iν
d
ν,
Theorem 2.5 states that it is su ﬃ cient to calculate the function on the spectrum of A . Let
λ
be an eigenvalue to A and since A is Schur, it holds that

λ

<
1. Hence
ω
−
ω
=
1 +
λ
e
−
iν
1
−
λ
e
−
iν
d
ν
ν
−
2
i
ln
=
i
ln e
−
iν
1
−
λ
e
−
iν
ω
−
ω
−
2
i
ln
= 2
ω
1
−
2
−
λ
e
−
iν
ω
−
ω i
ln 1
−
λ
e
−
iω
−
i
ln 1
−
λ
e
iω
(3.44) where ln ln

z

+
z i
arg denotes the principal branch of the complex logarithm, namely ln
z z
,
−
π <
arg
z
≤
π
. Going back to the matrix equation entails
=
S (
ω
) =
=
2
π
ω
1
4
π
−
ω
1
ω
I
I
−
−
i
e
−
iν
A ln
I −
−
1
A e
I
+
−
iω
A e
−
iν
− ln d
I
ν
−
A e
iω
.
(3.45)
Since the principal branch of the logarithm is used, Theorem 2.3 is applicable.
For this case it means that given a matrix C
∈ C
n
×
n
it holds that ln C = ln C .
S
ω
becomes
S (
ω
) =
=
=
2
2
2
1
π
1
π
1
π
/
ω
ω
Re
I
I
−
−
i i
ln
ω
I − ln
2
i
I −
I − ln
A
A e e
I −
−
iω
−
iω
A e
−
+
−
iω
ln
i
ln
.
I −
A
I − e
iω
A e
−
iω
! 0
34
3 FrequencyLimited
H
2
Norm
Remark 3.3.
If
ω
=
π
, then S
ω
=
I
2
− matrix is a real matrix, it follows that
1
π
S
Re [
i
ln (
I
ω
coincides with the regular Gramians when
=
ω
=
I
2
+ A ) ], and since the logarithm of a real
. Thus, the frequencylimited Gramians
π
.
3.2
FrequencyLimited
H
2
Norm
In this section, we will introduce a new frequencylimited
H
2
norm that uses the frequencylimited Gramians deﬁned in the previous section. This new measure can for example be used to compare di ﬀ erent models on limited frequency intervals, instead of the whole frequency domain.
3.2.1
Continuous Time
As presented in Section 2.1.3, the
A
C
B
D
H
2
norm of a continuoustime system
, which is asymptotically stable (
0 ), can be described by
A
G
is Hurwitz) and strictly proper ( D
=
=

G
 2
H
2
=
=
=
∞
1
2
π
tr
−∞
∞
1 tr
2
π
−∞
∞
1 tr
2
π
−∞
G
(
iν
)
G
∗
(
iν
)d
ν
CH
B
T
H
∗
iν iν
BB
C
T
T
H
∗
CH
iν iν
C
T d
ν
B d
ν
= tr CPC
T
= tr B
T
QB
.
(3.46a)
(3.46b)
(3.46c)
In this section, a new frequencylimited
H
2
like norm, that uses the frequencylimited Gramians presented in the previous section, is deﬁned and is denoted as

G

H
2
,ω
.
Deﬁnition 3.3.
For an asymptotically stable system
G
and 0
< ω <
∞
, deﬁne

G

2
H
2
,ω
ω
1
2
π
tr
−
ω
G
(
iν
)
G
∗
(
iν
)d
ν.
(3.47)
To be able to use the limitedfrequency
H
2
norm in practice, it has to be expressed in a more computationally friendly way.
3.2
FrequencyLimited
H
2
Norm
35
Theorem 3.7.
For an asymptotically stable system G
=
∞
, the limitedfrequency

G
 2
H
2
,ω
H
2
= tr
norm can be computed as

CP
ω
C
T
+ 2 tr CS
ω
B + D
ω
2
π
.
A
C
D
T
!
B
D
, or

G

2
H
2
,ω
= tr B
T
Q
ω
B + 2 tr

CS
ω
B + D
ω
2
π
.
D
T
!
, where
S
0
0
ω
= AP
ω
+
= A
T
Q
i
ω
= Re
π
P
ω
A
T
+ S
ω
BB
T
+
+ Q ln (
−
ω
A +
A
−
S
T
ω iω
I
)
C
T
!
C +
.
BB
T
S
T
ω
C
T
CS
,
ω
, and
0
< ω <
(3.48)
(3.49)
(3.50a)
(3.50b)
(3.50c)
Proof: Using Theorem 3.1 we can rewrite equation (3.47),

G

2
H
2
,ω
ω
=
1 tr
G
(
iν
)
G
∗
(
iν
)d
ν
2
π
=
1
2
π
tr
−
ω
ω
CH
iν
B + D B
T
H
∗
iν
C
T
+ D
T
−
ω
ω
1
= tr C
2
π
−
ω
+ tr
= tr
⎜⎜⎜⎝
⎜⎜⎜⎜
C
1
ω
2
π
−
ω
CP
ω
C
T
BB
T
H
∗
ω
H
iν
H
+ 2 tr
iν
d
ν
C
T
+ tr
1
2
π
−
ω iν
d
CS
ν
ω
BD
B
T
+ D
+
ω
2
π
.
ω
DB
T
1
D
2
π
T
!
.
−
ω
H
∗ d
ν
DD
iν
d
T
ν
d
ν
C
T
⎟⎟⎟⎠
⎟⎟⎟⎟
The same procedure can be used, using Theorem 3.2 and the fact that also can be written as

G

2
H
2
,ω
Theorem 3.3 shows how S
ω

G

2
H
2
,ω
=
1
2
π
tr
ω
−
ω
G
∗
(
iν
can be computed.
)
G
(
iν
)d
ν
, to show equation (3.49).
Using Corollary 3.1 it is possible, also for the limitedfrequency compute the
Ω
= [
−
ω
4
,
−
ω
H
2
3
]
∪
[
−
ω
2
,
−
ω
1
]
∪
[
ω
1
, ω
2
]
∪
[
ω
3
, ω
4
], 0
< ω
1
< ω
2
< ω
3
H
2
norm, to
norm on arbitrary segments in the frequency domain,
< ω

4
G
.
 2
H
2
,
Ω
,
One important thing to note that di ﬀ ers between the limitedfrequency
H
2 and the ordinary i.e., include
ω
=
H
2
∞
norm
norm, is that, if we do not include an inﬁnite interval in
Ω
, as the end frequency, then the system does not have to be strictly proper. This means that it is possible, in this case, to have D 0 .
36
3 FrequencyLimited
H
2
Norm
3.2.2
Discrete Time
In this section, the new frequencylimited
H
2
like norm for discretetime systems, that uses the frequencylimited Gramians presented in Section 3.1.2, is deﬁned.
Deﬁnition 3.4.
π
, deﬁne
For an asymptotically stable discretetime system
G
and 0
< ω <

G
 2
H
2
,ω
ω
1
2
π
tr
−
ω
G
(
iν
)
G
∗
(
iν
)d
ν.
(3.51)
Analogous to the continuoustime case, (3.51) can be expressed in a more computationally friendly way.
Theorem 3.8.
For an asymptotically stable discretetime system G
=
and
0
< ω < π , the limitedfrequency

G
 2
H
2
,ω
= tr CP
ω
C
T
H
2
norm can be computed as
+ 2 tr

CR
ω
B + D
2
ω
π
.
D
T
!
, or

G

2
H
2
,ω
= tr B
T
Q
ω
B + 2 tr

CR
ω
B +
ω
D
2
π
.
D
T
!
, where
S
R
0
0
ω
ω
= AP
ω
A
T
−
P
ω
+ S
ω
BB
T
+ BB
T
S
T
ω
,
=
=
=
A
T
Q
ω
A
−
Q
ω
1
2
π
−
1
π
Re
A
−
1
ω
I −
Re
i
2
i
+ S
T
ω
ln
I
C
T
−
C +
A e
C
T
−
iω
CS
,
ω
ln
I −
A e
−
iω
.
,
A
C
B
D
(3.52)
(3.53)
(3.54a)
(3.54b)
(3.54c)
(3.54d)
Proof: to
By using Theorem 3.4 and Theorem 3.5,

G
 2
H
2
,ω
can easily be rewritten

G

2
H
2
,ω

G
 2
H
2
,ω

= tr CP
ω
C
T
= tr B
T
Q
ω
B
+ 2 tr
+ 2 tr

CR
ω
B +
CR
ω
B +
D
ω
D
2
π
ω
2
π
.
.
D
T
!
!
,
D
T
,
(3.55a)
(3.55b) where
R
ω
ω
1
=
2
π
−
ω
H e
iν
d
ν
=
ω
1
2
π
−
ω
e
iν
I −
A
−
1 d
ν.
(3.56)
This integral can be computed and simpliﬁed similarly to what is shown in the
3.3
Concluding Remarks proof for Theorem 3.4, which leads to
R
ω
=
−
1
π
A
−
1
Re
i
ln
I −
A e
−
iω
.
37
(3.57)
3.3
Concluding Remarks
In this chapter, the frequencylimited Gramians and their derivations have been presented. Computationally more e ﬃ cient expressions than those presented in the original papers (Gawronski and Juang [1990] and Horta et al. [1993]), were derived. A detailed derivation of the discretetime frequencylimited Gramians was presented, using the same notation and framework as in the continuoustime case and correcting errors in the available literature. Additionally, the frequencylimited
H
2
norm that uses these Gramians, both for continuous and discrete time, were presented. This frequencylimited
H
2
norm will be used for frequencylimited model reduction in Chapter 4.
4
Model Reduction
This chapter starts by introducing the modelreduction problem in Section 4.1.
In Section 4.2, one of the most commonly used methods, balanced truncation
(including frequency weighted and frequency limited), will be presented. Then in Section 4.3 some existing methods that use an
H
2
measure for model reduction are presented. Then the proposed methods for ordinary, robust, frequencyweighted and frequencylimited model reduction will be presented in Section 4.4.
The material in this chapter is based on an extended version of the results in Petersson and Löfberg [2012a].
4.1
Introduction
Direct numerical simulation of dynamical systems has been a successful strategy for studying complex physical phenomena. However, deriving su ﬃ ciently detailed mathematical models, e.g., for designing controllers or analyzing performance, can be extremely di ﬃ cult and can result in large and unnecessarily complicated models. This is the case particularly for systems pertaining to circuit simulations or dynamical systems coming from discretized partial di ﬀ erential equations. These largescale models can make it di ﬃ cult to analyze the system, due to memorylimitations, timelimitations, illconditioning or computationally expensive analysis methods. Hence, there is a need for smaller models that can describe large complex systems well. One way of creating these loworder models is through model reduction.
Given an lti model,
G
:
˙ (
t
) = y (
t
) =
Ax (
t
) + Bu (
t
)
,
Cx (
t
) + Du (
t
)
,
39
40
4 Model Reduction y e
−
+
+
G
u
Figure 4.1:
Model reduction
where A
∈ R
n
×
n
, B
∈ R
n
×
m
, C
∈ R
p
×
n
and D
∈ R
p
×
m
reduction problem is to ﬁnd a reducedorder model
. For this model, the model
:
˙ˆ
(
t
) = ˆ ˆ (
t
Bu (
t
)
,
ˆ (
t
) = ˆ ˆ (
t
Du (
t
)
,
∈ R
ˆ
×
n
∈ R
ˆ
×
m
∈ R
p
×
ˆ
∈ R
p
×
m
and ˆ , where this reducedorder model, ˆ , describes the original model, to quantify the discrepancy between
G G
G
, well in some metric. One way
, is through the di respective outputs. Particularly, given a certain input, ﬀ erence in their u (
t
), the di ﬀ erence in the output, e (
t
) = y (
t
)
−
ˆ (
t
), should be small in some norm, see Figure 4.1.
This can be written as an optimization problem minimize
G
−
,
tem, and
G
H
∞ denotes the size of the system, i.e., the number of states in the sysor
H
2 are two examples of norms that could be used. There are a number of methods that address this problem, for example using balanced truncation (see Section 4.2), e.g., Enns [1984], Moore [1981], Glover [1984], or using optimization, e.g., Flagg et al. [2010], Beattie and Gugercin [2007], Beattie and
Gugercin [2009], Antoulas [2005], PoussotVassal [2011], Helmersson [1994] and the material in Section 4.4.
In many applications one is mainly interested in a loworder model that describes the system well only in a certain frequency interval. This leads us to investigate frequencyweighted model reduction. For the frequencyweighted model reduction, weighting ﬁlters are utilized, and in order to also facilitate mimo
systems an inputﬁlter (
W i
) and an outputﬁlter (
W o
) are needed. Example of such methods are, for example, Enns [1984], Diab et al. [2000], Halevi [1992], Sreeram and
Sahlan [2009], Zhou [1995]. Writing the frequencyweighted modelreduction problem as an optimization problem, results in minimize
W o
(
G
− ˆ
)
W i
,
In the frequencyweighted case, the weights have to be given by the user and are in practice often di ﬃ cult to choose. However, in many applications it is the case that a system should be approximated over a limited frequency interval, while the other frequencies are not important at all. In this case one would like to use
4.2
Balanced Truncation
41 an ideal bandpass ﬁlter, but approximating an ideal bandpass ﬁlter requires a large number of states in the weighting ﬁlters, and can lead to other problems.
To address this issue there are methods, that could be classiﬁed as a special class of frequencyweighted modelreduction methods, that will be called frequencylimited model reduction. This class of methods uses approaches that behave as though ideal bandpass ﬁlters have been used, e.g., Gawronski and Juang [1990],
Huang et al. [2001], Horta et al. [1993], Sahlan et al. [2012] and PoussotVassal and Vuillemin [2012], and we will introduce a new method using this strategy in
Section 4.4.3.
4.2
Balanced Truncation
One of the most commonly used modelreduction schemes is called balanced truncation, introduced in Moore [1981]. The physical interpretation of the balanced truncation is very simple, remove the states that induce a small amount of energy in the output and at the same time require a large amount of energy to excite. By understanding how the observability and controllability Gramians connect to these energies, see Section 2.1.2, one realizes that the system has to be expressed in a basis where the observability and controllability Gramians are equal and diagonal. Recall that the elements on the diagonal in the Gramians are the Hankel singular values of the system, see Section 2.1.2. This basis describes the states that can be classiﬁed as both di ﬃ cult to control and observe, these states that can be removed. These are the states that correspond to the small Hankel singular values. When a system is expressed in such a basis the system is called
balanced
controllability Gramian
U
∗
QU = K
Σ
2
K
∗
. Given a system with the observability Gramian
P , where P have the Cholesky factor U , P = UU
Q
∗ and
, and
, it can be shown that the transformation needed to balance the system can be written as
T =
Σ 1
/
2
K
∗
U
−
1 and T
−
1
= UK
Σ
−
1
/
2
,
(4.1)
= Tx , see for example Antoulas [2005].
Theorem 4.1 (Balanced reduction, Theorem 7.9 in Antoulas [2005]).
balanced system G
=
A
C
B
D
Given a
, which is asymptotically stable, with the Gramians equal to
Σ
and given the partitioning
A =
A
A
11
21
A
A
12
22
∈ R
n
×
n
,
B =
B
B
1
2
∈ R
n
×
m
,
C = C
1
C
2
∈ R
p
×
n
,
Σ
=
Σ
0
1
0
Σ
2
.
(4.2)
Then
=
A
11
C
1
B
D
1
,
A
11
∈ R ˆ
×
n is a reducedorder system of order which is both stable and balanced. Additionally, it holds that
ˆ
,
G
−
n
H
∞
≤
2
i
= ˆ +1
σ i
,
(4.3)
42
4 Model Reduction
where σ i are the Hankel singular values of the system in descending order of magnitude.
Proof: See Theorem 7.9 in Antoulas [2005]
There are several variations of the balancedtruncation method, which allow us to perform model reduction in a more computationally robust and e ﬃ cient manner, e.g., Safonov and Chiang [1989], Safonov et al. [1990], Glover [1984]. Two properties that most of the balancedtruncation methods have in common (which make them very popular) are the preservation of stability and the
a priori
computable error bounds. Important to note is that a system resulting from a balanced truncation scheme is not a minimizer to a speciﬁc system norm optimization (for example
H
2 and
H
∞
).
As mentioned in Section 4.1, one important class of balancedtruncation methods are the frequencyweighted balancedtruncation methods and they are described in the following way. Let
G
=
A
C
B
D
, be an asymptotically stable system to be reduced. Also assume that an input weighting,
W o
W i
(
s
), and an output weighting,
(
s
), are given. Deﬁne the weighted controllability and observability Gramians as
P
Q
i o
∞
1
=
2
π
−∞
∞
1
=
2
π
−∞
(
(
iω iω
I
I
−
−
A
A
)
)
−
1
−∗
B
W i
C
∗
W o
∗
(
iω
)
W i
∗
(
iω
) B
∗
(
iω
)
W o
(
iω
I −
A )
(
iω
) C (
iω
I −
A )
−∗
−
1 d d
ω,
ω.
(4.4a)
(4.4b) and compute the state transformation that simultaneously diagonalizes
Q
o
P
i
and
. Frequencyweighted balancedtruncation methods then utilize this transformation that diagonalizes P
i
and Q
o
, to do a balanced truncation. This approach to frequencyweighted balanced truncation was ﬁrst introduced in Enns [1984].
If either
W i
=
I or
W o
=
I this method guarantee stability of the reduced model.
However, if both input and output weightings are used at the same time nothing can be guaranteed. Modiﬁcations of this method that guarantee stability when both input and output weights are used are discussed in, e.g., Lin and Chiu
[1992], Varga and Anderson [2001].
Another important class of modelreduction methods, which was mentioned in
Section 4.1, is frequencylimited balanced truncation. This was introduced by
Gawronski and Juang [1990] for continuoustime systems and Horta et al. [1993] for discretetime systems. In these articles they use frequencylimited Gramians
(see Section 3.1) and simultaneously diagonalize these, to obtain a basis in which the truncation is done. The method in Gawronski and Juang [1990] can be seen as a special case of the method in Enns [1984] by choosing the weighting ﬁlters to be ideal bandpass ﬁlters (see Gugercin and Antoulas [2004]). However, the method
4.3
Overview of ModelReduction Methods using the
H
2
Norm
43 in Gawronski and Juang [1990] cannot guarantee stability. A modiﬁcation to this method that guarantee stability, has been presented in Gugercin and Antoulas
[2004].
4.3
Overview of ModelReduction Methods using the
H
2
Norm
The problem of ﬁnding a reducedorder model that, in
H
2 sense, resembles the original model well has been a goal in many investigations. Especially since the work of Meier and Luenberger [1967], and especially Wilson [1970], in which they derive ﬁrstorder optimality conditions for minimization of the
H
2
norm, see also, for example, Lepschy et al. [1991], Beattie and Gugercin [2007], Fulcheri and Olivi [1998], Yan and Lam [1999] and references therein. One reason for this could be the fact that the
H
2 criterion provides a meaningful characterization of the error, both in deterministic and stochastic contexts. For example, given two discretetime asymptoticallystable
y
(
Φ
u t
) and ˆ
(
ω
(
t
siso
) respectively, and a whitenoise input
) = 1), then it holds that systems
G
and ˆ , with the outputs
u
(
t
) (i.e., the input spectrum is minimize E (
y
−
y
ˆ )
2
= minimize
= minimize
π
G
(e
iω
)
−
ˆ
(e
iω
)
2
Φ
u
(
ω
)d
ω
−
π
π
G
(e
iω
)
−
ˆ
(e
iω
)
2 d
ω
= minimize
G
−
−
π
2
H
2
.
(4.5)
Finding global minimizers for the
H
2 approximation problem is very di ﬃ cult, it is in fact a nonlinear nonconvex optimization problem (see Example 4.1). The existing methods for
H
2 approximation have the more modest goal of ﬁnding local minimizers and can crudely be categorized into two categories; methods using tangential interpolation techniques or methods using gradientﬂow techniques.
Example 4.1: NonConvexity
To show that the cost function
V
=
ˆ
−
G
true
2
H
2 is nonconvex, we start with the system
G
true
=
−
1
1
1
0
.
A system
=
a c b
0
,
that approximates the system
G
true
, is sought, where
a, b
and
c
are the decision
44
4 Model Reduction variables. Consider an initial guess in an optimization formulation to be the system
G
0
=
−
8
−
2
−
4
0
.
Now, given the system example (
δa, δb, δc
)
T
G
0
, pick a descent direction for the cost function
= (7
,
5
,
5)
T
, such that
V
(
t
), for
ˆ
(
t
) =
−
8 + 7
t
−
2 + 5
t
−
4 + 5
t
0
, t
∈
[0
,
1]
,
then the value of the cost function,
V
direction is nonconvex, see Figure 4.2.
(
t
) =
ˆ
(
t
)
−
G
true
2
H
2
, along the descent
V
(
t
) along the search direction
3
2
.
5
2
1
.
5
1
0
.
5
0
0 0
.
1 0
.
2 0
.
3 0
.
4 0
.
5
t
0
.
6 0
.
7 0
.
8 0
.
9 1
Figure 4.2:
The value of the cost function along the search direction described in Example 4.1. The function clearly demonstrates the presence of local minimas along the search direction.
The gradientﬂow algorithms use the gradients of
G
−
H
2 with respect to the statespace matrices, derived in Wilson [1974] and let these evolve in time to ﬁnd a local approximation of the given system, see for example Yan and Lam [1999],
Fulcheri and Olivi [1998] and Huang et al. [2001]. The di ﬀ erent algorithms in this class use di ﬀ erent techniques to assure that the reduced model is stable, to speed up the process and to guarantee convergence.
4.4
Model Reduction using an
H
2
Measure
45
The interpolationbased
H
2 modelreduction techniques tries to ﬁnd a model whose transfer function interpolates the transfer function of the fullorder system (and its derivative) at selected interpolation points. These methods often use computationally e ﬀ ective Krylovbased algorithms which makes these techniques suitable for largescale problems. Examples of these algorithms are Xu and Zeng [2011], Beattie and Gugercin [2007] and PoussotVassal [2011].
4.4
Model Reduction using an
H
2
Measure
In this section, the proposed methods for model reduction are presented. We consider the following description for the modelreduction problem. Given a system
G
, search for the system ˆ such that
= arg min
W o
G
−
W i
2
H
2
,ω
.
(4.6)
It is assumed that the systems
G
G
=
A
C
B
D where
A
∈
∈
R
n
R
×
n
ˆ
×
n
,
,
B
∈
∈
R
n
×
m
R
ˆ
×
m
,
,
,
have the statespace realizations
C
=
∈
∈
R
p
R
p
×
n
,
×
ˆ
,
D
∈
∈
,
R
p
×
m
R
p
×
m
,
.
(4.7)
(4.8)
Since the
G
H
2
norm is used, it is also assumed that the system that is to be reduced,
, is asymptotically stable. Since, otherwise, the
H
2
norm is not deﬁned.
The idea with the proposed methods is to try an approach that tries to tackle the modelreduction problem head on. In Helmersson [1994] the model reduction problem (in
H
∞
norm) is rewritten as an sdp problem with bmi s, which, even for small models, leads to large optimization problems that are hard to solve. In Ani ć et al. [2013] they rewrite the modelreduction problem to an interpolation problem which makes it hard to incorporate structure in the system matrices. The proposed technique to solve the modelreduction problem is instead to use a nonlinear optimization approach and simply use a quasiNewton algorithm. Using this technique, the problem is not rewritten in any other format, which makes it possible to both use and incorporate structure in the system matrices. Additionally, by taking caution when di ﬀ erentiating the di ﬀ erent cost functions, and using the structure, the computational complexity can be kept low (in general an overhead cost of
O
(
n
3
) and
O
(
n
2
+
n
ˆ
2
) per iteration).
4.4.1
Standard Model Reduction
The method presented in this section was proposed already in Wilson [1970] for continuous time, however as a special case. The derivation in this section will include weighting ﬁlters and also the discretetime case. In this thesis, a di ﬀ erent derivation will be used, compared to Wilson [1970], with focus on being
46
4 Model Reduction computationally e ﬃ cient and also laying a foundation for the methods to come in the following sections.
The objective is to minimize the error between the given model,
G
, and the sought reducedorder model, ˆ , in the
H
2
norm with weighting ﬁlters,
W i
and
W o
, i.e.,
= arg min

E
 2
H
2
, E
=
W o
G
−
W i
,
(4.9) where it is assumed that
W i
and
W o
are given by the user and have the realizations
W i
=
A
i
C
i
B
i
D
i
, W o
=
A
o
C
o
B
o
D
o
,
(4.10) where
A
i
A
o
∈
∈
R
n
R
n o i
×
n i
×
n o
,
,
B
i
B
o
∈
∈
R
n
R
n i o
×
m
×
m
,
,
C
i
C
o
∈
∈
R
p
×
n
R
p
×
n i o
,
,
D
D
o i
∈
∈
R
R
p
×
m p
×
m
,
.
(4.11)
Using the realizations of
⎡
E
=
A
C
E
E
B
D
E
E
=
⎢⎢⎢⎢
⎢⎢⎢⎢
⎢⎢⎢⎢
⎣
G,
⎛
⎜⎜⎜⎜
⎜⎜⎜⎜
G, W
A
⎜⎜⎜⎝
B
0
0
o
C
i
D
o
C and
0
0
−
B
o
−
D
o
W o
,
E
BC
i
BC
A
0
i i
can be realized as
0
0
0
A
o
⎞
⎟⎟⎟⎠
⎟⎟⎟⎟
⎟⎟⎟⎟
⎜⎜⎝
⎜⎜⎜⎜
⎜⎜⎜⎜ BD
ˆ
B
0
i i i
0 C
o
D
o
D
−
⎟⎟⎟⎟
⎟⎟⎟⎟
⎟⎟⎠
D
i
⎥⎥⎥⎥
⎦
⎥⎥⎥⎥
⎥⎥⎥⎥
.
(4.12)
To be able to use the structure in the realization of
E
, a partitioning of the Gramians, P
E
P
E
and
⎛
Q
E
=
⎜⎜⎜⎜
⎜⎜⎜⎜
⎜⎜⎜⎝
P
P
P
P
T
12
T
13
T
14
, is introduced
P
P
P
12
T
23
T
24
P
P
P
P
13
23
T
i
34
P
14
P
24
P
34
P
o
⎟⎟⎟⎟
⎟⎟⎟⎟
⎟⎟⎟⎠
,
Q
E
=
⎜⎜⎜⎜
⎜⎜⎜⎜
⎜⎜⎜⎝
Q
Q
Q
Q
T
12
T
13
T
14
Q
Q
Q
12
T
23
T
24
Q
Q
Q
Q
13
23
T
i
34
Q
Q
Q
Q
14
24
34
o
⎟⎟⎟⎟
⎟⎟⎟⎟
⎟⎟⎟⎠
.
(4.13)
Since there will be some di ﬀ erences between the continuous and the discretetime cases, both cases will be presented. However, due to many similarities between the two, the continuoustime case will be presented in more detail than the discretetime case.
Continuous Time
In the continuoustime case, it is assumed that the system is strictly proper, otherwise the
H
2
norm will be unbounded, i.e., D
o
D
−
D
i
= 0 . Assuming this, the cost function in (4.9) can be written as, see Section 2.1.3,

E

2
H
2
= tr B
T
E
Q
E
= tr C
E
B
E
P
E
C
T
E
,
(4.14a)
(4.14b) which are two equivalent ways of computing the cost function, where
Q
E
P
E
and are the controllability and observability Gramians respectively, for the error
4.4
Model Reduction using an
H
2
Measure
47 system
E
, satisfying the equations
A
E
A
T
E
P
Q
E
E
+
+
P
E
Q
E
A
T
E
A
E
+
+
B
E
B
T
E
C
T
E
C
E
= 0
,
= 0
.
(4.15a)
(4.15b)
Using (4.14) and (4.15) it is possible to state the general necessary conditions for optimality, in which the gradients of the problem readily can be extracted to be used in a quasiNewton algorithm. In order to be as general as possible, we ﬁrst neglect the structure in (4.12).
Theorem 4.2 (Necessary conditions for optimality).
W o are asymptotically stable and that
E
Assume that is strictly proper, for the
H
2
deﬁned, i.e.,
A
,
ˆ
,
A
i and
A
o are Hurwitz and
D
o
D
−
D
i
= 0
G, G, W i and
norm to be
. In order for the matrices
ˆ
,
ˆ
, to be optimal for the problem (4.9), it is necessary that they satisfy the equations in (4.15) and that
∂

E
 2
H
2
∂
ˆ
∂

E

2
H
2
∂
ˆ
∂

E
 2
H
2
∂
ˆ
E
E
=
−
2
T
T
Q
E
B
Q
P
E
E
T
o
E
T
o
P
E
Q
E
i
E
=
P
0
C
i
T
E
,
+
+ Q
E
B
E
D
i
D
T
o
C
E
P
E
= 0
,
= 0
,
(4.16a)
(4.16b)
(4.16c)
where
=
⎜⎜⎜⎜
⎜⎜⎜⎜
⎜⎜⎝
0
I
0
0
n n n
×
ˆ
ˆ
×
n i o
×
ˆ
×
ˆ
⎟⎟⎟⎟
⎟⎟⎟⎟
⎟⎟⎠
,
E
i
=
⎜⎜⎜⎜
⎜⎜⎜⎜
⎜⎜⎝ I
0
0
n
×
ˆ
0
n
ˆ
×
n i n o
×
ˆ
×
ˆ
⎟⎟⎟⎟
⎟⎟⎟⎟
⎟⎟⎠
,
E
o
=
⎜⎜⎜⎜
⎜⎜⎜⎜
0
0
⎜⎜⎝
0
n
×
ˆ
ˆ
×
n
I
n n i o
×
ˆ
×
ˆ
⎟⎟⎟⎟
⎟⎟⎟⎟
⎟⎟⎠
.
(4.17)
Before proving the theorem above, two lemmas are needed to simplify the proof.
Lemma 4.1.
If
M
and
N
satisfy the Sylvester equations
AM + MB + C = 0
,
NA + BN + D = 0
, then
tr CN = tr DM
.
Proof of Lemma 4.1:
N
Multiplying the ﬁrst Sylvester equation from the left with and the second from the right with M , entails
NAM + NMB + NC = 0
,
NAM + BNM + DM = 0
.
Now taking the trace of both equations yields
− tr ( NAM + NMB ) = tr CN
,
− tr ( NAM + NMB ) = tr DM
.
Hence, it holds that tr CN = tr DM .
48
4 Model Reduction
Lemma 4.2.
that
tr
If
A
∂
A
B
∂a ij
C
∈ R
n
×
p
,
B
∈
= B
T
C
T
ij
R
m
×
n and
C
∈
∀
i, j
R
p
×
m and a ij or equivalently
∂
∂
A
= [ A ]
ij
, then it holds
(tr
BAC
) =
B
T
C
T
.
Proof of Lemma 4.2: First note that
∂
A
∂a ij
= e
i
e
T
j
, which is a matrix with a one in element (
i, j
) and zeros elsewhere. Now, it holds that tr B
∂
A
∂a ij
C = tr Be
i
e
T
j
C = tr e
T
j
CBe
i
= e
T
j
CBe
i
= [ CB ]
ji
= B
T
C
T
ij
.
Now, continuing with the proof for Theorem 4.2.
Proof of Theorem 4.2: If A
,
ˆ
,
A
i
and A
o
are Hurwitz, then all the equations in
(4.15) are uniquely solvable. The solutions to the equations in (4.15) are needed to compute the cost function and its gradient. Now, the gradient of the cost function with respect to ˆ ment (
i, j
) in ˆ
,
,
ˆ
,
have to be computed. Let
a ij
, b ij
and
c ij
denote elerespectively, now di ﬀ erentiating (4.14) with respect to
a ij
, b ij
and
c ij
entails
∂

E
 2
H
2
∂a ij
∂

E

2
H
2
∂b ij
∂

E

2
H
2
∂c ij
= tr
∂
Q
E
∂a ij
B
E
B
T
E
,
= tr
= tr
2
∂
B
T
E
∂b ij
Q
E
B
E
+
2
∂
C
T
E
∂c ij
C
E
P
E
+
∂
Q
∂b
E ij
B
E
B
∂
P
E
∂c ij
C
T
E
C
E
T
E
.
,
(4.18a)
(4.18b)
(4.18c)
Di ﬀ erentiate (4.15) with respect to
a ij
, b ij
and
c ij
,
A
T
E
A
T
E
A
∂
∂a
∂
E
Q
Q
∂b
E ij
E ij
∂
P
E
∂c ij
+
+
+
∂
∂a
∂
Q
Q
∂b
E ij
E ij
∂
P
E
∂c ij
A
A
E
E
A
T
E
+
+
+
∂
A
T
E
∂a ij
∂
A
T
E
∂b ij
Q
E
+ Q
E
∂
A
∂a
E ij
∂
A
E
∂c ij
Q
E
P
E
+
+
Q
E
∂
A
E
,
P
E
∂b ij
∂
A
T
E
∂c ij
.
,
(4.19a)
(4.19b)
(4.19c)
4.4
Model Reduction using an
H
2
Measure
49
Using Lemma 4.1 with (4.18) and (4.19) yields
∂

E

2
H
2
∂a ij
∂

E
 2
H
2
∂b ij
∂

E

2
H
2
∂c ij
=2 tr
∂
A
T
E
∂a ij
Q
E
P
E
,
=2 tr
=2 tr
∂
A
T
E
∂b ij
Q
E
P
E
+
∂
A
E
∂c ij
Q
E
P
E
+
∂
B
T
E
∂b ij
Q
E
B
E
∂
C
T
E
∂c ij
C
E
P
E
.
,
(4.20a)
(4.20b)
(4.20c)
Using the structure in the realization of
E
, (4.12), and Lemma 4.2, entails
∂

E
 2
H
2
∂
ˆ
∂

E

2
H
2
∂
ˆ
∂

E
 2
H
2
∂
ˆ
E
E
=
−
2
T
T
Q
E
B
Q
P
E
E
T
o
E
T
o
P
E
Q
E
i
E
=
P
0
C
i
T
E
,
+
+ Q
E
B
E
D
i
D
T
o
C
E
P
E
=
=
0
,
0
,
(4.21a)
(4.21b)
(4.21c) where
=
⎜⎜⎜⎜
⎜⎜⎜⎜
⎜⎜⎝
0
0
I
n
×
ˆ
ˆ
×
n n i
0
n o
×
ˆ
×
ˆ
⎟⎟⎟⎟
⎟⎟⎟⎟
⎟⎟⎠
,
E
i
=
⎜⎜⎜⎜
⎜⎜⎜⎜
⎜⎜⎝
0
0
n
×
n
0
I
n
ˆ
×
n i
×
n i i n o i
×
n i
⎟⎟⎟⎟
⎟⎟⎟⎟
⎟⎟⎠
,
E
o
=
⎜⎜⎜⎜
⎜⎜⎜⎜
⎜⎜⎝
0
I
0
0
n n n
×
n o
ˆ
×
n o o i
×
n
×
n o o
⎟⎟⎟⎟
⎟⎟⎟⎟
⎟⎟⎠
.
(4.22)
At a ﬁrst glance, it can seem restrictive to have a technique that operates on system matrices, since one is given a model in a speciﬁc realization. Does this inﬂuence the realization of the resulting model or in other ways restrict the sought model? As can be seen in Theorem 4.3 below, this is not the case since the optimization problem becomes invariant to the realization of the given model to be reduced.
Theorem 4.3.
The cost function in the optimization problem (4.6) and its gradient, given in Theorem 4.2, are invariant under state transformations of the systems
G
,
W i and
W o
.
Proof: Given the realizations of
G
,
W i
and in (4.7) and (4.10). The realizations of the transformed systems, given the transformations matrices T , T
i
and
T
o
, become
W o
G
= =
T
−
1
AT
CT
T
−
1
B
D
,
50
4 Model Reduction
W i
=
W o
=
C
i i o
C
o
B
o
D
o i i
=
=
T
i
−
1
C
i
A
T
i i
T
i
T
−
1
A
o
T
o
C
o
T
o
T
−
1
i
D
i
B
i
T
−
1
B
o
D
o
,
.
This can be written as
E
=
A
E
C
E
B
E
D
E
=
T
−
1
E
A
C
E
E
T
E
T
E
T
−
1
E
B
D
E
E
,
T
E
=
⎜⎜⎜⎜
⎜⎜⎜⎜ T
⎜⎜⎝
0
0
0
0
I
0
0
0
0
T
i
0
0
0
0
T
o
⎟⎟⎟⎟
⎟⎟⎟⎟
⎟⎟⎠
.
The matrices P
E
and Q
E
will be transformed as
(4.23)
P
E
= T
E E
T
T
E
,
Q
E
= T
− T
E
Q
E
T
−
1
E
.
(4.24)
Now it is easy to see that the cost function (4.14) is invariant under the transformation T
E
, since

E
 2
H
2
= tr B
T
E
Q
E
B
E
T
E
T
T
E
T
− T
E
E
T
−
1
E
T
E E
T
E
E
B
E
.
(4.25)
E
T
T
− T
E
, T
T
E
E
i
and E
T
o
T
− T
E
are evaluated,
E
T
T
− T
E
E
T
,
T
T
E
E
i
= E
i
T
i
T
,
E
T
o
T
− T
E
= T
− T
o
E
T
o
.
Using (4.26) when computing the gradient entails,
(4.26)
∂

E

2
H
2
∂
ˆ
E
T
Q
E
P
E
E
T
T
− T
E
E
T
−
1
E
T
E E
T
T
E
E
T
Q
E E
ˆ
,
∂

E

2
H
2
∂
ˆ
E
T
Q
E
P
E
E
i
C
i
T
+ Q
E
B
E
D
i
E
T
T
− T
E
Q
E
T
−
1
E
T
E E
T
T
E
E
i
T
i
− T
i
T
+ Q
E
T
−
1
E
T
E
B
E
D
i
E
T
E
P
E
E
i i
T
Q
E
B
E i
,
4.4
Model Reduction using an
H
2
Measure
∂

E

2
H
2
∂
ˆ
=
−
2 B
=
−
2
T
o
E
T
o
Q
E
T
o
P
E
T
T
o
E
T
o
+ D
T
o
C
E
P
E
T
− T
E
Q
E
T
−
1
E
T
E E
=
T
o
C
E
T
−
1
E
T
E
−
2
T
o
E
T
o
E
E
T
T
E
E
T
o
C
E
P
E
ˆ
.
51
Looking at the special case when not having any weighting ﬁlters, i.e.,
W o
=
I
,
n i
=
n o
= 0, yields the cost function
W i
=
I and

E
 2
H
2

E

2
H
2
= tr
= tr
B
T
QB + 2 B
T
Q
12
ˆ
+ ˆ
T
CPC
T
−
2 CP
12
C
T
C
ˆ
C
T
,
,
and the ﬁrstorder conditions for the gradient simplify to
(4.27a)
(4.27b)
∂

E
 2
H
2
∂
ˆ
∂

E

2
H
2
∂
ˆ
∂

E

2
H
2
∂
ˆ
= 2
= 2
= 2
Q
ˆ
+ Q
T
12
P
12
Q
ˆ
+ Q
T
12
B
C
ˆ −
CP
12
=
=
= 0
,
0
.
0
,
P
,
Q
,
ˆ
,
ˆ
,
P
12 and Q
12 satisfy the equations
(4.28a)
(4.28b)
(4.28c)
AP + PA
T
+ BB
T
= 0
,
AP
12
+ P
12
A
T
A
ˆ
P A
T
+ B
ˆ
T
B
ˆ
T
= 0
= 0
,
A
T
Q + QA + C
T
C = 0
,
,
A
T
Q
12
A
T
+ Q
12
ˆ −
C
T
ˆ
+ ˆ
T
=
=
0
0
.
,
(4.29a)
(4.29b)
(4.29c)
(4.29d)
(4.29e)
(4.29f)
Note that P and Q satisfy the Lyapunov equations for the controllability and observability Gramians for the given system,
G
, and ˆ satisfy the Lyapunov equations for the controllability and observability Gramians for the sought system, ˆ .
52
4 Model Reduction
For this special case it is also quite straightforward to derive the Hessian for the cost function. Using di ﬀ erentiated (with respect to
a ij
, b ij
, c ij
equations in (4.29) and using Lemma 4.1 and Lemma 4.2, yields
) versions of the
∂
2
V
∂a ij
∂a kl
=2
∂
2
V
∂b ij
∂b kl
∂
2
V
∂c ij
∂c kl
=
2
=
⎪⎪⎨
2
∂
2
∂a ij
V
∂b kl
∂
2
V
∂c ij
∂a kl
∂
2
V
∂c ij
∂b kl
=2
=2
=2
∂
ˆ
∂a ij
+
kl
0
, ik
, l l
=
j j
,
∂
ˆ
∂a kl ij
+
0
, lj
, i i
=
k k
,
∂
ˆ
∂b kl
∂
ˆ
∂a kl
∂
ˆ
∂b kl ij ij
+ 2 Q
T
12
∂
P
12
∂b ij
−
2
ij
−
2
C
∂
P
12
∂a kl
C
∂
P
12
∂b kl ij
, ij
.
kl
,
Q
T
12
∂
P
12
∂a ij kl
+ Q
T
12
∂
P
12
∂a kl ij
,
(4.30a)
(4.30b)
(4.30c)
(4.30d)
(4.30e)
(4.30f)
The explicit equations for the cost function, the gradient and the Lyapunov equations for the case when having both input and output ﬁlters are included in Appendix 4.B.1.
Discrete Time
In the discretetime case the cost function in (4.6) can be rewritten as, see Section 2.1.3,

E
 2
H
2
= tr B
T
E
Q
E
B
E
= tr C
E
P
E
C
T
E
+ D
T
E
D
E
+ D
E
D
T
E
,
(4.31a)
(4.31b) which are two equivalent ways of computing the cost function. The matrices and Q
E
P
E
are the controllability and observability Gramians respectively, for the error system
E
, and in this case they satisfy the discrete Lyapunov equations
A
E
P
E
A
T
E
A
T
E
Q
E
A
E
−
P
E
−
Q
E
+
+
B
E
B
T
E
C
T
E
C
E
= 0
,
= 0
.
(4.32a)
(4.32b)
Note that in the discretetime case, the system
E
does not any longer have to be strictly proper, however it still has to be asymptotically stable for the
H
2
norm to be deﬁned.
Theorem 4.4 (Necessary conditions for optimality).
W o are asymptotically stable, for the are Schur. In order for the matrices
H
2
ˆ
,
norm to be deﬁned, i.e.,
ˆ
, and
Assume that
A
,
G,
ˆ
,
G, W i
A
i and and
A
o to be optimal for the problem
4.4
Model Reduction using an
H
2
Measure

E

2
H
2

E
 2
H
2
= tr
= tr
B
T
QB
CPC
T
−
B
T
Q
T
12
T
12
B
C
T
B
T
Q
ˆ
+ D
T
D
−
D
T
D
C
ˆ
C
T
+ DD
T
−
2 D D
T and the ﬁrstorder conditions for the gradient simplify to
∂

E
 2
H
2
∂
ˆ
∂

E

2
H
2
∂
ˆ
∂

E

2
H
2
∂
ˆ
∂

E

2
H
2
∂
ˆ
= 2
= 2
= 2
= 2
Q
ˆ
ˆ −
D
+ Q
Q
ˆ
+ Q
T
12
B
C
ˆ −
CP
12
=
T
12
0
,
AP
12
=
= 0
0
,
,
= 0
,
D
T
D
T
,
,
53
(4.9), it is necessary that they satisfy the equations in (4.32) and that
∂

E

2
H
2
∂
ˆ
∂

E
 2
H
2
∂
ˆ
∂

E

2
H
2
∂
ˆ
∂

E
 2
H
2
∂
ˆ
=
−
E
E
2
T
T
Q
E
Q
= 2 D
T
o
D
o
A
E
P
E
E
A
B
T
o
E
T
o
E
P
Q
E
−
E
D
=
E
i
A
E
0
,
C
T
i
P
E
D
i
+
+ Q
E
B
E
D
i
D
T
i
D
T
o
=
C
E
0
,
P
E
=
=
0
,
0
, where
=
⎜⎜⎜⎜
⎜⎜⎜⎜
⎜⎜⎝
0
I
0
0
n n n
×
ˆ
ˆ
×
n i o
×
ˆ
×
ˆ
⎟⎟⎟⎟
⎟⎟⎟⎟
⎟⎟⎠
,
E
i
=
⎜⎜⎜⎜
⎜⎜⎜⎜
0
0
I
n
×
ˆ
ˆ
×
n
⎜⎜⎝
0
n n o i
×
ˆ
×
ˆ
⎟⎟⎟⎟
⎟⎟⎟⎟
⎟⎟⎠
,
E
o
=
⎜⎜⎜⎜
⎜⎜⎜⎜
0
0
⎜⎜⎝
0
n
×
ˆ
ˆ
×
n
I
n n i o
×
ˆ
×
ˆ
⎟⎟⎟⎟
⎟⎟⎟⎟
⎟⎟⎠
.
(4.33a)
(4.33b)
(4.33c)
(4.33d)
(4.34)
Proof: The proof is analogous to the proof of Theorem 4.2 for the continuous time case.
Theorem 4.5.
The cost function of the optimization problem (4.6) and its gradient, given in Theorem 4.4, are invariant under state transformations of the systems G , W i and W o
.
Proof: The proof is analogous with the proof for Theorem 4.3.
Now looking at the special case when not having any weighting ﬁlters, i.e., and
W o
=
I
,
n i
=
n o
= 0, yields the cost function
W i
=
I
(4.35a)
(4.35b)
(4.36a)
(4.36b)
(4.36c)
(4.36d)
54 where P
,
Q
,
ˆ
,
ˆ
,
P
12 and Q
12 satisfy the equations
4 Model Reduction
APA
T
−
P
AP
12
A
T
−
P
12
A
ˆ
A
T
−
A
T
QA
−
Q
+
+
BB
B
ˆ
B
ˆ
T
T
T
+ C
T
C
T
Q
T
12
A
T
ˆ
−
Q
T
12
−
−
C
T
C
C
T
= 0
,
= 0
,
= 0
,
= 0
,
=
=
0
,
0
.
(4.37a)
(4.37b)
(4.37c)
(4.37d)
(4.37e)
(4.37f)
Note that P and Q satisfy the Lyapunov equations for the controllability and observability Gramians for the given system,
G
, and ˆ satisfy the Lyapunov equations for the controllability and observability Gramians for the sought system, ˆ . For this special case, in discrete time, it is also quite straightforward to derive the Hessian for the cost function. Using di ﬀ erentiated (with respect to
a ij
, b ij
, c ij
) versions of the equations in (4.37) and using Lemma 4.1 and Lemma
4.2, entails
∂a
∂ ij
2
V
∂a kl
=2 Q
T
12
A
∂
P
12
∂a ij
+ 2
∂
ˆ
Q
ˆ
∂a ij kl kl
+ 2
+ 2
Q
T
12
A
∂
P
∂a
12
kl
∂
ˆ
Q
ˆ
∂a kl ij
.
ij
+ 2
∂b
∂
2
V
∂c ij
∂c kl
∂
2
V
∂d ij
∂d kl
∂
2
V
∂a ij
∂b kl
=
∂c
∂c
∂a
∂ ij
∂ ij
∂ ij
∂ ij
2
2
2
2
V
∂b
V
∂a
V
∂b
V
∂d kl kl kl kl
=
2
=
⎪⎪⎨
2
0
, ik
, l
0
,
2
, i
0
, lj
, i
=
i l k, j
=
=
=
k k j j
otherwise
l
,
,
,
=2
=2
=2
=
∂b
∂
ˆ
Q
ˆ
∂b kl
∂
ˆ
∂a kl ij
∂
2
ij
∂
ˆ
∂b kl
V
∂d kl ij
+ 2
−
2
−
2
Q
ij
=
∂
2
V
∂c ij
∂d kl
T
12
C
∂
P
12
∂a kl
C
∂
P
12
∂b kl
A
= 0
∂
P
.
∂b ij ij
,
,
12
kl ij
, ik lj
(4.38a)
(4.38b)
(4.38c)
(4.38d)
(4.38e)
(4.38f)
(4.38g)
(4.38h)
4.4
Model Reduction using an
H
2
Measure
55
The explicit equations for the cost function, the gradient and the Lyapunov equations for the case when having both input and output ﬁlters are included in Appendix 4.B.2.
4.4.2
Robust Model Reduction
In the previous section, it has been tacitly assumed that the given data, (i.e., the statespace matrices) are exact. In a more realistic setting, the presence of errors (e.g., modeling, truncation or roundo ﬀ
) in these data can be assumed. The question is how to cope with these errors and take them into account. This can for example be done using
robust optimization
. However, this is a very di ﬃ cult problem, see, e.g., Bertsimas et al. [2011] or BenTal and Nemirovski [2002]. In this section, a di ﬀ erent view of robust optimization is investigated, that is to use
regularization
as a proxy for robust optimization, which can be seen as a worstcase optimization approach.
Before presenting the equations for the regularized modelreduction problem, the idea is ﬁrst presented by using a more general description to get an intuition for the idea. The idea is then exempliﬁed using a leastsquares ( ls
) problem and a quadratic programming ( qp
) problem.
Regularization can be used to make illposed problems well posed or to make a solution less sensitive when having small amount of data. Commonly used regularization methods are for example problems referred to, in the
1
case as
1
 and
2
regularization, for leastsquares lasso and in the
2
case Tikhonov regularization or ridge regression, see e.g., Hastie et al. [2001]. In these regularizations an extra term,
1
 or
V
rob
( x ), is added to the cost function,
2
norm of the sought variables, i.e.,
V
original
( x ), to penalize the
V
reg
( x ) =
V
original
( x ) +
λV
rob
( x )
.
(4.39)
The regularization parameter, here denoted
λ
, is seen as a design parameter and is in most cases hard to tune (see for example Bauer and Lukas [2011]).
In many applications, there is no
a priori
knowledge about the variables, e.g., that they should be small (typically achieved by should be sparse (typically achieved using
2
1
regularization) or that the solution
regularization). Instead, one would like to make the solution less sensitive to uncertainties. As mentioned above, in this section, regularization will be used as a proxy for robust optimization. The idea is to penalize the ﬁrstorder derivative (with respect to
data
) of the cost function to make it less sensitive to uncertainties in data. This can be interpreted as doing a ﬁrstorder approximation of the general robust optimization problem minimize x max
 Λ 
2
≤
λ
V
( x
,
ˆ )
,
y +
Λ
,
(4.40)
∈ R
m
is the given data, y
∈ sents the uncertainty in the data and
R
m
x
∈ is the unperturbed data,
R
n
Λ ∈ R
m
repreis the sought variable. To see how a regularization can be an approximation of the robust optimization problem, a
Taylor expansion of the cost function with respect to the data is made. Assuming that the cost function is di ﬀ erentiable in the data variables, the cost function can
56
4 Model Reduction be expressed as
V
( x
,
ˆ
) =
V
( x
,
y ) + ( ˆ
− y )
T
∇
V
( x
,
y ) +
O
(

ˆ
− y
 2
2
)
=
V
( x
,
y ) +
Λ
T
f
( x
,
y ) +
O
(
 Λ  2
)
.
(4.41)
Limiting the uncertainty to be bounded, i.e., mum of (4.41), yields
 Λ 
2
≤
λ
, and computing the maximax
 Λ 
2
≤
λ
V
( x
,
ˆ
) = max
 Λ 
2
≤
λ
V
( x
,
y ) +
Λ
T
∇
V
( x
,
y ) +
O
(
 Λ  2
2
)
=
V
( x
,
y ) +
λ
∇
V
( x
,
y )
2
+
O
(
λ
2
)
.
(4.42)
To make this more clear, some examples are presented for an ls qp problem.
problem and a
Example 4.2: Robust ls and qp
Let us start with one of the most common problems, an ls problem. Assume that the
data
A and b are given and a solution x , fulﬁlling x arg min x
V
( x
,
A
,
b ) = arg min x
( Ax
− b )
T
( Ax
− b )
,
(4.43) is sought. To see how, for example, the A matrix inﬂuence the cost function, the cost function is di ﬀ erentiated with respect to A , i.e.
∂V
( x
,
A
,
b )
= 2 tr
∂a ij
e
j
e
i
T
[ Ax
− b ] x
T
,
(4.44) where
a ij
is the (
i, j
) element in A . This yields
∂V
( x
,
A
,
b )
A
= 2 ( Ax
− b ) x
T
.
(4.45)
Hence,
∂V
( x
,
A
A
,
b )
2
= 2

Ax
− b

2
 x

2
.
An interesting fact about the term in (4.46) is that it can be rewritten as
2

Ax
− b

2
 x

2
= 2

Ax
−
 x

2 b

2
 x

2
2
=
μ
( x )
 x

2
2
,
(4.46) where
μ
( x ) resembles Miller’s choice of regularization parameter (see El Ghaoui and Lebret [1997] or Miller [1970]). In Miller [1970] the regularization parameter
μ
( x ) is determined iteratively.
It is also possible to di ﬀ erentiate with respect to b in the ls problem. This term, together with the terms coming from di ﬀ erentiating with respect to H and f in a qp problem
V
( x ; H
,
f ) = x
T
Hx + f
T x
,
(4.47) are collected in Table 4.1.
4.4
Model Reduction using an
H
2
Measure
Table 4.1:
The di ﬀ erent regularization terms for the di ﬀ erent variables in the special cases, ls problem and qp problem
Problem Variable Uncertainty in ls ls qp qp qp
H (not sym.)
H
A b
(sym.) f



 Λ 

Λ
Λ
Λ
Λ




F
2
F
F
2
≤
≤
≤
≤
≤
λ
λ
λ
λ
λ
Reg. term
λ
 x

2
λ


Ax x

2
2
− b

2
=
λ

Ax

1
1
2
λ
 x

2
2
2
1
−
3
4 tr( xx
T
 x
 4
2
λ
 x

2
λ

Ax
− b

2
 x

2 xx
T
)
 x

2
2
57
Now, the regularization strategy explained above will be used as an extension to the special case of the modelreduction method in Section 4.4.1, having no weighting ﬁlters. To reduce the inﬂuence of errors in data, the unregularized cost function (4.27) is regularized by adding three new terms. These are the Frobenius norms of the derivatives of the cost function with respect to
the given data
, A , B , C and in
D , i.e., the solution obtained is inclined to be less sensitive to uncertainties
the data
.
The optimization problem with these new terms becomes
ˆ
,
min
ˆ
,
ˆ
,

E

2
H
2
+
V
rob
, E
=
G
−
(4.48) where
V
rob
=
A
∂

E
 2
H
2
∂
A
+
B
∂

E
 2
H
2
∂
B
+
C
∂

E
 2
H
2
∂
C
+
D
∂

E
 2
H
2
∂
D
.
(4.49)
F F F F
Note that here the term and
D
.
V
rob includes the regularization parameters,
A
,
B
,
C
V
rob becomes di ﬀ erent in the continuoustime case and the discretetime case. By exploiting the symmetry in (4.27), (4.28) and (4.29) with respect to ( ˆ
,
B
,
ˆ
) and
( A
,
B
,
C ) we obtain, in continuous time that
V
rob is
V
rob
= 2
A
QP + Q
12
P
T
12
F
+
B
QB + Q
12
F
+
C
CP
−
T
12
F
,
(4.50) and in the discretetime cases it becomes
V
rob
= 2
A
QAP + Q
12
T
12
F
+
B
+
C
QB + Q
12
CP
−
T
12
F
F
+
D
D
−
F
.
(4.51)
By di ﬀ erentiating the cost function (4.48) it is possible to state the necessary conditions for optimality, both for the continuoustime case and the discretetime case.
58
4 Model Reduction
Theorem 4.6 (Necessary conditions for optimality in continuous time).
sume that
H
2
G and G are asymptotically stable and that
norm to be deﬁned. In order for the matrices
ˆ
,
E
B
is strictly proper, for the and
Asto be optimal for
(4.48), in continuous time, it is necessary that they satisfy the equations in
(4.29)
and the equations
T
W
1
+
AW
2
+
W
1
A
W
2
ˆ
T
+ Q
+
T
12
QP + Q
12
P
T
12
QP + Q
12
P
T
12
P
12
AW
3
+ W
3
A
T
+ QB + Q
12
= 0
,
= 0
,
B
T
= 0
,
A
T
W
4
+ W
4
A C
T T
12
−
CP = 0
,
(4.52a)
(4.52b)
(4.52c)
(4.52d)
and that
∂

E

2
H
2
∂
ˆ
∂

E
 2
H
2
∂
ˆ
∂

E

2
H
2
∂
ˆ
+
+
∂V rob
∂
ˆ
∂V
∂ rob
ˆ
+
∂V rob
∂
ˆ
= 0
,
= 0
,
= 0
.
(4.53a)
(4.53b)
(4.53c)
With
∂V rob
∂
ˆ
∂V rob
∂
ˆ
∂V rob
∂
ˆ
=4
A
=4
A
W
1
P
12
+
∂

E

∂
A
H
2
Q
T
12
W
2
F
W
1
B
∂

E

∂
A
2
H
2
F
+ 4
+ 4
B
Q
T
12
B
QB
Q
T
12
∂

E

∂
B
2
H
2
W
3
∂

E

∂
B
H
2
+ Q
12
F
F
=
−
4
A
CW
2
∂

E

∂
A
H
2
F
−
4
B
CW
3
∂

E

∂
B
H
2
F
+ 4
C
W
4
P
12
∂

E

∂
C
H
2
F
,
−
4
C
+ 4
C
W
4
B
∂

E

∂
C
2
H
2
,
F
CP
−
∂

E

∂
C
H
2
T
12
F
P
12
, and
∂

E

H
2
∂
ˆ
,
∂

E

H
2
∂
ˆ
and
∂

E

H
2
∂
ˆ
as in
(4.28)
.
Proof: If
G
are asymptotically stable, the equations in (4.29) and (4.52) are uniquely solvable. The solutions to the equations in (4.29) and (4.52) are needed to compute the cost function and its gradient. Now the gradient of the cost function with respect to ˆ , ˆ and ˆ has to be computed. The ﬁrst part of the gradient
∂

E

H
2
∂
ˆ
,
∂

E

H
2
∂
ˆ and
∂

E

H
2
∂
ˆ has been computed in Theorem 4.2 and can be found in (4.28). Only the equations for the gradient of the
V
rob
part is left to be calculated, since this part enters as an additive term in the cost function. The calculations of this part of the gradient are moved to Appendix 4.A.
4.4
Model Reduction using an
H
2
Measure
59
An analogous result can be stated in discrete time.
Theorem 4.7 (Necessary conditions for optimality in discrete time).
that
G and
G are asymptotically stable, for the for the matrices
A
,
ˆ
, and
H
2
Assume
norm to be deﬁned. In order to be optimal for (4.48), in discrete time, it is necessary that they satisfy the equations in
(4.37)
and the equations
A
T
W
1
A
AW
2
A
T
−
−
W
W
1
2
+
AW
3
A
T
T
Q
T
12
QAP +
QAP + Q
12
Q
12
T
12
T
12
P
12
A
T
−
W
3
+ QB + Q
12
B
T
= 0
,
= 0
,
= 0
,
A
T
W
4
A
−
Q
T
12
W
4
Q
12
C
T T
12
−
CP
T
12
+ QAP P
12
= 0
,
= W
5
,
(4.55a)
(4.55b)
(4.55c)
(4.55d)
(4.55e)
and that
∂

E
 2
H
2
∂
ˆ
∂

E

2
H
2
∂
ˆ
∂

E

2
H
2
∂
ˆ
+
∂V rob
∂
ˆ
+
+
∂V rob
∂
ˆ
∂V rob
∂
ˆ
=
=
=
0
0
0
.
,
,
(4.56a)
(4.56b)
(4.56c)
With and
∂V
∂
∂V
∂
∂V
∂
∂V
∂ rob
ˆ
rob
ˆ
rob
ˆ
rob
ˆ
=4
A
=4
A
W
5
+
W
1
B
∂

E

∂
A
H
2
W
1
F
AP
12
+ Q
T
12
AW
2
∂

E

H
2
∂
A
+ 4
B
F
Q
T
12
QB
+ 4
+
∂

E

∂
B
H
2
Q
12
F
B
=
−
4
A
CW
2
∂

E

∂
A
H
2
F
−
4
B
CW
3
∂

E

∂
B
H
2
F
Q
T
12
AW
3
∂

E

H
2
∂
B
F
+ 4
C
W
4
−
4
C
+ 4
C
W
4
B
∂

E

∂
C
H
2
,
F
CP
−
∂

E

∂
C
H
2
T
12
F
P
12
,
AP
12
∂

E

H
2
∂
C
F
,
=4
D
−
D
∂

E

∂
D
2
H
2
F
,
∂

E

2
H
2
∂
ˆ
,
∂

E

2
H
2
∂
ˆ
and
∂

E

2
H
2
∂
ˆ
as in
(4.36)
.
Proof: The proof is analogous with the one for Theorem 4.6.
60
4 Model Reduction
4.4.3
FrequencyLimited Model Reduction
The method proposed in this section is a new method that was introduced in
Petersson and Löfberg [2012a]. The method relies heavily on the theory in Chapter 3. The variants of this method for continuous and discrete time are similar and, therefore, the continuoustime case will be presented in full detail and we will not provide as much detail for the discretetime case.
The method proposed in this section is a modelreduction method that given a model
G
G
, ﬁnds a reduced order model ˆ , which is a good approximation of on a chosen frequency interval, e.g., [0
, ω
]. The objective is to minimize the discrepancy between the given model and the sought reducedorder model in a frequencylimited
H
2
norm, using the frequencylimited Gramians. Correspondingly, the optimization problem for this purpose is as follows
= arg min

E

2
H
2
,ω
, E
=
G
−
(4.58) where

E

2
H
2
,ω
is deﬁned in Chapter 3.
Given the realization in (4.7), the error system can be realized, in statespace form, as
E
:
A
C
E
E
B
D
E
E
=
⎢⎢⎢⎢
⎣
A
0
C
−
0
ˆ
D
B
−
⎥⎥⎥⎥
⎦
.
(4.59)
Continuous Time
In the continuoustime case, the cost function of the optimization problem in
(4.58) can be rewritten as, see Section 3.2.1


E

2
H
2
,ω
= tr C
E
P
E,ω
C
T
E
+ 2 tr S
E,ω

C
E
= tr B
T
E
Q
E,ω
B
E
+ 2 tr C
E
S
E,ω
B
B
E
E
+
+
D
E
D
E
2
ω
π
ω
2
π
.
.
!
D
T
E
D
T
E
!
.
(4.60a)
(4.60b) where
A
A
E
P
E,ω
T
E
Q
E,ω
+
+ P
E,ω
A
T
E
Q
E,ω
A
E
+ S
E,ω
B
E
B
T
E
+ S
∗
E,ω
C
T
E
C
E
+
+
B
E
B
T
E
S
∗
E,ω
C
T
E
C
E
S
E,ω
= 0
,
= 0
,
(4.61a)
(4.61b) with
S
E,ω
= Re
i
2
π
ln (
−
A
E
−
iω
I
)
!
.
(4.62)
Now, the cost function (4.60) can be rewritten using the inherent structure in the problem. This is done by using the realization given in (4.59) and by partitioning
4.4
Model Reduction using an
H
2
Measure
61 the Gramians P
E,ω
and Q
E,ω
as
P
E,ω
=
P
P
T
ω
12
,ω
P
12
,ω
P
ω
,
and S
E,ω
as
S
E,ω
=
S
ω
0
Q
E,ω
=
Q
Q
T
ω
12
,ω
S
0
ω
.
Q
12
,ω
Q
ω
,
(4.63)
(4.64)
P
ω
,
Q
ω
,
equations
ω
,
Q
ω
,
P
12
,ω
and Q
12
,ω
satisfy, by (4.61), the Sylvester and Lyapunov
AP
AP
12
,ω
A
ˆ
ω
+
ω
A
T
Q
ω
A
T
Q
12
,ω
+
A
T
Q
ω
+
P
P
ω
12
,ω
P
ω
A
A
A
T
T
T
+
+
S
S
S
ω
ω
ω
BB
B
B
ˆ
ˆ
T
T
T
+
+
BB
T
S
∗
ω
B
ˆ
T
B
ˆ
T
∗
ω
∗
ω
+
Q
Q
ω
12
,ω
ω
A
ˆ
+ S
∗
ω
−
S
∗
ω
∗
ω
C
C
T
C
T
T
C +
ˆ
ˆ
−
C
C
+ ˆ
T
T
T
CS
C
C
ˆ
ˆ
ω
ω
ω
= 0
,
=
=
=
=
=
0
0
0
0
0
,
,
,
,
,
(4.65a)
(4.65b)
(4.65c)
(4.65d)
(4.65e)
(4.65f) with
S
ω
= Re
i
2
π
ln (
−
A
−
iω
I
)
!
,
S
ω
= Re
i
2
π
ln
− ˆ −
iω
I
!
.
(4.66)
Note that P
ω
and Q
ω
satisfy the Lyapunov equations for the frequencylimited
ω
Q
ω
satisfy the Lyapunov equations for the frequencylimited controllability and observability Gramians for the sought model, see Section 3.1.1.
With the partitioning of alternative forms
P
E,ω
and Q
E,ω
, it is possible to rewrite (4.60) in two

E
 2
H
2
,ω

E

2
H
2
,ω
= tr B
T
+ 2 tr
Q
ω
CS
B + 2 B
T
ω
B +
Q
12
,ω
D
ω
2
π
−

C
ˆ
B
T
Q
ω
ω
= tr CP
ω
+ 2 tr
C
T
CS
ω
−
2 CP
12
,ω
B + D
2
ω
π
C
T

−
C
C
ˆ
ω
ˆ
ω
C
T
ω
2
π
. !
D
T
−
D
T
ω
2
π
. !
D
T
−
D
T
.
,
(4.67a)
(4.67b)
Of course, as in Chapter 3, it is possible to have arbitrary segments in the frequency domain, e.g.,
0
< ω
1
< ω
2
< ω
3

E
 2
H
2
< ω
4
,
Ω
,
Ω
= [
−
ω
4
,
−
ω
3
]
∪
[
−
ω
2
,
−
ω
. Important to note, is that if
1
]
∪
Ω
[
ω
1
, ω
2
]
∪
[
ω
3
, ω
4
], does not contain an inﬁnite interval, then neither the given system to be reduced,
G
, nor the reduced system, ˆ , have to be strictly proper.
An appealing feature of the proposed optimization problem (4.58), is that the corresponding cost function, (4.67), is di ﬀ erentiable in the system matrices, ˆ
,
ˆ
,
62
4 Model Reduction and ˆ . In addition, the closedform expressions obtained when di ﬀ erentiating the cost function is expressed in the given data ( A
,
B
,
C and variables ( ˆ
,
B
,
and ˆ
D ), the optimization
) and solutions to the equations in (4.65). This makes it possible to formulate necessary conditions for optimality for the optimization problem (4.58).
Theorem 4.8 (Necessary conditions for optimality).
asymptotically stable, for the frequencylimited and
A
are Hurwitz. In order for the matrices
ˆ
,
ˆ
H
,
ˆ
2
C
Assume that
G and are
norm to be deﬁned, i.e., and
D
A
to be optimal for the problem (4.58), it is necessary that they satisfy the equations in (4.65) and the equations in (4.29) and that
∂

E
 2
H
2
,ω
∂
ˆ
∂

E

2
H
2
,ω
∂
ˆ
∂

E

2
H
2
,ω
∂
ˆ
∂

E

2
H
2
,ω
∂
ˆ
=2
=2
Q
Q
T
12
,ω
ω
P
+
12
Q
Q
T
12
,ω
ω
B
−
−
T
ω
2 W
C
T
=
D
0
−
=2 C
ˆ
ω
−
CP
12
,ω
−
D
−
=
−
2

CS
ω
B + D
ω
π
−
C
ˆ
ω
ˆ −
B
T
,
= 0
T
ω
= 0
,
ω
.
π
= 0
,
,
(4.68a)
(4.68b)
(4.68c)
(4.68d)
where
W
V

i
= Re
π
C
T
C
ˆ
L
−
− ˆ −
C
T
CP
12
iω
I
,
V
.
T
,
−
C
T
D
−
B
T
(4.68e)
(4.68f)
with the function L
(
· ,
Higham [2008].
·
)
being the Frechét derivative of the matrix logarithm, see
Proof: If A and ˆ are Hurwitz, then the equations in (4.65) are uniquely solvable, see Theorem 2.1. These are needed to compute the cost function and its gradient.
Now, the gradient of the cost function with respect to ˆ
,
ˆ
,
have to be calculated. However, this is done in Appendix 4.C, since the calculations are quite long.
As in Section 4.4.1 the optimization problem in this section also becomes invariant to the realization of the given model to be reduced, as can be seen in the following theorem.
Theorem 4.9.
The cost function in the optimization problem
(4.58)
and its gradient, given in Theorem 4.8, are invariant under state transformations of the system
G
.
Proof: Given the realization of
G
in (4.7) and a transformations matrix T , the
4.4
Model Reduction using an
H
2
Measure realization of the transformed system becomes
G
= =
T
−
1
AT
CT
T
−
1
B
D
.
Realizing that
S
ω
= Re
i
2
π
S
ω
= T
−
1
S
ω
T , since
!
ln (
−
A
−
iω
I
) = Re
=
2
i
π
T
−
1 ln
Re
T
−
1
i
2
π
− ¯ ln
−
iω
I
− ¯ −
T
!
iω
I
!
T = T
−
1
S
ω
T
,
the proof is analogous to the proof in Theorem 4.3.
63
Discrete Time
In the discretetime case, the cost function in (4.58) can be written as, see Section 3.2.2,

G

2
H
2
,ω
= tr CP
+ 2 tr

C
T
CR
ω
B
C
+
ˆ
ω
2
ω
C
D
T
−
= tr B
T
Q
+ 2 tr
ω
D
B + 2 tr B
T
−
T

Q
12
,ω
CR
ω
B +
−
2 tr CP
C
ˆ
ω
ˆ
−
12
,ω
ω
2
D
C
T
.
D
ˆ
+ tr ˆ
T
ω
ω
2
D
−
C
ˆ
ω
ˆ −
−
ω
2
T
.
,
(4.69a)
(4.69b) where
A
AP
ω
A
T
−
P
ω
AP
A
T
Q
ω
12
,ω
A
T
A
−
−
P
Q
ω
12
,ω
T
Q
12
,ω
A
−
Q
12
,ω
+ S
ω
+ S
T
ω
BB
T
C
T
C +
+ BB
T
S
T
ω
C
T
CS
ω
+ S
ω
BB
T
+ BB
T
S
T
ω
+ S
T
ω
C
T
C + C
T
CS
ω
= 0
,
=
=
=
0
0
0
,
,
,
(4.70a)
(4.70b)
(4.70c)
(4.70d) with
S
ω
=
1
2
π
Re
ω
I −
2
i
ln
I −
A e
−
iω
,
S
ω
=
1
2
π
Re
ω
I −
2
i
ln
I − ˆ e
−
iω
,
R
ω
=
−
1
π
A
−
1
Re
i
ln
I −
A e
−
iω
,
R
ω
=
−
1
π
A
−
1
Re
i
ln
I − ˆ e
−
iω
.
(4.70e)
(4.70f)
For the discretetime case it is also possible to calculate a closed form expression for the gradient of the cost function, and again this makes it possible to formulate necessary conditions for optimality.
Theorem 4.10 (Necessary conditions for optimality).
are asymptotically stable, for the frequencylimited
A
and
A
are Schur. In order for the matrices
ˆ
,
ˆ
,
H
2
norm to be deﬁned, i.e., and
Assume that
G and to be optimal for the problem in (4.58), it is necessary that they satisfy the equations in (4.70) and the
64
4 Model Reduction
equations in (4.37) and that
∂

E

2
H
2
,ω
∂
ˆ
=2 Q
T
12
,ω
AP
12
ω
A
ˆ
+ W
A
− T
C
T
D
−
B
T
R
T
= 0
,
∂

E
 2
H
2
∂
ˆ
,ω
∂

E

2
H
2
,ω
∂
ˆ
∂

E
 2
H
2
∂
ˆ
,ω
=2
=2
=
−
Q
C
2
ω
ˆ
ω
+ Q
T
12
,ω
B
−
CP
12
,ω
−
−
D
T
ω
C
T
−
D
CR
ω
B + D
ω
−
C
ˆ
ω
ˆ −
B
T
−
D
ˆ
ω
T
ω
=
=
= 0
,
0
0
,
, where
W
V
= Re

i
π
e
−
iπω
L
I −
C
T
ˆ −
P
T
12
C
T
ˆ −
ˆ e
−
iω
,
V
.
T
,
D
−
T
C
ˆ
−
1
(4.71a)
(4.71b)
(4.71c)
(4.71d)
(4.72a)
(4.72b)
with the function L
(
· ,
Higham [2008].
·
)
being the Frechét derivative of the matrix logarithm, see
Proof: The proof is analogous to the proof for Theorem 4.8 for continuous time.
Theorem 4.11.
The cost function to the optimization problem (4.6) and its gradient, given in Theorem 4.10, are invariant under state transformations of the system G .
Proof: Realizing that S
ω
= T
−
1
S gous to the proof in Theorem 4.3.
ω
T and R
ω
= T
−
1
R
ω
T , makes the proof analo
4.5
Computational Aspects of the Optimization
Problems
In this section, suggestions for how to initialize the optimization and how the optimization can be performed e ﬃ ciently, by using the inherent structure to speed up the computations, will be presented.
For all the methods that have been presented in Section 4.4, a cost function has been given and necessary conditions for optimality. The gradients for all the methods are readily extracted from the necessary conditions for optimality for the methods. With this information it is straightforward to, for example, use any quasiNewton solver, see Section 2.2.1, to solve the optimization problem in (4.6).
For two special cases, the Hessians were also calculated, which can be used to
4.5
Computational Aspects of the Optimization Problems y y
1
G
1
+
+
+ y
2
G
2
G
u
Figure 4.3:
Models in parallel
65 initialize the Hessian in the quasiNewton solver. Computing the Hessian in all iterations would be to computationally expensive.
4.5.1
Structure in Variables
In some cases, the system matrices A
,
B
,
C and D have a certain structure, that is desired to preserve while computing ˆ . In other words, it is desirable to have a similar structure in the system matrices for ˆ A
,
ˆ
,
and ˆ . For example, assume that
G
has the structure as given in Figure 4.3, with two systems in parallel where we want to use model reduction on the system
G
, but also keep the internal parallel structure. In this case a block diagonal ˆ matrix is desired.
Looking at all the cost functions in Section 4.4, there is nothing holding us back from introducing structure in the system matrices, e.g., block diagonal ˆ , when formulating our optimization problem. The question is if the derived gradients are still usable when having structure in the system matrices, and the answer is, yes. This is because all the steps in deriving the gradients have been done element is desirable, only are relevant and are hence used. In general, for this purpose, the so called structure variables S
,
S
,
S and S , are introduced, which holds the structure of the system matrices, i.e., element (
i, j
) in
S is 1 if element (
i, j
) is a variable in the sought system matrix and 0 otherwise.
The gradients now become where
∂

E

2
H
2
,ω
∂
ˆ
∂

E
 2
H
2
,ω
∂
ˆ
S
S
ˆ
,
,
∂

E

2
H
2
,ω
∂
ˆ
∂

E
 2
H
2
,ω
∂
ˆ
S
S
ˆ
,
,
denotes the Hadamard (element wise) product of two matrices.
Furthermore, with ˆ and S
ˆ
,
ˆ
,
ˆ
D initialized with structure according to S
ˆ
,
S
, the structure will remain when moving along a quasiNewton step.
,
S
4.5.2
Initialization
The optimization problem in (4.6), is both nonlinear and nonconvex, see, for instance, Example 4.1. This makes the initialization an important part of the
66
4 Model Reduction problem. For the methods proposed in this chapter, the model used for initialization has to be asymptotically stable. Since there exists numerous methods for model reduction, which are easily computed and produces asymptotically stable reduced models, e.g., balanced truncation, see Section 4.2, any of them can be used to create a model for initialization. In the special cases, in Section 4.4.1, where there are no input or output ﬁlters, even more can be done for the initialization. Looking at the cost functions, (4.27) and (4.35), one sees that the cost
(or ˆ A and ˆ ) are ﬁxed, and since ˆ P ) is positive semideﬁnite, the quadratic program is solvable. Hence, ﬁrst a basic initialization is used to obtain a model with the correct number of states, e.g., using balanced truncation. This model is then used in the quadratic and ˆ .
4.5.3
Structure in Equations
In this section, the inherent structure in the equations will be used to speed up the computations. First, remember that the problem is a model
reduction
problem, and in most cases ˆ . The analysis in this section will be based on the continuoustime case, but the same results are also valid for the discretetime case. Consider the cost function for the general case, when using input and output ﬁlters, (4.98). The terms D
i
T
B
T
QBD
i
and D
o
CPC
T
D do not depend on any of the optimization variables and are the only terms that include the matrices P and Q (see (4.96), (4.97) and (4.98)). Hence, puted. The same applies for the terms B
T
Q
ω
and Q
ω
in (4.65).
P
B and and
Q
CP does not have to be com
ω
C
T and the matrices P
ω
In all the presented methods, for every iteration in the solver, both the cost function and its gradient have to be computed. To do this a number of Lyapunov and
Sylvester equations have to be solved. This is where most of the computational time is spent. Therefore, before starting to analyze what is done in every iteration, a brief explanation on how to solve a general Sylvester equation is presented. A general Sylvester equation can be written as
AX + XB + C = 0
,
A
∈ R
n
×
n
,
B
∈ R ˆ
×
n
,
C
∈ R
n
×
ˆ
.
(4.73)
The ﬁrst main step when solving a Sylvester equation is to Schur factorize (see e.g., Golub and Van Loan [1996] or Bartels and Stewart [1972]) can be done in
O
(
n
3
) operations for A and
O
( ˆ
3
) operations for
A and B , which
B . Now the equation
A
S
X
S
+ X
S
B
S
= C
S
(4.74) has to be solved, where A
S
= U
T
AU and B
S
computed using the Schur factorization and
=
C
V
T
S
BV
= U
T are block upper triangular,
CV and X
S
= U
T
XV . It is not hard to verify that the new system of linear equations, (4.74), can be solved in
O
(
n
2
+
n
2
) complexity, and the solution to (4.73) is computed as, which also costs
O
(
n
2
+
n
ˆ
2
X = UX
S
V
T
). It can be concluded that when solving several
Sylvester equations with the same factors A and B but di ﬀ erent C :s, speed can be gained in the computations if A and B are Schur factorized before solving the
4.6
Examples
67 equations. It can also be concluded that it is computationally much more e ﬃ cient to use the structure in the realizations (4.12) and (4.59) and split up the large
Lyapunov equations for P
E
and Q
E
in a number of smaller Lyapunov/Sylvester equations, as described in (4.96) and (4.97), which can be solved much more e ﬃ
ciently.
For the methods in Section 4.4.1 and Section 4.4.3, which are invariant under state transformations, the given system
G
(and the input and/or output ﬁlter if they are present) can be transformed to a basis such that the A matrices are upper triangular (Schur factorize the factorization of A , such that A = U
¯
A matrices). In other words, given a Schur
T is block upper triangular and U is orthogonal, we can transform the system as follows,
G
=
U
T
AU
CU
U
T
D
B
=
,
(4.75) and use this realization during the iterations. Additionally, looking at the Lyapunov/Sylvester equations needed to be solved (equations (4.96) and (4.97) or equations (4.65) or (4.52)), one observes that they all have the same underlying structure, i.e., their factors in the equations are A , ˆ , A
i
, and A
o
. Assuming that
A (and A
i
and A
o
) is given in real Schur form, then for every iteration only the has to be Schur factorized, which is small compared to A , to be able to solve all Lyapunov/Sylvester equations at a maximum cost of
O
(
n
2
ˆ +
n
2
).
4.6
Examples
In this section, some examples that show the applicability of the proposed methods will be presented. Where it is possible, comparisons with other relevant methods will be made. To be able to measure how well di ﬀ erent methods perform, the relative error for the particular norm in use will be utilized, i.e.,
G
−

G

H
H
.
(4.76)
To shorten the names and make the ﬁgures more readable our proposed methods will be denoted as
• h
2 nl
– the ordinary modelreduction method without weights, described in Section 4.4.1
• wh
2 nl
– the ordinary modelreduction method with weights, described in
Section 4.4.1
• flh
2 nl
– the frequencylimited modelreduction method, described in Section 4.4.3
• rh
2 nl
– the robust modelreduction method, described in Section 4.4.2
The methods that will be used for comparison, in the di ﬀ erent examples, are
68
4 Model Reduction
• bt
– ordinary balanced truncation, the implementation used is the function schurmr in Robust Control Toolbox in
M atlab
• wbt
– weighted balanced truncation, an implementation of the method in
Enns [1984]
• flbt
– frequencylimited balanced truncation, an implementation of the method in Gawronski and Juang [1990]
• mflbt
– modiﬁed frequencylimited balanced truncation, an implementation of the method in Gugercin and Antoulas [2004]
• itia
– iterative tangential interpolation algorithm, the implementation in the more
toolbox is used (see PoussotVassal and Vuillemin [2012])
• istia
– iterative svd
tangential interpolation algorithm (see PoussotVassal and Vuillemin [2012]), the implementation in the more
toolbox is used
• flistia
– frequencylimited iterative tangential interpolation algorithm(see
Vuillemin et al. [2013]), the implementation in the more
toolbox is used
We start with an example to illustrate that the balanced truncation method can be used for initialization of the proposed methods.
Example 4.3:
H
2
Model Reduction
In this example 10000 random asymptotically stable and strictly proper siso systems with 20 states using the function rss in Control System Toolbox in
M atlab are generated. On each of these systems, the number of states are reduced to 10 with h
2 nl and model from bt step on top of bt
. When reducing the order of a system with h
2 is used as the initial point. In this case h
2 nl
, the reduced nl works as a reﬁnement bt
.
In Figure 4.4, two histograms are plotted. They show the histograms of the entities

G
− ˆ
G
− ˆ h bt

H
2 and

G
−
G
− ˆ
ˆ bt

H∞ respectively. In other words, they show how
2 norm, using nl h
H
2
2 nl
.
h
2 nl h
2 nl H∞ much the systems reduced using bt have been improved, in
H
2
norm and
H
∞ works well as a modelreduction method and can in
H
2

most cases decrease the model reduction error 16 times, measured in the norm. The average improvement in
H
2
norm is 4.15. Observe that also the
H
∞ is not
norm can be improved when using h a solution to a minimum norm, takes 1.82 seconds and with bt
H
2
2 or nl
, this is because of the fact that bt
H
∞
, problem. In average a run with it takes 0.07 seconds.
h
2 nl
We continue with two more examples based on a mediumscale model of a clamped beam. For the ﬁrst example we use ordinary model reduction without weights and for the second one the frequencylimited modelreduction method is utilized.
4.6
Examples
69
Ratio for
H
2
norm
400
300
200
100
0
0 5 10
Ratio for
H
∞
norm
15 20
600
400
200
0
0 0
.
5 1 1
.
5
Ratio
2 2
.
5
Figure 4.4: reduced using between the
The ﬁgure illustrates, in two histograms, how much a system
H
bt
∞
has been improved using
norm and the error system from using
H
2
h
2
norm of the error system from using nl
, i.e.,

G
−
h
G
− ˆ
h
2
2
nl
ˆ
bt

nl
H
H
. The
.
x axis is the quotient bt and
Example 4.4: Clamped Beam Model, varying order
In this example a model of a clamped beam, a siso model with 348 states which can be found in Leibfritz and Lipinski [2003], is used. The model will be reduced to di ﬀ erent orders,
n r
∈
[4
,
30], with h
2 nl be compared with models reduced using
. The reduced models using istia
, itia and bt h
2 nl will
. In the left plot of
Figure 4.5, it can be observed that for small than bt
, for the
H
2
norm, and for larger
n r n r
, h
2 nl
, itia and istia are better the error approaches zero for all methods. It can also be observed, in the right plot of Figure 4.5, that, even though we are minimizing the
H
2
norm, the
H
∞
norm remains small for all the methods.
Example 4.5: Clamped Beam Model, limited frequency interval
In this example, the model of the clamped beam from the previous example is reused. This time, instead of trying di ﬀ erent orders, the focus will be on ﬁnding reduced models for di ﬀ erent frequency intervals, [0
, ω
]
, ω
∈
[2
,
40] and ﬁx the reducedorder model to have 12 states,
n r
= 12. The proposed method flh
2 nl will be used and it will be compared with the frequencylimited methods flistia
, flbt and mflbt
. Additionally, the methods wh
2 nl and wbt will be used, both with a tenth order Butterworth lowpass ﬁlter, with the cuto ﬀ frequency equal to
ω
. Looking at the left plot of Figure 4.6, it can be observed that for small all the
H
2 optimal methods do very well. However, for
ω >
7, h
2 nl
ω
gives better
, result than all the other methods. As in the previous example, the relative
H
∞

70
4 Model Reduction
10
− 2
Relative error for H
2 norm
10
− 2
10
− 3
Relative error for
H
∞ norm bt itia istia h
2 nl
10
− 3
10
− 4
10
n r
20 30
10
−
5
10
n r
20 30
Figure 4.5: h 2 nl
, itia the relative
,
Reduction of a clamped beam model to di ff erent orders using istia and bt
. To the left, the relative
H
2 error and to the right
H
∞ error.
norm remains low, for almost all
H
∞
ω
, the H
2 optimal methods have better relative
error than the methods using balanced truncation.
Now, two smaller examples are presented to show how models coming from frequencylimited methods can look in the frequency region of interest and outside this region. We start with a small toy example.
Example 4.6: Small toy example
This example considers a small model with four states. The model is composed of two secondorder models in series, one with a resonance frequency at and the other at
ω
= 3. The frequency range is limited to capture the first model. The model used is
ω
∈ [0
,
ω
= 1
2] to try to only
G
=
G
1
G
2
=
s
2
+ 0
1
.
2
s
+ 1
s
2
+ 0
.
9
003
s
+ 9
.
(4.77)
The methods flh
2 nl
, flistia
, are also compared with the methods pass Butterworth filter with a cuto ff from the di ff erent methods can be seen in Figure 4.8, Figure 4.9 and Table 4.2.
As can be seen in the result, flbt flh
2 nl
, and wh
2 frequency of 2, see Figure 4.7. The results wh mflbt nl
2 nl
, and are compared. These methods wbt flistia using a tenth order lowand flbt in finding a good model for the relevant frequencies, especially are successful flh
2 nl
, which is almost six times better, in
Table 4.2.
mflbt
H
2
norm, than the second best model, wh
2 nl
, see captures the wrong resonance mode (from our perspective) and fails completely in the lower frequency region, and wbt gain at both the resonance frequency and at the cuto ff misses to capture the frequency. Interesting to note is how the methods, that does a good job, sacrifices the model fit at higher frequencies for the lower.
4.6 Examples
1
.
4
1
.
2
1
0
.
8
0
.
6
0
.
4
0
.
2
0
Relative error for
· 10
− 2
H
2 norm
10 20
ω
30
40
1
.
4
1
.
2
1
0
.
8
0
.
6
0
.
4
0
.
2
0
Relative error for
· 10
− 3
H
∞ norm wbt flbt h
2 nl mflbt flistia wh
2 nl
10 20
ω
30
40
Figure 4.6: Reduction of a clamped beam model to 12 states with focus on the frequency interval
[0
, ω
]
, ω
∈
[2
,
40] using flh
2 nl
, wh
2 nl
, flistia
, flbt and mflbt
. The filter used for the weighted methods is a tenth order
Butterworth lowpass filter with cuto ff frequency
ω
. To the left, the relative
H
2 error and to the right the relative
H
∞ error.
71
Magnitude plot for the filter, the true model and the filtered true model
20
0
− 20
− 40
10
− 1
True
Filter
Filtered True
10
0
Frequency [rad
/
s]
10
1
Figure 4.7: The true and filtered model and the lowpass filter for Example
4.6. The dashed vertical line denotes
ω
= 2
.
72
4 Model Reduction
40
20
0
Magnitude plot for the true and the reduced models wbt mflbt flbt flistia flh
2 nl wh
2 nl
True
− 20
−
40
10
−
1
10
0
Frequency [rad
/
s]
10
1
Figure 4.8: The true and reducedorder models for Example 4.6. The dashed vertical line denotes
ω
= 2
.
flh 2 nl
, wh 2 nl
, flistia and flbt are successful in finding a good model for the relevant frequencies while mflbt and wbt fails.
40
20
0
Magnitude plot for the error models wbt mflbt flbt flistia flh
2 wh
2 nl nl
− 20
−
40
− 60
10
− 1
10
0
Frequency [rad
/
s]
10
1
Figure 4.9: The error models for the di ff erent methods for Example 4.6. The dashed vertical line denotes
ω
= 2
.
flh
2 nl
, wh
2 nl
, flistia and flbt are successful in finding a good model for the relevant frequencies while mflbt and wbt fails.
4.6 Examples
73
Table 4.2: mflbt flbt flistia flh wh wbt
2
2 nl nl
Numerical results for Example 4.6

G
−


G

H
2
H
2
,ω
,ω
3.01e01
1.00e+00
6.31e02
6.38e02
1.02e02
5.97e02

G
−

H∞

G

H∞
,ω
,ω
2.91e01
1.00e+00
4.00e02
3.96e02
1.15e02
3.95e02
Re
λ
max
1.00e01
1.51e03
9.93e02
9.99e02
1.01e01
1.00e01
Magnitude plot for the filter, the true model and the filtered true model
40
20
True
Filter
Filtered True
0
− 20
− 40
− 60
10
0
10
1
10
2
10
3
Frequency [rad
/
s]
10
4
10
5
Figure 4.10: The true and filtered model and the bandpass filter for Example 4.7. The dashed vertical lines denote
ω
= 10 and
ω
= 10000
.
Example 4.7: CD player
This example uses a slightly larger model, a model of a compactdisc player with
120 states and two inputs and two outputs, see Leibfritz and Lipinski [2003]. In this example, to illustrate the result in the same way as in the previous example, only one siso part of the transfer function is chosen, namely the transfer function from the second input to the first output of the model. Here, focus will be on a banded frequency interval,
ω
∈ [10
,
1000] where the main peak gain is, see Figure 4.10. The methods that will be compared are the frequencylimited methods flbt
, mflbt
, flistia and flh
2 nl and the weighted methods with a tenth order Butterworth bandpass filter with cuto ff wbt and wh
2 nl frequencies equal to
ω
= 10 and
ω
= 1000. Looking at the results in Figure 4.11, Figure 4.12 and Table 4.3 all the methods, except flistia
, does a good job, and again nl finds the best model.
flh
2
74
4 Model Reduction
40
Magnitude plot for the true and the reduced models
20
0 wbt mflbt flbt flistia flh
2 nl wh
2 nl
True
−
20
− 40
− 60
10
0
10
1
10
2
10
3
Frequency [rad
/
s]
10
4
10
5
Figure 4.11: The true and reduced order models for Example 4.7. The dashed vertical lines denote
ω
= 10 and
ω
= 10000
.
flh 2 nl
, wh 2 nl
, mflbt
, wbt and flbt are successful in finding a good model for the relevant frequencies. However, in this example the method flistia fails.
20
0
Magnitude plot for the error models wbt mflbt flbt flistia flh
2 wh
2 nl nl
−
20
−
40
−
60
10
0
10
1
10
2
10
3
Frequency [rad
/
s]
10
4
10
5
Figure 4.12: The error models for the di ff erent methods for Example 4.7.
The dashed vertical lines denote
ω
= 10 and
ω
= 10000
.
flh
2 nl
, wh
2 nl
, mflbt
, wbt and flbt are successful in finding a good model for the relevant frequencies. However, in this example the method flistia fails.
4.7
Conclusions
75
Table 4.3:
wbt

Numerical results for Example 4.7
G
− ˆ


G

H
2
H
2
,ω
,ω
1.24e03

G
−

H∞

G

H∞
,ω
,ω
9.50e04
Re
λ
max
5.55e+00 mflbt flbt
1.25e03
1.24e03
9.43e04
9.41e04
5.54e+00
5.54e+00 flistia flh wh
2
2 nl nl
8.23e02
6.95e04
8.94e04
5.64e02
6.83e04
7.76e04
2.26e01
5.63e+00
5.80e+00
Example 4.8: CD player with perturbed poles
In this example, the model of the CD player from the last example is used again.
However, in this case the system matrices of the model are perturbed such that
A pert
= A + E
A
B pert
= B + E
B
C pert
= C + E
C
A
B
C
,
,
,
where the elements in the distribution
N
E
A
,
0
,
0
.
05
2
E
B and E
C are independent random variables with
. The perturbed model will be reduced to a ﬁfteenth order model using rh
2 h
2 nl
. This will be compared with reducing the model with nl with di ﬀ erent values of the regularization parameter. This procedure is repeated 250 times with di ﬀ erent realizations of the random variables and the average is computed. The result from the optimization can be seen in Figure 4.13.
In this ﬁgure, the average relative error between the true, unperturbed, model and the reduced models, as a function of the regularization parameter, for rh
2 nl and h
2 nl are plotted.
In Figure 4.13 one observes that for the tested values of the regularization parameters it is possible, in this case, to ﬁnd a better model. Even for the
H
∞
norm, it is possible to ﬁnd a model that performs better than the unregularized method.
Some more examples using model reduction methods will be performed in Chapter 7, where two larger examples are presented which need more background.
4.7
Conclusions
In this chapter, three modelreduction methods (in both continuous and discrete time) based on minimizing the
H
2
norm using optimization have been presented.
For these methods, both cost functions and gradients have been derived, which makes it possible to e ﬃ ciently use oftheshelves quasiNewton solvers. For a few cases the Hessians have been derived, which also can be utilized in the quasi
Newton solver. The derivation of the methods enables us to impose structural
76
4 Model Reduction
Relative error for H
2 norm Relative error for
H
∞ norm
4
3
2
4
3
2
1
10
−
3
10
−
2
10
−
1
α
10
0
10
1
1
10
−
3
10
−
2
10
−
1
α
10
0
10
1
Figure 4.13: turbations, for and the blue line is the average relative error, over di ff erent perturbations, for h
2 nl
(
H
2
The black line is the average relative error, over di ff erent perrh
2 nl using di ff erent values of the regularization parameter
norm in the left plot and
H
∞
norm in the right).
constraints, e.g., block diagonal ˆ matrix, in the system matrices. Additionally, a number of examples showing the applicability of the methods, both for small and mediumscale problems have been presented, for which the methods have performed well.
One of the drawbacks with the methods is the nonconvexity of the problem.
One way to possibly reduce the influence of the nonconvexity is to have a better initialization, which is a subject of further research. However, for the examples presented in this chapter the proposed initialization procedure seems to work.
Appendix
4.A
Gradient of
V
rob
In this appendix, the derivation of the gradient, with respect to ˆ B
V
rob in (4.49) in Section 4.4.2 will be presented, where and ˆ , for
V
rob
=
A
∂

E

2
H
2
∂
A
F
+
B
∂

E

2
H
2
∂
B
F
+
C
∂

E

2
H
2
∂
C
F
+
D
∂

E

2
H
2
∂
D
F
.
(4.78)
To di ﬀ erentiate function
f
V
rob
, we ﬁrst need the deﬁnition of the Frobenius norm. Given a
(
x
), the Frobenius norm is deﬁned as
2

f
(
x
)

F
tr
f
(
x
)
T
f
(
x
)
.
(4.79)
Di ﬀ erentiating

f
(
x
)

F
with respect to
x
, yields
∂

f
(
x
)

F
∂x
=
∂
∂x
tr
f
2

f
(
x
)
T
f
(
x
)

F
(
x
)
.
(4.80)
This means that the still unknown part when calculating numerator part. Given the structure of
V
rob
∂

f
(
x
)

F
∂x
given
f
(
x
), is the in (4.78) this means that to obtain an expression for
V
rob
, we need to calculate, for example, terms like
∂
∂
ˆ tr
⎜⎜⎜⎜
⎜⎜⎝
⎢⎣
⎢⎢⎢⎢
∂

E

∂
A
2
H
2
⎥⎥⎥⎥
⎥⎦
T
⎢⎢⎢⎢
⎢⎣
∂

E

∂
A
2
H
2
⎥⎥⎥⎥
⎥⎦
⎟⎟⎟⎟
⎟⎟⎠
.
In this appendix, elements in the matrices ˆ
,
and
c ij
respectively.
ˆ
C will be denoted with
a ij
, b ij
77
78
4 Model Reduction
To simplify the equations later on, four new Sylvester equations are deﬁned,
T
W
1
+
AW
2
+
W
1
A
W
2
ˆ
T
+ Q
+
T
12
QP + Q
12
P
T
12
QP + Q
12
P
T
12
P
12
AW
3
+ W
3
A
T
+ QB + Q
12
= 0
,
= 0
,
B
T
= 0
,
A
T
W
4
+ W
4
A C
T T
12
−
CP = 0
,
(4.81a)
(4.81b)
(4.81c)
(4.81d) whose origin will become clear soon. Di ﬀ erentiated versions of the equations in
(4.29) will also be needed
A
T
∂
A
∂
P
12
A
T
∂a ij
∂
Q
T
12
∂a ij
∂a
ˆ
ij
+
∂
ˆ
∂a ij
+
+
∂
P
ˆ
+
12
∂a ij
∂
Q
T
12
∂a ij
A +
∂
ˆ
T
∂a ij
T
+
ˆ
P
12
∂
ˆ
T
∂a ij
Q
∂
∂a
Q
T
12
∂
ˆ
∂a
ˆ
T
ij ij
A
∂
P
∂b
12
ij
+
∂
P
∂b
12
ij
A
T
+ B
∂
ˆ
T
∂b ij
A
T
∂
Q
T
12
∂c ij
+
∂
Q
∂c
T
12
ij
A
−
∂
ˆ
∂c
T
ij
C
= 0
,
=
=
=
=
0
0
0
0
.
,
,
,
(4.82a)
(4.82b)
(4.82c)
(4.82d)
(4.82e)
We start with the terms containing
∂

E

∂
A
2
H
2
, tr
⎜⎜⎝
⎜⎜⎜⎜
⎢⎣
⎢⎢⎢⎢
∂

E

2
H
2
∂
A
⎥⎦
⎥⎥⎥⎥
T
⎢⎣
⎢⎢⎢⎢
∂

E

2
H
2
∂
A
⎥⎦
⎥⎥⎥⎥
⎟⎟⎠
⎟⎟⎟⎟
= tr

4 QP + Q
12
P
T
12
T
QP + Q
12
P
T
12
.
= 4 tr PQQP + 2 P
12
Q
T
12
QP + P
12
Q
T
12
Q
12
P
T
12
.
(4.83)
Di ﬀ erentiating with respect to ˆ :
∂
∂a ij
tr
⎜⎜⎝
⎜⎜⎜⎜
⎢⎣
⎢⎢⎢⎢
∂

E
 2
H
2
∂
A
⎥⎦
⎥⎥⎥⎥
T
⎢⎣
⎢⎢⎢⎢
∂

E
 2
H
2
∂
A
⎥⎦
⎥⎥⎥⎥
⎟⎟⎠
⎟⎟⎟⎟
=
= 8 tr
∂
P
12
∂a ij
Q
T
12
QP + Q
12
P
T
12
+
∂
Q
T
12
∂a ij
QP + Q
12
P
T
12
P
12
.
(4.84)
Di ﬀ erentiating with respect to ˆ :
∂
∂b ij
tr
⎜⎜⎝
⎜⎜⎜⎜
⎢⎣
⎢⎢⎢⎢
∂

E
 2
H
2
∂
A
⎥⎦
⎥⎥⎥⎥
T
⎢⎣
⎢⎢⎢⎢
∂

E
 2
H
2
∂
A
⎥⎦
⎥⎥⎥⎥
⎟⎟⎠
⎟⎟⎟⎟
= 8 tr
∂
P
12
∂b ij
Q
T
12
QP + Q
12
P
T
12
.
(4.85)
4.A
Gradient of
V
rob
Di ﬀ erentiating with respect to ˆ :
∂
∂c ij
tr
⎜⎜⎝
⎜⎜⎜⎜
⎢⎣
⎢⎢⎢⎢
∂

E
 2
H
2
∂
A
⎥⎦
⎥⎥⎥⎥
T
⎢⎣
⎢⎢⎢⎢
∂

E
 2
H
2
∂
A
⎥⎦
⎥⎥⎥⎥
⎟⎟⎠
⎟⎟⎟⎟
= 8 tr
∂
Q
T
12
∂x ij
QP + Q
12
P
T
12
P
12
.
79
(4.86)
Now, continue with the terms containing
∂

E

∂
B
2
H
2
, tr ⎢⎣
⎢⎢⎢⎢
⎜⎜⎝
⎜⎜⎜⎜
∂

E

2
H
2
∂
B
⎥⎦
⎥⎥⎥⎥
T
⎢⎣
⎢⎢⎢⎢
∂

E

2
H
2
∂
B
⎥⎦
⎥⎥⎥⎥
⎟⎟⎠
⎟⎟⎟⎟
= tr

4 QB + Q
12
T
QB + Q
12
B
.
= tr B
T
QQB + 2 ˆ
T
Q
T
12
QB B
T
Q
T
12
Q
12
.
(4.87)
Di ﬀ erentiating with respect to ˆ :
∂
∂a ij
tr ⎢⎣
⎢⎢⎢⎢
⎜⎜⎝
⎜⎜⎜⎜
∂

E

2
H
2
∂
B
⎥⎦
⎥⎥⎥⎥
T
⎢⎣
⎢⎢⎢⎢
∂

E

2
H
2
∂
B
⎥⎦
⎥⎥⎥⎥
⎟⎟⎠
⎟⎟⎟⎟
= 8 tr
∂
Q
T
12
∂a ij
QB + Q
12
B
T
.
(4.88)
Di ﬀ erentiating with respect to ˆ :
∂
∂b ij
tr
⎢⎣
⎢⎢⎢⎢
⎜⎜⎝
⎜⎜⎜⎜
∂

E
 2
H
2
∂
B
⎥⎦
⎥⎥⎥⎥
T
⎢⎣
⎢⎢⎢⎢
∂

E
 2
H
2
∂
B
⎥⎦
⎥⎥⎥⎥
⎟⎟⎠
⎟⎟⎟⎟
= 8 tr
∂
ˆ
T
∂b ij
Q
T
12
QB + Q
12
.
(4.89)
Di ﬀ erentiating with respect to ˆ :
∂
∂c ij
tr
⎜⎜⎝
⎜⎜⎜⎜
⎢⎣
⎢⎢⎢⎢
∂

E
 2
H
2
∂
B
⎥⎦
⎥⎥⎥⎥
T
⎢⎣
⎢⎢⎢⎢
∂

E
 2
H
2
∂
B
⎥⎦
⎥⎥⎥⎥
⎟⎟⎠
⎟⎟⎟⎟
= 8 tr
∂
Q
T
12
∂c ij
QB + Q
12
B
T
(4.90)
Continuing with the terms containing
∂

E

∂
C
2
H
2
, tr
⎢⎣
⎢⎢⎢⎢
⎜⎜⎝
⎜⎜⎜⎜
∂

E
 2
H
2
∂
C
⎥⎦
⎥⎥⎥⎥
T
⎢⎣
⎢⎢⎢⎢
∂

E
 2
H
2
∂
C
⎥⎦
⎥⎥⎥⎥
⎟⎟⎠
⎟⎟⎟⎟
= tr

4 CP
−
T
12
T
CP
−
T
12
.
= 4 tr PC
T
CP
−
2 P
12
C
T
CP + P
12
C
T T
12
(4.91)
Di ﬀ erentiating with respect to ˆ :
∂
∂a ij
tr
⎜⎜⎝
⎜⎜⎜⎜
⎢⎣
⎢⎢⎢⎢
∂

E
 2
H
2
∂
C
⎥⎦
⎥⎥⎥⎥
T
⎢⎣
⎢⎢⎢⎢
∂

E
 2
H
2
∂
C
⎥⎦
⎥⎥⎥⎥
⎟⎟⎠
⎟⎟⎟⎟
= 8 tr
∂
P
12
C
T
∂a ij
T
12
−
CP (4.92)
80
4 Model Reduction
Di ﬀ erentiating with respect to ˆ :
∂
∂b ij
tr
⎢⎣
⎢⎢⎢⎢
⎜⎜⎝
⎜⎜⎜⎜
∂

E
 2
H
2
∂
C
⎥⎦
⎥⎥⎥⎥
T
⎢⎣
⎢⎢⎢⎢
∂

E
 2
H
2
∂
C
⎥⎦
⎥⎥⎥⎥
⎟⎟⎠
⎟⎟⎟⎟
= 8 tr
∂
P
12
C
T
∂b ij
T
12
−
CP (4.93)
Di ﬀ erentiating with respect to ˆ :
∂
∂c ij
tr
⎢⎣
⎢⎢⎢⎢
⎜⎜⎝
⎜⎜⎜⎜
∂

E

2
H
2
∂
C
⎥⎦
⎥⎥⎥⎥
T
⎢⎣
⎢⎢⎢⎢
∂

E

2
H
2
∂
C
⎥⎦
⎥⎥⎥⎥
⎟⎟⎠
⎟⎟⎟⎟
=
−
8 tr
∂
ˆ
T
∂c ij
CP
−
T
12
P
12
(4.94)
Here is where the equations for W
1
, W
2
, W
3 and W
4 from (4.81) comes in. Using
Lemma 4.1 with the equations in (4.81) and (4.82) together with the equations above entails
∂V
∂
∂V
∂
∂V
∂
rob
ˆ rob
ˆ rob
ˆ
= 4
A
= 4
A
W
1
P
12
+ Q
T
12
W
2
∂

E

∂
A
H
2
W
1
B
∂

E

∂
A
2
H
2
F
+ 4
F
+ 4
B
Q
T
12
B
QB
Q
T
12
∂

E

∂
B
2
H
2
W
3
∂

E

∂
B
H
2
+ Q
12
F
F
=
−
4
A
CW
2
∂

E

∂
A
H
2
F
−
4
B
CW
3
∂

E

∂
B
H
2
F
+ 4
C
W
4
P
12
∂

E

∂
C
H
2
F
,
−
4
C
+ 4
C
W
4
B
∂

E

∂
C
2
H
2
,
F
CP
−
∂

E

∂
C
H
2
T
12
F
P
12
.
(4.95a)
(4.95b)
(4.95c)
4.B
Equations for FrequencyWeighted Model
Reduction
In this appendix, the equations that comes from partitioning P
E
and Q
E
as in
(4.13) and using the realization (4.12) of
E
, will be presented, both for continuous and discrete time.
4.B
Equations for FrequencyWeighted Model Reduction
4.B.1
Continuous Time
81
Splitting the equations in (4.15) using the partitioning in (4.13) yields the equations
A
o
P
o
+ P
o
A
AP
T
o
AP +
A
12
+
AP
ˆ
+
B
o
14
P
PA
P
CP
14
12
+ P
ˆ
A
T
T
T
14
+
+
BC
BC
P
i i
T
14
P
P
T
13
T
23
C
T
+
+
B
T
o
P
P
−
13
23
A
B
i
C
C
o
P
i
T
i
T
i
B
T
B
T
+
24
+
P
BD
i i i
A
−
i
T
P
+
T
24
D
D
i
T
i
T
B
T
B
T
B
i
C
T
B
B
T
o i
T
+
A
T
o
BC
+
i
AP
13
P
T
23
+ P
BC
i
P
+
13
34
P
13
A
i
T
+
C
i
T
+
B
BC
PC
T
T
B
i
T
o
+
P
i
−
P
BD
+
i
D
i
T
B
T
BD
12
i
C
T
B
B
T
o i
T
AP
24
+ P
A
i
24
P
A
T
o
34
AP
23
+ P
+
BC
i
34
P
P
A
T
o
23
34
+
A
i
T
+
P
P
T
12
T
13
C
BC
i
C
T
B
T
T
o
P
B
T
o
−
i
−
P
T
23
BD
i
B
P
ˆ
T
B
T
o i
T
C
T
B
T
o
= 0
,
= 0
,
= 0
,
= 0
,
= 0
,
= 0
,
= 0
,
= 0
,
= 0
,
= 0
,
(4.96a)
(4.96b)
(4.96c)
(4.96d)
(4.96e)
(4.96f)
(4.96g)
(4.96h)
(4.96i)
(4.96j) and
Q
i
A
i
+ A
i
T
QA
Q
Q
i
ˆ
+
+ A
T
Q +
+ ˆ
Q
T
13
T
BC
i
−
Q
Q
14
24
B
B
o o
C
ˆ
+
−
+ C
i
T
B
T
Q
13
C
T
B
T
o
C
T
B
T
o
+ Q
T
23
Q
T
14
Q
T
24
BC
i
+
Q
o
A
o
+ A
C
C
T
T
D
D
T
o
T
o
D
D
o o
C
+ C
i
T
T
o
Q
o
+
B
T
Q
23
C
T
o
C
o
Q
12
ˆ
+
Q
13
A
T
A
i
Q
12
+ A
T
Q
−
Q
14
14
Q
A
13
o
+
+
B
o
ˆ
+
QBC
i
A
T
Q
14
C
T
B
T
o
+ Q
12
Q
T
24
BC
i
−
C
T
D
T
o
+ C
T
B
T
o
D
o
Q
T
34
+ C
T
B
T
o
Q
o
+ C
T
D
T
o
C
o
Q
23
A
i
Q
Q
34
T
Q
23
24
A
o
A
o
+
+ Q
T
12
BC
i
Q
ˆ
i
−
C
T
B
T
o
Q
T
34
A
i
T
A
T
Q
Q
34
24
−
+ C
i
T
C
T
B
B
T
T
o
Q
Q
o
14
−
C
+ C
i
T
T
D
B
T
T
o
C
o
Q
24
= 0
,
=
=
=
=
=
=
=
=
=
0
0
0
0
0
0
0
0
0
.
,
,
,
,
,
,
,
,
(4.97a)
(4.97b)
(4.97c)
(4.97d)
(4.97e)
(4.97f)
(4.97g)
(4.97h)
(4.97i)
(4.97j)
Splitting the cost function, (4.14), using the realization of
E
, (4.12) and the partitioning of P
E
and Q
E
, yields

E

2
H
2
= tr D
i
T
B
T
QBD
i
+ 2 D
i
T
B
T
Q
T
12
BD
i
+ B
i
T
Q
i
B
i
+ D
T
i
B
T
BD
+ 2 B
i
T
Q
T
13
BD
i i
+ 2 B
T
i
Q
T
23
i
,
(4.98a)

E

2
H
2
= tr D
o
CPC
T
D
T
o
−
2 D
o
+ C
o
T
12
C
T
D
T
o
P
o
C
T
o
+ D
o
C
ˆ
C
T
D
T
o
+ 2 D
o
CP
14
C
T
o
−
2 D
o
CP
24
C
T
o
.
(4.98b)
82
4 Model Reduction
The gradient becomes
∂

E

2
H
2
∂
ˆ
∂

E
 2
H
2
∂
ˆ
= 2
= 2
+ 2
Q
ˆ
+ Q
T
12
P
12
+ Q
23
P
T
23
+ Q
24
P
T
24
Q
T
12
P
13
Q
ˆ
i
QP
23
+ Q
23
+ Q
T
12
BD
i
P
i
+ Q
+ Q
23
B
i
24
D
P
i
T
T
34
,
,
C
i
T
∂

E

2
H
2
∂
ˆ
=
−
2 B
T
o
+ 2 D
T
o
Q
T
14
D
o
P
C
12
ˆ
+ Q
T
24
−
D
o
ˆ
+
CP
12
Q
T
34
P
T
23
−
C
o
P
T
24
+
.
Q
o
P
T
24
(4.99a)
(4.99b)
(4.99c)
4.B.2
Discrete Time
Splitting the equations in (4.32) using the partitioning in (4.13) yields the equations
APA
T
−
P + BC
i
P
T
13
A
T
+ AP
13
C
i
T
B
T
+ BC
i
P
i
C
i
T
B
T
+ BD
i
D
i
T
B
T
= 0
,
(4.100a)
A
ˆ
T
−
BC
i
P
T
23
A
T
AP
23
C
i
T
B
T
BC
i
P
i
C
i
T
B
T
BD
i
D
i
T
B
T
= 0
,
(4.100b)
A
i
P
i
A
i
T
−
P
i
+ B
i
B
i
T
= 0
,
(4.100c)
A
o
P
o
A
T
o
−
P
o
−
B
o
+ B
o
CP
CP
12
14
C
T
B
A
T
o
T
o
+ A
−
B
o o
P
T
14
CP
T
12
C
T
B
T
o
C
T
B
o
−
+
B
o
B
o
CP
24
A
T
o
CPC
T
B
T
o
−
+
A
o
P
T
24
C
T
B
o
C
ˆ
C
T
B
T
o
B
T
o
= 0
,
(4.100d)
AP
12
T
−
P
12
+ BC
i
P
T
23
A
T
+ AP
13
C
i
T
B
T
+ BC
i
P
i
C
i
T
B
T
+ BD
i
D
i
T
B
T
= 0
,
(4.100e)
AP
13
A
i
T
−
P
13
+ BC
i
P
i
A
i
T
+ BD
i
B
i
T
= 0
,
(4.100f)
AP
14
A
T
o
−
P
14
+ BC
i
P
34
A
T
o
+ BC
i
+
P
T
13
APC
T
B
T
o
C
T
B
T
o
−
−
AP
BC
i
P
12
T
23
C
T
C
T
B
T
o
B
T
o
= 0
,
(4.100g)
AP
24
A
T
o
−
P
24
AP
23
A
i
T
i
B
i
T
= 0
,
(4.100h)
BC
i
P
34
A
T
o
BC
i
P
T
13
T
12
C
T
B
T
o
C
T
B
T
o
−
−
BC
i
A
ˆ
C
T
P
T
23
C
T
B
T
o
B
T
o
= 0
,
(4.100i)
A
i
P
34
A
T
o
−
P
34
+
−
P
A
i
23
BC
i
P
T
13
C
T
B
T
o
P
i
−
A
i
T
A
i
P
T
23
C
T
B
T
o
= 0
,
(4.100j)
4.B
Equations for FrequencyWeighted Model Reduction
83 and
A
T
QA
−
Q + A
T
Q
14
B
o
C + C
T
B
T
o
Q
T
14
A + C
T
B
T
o
Q
o
B
o
C + C
T
D
T
o
D
o
C = 0
,
(4.101a)
A
T
ˆ − ˆ −
A
T
Q
24
B
o
ˆ −
C
T
B
T
o
Q
T
24
ˆ
+ ˆ
T
B
T
o
Q
o
B
o
ˆ
+ ˆ
T
D
T
o
D
o
ˆ
= 0
,
(4.101b)
A
i
T
Q
i
A
i
−
Q
i
+ C
i
T
+ A
i
T
B
T
Q
Q
T
13
T
12
BC
BC
i i
+
+
C
T
i
C
i
T
B
B
T
T
Q
Q
13
12
A
BC
i i
+
+
A
C
T
i
T
i
Q
T
23
BC
B
T
QBC
i i
+ C
+ C
i
T
i
T
B
T
B
T
Q
23
A
Q
ˆ
i i
= 0
,
(4.101c)
A
T
o
Q
o
A
o
−
Q
o
+ C
T
o
C
o
= 0
,
(4.101d)
A
T
Q
12
ˆ −
Q
12
−
−
C
A
T
T
B
Q
T
o
14
Q
o
B
o
B
o
ˆ
+
ˆ −
C
T
C
T
B
T
o
D
T
o
Q
T
24
D
o
ˆ
= 0
,
(4.101e)
A
T
Q
13
A
i
−
Q
13
+ A
T
QBC
i
+ C
T
+ A
T
B
T
o
Q
Q
T
14
12
BC
i
BC
i
+ C
T
B
T
o
+ C
T
B
T
o
Q
Q
T
24
T
34
A
BC
i i
= 0
,
(4.101f)
A
T
Q
14
A
o
−
Q
14
+ C
T
B
T
o
Q
o
A
o
+ C
T
D
T
o
C
o
= 0
,
(4.101g)
T
Q
23
A
i
−
Q
23
A
T
Q
T
12
BC
i
− ˆ
T
B
T
o
A
T
Q
ˆ
i
Q
T
14
BC
i
−
−
C
T
B
T
o
Q
T
34
A
i
T
o
Q
24
BC
i
= 0
,
(4.101h)
T
Q
24
A
o
−
Q
24
−
C
T
B
T
o
Q
o
A
o
−
C
T
D
T
o
C
o
= 0
,
(4.101i)
A
i
T
Q
34
A
o
−
Q
34
+ C
T
i
B
T
Q
14
A
o
+ C
i
T
B
T
Q
24
A
o
= 0
.
(4.101j)
Using the partitioning in (4.13) again, yields the cost function

E
 2
H
2
= tr D
+ 2 B
i
T
Q
T
13
i
T
B
T
QBD
BD
i
+ 2
i
B
+ 2
i
T
D
Q
T
23
i
T
B
T
BD
i
Q
+
T
12
BD
i
D
T
i
+
D
T
D
−
T
i
B
T
D
T
BD
i
D
T
o
D
o
+ B
i
T
D
−
Q
i
B
i
D
i
,
(4.102a)

E

2
H
2
= tr D
o
CPC
T
+ 2 D
o
CQ
14
C
T
o
−
D
T
o
2 D
−
o
2 D
o
CQ
24
C
T
o
T
12
C
T
D
T
o
+ D
o
D
+
−
D
o
C
ˆ
C
T
D
T
o
D
i
D
i
T
D
T
+ C
o
P
o
C
T
o
−
D
T
D
T
o
.
(4.102b)
84
4 Model Reduction
The gradient becomes
∂

E
 2
H
2
∂
ˆ
= 2 Q
ˆ
+ Q
T
12
AP
12
+ Q
23
A
i
P
T
23
+ Q
24
A
o
P
T
24
+ Q
24
B
o
CP
12
−
C
ˆ
,
(4.103a)
∂

E

2
H
2
∂
ˆ
= 2 Q
T
12
AP
13
+ Q
24
B
o
Q
ˆ
23
+ Q
23
A
i
P
i
+ Q
24
A
o
Q
T
34
CP
13
−
CP
23
+ 2
+ Q
T
12
B
BD
i
C
i
P
i
+ Q
T
12
BD
i
C
i
T
+ Q
23
B
i
D
i
T
,
(4.103b)
∂

E

2
H
2
∂
ˆ
=
−
2 B
T
o
Q
T
14
AP
12
+ Q
T
24
A
ˆ
+ Q
T
34
A
i
P
T
23
+ Q
o
A
o
P
T
24
+ Q
o
B
o
CP
12
−
C
ˆ
+
+ 2 D
T
o
Q
T
14
B +
D
o
C
ˆ
Q
T
24
−
D
o
C
i
P
CP
12
T
23
−
C
o
P
T
24
,
(4.103c)
∂

E

2
H
2
∂
ˆ
= 2 D
T
o
D
o
ˆ −
D D
i
D
i
T
.
(4.103d)
4.C
Gradient of the FrequencyLimited Case
In this section, the derivation of the gradient of the cost function (4.67) will be presented. We start by di ﬀ erentiating the cost function (4.67) with respect to
B
,
B and ˆ . First, note that neither Q
ω
,
Q
12
,ω
. This means that (4.67a) is quadratic in ˆ
Q
ω
in equation (4.67a) depend
B . Analogous observations can be made with equation (4.67b) and the variable ˆ D . Hence, the derivative of the cost function with respect ˆ
,
becomes
∂

E

2
H
2
,ω
∂
ˆ
∂

E
 2
H
2
,ω
∂
ˆ
∂

E

2
H
2
,ω
∂
ˆ
= 2
= 2
=
−
2
Q
ω
+ Q
T
12
,ω
B
ˆ
C
ˆ
ω
−
CP
12
,ω
−
CS
ω
B + D
ω
−
T
ω
C
T
D
C
ˆ
ω
−
D
ˆ −
−
B
ˆ
D
,
T
ω
T
ω
+
,
D
−
ω .
(4.104a)
(4.104b)
(4.104c)
When di ﬀ erentiating with respect to ˆ
Q
12
,ω
depend on ˆ .
Q
ω
and
4.C
Gradient of the FrequencyLimited Case
85
∂

E
 2
H
2
,ω
∂a ij
= tr BB
T
∂
Q
12
,ω
∂a ij
B
ˆ
T
∂
ˆ
∂a
ω ij
−
2 ˆ
∂
ˆ
ω
∂a ij
D
T
−
D
T
,
(4.105) where
∂
Q
12
,ω
∂a ij
in (4.65), and
∂
ˆ
ω
∂a ij
depend on ˆ via the di ﬀ erentiated versions of the equations
A
T
∂
Q
T
12
,ω
∂a ij
+
∂
Q
T
12
,ω
∂a ij
A +
∂
ˆ
T
∂a ij
Q
T
12
,ω
−
∂
ˆ
T
ω
∂a ij
C
T
C = 0
,
(4.106a)
A
T
∂
ˆ
ω
∂a ij
+
∂
ˆ
ω
∂a ij
ˆ
+
∂
ˆ
T
∂a ij
Q
ω
∂
ˆ
ω
∂a ij
+
∂
ˆ
T
ω
∂a ij
C
T
ˆ
+ ˆ
T
∂
ˆ
ω
∂a ij
= 0
.
(4.106b)
Using Lemma 4.1 on (4.105) with the equations in (4.29) and (4.106) yields
∂

E

2
H
2
,ω
∂a ij
= 2 tr
∂
ˆ
T
∂a ij
Y
T
ω
X
ω
+
∂
ˆ
∂a ij
C
T
C
ˆ
−
2 tr
∂
ˆ
ω
∂a ij
−
C
T
CX
D
T
−
D
T
C
.
(4.107)
What remains is to rewrite the two last terms in (4.107), which includes
∂
ˆ
ω
∂a ij
∂
ˆ
∂a ij
S
ω
,
S
ω
Re

i
π
ln
− ˆ −
iω
I
.
and
(4.108) and di ﬀ erentiate with respect to an element in ˆ , i.e.,
a ij
. This yields
∂
ˆ
∂a
ω ij
= Re
= Re
2
π i i
2
π
L
L
− ˆ
− ˆ
−
iω
I
,
∂
∂a ij
∂
ˆ
−
iω
I
,
−
∂a ij
− ˆ
,
−
iω
I
(4.109) where
L
( A
,
E ) is the Frechét derivative of the matrix logarithm with
L
( A
,
E ) =
0
1
(
t
( A
− I
) +
I
)
−
1
E (
t
( A
− I
) +
I
)
−
1 d
t,
see Higham [2008].
(4.110)
The function
L
( A
,
E ) can be e ﬃ ciently evaluated using the algorithm in Higham
[2008] or AlMohy et al. [2012]. Substituting (4.109) into (4.107) and using
(4.110) with the fact that the troperator and the integral can be interchanged,
86
4 Model Reduction yields
∂

E

∂a
2
H
2
,ω ij
= 2 tr
= 2 tr
∂
ˆ
∂a
T
ij
∂
ˆ
T
∂a ij
Q
T
12
,ω
P
12
−
Q
T
12
,ω
P
12
2 tr
∂
ˆ
ω
Q
ω
∂a ij
P
−
ω
D
2 tr
+
T
∂
−
∂
ˆ
∂a
ˆ
∂a
T
ω ij
D
T
T
ij
C
C
T
C
ˆ
Re
i
π
L
−
C
T
CP
12
− ˆ −
iω
I
,
V
!
T
= tr
∂
∂a
ˆ
T
ij
2 Q
T
12
,ω
P
12
Q
ω
−
2 W
,
(4.111) where
W
V
= Re

i
π
L
C
T
C
ˆ −
− ˆ −
C
T
CP
12
iω
I
,
V
.
T
,
−
C
T
D
−
B
T
.
(4.112)
(4.113)
LPV
5
Modeling
In this chapter, local methods to approximate lpv models are developed. The methods use an approach that tries to preserve the inputoutput relations from the given models in the resulting lpv model. This is done by minimizing the sum of the lpv
H
2
norms of the di ﬀ erence between the given models and a parametrized model. When developing the methods, large e ﬀ ort is made on making the method computationally e ﬃ cient. The material in this chapter is largely based on Petersson and Löfberg [2012c].
5.1
Introduction
In the last decades, intensive research has been carried out on linear parametervarying models ( lpv models), see e.g., Rugh and Shamma [2000], Leith and Leithead [2000], Tóth [2008], Lovera et al. [2011] or Mohammadpour and Scherer
[2012]. An important reason for this interest is that it is a powerful tool for modeling and analysis of nonlinear systems, such as aircrafts (see Marcos and Balas
[2004]) or wafer stages (see Wassink et al. [2005]). Some advanced robustness analysis methods, such as IQCanalysis and
μ
analysis, see e.g., Megretski and
Rantzer [1997], Zhou et al. [1996], require a conversion of the lpv model into a linear fractional representation ( lfr
), see e.g., Zhou et al. [1996]. For this to be possible it is necessary that the parametric matrices A ( p ), B ( p ), C ( p ) and D ( p ) of the lpv model are rational in p . This requirement is often violated in lpv models generated directly from a nonfractional model description, either due to presence of nonfractional parametric expressions or tabulated data in the model.
In both cases, rational approximations must be used to obtain a suitable model.
This motivates a method that both can approximate a nonlinear model with an lpv model and approximate a complex lpv model with a less complex one.
87
88
5 lpv
Modeling
As described in Section 2.1.5, lpv
models can be described by linear di ﬀ erential equations whose coe ﬃ cients depend on scheduling parameters,
˙ (
t
) = A ( p ) x (
t
) + B ( p ) u (
t
)
,
y (
t
) = C ( p ) x (
t
) + D ( p ) u (
t
)
,
(5.1)
(5.2) where x (
t
) is the state, u (
t
) and y (
t
) are the input and output signals and p (
t
) is the vector of scheduling parameters. For example, in ﬂight control applications, the components of p (
t
) are typically mass, position of centre of gravity and various aerodynamic coe ﬃ cients, but can also include state dependent parameters such as altitude and velocity, specifying current ﬂight conditions.
Generation of lpv models can simplistically be divided into two main families of methods, global methods (see e.g., Nemani et al. [1995], Lee and Poolla [1999],
Bamieh and Giarre [2002], Felici et al. [2007], Tóth [2008]) and local methods (see e.g., Steinbuch et al. [2003], Wassink et al. [2005], Lovera and Mercere [2007],
De Caigny et al. [2011], Pﬁfer and Hecker [2008], De Caigny et al. [2012]). A survey of existing methods can be found in Tóth [2008]. The global methods will only be mentioned brieﬂy, since the main focus will be on local methods.
5.2
Global Methods
In the class of global methods, a global identiﬁcation experiment is performed by exciting the system while the scheduling parameters change the dynamics of the system. An advantage with this approach, of generating lpv models, is that it is also possible to capture the rate of change of the parameters and how they can vary between di ﬀ erent operating points. However, one drawback is that it is sometimes, for example in some ﬂight applications, not possible to perform such an experiment.
5.3
Local Methods
A
i
C
i
3
N
In the class of local methods, a set of lti models,
M
=
G i
= are interpolated, or in some other way combined, to generate an
B
D
i i
lpv
,
p
i i
=1 model.
,
These local models,
G i
, can, for example, have been identiﬁed using a set of inputoutput measurements where the parameters have been kept constant, for which there exists several methods, see e.g., Ljung [1999], or by linearizing a nonlinear model in di ﬀ erent operating points.
In this family of methods it is assumed that the system can operate at di ﬀ erent ﬁxed operating points, where the scheduling parameters are “frozen”. There are of course systems where this is not possible and where this family of methods is inapplicable, requiring the use of global methods. Another drawback with this family of methods is that it does not take time variations of the scheduling parameters into account, thus limiting local methods to systems where the scheduling
5.4
lpv
Modeling using an
H
2
Measure
89 parameters vary slowly in time, which is a commonly used assumption in gain scheduling, see Shamma and Athans [1992]. To see this more clearly, write the lpv system as
G
( p
,
˙
, . . .
) =
A
C
S
S
(
( p p
)
)
B
D
S
S
(
( p p
)
)
+
A
D
( p
,
C
D
( p
,
˙
, . . .
)
˙
, . . .
)
B
D
( p
,
D
D
( p
,
˙
, . . .
)
˙
, . . .
)
=
G
S
( p ) +
G
D
( p
,
˙
, . . .
)
,
(5.3) where
G
S
( p ) only depends on the current parameter value and does not include any dynamic dependence of the parameters, and namic dependence of the parameters.
G
D
G
D
( p
,
˙
, . . .
) includes all the dyhas the property that
G
D
( p
,
0
,
0
, . . .
) =
0. If the parameters are kept constant and the models,
G i
, are generated
G
( p
i
,
0
,
0
, , . . .
) =
G
S
( p
i
) +
G
D
( p
i
,
0
,
0
, . . .
) =
G
S
( p
i
)
,
one observes that the information in
G
D
is lost. This is one reason why one has to be careful when doing model interpolation. A paper that explains the pitfalls of interpolation is Tóth et al. [2007].
A common drawback of many of the local methods is that they need the local models to be given in the same statespace basis, see e.g., Pﬁfer and Hecker [2008].
However, the lti models given in
M are related to the true lpv system as
G i
=
A
C
i i
B
D
i i
=
T
−
1
i
A
C
S
S
( p
( p ) T
)
i
T
i
T
i
−
1
D
S
B
S
( p
(
) p )
,
for some invertible matrices T
i
eral, assume that the given lti
, which are unknown. Hence, one cannot, in genmodels are described in the same statespace basis.
be able to transform the lti models,
G i
T
i
to
, to a common basis that encourage interpolation, usually some canonical form, see e.g., Steinbuch et al. [2003]. However, these lti models in canonical forms may su ﬀ er from bad numerics. In De Caigny et al. [2012] they solve this problem by ﬁxing one of the given models as a reference model and transforming the other models to statespace bases that are consistent with the reference model.
5.4
LPV
Modeling using an
H
2
Measure
The methods that will be described in this section are based on the modelreduction techniques introduced in Section 4.4 and are in the family of local methods.
The goal with the methods proposed in this section is to try to preserve the inputoutput relations of the given lti models in
M
, instead of doing direct interpolation of system matrices. Let
G
( p ) denote the true lpv system, then ideally the goal would be to ﬁnd an lpv model, ˆ ( p ), that is optimal with respect to some global discrepancy measure on the model error, for instance the following integral
90
5 lpv
Modeling
ˆ
( p )
,
min
ˆ
( p )
,
ˆ
( p )
,
ˆ
( p )
G
( p )
− ˆ
( p )
2
H
2
,ω
d p
,
(5.4) where
ˆ
( p ) :
˙ (
t
) y (
t
)
=
=
ˆ
( p ) x (
t
B ( p ) u (
t
)
ˆ
( p ) x (
t
D ( p ) u (
t
)
.
(5.5)
This is not always practical or even tractable. In many applications, e.g., ﬂight applications, one often only have a simulation model available or a model that is used for computational ﬂuiddynamic calculations and not an analytical nonlinear model and it is only possible to extract linearized models for discrete values of the scheduling parameters, p
i
, i.e., we are given the model set
M
=
{
G i
,
p
i
}
N i
=1
.
Having this in mind (5.4) is changed into a discretized, in the parameters, version,
N
ˆ
( p )
,
min
ˆ
( p )
,
ˆ
( p )
,
ˆ
( p )
i
=1
G i
− ˆ
( p
i
)
2
H
2
,ω
.
(5.6)
The two most widely used norms in system theory are the
(5.6), the norm that will be used here is the
H
2
 and
H
∞
norms, both capturing the inputoutput relation of the system. As indicated in (5.4) and
H
2
norm (or the frequencylimited
H
2
norm). The main reason for this choice is, as in Chapter 4, that the cost function, again, becomes di ﬀ erentiable with respect to the optimization variables, with readily computed gradients.
5.4.1
General Properties
Since the lpv methods in this section will be based on the methods in Section 4.4, they also inherit the property that they are invariant under state transformations of the given lti systems. This was useful in the modelreduction scheme since it does not matter in which state basis the given system is described. For the lpv methods, this fact can be utilized again. As explained in Section 5.3, what we are searching for in the local methods is the related to the model set
M as
G
S
( p )part of the lpv model, which is
M
=
{
G i
,
p
i
}
N i
=1
, G i
=
A
C
i i
B
D
i i
=
T
i
−
1
C
S
A
S
( p
(
) p
T
)
i
T
i
T
i
−
1
D
S
B
S
( p
(
) p )
,
where T
i
are some unknown invertible matrices, which, generally, are not related to each other. Since the methods are invariant under state transformations we do not seek to ﬁnd these other local methods.
T
i
, only
G
S
( p ), which is an advantage compared to most
One thing that has been left out so far, is how the system matrices ˆ ( p ), ˆ ( p ), ˆ ( p ) and ˆ ( p ) are parametrized. These matrices are taken to be linear combinations of some basis functions
w k
( p ), e.g., in the polynomial case, monomials. The system
5.4
lpv
Modeling using an
H
2
Measure
91 matrices in the lpv model, ˆ ( p ), will then depend on p as
ˆ
( p ) =
w k
( p ) ˆ
(
k
)
,
(5.7a)
k
ˆ
( p ) =
w k
( p ) ˆ
(
k
)
,
(5.7b)
k
ˆ
( p ) =
w k
( p ) ˆ
(
k
)
,
(5.7c)
k
ˆ
( p ) =
w k
( p ) ˆ
(
k
)
,
(5.7d)
k
where the functions
w k
( p ) are design choices that can be hard to choose to not make the model class to restrictive. However, it is not as restrictive as one might think. To see this, start by looking at how an lpv model changes when doing a state transformation, which can depend on the parameters. Given the state transformation
= ¯ ( p ) x
,
(5.8) where ¯ ( p ), in the continuoustime case is a nonsingular continuously di ﬀ erentiable matrix for all valid parameter values, and in the discretetime case is a matrix rational function of case, given an lpv p and invertible for all model as in (5.3), entails p
k
. For the continuoustime
¯
( p
,
˙
, . . .
+
) =
¯
( p
C
)
S
A
(
S
p
( p ) ¯
) ¯
−
1
(
−
1 p )
( p )
¯
( p ) B
S
D
S
(
( p ) p )
¯
( p ) A
D
( p
,
C
D
˙
, . . .
) ¯
−
1
( p
,
( p ) + ˙¯ ( p ) ¯
−
1
˙
, . . .
) ¯
−
1
( p )
( p )
¯
( p ) B
D
D
D
( p
,
( p
,
˙
˙
, . . .
, . . .
)
)
G
S
( p ) + ¯
D
( p
,
˙
, . . .
)
.
(5.9)
Important to note here is that the part
G
S
( p ) is transformed using only a static dependence in the parameters and, hence, it will, after the transformation, still only depend statically on the parameters. This fact can be used to realize that the choices of
w k
( p ) in (5.7) are not as restrictive as one can think. Let us illustrate this with an example.
Example 5.1: E ﬀ ect of State Transformations
Assume samples from the continuoustime lpv model,
G
(
p
) are given.
do not have any dynamic dependence of the parameters, i.e.,
G
(
p
) =
G
S
G
(
p
)
(
p
) =
A (
p
)
C (
p
)
B (
p
)
D (
p
)
, where
A (
p
) =
⎜⎜⎜⎜
⎜⎜⎜⎜
0
.
4
p
0
.
4
p
2
2
+ 3
p
+ 3
.
6
p
−
−
3
.
6
3
.
2
1
.
6
p
−
1
.
6
−
−
0
.
4(
p
3
−
24
p
−
40)
0
.
2(2
p
3
−
+3
p p p
2
−
46
p
−
0
.
2(8
p
2
p
−
33
p
−
5)
10)
0
.
2(27
p
3
+55
p
2
+37
p
−
160)
0
.
2(27
p
3
0
.
2(23
p p
+23
p
2
p
−
68
p
−
10)
p
2
−
96
p
−
20)
⎟⎟⎟⎟
⎟⎟⎟⎟
,
92
5 lpv
Modeling
B (
p
) =
C (
p
) =
⎜⎜⎝
⎜⎜⎜⎜
8 + 7
p
6 + 2
p
3

+
+
0
.
2 + 0
.
2
p p
2
p
2 ⎟⎟⎠
⎟⎟⎟⎟
,
−
0
.
2(
−
9
p
+
p
2
−
10)
p
D (
p
) = 0
.
−
0
.
8(
−
p
+4
p
2
−
5)
p
.
,
This lpv model does not have any dynamic dependence in the parameter and to be certain to be in the correct model class we can use the basis functions,
{
p
−
1
,
1
, p, p
2 }
. However, a di ﬀ erent realization of this model is given by
A
B
T
T
(
(
p p
) = ¯
) = ¯
(
(
p p
)
)
A
B (
(
p p
) ¯
) =
−
1
⎜⎜⎝
⎜⎜⎜⎜
(
p
) =
1 +
p
⎞
2 +
3
p
⎟⎟⎠
⎟⎟⎟⎟
⎛
⎜⎜⎝
⎜⎜⎜⎜
,
−
2 +
p
2 + 2
p
−
8 + 8
p
3 +
−
4 + 3
p
1 + 5
p p
5 + 2
p
1 + 5
p
−
2 + 3
p
⎟⎟⎠
⎟⎟⎟⎟
, w k
(
p
) =
C
T
(
p
) = C (
p
) ¯
−
1
(
p
) = 1 +
p
2 + 2
p
3 + 3
p ,
D
T
(
p
) =
¯
(
p
) =
⎜⎜⎝
⎜⎜⎜⎜
D (
p
) = 0
,
⎛
5
p
0
0
p
0
1
2
⎞
1
⎟⎟⎠
.
Obviously, in this realization, the model is only a ﬃ ne in
p
. This means that it is also possible to ﬁnd the correct model using only the basis functions
{
1
, p
}
.
In the example above, it can be observed that the choice of
w k
( p ) can sometimes be a little forgiving, since we have methods that are invariant to the state basis that the given models are represented in.
5.4.2
The Optimization Problem
The general optimization problem that will be studied can be written as
N
minimize
A
(
k
)
,
ˆ
(
k
)
,
ˆ
(
k
)
,
ˆ
(
k
)
i
=1
W o,i
G i
− ˆ
( p
i
)
W i,i
2
H
2
,ω
=
N
minimize
(
k
)
,
B
(
k
)
,
ˆ
(
k
)
,
ˆ
(
k
)
i
=1
W o,i
E i
W i,i
2
H
2
,ω
, E i
=
G i
− ˆ
( p
i
)
.
(5.10)
To study the problem in (5.10), start by looking at the case when there is only one model and see what can be concluded. This problem becomes, almost, identical to the problems in Section 4.4. The only di ﬀ erence is that the system matrices
5.4
lpv
Modeling using an
H
2
Measure
93
A
(
k
)
B
(
k
)
,
C
(
k
)
D
(
k
) enter linearly in ˆ ( p ), ˆ ( p ), ˆ ( p D ( p ) which makes it easy to express the gradient in the new variables instead, for example,
2
∂ W o,i
G i
− ˆ
( p
i
)
W i,i
∂
ˆ
(
k
)
H
2
= 2
w k
( p
i
E
T
Q
E,i
P
E,i
ˆ
.
(5.11)
Now returning to the original problem, (5.10), when having a number of lti models given, instead of just one. This is also a simple extension of the problems in Section 4.4, since this is a sum of the
H
2
norm over a number of lti models, which yields the structure
∂
+
i
W o,i
G i
−
∂
ˆ
(
k
)
ˆ
( p
i
)
W i,i
2
H
2
=
N i
=1
∂ W o,i
G i
−
∂
ˆ
ˆ
( p
i
)
(
k
)
W i,i
2
H
2
= 2
N i
=1
w k
( p
i
T
Q
E,i
P
E,i
ˆ
.
(5.12)
When converting the modelreduction methods in Section 4.4 into lpv methods, they will not only inherit the properties, but also the prerequisites of the methods, that is, when extending all the methods, it is required that the given in
M lti models are all asymptotically stable, and additionally for the continuoustime case and the methods in Section 4.4.1 and Section 4.4.2, the lti models require the error system to be strictly proper, i.e., of ﬁnding ˆ
D
i
= ˆ ( p
i
). For these methods, the problem
D ( p ) can be seen as a separate problem.
Before stating the necessary conditions for optimality for the proposed lpv methods (derived from the modelreduction methods), some notation has to be established. The given systems,
G i
in the set
M are assumed to have the realizations
G i
=
A
i
C
i
B
i
D
i
,
(5.13) and correspond to the parameter values p
i
. The notation and partitioning will be the same as in Section 4.4, with the exception that all variables will have a subscript
i
corresponding to the parameter value considered, p
i
. Only the necessary conditions for the continuoustime cases are stated, since most of the details are covered in Section 4.4 and the discretetime cases are analogous with the continuoustime case. From the necessary conditions for optimality, the expressions for the gradients can be readily extracted to be used in, for example, a quasiNewton algorithm.
The necessary conditions for optimality for the
Section 4.4.1 can be stated as follows.
lpv version of the method in
Theorem 5.1 (Necessary conditions for optimality).
and W o,i are asymptotically stable and that E i
Assume that G i is strictly proper, for the
,
H
2
i
, W i,i
norm
94
to be deﬁned, i.e., all
A
i
,
A
i
,
A
i,i i . In order for the matrices and
A
o,i
A
(
k
)
,
B
are Hurwitz and
(
k
)
,
C
(
k
)
D
o,i
D
i
−
D
i
D
i,i to be optimal for the problem
= 0
for
minimize
A
(
k
)
,
ˆ
(
k
)
,
ˆ
(
k
)

E i
 2
H
2
, E i
=
W o,i
G i
−
i i it is necessary that they satisfy the equations
A
E,i
P
E,i
A
T
E,i
Q
E,i
+ P
E,i
A
T
E,i
+ Q
E,i
A
E,i
+
+
B
E,i
B
T
E,i
C
T
E,i
C
E,i
= 0
,
= 0
, for all i
:s, and that
∂

E

2
H
2
∂
ˆ
(
k
)
= 2
∂

E
 2
H
2
∂
ˆ
(
k
)
= 2
∂

E

2
H
2
∂
ˆ
(
k
)
=
−
2
i i i w k
( p
i
E
T
Q
E,i
P
E,i w k
( p
i
E
T
w k
( p
i
)
Q
E,i
B
T
o
E
T
o
P
Q
E,i
E,i
=
E
i
P
0
C
i
T
E,i
,
+
+ Q
E,i
B
E,i
D
i
D
T
o
C
E,i
P
E,i
W i,i
,
=
=
0
,
0
,
(5.14)
(5.15a)
(5.15b)
(5.16a)
(5.16b)
(5.16c)
where
=
⎜⎜⎜⎜
⎜⎜⎜⎜ 0
n
×
ˆ
⎜⎜⎝
I
0
0
n
ˆ
×
n n o i
×
ˆ
×
ˆ
⎟⎟⎟⎟
⎟⎟⎟⎟
⎟⎟⎠
,
E
i
=
⎜⎜⎜⎜
⎜⎜⎜⎜
⎜⎜⎝
0
0
I
0
n
×
ˆ
n i n
ˆ
×
n o
×
ˆ
×
ˆ
⎟⎟⎟⎟
⎟⎟⎟⎟
⎟⎟⎠
,
E
o
=
⎜⎜⎜⎜
⎜⎜⎜⎜
⎜⎜⎝
0
0
0
I
n n
×
ˆ
ˆ
×
n n o i
×
ˆ
×
ˆ
⎟⎟⎟⎟
⎟⎟⎟⎟
⎟⎟⎠
.
(5.17)
Proof: The proof is analogous with Theorem 4.2.
5 lpv
Modeling
The necessary conditions for optimality for the lpv version of the method presented in Section 4.4.2, the robust extension, can be stated as
Theorem 5.2 (Necessary conditions for optimality).
and
W o,i are asymptotically stable and that to be deﬁned, i.e.,
A
i
,
A
i
,
A
i,i and
A
o,i
E i
Assume that
D
i
D
i
G i is strictly proper, for the are Hurwitz and
D
o,i
−
D
,
H
i,i
G i
2
norm
=
, W
0
i,i for all i . In order for the matrices
A
(
k
)
,
B
(
k
)
,
C
(
k
)
to be optimal for the problem
A
(
k
) min
,
ˆ (
k
)
,
ˆ (
k
)
i

E i

2
H
2
+
V rob
,
V rob
= 2
A
Q
i
P
i
+ Q
12
,i
P
T
12
,i
F i
+ 2
B
Q
i
B
i
+ Q
12
,i
B
i
F it is necessary that they satisfy the equations in
+ 2
C
(5.15)
C
i
P
i
(for
−
W
C
i i,i
P
=
T
12
,i
W o,i
F
,
(5.18)
=
I
) and
5.4
lpv
Modeling using an
H
2
Measure
95
the equations
T
i
W
1
,i
A
i
W
2
,i
A
i
+
+
W
1
,i
A
i
W
2
,i i
T
+ Q
+
T
12
,i
Q
i
P
i
Q
i
+
P
i
+ Q
12
,i
P
T
12
,i
Q
12
,i
P
T
12
,i
P
12
,i
W
3
,i
+ W
3
,i i
T
+ Q
i
B
i
+ Q
12
,i
B
i i
T
i
T
W
4
,i
+ W
4
,i
A
i i
T
C
i
P
T
12
,i
−
C
i
P
i
= 0
,
= 0
,
= 0
,
= 0
.
(5.19a)
(5.19b)
(5.19c)
(5.19d)
for all i and that
∂

E
 2
H
2
∂
ˆ
(
k
)
∂

E

2
H
2
∂
ˆ
(
k
)
∂

E

2
H
2
∂
ˆ
(
k
)
+
∂V rob
∂
ˆ
(
k
)
+
+
∂V rob
∂
ˆ
(
k
)
∂V rob
∂
ˆ
(
k
)
= 0
,
= 0
,
= 0
.
(5.20a)
(5.20b)
(5.20c)
With
∂V
∂
ˆ
∂V
∂
ˆ
∂V
∂
ˆ
rob
(
k
)
rob
(
k
)
rob
(
k
)
= 4
= 4
=
−
4
i i i w k
( p
i
)
⎛
⎝
A
W
1
,i
Q
i
P
12
,i
P
i
+
+
Q
Q
12
,i
T
12
,i
P
W
T
12
,i
2
,i
F
+
B
Q
i
Q
T
12
,i
B
i
W
+ Q
3
,i
12
,i
B
i w i
( p
i
)
⎛
⎝
A
Q
i
P
i
+
W
1
,i
+ Q
B
i
12
,i
P
T
12
,i
F
B
Q
T
12
,i
Q
i
Q
B
i i
B
i
+
+ Q
Q
12
,i
12
,i i
B
i
F w k
( p
i
)
⎛
⎝
A
Q
i
P
i
C
+
i
W
2
,i
Q
12
,i
P
T
12
,i
F
F
+
B
Q
i
B
i
C
i
+
W
3
,i
Q
12
,i
B
i
+
+
C
C
C
C
i i
P
W
4
,i
P
i i
−
P
i
12
,i
P
T
12
,i
W
−
4
,i
C
i
B
i
P
T
12
,i
F
F
F
+
C
C
i
P
C
i i
P
−
i
C
i
P
T
12
,i
−
C
i
P
T
12
,i
P
12
,i
F
⎠
.
⎠
,
⎠
, and
∂

E

∂
ˆ
2
H
(
k
)
2
,
∂

E

∂
ˆ
2
H
(
k
)
2
and
∂

E

∂
ˆ
2
H
(
k
)
2
as in
(5.16)
.
Proof: The proof is analogous with the proof for Theorem 4.6.
For the lpv version of the frequencylimited method, described in Section 4.4.3, the necessary conditions for optimality can be stated as
96
5 lpv
Modeling
Theorem 5.3.
ited
H
2
Assume that all G i and
norm to be deﬁned, i.e., all the matrices
A
(
k
)
,
B
(
k
)
,
C
(
k
)
and
(
k
)
A
i
G i are asymptotically stable, for the limand
A
i are Hurwitz for all to be optimal for the problem i . In order for
minimize
ˆ
(
k
)
,
ˆ
(
k
)
,
ˆ
(
k
)
,
ˆ
(
k
)

E i

2
H
2
,ω
, E i
=
G i
−
i
, i where
∂
∂
+
+
i i
∂

tions in (4.65) and the equations in (4.29) for all
∂


E
E
ˆ
E
ˆ
i i i


(
k
)

(
k
)
2
H
2
,ω
2
H
2
,ω
2
H
2
,ω is deﬁned in Chapter 3, it is necessary that they satisfy the equa
=2
=2
i i w w k k
(
( p p
i i
)
)
Q
Q
T
12
,ω,i
ω,i
B
i
P
+
12
,i
Q
Q
T
12
,ω,i
ω,i
B
i i and that
P
−
i
−
T
ω,i
W
i i
T
=
D
i
0
,
−
i
= 0
,
(5.23a)
(5.23b)
∂
∂
+
i
+
i
∂
∂


E
ˆ
E
ˆ
i
(
k
)
i


(
k
)
2
H
2
,ω
2
H
2
,ω
=2
=
−
2
i i w k
(
w
p
k i
(
) p
i
C
)
i

P
C
ω,i i
S
−
ω,i
C
B
i i
P
+
12
,ω,i
D
i
ω
π
−
−
D
C
i i
−
S
D
ω,i i
B
i
−
i
T
D
T
ω,i i
ω
π
.
=
=
0
,
0
,
(5.23c)
(5.23d)
where
W
i
V
i

i
= Re
T
i
C
i
π
P
i
L
−
− ˆ
i i
T
−
C
i iω
I
P
12
,i
,
V
−
i
.
T
i
T
,
D
i
−
i i
T
.
(5.22)
(5.23e)
(5.23f)
With the function see Higham [2008].
L
(
· , ·
)
being the Frechét derivative of the matrix logarithm,
Proof: The proof is analogous with the proof for Theorem 4.8.
Low Rank Coefﬁcient Matrices
For some applications it can be preferable to be able to control the rank of some of
A
(
k
)
,
B
(
k
)
,
C
(
k
) (
k
)
. See, for instance, the example in Section 7.1, where this is important.
One way of controlling the rank of the coe ﬃ cient matrices, is to parametrize them as
A
B
C
(
k
)
(
k
)
(
k
)
= V
(
k
)
A
W
= V
(
k
)
B
W
= V
(
k
)
C
W
(
k
)
T
A
(
k
)
T
B
(
k
)
T
C
,
,
.
(5.24a)
(5.24b)
(5.24c)
If, for example, it is assumed that the resulting lpv model should have
n r
states,
5.5
Computational Aspects of the Optimization Problems
97
A
R
(
k
)
n r
∈
×
n k
R
n r
×
n
and
r
,
W
∀
A k
(
k
)
, and the rank of the matrix ˆ
∈ R
n r
×
n k
(
k
) should be
n k
< n r
, then V
(
k
)
A
∈ is chosen. This type of parametrization have, with success, been used in, for example, Burer and Monteiro [2003] for semideﬁnite programs.
If this new parametrization is introduced, the only change in Theorem 5.1, Theorem 5.2 and Theorem 5.3, will be a small change in the gradients. For example,
A
(
k
) in Theorem 5.1 was computed as
∂

E

2
H
2
∂
ˆ
(
k
)
= 2
w k
( p
i
T
Q
E,i
P
E,i
ˆ
.
i
The new equations for the gradient, given the parametrization in (5.24), would be
∂

E
 2
H
2
∂
V
(
k
)
A
∂

E
 2
H
2
∂
W
(
k
)
A
= 2
= 2
i i w w k k
( p
( p
i i
)
T
Q
E
T
E,i
Q
P
E,i
E,i
P
EW
E,i
(
k
)
A
T
,
V
(
k
)
A
.
The equations for V
(
k
)
B
, W
(
k
)
T
B
, V
(
k
)
B
and W
(
k
)
T
C
follow analogously.
(5.25a)
(5.25b)
5.5
Computational Aspects of the Optimization
Problems
In this section, as in Section 4.5, an initialization will be suggested and again how to use both the structure in the variables and the equations to speed up the computations is shown.
As with the methods in Section 4.5, both the cost functions and gradients are given for the lpv methods and it is straightforward to use, for example, any quasi
Newton solver to solve the optimization problem.
5.5.1
Structure in Variables and Equations
What was explained in Section 4.5.1, about structure in the sought system matrices, is applicable, with the same motivation, for the lpv methods. Hence, it is easy to impose structure in the system matrices, e.g., blockdiagonal A matrix.
In Section 4.5.3, it was explained how to use the inherent structure of the equations in the problem to, more e ﬃ ciently, compute the Lyapunov/Sylvester equations that is needed to compute the cost function and the gradient. For the modelreduction case it was possible to reduce the complexity for every iteration to
O
n
2
+
n
2
. For the lpv case, the same structure can be utilized for every lti be model in
O
N n
2
M
+
n
and iteration. This means that the complexity per iteration will
ˆ
2
, where
N
is the number of lti models in
M
.
98
5 lpv
Modeling
5.5.2
Initialization
A subject that needs more attention though, is the initialization. It is assumed, for the initializations described here, that one basis function for the sought system matrices is
w k
( p ) = 1, i.e., there is a constant term in the parametrization (5.7). A simple initialization is to use one of the given models in
M and set the constant matrix coe ﬃ cient terms to this model.
As with the model reduction problem, a bit more can be done in then case when there are no input or output ﬁlters. The cost functions in this case becomes

E i

2
H
2
= tr B
i
T
Q
i
B
i
+ 2 B
i
T
Q
12
,i
B
i i
T
i
B
i
,
(5.26a)
i i

E i

2
H
2
= tr C
i
P
i
C
i
T
−
2 C
i
P
12
,i
T
i
C
i
P
i i
T
.
(5.26b)
i i
A
(
k
)
B
(
k
)
C
(
k
) (
k
)
C
(
k
)
(or
C
(
k
)
) matrices are ﬁxed. First, to have a system to start from, any of the given lti models in
M
= . Now set
ˆ
( p ) = ˜ , i.e., choose ˆ to be a constant matrix that does not depend on do the same thing for ˆ . The problem of ﬁnding ˆ
(
k
) p , and is now a quadratic problem which can be solved as explained below.
B
i
can be written as
B
i
= where size.
I
w w k
1
( p
i
( p
i
)
I
w
2
( p
×
i

)
I
B
(1)
w
3
T
( p
i
)
B
(2)
. . .
T
I
w
N w
( p
i
)
B
(3)
T
. . .
(
N w
)
T
.
T
i
¯
,
(5.27) are deﬁned in (5.7) and
I is the identity matrix of compatible
Now rewrite the cost function (5.26a) as
V
= tr B
i
T
Q
i
B
i
+ 2 B
i
T
Q
12
,i
B
i i
T
i
Q
i
B
i
= tr B
i
T
Q
i
B
i
+ 2 B
i
T
Q
12
,i
p
i
¯
+ ¯
T
i
= tr
⎜⎝
⎜⎜⎜⎜

= tr
i
b
1
+
B
T
Q
i
B
i
b
2
+
+
⎢⎣
⎢⎢⎢⎢
2
1
2
B
T b
3
i
B
.
.
T
i
Q
12
,i i
⎥⎦
⎥⎥⎥⎥
i
T
¯
+
Q
i i
1
2
B
T
⎢⎣
⎢⎢⎢⎢
2
i i
T
Q
i i
The solution to the problem min
¯
V
, which always exists since b
3
⎥⎦
⎥⎥⎥⎥
B
⎞
⎟⎠
(5.28) is positive
5.6
Examples
99 semideﬁnite, is the solution to the linear system of equations b
3
=
− b
T
2
.
(5.29)
C
(
k
)
; deﬁne
= C
(1)
,
C
(2)
, . . . ,
(
N w
)
B
(
k
) that was found solving the quadratic problem described above. Now, the equations are obtained, where c
1
=
V
+
i
= tr
C
i
P

i
c
1
C
i
T
+
,
c c
2
2
C
=
T
+
−
2
1
2
+
Cc
i
3
C
i
C
P
T
.
12
,i
p
i
and c
3
= 2
+
i
(5.30)
i
T
P
i i
The solution to the quadratic problem in this case, which also always exists since
.
c
3 is positive semideﬁnite, is the solution to the system of linear equations
Cc
3
=
− c
2
.
(5.31)
These are suggestions for ﬁnding initial values for ˆ B and ˆ .
A
(
k
)
B
(
k
)
C
(
k
) matrices can be controlled, the initialization strategy above has to be used with caution. In the above strategy, using the parametrization
ˆ
( p ) = ˆ
(1)
+
w k
( p ) V
(
k
)
A
W
(
k
)
T
A
,
(5.32)
k
(
k
)
W
(
k
) are initialized as matrices with all zeros. Looking at (5.25), it can
V
(
k
)
A
W
(
k
)
A
(
k
) to stay zero
W
(
k
) to zero and the other one to a matrix with random values, or more generally to two orthogonal matrices. This will avoid the problem described above.
5.6
Examples
In this section, an illustrative example to shed light on some properties of the proposed methods will be presented. A larger more extensive example using the methods in this chapter will be presented in Chapter 7, since it requires more background material.
When solving the example, the function fminunc in
M atlab is used as the quasi
Newton solver framework. To generate a starting point for the solver, which is an extremely important problem in need of signiﬁcant amounts of research, the initialization procedure explained in Section 5.5.2 is used.
As a comparison, a method that will be called smile is used. The method is described in detail in De Caigny et al. [2012]. This method uses interpolation of the system matrices, by ﬁrst changing all the given lti models to a common basis and then do a standard interpolation of the elements in the system matrices.
100
Example 5.2: Small lpv
Approximation Example
5 lpv
Modeling
To show the potential of the lpv approximation and illustrate the importance of addressing system properties, a small example is studied.
The system in this example is deﬁned by a connection of two secondorder systems, i.e., a system with four states, with parameter dependent damping,
G
ζ
1
=
G
1
G
2
,
= 0
.
1 + 0
.
9
p, ζ
2
1 where
G
1
= 0
.
=
s
1 + 0
2
.
+ 2
ζ
1
9(1
s
+ 1
−
p
)
, p
, G
2
∈
[0
,
=
1]
.
s
2
9
+ 6
ζ
2
s
+ 9
,
(5.33a)
(5.33b)
The system was sampled by selecting 10 equidistant points in
p
∈ linear models with four states each are given as data to the method.
[0
,
1], i.e., 10
The data is given in a state basis where all the lti models are balanced. The elements in the system matrices happen to depend nonlinearly on the parameter
p
, see the gray dashed lines in Figure 5.1. The interesting and obvious property of this example is that there exists state bases (for example, observable canonical form) for which the model has a ﬃ ne dependence on of the system matrix A are a ﬃ ne in
p p
; in fact only two elements while all other matrix elements in A
,
B and
C are constants, see the black solid lines in Figure 5.1.
The method h
2 nl will be used with an a ﬃ ne parametrization with respect to the parameters, and we investigate if it is possible to ﬁnd a representation of the true system with this structure, given the data where the individual elements in the system matrices depends nonlinearly on the parameter. Additionally, the method h
2 nl
, where we control the rank in ˆ
(1)
, where ˆ (
p
) = ˆ
(0)
+
p
ˆ
(1)
, will also be used. We will choose the rank to be two, since there exists a statespace basis where only two elements of the system matrix A are a ﬃ ne in
p
and all other elements are constant, see the black lines in Figure 5.1.
From the results in Table 5.1 it can be observed that when h
2 nl with rank 4 and rank 2, a high accuracy low order (indeed a ﬃ ne) the system can be found.
is used, both lpv model of
Using smile with an a ﬃ ne parametrization, a much worse model is obtained.
Achieving comparable results using the smile strategy requires polynomials of order two. To further illustrate the accuracy, 100 validation points are generated from (5.33) and the relative
H
2
norm for the error model in these points is shown in Figure 5.2.
In this example, the importance of addressing the behavior of the system instead of interpolating the system matrices can be seen. First of all, it is hard to ﬁnd base transformations such that all the given lti models are represented in the same basis (called a coherent state basis in De Caigny et al. [2012]), and second you cannot control how the system depends on the parameters in this basis, as is illustrated with the smile method using di ﬀ erent orders in the polynomial of the parameter.
5.7
Conclusions
101
10
0
−
10
10
0
−
10
10
0
−
10
10
0
−
10
0 0
.
5 1 0 0
.
5 1 0 0
.
5 1 0 0
.
5 1
Figure 5.1: lpv system
The elements in the
(5.33)
A
matrices as function of p for the four state for two di ﬀ erent state bases. The gray dashed lines represents the elements in the
A
matrix when any lti model extracted is given in a balanced form. For this state basis, the elements depend nonlinearly on p , This is also the basis for which the lti models that are given as data are extracted from. The black lines represents the elements in the another state basis when only two elements depend a ﬃ ne on p
A
matrix for and the rest are constant. This state basis is shown here to show that there exist another, inputoutput equivalent, system which has a simple structure.
5.7
Conclusions
In this chapter, new local methods for computing an lpv model, given a set of lti models are proposed. These methods use a nonlinear optimization approach that is based on the modelreduction techniques in Chapter 4. The proposed methods try to preserve the inputoutput behavior of the given systems by minimizing the
H
2
norm of the error systems. The cost functions and their gradients are derived to be computationally e ﬃ cient. This enables us to have a measure of ﬁrst order optimality and to e ﬃ ciently use standard quasiNewton solvers to solve the problem. The method has been shown to work both conceptually, on small examples, and on realworld examples, as we will see in Chapter 7.
There are two main advantages with the proposed methods, compared to existing local methods. The ﬁrst one is that it is possible to impose structure in the elements in the system matrices. The other one is that the method tries to capture the inputoutput behavior of the given systems. However, this comes at the price of computational burden, which makes the method slower than many existing local methods. The fact that the methods consider the inputoutput behavior, using the
H
2
norm, implies that the method is invariant to which statespace bases
102
5 lpv
Modeling
0
.
4
0
.
2
0
Table 5.1:
Method h
2 nl
, rank 2 h
2 nl
, rank 4 smile smile
i

E i

1
.
44 · 10
2
.
54 · 10
H
2
−
4
−
5
2
.
54 · 10
−
13
6
.
70
Degree
1
1
2
1 smile
, degree 1
· 10
−
14 smile
, degree 2
1
.
5
1
0
.
5
0
0
.
8
0
.
6
0
.
4
0
.
2
1
· 10 h
−
6
2 nl
, degree 1, rank 4
0 0
.
2 0
.
4
p
0
.
6 0
.
8
6
4
2
1
0
0
· 10 h
−
6
2 nl
, degree 1, rank 2
0
.
2 0
.
4
p
0
.
6 0
.
8 1
Figure 5.2:
The ﬁgure illustrates the relative
H
2
norm of the error system in
100 validation points for the di ﬀ erent methods. Note the di ﬀ erent scales and that it takes a polynomial of order two using the smile approach to obtain a satisfactory result, as with the proposed method using an a ﬃ ne function.
the given local lti models are represented in and even how many states the given models have. It also implies that it is possible to ﬁnd an lpv model with low dependence on the parameters, despite apparently complex dependence of the parameter.
6
Controller Synthesis
Let us start by quoting a sentence from Syrmos et al. [1997]: “
The static output feedback problem is one of the basic problems in feedback design, which, in the multivariable case, is still analytically unsolved.
” In Blondel and Tsitsiklis
[1997] they show that the static outputfeedback stabilization problem is indeed
NPhard if one constrains the coe ﬃ cients of the controller to lie in prespeciﬁed intervals. They also conjecture that already the unconstrained problem is NPhard.
This chapter does not include a revolutionary solution to this problem, instead it proposes a computational method for ﬁnding locally optimal solutions to the mentioned problem and as will be shown, the method works for mediumscale systems and for controllers that have structural constraints. A method for synthesizing controllers for lpv systems, based on the ﬁrst method, is also presented.
The methods use, as the methods in the previous chapters, a general nonlinear optimization approach.
6.1
Overview
The problem of ﬁnding an unstructured statefeedback
H
2 or
H
∞ controller is well known to be a problem that, under certain assumptions, see, e.g., Zhou et al.
[1996], easily can be solved. However, the problem of ﬁnding a static outputfeedback
H
2
(or
H
∞
) controller is generally a nonconvex problem and not solved as easily. The problem of ﬁnding an
H
2 controller is closely related to the problem of ﬁnding an optimal controller with a quadratic performance criterion. This problem was introduced in Kalman [1960] and has been studied since then. The problem has been attacked in di ﬀ erent ways, both using direct generalpurpose minimization, see, e.g., Rautert and Sachs [1997], and using semideﬁnite pro
103
104
6 Controller Synthesis grams ( sdp
) see, e.g., Stingl [2006]. These methods can handle problems of moderate sizes but can experience problems already for smallscale systems.
sdp has been a hot topic during the last years, but the problem with the sdp approach is that it scales badly with the dimension of the problem. When formulating this particular optimization problem, of ﬁnding a reducedorder controller, it involves bilinear matrix inequalities ( bmi s) that makes the problem even more di ﬃ cult to solve, see Mesbahi et al. [1995]. Another approach that very recently has been published is the more direct approach in Lin et al. [2009] (and Fardad et al. [2009]) that formulates the problem as a general nonlinear optimization problem and uses a dedicated quasiNewton algorithm to solve the problem. The ﬁrst method presented in this chapter resembles closely the method presented in Lin et al. [2009], but has been independently derived with an, in our opinion, more straightforward derivation. The main focus in Lin et al. [2009] is on the ability to create structured controllers, e.g., interconnected systems subject to architectural constraints on the distributed controller. In this chapter the main goal is to ﬁnd a method that is applicable to mediumscale systems and is expandable to a framework for creating robust
H
2 controllers or controllers for lpv system, e.g., controllers for systems with parametric uncertainties. The ﬁrst method is then extended to handle controller synthesis for lpv systems, much as how the methods in Chapter 5 are extensions of the methods in Chapter 4.
The methods proposed in this chapter, for controller synthesis, both for lpv lti and systems, will of course have at least two drawbacks. The ﬁrst one is that the methods need a stabilizing controller to be able to start the optimization and ﬁnding a stabilizing controller is most likely an NPhard problem. The second one is that, given a stabilizing controller, the problem of ﬁnding a static outputfeedback
H
2 controller is a nonconvex problem, therefore the proposed methods can not guarantee to ﬁnd a globally optimal controller but only a locally optimal one.
6.2
Static OutputFeedback
H
2
Controllers
In this section, a method for synthesizing static outputfeedback for lti
H
2 controllers systems will be presented, and as explained in Section 2.1.4, this method can also be used to synthesize reducedorder controllers. The proposed method will, as the methods presented in Chapter 4 and Chapter 5, be based on minimizing the
H
2
norm.
The goal with the optimization problem in this section is to formulate an optimization problem for synthesizing a static outputfeedback controller. When formulating this optimization problem, great care need to be taken when deriving the expression for the cost function and its gradient to make sure that the expressions can be evaluated e ﬃ ciently. The method presented in this section is designed to work on mediumscale systems, which will be shown later, and it also works with structural constraints in the controller.
As described in Section 2.1.4, the model that will be used to measure the perfor
6.2
Static OutputFeedback
H
2
Controllers
105 mance of a system is
⎜⎜⎝
⎜⎜⎜⎜ x
⎞ z y
⎟⎟⎟⎟
⎟⎟⎠
=
⎜⎜⎝
⎜⎜⎜⎜
A
C
1
C
2
B
1
D
11
D
21
B
2
D
12
0
⎟⎟⎠
⎟⎟⎟⎟
⎜⎜⎝
⎜⎜⎜⎜ x w
⎞ u
⎟⎟⎠
,
(6.1) where x
∈ R
n x
control signal, signal.
is the state vector, z
∈ R
n z
w
∈ R
n w
the disturbance signal, the performance measure and y
∈ R
n y
u
∈ R
n u
the the measurement
Closing the loop with a static outputfeedback controller, u = matrix describing the controller, yields the closedloop system
Ky , where K is a
T w,z
=
A
T
C
T
B
T
D
T
=
C
A +
1
+
B
2
KC
2
D
12
KC
2
B
1
D
11
+
+
B
2
KD
21
D
12
KD
21
.
(6.2)
Now, let us formulate the optimization problem of minimizing the the closedloop system from w to z ,
T w,z
, in (6.2), i.e.,
H
2
norm of min
K
T w,z
2
H
2
.
(6.3)
Since the equations will di ﬀ er in continuous and discrete time but the general ideas are the same, both versions will be presented but with less detail in the discretetime case.
6.2.1
Continuous Time
For the
H
2
norm to be deﬁned, the system and strictly proper, i.e., A + B
2
KC
2
T w,z
has to be asymptotically stable has to be Hurwitz and D
11
+ D
12
KD
21
=
0 . Note that already the problem of ﬁnding a K that stabilizes the system is, as explained in the beginning of this chapter, most likely an NPhard problem.
Because of this, for the rest of the chapter, if nothing else is mentioned, it will be assumed that K stabilizes the system.
To compute the cost function for the optimization problem (6.3), the cost function have to be expressed in a more suitable form for evaluation. Using (2.21), the cost function for the optimization problem (6.3) can be expressed as
T w,z
2
H
2
= tr B
T
T
Q
T
B
T
= tr C
T
P
T
C
T
T
,
where Q
T
and P
T
satisfy the Lyapunov equations
A
T
P
T
A
T
T
Q
T
+
+ P
T
A
T
T
Q
T
A
T
+
+
B
T
B
T
T
C
T
T
C
T
= 0
,
= 0
.
(6.4a)
(6.4b)
(6.5a)
(6.5b)
Now, with the equations in (6.4) and (6.5) it is possible to state necessary conditions for optimality for (6.3). In the theorem below, which states the necessary conditions for optimality, the gradient of the cost function for the optimization
106
6 Controller Synthesis problem (6.3) can be readily extracted to be used in, for example, a quasiNewton algorithm.
Theorem 6.1 (Necessary conditions for optimality).
(6.1)
Given a system and a static outputfeedback controller, described by the matrix
G as in
K
, such that
u = Ky
. The system G and the controller are given such that the closedloop system,
Hurwitz and
T w,z
D
T in
=
(6.2)
0
, is asymptotically stable and strictly proper, i.e.,
. In order for the matrix
(6.3), it is necessary that
K
K
satisﬁes the equations in (6.5) and that
A
T is to be optimal for the problem
∂ T w,z
∂
K
2
H
2
= 2 B
T
2
Q
T
P
T
C
T
2
+ B
T
2
Q
T
B
T
D
T
21
+ D
T
21
C
T
P
T
C
T
2
= 0
.
(6.6)
Proof: If A
T
is Hurwitz, then the equations in (6.5) are uniquely solvable. These are needed to compute the cost function and its gradient. Now the gradient of the cost function with respect to K has to be computed. Let
(
i, j
) in K . First di ﬀ erentiate (6.5b) with respect to
k ij k ij
denote element
, which will be needed later on, which entails
A
T
T
∂
Q
T
∂k ij
+
∂
Q
T
∂k ij
A
T
+
∂
A
T
T
∂k ij
Q
T
+ Q
T
∂
A
T
∂k ij
+
∂
C
T
T
∂k ij
C
T
+ C
T
T
∂
C
T
∂k ij
= 0
.
Now di ﬀ erentiate the cost function (6.4a) with respect to
k ij
,
(6.7)
∂ T w,z
∂k ij
2
H
2
= 2 tr
∂
B
T
T
∂k ij
Q
T
B
T
+ tr
∂
Q
T
∂k ij
B
T
B
T
T
.
(6.8)
Using Lemma 4.1 on the equation above together with equations (6.5a) and (6.7) entails
∂ T w,z
∂k ij
2
H
2
= 2 tr
∂
A
T
T
∂k ij
Q
T
P
T
+
∂
B
T
T
∂k ij
Q
T
B
T
+
∂
C
T
T
∂k ij
C
T
P
T
.
(6.9)
Using the structure of the variables A
T
, B
T
and C
T
in (6.2) and Lemma 4.2 yields
∂ T w,z
∂
K
2
H
2
= 2 B
T
2
Q
T
P
T
C
T
2
+ B
T
2
Q
T
B
T
D
T
21
+ D
T
12
C
T
P
T
C
T
2
.
(6.10)
6.2
Static OutputFeedback
H
2
Controllers
107
For the optimization problem (6.3) it is also quite straightforward to derive the
Hessian, in the same manner as deriving the gradient, ending up in
∂
2
T w,z
∂k ij
∂k kl
2
H
2
= 2 tr
+ B
T
2
Q
T
∂
P
∂k
T kl
C
∂
K
T
∂k
T
2
ij
+
B
T
2
∂
Q
T
∂k
B
T
2
Q
T
B
kl
2
B
T
∂
K
∂k kl
D
T
21
D
21
+
D
D
T
21
T
21
+
C
T
∂
P
∂k
T kl
D
T
21
D
21
C
T
2
∂
+
K
∂k kl
B
T
2
C
2
∂
Q
∂k
P
T
T
P
T kl
C
T
2
C
T
2
= 2 D
T
12
C
T
∂
P
∂k
T kl
C
T
2
ij
+ 2 D
T
12
C
T
∂
P
T
∂k ij
C
T
2
kl
+ 2 B
T
2
Q
T
∂
P
∂k
T kl
C
T
2
ij
+ 2 B
T
2
Q
T
∂
P
T
∂k ij
C
T
2
kl
+ 2 B
T
2
Q
T
B
2
ik
D
21
D
T
21
lj
+ 2 D
T
12
D
12
ik
C
2
P
T
C
T
2
lj
.
(6.11)
6.2.2
Discrete Time
In discrete time, for the totically stable, i.e., A +
H
2
B
2
norm to be deﬁned, the system
KC
2
T w,z
must be asymphas to be Schur. To compute the cost function in
(6.3) for discretetime systems the equations
T w,z
2
H
2
= tr B
T
T
Q
T
B
T
= tr C
T
P
T
C
T
T
+ tr D
T
D
T
T
+ tr D
T
T
D
T
,
(6.12a)
(6.12b) can be used, where Q
T
and P
T
satisfy the discretetime Lyapunov equations
A
T
A
T
T
P
Q
T
T
A
A
T
T
T
−
P
T
+ Q
T
+ B
T
+ C
T
T
B
T
T
C
T
=
=
0
0
.
,
(6.13a)
(6.13b)
Theorem 6.2 (Necessary conditions for optimality).
(6.1)
Given a system and a static outputfeedback controller, described by the matrix
G as in
K
, such that
u = Ky
. The system G and the controller are given such that the closedloop system, matrix
T w,z
K
in
(6.2)
, is asymptotically stable, i.e.,
A
T is Schur. In order for the to be optimal for the problem (6.3), it is necessary that it satisﬁes the equations in (6.13) and that
∂ T w,z
∂
K
2
H
2
= 2 B
T
2
Q
T
A
T
P
T
C
T
2
+ B
T
2
Q
T
B
T
D
T
21
+ D
T
21
C
T
PC
T
2
+ D
T
12
D
T
D
T
21
= 0
.
(6.14)
Proof: The proof is analogous to the proof for Theorem 6.1
As in the continuoustime case it also here possible to compute the Hessian,
108
6 Controller Synthesis which becomes
∂
2
T w,z
∂k ij
∂k kl
2
H
2
= 2 D
T
12
C
T
∂
P
∂k
T kl
C
T
2
ij
+ 2 D
T
12
C
T
∂
P
T
∂k ij
C
T
2
kl
+2
+ 2
B
T
2
Q
B
T
2
Q
T
A
T
∂
P
T
∂k kl
C
T
2
T
B
2
ik
D
21
D
T
21
ij lj
+ 2
+2 D
B
T
2
Q
T
A
T
T
12
D
12
ik
∂
P
∂k
T ij
C
T
2
C
2
P
T kl
C
T
2
+ 2
lj
+2
B
T
2
Q
T
B
2
D
T
12
D
12
ik ik
C
2
P
T
C
T
2
lj
D
21
D
T
21
.
lj
(6.15)
6.3
Static OutputFeedback
H
2
LPV
Controllers
The controller synthesis method for lpv systems presented in this section will be an extension of the method presented in the previous section, much as how the methods in Chapter 5 are extensions of the methods in Chapter 4. The goal with the optimization problem in this section, is to synthesize a static outputfeedback linear parametervarying
H
2 controller. The idea is to be able to directly synthesize a controller using data, instead of ﬁrst identifying an lpv model and then from that model synthesize a controller. As talked about in Chapter 5, given an lpv model,
G
( p ) :
⎜⎜⎝
⎜⎜⎜⎜ z x y
⎞
⎟⎟⎠
⎟⎟⎟⎟
=
⎜⎜⎝
⎜⎜⎜⎜
A
C
1
C
2
(
(
( p p p
)
)
)
B
D
D
1
11
21
(
(
( p p p
)
)
)
B
2
D
12
( p )
( p )
⎞
0
⎟⎟⎠ ⎜⎜⎝
⎜⎜⎜⎜ x w
⎞ u
⎟⎟⎠
(6.16) what ideally is wanted, is to minimize the integral min
K ( p )
T w,z
( p )
2
H
2 d p
,
(6.17) where
G
( p ) with the a set,
T w,z
( p
M
, of
) is the closedloop system when closing the loop for the
N
lpv lti controller models,
G i
lpv model
K
( p ). However, in this section, it is assumed that
, for di ﬀ erent ﬁxed parameter values, p
i
, just as in Chapter 5, is given. This will of course lead to the fact that it is not possible to control the dynamic behavior coming from when the parameters are not ﬁxed, as discussed in Chapter 5, since this information is not present in the data given. However, this is a common problem when working with gainscheduling and it is assumed in this thesis that the parameters move slowly such that the dynamics from the parameters do not inﬂuence the system much, a commonly used assumption, see Shamma and Athans [1992]. The optimization problem now becomes
N
minimize
K ( p )
i
=1
T w,z
( p
i
)
2
H
2
,
(6.18) which for a ﬁxed
i
becomes equivalent to the problem in Section 6.2.
6.4
Computational Aspects
109
The parametrization of the controller
K
( p ) : u (
t
) = K ( p ) y (
t
) with respect to the parameters is taken as
K ( p ) =
w k
( p ) K
(
k
)
.
k
(6.19)
(6.20)
As when identifying lpv models, the functions
w k
( p ) are design choices that can be hard to choose. However, given such a parametrization and a set of lti models, as in (6.16), where the given as
K
( p
i
) =
K i
problem can be written as lti models are denoted as and the closed loop system as
T w,z
( p
i
G
( p
i
) =
T
) =
w,z,i
G i
, the controller
, the optimization
N
minimize
K
(
k
)
i
=1
T w,z,i
2
H
2
(6.21) where Q
T ,i
and P
T ,i
satisfy the Lyapunov equations
A
T ,i
P
T ,i
A
T
T ,i
Q
T ,i
+
+ P
T ,i
A
T
T ,i
Q
T ,i
A
T ,i
+
+
B
T ,i
B
T
T ,i
C
T
T ,i
C
T ,i
= 0
,
= 0
,
(6.22a)
(6.22b) for the continuoustime case and their discretetime counterpart in the discretetime case.
Now we formulate the necessary conditions for optimality for this method in continuous time, the conditions for the discretetime case are analogous.
Theorem 6.3 (Necessary conditions for optimality).
Assume that
K
i stabilizes the system G i is Hurwitz and and that all closedloop systems,
D
T ,i
= 0
for all i
T w,z,i are strictly proper, i.e.,
A
T ,i
. In order for the matrices
K
(
k
)
to be optimal for the problem (6.21), it is necessary that
K (
p
)
satisﬁes the equations in (6.22) for all i
, and that
∂
+
i
T w,z,i
∂
K
2
H
2
= 2
N w k
( p
i
)
i
=1
B
T
2
,i
Q
T ,i
P
T ,i
C
T
2
,i
+ B
T
2
,i
Q
T ,i
B
T ,i
D
T
21
,i
+ D
T
12
,i
C
T ,i
P
T ,i
C
T
2
,i
= 0
.
(6.23)
Proof: The proof is analogous with the proof for Theorem 6.1.
6.4
Computational Aspects
In this section, a suggestion of how the methods in this chapter can be initialized and how to speed up the computations will be presented.
As with the methods in the previous chapters, both cost functions and their gra
110
6 Controller Synthesis dients have been calculated and can easily be used in, e.g., a quasiNewton algorithm to solve the optimization problem. For the methods described in this chapter also the Hessians have been calculated, which can be utilized in a quasi
Newton algorithm to initialize the Hessian approximation in, e.g., bfgs
. We do not want to use the Hessian information in every iteration since this would be too heavy, computationally.
The derivations for the gradients and the Hessians in Section 6.2, have been done element wise, as with the methods in the previous chapters. This means that it is possible, also for these methods, to introduce structure in the controller, e.g. a diagonal controller.
For the methods in this chapter it is, however, not as straightforward to utilize the structure in the Lyapunov equations (6.22) (or (6.5)) since, in the realization
(6.2), there is no obvious structure that can be exploited. What can be used, is that if both the cost function and the gradient have to be computed, both
P
T
must be computed, and the fact that A
T
Q
T
and P
T
Q
T
and can be solved e ﬃ ciently together by using is the factor in both of the Lyapunov equations, see for example
Benner et al. [1998].
The optimization problems (6.3) and (6.21) are both nonconvex and nonlinear, which makes the initialization an important problem. Additionally, it is required that the initializing controller is stabilizing, which is probably an NPhard problem, see Blondel and Tsitsiklis [1997]. If the given system (or systems if given a set of models) is asymptotically stable, then the initialization used is a controller with all zeros. However, if given an unstable system for which an
H
2 controller should be computed we take use of other, existing, methods/algorithms to try and stabilize the system and then start our method with this stabilizing controller.
The algorithm used to ﬁnd a stabilizing controller is hifoo
(see Gumussoy et al.
[2009]).
6.5
Examples
In this section, we will try to show the applicability of the methods presented in this chapter using some examples. We begin with an example where the method presented in Section 6.2 is used on some systems in the COMPl e collection (see Leibfritz and Lipinski [2003]).
ib benchmark
Example 6.1: COMPl e ibSystems
In this example our goal is to compare the method presented in Section 6.2, which will be called h
2 nlctrl
, with the sdp
method described in Stingl [2006], called stingl
, and the method described in Arzelier et al. [2011], called hifoo
.
The systems used in this example comes from COMPl e ib (see Leibfritz and Lipinski [2003]), and are systems ranging from 2 to 1100 states. In all systems we have
D
11 the
=
H
2
0 and D
12
= 0 or D
21
= 0 , to make sure that
norm of the closedloop system is deﬁned.
D
11
+ D
12
KD
21
= 0 , so that
6.5
Examples
111
To initialize h
2 nlctrl and hifoo
, a heuristic approach is used. First it is checked if the system is openloop stable and if that is the case, then the optimization is initialized with K = 0 . If this does not hold then the optimization package hi
foo
, see Gumussoy et al. [2009], is called to minimize the real part of the largest eigenvalue of the matrix A + B
2 found are reported in the tables.
KC
2
. Only cases where a stabilizing controller is
In Table 6.1, Table 6.2, Table 6.3, Table 6.4 and Table 6.5, results from the numerical benchmark are presented. The name of the COMPl e ibsystem is displayed in the ﬁrst column. In the second column, the relevant sizes, i.e., number of states, number of output and number of inputs are display and are denoted
n u
respectively. In columns three, four and ﬁve the
H
2
n x
,
n y
and
norm for the resulting closedloop systems are displayed and in columns six, seven and eight how long time it took for the methods h
2 nlctrl
, hifoo and stingl to ﬁnd the controller are displayed. For the ﬁrst 31 systems, which also occur in the test performed in Stingl [2006], the results are compared to the results reported in Stingl [2006].
For the reminder of systems a “–” in the fourth column denotes that we do not have any other results from Stingl [2006] to compare with. In Stingl [2006] they could not ﬁnd a controller for these systems, mostly because of numerical problems and rapid growth of the
Lyapunov matrices P or Q .
sdp s, since they optimize over both K and the
When the results using hifoo and h
2 nlctrl are compared with the results from stingl
, hifoo and h
2 nlctrl ﬁnd for almost all systems the same value for
T
hifoo
w,z
H
2
, however, generally, much faster. When comparing h
2 nlctrl and they perform very similar for most of the systems regarding the value
T w,z
H
2
. However, for a large number of the systems, the controller faster than hifoo and for a few systems h
2 nlctrl hifoo is able to ﬁnd is not able to compute a controller due to out of memory, denoted with “–” in the ﬁfth column, where h
2 nlctrl can.
The results in the tables below also show the beneﬁt with the new method, apart from being able to handle structure in the controller, it can handle mediumscale systems. The amount of optimization variables does not grow with the amount of states in the systems, as in the sdp case used by Stingl [2006], but only depends on the size of the controller.
112
6 Controller Synthesis
6.5
Examples
113
114
6 Controller Synthesis
6.5
Examples
115
116
6 Controller Synthesis
6.5
Examples
117
Table 6.6:
Numerical values for the coe ﬃ cients in the
Example 6.2 and the time to compute them.
Model K
(0)
K
(1)
K
(2)
lpv
Time [s]
controllers in
Constant
Linear
Quadratic
0.1638
0.259
0.2936
–
0.305
0.935
–
–
0.6903
0.09 s
0.12 s
0.11 s
A small example of an lpv controller synthesis problem is now presented to show the potential of the method proposed in Section 6.3.
Example 6.2
The system in this example is the same as in Example 5.2
G
=
G
1
G
2
,
where
G
1
ζ
1
=
s
= 0
.
2
+ 2
1
ζ
1 + 0
.
1
9
s
+ 1
p, ζ
2
, G
2
=
s
2
= 0
.
1 + 0
.
9(1
9
+ 6
ζ
2
s
+ 9
−
p
)
, p
,
∈
[0
,
1]
.
(6.24a)
(6.24b)
(6.24c)
From these equations we obtain A (
p
)
,
B
2
(
p
)
,
C
2
(
p
) and D
22
(
p
), using the notation in (2.31), that represents the dynamical system. Then we create the matrices
D
B
1
(
p
) =
I
4
×
4
,
11
(
p
) = 0
4
×
4
,
C
D
1
(
12
p
(
) =
p
I
) =
4
×
4
0
3
×
1
1
,
D
21
(
p
) = 0
1
×
4 to have a fully deﬁned performance measure of the system. From this system we extract ﬁve systems representing ﬁve equidistant points in
p
∈
[0
,
1], i.e., we are given ﬁve lti models, extracted from the lpv system (6.24), with four states each.
The lpv system is expressed in a balanced state basis. In this state basis the lpv system depend nonlinearly on the parameter
p
, see Figure 6.1. Hence, judging from the given data, one could easily suspect that a complex lpv controller would be required. However, in this example, using the proposed method from
Section 6.3, we will try to ﬁnd three static outputfeedback lpv controllers of different complexity, one that is constant and independent of the parameter
p
, one that is linear in
p
and one that is quadratic in
p
. For example, the quadratic lpv controller has the structure,
u
(
t
) = K (
p
)
y
(
t
)
,
K (
p
) = K
(0)
+ K
(1)
p
+ K
(2)
p
2
.
(6.25)
The speciﬁc values for the resulting them can be found in Table 6.6
lpv controllers and times for computing
To validate the controllers, 100 validation points were generated from (6.24), for
p
∈
[0
,
1]. For each of these 100 models, an optimal static outputfeedback
118
6 Controller Synthesis
4
0
−
4
4
0
−
4
4
0
−
4
4
0
−
4
0 0
.
5 1 0 0
.
5 1 0 0
.
5 1 0 0
.
5 1
Figure 6.1: lpv system
The elements in the
(6.24)
A
matrices as function of in the given state basis.
p for the four state
controller was created (the associated optimization problem is a scalar problem, hence trivially solved using, e.g., gridding). In Figure 6.2 the ratio between the
H
2
performance for the di ﬀ erent lpv controllers and the
H
2
performance with the optimal static outputfeedback controller in the di ﬀ erent validation points is shown, i.e., the closer the curve is to the value one the closer the lpv controller is to the optimal controller. In Figure 6.2, we see that method is able to ﬁnd lpv controllers that, depending on the complexity of the lpv the optimal reference controller in the validation points.
controller, is close to
In Figure 6.3, the reference controller and the resulting lpv controllers (constant, linear and quadratic in one can see that with an
p
) are plotted. Looking at both Figure 6.2 and Figure 6.3
lpv controller that is quadratic in
p
we ﬁnd a controller that is very similar to the globally optimal one.
6.6
Conclusions
In this chapter, two methods for synthesizing one for lti systems and one for lpv
H
2 controllers have been presented, systems. The methods use a direct nonlinear optimization approach to solve the problem which makes it possible to control the structure of the controller to create, e.g., a diagonal or bidiagonal controller.
For these methods, both cost functions, gradients and hessians have been derived, which makes it possible to e ﬀ ectively use oftheshelf quasiNewton solvers and makes it possible to solve problems of mediumscale size. One of the drawbacks with the methods is the nonconvexity of the problems and the possible fact that ﬁnding a stabilizing controller is an NPhard problem. However, this is a problem
6.6
Conclusions
119
1
.
03
1
.
02
1
.
01
Constant
Linear
Quadratic
1
0 0
.
2 0
.
4 0
.
6 0
.
8 1
p
Figure 6.2:
The ratio between the controllers and the
H
2
H
2
performance with the di ﬀ erent lpv
performance with the optimal static outputfeedback controller in the di ﬀ erent validation points. The closer the curve is to the value one the closer the lpv controller is to the optimal controller.
0
.
3
0
.
2
0
.
1
0
True
Constant
Linear
Quadratic
0 0
.
2 0
.
4 0
.
6 0
.
8 1
p
Figure 6.3:
The reference controller (solid line) and the resulting lpv controllers (linear in in p , dashed line, quadratic in p , dashdotted line and cubic p
, dotted line) plotted as functions of the parameter, p
.
120
6 Controller Synthesis that the methods have in common with other methods too, and is one of the problems that need more attention in the future. One possible direct extension that has not been tested is to use the idea of controlling the rank of the system matrices, as in Section 5.4.2. By using a method that can control the rank, one could, for example, enforce the controller to have integrators.
7
Examples of Applications
In this chapter, the methods from Chapter 4 and Chapter 5 are illustrated with two more elaborate examples. In the ﬁrst example, both modelreduction methods from Chapter 4 and lpv generation methods from Chapter 5 are used on an
Airbus aircraft model, to show the applicability of the methods on a realworld example. In the second example, we show how modelreduction methods can be used in system identiﬁcation to obtain better estimates for certain model structures.
7.1
Aircraft Example
The models used in this section, are models of an Airbus aircraft that were developed and used in an EU project called cofcluo
(Clearance Of Flight Control
Laws Using Optimization, see http://cofcluo.isy.liu.se/ and Varga et al.
[2012]). The main objective of the cofcluo project was to develop methods that use optimization techniques to make clearance of ﬂight control laws more e ﬃ
cient and reliable, see for example Garulli et al. [2013]. The clearance of ﬂight control laws is an important part of the certiﬁcation and qualiﬁcation process for the airplane industry. The models used in the examples below are three lpv models that, with di ﬀ erent complexity, describe an airplane in closed loop in the longitudinal direction. All models are siso lpv depend polynomially on the parameters. The di ﬀ models with 22 states and all erence between the lpv models is that they depend on one (di ﬀ erent conﬁgurations for the center tank), two
(di ﬀ erent conﬁgurations for the center tank and the outer tank) or three parameters (di ﬀ erent conﬁgurations for the center tank, the outer tank and payload) respectively.
121
122
7 Examples of Applications
7.1.1
LPV
Simpliﬁcation
To be able to use certain analysis methods for evaluating performance criteria for ﬂight clearance, the lpv models have to be represented as linear fractional representations, lfr s, (see, e.g., Zhou et al. [1996] or Hecker [2006]). To be able to use the analysis methods e ﬃ ciently the lfr s have to be of low order. Generally, any lpv model with rational dependence in the parameters can be turned into an lfr
. However, it is a di ﬃ cult problem to guarantee that the resulting lfr is of minimal order. There exist some special cases for when this is possible, for example, when the lpv depends a ﬃ nely on the parameters, see Hecker [2006]).
Take an lpv model
G
( p ) =
A ( p )
C ( p )
B
D
(
( p p
)
)
,
where the system matrices depend a ﬃ nely on the parameters in
A
(0)
+ A
(1)
p
1
+ A
(2)
p
create the matrices F
2
+
(0)
,
· · ·
F
+
(1)
,
A
F
(
N
)
(2)
,
p
N
and the same for
. . .
, F
(
N
) as
B ( p ), C ( p p , i.e.,
) and D ( p
A ( p ) =
). Now
F
(0)
=
A
C
(0)
(0)
B
D
(0)
(0)
,
F
(1)
=
A
C
(1)
(1)
B
D
(1)
(1)
,
F
(2)
=
A
C
(2)
(2)
The minimal order the this lfr lfr
, generated from
G
( p ), can have is is easy to compute, see Hecker [2006].
B
(2)
D
(2)
, . . . .
+
N i
=1 rank F
(
i
) and
In this example, the lpv generation methods described in Chapter 5 will be used to reduce the complexity, with respect to the parameters, of the original lpv models. The strategy that will be used is to sample a number of three given lpv lti models from the models and choose an a ﬃ ne parametrization for the generated lpv models to be able to guarantee that a low order lfr can be computed from the generated lpv models.
The given lpv models are not strictly proper, which is a problem when using methods based on the
H
2 cumvent this problem, the
norm, since the
D
H
2
norm is inﬁnite if D matrices are ﬁrst ignored and an a ﬃ ne
0 . To cirlpv model is computed using only the A , B and C matrices. To ﬁnd the resulting D matrices a simple elementwise interpolation problem is solved. However, since the D matrices are una ﬀ ected by state transformations the complexity cannot as easy be reduced for the D matrices and a higher order polynomial might be necessary in the interpolation to obtain a su ﬃ ciently good approximation.
As mentioned above, an lfr of low order is preferred. The ﬁrst step towards this was to use an a ﬃ ne parametrization. However, by using the rank controlling method described in Section 5.4.2, it is possible to control the rank of the coe ﬃ cient matrices ( F
(1)
,
F
(2)
, . . .
) in the generated lpv
. Hence, using the rank controlling method described in Section 5.4.2 the complexity of the resulting lfr can be lowered even more by constraining the appropriate matrices to have low rank.
In this example, we sample 10, 100 and 125 lti models from the one, two and
7.1
Aircraft Example
123 three parameter lpv models, respectively. The tantly in the parameter space. These lti lti models are sampled equidismodels are used as inputs to the proposed methods. Two lpv lpv models will be generated for the data sets from the models with one and two parameters, one with full rank in all the coe ﬃ cient matrices and one with rank deﬁcient A
(1) and A
(2) matrices.
A few di ﬀ erent ranks for the A
(1)
, A
(2) and A
(3) matrices were tested and for the one parameter model set, rank two was chosen for the matrix A
(1) and for the two parameter model set, rank eleven was used for both A
(1) and A
(2)
. For the three parameter model set, no su ﬃ ciently good model, for the ranks tested, was found and only the result using coe ﬃ cient matrices with full rank will be presented.
The validity of the resulting ferent, set of lti lpv models are evaluated by sampling a new, difmodels from each of the given lpv models and compare these with the generated lpv
H
2
norm, ignoring the models. The models are compared both using the relative
D matrices and the relative matrices. The results from the lpv
H
∞
norm, including the D generation are displayed for the one parameter case in Figure 7.1. For the two parameter case, the full rank case is displayed in Figure 7.2 and for the low rank case in Figure 7.3. The result from the three parameter case is displayed in Figure 7.4.
In Figure 7.1 – 7.4, we can see that all the generated lpv models have a low relative
H
2
norm for all validation models. This suggests that we have found good approximations of the original lpv models. Not only is the relative
H
2 norm low, but also the relative
H
∞
norm, which gives another certiﬁcate that the generated models approximates the given lpv models well. Looking at Table 7.1,
we can also see that complexity of the resulting lfr have decreased in most cases and especially in the cases where we were able to ﬁnd lpv models with rank deﬁcient coe ﬃ cient matrices. These facts suggest that the proposed lpv methods can be used to reduce the complexity of lpv models and their lfr s. Another interesting fact that can be seen in Figure 7.1 is that for the one parameter model, the resulting model using a rank deﬁcient coe ﬃ cient matrix ﬁnds a better model than the one with full rank. Two likely explanations are that it could be due to the nonconvexity of the problem or that the full rank case is an overparametrization and the low rank method works as a regularization to the problem.
7.1.2
Model Reduction
The three lpv models, described in the previous section, describe an aircraft, and more precisely a ﬂexible aircraft. The original models were computed using ﬁnite element computations and were very large. These models were then reduced such that the dynamics above 15 rad
/
s in the models were truncated. Hence, the given lpv models are only valid up till 15 rad
/
s, which makes these models suitable for testing the frequencylimited modelreduction method, described in
Section 4.4.3. As can be seen in Figure 7.5, which plots the magnitude curve of one of the lti models, it would be beneﬁcial to be able to ignore the dynamics after 15 rad
/
s when doing model reduction.
For this example we extract one lti model from the one parameter lpv model
124
7 Examples of Applications
4
Relative error in
· 10
− 4
H
2
norm for 100 validation points
2
0
2
Relative error in
· 10
− 2
H
∞
norm for 100 validation points
1
0
−
1
−
0
.
5 0
p
0
.
5 1
Figure 7.1:
H
2 the one parameter case. The gray line comes from the case when the coe ffi cient matrix
(1) has full rank and the black dashed line from the case when
A
(1)
Relative error in  and
H
∞
norm at 100 validation points, in has rank two. Interesting to note is that the low rank model performs better than the full rank one. This could, for example, be due to the nonconvexity of the problem or overparametrization.
Relative error in
H
2
norm for 1225 validation points
· 10
− 4
1
−
1
− 0
.
5
0
Relative error in H
∞
0
.
5
1
− 1
0
norm for 1225 validation points
1
· 10
− 3
2
0
− 1
1
−
0
.
5
0
0
.
5
1
−
1
0
p
1
p
2
Figure 7.2: Relative error in
H
2
 and
H
∞
norm at 1225 validation points, in the two parameter case when the coe ffi cient matrices
A
(1) and
A
(2) have full rank.
7.1 Aircraft Example
125
Relative error in H
2
norm for 1225 validation points
· 10
−
4
4
2
− 1
− 0
.
5
Relative error in
0
H
∞
0
.
5
1
−
1
0
norm for 1225 validation points
1
· 10
− 3
2
1
− 1
0
p
1
1
− 1
−
0
.
5
p
2
0
0
.
5
Figure 7.3: Relative error in
H
2
 and
H
∞
norm at 1225 validation points, in the two parameter case when the coe ffi cient matrices
(1) and
A
(2) have rank
11.
Relative error in
H
2
norm for 3375 validation points
200
100
0
0 1 2 3 4 5 6 7
300
Relative error in H
∞
· 1 · 10
− 4
norm for 3375 validation points
200
100
0
0 2
4
6
Error
8 10 12
· 1 · 10
− 3
Figure 7.4: A histogram over the relative error in
H
2
 and
H
∞
norm at 3375 validation points, in the three parameter case when the coe ffi cient matrices
A
(1)
,
ˆ
(2) and
A
(3) have full rank.
126
7 Examples of Applications
Table 7.1: A table showing the amount of time it took to compute the di ff erent
¯
∆ lpv represents the size of the resulting ods and models from Section 7.1.1 and the sizes of the corresponding
n
∆ lfr coming from the proposed methrepresents the size of the resulting lfr from the original lfr s.
lpv model.
lpv
Model
n
∆
n
∆
Time
1 parameter, full rank
1 parameter, rank 2
2 parameters, full rank
2 parameters, rank 11
3 parameters, full rank
26
6
52
32
94
20
20
62
62
98
7m 56s
8m 56s
1h 35m 31s
50m 44s
1h 55m 53s
Magnitude plot for a sampled lti model
50
0
− 50
10
−
2
10
−
1
10
0
Frequency [rad
/
s]
10
1
10
2
Figure 7.5: eter lpv
A magnitude plot for a sampled lti model from the one parammodel. The dashed vertical line denotes
ω
= 15 rad
/
s
.
7.1 Aircraft Example
50
Magnitude plot for the error models wbt mflbt flbt flistia flh
2 wh
2 nl nl
0
127
− 50
10
− 2
10
− 1
10
0
Frequency [rad
/
s]
10
1
10
2
Figure 7.6: The error models resulting from the di ff erent methods, from
Section 7.1.2. The dashed vertical line denotes
ω
= 15 rad
/
s
. The red line
( flbt
) seems to have found the best model. However, this model is unstable.
The best model, in
H
2
norm, is then the green model, which is our proposed method from Section 4.4.3.
at the nominal value
p
= 0. This model will be reduced using the methods described in Chapter 4 and will be compared with other modelreduction methods. The methods flh
2 nl
(which is our proposed frequencylimited modelreduction method, see Section 4.4.3), flistia
, flbt
These methods are also compared with the methods and wh
2 posed frequencyweighted modelreduction method, see Section 4.4.1) and using a tenth order lowpass Butterworth filter with a cuto ff wbt frequency of 15 rad
/
s. The model is reduced from 22 states to 16 states.
mflbt nl are compared.
(which is our pro
The results from the di ff erent methods can be seen in Figure 7.6, showing the di ff erent error models, and Figure 7.7, showing the true and reduced models, and Table 7.2. In Figure 7.6 it seems that flbt has found a good approximation.
However, looking at Table 7.2 we see that the model from flbt is unstable. All the other methods find models that are acceptable for the relevant frequency range and as in the examples in Section 4.6, nl finds the model with the best fit.
flh
2
H
2
In this example we had a model that was only valid up till a certain frequency and looking at the result in Figure 7.6 and Figure 7.7 and Table 7.2, we see that the frequencylimited modelreduction methods sacrifices the model fit in the upper frequencies for the valid, lower, frequency regions. Hence, we see the importance of using methods that are able focus on the relevant region.
128
7 Examples of Applications
80
Magnitude plot for the true and the reduced models
60
40
20 wbt mflbt flbt flistia flh
2 wh
2
True nl nl
0
−
20
− 40
10
− 2
10
− 1
10
0
Frequency [rad
/
s]
10
1
10
2
Figure 7.7: The true and reducedorder models, for the di ff erent methods, from Section 7.1.2. The dashed vertical line denotes
ω
= 15 rad
/
s
.
Table 7.2: Numerical results for the example in Section 7.1.2.

G
−

H
2

G

H
2
,ω
,ω

G
−

H∞

G

H∞
,ω
,ω
Re
λ
max mflbt flbt istia flh wh wbt
2
2 nl nl
9.90e03
2.90e02
∞
7.79e03
1.68e03
8.12e03
1.12e02
2.07e02
3.87e04
9.33e03
5.11e03
1.37e02
1.63e01
1.19e01
6.12e+00
1.35e01
1.90e01
1.82e01
7.2
Model Reduction in System Identification
In this example we will show how model reduction can be used in system identification to obtain parameter estimates with a smaller covariance matrix than with direct system identification. The example that will be used is taken from
Tjärnström [2003] where also the theoretical results are presented.
We will be work with a model with and
N
T s
= 1. Let siso discretetime outputerror ( oe
, see Ljung [1999])
y
(
t
) denote the output of the system and is the total number of measured data. The signal generated from the true system,
G
0
(
q
), as
u
(
t
) the input
y
(
t
) is assumed to be
y
(
t
) =
G
0
(
q
)
u
(
t
) +
e
(
t
)
,
where
q
is the discretetime shift operator and the additive noise,
e
(
t
), is a zero
7.2
Model Reduction in System Identiﬁcation
129 mean, whitenoise sequence, independent of the input. The sought system is parametrized as an oe the parameters for the model and denoted ˆ oe
(
q, θ
), where
θ
is a vector holding model. To identify a model using the inputoutput data the predictionerror method ( pem
) (see Ljung [1999]) can be used. One cost function that is commonly used when doing system identiﬁcation using pem is
V
N
(
θ
) =
(
t, θ
) =
1
N
2
(
t, θ
)
,
2
N t
=1
y
(
t
)
− ˆ
(
q, θ
)
u
(
t
)
,
and the estimate of
θ
given
N θ
N
, is taken as
θ
N
= arg min
θ
V
N
(
θ
)
.
Using the notation and deﬁnitions above we can state a connection between system identiﬁcation, using pem
, and model reduction, using the
H
2
norm. Under weak conditions, it holds that
θ
N
→
θ
∗
= arg min
+
θ
(
t
) lim
N
→∞
1
N
model structure, we have that
N t
=1
E
f
1
2
¯ 2
(
t, θ
)
¯
(
θ
)
,
as
N
→ ∞
,
(
t
) and using Parseval’s formula and an oe
π
¯
(
θ
) =
1
4
π
−
π
G
(e
iω
)
−
ˆ
(e
iω
, θ
)
2
Φ
u
(
ω
)d
ω
=
1
2
G
−
ˆ
(
θ
)
2
Φ
u
,
H
2
.
Results in Tjärnström and Ljung [2002] and Tjärnström [2003] states, when estimating an oe model of low order (undermodeling), it is better to estimate the loworder model with model reduction of a highorder model compared to estimating the loworder model directly from data. This was exempliﬁed already in Tjärnström [2003]. However, not by using a
H
2 modelreduction algorithm but by using a ﬁrstorder approximation of the covariance expression for the parameters, see Tjärnström [2003]. First in this example we will use the method proposed in Section 4.4.1 to do the model reduction when having a whitenoise input. Secondly, we will use an input signal with a frequencylimited spectrum that requires the use of the method proposed in Section 4.4.3.
In this example the true system is given by
y
(
t
) =
B
(
q
)
F
(
q
)
u
(
t
) +
e
(
t
)
,
where
B
(
q
) = 2
F
(
q
) = 1
q
−
1 −
q
−
2
−
0
.
7
q
−
1
+ 0
.
52
q
−
2 −
0
.
092
q
−
3 −
0
.
1904
q
−
4
.
130
7 Examples of Applications
The input,
u
, and noise,
e
, are jointly independent. The noise is a zeromean whitenoise process with variance 1.
First we will use a zeromean whitenoise process with variance 1 for the input. The system is simulated with this input with
N
= 250 to obtain a data set with input and output data. This data set is used ﬁrst to directly estimate, using
{
n
{
n b
using
b
pem
= 2
= 2
, n
pem
, n f f
, three loworder
= 2 duced, using h
, n
2
= 2
, n k k
= 1
} the same data set an oe
= 1
} and nl
, to three and
{ oe
n
{
b n b
oe models with orders
= 3
k
= 4
, n
models with orders
= 3
, n
, n f f
= 3 model with order
= 3
{
n
, n k
, n b
and this estimated model of order
{
n
= 1
}
b
= 1
}
f
{
n b
= 1 respectively. Now, using
= 4
, n k
= 4
, n f
{
n b
= 1
, n f
= 1
}
= 4
, n k
, n f
= 1
, n k
= 1
} is estimated
= 1
}
= 1
, n k
are re
= 1
}
, respectively. This procedure
, is repeated 500 times and from the obtained estimates, Monte Carlo based estimates of the covariance matrices are computed. From each of the six covariance matrices, as in Tjärnström [2003], the eigenvalues are determined to represent the size of the covariance matrices. The results are presented in Table 7.3.
Table 7.3:
Numerical results for the example in Section 7.2 using a zeromean whitenoise process with variance 1 for the input. The cases marked
“direct” means that the model comes from directly using pem and “reduced” means that ﬁrst a fourth order model is identiﬁed using pem and then this model is then reduced using model reduction to the desired order.
Model – Method oe
(1
,
1
,
1) – direct oe oe
(1
,
1
,
1) – reduced
(2
,
2
,
1) – direct oe oe
(2
,
2
,
1) – reduced
(3
,
3
,
1) – direct oe
(3
,
3
,
1) – reduced
λ
1
0.930
0.924
1.87
1.81
233
179
λ
2
0.0859
0.0671
0.916
0.910
3.57
2.11
λ
3
–
–
0.0919
0.0871
0.915
0.952
λ
–
–
4
0.0440
0.0431
0.355
0.305
λ
–
–
–
–
5
0.0413
0.0407
λ
6
–
–
–
–
0.0276
0.0265
In a second experiment we use an input with a limited spectrum. The input in this case is a zeromean gaussian signal with a nonzero spectrum on the frequency interval [0
, π/
2] and with variance 1. The same procedure as above is used to estimate six di ﬀ erent oe models using the direct and reduced approach.
The di ﬀ erence compared to the case above is that the proposed method from
Section 4.4.3 is used instead. From each of the six covariance matrices, as in
Tjärnström [2003], the eigenvalues are determined to represent the size of the covariance matrices. The results are presented in Table 7.4
This example repeats the results from Tjärnström [2003], that
H
2 model reduction can in some cases be used to ﬁnd better estimates in system identiﬁcation, by ﬁnding smaller covariance matrices, see Table 7.3 and Table 7.4. However, this time using an
H
2 modelreduction algorithm, both for the case of having whitenoise input and an input with limited spectrum. This example is meant to highlight the connection between system identiﬁcation and
H
2 model reduction, and illustrate yet another application of our results.
7.3
Conclusions
131
Table 7.4:
Numerical results for the example in Section 7.2 using a zeromean gaussian process with a limited spectrum with variance 1 for the input.
With “direct” means that the model comes from directly using pem and with reduced means that ﬁrst a fourth order model is identiﬁed using pem and the this model is reduced, using model reduction to the correct order.
Model – Method oe oe oe oe oe oe
(1
,
1
,
1) – direct
(1
,
1
,
1) – reduced
(2
,
2
,
1) – direct
(2
,
2
,
1) – reduced
(3
,
3
,
1) – direct
(3
,
3
,
1) – reduced
λ
1
43.2
40.6
1290
1210
3590
1940
λ
2
0.411
0.400
80.9
65.9
595
530
λ
3
–
–
10.1
8.18
466
488
λ
4
–
–
0.214
0.246
128
99.6
λ
5
–
–
–
–
3.63
3.51
λ
6
–
–
–
–
0.170
0.180
7.3
Conclusions
The two examples in this chapter have been chosen to highlight some properties and applications for the model reduction and lpv algorithms and to show their applicability on a realworld example. In the aircraft example in Section 7.1 we could see how the lpv generating algorithms could be used to lower the complexity of an existing lpv model and how the limitedfrequency modelreduction algorithm can be used to capture relevant frequency regions when performing model reduction. In the system identiﬁcation example in Section 7.2 we highlight the connection between system identiﬁcation and
H
2 model reduction using an example that shows how the covariance matrix of the estimates can be made smaller using model reduction together with system identiﬁcation.
8
Concluding Remarks
The previous chapters have introduced, and shown the applicability of, some new methods for reducing the complexity of lti and lpv systems and for synthesizing
H
2 controllers. All methods are based on the same technique, which is minimizing the
H
2
norm of di ﬀ erent systems, and utilizing the structure of the problems to make the methods more e ﬃ cient. The methods have been developed such that an o ﬀ
theshelf quasiNewton solver can be used to solve the problems using the equations derived in the thesis.
In Section 4.4.1 a method for model reduction, for which the basic idea is not new, was presented. However, we presented how to utilize the structure of the problem and also laid the foundation for the other methods that were presented.
In Section 4.4.2 a modelreduction method that tries to cope with errors in the given data was presented. The method uses the foundation laid in Section 4.4.1
together with a di ﬀ erent view of robust optimization, namely using regularization as a proxy for robust optimization.
In Chapter 3 a more complete and uniform derivation, than in the existing literature, of frequencylimited Gramians were presented. In Section 4.4.3 a frequencylimited modelreduction method was presented. This method was based on the derivations in Chapter 3 together with the foundation laid in Section 4.4.1.
All the modelreduction methods in Chapter 4 were then extended into an lpv framework to be able to handle lpv systems and to be able to reduce the complexity both in the states and the parameters for the lpv systems. Many of the existing lpv generating methods have one drawback in common, which is that they are not invariant to the state basis the lti models are given in. This drawback makes it hard for the existing models to be able to reduce the complexity of
133
134
8 Concluding Remarks the lpv to the model. However, by using a modelreduction method as the foundation lpv generating methods in Chapter 5 this drawback is eliminated.
The modelreduction problem is closely related to the controllersynthesis problem and using the same techniques as in Chapter 4 and Chapter 5,
H
2 controllersynthesis methods were developed in Chapter 6. As discussed in Chapter 6, a possible extension of the methods for synthesizing controllers could be to use the idea of controlling the rank of the system matrices, as in Section 5.4.2. By using this idea, of controlling the rank, one could, for example, enforce the controller to have integrators.
The presented methods have been shown to work well on the presented examples, which are both small academic examples and relevant realworld examples, for example a model of an Airbus aircraft.
All the methods described in this thesis tries to solve nonconvex optimization problems, which are di ﬃ cult problems and only local solutions can be guaranteed. Hence, the initialization problem is a very important part of the methods presented in this thesis. We have presented some suggestions for initializing the methods and, in our examples, they have worked well. However, this is a part of the problem that is in need of further research and much can be gained by making even better initializations, e.g., faster and more reliable computations, since we can hopefully start even closer to an optimum.
Another problem that is in need of further research is the problem of ﬁnding a stabilizing controller, which is a problem that has not been discussed much in this thesis. The problem, of ﬁnding stabilizing controllers, is crucial to be able to use the methods in Chapter 6, and in this thesis only one simple suggestion that relies on existing methods is presented.
Bibliography
Awad H AlMohy, Nicholas J Higham, and Samuel D Relton. Computing the
Fréchet derivative of the matrix logarithm and estimating the condition number.
Manchester Institute for Mathematical Sciences. The University of Manchester, UK
, MIMS EPrint 2012.72, 2012. URL uk/1852/
. Cited on pages 22 and 85.
http://eprints.ma.man.ac.
Branimir Ani ć
, Christopher Beattie, Serkan Gugercin, and Athanasios C. Antoulas. Interpolatory weighted
H
2 model reduction.
Automatica
, 49(5):1275–
1280, 2013. Cited on page 45.
Athanasios C. Antoulas.
Approximation of LargeScale Dynamical Systems
. Advances in Design and Control. Society for Industrial and Applied Mathematics,
2005. ISBN 0898715296. Cited on pages 8, 40, 41, and 42.
Denis Arzelier, Deaconu Georgia, Suat Gumussoy, and Didier Henrion.
hifoo
. In
H
2 for
Proceedings of 2011 International Conference on Control and Optimization with Industrial Applications
, Ankara, Turkey, 2011. Cited on page
110.
Bassam Bamieh and Laura Giarre. Identiﬁcation of linear parameter varying models.
International Journal of Robust and Nonlinear Control
, 12(9):841–853,
2002. Cited on page 88.
Richard H. Bartels and G. W. Stewart. Algorithm 432: The solution of the matrix equation
AX
−
BX
=
C
.
Communications of the ACM
, 8:820–826, 1972. Cited on page 66.
Frank Bauer and Mark A. Lukas. Comparing parameter choice methods for regularization of illposed problems.
Mathematics and Computers in Simulation
,
81(9):1795–1841, May 2011. Cited on page 55.
Christopher A. Beattie and Serkan Gugercin. Krylovbased minimization for optimal
H
2 model reduction. In
Proceedings of the 47th IEEE Conference on
Decision and Control
, pages 4385–4390, New Orleans, USA, 2007. Cited on pages 40, 43, and 45.
135
136
Bibliography
Christopher A. Beattie and Serkan Gugercin. A trust region method for optimal
H
2 model reduction. In
Proceedings of the 48th IEEE Conference on Decision and Control and 28th Chinese Control Conference
, pages 5370–5375, Shanghai, China, 2009. Cited on page 40.
Aharon BenTal and Arkadi Nemirovski. Robust Optimization  Methodology and Applications.
Mathematical Programming (Series B)
, 92:453–480, 2002.
Cited on page 55.
Peter Benner, Jose M. Claver, and Enrique S. QuintanaOrti. E ﬃ cient solution of coupled Lyapunov equations via matrix sign function iteration. In
Proceedings of the 3rd Portuguese Conference on Automatic Control
, pages 205–210,
Coimbra, Portugal, 1998. Cited on page 110.
Dimitris Bertsimas, David B. Brown, and Constantine Caramanis. Theory and Applications of Robust Optimization.
SIAM Review
, 53(3):464–501, 2011. Cited on page 55.
Vincent Blondel and John N. Tsitsiklis. NPhardness of some linear control design problems.
SIAM Journal on Control and Optimization
, 35(6):2118–2127, 1997.
Cited on pages 103 and 110.
Samuel Burer and Renato D.C. Monteiro. A nonlinear programming algorithm for solving semideﬁnite programs via lowrank factorization.
Mathematical
Programming, Series B
, 95(2):329–357, 2003. Cited on page 97.
Jan De Caigny, Juan F. Camino, and Jan Swevers. Interpolationbased modeling of mimo lpv systems.
IEEE Transactions on Control Systems Technology
, 19
(1):46–63, 2011. Cited on page 88.
Jan De Caigny, Rik Pintelon, Juan F. Camino, and Jan Swevers. Interpolated modeling of lpv systems based on observability and controllability. In
Proceedings of the 16th IFAC Symposium on System Identiﬁcation
, pages 1773–1778, Brussels, Belgium, 2012. Cited on pages 88, 89, 99, and 100.
M. Diab, W.Q. Liu, and V. Sreeram. Optimal model reduction with a frequency weighted extension.
Dynamics and Control
, 10:255–276, 2000. Cited on page
40.
Laurent El Ghaoui and Hervé Lebret. Robust solutions to leastsquares problems with uncertain data.
SIAM Journal on Matrix Analysis and Applications
, 18(4):
1035–1064, 1997. Cited on page 56.
Laurent El Ghaoui, Francois Oustry, and Mustapha AitRami. A cone complementarity linearization algorithm for static outputfeedback and related problems.
IEEE Transactions on Automatic Control
, 42(8):1171–1176, aug 1997. Cited on page 13.
Dale F. Enns. Model reduction with balanced realizations: An error bound and a frequency weighted generalization. In
Proceedings of the 23rd IEEE Confer
Bibliography
137
ence on Decision and Control
, pages 127 – 132, Las Vegas, USA, 1984. Cited on pages 40, 42, and 68.
Makan Fardad, Fu Lin, and Mihailo R. Jovanovic. On the optimal design of structured feedback gains for interconnected systems. In
Proceedings of the 48th
IEEE Conference on Decision and Control, held jointly with the 28th Chinese
Control Conference
, pages 978–983, Shanghai, China, 2009. Cited on page
104.
Federico Felici, JanWillem Van Wingerden, and Michel Verhaegen. Subspace identiﬁcation of mimo lpv systems using a periodic scheduling sequence.
Automatica
, 43(10):1684–1697, 2007. Cited on page 88.
Garret M. Flagg, Serkan Gugercin, and Christopher A. Beattie. An interpolationbased approach to
H
∞ model reduction of dynamical systems. In
Proceedings of the 49th IEEE Conference on Decision and Control
, pages 6791–6796, Atlanta, GA, USA, 2010. Cited on page 40.
Pascale Fulcheri and Martine Olivi. Matrix rational dient algorithm based on schur analysis.
H
2 approximation: A gra
SIAM Journal on Control and Optimization
, 36(6):2103–2127, 1998. Cited on pages 43 and 44.
Andrea Garulli, Anders Hansson, Sina Khoshfetrat Pakazad, Alﬁo Masi, and Ragnar Wallin. Robust ﬁnitefrequency
H
2 analysis of uncertain systems with application to ﬂight comfort analysis.
Control Engineering Practice
, 21(6):887–
897, 2013. Cited on page 121.
Wodek Gawronski and JerNan Juang. Model reduction in limited time and frequency intervals.
International Journal of Systems Science
, 21(2):349–376,
1990. Cited on pages 23, 24, 37, 41, 42, 43, and 68.
Wodek K. Gawronski.
Advanced Structural Dynamics and Active Control of
Structures
. Mechanical Engineering Series. Springer, 2004. Cited on page 23.
Keith Glover. All optimal hankelnorm approximations of linear multivariable systems and their
L
∞
error bounds.
International Journal of Control
1115–1193, 1984. Cited on pages 40 and 42.
, 39(6):
Gene H. Golub and Charles F. Van Loan.
Matrix Computations
. Johns Hopkins
University Press, 3rd edition, 1996. ISBN 0801854138. Cited on page 66.
Serkan Gugercin and Athanasios C. Antoulas. A survey of model reduction by balanced truncation and some new results.
International Journal of Control
,
77(8):748–766, 2004. Cited on pages 42, 43, and 68.
Suat Gumussoy, Didier Henrion, Marc Millstone, and Michael L. Overton. Multiobjective robust control with hifoo
2.0. In
Proceedings of the IFAC Symposium on Robust Control Design
, Haifa, Israel, 2009. Cited on pages 110, 111,
112, 113, 114, 115, and 116.
Yoram Halevi. Frequency weighted model reduction via optimal projection.
IEEE
Transactions on Automatic Control
, 37(10):1537–1542, 1992. Cited on page 40.
138
Bibliography
Trevor Hastie, Robert Tibshirani, and Jerome Friedman.
The Elements of Statistical Learning: Data Mining, Inference, and Prediction
. Springer, 2001. ISBN
0387952845. Cited on page 55.
Simon Hecker.
Generation of low order lft
Representations for Robust Control
Applications
. PhD thesis, Technical University of Munich, 2006. Cited on page
122.
Anders Helmersson. Model reduction using lmi s. In
Proceedings of the 33rd
IEEE Conference on Decision and Control
, volume 4, pages 3217–3222, Lake
Buena Vista, USA, 1994. Cited on pages 40 and 45.
Nicholas J. Higham.
Functions of Matrices: Theory and Computation
. SIAM,
2008. Cited on pages 19, 20, 21, 22, 62, 64, 85, and 96.
Lucas G. Horta, JerNan Juang, and Richard W. Longman. Discretetime model reduction in limited frequency ranges.
Journal of Guidance, Control and Dynamics
, 16(6):1125–1130, 1993. Cited on pages 23, 30, 37, 41, and 42.
XueXiang Huang, WeiYong Yan, and K. L. Teo.
tion.
H
2 nearoptimal model reduc
IEEE Transactions on Automatic Control
, 46(8):1279–1284, 2001. Cited on pages 41 and 44.
Rudolf E. Kalman. Contributions to the theory of optimal control.
Boletin de la
Sociedad Mathematica Mexicana
, 5:102–119, 1960. Cited on page 103.
Balazs Kulcsar and Roland Tóth. On the similarity state transformation for linear parametervarying systems. In
Proceedings of the 18th IFAC World Congress
, pages 4155–4160, Milan, Italy, 2011. Cited on page 14.
Peter Lancaster and Miron Timor Tismenetsky.
The Theory of Matrices
. Computer Science and Scientiﬁc Computing. Academic Press, second edition edition, 1985. Cited on pages 19 and 20.
Lawton H. Lee and Kameshwar Poolla. Identiﬁcation of linear parametervarying systems using nonlinear programming.
Journal of Dynamic Systems, Measurement and Control
, 121(1):71–78, 1999. Cited on page 88.
Friedemann Leibfritz and W. Lipinski. Description of the benchmark examples in
COMPl e ib 1.0. Technical report, University of Trier, Department of Mathematics, Germany, 2003. Cited on pages 69, 73, 110, 112, 113, 114, 115, and 116.
Douglas J. Leith and William E. Leithead. Survey of gainscheduling analysis and design.
International Journal of Control
, 73(11):1001–1025, 2000. Cited on page 87.
Antonio Lepschy, Gian Antonio Mian, G. Pinato, and Umberto Viaro. Rational
L
2 approximation: a nongradient algorithm. In
Proceedings of the 30th IEEE
Conference on Decision and Control
, volume 3, pages 2321–2323, Brighton,
UK, 1991. Cited on page 43.
Bibliography
139
Adrian S Lewis and Michael L Overton. Nonsmooth optimization via quasinewton methods.
Mathematical Programming
, 2012. Cited on page 16.
ChingAn Lin and TaiYih Chiu. Model reduction via frequency weighted balanced realization.
Control Theory and Advanced Technology
, 8:341–351, 1992.
Cited on page 42.
Fu Lin, Makan Fardad, and Mihailo R. Jovanovic. Synthesis of structured controllers: primal and dual formulations. In
H
2 optimal static
Proceedings of the
47th Annual Allerton Conference
, pages 340–346, Urbana–Champaign, USA,
2009. Cited on page 104.
Lennart Ljung.
System Identiﬁcation: Theory for the User
. Prentice Hall, second edition, 1999. ISBN 0136566952. Cited on pages 88, 128, and 129.
Marco Lovera and Guillaume Mercere. Identiﬁcation for gainscheduling: a balanced subspace approach. In
Proceedings of the American Control Conference
, pages 858–863, New York, USA, 2007. Cited on page 88.
Marco Lovera, Carlo Novara, Paulo Lopes dos Santos, and Daniel Rivera. Guest editorial special issue on applied lpv modeling and identiﬁcation.
IEEE Transactions on Control Systems Technology
, 19(1):1–4, 2011. Cited on page 87.
Andres Marcos and Gary J. Balas. Development of linearparametervarying models for aircraft.
Journal of Guidance, Control and Dynamics
, 27(2):218–228,
2004. Cited on page 87.
Alexandre Megretski and Anders Rantzer. System analysis via integral quadratic constraints.
IEEE Transactions on Automatic Control
, 42(6):819–830, 1997.
Cited on page 87.
Lewis Meier and David G. Luenberger. Approximation of linear constant systems.
IEEE Transactions on Automatic Control
, 12(5):585–588, 1967. Cited on page
43.
Mehran Mesbahi, George P. Papavassilopoulos, and Michael G. Safonov. Matrix cones, complementarity problems, and the bilinear matrix inequality. In
Proceedings of the 34th IEEE Conference on Decision and Control
, volume 3, pages 3102–3107, New Orleans, USA, 1995. Cited on page 104.
Keith Miller. Least squares methods for illposed problems with a prescribed bound.
SIAM Journal on Mathematical Analysis
, 1(1):52–74, 1970. Cited on page 56.
Javad Mohammadpour and Carsten W. Scherer, editors.
Control of Linear Parameter Varying Systems with Applications
. Springer US, 2012. ISBN 97814614
18337. Cited on page 87.
Bruce C. Moore. Principal component analysis in linear systems: Controllability, observability and model reduction.
IEEE Transactions on Automatic Control
,
AC26(1):17–32, 1981. Cited on pages 40 and 41.
140
Bibliography
Mahadevamurty Nemani, Rayadurgam Ravikanth, and Bassam A. Bamieh. Identiﬁcation of linear parametrically varying systems. In
Proceedings of the 34th
IEEE Conference on Decision and Control
, volume 3, pages 2990–2995, Seville,
Spain, 1995. Cited on page 88.
Jorge Nocedal and Stephen J. Wright.
Numerical Optimization
. Springer, 2006.
ISBN 9870387303031. Cited on pages 15, 16, and 17.
Daniel Petersson.
Nonlinear optimization approaches to
H
2
norm based lpv modelling and control
. Licentiate thesis no. 1453, Department of Electrical
Engineering, Linköping University, 2010. Not cited.
Daniel Petersson and Johan Löfberg. Optimization based of multimodel systems. In lpv
approximation
Proceedings of the European Control Conference
, pages 3172–3177, Budapest, Hungary, 2009. Not cited.
Daniel Petersson and Johan Löfberg. Robust generation of els using a regularized
H
2
cost. In lpv statespace mod
Proceedings of the IEEE International Symposium on ComputerAided Control System Design
, pages 1170–1175, Yokohama, Japan, 2010. Not cited.
Daniel Petersson and Johan Löfberg.
programming. In lpv
H
2
controller synthesis using nonlinear
Proceedings of the 18th IFAC World Congress
, pages 6692–
6696, Milan, Italy, 2011. Not cited.
Daniel Petersson and Johan Löfberg. Model reduction using a frequencylimited
H
2
cost.
arXiv preprint arXiv:1212.1603
, December 2012a. URL arxiv.org/abs/1212.1603
. Cited on pages 39 and 60.
http://
Daniel Petersson and Johan Löfberg.
Optimization Based Clearance of Flight
Control Laws  A Civil Aircraft Application
, chapter Identiﬁcation of
Space Models Using
H
2 lpv
State
Minimisation, pages 111–128. Springer, 2012b. Not cited.
Daniel Petersson and Johan Löfberg. Optimizationbased modeling of tems using an
H
2 objective.
lpv sys
Submitted to International Journal of Control
,
December 2012c. Cited on page 87.
Harald Pﬁfer and Simon Hecker. Generation of optimal linear parametric models for lft
based robust stability analysis and control design. In
Proceedings of the 47th IEEE Conference on Decision and Control
, pages 3866–3871, Cancun,
Mexico, 2008. Cited on pages 88 and 89.
Charles PoussotVassal. An iterative SVDtangential interpolation method for mediumscale craft. In mimo systems approximation with application on ﬂexible air
Proceedings of the 50th IEEE Conference on Decision and Control and the European Control Conference
, pages 7117–7122, Orlando, FL, USA,
2011. Cited on pages 40 and 45.
Charles PoussotVassal and Pierre Vuillemin. Introduction to MORE: a MOdel RE
Bibliography
141 duction Toolbox. In
Proceedings of the IEEE Multi Systems Conference (MSC
CCA’12)
, pages 776–781, Dubrovnik, Croatia, 2012. Cited on pages 41 and 68.
T. Rautert and Ekkehard W. Sachs. Computational design of optimal output feedback controllers.
page 103.
SIAM Journal on Optimization
, 7(3):837–852, 1997. Cited on
Wilson J. Rugh and Je ﬀ
S. Shamma. Research on gain scheduling.
36(10):1401–1425, 2000. Cited on page 87.
Automatica
,
M. G. Safonov and R. Y. Chiang. A Schur Method for BalancedTruncation Model
Reduction.
IEEE Transactions on Automatic Control
, 34(7):729–733, 1989.
Cited on page 42.
M. G. Safonov, R. Y. Chiang, and D. J. N. Limebeer. Optimal Hankel Model Reduction for Nonminimal Systems.
IEEE Transactions on Automatic Control
, 35
(4):496–502, 1990. Cited on page 42.
Shaﬁshuhaza Sahlan, Abdul Ghafoor, and Victor Sreeram. A new method for the model reduction technique via a limited frequency interval impulse response gramian.
Mathematical and Computer Modelling
, 55(34):1034–1040, 2012.
Cited on page 41.
Je ﬀ
S. Shamma and Michael Athans. Gain scheduling: potential hazards and possible remedies.
IEEE Control Systems Magazine
, 12(3):101–107, 1992. Cited on pages 89 and 108.
Robert E. Skelton, Tetsuya Iwasaki, and Karolos M. Grigoriadis.
A Uniﬁed Algebraic Approach to Linear Control Design
. Taylor and Francis, 1998. ISBN
0748405925. Cited on pages 10, 19, and 20.
Sigurd Skogestad and Ian Postlethwaite.
Multivariable Feedback Control: Analysis and Design
. Wiley, second edition, 2007. ISBN 0470011676. Cited on pages 8, 10, and 13.
Victor Sreeram and Shaﬁshuhaza Sahlan. Improved results on frequency weighted balanced truncation. In
Proceedings of the 48th IEEE Conference on Decision and Control, held jointly with the 28th Chinese Control Conference
, pages
3250–3255, Shanghai, China, 2009. Cited on page 40.
Maarten Steinbuch, Rene van de Molengraft, and AartJan Van Der Voort. Experimental modelling and lpv control of a motion system. In
Proceedings of the American Control Conference
, volume 2, pages 1374–1379, Denver, USA,
2003. Cited on pages 88 and 89.
Michael Stingl.
On the Solution of Nonlinear Semideﬁnite Programs by Augmented Lagrangian Methods
. PhD thesis, University of Erlangen, 2006. Cited on pages 104, 110, 111, 112, 113, 114, 115, and 116.
Vassili L. Syrmos, Chaouki T. Abdallah, Peter Dorato, and Karolos Grigoriadis.
Static output feedback – a survey.
Automatica
, 33(2):125–137, 1997. Cited on page 103.
142
Bibliography
Fredrik Tjärnström. Variance analysis of eling  the output error case.
L
2 model reduction when undermod
Automatica
, 39(10):1809–1815, 2003. Cited on pages 128, 129, and 130.
Fredrik Tjärnström and Lennart Ljung.
tion.
L
2 model reduction and variance reduc
Automatica
, 38(9):1517–1530, Sep 2002. Cited on page 129.
Roland Tóth.
Modeling and Identiﬁcation of Linear ParameterVarying Systems, an Orthonormal Basis Function Approach
. PhD thesis, Delft University of Technology, 2008. Cited on pages 14, 87, and 88.
Roland Tóth, Federico Felici, Peter S. C. Heuberger, and Paul Van Den Hof. Discrete time lpv
I/O and statespace representations, di ﬀ erences of behavior and pitfalls of interpolation. In
Proceedings of the European Control Conference
, pages 5418–5425, Kos, Greece, 2007. Cited on page 89.
Roland Tóth, Hossam Seddik Abbas, and Herbert Werner. On the statespace realization of lpv inputoutput models: Practical approaches.
IEEE Transactions on Control Systems Technology
, 20(1):139–153, 2012. Cited on page 15.
Andras Varga and Brian D.O. Anderson. Accuracy enhancing methods for the frequencyweighted balancing related model reduction. In
Proceedings of the
40th IEEE Conference on Decision and Control
, pages 3659–3664, Orlando,
USA 2001. Cited on page 42.
Andreas Varga, Anders Hansson, and Guilhem Puyou, editors.
Optimization
Based Clearance of Flight Control Laws
. Lecture Notes in Control and Information Science. Springer, 2012. Cited on page 121.
Pierre Vuillemin, Charles PoussotVassal, and Daniel Alazard.
frequency limited approximation methods for largescale lti
H
2 optimal and dynamical systems. In
Proceedings of the 5th IFAC Symposium on System Structure and
Control
, pages 719–724, Grenoble, France, 2013. Cited on page 68.
Matthijs Groot Wassink, Marc Van De Wal, Carsten Scherer, and Okko Bosgra.
lpv control for a wafer stage: beyond the theoretical solution.
Control Engineering Practice
, 13(2):231–245, 2005. Cited on pages 87 and 88.
D. A. Wilson. Optimum solution of model reduction problem.
Proceedings of the
Institution of Electrical Engineers
, 117(6):1161–1165, 1970. Cited on pages 43 and 45.
D. A. Wilson. Model reduction for mutivariable systems.
of Control
, 20(1):57–64, 1974. Cited on page 44.
International Journal
Yuesheng Xu and Taishan Zeng. Optimal mimo
H
2 systems via tangential interpolation.
model reduction for large scale
International Journal of Numerical
Analysis and Modeling
, 8(1):174–188, 2011. Cited on page 45.
WeiYong Yan and James Lam. An approximate approach to reduction.
H
2 optimal model
IEEE Transactions on Automatic Control
, 44(7):1341–1358, 1999.
Cited on pages 43 and 44.
Bibliography
143
Kemin Zhou. Frequencyweighted reduction.
L
∞ norm and optimal Hankel norm model
IEEE Transactions on Automatic Control
, 40(10):1687–1699, 1995.
Cited on page 40.
Kemin Zhou, John C. Doyle, and Keith Glover.
Robust and optimal control
.
PrenticeHall, Inc., 1996. ISBN 0134565673. Cited on pages 10, 12, 13,
87, 103, and 122.
PhD Dissertations
Division of Automatic Control
Linköping University
M. Millnert: Identiﬁcation and control of systems subject to abrupt changes. Thesis
No. 82, 1982. ISBN 9173725420.
A. J. M. van Overbeek: Online structure selection for the identiﬁcation of multivariable systems. Thesis No. 86, 1982. ISBN 9173725862.
B. Bengtsson:
5935.
S. Ljung:
On some control problems for queues. Thesis No. 87, 1982. ISBN 917372
Fast algorithms for integral equations and least squares identiﬁcation problems.
Thesis No. 93, 1983. ISBN 9173726419.
H. Jonson: A Newton method for solving nonlinear optimal control problems with general constraints. Thesis No. 104, 1983. ISBN 9173727180.
E. Trulsson: Adaptive control based on explicit criterion minimization. Thesis No. 106,
1983. ISBN 9173727288.
K. Nordström: Uncertainty, robustness and sensitivity reduction in the design of single input control systems. Thesis No. 162, 1987. ISBN 9178701708.
B. Wahlberg: On the identiﬁcation and approximation of linear systems. Thesis No. 163,
1987. ISBN 9178701759.
S. Gunnarsson: Frequency domain aspects of modeling and control in adaptive systems.
Thesis No. 194, 1988. ISBN 9178703808.
A. Isaksson: On system identiﬁcation in one and two dimensions with signal processing applications. Thesis No. 196, 1988. ISBN 9178703832.
M. Viberg: Subspace ﬁtting concepts in sensor array processing. Thesis No. 217, 1989.
ISBN 9178705290.
K. Forsman: Constructive commutative algebra in nonlinear control theory.
No. 261, 1991. ISBN 9178708273.
Thesis
F. Gustafsson: Estimation of discrete parameters in linear systems. Thesis No. 271, 1992.
ISBN 9178708761.
P. Nagy: Tools for knowledgebased signal processing with applications to system identiﬁcation. Thesis No. 280, 1992. ISBN 9178709628.
T. Svensson: Mathematical tools and software for analysis and design of nonlinear control systems. Thesis No. 285, 1992. ISBN 917870989X.
S. Andersson: On dimension reduction in sensor array signal processing. Thesis No. 290,
1992. ISBN 9178710154.
H. Hjalmarsson: Aspects on incomplete modeling in system identiﬁcation. Thesis No. 298,
1993. ISBN 9178710707.
I. Klein: Automatic synthesis of sequential control schemes.
Thesis No. 305, 1993.
ISBN 9178710901.
J.E. Strömberg: A mode switching modelling philosophy. Thesis No. 353, 1994. ISBN 91
78714303.
K. Wang Chen: Transformation and symbolic calculations in ﬁltering and control. Thesis
No. 361, 1994. ISBN 9178714672.
T. McKelvey: Identiﬁcation of statespace models from time and frequency data. Thesis
No. 380, 1995. ISBN 9178715318.
J. Sjöberg: Nonlinear system identiﬁcation with neural networks. Thesis No. 381, 1995.
ISBN 9178715342.
R. Germundsson: Symbolic systems – theory, computation and applications. Thesis
No. 389, 1995. ISBN 9178715784.
P. Pucar: Modeling and segmentation using multiple models. Thesis No. 405, 1995.
ISBN 9178716276.
H. Fortell: Algebraic approaches to normal forms and zero dynamics. Thesis No. 407,
1995. ISBN 9178716292.
A. Helmersson: Methods for robust gain scheduling. Thesis No. 406, 1995. ISBN 917871
6284.
P. Lindskog: Methods, algorithms and tools for system identiﬁcation based on prior knowledge. Thesis No. 436, 1996. ISBN 9178714248.
J. Gunnarsson: Symbolic methods and tools for discrete event dynamic systems. Thesis
No. 477, 1997. ISBN 9178719178.
M. Jirstrand: Constructive methods for inequality constraints in control. Thesis No. 527,
1998. ISBN 9172191872.
U. Forssell: Closedloop identiﬁcation: Methods, theory, and applications. Thesis No. 566,
1999. ISBN 9172194324.
A. Stenman: Model on demand: Algorithms, analysis and applications. Thesis No. 571,
1999. ISBN 9172194502.
N. Bergman: Recursive Bayesian estimation: Navigation and tracking applications. Thesis
No. 579, 1999. ISBN 9172194731.
K. Edström: Switched bond graphs: Simulation and analysis. Thesis No. 586, 1999.
ISBN 9172194936.
M. Larsson: Behavioral and structural model based approaches to discrete diagnosis. Thesis No. 608, 1999. ISBN 9172196155.
F. Gunnarsson: Power control in cellular radio systems: Analysis, design and estimation.
Thesis No. 623, 2000. ISBN 9172196890.
V. Einarsson: Model checking methods for mode switching systems. Thesis No. 652, 2000.
ISBN 9172198362.
M. Norrlöf: Iterative learning control: Analysis, design, and experiments. Thesis No. 653,
2000. ISBN 9172198370.
F. Tjärnström: Variance expressions and model reduction in system identiﬁcation. Thesis
No. 730, 2002. ISBN 9173732532.
J. Löfberg: Minimax approaches to robust model predictive control. Thesis No. 812, 2003.
ISBN 9173736228.
J. Roll: Local and piecewise a ﬃ ne approaches to system identiﬁcation. Thesis No. 802,
2003. ISBN 9173736082.
J. Elbornsson: Analysis, estimation and compensation of mismatch e ﬀ ects in A/D converters. Thesis No. 811, 2003. ISBN 917373621X.
O. Härkegård: Backstepping and control allocation with applications to ﬂight control.
Thesis No. 820, 2003. ISBN 9173736473.
R. Wallin: Optimization algorithms for system analysis and identiﬁcation. Thesis No. 919,
2004. ISBN 9185297194.
D. Lindgren: Projection methods for classiﬁcation and identiﬁcation. Thesis No. 915,
2005. ISBN 9185297062.
R. Karlsson: Particle Filtering for Positioning and Tracking Applications. Thesis No. 924,
2005. ISBN 9185297348.
J. Jansson: Collision Avoidance Theory with Applications to Automotive Collision Mitigation. Thesis No. 950, 2005. ISBN 9185299456.
E. Geijer Lundin: Uplink Load in CDMA Cellular Radio Systems. Thesis No. 977, 2005.
ISBN 9185457493.
M. Enqvist: Linear Models of Nonlinear Systems. Thesis No. 985, 2005. ISBN 9185457
647.
T. B. Schön: Estimation of Nonlinear Dynamic Systems — Theory and Applications. Thesis No. 998, 2006. ISBN 9185497037.
I. Lind: Regressor and Structure Selection — Uses of ANOVA in System Identiﬁcation.
Thesis No. 1012, 2006. ISBN 9185523984.
J. Gillberg: Frequency Domain Identiﬁcation of ContinuousTime Systems Reconstruction and Robustness. Thesis No. 1031, 2006. ISBN 9185523348.
M. Gerdin: Identiﬁcation and Estimation for Models Described by Di ﬀ erentialAlgebraic
Equations. Thesis No. 1046, 2006. ISBN 9185643874.
C. Grönwall: Ground Object Recognition using Laser Radar Data – Geometric Fitting,
Performance Analysis, and Applications. Thesis No. 1055, 2006. ISBN 918564353X.
A. Eidehall: Tracking and threat assessment for automotive collision avoidance. Thesis
No. 1066, 2007. ISBN 9185643106.
F. Eng: NonUniform Sampling in Statistical Signal Processing. Thesis No. 1082, 2007.
ISBN 9789185715497.
E. Wernholt: Multivariable FrequencyDomain Identiﬁcation of Industrial Robots. Thesis
No. 1138, 2007. ISBN 9789185895724.
D. Axehill: Integer Quadratic Programming for Control and Communication. Thesis
No. 1158, 2008. ISBN 9789185523030.
G. Hendeby: Performance and Implementation Aspects of Nonlinear Filtering. Thesis
No. 1161, 2008. ISBN 9789173939799.
J. Sjöberg: Optimal Control and Model Reduction of Nonlinear DAE Models. Thesis
No. 1166, 2008. ISBN 9789173939645.
D. Törnqvist: Estimation and Detection with Applications to Navigation. Thesis No. 1216,
2008. ISBN 9789173937856.
PJ. Nordlund: E ﬃ cient Estimation and Detection Methods for Airborne Applications.
Thesis No. 1231, 2008. ISBN 9789173937207.
H. Tidefelt: Di ﬀ erentialalgebraic equations and matrixvalued singular perturbation.
Thesis No. 1292, 2009. ISBN 9789173934794.
H. Ohlsson: Regularization for Sparseness and Smoothness — Applications in System
Identiﬁcation and Signal Processing. Thesis No. 1351, 2010. ISBN 9789173932875.
S. Moberg: Modeling and Control of Flexible Manipulators. Thesis No. 1349, 2010.
ISBN 9789173932899.
J. Wallén: Estimationbased iterative learning control. Thesis No. 1358, 2011. ISBN 978
9173932554.
J. Hol: Sensor Fusion and Calibration of Inertial Sensors, Vision, UltraWideband and GPS.
Thesis No. 1368, 2011. ISBN 9789173931977.
D. Ankelhed: On the Design of Low Order Hinﬁnity Controllers. Thesis No. 1371, 2011.
ISBN 9789173931571.
C. Lundquist: Sensor Fusion for Automotive Applications.
Thesis No. 1409, 2011.
ISBN 9789173930239.
P. Skoglar: Tracking and Planning for Surveillance Applications. Thesis No. 1432, 2012.
ISBN 9789175199412.
K. Granström: Extended target tracking using PHD ﬁlters.
Thesis No. 1476, 2012.
ISBN 9789175197968.
C. Lyzell: Structural Reformulations in System Identiﬁcation. Thesis No. 1475, 2012.
ISBN 9789175198002.
J. Callmer: Autonomous Localization in Unknown Environments. Thesis No. 1520, 2013.
ISBN 9789175196206.
* Your assessment is very important for improving the work of artificial intelligence, which forms the content of this project