applied matrix theory

applied matrix theory
APPLIED MATRIX THEORY
j
Lecture Notes for Math 464/514 Presented by
DR. MONIKA NITSCHE
j
Typeset and Editted by
ERIC M. BENNER
j
STUDENTS PRESS
December 3, 2013
Copyright © 2013
Contents
1
Introduction to Linear Algebra
1.1
1
Lecture 1: August 19, 2013 . . . . . . . . . . . . . . . . . . . . . . . . . . 1
About the class, 1. Linear Systems, 1. Example: Application to boundary value
problem, 2. Analysis of error, 3. Solution of the discretized equation, 4.
2
Matrix Inversion
2.1
5
Lecture 2: August 21, 2013 . . . . . . . . . . . . . . . . . . . . . . . . . . 5
Gaussian Elimination, 5. Inner-product based implementation, 7.
other class notes, 8. Example: Gauss Elimination, 8.
2.2
Office hours and
Lecture 3: August 23, 2013 . . . . . . . . . . . . . . . . . . . . . . . . . . 8
Example: Gauss Elimination, cont., 8. Operation Cost of Forward Elimination, 9.
Cost of the Order of an Algorithm, 10. Validation of Lower/Upper Triangular Form, 11.
Theoretical derivation of Lower/Upper Form, 11.
2.3
3
HW 1: Due August 30, 2013 . . . . . . . . . . . . . . . . . . . . . . . . . 12
Factorization
3.1
15
Lecture 4: August 26, 2013 . . . . . . . . . . . . . . . . . . . . . . . . . . 15
Elementary Matrices, 15. Solution of Matrix using the Lower/Upper factorization, 18.
Sparse and Banded Matrices, 18. Motivation for Gauss Elimination with Pivoting, 19.
3.2
Lecture 5: August 28, 2013 . . . . . . . . . . . . . . . . . . . . . . . . . . 19
Motivation for Gauss Elimination with Pivoting, cont., 19. Discussion of well-posedness, 20.
Gaussian elimination with pivoting, 21.
3.3
Lecture 6: August 30, 2013 . . . . . . . . . . . . . . . . . . . . . . . . . . 22
Discussion of HW problem 2, 22. PLU factorization, 22.
3.4
Lecture 7: September 4, 2013 . . . . . . . . . . . . . . . . . . . . . . . . . 24
PLU Factorization, 24. Triangular Matrices, 25. Multiplication of lower triangular matrices, 25. Inverse of a lower triangular matrix, 25. Uniqueness of LU factorization, 26.
Existence of the LU factorization, 26.
3.5
Lecture 8: September 6, 2013 . . . . . . . . . . . . . . . . . . . . . . . . . 27
About Homeworks, 27. Discussion of ill-conditioned systems, 27. Inversion of lower
triangular matrices, 28. Example of LU decomposition of a lower triangular matrix, 28.
Banded matrix example, 29.
iii
Nitsche and Benner
3.6
Applied Matrix Theory
Lecture 9: September 9, 2013 . . . . . . . . . . . . . . . . . . . . . . . . . 29
Existence of the LU factorization (cont.), 29. Rectangular matrices, 31.
3.7
4
HW 2: Due September 13, 2013 . . . . . . . . . . . . . . . . . . . . . . . 32
Rectangular Matrices
4.1
35
Lecture 10: September 11, 2013 . . . . . . . . . . . . . . . . . . . . . . . 35
Rectangular matrices (cont.), 35. Example of RREF of a Rectangular Matrix, 37.
4.2
Lecture 11: September 13, 2013 . . . . . . . . . . . . . . . . . . . . . . . 38
Solving Ax = b, 38. Example, 38. Linear functions, 39. Example: Transpose
operator, 40. Example: trace operator, 40. Matrix multiplication, 41. Proof of
transposition property, 42.
4.3
Lecture 12: September 16, 2013 . . . . . . . . . . . . . . . . . . . . . . . 42
Inverses, 42. Low rank perturbations of I, 43. The Sherman–Morrison Formula, 44.
Finite difference example with periodic boundary conditions, 44. Examples of perturbation, 45. Small perturbations of I, 45.
4.4
Lecture 13: September 18, 2013 . . . . . . . . . . . . . . . . . . . . . . . 46
Small perturbations of I (cont.), 46. Matrix Norms, 47. Condition Number, 48.
4.5
5
HW 3: Due September 27, 2013 . . . . . . . . . . . . . . . . . . . . . . . 49
Vector Spaces
5.1
Lecture 14: September 20, 2013 . . . . . . . . . . . . . . . . . . . . . . . 55
Topics in Vector Spaces, 55.
spaces, 57.
5.2
55
Field, 55.
Vector Space, 56.
Examples of function
Lecture 15: September 23, 2013 . . . . . . . . . . . . . . . . . . . . . . . 58
The four subspaces of Am×n , 58.
5.3
Lecture 16: September 25, 2013 . . . . . . . . . . . . . . . . . . . . . . . 61
The Four Subspaces of A, 62. Linear Independence, 63.
5.4
Lecture 17: September 27, 2013 . . . . . . . . . . . . . . . . . . . . . . . 64
Linear functions (rev), 64. Review for exam, 64. Previous lecture continued, 65.
5.5
Lecture 18: October 2, 2013. . . . . . . . . . . . . . . . . . . . . . . . . . 66
Exams and Points, 66. Continuation of last lecture, 66.
6
Least Squares
6.1
69
Lecture 19: October 4, 2013. . . . . . . . . . . . . . . . . . . . . . . . . . 69
Least Squares, 69.
6.2
Lecture 20: October 7, 2013. . . . . . . . . . . . . . . . . . . . . . . . . . 70
Properties of Transpose Multiplication, 71. The Normal Equations, 71. Exam 1, 73.
6.3
Lecture 21: October 9, 2013. . . . . . . . . . . . . . . . . . . . . . . . . . 74
Exam Review, 74. Least squares and minimization, 74.
6.4
HW 4: Due October 21, 2013 . . . . . . . . . . . . . . . . . . . . . . . . . 76
iv
Nitsche and Benner
7
Applied Matrix Theory
Linear Transformations
7.1
81
Lecture 22: October 14, 2013 . . . . . . . . . . . . . . . . . . . . . . . . . 81
Linear Transformations, 83. Examples of Linear Functions, 83. Matrix representation
of linear transformations, 83.
7.2
Lecture 23: October 16, 2013 . . . . . . . . . . . . . . . . . . . . . . . . . 84
Basis of a linear transformation, 84. Action of linear transform, 87. Change of Basis, 88.
7.3
Lecture 24: October 21, 2013 . . . . . . . . . . . . . . . . . . . . . . . . . 89
Change of Basis (cont.), 89.
7.4
Lecture 25: October 23, 2013 . . . . . . . . . . . . . . . . . . . . . . . . . 91
Properties of Special Bases, 91. Invariant Subspaces, 93.
7.5
8
HW 5: Due November 4, 2013 . . . . . . . . . . . . . . . . . . . . . . . . 94
Norms
8.1
99
Lecture 26: October 25, 2013 . . . . . . . . . . . . . . . . . . . . . . . . . 99
Difinition of norms, 99. Vector Norms, 99. The two norm, 99. Matrix Norms, 101.
Induced Norms, 102.
8.2
Lecture 27: October 28, 2013 . . . . . . . . . . . . . . . . . . . . . . . . .102
Matrix norms (review), 102. Frobenius Norm, 102. Induced Matrix Norms, 104.
8.3
Lecture 28: October 30, 2013 . . . . . . . . . . . . . . . . . . . . . . . . .106
The 2-norm, 106.
9
Orthogonalization with Projection and Rotation
9.1
109
Lecture 28 (cont.) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .109
Inner Product Spaces, 109.
9.2
Lecture 29: November 1, 2013 . . . . . . . . . . . . . . . . . . . . . . . .110
Inner Product Spaces, 110.
(Gramm-Schmidt), 111.
9.3
Fourier Expansion, 111.
Orthogonalization Process
Lecture 30: November 4, 2013 . . . . . . . . . . . . . . . . . . . . . . . .112
Gramm–Schmidt Orthogonalization, 112.
9.4
Lecture 31: November 6, 2013 . . . . . . . . . . . . . . . . . . . . . . . .115
Unitary (orthogonal) matrices, 116. Rotation, 117. Reflection, 118.
9.5
9.6
HW 6: Due November 11, 2013 . . . . . . . . . . . . . . . . . . . . . . .118
Lecture 32: November 8, 2013 . . . . . . . . . . . . . . . . . . . . . . . .120
Elementary orthogonal projectors, 120.
Subspaces of V, 121. Projectors, 121.
9.7
Elementary reflection, 121.
Complimentary
Lecture 33: November 11, 2013. . . . . . . . . . . . . . . . . . . . . . . .122
Projectors, 122. Representation of a projector, 123.
9.8
Lecture 34: November 13, 2013. . . . . . . . . . . . . . . . . . . . . . . .124
Projectors, 124.
An×n , 126.
9.9
Decompositions of Rn , 125.
Range Nullspace decomposition of
HW 7: Due November 22, 2013 . . . . . . . . . . . . . . . . . . . . . . .126
v
Nitsche and Benner
Applied Matrix Theory
9.10 Lecture 35: November 15, 2013. . . . . . . . . . . . . . . . . . . . . . . .128
Range Nullspace decomposition of An×n , 128. Corresponding factorization of A, 129.
10 Singular Value Decomposition
131
10.1 Lecture 35 (cont.) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .131
Singular Value Decomposition, 131.
10.2 Lecture 36: November 18, 2013. . . . . . . . . . . . . . . . . . . . . . . .132
Singular Value Decomposition, 132. Existence of the Singular Value Decomposition, 133.
10.3 Lecture 37: November 20, 2013. . . . . . . . . . . . . . . . . . . . . . . .136
Review and correction from last time, 136. Singular Value Decomposition, 136. Geometric
interpretation, 138.
10.4 Lecture 38: November 22, 2013. . . . . . . . . . . . . . . . . . . . . . . .139
Review for Exam 2, 139. Norms, 139. More major topics, 140.
10.5 HW 8: Due December 10, 2013 . . . . . . . . . . . . . . . . . . . . . . .142
10.6 Lecture 39: November 27, 2013. . . . . . . . . . . . . . . . . . . . . . . .144
Singular Value Decomposition, 144. SVD in Matlab, 145.
11 Additional Topics
149
11.1 Lecture 39 (cont.) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .149
The Determinant, 149.
11.2 Lecture 40: December 2, 2013 . . . . . . . . . . . . . . . . . . . . . . . .150
Further details for class, 150. Diagonalizable Matrices, 150. Eigenvalues and eigenvectors, 150.
Index
155
Other Contents
157
vi
UNIT 1
Introduction to Linear Algebra
1.1
Lecture 1: August 19, 2013
About the class
The textbook for the class will be Matrix Analysis and Applied Linear Algebra by Meyer.
Another highly recommended text is Laub’s Matrix Analysis for Scientists and Engineers.
Linear Systems
A linear system may be of the general form
Ax = b.
(1.1.1)
This may be represented in several equivalent ways.
2x1 + x2 − 3x3 = 18,
−4x1 + 5x3 = −28,
6x1 + 13x2 = 37.
(1.1.2a)
(1.1.2b)
(1.1.2c)
This also may be put in matrix form

  

2 1 −3
x1
18
−4 0
5 x2  = −28.
6 13
0
x3
37
(1.1.3)
Finally, a the third common form is vector form:


 
 


2
1
−3
18
−4 x1 +  0 x2 +  5 x3 = −28.
6
13
0
37
1
(1.1.4)
Nitsche and Benner
Unit 1. Introduction to Linear Algebra
y
y(t)
t0
t1
t2
t3
···
t
tn
Figure 1.1. Finite difference approximation of a 1D boundary value problem.
Example: Application to boundary value problem
We will use finite difference approximations on a rectangular grid to solve the system,
− y 00 (t) = f (t),
for t ∈ [0, 1],
(1.1.5)
with the boundary conditions
y(0) = 0,
y(1) = 0.
(1.1.6a)
(1.1.6b)
This is a 1D version of the general Laplace equation represented by,
− ∆u = f
(1.1.7)
− ∇2 u = f.
(1.1.8)
or in more engineering/science form
The Laplace operator in cartesian coordinates,
∇2 u = ∇ · (∇u),
= uxx + uyy + uzz .
(1.1.9a)
(1.1.9b)
Finite Difference Approximation
Let tj = j∆t, with j = 0, . . . , N . The approximate forms of the solution yj ≈ y(tj ).
Now we need to approximate the derivatives with discrete values of the variables. The
forward difference approximation is
y 0 (tj ) =
yj+1 − yj
,
tj+1 − tj
(1.1.10)
y 0 (tj ) =
yj+1 − yj
,
∆t
(1.1.11)
or
2
1.1. Lecture 1: August 19, 2013
Applied Matrix Theory
The backward difference approximation is
y 0 (tj ) =
yj − yj−1
.
∆t
(1.1.12)
The centered difference approximation is
yj+1 − yj−1
.
(1.1.13)
2∆t
Each of these are useful approximations to the first derivative that have varying properties
when applied to specific differential equations.
The second derivative may be approximated by combining the approximations of the first
derivative
y 0 (tj ) =
0 0
(y ) (tj ) ≈
0
0
yj+
1 − y
j− 1
2
2
,
∆t
yj+1 −yj
y −y
− j ∆tj−1
= ∆t
,
∆t
yj+1 − 2yj + yj−1
=
.
∆t2
(1.1.14a)
(1.1.14b)
(1.1.14c)
Analysis of error
To understand the error of this approximation we may utilize the Taylor series. A general
Taylor series is
1
1
f (x) = f (a) + f 0 (a)(x − a) + f 00 (a)(x − a)2 + f 000 (a)(x − a)3 + · · ·
(1.1.15)
2
3!
By the Taylor remainder theorem, we may approximate the error with a special truncation
of the series,
1
1
f (x) = f (a) + f 0 (a)(x − a) + f 00 (a)(x − a)2 + f 000 (ξ)(x − a)3 ,
2
3!
(1.1.16)
or simply
1
f (x) = f (a) + f 0 (a)(x − a) + f 00 (a)(x − a)2 + O (x − a)3 .
2
The difference we are interested in to find the error is,
E = y 00 (tj ) −
y(tj+1 ) − 2y(tj ) + y(tj−1 )
∆t2
(1.1.17)
(1.1.18)
The Taylor series,
y(tj+1 ) = y(tj + ∆t) = y(tj ) + y 0 (tj )∆t + O ∆t2 ,
y(tj−1 ) = y(tj − ∆t) = y(tj ) − y 0 (tj )∆t + O ∆t2
(1.1.19a)
(1.1.19b)
will need to be substituted.
A function g is said to be order 2, or g = O(h2 ), if,
|g| ≤ Ch2 .
3
(1.1.20)
Nitsche and Benner
Unit 1. Introduction to Linear Algebra
Solution of the discretized equation
We now substitute the discrete difference,
−
yj+1 − 2yj + yj−1
= f (tj ),
∆t2
for j = 1, . . . , n − 1
(1.1.21)
and the boundary conditions become
y0 = 0,
yn = 0.
This gives the linear system which

2 −1
0

−1
2 −1

 0 −1
2

 . .
.
 .. . . . .
0 ···
will need to be solved for the unknowns yi .



···
0  y 
f (t1 )
1
.. 
...
 f (t2 ) 
.  y 2 




 .. 
.. .
2
...

=
∆t




0  . 
 . 

..
f (tn−2 )
. −1 yn−2 
yn−1
f (tn−1 )
0 −1
2
4
(1.1.22a)
(1.1.22b)
(1.1.23)
UNIT 2
Matrix Inversion
2.1
Lecture 2: August 21, 2013
Previously we came up with a tridiagonal system for finite difference solution last time.
Gaussian Elimination
We want to solve Ax = b. Claim: Gaussian elimination: A = LU
Notation:
A = [aij ]
(2.1.1)
Lower triangular system Lx = b. In class we use underlines to indicate the vector. In
general these vectors are column vectors, and we will use x| to indicate the row vector.
Lower triangular system Lx = b

   
`11 0
0
0
x1
b1
 `21 `22 0 · · · 0     

   
 `31 `32 `21
  ..   .. 
0

 .  =  . 

..
..     
..

. .    
.
`n1 `n2 `n3 · · · `nn
xn
bn
(2.1.2)
`11 x1 = b1
`21 x1 + `22 x2 = b2
···
`n1 x1 + `n2 x2 + · · · + `nn xn = bn
(2.1.3a)
(2.1.3b)
(2.1.3c)
(2.1.3d)
or
5
Nitsche and Benner
Unit 2. Matrix Inversion
Rearranging to solve the equations,
b1
`11
b2 − `21 x1
x2 =
`22
···
x1 =
xi =
(2.1.4a)
(2.1.4b)
(2.1.4c)
bi − `i(i−1) xi−1 + · · · + `i1 x1
`ii
(2.1.4d)
The basic algorithm for solution of the above system in pseudo code follows:
1: x1 ← b1 /`11
2: for i ← 2, n do
P
3:
xi ← [bi − i−1
k=1 `ik xk ]/`ii
4: end for
The operation count, Nops , becomes,
Nops = 1 +
n
X
i=2
1 +
|{z}
division
1
|{z}
substitution
+ (i − 1) + (i − 2) .
| {z }
| {z }
multiplication
(2.1.5)
addition
Each of the terms arise directly from the steps of the algorithm shown above.
ASIDE: Finite sums
We need the following sums for our derivations of the operation counts,
n
X
n(n + 1)
,
2
(2.1.6)
n(n + 1)(2n + 1)
.
6
(2.1.7)
i=
i=1
n
X
i2 =
i=1
Evaluating the operation count,
Nops = 1 +
n
X
(2i − 1),
(2.1.8a)
i=2
=
n
X
(2i − 1),
(2.1.8b)
i=1
=2
n
X
!
i
− n,
(2.1.8c)
i=1
= n(n + 1) − n,
= n2 .
6
(2.1.8d)
(2.1.8e)
2.1. Lecture 2: August 21, 2013
Applied Matrix Theory
Implementation of lower triangular solution in Matlab
We give a Matlab code for this solution,
1
2
3
4
5
6
7
8
9
10
11
12
13
function x = L t r i s o l (L , b )
% s o l v e $Lx = b$ , assuming $L { i i } \ ne 0$
n = length ( b ) ;
% i n i t i a l i z e t h e s i z e o f your v e c t o r s
x1 = b ( 1 ) / l ( 1 , 1 ) ;
for i = 2 : n
x( i ) = b( i ) ;
f o r k = 1 : i −1
x( i ) = x( i ) − l ( i , k) ∗ x(k );
end
end
%
end
This would be saved as the code Ltrisol.m and would be run as
>> L = ...; b = ...;
>> x = Ltrisol(L, b)
Warning: Matlab loops are very slow!
Inner-product based implementation
How do we re-write the code as inner products?
We can reorder the second for-loop so that it is simply an inner-product,
1
2
3
4
5
6
7
8
9
10
function x = L t r i s o l (L , b )
% s o l v e $Lx = b$ , assuming $L { i i } \ ne 0$
n = length ( b ) ;
% i n i t i a l i z e t h e s i z e o f your v e c t o r s
x1 = b ( 1 ) / l ( 1 , 1 ) ;
for i = 2 : n
x ( i ) = ( b ( i ) − l ( i , 1 : i −1)∗x ( 1 : i −1))/ l ( i , i ) ;
end
%
end
Note that the l(i,1:i-1) term is a row vector and x(1:i-1) is a column vector so this
code will work fine. Recall that this required that x be initialized as a column vector. The
inner part can also be rewritten more cleanly as,
1 function x = L t r i s o l (L , b )
2 % s o l v e $Lx = b$ , assuming $L { i i } \ ne 0$
3 n = length ( b ) ;
4 % i n i t i a l i z e t h e s i z e o f your v e c t o r s
5 x1 = b ( 1 ) / l ( 1 , 1 ) ;
6 for i = 2 : n
7
k = 1 : i −1;
8
x ( i ) = (b( i ) − l ( i , k )∗ x ( k ))/ l ( i , i ) ;
9 end
7
Nitsche and Benner
Unit 2. Matrix Inversion
10 %
11 end
Office hours and other class notes
Office hours will be from 12–1 on MWF, the web address is, www.math.unm.edu/~nitsche/
math464.html.
Example: Gauss Elimination
Example:
2x1 − x2 + 3x3 = 13
−4x1 + 6x2 − 5x3 = −28
6x1 + 13x2 − 16x3 = 37
(2.1.9a)
(2.1.9b)
(2.1.9c)
Let’s perform each step in full equation form. So we execute the steps R2 → R2 − (−2)R1
and R3 → R3 − (−3)R1 .
2x1 − x2 + 3x3 = 13
4x2 + x3 = −2
16x2 + 7x3 = −2
(2.1.10a)
(2.1.10b)
(2.1.10c)
Next step will be R3 → R3 − (4)R2 .
2.2
Lecture 3: August 23, 2013
Example: Gauss Elimination, cont.
Example:
2x1 − x2 + 3x3 = 13
−4x1 + 6x2 − 5x3 = −28
6x1 + 13x2 − 16x3 = 37
(2.2.1a)
(2.2.1b)
(2.2.1c)
Let’s perform each step in full equation form. So we execute the steps R2 → R2 − (−2)R1
and R3 → R3 − (−3)R1 .
2x1 − x2 + 3x3 = 13
4x2 + x3 = −2
16x2 + 7x3 = −2
8
(2.2.2a)
(2.2.2b)
(2.2.2c)
2.2. Lecture 3: August 23, 2013
Applied Matrix Theory
Next step will be R3 → R3 − (4)R2 .
2x1 − x2 + 3x3 = 13
4x2 + x3 = −2
3x3 = 6
(2.2.3a)
(2.2.3b)
(2.2.3c)
Now we begin the backward substitution.
x3 = 2;
x2 = (−2 − x3 )/4,
= −1;
x1 = (13 + x2 − 3x3 )/2,
= 3.
(2.2.4a)
(2.2.4b)
(2.2.4c)
(2.2.4d)
(2.2.4e)
Gauss Elimination is forward elimination and backward substitution. Now we will do the
same problem in matrix form,




2 −1
3
13
2 −1 3 13
 −4
6 −5 −28  →  0
4 1 −2 ,
(2.2.5a)
6 13 16
37
0 16 7 −2


2 −1 3 13
4 1 −2 .
(2.2.5b)
→ 0
0
0 3
6
Operation Cost of Forward Elimination
Now we want to know the operation count for the forward elimination step when we take
A → U without pivoting for a general n × n matrix, A = [aij ]. As an example of each step:

a11
a21

a31

a41
a51
a12
a22
a32
a42
a52
a13
a23
a33
a43
a53
a14
a24
a34
a44
a54


a15
a11


a25 
0
0
a35 
→


0
a45 
0
a55
These operations are given by, rowj →
not be close to zero or we will need to
The next step,

a11
0

→
0
0
0
a12
a022
a032
a042
a052
a13
a023
a033
a043
a053

a15
a025 

a035 

a045 
a055
a14
a024
a034
a044
a054
(2.2.6a)
a
rowj − `ij rowi , where `ij = aijii if aii 6= 0 (aii should
a1j
use pivoting). An example, a1j → aij − a11
a1j = 0.
a12
a022
0
0
0
a13
a023
a0033
a0043
a0053
9
a14
a024
a0034
a0044
a0054

a15
a025 

a0035 

a0045 
a0055
(2.2.6b)
Nitsche and Benner
Unit 2. Matrix Inversion
y
y
y(t)
t0
t1
t2
t3
y(t)
t
···
t0 t2 t4 t6 t8 t10 t12 t14 t16 · · · t4n
tn
(a) n grid
t
(b) 4n grid
Figure 2.1. One-dimensional discrete grids.
At ith step (i = 1 : n − 1),
B(n−i)×(n−i) → B̃(n−i)×(n−i) ,
(2.2.7)
the cost of the individual step: n − i + 2(n − i)2 . The total cost is thus,
| {z } | {z }
comp `ij
Nops =
comp aij
n−1
X
(n − i) + 2(n − 1)2
(2.2.8a)
i=1
Let k = n − i then i = 1 → k = n − 1 and i = n − 1 → k = n − (n − 1) = 1
=
1
X
(k + 2k 2 ),
(2.2.8b)
k=n−1
=
(n − 1)n
(n − 1)n(2(n − 1) + 1)
+2
,
2 }
6
| {z
|
{z
}
O(n2 )
(2.2.8c)
O(n3 )
3
≈O n .
(2.2.8d)
This means that the problem scales with order 3.
Cost of the Order of an Algorithm
For an order 3 algorithm, if you increase the size of your matrix by a factor of 2, the expense
of computer time will increase by a factor of 8. Similarly, if it took one day to solve a
boundary value problem in 1D with n = 1000, then it will take 64 days to do n = 4000 (see
figure 2.1).
Alternatively, if you are doing a 2D simulation, increasing by a factor of 4, as shown in
figure 2.2, would increase the domain to 16 and thus the calculations would increase to 163 .
This gets very expensive! This is one of the reasons that models of phenomena such as the
weather is very difficult.
10
2.2. Lecture 3: August 23, 2013
Applied Matrix Theory
y
y
yn
y4n
y0
y0
x
x0
xn
x
x0
(a) n × n grid
x4n
(b) 4n × 4n grid
Figure 2.2. Two-dimensional discrete grids.
Validation of Lower/Upper Triangular Form
Consider that we have the Gaussian Elimination with A = LU, where
L=
1 0
.
`ij 1
(2.2.9)
Check our previous system:

 


2 −1
3
1 0 0
2 −1 3
−4
4 1 .
6 −5 = −2 1 0 0
6 13 16
3 4 1
0
0 3
(2.2.10)
This works!
Theoretical derivation of Lower/Upper Form
We want to show that Gauss elimination naturally leads to the LU form using elementary
row operations. The three elementary operations are:
1. Multiply row by α;
2. Switch rowi and rowj ;
3. Add multiple of rowi to rowj .
All are equivalent to pre-multiplying A by an elementary matrix. Let’s illustrate these:
11
Nitsche and Benner
1. Multiply by α.

1 0 0
0 1 0 · · ·

0 0 α

 ..
 .
0 0 0 ···
{z
|
Ei
Unit 2. Matrix Inversion

a11
0
 a21
0


0
  a31

.. 
. 
an1
1
}
 

a12 a13
a1n
a11
a12
a13
a1n

a22 a23 · · · a2n 
a22
a23 · · · a2n 
  a21

αa31 αa32 αa33

a32 a33
a3n 
αa
3n
=
 



..
..
.. 
. . . ..  
...
.
.
. 
.
an2 an3 · · · ann
an1 an2 an3 · · · ann
(2.2.11a)
2.3
Homework Assignment 1: Due Friday, August 30,
2013
1. Use Taylor series expansions of f (x ± h) about x to show that
f 00 (x) =
f (x + h) − 2f (x) + f (x − h) h2 (4)
4
−
f
(x)
+
O
h
.
h2
12
(2.3.1)
2. Consider the two-point boundary value problem
y 00 (x) = ex ,
1
y(−1) = ,
e
y(1) = e
(2.3.2)
where x ∈ [−1, 1], Divide the interval [−1, 1] into N equal subintervals and apply
the finite difference method presented in class to find the approximate the solution
yj ≈ y(xj ) at the N −1 interior points j = 1, . . . , N −1, where xj = a+jh, h = (b−a)/N ,
and [a, b] = [−1, 1]. Compare the approximate values at the grid points with the exact
solution at the grid points. Use N = 2, 4, 8, . . . , 29 and report the maximal absolute
error for each N in a table. Your writeup should contain:
• the Matlab code;
• a table with two columns. The first contains h, the second contains the corresponding maximal errors. By how much is the error reduced every time N is
doubled? Can you conclude whether the error is O(h), O(h2 ) or O(hp ) for some
other integer p?
Regarding Matlab: If needed, go over the Matlab tutorial on the course website,
items 1–6. This covers more than you need for this problem. In Matlab, type
help diag or help ones
to find what these commands do. The (N −1)×(N −1) matrix with 2s on the diagonal
and –1 on the off-diagonals can be constructed by
v=ones(1,n-1);
A=2*diag(v)-diag(v(1:n-2),1)-diag(v(1:n-2),-1);
12
2.3. HW 1: Due August 30, 2013
Applied Matrix Theory
The system Ax = b can be solved in Matlab by x = A\b. The maximal difference
between two vectors x and y is error=max(abs(x-y)). Your code should have the
following structure
Listing 2.1. code stub for tridiagonal solver
1 disp ( s p r i n t f (
h
error )
2 a=...; b=...;
% Set values of endpoints
3 ya = . . . ; yb = . . . ;
% Set values of y at the endpoints
4 for n = . . . ;
5
h=2/n ;
6
x=a : h : b ;
7
% S e t m a t r i x A o f t h e l i n e a r system t o be s o l v e d .
8
v=o n e s ( 1 , n −1);
9
A=2∗diag ( v)−diag ( v ( 1 : n−2) ,1) − diag ( v ( 1 : n −2) , −1);
10
% S e t r i g h t hand s i d e o f l i n e a r system .
11
rhs = . . .
12
% S o l v e l i n e a r system t o f i n d a p p r o x i m a t e s o l u t i o n .
13
y ( 2 : n)=A\ r h s ; y (1)= ya ; y ( n+1)=yb ;
14
% Compute e x a c t s o l u t i o n and a p p r o x i m a t i o n e r r o r
15
yex = . . .
% set exact solution
16
plot ( x , y , b − , x , yex , r − ) % t o compare v i s u a l l y
17
error=max( abs ( y−yex ) )
18
disp ( s p r i n t f ( %1 5 . 1 0 f %20.15 f , h , e r r o r ) )
19 end
Note that in Matlab the index of all vectors starts with 1. Thus, x=-1:h:1, is a
vector of length n + 1 and the interior points are x(2:n).
3. Let U be an upper triangular n × n matrix with nonzero entries uij , j ≥ i.
(a) Write an algorithm that solves Ux = b for a given right hand side b for the
unknown x.
(b) Find the number of operations that it takes to solve for x, using your algorithm
above.
(c) Write a Matlab function function x=utrisol(u,b) that implements your algorithm and returns the solution x.
4. Given A, b below,
(a) find the LU factorization of A (using the Gauss Elimination algorithm);
(b) use it to solve Ax = b.


 
2 −1
0
0
0
−1



2 −1
0
0.
A=
,
b
=
(2.3.3)
 0 −1
0
2 −1
0
0 −1
2
5
5. Sparsity of L and U, given sparsity of A = LU. If A, B, C, D have non-zeros in the
positions marked by x, which zeros (marked by 0) are still guaranteed to be zero in
13
Nitsche and Benner
Unit 2. Matrix Inversion
their factors L and U? (B, C, D are all band matrices with p = 3 bands, but differing
sparsity within the bands. The question is how much of this sparsity is preserved.) In
each case, highlight the new nonzero entries in L and U.


x 0 x 0 0 0


0 x 0 x 0 0
x x x x


x x x 0 
x 0 x 0 x 0 



,
A=
,
B=

0 x x x
0
x
0
x
0
0


0 0 x 0 x 0
0 0 x x
0 0 0 x 0 x

x
0

x
C=
0

0
0
x
x
0
x
0
0
x
0
x
0
x
0
0
x
0
x
0
x
0
0
x
0
x
0

0
0

0
,
0

0
x

x
0

x
D=
0

0
0
0
x
0
x
0
0
0
0
x
0
x
0
x
0
0
x
0
x
0
x
0
0
x
0

0
0

x
,
0

0
x
6. Consider solving a differential equation in a unit cube, using N points to discretize each
dimension. That is, you have a total of N 3 points at which you want to approximate
the solution. Suppose that at each time step, you need to solve a linear system Ax = b,
where A is an N 3 × N 3 matrix, which you solve using Gauss Elimination, and suppose
there are no other computations involved. Assume your personal computer runs at 1
GigaFLOPS, that is, it executes 109 floating point operations per second.
(a) How much time does it take to solve your problem for N = 500 for 1000 timesteps?
(b) When you double the number of points N , you typically also have to halve the
timestep, that is, double the total number of timesteps taken. By what factor
does the runtime increase each time you double N ?
(c) How much time will it take to solve the problem if you use N = 2000?
14
UNIT 3
Factorization
3.1
Lecture 4: August 26, 2013
For the h in the homework, for n = 2.^(1:1:10). We want to deduce the order of the
method from the table of h and the error.
Elementary Matrices
1. Multiply rowi by α:

1 0 0 0 0
0 . . . 0 0 0 




E 1 = 0 0 α 0 0  .


0 0 0 . . . 0
0 0 0 0 1

(3.1.1)
The inverse is,

E−1
1
1 0 0 0
0 . . . 0 0


= 0 0 α1 0

0 0 0 . . .
0 0 0 0

0
0


0 .

0
(3.1.2)
1
E1 E−1
1 = I
(3.1.3)
2. Exchange rowi and rowj :

1
0

0
E2 = 
0

0
0
0
1
0
0
0
0
0
0
0
1
0
0
15
0
0
1
0
0
0
0
0
0
0
1
0

0
0

0
.
0

0
1
(3.1.4)
Nitsche and Benner
Unit 3. Factorization
E22 = I
3. Replace rowj by rowj + αrowi .

1
0

0
E3 = 
0

0
0

E−1
3
What happens if
the columns instead

a11 a12
 a21 a22


AE1 =  a31 a32

..

.
an1 an2
1
0

0
=
0

0
0
we post-multiply by
of the rows.

1
a13
a1n

a23 · · · a2n 
 0

a33
a3n  
0

. . . .. 
.  0
an3 · · · an
0

a11
 a21


AE2 =  a31


an1
a12 a13
a22 a23
a32 a33
..
.
an2 an3
0
1
0
0
0
0
0
0
1
α
0
0
(3.1.5)
0
0
0
1
0
0
0
0
0
0
1
0

0
0

0
.
0

0
1

0
0

0
.
0

0
1
0 0 0 0
1 0 0 0
0 1 0 0
0 −α 1 0
0 0 0 1
0 0 0 0
(3.1.6)
(3.1.7)
the elementary matrices? The matrices will act on
 

0
a11 a12 αa13
a1n


0
  a21 a22 αa23 · · · a2n 
  a31 a32 αa33
a3n 
0 = 

 
..
.. 
.
.
. . 
.
0 
an1 an2 α an3 · · · an
1
(3.1.8)
 1 0 0 0 0 0
a1n
0 1 0 0 0 0

· · · a2n 

0 0 0 1 0 0


a3n  
(3.1.9)
0 0 1 0 0 0
..  
..


. .  0 0 0 0 1 0
· · · an
0 0 0 0 0 1
0 0 0
.. 0 0
.
0 α 0
0 0 ...
0 0 0
Gaussian Elimination without pivoting
Premultiply by elementary matrices type 3 repeatedly.
aji
,
aii

x
0

E−21 A = 
x
x
x
`ji =
for j > i
x
x
x
x
x
16
x
x
x
x
x
x
x
x
x
x

x
x

x

x
x
(3.1.10)
(3.1.11)
3.1. Lecture 4: August 26, 2013
Applied Matrix Theory

x x x x
0 x x x

E−31 E−21 A = 
0 x x x
x x x x
x x x x
This sequence continues until we have introduced zeros to

x x
0 x

E−n,n−1 · · · E−n1 · · · E−31 E−21 A = 
0 0
0 0
0 0

x
x

x
(3.1.12)


x
x
get the lower diagonal:

x x x
x x x

x x x
(3.1.13)
=U
0 x x
0 0 x
Thus,

1 0
`21 1

0 0
E21 E31 = 
0 0

0 0
0 0
Which extends to
0
0
1
0
0
0
0
0
0
1
0
0
A = E21 E31 · · · En−1,n−2 En,n−2 En,n−1 U
{z
}
|
L

 
0 0
1 0 0 0 0 0
1 0




0 0  0 1 0 0 0 0 `21 1

 
0 0
 `31 0 1 0 0 0 = `31 0



0 0  0 0 0 1 0 0
 0 0
1 0  0 0 0 0 1 0  0 0
0 1
0 0 0 0 0 1
0 0

Ẽ1 = En1 · · · E21 E31
1
`21
`31
..
.
0 0
1 0
0 1
0
0
0
...
0 0
0 0 0
0 0 0




=


`n−1,1
`n1
(3.1.14)
0
0
1
0
0
0
0
0
0
1
0
0
0
0
0
0
1
0

0 0
0 0

0 0

.
0 0

1 0
0 1

0
0

0
 . (3.1.15)
0

0
1
(3.1.16)
This further extends to,

1
`21
`31
..
.
0
1
`32
..
.




Ẽ1 Ẽ2 = 


`n−1,1 `n−1,2
`n1
`n2
0
0
1
0
0
0
.
0 ..
0 0
0 0

0 0
0 0

0 0

.
0 0

1 0
0 1
(3.1.17)
Finally we get that

Ẽ1 Ẽ2 · · · Ẽn−1
1
`21
`31
..
.
0
1
`32
..
.
0
0
1
..
.
0
0
0
..
.
0
0
0




=

0

`n−1,1 `n−1,2 · · · `n−1,n−2
1
`n1
`n2 · · · `n,n−2 `n,n−1
17

0
0

0

.
0

0
1
(3.1.18)
Nitsche and Benner
Unit 3. Factorization
Solution of Matrix using the Lower/Upper factorization
To use A = LU to solve Ax = b.
1. Find L, U (number of operations: 23 n3 )
2. L(Ux) = b First solve Ly = b (number of operations: n2 ), then solve, Ux = y
(number of operations: n2 ).
Example:
To solve Ax = b_k k= 1,10^4
%
Find L, U once O(2/3 n^3)
then solve
L y = b
U x = y
10,000 times
O(10,000 * n^2 * 2)
Sparse and Banded Matrices
Given

x 0 0 0 0
0 . . . 0 0 0




A = 0 0 x 0 0


0 0 0 . . . 0
0 0 0 0 x

(3.1.19)
the bandwidth is 1. Below,

x
x

0
A=
0

0
0
x
x
x
0
0
0
0
x
x
x
0
0
the bandwidth is 3—this is a tridiagonal matrix .
when it undergoes LU decomposition.

 
x x 0 0 0 0
1 0 0 0 0
x x x 0 0 0  x 1 0 0 0

 
0 x x x 0 0 0 x 1 0 0

 
0 0 x x x 0 = 0 0 x 1 0

 
 0 0 0 x x x  0 0 0 x 1
0 0 0 0 x x
0 0 0 0 x
18
0
0
x
x
x
0
0
0
0
x
x
x

0
0

0
,
0

x
x
(3.1.20)
This type of matrix maintains it’s sparsity

0
x


0  0

0
 0

0 
0

0 0
1
0
x
x
0
0
0
0
0
x
x
0
0
0
0
0
x
x
0
0
0
0
0
x
x
0

0
0

0
.
0

x
x
(3.1.21)
3.2. Lecture 5: August 28, 2013
Applied Matrix Theory
Motivation for Gauss Elimination with Pivoting
When does Gauss elimination give us a problem? For example
0 1
1.
1 1
δ 1
1+δ
1
2. A =
. Solve Ax =
, the exact solution is
. However, we run into
1 1
2
1
numerical problems.
3.2
Lecture 5: August 28, 2013
Motivation for Gauss Elimination with Pivoting, cont.
Whendoes Gauss elimination
give
us a problem? Returning
to the example problem, A =
δ 1
1+δ
1
. Solve Ax =
, the exact solution is
, but we run into numerical
1 1
2
1
problems.
There are a couple approaches to this problem. First, solve for x by first finding L, U
and using them numerically,
δ 1
δ
1
=U
(3.2.1)
A=
→
1 1
0 1 − 1δ
and
1
L=
1
δ
0 1
(3.2.2)
Now we want to solve L (Ux) = b
1 f o r j =1:16
2
d e l t a = 10ˆ(− j ) ;
3
b = [1 + delta , 2 ] ;
4
L = [ 1 , 0 ; 1/ d e l t a , 1 ] ;
5
U = [ d e l t a , 1 ; 0 , 1−1/ d e l t a ] ;
6
% S o l v e Ly = b \ t o y
7
y (1) = b ( 1 ) ; y (2) = b(2) − L(2 ,1)∗ y ( 1 ) ;
8
% S o l v e Ux = y \ t o x
9
x (2) = y(2)/u (2 ,2); x (1) = (y (1) − u(1 ,2)∗ x (2))/u (1 ,1);
10
%
11
disp ( s p r i n t f ( ’ %5.0 e %20.15 f %20.15 f %10.8 e ’ , d e l t a , x ( 1 ) , x ( 2 ) ,norm( x − [ 1 , 1 ] ) ) ;
12 end
p
Note that the norm is the Euclidian norm, x − [1, 1] = (x(1) − 1)2 + (x(2) − 1)2 . This
gives us a table of results as shown below
Conclusion: Ax = b is a good problem (well-posed) introducing small perturbations
(e.g., by roundoff) does not change the solution by much. Matlab’s algorithm A\b is a
good algorithm (stable); LU decomposition does not give a good algorithm (unstable).
19
Nitsche and Benner
Unit 3. Factorization
Table 3.1. Variation of error with the perturbation variable
δ
1e-01
1e-02
1e-03
1e-04
1e-05
...
1e-16
x(1)
1.000
1.000
0.999
1.000. . . 28
...
...
0.888
x(2) ||x − [1, 1]||2
1.000
8e-16
1.000
1e-13
1.000
6e-12
1.000
e-11
1.000
e-10
...
...
1.000
e-0
Discussion of well-posedness
Geometrically, Ax = b,
δx1 + x2 = 1 + δ,
x1 + x2 = 2.
(3.2.3a)
(3.2.3b)
This is a well-posed system. Rearranging
x2 ≈ 1 − δx1 ,
x 2 = 2 − x1 .
(3.2.4a)
Our other system Ly = b,
y1 = 1
(3.2.5a)
1
y1 + y2 = 2
δ
(3.2.5b)
This makes a very ill-posed system because small wiggles in δ give much larger errors because
the slopes are so near each other.
Now we consider Ux = y,
δx1 + x2 = 1,
1
1−
x2 = y2 .
δ
(3.2.6a)
(3.2.6b)
This is also ill-posed as well. All of these linear problems are illustrated in figure 3.1.
20
3.2. Lecture 5: August 28, 2013
Applied Matrix Theory
x2
x2
x2
(1, 1)
x1
x1
(a) Ax = b
x1
(b) Ly = b
(c) Ux = y
Figure 3.1. Plot of linear problems and their solutions.
Gaussian elimination with pivoting
Pivoting means we exchange rows such that the current |aii | = max |aji |. Similarly, `ji =
j≥i
aji
aii
≤ 1 for all j > i. Now,
δ 1 1+δ
1 1
2
→
1 1
2
δ 1 1+δ
1
1
2
−−−−−−−→
0 1 − δ 1 + δ − 2δ
1
1
2
→
0 1−δ 1−δ
R2 ←R2 −δR1
(3.2.7a)
(3.2.7b)
(3.2.7c)
PLU always works. Theorem: Gaussian elimination with pivoting yields PA = LU. The
permutation matrix is P. Every matrix has a PLU factorization.
To do the pivoting, at each step, first premultiply A by


1 0 0 0 0 0
 ...

0 0 0 0
0


0 0 0 1 0 0
Pk = 
(3.2.8)

0 0 1 0 0 0


0 0 0 0 . . . 0
0 0 0 0 0 1
then premultiply by

1
0
0
 ...
0
0

0
0
1

Lk = 
0 0 `k−1,k

..
0 0
.
0 0
`n,k
21
0
0
1
1
0
0

0

0 0

0 0

0 0
... 
0
0 1
0
(3.2.9)
Nitsche and Benner
Unit 3. Factorization
We do this in succession,
Ln−1 Pn−1 · · · L2 P2 L1 P1 A = U
(3.2.10)
How do these commute into a useful P and L matrix?
3.3
Lecture 6: August 30, 2013
Discussion of HW problem 2
− yj−1 + 2yj − yj+1 = h2 f (xj ),

2 −1

−1
2

 0 −1

 . .
 .. . .
0 ···
for j = 1, . . . , n − 1.



0 ···
0  y 
f (t1 ) + y0
1
.. 
.


−1 . .
.  y 2 
f (t2 )








..
..
2
...

=
h



.
.
2
0  . 



.. ..
 f (tn−2 ) 
. −1 yn−2 
.
yn−1
f (tn−1 ) + yn
0 −1
2
(3.3.1)
(3.3.2)
So we’ve set up our matrix
rhs = matrix of zeros size \(1 \times n-1\)
for
A_{(n-1)x(n-1)}
x = a:h:b = linspace(a,b,n+1)
rhs = h^2*f(x(2:n));
rhs(1) = rhs(1) + ya;
rhs(n-1) = rhs(n-1) + yb;
Recall that our f (x) = −ex :
− y 00 = −ex
(3.3.3)
PLU factorization
For PLU factorization, we are doing Gauss elimination with pivoting. At each k th step
(k)
of Gaussian elimination, switch rows so that the pivots, akk , are the largest number by
magnitude in the k th column.
For example,

   
1 −1
3
x1
−3
−1
0 −2 x2  =  1.
(3.3.4)
2
2
4
x3
0
22
3.3. Lecture 6: August 30, 2013
Applied Matrix Theory
or




1 −1
3 −3
2
2
4
0
 −1
1  →  −1
1 ,
0 −2
0 −2
2
2
4
0
1 −1
3 −3


0
2
2
4
1 −0
1 ,
→ 0
0 −2
1 −3
row1 ↔ row3
(3.3.5a)
1
1
row2 ← row2 − row1 , and row3 ← row3 − row1
3
2
(3.3.5b)


2
2
4
0

,
−3
0
−2
1
→
row2 ↔ row3
(3.3.5c)
0
1 −0
1


0
2
2
4
1
1
−3  ,
→  0 −2
row3 ← row3 − −
row2
2
0
0 1/2 −1/2
(3.3.5d)
We need to do the back substitution to solve this system. But more importantly, we want
to know what the factorization of this system would be. Recall,

1
0
0
.

0
0 . .

1
0 0
Lk = 
0
0
`

k−1,k

..
0 0
.
0 0
`n,k

0

0 0

0 0
,
0 0

..
. 0
0 1
0
0
0
0
1
0
0
(3.3.6)
and
L−(n−1) Pn−1 · · · L−2 P2 L−1 P1 A = U.
(3.3.7)
Pn−1 · · · L−2 P2 L−1 P1 A = L(n−1) U.
(3.3.8)
Reordering,
We want to move each P to be right next to A and all the Ls such that we can form a true
L. Claim,
Pj L−k = L̃−k Pj ,
j > k.
(3.3.9)
Pj permutation moves columns below the k th row. This allows us to move L’s out.
Pj L−k Pj = L̃−k
(3.3.10a)
˜ P
L̃−n · · · L̃
−1 n−1 · · · P1 A = U
(3.3.11)
23
Nitsche and Benner
Unit 3. Factorization
Now we can return to our example but with keeping track of the






1 −1
3 −3
2
2
4
0
0 0 1
 −1
0 −2
1  →  −1
0 −2
1 ,
row1 ↔ row3 , P1 = 0 1 0
0
2
2
4
1 −1
3 −3
1 0 0
(3.3.12a)

2

− 21
→


→

0
4
1
1
row2 ← row2 − −
row1 , row3 ← row3 − ro
2
2

1 −0
1 ,

−2
1 −3
1
2

2

2
1
2
− 21
(3.3.12b)


0 0 1
row2 ↔ row3 , P2 = 1 0 0
0 1 0

0
4
1 −3 
,

1 −0
1
2
−2
(3.3.12c)


→

2
1
2
− 21
2
−2
− 12

0
−3 
,

1/2 −1/2
4
1
1
row3 ← row3 − −
2
row2
(3.3.12d)
Because P = P−1 , we should remember that,
PA = LU
A = PLU.
3.4
(3.3.13a)
(3.3.13b)
Lecture 7: September 4, 2013
PLU Factorization
Recall
PA = LU
(3.4.1)
always exists by construction. This is because we can make anything non-zero by the permutation. This is also equivalent to,
A = PLU
(3.4.2)
because P = P−1 . To use this in an actual solution,
PAx = Pb,
(3.4.3)
LUx = Pb,
(3.4.4)
or
So this system is determined by:
24
3.4. Lecture 7: September 4, 2013
Applied Matrix Theory
1. Solving Ly = Pb,
2. Solving Ux = y.
In Matlab, we would use the commands [L,U,P] = lu(A), to find these three matrices.
This factorization is not unique. We want to show the uniqueness of the LU factorization,
and are also interested in when it exists.
Triangular Matrices
We are interested in the determinants of lower or upper triangular matrices. Let’s discuss
det(L).


`11 0
0
0
0
 .. . . .

0
0
0 
 .


0 
L =  `i1 · · · `jj 0
(3.4.5)
 .

 .. · · · ... . . . 0 
`n1 · · · `nj . . . `nn
Qn
the determinant is det(L) = i=1 `ii . Thus L is invertible only if `ii 6= 0 for all `ii . We
conjecture the product of two lower triangular matrices will give us lower a triangular matrix.
e.g.
L1 L2 = L12
(3.4.6)
We want to prove this!
Multiplication of lower triangular matrices
Prove that L1 L2 is lower triangular. Assume AB are lower triangular. Show C = AB is
lower triangular. We know that bij aij = 0 for j > i. In our proof, we first consider matrix
multiplication.
X
eij =
aik bkj .
(3.4.7)
We know that aik = 0 for k > i, and bkj = 0 for j > k. If j > i, then when k < i we have
that k < j so bkj = 0. Alternatively, if k > i then aik = 0. Thus, in either case one of the
two products is zero and we have proved our hypothesis.
Inverse of a lower triangular matrix
A lower triangular matrix’s inverse is also

`11 · · ·
 .. . .
−1
L = .
.
`n1
a lower triangular matrix;

0
..  = Lower triangular
. 
· · · `nn
(3.4.8)
So, this helps with inversion of the form,
L−n · · · L−2 L−1 A = U.
25
(3.4.9)
Nitsche and Benner
Unit 3. Factorization
For matrixes of the form

L−k
1
0
0
.

0
0 . .

1
0 0
=
0
0
−`

ij

..
0 0
.
0 0 −`nj
0
0
0
0
1
0
0
0
...
0
0 0
0


0

0
;
0

0
(3.4.10)
1
the inverse matrix is

1
0
0
 ...
0
0

1
0 0
Lk = 
0 0 `ij

..
0 0
.
0 0 `nj
0
0
0
1
0
0

0

0 0

0 0
.
0 0
... 
0
0 1
0
(3.4.11)
For any



1 0 0 0 0
`11 0 0 0
0
 .. . . .
  . ...

0 0 0  ..
0 0
0 
.



0 
Lk = 0 · · · 1 0 0  0 · · · `ii 0
.
 .

.. . .
.. . .
 .. 0
. 0  ..
. 0 
.
0
.
0 · · · `in . . . 1
0 · · · 0 . . . `nn
(3.4.12)
GE
To find L−1 , [L I] −−→ [I L−1 ]. Use Gaussian elimination on L, and we go through each
column.
Uniqueness of LU factorization
Theorem: If A is such that no non-zero pivots are encountered, then A = LU with `ii = 1
a
and uii 6= 0, which are the pivots. For, `ij = aijii for j < i by construction.
Proof: Assume A = L1 U1 = L2 U2 , then
L−1
2 L1 U1 = U2 ,
−1
L−1
2 L1 = U2 U1
= diagonal matrix
= I.
(3.4.13a)
(3.4.13b)
(3.4.13c)
(3.4.13d)
If this is the case, then L−1
2 L1 = I or L2 = L1 , and similarly U2 = U1 . Thus these matrices
are the same and the solution must be unique.
Existence of the LU factorization
Theorem: A = LU with no zero pivots, then all leading principal submatrices Ak are nonsingular. We define the leading principle sub matrices Ak of An×n is Ak = A(1:k),(1:k) . These
are the upper-left square matrices of the full matrix.
26
3.5. Lecture 8: September 6, 2013
Applied Matrix Theory
Part 2. A = LU then define Ak 6= 0 for any k. We want to prove that if A = LU, show
that Ak is invertible. Then if Ak is invertible show that A = LU.
3.5
Lecture 8: September 6, 2013
About Homeworks
The median score was 50 out of 60. A histogram was shown with the general grade distribution. 1 around 10, 3 around 25, 1 around 40, 4 from 45–50, 4 from 50–55, 6 from 55–60.
Comments: write in working Matlab code. Also, L must have ones on the diagonal, while
U has pivots on the diagonal. “Computing efficiently” means using the LU decomposition,
not invert the matrix A.
For homework 2, we will have applications of finding the inverse of A or solve
AX = I
(3.5.1)
A x1 x2 · · · xn = e1 e2 · · · en
(3.5.2)
Axj = ej ,
(3.5.3)
or
To find A−1 , solve
for all j = 1, 2, . . . , n. Use the LU decomposition.
Discussion of ill-conditioned systems
We define Ax = b as an ill-conditioned system if small changes in A or b introduces large
changes in the solution. Geometrically we showed this interpretation previously on a 2 × 2
system, and we noted that the slopes were very similar to each-other. Numerically, we have
trouble because the roundoff when we solve Ãx = b. We also may compute a condition
number which tells us the amplification factor of errors in the system.
In Matlab, the command cond(A) gives you the condition. This should hopefully be
under a thousand. The condition number essentially tells you how much accuracy you can
expect to get from the final solution. In other words, if your condition number is 1 × 105
then you can only expect to have about 11 significant digits in our solution at floating point
arithmetic.
27
Nitsche and Benner
Unit 3. Factorization
Inversion of lower triangular matrices
Show that if A is a lower triangular matrix then so is A−1 . So let’s solve AX = I with A
lower triangular.






x
x
x
x
x
0
x
x
x
x
0
0
x
x
x
0
0
0
x
x
0
0
0
0
x
1
0
0
0
0
0
1
0
0
0
0
0
1
0
0
0
0
0
1
0
0
0
0
0
1






→







→





→





→


x
x
x
x
x
0
x
x
x
x
0
0
x
x
x
0
0
0
x
x
x
x
x
x
x
0
x
x
x
x
0
0
x
x
x
x
x
x
x
x
0
x
x
x
x
0
0
x
x
x
0
0
0
x
x
x
x
x
x
x
0
x
x
x
x
0
0
x
x
x
0
0
0
x
x
1
y
y
y
y
0
0
0
0
x
0
0
0
x
x
0
1
0
0
0
1
y
y
y
y
0
0
0
0
x
0
0
0
0
x
0
0
0
0
x
0
0
1
0
0
0
1
y
y
y
1
y
y
y
y
0
0
0
1
0
0
0
1
0
0
0
1
y
y
y
1
y
y
y
y
0
0
0
1
0
0
0
1
y
y
0
1
y
y
y

0
0
0
0
1


,


0
0
0
0
1
0
0
0
1
0
0
0
1
y
y



,


0
0
0
0
1
0
0
0
1
y
(3.5.4a)
(3.5.4b)



,


0
0
0
0
1
(3.5.4c)



 . (3.5.4d)


We now have shown that we can get the lower triangular matrix A into the form LD. Now
we do backward substitution to get our X. In this case this is simply deviding each row by
the value of the pivot of that row. In this way with D = U, we have X = D−1 L−1 .
Example of LU decomposition of a lower triangular matrix
Given the matrix,

 


2 0 0
1 0 0
2 0 0
1 3 0 =  1 1 0 0 3 0 ,
2
2 1 4
1 13 1
0 0 4
= LU.
28
(3.5.5a)
(3.5.5b)
3.6. Lecture 9: September 9, 2013
Applied Matrix Theory
Banded matrix example
Exercise 3.10.7: Band matrix A with bandwidth w is a matrix with aij = 0 if |i − j| > w.
If w = 0, we have a diagonal matrix.


a11 0
0
0
0
 0 a22 0
0
0



0 a33 0
0
Aw=0 =  0
(3.5.6)
.
0
0
0 a44 0 
0
0
0
0 a55
For bandwidth, w = 1,

Aw=1

a11 a12 0
0
0
a21 a22 a23 0
0


.
0
a
a
a
0
=
32
33
34


0
0 a43 a44 a45 
0
0
0 a54 a55
(3.5.7)
For bandwidth, w = 2,

Aw=2
a11
a21

=
a31
0
0
a12
a22
a32
a42
0
a13
a23
a33
a43
a53

0
0
a24 0 

a34 a35 
.
a44 a45 
a54 a55
(3.5.8)
In the LU decomposition these zeros are preserved. However there are other cases (as shown
in the homework) where the zeros may not be preserved.
We will return to our theorem on Monday. For the homework, a matrix has an LU
decomposition if and only if all principle submatrices are invertible.
3.6
Lecture 9: September 9, 2013
Existence of the LU factorization (cont.)
When does LU factorization exist? Theorem: If no zero pivots that appears in Gaussian
elimination (including the nth one) then A = LU, `ii = 1 and uii 6= 0 are pivots. Then L,
U are unique.
Theorem: A = LU if and only if the leading principle submatrices Ak is invertible.
Proof: Assume (for block matrices of length k × k, n − k × n − k and the difference)
A = LU,
L11 0
U11 U12
=
,
L21 L22
0 U22
L11 U11 L11 U12
=
L21 U11 L22 U22
29
(3.6.1)
(3.6.2)
(3.6.3)
Nitsche and Benner
Unit 3. Factorization
Q
Now our question: is Ak = L11 U11 ? We know that det L11 = kj=1 `jj 6= 0 so L11 is
Q
invertible. Similarly, U11 = kj=1 ujj 6= 0 so it is also invertibles. Since we know that the
product of two invertible matrices is also invertible, Ak must also be invertible.
We will now do a proof by induction: If we assume that all Ak are invertible. Show that
A = LU.
ASIDE: Example of proof by induction.
We want to show,
n
X
j2 =
j=1
n(n + 1)(2n + 1)
.
6
(3.6.4)
The steps of proof by induction are
1. First we show that this holds for n = 1,
2. next we assume it holds for n,
3. finally we show that it holds for n + 1.
Let’s show the third step,
n+1
X
j2 =
j=1
n
X
j 2 + (n + 1)2 ,
(3.6.5a)
j=1
=
=
=
=
=
n(n + 1)(2n + 1)
+ (n + 1)2 ,
6
n(n + 1)(2n + 1) + 6(n + 1)2
,
6
(n + 1) [n(2n + 1) + 6(n + 1)]
,
6
2
(n + 1) 2n + 7n + 1
,
6
(n + 1)(n + 2)(2n + 3)
.
6
(3.6.5b)
(3.6.5c)
(3.6.5d)
(3.6.5e)
(3.6.5f)
Which is what would be expected, and we have proved this relation by induction.
So for our system,
1. First we show that this holds for n = 1,
A = [a11 ] = [1] [a11 ] where a11 6= 0.
2. Assume true for n:
If Ak , k = 1, . . . , n are invertible, then An×n = Ln×n Un×n .
3. Show it holds for n + 1.
So let’s move onto the third step, assume A(n+1)×(n+1) with Ak , k = 1, . . . , n+1 are invertible.
By induction assumption An = Ln Un , since A1 , . . . , An are invertible. Now we need to show
that An+1 = Ln+1 Un+1 ,
An b
An+1 =
,
(3.6.6a)
c| α
Ln Un b
=
,
(3.6.6b)
c|
α
Ln 0 Un x
=
.
(3.6.6c)
|
y| 1
0 β
30
3.6. Lecture 9: September 9, 2013
Applied Matrix Theory
−1
We want Ln x = b so we let x = L−1
n b which supposes that Ln exists. We also want
|
|
y| Un = c| so we let y| = c| U−1
n . Finally, we want y x + β = α, so we let β = α − y x. We
know,
An+1
Ln Un
=
c|
Ln
=
| −1
c Un
b
,
α
0 Un
L−1
n b
.
|
−1
1
0 α − c| U−1
n Ln b
(3.6.7a)
(3.6.7b)
Since A = An+1 is invertible, we must have β 6= 0 because if β = 0 then det(Ln+1 ) det(Un+1 ) =
0, in which case An+1 would not be invertible. So, An+1 has an LU decomposition and by
principle of induction we have proven our theorem.
Rectangular matrices
For a rectangular matrix Am×n ∈ Rm×n . Our question: is Ax = b solvable? Is the solution
unique? We are presented with there options: no solution, unique solution, or infinitely
many solutions. We are going to do Gaussian elimination to reduce the form of the matrix
to see how many solutions we will have. So we will do row echelon form (REF) reduction.
Example of row echelon form

1
2
A=
1
2

1
0
→
0
0

1
0
→
0
0

1
0
→
0
0
2
4
2
4
1
0
3
0
3
4
5
4

3
4
,
5
7

2
1
3
3
0 −2 −2 −2
,
0
2
2
2
0 −2 −1
1

2 1 3 3
0 1 1 1
,
0 0 0 0
0 0 0 2

2 1 3 3
0 1 1 1
.
0 0 0 1
0 0 0 0
(3.6.8a)
(3.6.8b)
(3.6.8c)
(3.6.8d)
Where we made interchanges to have leading ones for the columns. What do we know about
our matrix A from this information? First, we know what columns are linearly independent.
We are trying to find the column space of our matrix.
31
Nitsche and Benner
3.7
Unit 3. Factorization
Homework Assignment 2: Due Friday, September
13, 2013
1. Textbook 3.10.1 (a, c): LU and PLU factorizations


1 4 5
Let, A = 4 18 26.
3 16 30
(a) Determine the LU factors of A
(c) Use the LU factors to determine A−1
2. Textbook 3.10.2
Let A and b be the matrices,


1 2
4 17
3 6 −12 3

A=
2 3 −3 2
0 2 −2 6
and
 
17
 3

b=
 3.
4
(a) Explain why A does not have an LU factorization.
(b) Use partial pivoting and find the permutation matrix P as well as the LU factors
such that PA = LU.
(c) Use the information in P, L, and U to solve Ax = b.
3. Textbook 3.10.3


ξ 2 0
Determine all values of ξ for which A = 1 ξ 1 fails to have an LU factorization.
0 1 ξ
4. Textbook 3.10.5
If A is a matrix that contains only integer entries and all of its pivots are 1, explain
why A−1 must also be an integer matrix. Note: This fact can be used to construct
random integer matrices that posses integer inverses by randomly generating integer
matrices L and U with unit diagonals and then constructing the product A = LU.
5. Lower triangular matrices
Let A be a 3 × 3 matrix with real entries. We showed that GE is equivalent to finding
lower triangular matrices L−1 and L−2 such that L−2 L−1 A = U where U is upper
triangular and,




1
0 0
1
0
0
1
0 ,
L−1 = −`21 1 0 ,
L−2 = 0
(3.7.1)
−`31 0 1
0 −`32 1
32
3.7. HW 2: Due September 13, 2013
Applied Matrix Theory
with

(L−1 )−1

1 0 0
= `21 1 0 = L1 ,
`31 0 1
(L−2 )−1


1 0 0
= 0 1 0 = L2 .
0 `32 1
(3.7.2)
It follows that A = L2 L1 U. Show that


1
0 0
L2 L1 = `21 1 0 .
`31 `32 1
(3.7.3)
Show by example that generally,
L2 L1 6= L1 L2
(3.7.4)
That is, the order in which these lower triangular matrices are multiplied matters.
6. Textbook 1.6.4: Conditioning
Using geometric considerations, rank the following three systems according to their
condition.
(a)
1.001x − y = 0.235,
x + 0.0001y = 0.765.
(b)
1.001x − y = 0.235,
x + 0.9999y = 0.765.
(c)
1.001x + y = 0.235,
x + 0.9999y = 0.765.
7. Textbook 1.6.5
Determine the exact solution of the following system:
8x + 5y + 2z = 15,
21x + 19y + 16z = 56,
39x + 48y + 53z = 140.
Now change 15 to 14 in the first equation and again solve the system with exact
arithmetic. Is the system ill-conditioned?
33
Nitsche and Benner
Unit 3. Factorization
8. Textbook 1.6.6
Show that the system
v−w−x−y−z
w−x−y−z
x−y−z
y−z
z
= 0,
= 0,
= 0,
= 0,
= 1,
is ill-conditioned by considering the following perturbed system:
v − w − x − y − z = 0,
−
1
v+w−x−y−z
15
1
− v+x−y−z
15
1
− v+y−z
15
1
− v+z
15
34
= 0,
= 0,
= 0,
= 1.
UNIT 4
Rectangular Matrices
4.1
Lecture 10: September 11, 2013
Rectangular matrices (cont.)
We are interested in a rectangular matrix, Am×n . We may apply REF, or RREF to find the
column dependence, what the basic columns are, and what the rank of the matrix is. This
way we can find for any system Ax = b, whether the system is consistent and find all the
solutions; whether it is homogeneous, or what the free variables are; and what the particular
solutions are. Last time’s example, we went from

1
2
A =
1
2

1
0
→
0
0
2
4
2
4
1
0
3
0
3
4
5
4
2
0
0
0
1
2
0
0
3
2
0
0

3
4
,
5
7

3
2
.
3
0
(4.1.1a)
(4.1.1b)
The first, third, and fifth columns have pivots and are the basic columns. They correspond
to the linearly independent columns in A. How do we write the other two columns (c2 , c4 )
as functions of the other three columns? We can notice that, c2 = 2c1 , and similarly c4 =
2c1 + c3 . The reduced row echelon form (RREF) has pivots on 1, and zeros below and above
35
Nitsche and Benner
Unit 4. Rectangular Matrices
x2
x2
x2
x1
x1
(a) Intersecting system (one
solution)
x1
(b) Parallel system (no solution)
(c) Equivalent system (infinite solutions)
Figure 4.1. Geometric illustration of linear systems and their solutions.
all pivots. So,

1
0

0
0
2
0
0
0
1
2
0
0
3
2
0
0
 
3
1
0
2
 →
3 0
0
0

1
0
→
0
0

1
0
→
0
0
2
0
0
0
1
1
0
0
3
1
0
0
2
0
0
0
1
1
0
0
3
1
0
0
2
0
0
0
0
1
0
0
2
1
0
0

3
1
,
1
0

0
0
,
1
0

0
0
.
1
0
(4.1.2a)
(4.1.2b)
(4.1.2c)
In this form, the basic columns are very clear, and the relations between the dependent
columns and the basic columns is also obvious. So again we can see that, c2 = 2c1 and
c4 = 2c1 + 1c3 . The rank of the matrix is the number of linearly independent columns, which
is also the number of linearly independent rows, and also the number of pivots in row-echelon
form of the matrix. A consistent system, Ax = b is a system that has at least one solution.
It is inconsistent if it has no solutions. To determine if Ax = b is consistent, in a 2 × 2
system, Ax = b,
a11 x1 + a12 x2 = b1 ,
a21 x1 + a22 x2 = b2 .
(4.1.3a)
(4.1.3b)
Since this system is a linear system we can see three cases: one intersection, parallel and
separated, and parallel and the same. Each of these cases are illustrated in Figure 4.1.
In general, for any size matrix, we find the row echelon form of the augmented system
36
4.1. Lecture 10: September 11, 2013
Applied Matrix Theory
h
i
[A b] → E b̃ .


x x x x x
0 x x x x
0 0 0 0 α
(4.1.4)
If α 6= 0, then the system is inconsistent. So Ax = b is consistent if rank([A b]) = rank(A).
If α = 0 then b̃ is not a basic column of (A b). The we can write b̃ as a linear combination
of the basic columns of E. We can write b as linear combinations of basic columns of A.
In our example, we had c1 , c3 , and c5 where the basic columns and Ax = b was consistent.
Here then if we were to preform a reduction, the b = x1 c1 + x3 c3 + x5 c5 , or in other words,
 
x1
0
 

A
(4.1.5)
x3  = b.
0
x5
Example of RREF of a Rectangular Matrix
Given the matrix,

1
 2

 2
3
1
2
2
5
2
4
4
8
2
4
4
6
1
3
2
5
 
1 1
1
 0 0
1 
 →
2   0 0
3
0 2

1 1
 0 2
→
 0 0
0 0
2
0
0
2
2
0
0
0
2
2
0
0
2
0
0
0

1
1
1 −1 
,
0
0 
0
2

1
1
2
0 
.
1 −1 
0
0
(4.1.6a)
(4.1.6b)
Thus, our system is consistent. We have that rank([A b]) = rank(A). Similarly, we observe
that we have 3 basic columns, r, and 2 linearly dependent columns, n − r. (If n > m, then
n > r, so n − r 6= 0). Let’s continue on to perform the reduced row echelon form.

 

1 1 2 2 1
1
1 1 2 2 1
1
 0 2 2 0 2

0 
0 

  0 1 1 0 1

(4.1.7a)
 0 0 0 0 1 −1  →  0 0 0 0 1 −1 ,
0 0 0 0 0
0
0 0 0 0 0
0


1 1 2 2 0
2
 0 1 1 0 0
1 

→
(4.1.7b)
 0 0 0 0 1 −1 ,
0 0 0 0 0
0


1 0 1 2 0
1
 0 1 1 0 0
1 

→
(4.1.7c)
 0 0 0 0 1 −1 .
0 0 0 0 0
0
37
Nitsche and Benner
Unit 4. Rectangular Matrices
Thus our b̃ = 1c̃1 + 1c̃2 − 1c̃5 . Therefore, b = 1c1 + 1c2 − 1c5 , and
 
1
 1
 

x=
 0 .
 0
−1
(4.1.8)
So in review,

1
 2

 2
3
1
2
2
5
2
4
4
8
2
4
4
6
1
3
2
5


1
1


1 
0
→
 0
2 
3
0
0
1
0
0
1
1
0
0
2
0
0
0

1
0
0
1 
.
1 −1 
0
0
(4.1.9)
We found a particular solution, xp = (1 1 0 0 − 1)| of Ax = b. For any solution xh of
Ax = 0, we have that A (xp + xH ) = b + 0. So (xp + xH ) also solves Ax = b.
4.2
Lecture 11: September 13, 2013
Solving Ax = b
Ax = b is consistent if rank[A | b] = rank(A). We have that b is a nonbasic column of
[A | b]. We can express b in terms of columns of A to get a solution Axp = b. The set of all
solutions is xp + xH , where Axp = b has the particular solution to Ax = b. We also solve
AxH = 0, and get all homogeneous solutions, xH . Since we can add these two solutions, we
have A (xp + xH ) = b.
Now to actually find the particular solution, xp , we write b in terms of basic columns.
To find the homogeneous solutions, xH , we solve Ax = 0 by solving for basic variables xi
in terms of the n − r free variables. Basic variables correspond to basic columns, while free
variables correspond to nonbasic columns. Note that if n > r then the set of columns is
linearly independent and we can find x 6= 0 such that Ax = 0.
Example
From our example

1
 2

 2
3
1
2
2
5
2
4
4
8
2
4
4
6
1
3
2
5
 
1
1


1   0
→
2   0
3
0
0
1
0
0
1
1
0
0
2
0
0
0

0
1
0
1 
,
1 −1 
0
0
(4.2.1a)
we have that
b = a:1 + a:2 − a:5 ,
= x1 a:1 + x2 a:2 − x5 a:5 ,
= Axp ,
|
where xp = 1 1 0 0 −1 .
38
(4.2.2a)
(4.2.2b)
(4.2.2c)
4.2. Lecture 11: September 13, 2013
Applied Matrix Theory
Solve,

1
 0
[A | 0] = 
 0
0
0
1
0
0
1
1
0
0
2
0
0
0
0
0
1
0

0
0 
.
0 
0
(4.2.3a)
This gives us the three equations for the homogeneous solutions,
x1 = −x3 − 2x4 ,
x2 = −x3 ,
x5 = 0.
(4.2.4a)
(4.2.4b)
(4.2.4c)
This gives us the homogeneous solutions of the form,


−x3 − 2x4
 −x3 


,
x
xH = 
3




x4
0
 
 
−1
−2
−1
 0
 
 

 
= x3 
 0 + x4  0.
 0
 1
0
0
(4.2.5a)
(4.2.5b)
Thus the set of all solutions are,
x = xp + xH ,
 
 
 
1
−1
−2
 1
−1
 0
 
 
 

 
 
=
 0 + x3  0 + x4  0 .
 0
 0
 1
−1
0
0
(4.2.6a)
(4.2.6b)
This solves Ax = b for any x3 and x4 . Therefore we have infinitely many solutions. Not we
can only have unique solutions if n = r.
Linear functions
We have any function f : D → R is a linear function if
1. f (x + y) = f (x) + f (y),
2. f (αx) = αf (x).
39
Nitsche and Benner
Unit 4. Rectangular Matrices
For example, f (x) = ax + b, with b 6= 0.
f (x + y) = (ax + b) + (ay + b) ,
= a(x + y) + 2b,
6= a(x + y) + b.
(4.2.7a)
(4.2.7b)
(4.2.7c)
Thus this is not a linear function. However when b = 0, the function f (x) = ax can be
verified to be linear.
Example: Transpose operator
The transpose operator is f (A) = A| . Define that if A = [aij ], then A| = [aji ] and
A∗ = A| = [āji ]. Is this linear?
|
f (A + B) = (A + B) ,
(4.2.8a)
|
= [aij + bij ] ,
= [aji + bji ] ,
|
|
=A +B .
(4.2.8b)
(4.2.8c)
(4.2.8d)
To check the second criterion,
|
f (αA) = [αA] ,
(4.2.9a)
|
= α [A] ,
= αf (A).
(4.2.9b)
(4.2.9c)
So this operator is linear.
Example: trace operator
P
aii .
X
f (A + B) =
(aii + bii ) ,
The trace operator is f (A) = tr(A) =
i
(4.2.10a)
i
=
X
aii +
X
i
bii ,
(4.2.10b)
i
= tr(A) + tr(B).
(4.2.10c)
The second cirterion,
f (αA) = tr(αA),
X
=
αaii ,
(4.2.11a)
(4.2.11b)
i
=α
X
aii ,
(4.2.11c)
= α tr(A),
= αf (A).
(4.2.11d)
(4.2.11e)
i
We have therefore shown that this is a linear operator.
40
4.2. Lecture 11: September 13, 2013
Applied Matrix Theory
Matrix multiplication
Given,
a b
A=
,
c d
B=
ã b̃
.
c̃ d˜
(4.2.12)
Then consider
ax1 + bx2
f (x) = Ax =
,
cx1 + dx2
g(x) = Bx =
ãx1 + b̃x2
˜ 2 .
c̃x1 + dx
(4.2.13)
Take
f (g(x)) = A (Bx) ≡ ABx.
(4.2.14)
But,
˜ 2)
a(ãx1 + b̃x2 ) + b(c̃x1 + dx
f (g(x)) =
˜ 2) ,
c(ãx1 + b̃x2 ) + d(c̃x1 + dx
˜ 2
(aã + bc̃)x1 + (ab̃ + bd)x
=
˜ 2 ,
(cã + dc̃)x1 + (cb̃ + dd)x
aã + bc̃ ab̃ + bd˜ x1
,
=
cã + dc̃ cb̃ + dd˜ x2
≡ AB.
Now if we define AB = [Ai: B:j ] or Ai: B:j =
| {z }
(4.2.15a)
(4.2.15b)
(4.2.15c)
(4.2.15d)
Pn
k=1
Aik Bkj . We get that matrix multiplication
(AB)ij
is not generally commutative, or AB 6= BA. If AB = 0 then either A = 0 or B = 0 unless
A or B are invertible. Further we know that we have the distributive properties,
A (B + C) = AB + AC,
(4.2.16)
(A + B) D = AD + BD,
(4.2.17)
(AB) C = A (BC) .
(4.2.18)
or
and the associative property
A property of the transpose operator is,
|
|
|
(AB) = B A ,
(4.2.19)
tr(AB) = tr(BA).
(4.2.20)
which also helps to understand that,
Note, however, that tr(ABC) 6= tr(ACB) as we will demonstrate on the homework.
41
Nitsche and Benner
Unit 4. Rectangular Matrices
Proof of transposition property
We want to prove the useful property,
|
|
|
(AB) = B A .
(4.2.21)
Dealing with our left hand side of the equation,
LHS :
|
|
(AB) = (AB)ij ,
(4.2.22a)
= [(AB)ji ],
= [Aj: B:i ].
(4.2.22b)
(4.2.22c)
Manipulating the right hand side of the property,
h
i
| |
| |
RHS : B A = B A ij ,
| |
= Bi: A:j ,
= [B:i Aj: ],
= [Aj: B:i ],
= LHS.
(4.2.23a)
(4.2.23b)
(4.2.23c)
(4.2.23d)
(4.2.23e)
Thus, we have proved the identity.
4.3
Lecture 12: September 16, 2013
We will be having an exam on September 30th .
Inverses
We define: A has an inverse if each A−1 exists such that,
AA−1 = A−1 A = I.
(4.3.1)
We also have the properties:
• (AB)−1 = B−1 A−1 ,
−1
• (A| )
−1
• (A−1 )
|
= (A−1 ) ,
= A.
What about the inverse of sums (A + B)−1 ? There are the special cases,
−1
• low rank perturbations of In×n : (I + CD| ) , where C, D ∈ Rn×k or the matrices are
of rank k.
• small perturbation of I : (I + A)−1 , where ||A||.
42
4.3. Lecture 12: September 16, 2013
Applied Matrix Theory
We have a rank-1 matrix uv| , with u, v ∈ Rn = Rn×1 .
 
u1
u2 
 
|
uv =  ..  v1 v2 · · · vk ,
.
uk


u1 v1 u1 v2 · · · u1 vk
u2 v1 u2 v2 · · · u2 vk 


=  ..
..
.. ,
...
 .
.
. 
uk v1 uk v2 · · · uk vk


u1 v|
u2 v| 


=  .. .
 . 
uk v|
(4.3.2a)
(4.3.2b)
(4.3.2c)
Now let’s say we have an example where all matrix entries are zero except for αij at some
point (i, j).

  
0
0
···
0

  .. 

 .
 ..
..  = α 0 · · · 1 · · · 0,
(4.3.3a)
 
.
α
.

 .

  .. 
0
···
0
0
|
= αei ej .
(4.3.3b)
Low rank perturbations of I
We make the claim the if u, v are such that v| u + 1 6= 0 then
I + uv
| −1
=I−
uv|
1 + v| u
(4.3.4)
Proof:
I + uv
|
uv|
I−
1 + v| u
uv|
u (v| u) v|
|
+
uv
−
,
1 + v| u
1 + v| u
uv|
(v| u)
|
|
=I−
+
uv
−
|
| uv ,
1+v
u
1
+
v
u
1
(v| u)
|
= I − uv
,
| +1−
|
1+v u | 1+v u
1
+
v u
|
= I − uv 1−
,
1 + v| u
= I.
=I−
43
(4.3.5a)
(4.3.5b)
(4.3.5c)
(4.3.5d)
(4.3.5e)
Nitsche and Benner
Unit 4. Rectangular Matrices
|
So if c, d ∈ Rn such that d (A−1 c) + 1 6= 0, we are interested in A−1 .
| −1
| A + cd
= A I + A−1 cd ,
|
= I + A−1 c d A−1 ,
| A−1 cd
= I−
A−1 ,
| −1
1+d A c
|
A−1 cd A−1
−1
=A −
.
|
1 + d A−1 c
(4.3.6a)
(4.3.6b)
(4.3.6c)
(4.3.6d)
The Sherman–Morrison Formula
The Sherman–Morrison formula states that if A is invertible and C, D ∈ Rn×k such that
I + D| A−1 C is invertible. Then,
−1 | −1
| −1
|
A + CD
= A−1 − A−1 C I + D A−1 C
D A
(4.3.7)
Finite difference example with periodic boundary conditions
Previously, we had,
−y 00 = f,
y(a) = ya ,
y(b) = yb .
on [a, b],
(4.3.8a)
(4.3.8b)
(4.3.8c)
We get the finite difference approximation of,




  
f1
2 −1
0
0
y0
y1
  y2 
 ..   .. 
−1
2
−1
·
·
·
0


 .  .

..




  
2
y
. 0  3  = h 
2
 fi  +  0  .
 0 −1
 . 
 .  .

.. . . . . ..
 ..   .. 

. .  .. 
.
.
yn−1
0
0
0 ··· 2
fn−1
yn
(4.3.9)
If we instead use periodic boundary conditions we have perturbed our solution,
2 −1
−1
2


 0 −1

..

.
−1
0

−y 00 = f,
on [a, b],
y(a) = y(b),
y 0 (a) = y 0 (b).




0
−1
f1
y1


 .. 
−1 · · ·
0
  y2 
 . 
...




2
0  y3  = h2  fi  .





.
..  .. 
... ...
 ... 
.
yn−1
0 ···
2
fn−1
(4.3.10a)
(4.3.10b)
(4.3.10c)
In this case the Shermann–Morrison formula would help greatly with our inversion.
44
(4.3.11)
4.3. Lecture 12: September 16, 2013
Applied Matrix Theory
Examples of perturbation
Given a matrix
1 2
A=
,
1 3
3 −2
−1
A =
.
−1
1
1 2
B=
,
2 3
0 0
=A+
,
1 0
|
= A + e2 e1 ,
(4.3.12a)
(4.3.12b)
(4.3.12c)
(4.3.12d)
(4.3.12e)
Applying the Shermann–Morrison formula
0 0
A
A−1
1
0
= A−1 −
,
1 + e|1 A−1 e2
3 −2
0
0
−1
1
3 −2
,
= A−1 −
1
−
2
−6
4
−1
=A −
,
3 −2
9 −2
=
.
−4
3
−1
B−1
(4.3.12f)
(4.3.12g)
(4.3.12h)
(4.3.12i)
Small perturbations of I
We want to show what happens when we have small perturbations from the identity matrix
I;
?
(I − A)−1 = I + A + A2 + · · · ,
(4.3.13)
when kAk < 1.
We first consider the geometric series,
1
,
1−x
∞
X
=
xn ,
(1 − x)−1 =
(4.3.14a)
(4.3.14b)
n=0
= 1 + x + x2 + x3 + · · ·
when |x| < 1.
To be continued. . .
45
(4.3.14c)
Nitsche and Benner
4.4
Unit 4. Rectangular Matrices
Lecture 13: September 18, 2013
Small perturbations of I (cont.)
We want to show what happens when we have small perturbations from the identity matrix
I;
?
(I − A)−1 = I + A + A2 + · · · ,
(4.4.1)
when kAk < 1.
We first consider the geometric series,
1
,
1−x
∞
X
=
xn ,
(1 − x)−1 =
(4.4.2a)
(4.4.2b)
n=0
= 1 + x + x2 + x3 + · · · ,
(4.4.2c)
when |x| < 1. This is proved as follows,
S=
n
X
xk ,
(4.4.3a)
k=0
S − xS = 1 + x + x2 + · · · + xn − x − x2 − · · · − xn+1 ,
2 2
n n
= 1 +
(x−x)
+ x
− x + ··· +
(x
− x ) − xn+1 ,
= 1 − xn+1 ,
1 − xn−1
,
S=
1−x
1 − xn−1
,
= lim
n→∞ 1 − x
1
=
.
1−x
(4.4.3b)
(4.4.3c)
(4.4.3d)
(4.4.3e)
(4.4.3f)
(4.4.3g)
Returning to the full series for a matrix,
(I − A) (I + A + · · · + An ) = I + A + A2 + · · · + An − A − A2 − · · · − An+1 , (4.4.4a)
n
2 2
= I +
(A−
A) + A
− A + ··· +
(An−A
) − An+1 ,
(4.4.4b)
= I − An+1 .
(4.4.4c)
If A is small, so that An → 0 as n → ∞, then
(I − A)
∞
X
Ak = I,
(4.4.4d)
k=0
(I − A)
−1
=
∞
X
k=0
46
Ak .
(4.4.4e)
4.4. Lecture 13: September 18, 2013
Applied Matrix Theory
Let’s consider the convergence of this series now.
L=
∞
X
ak ,
(4.4.5)
k=1
Pn
where ak → 0 as k → ∞. We
define
that
L
is
finite
if
lim
n→∞
Pn 1 k=1 ak exists and is finite.
P
1
diverges
since
lim
As an example we see that ∞
n→∞
k=1 k → ∞. So we also should
n=1 n
consider that the difference,
(L−) −
∞
X
ak → 0,
as n → ∞.
(4.4.6)
k=1
Thus, we can consider that,
L≈
n
X
ak ,
with error → 0 as n → ∞.
(4.4.7)
k=1
In particular if A is small then,
(I − A)−1 ≈ I + A.
(4.4.8)
−1
(A + B)−1 = A I + A−1 B
,
(4.4.9a)
−1 −1
= I + A−1 B
A ,
≈ I − A−1 B A−1 ,
= A−1 − A−1 BA−1 .
(4.4.9b)
For example,
where A−1 exists,
(4.4.9c)
(4.4.9d)
Matrix Norms
The properties of norms of matrix A ∈ Rm×n has a norm, k · k, if the norm satisfies,
1. kAk ≥ 0, and if kAk = 0 then A = 0,
2. kA + Bk ≤ kAk + kBk,
3. kαAk = |α| kAk,
and we must add the fourth property;
4. kABk ≤ kAk kBk.
As an example of a norm,
kAk = max
j
47
X
i
|aij |
(4.4.10)
Nitsche and Benner
Unit 4. Rectangular Matrices
which is the maximum absolute value of the column sum. If kAk < 1, then 0 ≤ kAn k ≤
kAkn → 0 as n → ∞. So kAn k → 0 as n → ∞ and An → 0 as n → ∞.
When is A−1 B small?
−1 −1 A B ≤ A kBk,
kBk
= A−1 kAk,
kAk
kBk
.
=
kAk
(4.4.11a)
(4.4.11b)
(4.4.11c)
Thus,
−1 A 6≤
Note, we have shown kA−1 k 6≤
1. If this is the case,
1
,
kAk
1
.
kAk
(4.4.12)
since kAA−1 k = kIk which we suppose to be equal to
1 = AA−1 ,
= kAkA−1 .
(4.4.13a)
(4.4.13b)
So,
1
≤ A−1 .
kAk
(4.4.13c)
However, we would get,
−1 kA−1 kkAk
A =
,
kAk
= A−1 κ (A) .
(4.4.13d)
(4.4.13e)
Condition Number
For example pertaining to the condition number , we suppose we have Ax = b, and we have
the perturbation (A + B) x̃ = b, where we know that kA−1 Bk < 1, or in other words that
B is sufficiently small. We can get the relative change in x introduced by the change in A,
−1
A b − (A + B)−1 b
kx − x̃k
=
,
kxk
kxk
−1
A − (A + B)−1 b
=
,
kxk
48
(4.4.14a)
(4.4.14b)
4.5. HW 3: Due September 27, 2013
Applied Matrix Theory
If we use (A + B)−1 ≈ A−1 − A−1 BA−1
kA−1 BA−1 bk
,
kxk
kA−1 Bkkxk
≤
,
kxk
kA−1 kkBkkAk
≤
,
kAk
kBk
κ(A).
=
kAk
≈
(4.4.14c)
(4.4.14d)
(4.4.14e)
(4.4.14f)
Thus, κ(A) measures the amplification of the errors.
4.5
Homework Assignment 3: Due Friday, September
27, 2013
For the first four problems, you may use the Matlab commands rref(a) and a\b to check
your work.
1. Textbook 2.2.1: Row Echelon Form, Rank, Consistency, General solution of Ax = b.
Determine the reduced row echelon form for each of the following matrices and then
express each nonbasic column in terms of the basic columns:


1 2 3 3
(a) 2 4 6 9
2 6 7 6


2 1 1
3 0 4 1
4 2 4
4 1 5 5


2 1 3
1 0 4 3


(b) 

6
3
4
8
1
9
5


0 0 3 −3 0 0 3
8 4 2 14 1 13 3
2. Textbook 2.3.3
If A is an m × n matrix with rank(A) = m, explain why the system [A|b] must be
consistent for every right-hand side b.
3. Textbook 2.5.1
Determine the general solution for each of the following non homogeneous systems.
(a)
x1 + 2x2 + x3 + 2x4 = 3,
2x1 + 4x2 + x3 + 3x4 = 4,
23x1 + 6x2 + x3 + 4x4 = 5.
49
(4.5.1a)
(4.5.1b)
(4.5.1c)
Nitsche and Benner
Unit 4. Rectangular Matrices
(b)
2x + y + z
4x + 2y + z
6x + 3y + z
8x + 4y + z
= 4,
= 6,
= 8,
= 10.
(4.5.2a)
(4.5.2b)
(4.5.2c)
(4.5.2d)
(c)
x1 + x2 + 2x3
3x1 + 3x3 + 3x4
2x1 + x2 + 3x3 + x4
x1 + 2x2 + 3x3 − x4
= 3,
= 6,
= 3,
= 0.
(4.5.3a)
(4.5.3b)
(4.5.3c)
(4.5.3d)
(d)
2x + y + z
4x + 2y + z
6x + 3y + z
8x + 5y + z
= 2,
= 5,
= 8,
= 8.
(4.5.4a)
(4.5.4b)
(4.5.4c)
(4.5.4d)
2x + 2y + 3z = 0,
4x + 8y + 12z = −4,
6x + 2y + αz = 4.
(4.5.5a)
(4.5.5b)
(4.5.5c)
4. Textbook 2.5.4
Consider the following system:
(a) Determine all values of α for which the system is consistent.
(b) Determine all values of α for which there is a unique solution, and compute the
solution for these cases.
(c) Determine all values of α for which there are infinitely many different solutions,
and give the general solution for these cases.
5. Textbook 3.3.1: Linear Functions
Each of the following is a function from R2 into R2 . Determine which are linear
functions.
x
x
=
.
(a) f
y
1+y
x
y
(b) f
=
.
y
x
50
4.5. HW 3: Due September 27, 2013
Applied Matrix Theory
Figure 4.2. Figures for Textbook problem 3.3.4.
(c)
(d)
(e)
(f)
x
0
f
=
.
y
xy
2
x
x
f
=
.
y
y2
x
x
f
=
.
y
sin y
x
x+y
f
=
.
x−y
y
6. Textbook 3.3.4
Determine which of the following three transformations in R2 are linear.
7. Textbook 3.5.4: Matrix Multiplication
Let ej denote the j th unit column that contains a 1 in the j th position and zeros
everywhere else. For a general matrix An×n , describe the following products. (a)
Aej (b) e|j A (c) e|j Aej
8. Textbook 3.5.6
(please use induction)
1/2 α
, determine limn→∞ An . Hint: Compute a few powers of A and try
0 1/2
to deduce the general form of An .
For A =
9. Textbook 3.5.9
If A = [aij (t)] is a matrix whose entries are functions of a variable t, the derivative of
A with respect to t is defined to be the matrix of derivatives. That is,
daij
dA
=
.
dt
dt
51
Nitsche and Benner
Unit 4. Rectangular Matrices
Derive the product rule for differentiation
d(AB)
dA
dB
=
B+A
.
dt
dt
dt
10. Textbook 3.6.2
For all matrices An×k and Bk×n show that the block matrix
I − BA
B
L=
2A − ABA AB − I
has the property L2 = I. Matrices with this property are said to be involuntary, and
they occur in the science of cryptography.
11. Textbook 3.6.3
For the matrix

1
0

0
A=
0

0
0
0
1
0
0
0
0
0
0
1
0
0
0
1/3
1/3
1/3
1/3
1/3
1/3
1/3
1/3
1/3
1/3
1/3
1/3

1/3
1/3

1/3
,
1/3

1/3
1/3
determine A300 . Hint: A square matrix C is said to be idempotent when it has the
property that C2 = C. Make use of the idempotent submatrices in A.
12. Textbook 3.6.5
If A and B are symmetric matrices that commute, prove that the product AB is also
symmetric. If AB 6= BA, is AB necessarily symmetric?
13. Textbook 3.6.7
For each matrix An×n , explain why it is impossible to find a solution for Xn×n in the
matrix equation
AX − AX = I.
(4.5.6)
Hint: Consider the trace function.
14. Textbook 3.6.11
Prove that each of the following statements is true for conformable matrices
(a) tr (ABC) = tr(BCA) = tr(CAB).
(b) tr (ABC) can be different from tr (BAC).
(c) A| B = tr(AB| )
15. Textbook 3.7.2: Inverses
Find the matrix X such that X = AX + B, where




0 −1
0
1 2
0 −1 and B = 2 1 .
A = 0
0
0
0
3 3
52
4.5. HW 3: Due September 27, 2013
Applied Matrix Theory
16. Textbook 3.7.6
If A is a square matrix such that I − A is nonsingular, prove that
A (I − A)−1 = (I − A)−1 A.
17. Textbook 3.7.8
If A, B, and A + B are each nonsingular, prove that
A (A + B)−1 = B (A + B)−1 A = A−1 + B−1
−1
.
18. Textbook 3.7.9
Let S be a skew-symmetric matrix with real entries.
(a) Prove that I − S is nonsingular. Hint: x| x = 0 means x = 0.
(b) If A = (I + S) (I − S)−1 , show that A−1 = A| .
19. Textbook 3.9.9: Sherman–Morrison formula, rank 1 matrices
Prove that rank(An×n ) = 1 if and only if there are nonzero columns um×1 and vn×1
such that
|
A = uv .
20. Textbook 3.9.10
Prove that rank(An×n ) = 1, then A2 = τ A, where τ = tr(A).
53
Nitsche and Benner
Unit 4. Rectangular Matrices
54
UNIT 5
Vector Spaces
5.1
Lecture 14: September 20, 2013
Topics in Vector Spaces
We will be discussing the following topics in this lecture (and possibly the next couple).
• Field
• Vector Space
• Subspace
• Spanning Set
• Basis
• Dimension
• The four subspaces of Am×n
Field
We define a field as a set F with the properties such that,
• Closed under addition (+) and multiplication ( · ). Thus if α, β ∈ F, then α + β ∈ F
and α · β ∈ F.
• Addition and multiplication are commutative.
• Addition and multiplication are associative. This means that (α + β) + γ = α + (β + γ)
and (αβ)γ = α(βγ).
• Addition with multiplication is distributive. α(β + γ) = αβ + αγ.
• There exists an additive and multiplicative identity α + 0 = α, α · 1 = α.
• There exists an additive and multiplicative inverse α + (−α) = 0, α(α−1 ) = 1.
For example the reals and the complex numbers are fields. The natural numbers are not,
the rational numbers are. The set L2 = {0, 1} has the three operations 0 + 0 = 1, 0 + 1 = 1,
1 + 1 = 0.
55
Nitsche and Benner
Unit 5. Vector Spaces
Vector Space
We may define a vector space V over a field F is a set V with operations + and · such that,
• v + w ∈ V for any v, w ∈ V.
• αv ∈ V for any v ∈ V, α ∈ F.
• v + w = w + v for any v, w ∈ V. This is the commutative property of addition.
• (u + v) + w = u + (v + w) for any u, v, w ∈ V, which is the associative law of addition.
• For each 0 ∈ V contains u + 0 = u, for any u ∈ V.
• For each −u ∈ V contains u + (−u) = 0, for any u ∈ V.
• (αβ)u = α(βu) for any α, β ∈ F, u ∈ V.
• (α + β)u = αu + βu for any α, β ∈ F, u ∈ V. This is the first form of the distributive
property.
• 1 · u = u, the 1 multiplication identity in F.
• α(u + v) = αu + αv for any α ∈ F, and u, v ∈ V.
Examples of vector spaces of R is Rn = Rn×1 , Rn×m , Cm×n , all functions such that [0, 1] → R,
all polynomials which map R → R.
Theorem 5.1. A subset S of a vector space V over F is a vector space over F if
• v + w ∈ S, for any v, w ∈ S.
• αv ∈ S for any α ∈ F, v ∈ S.
Several examples include All continuous functions: [0, 1] → R = C[0, 1], all polynomials
of degree n, S = {0} contained in V.
Definition 5.2. Let {v1 , . . . , vn } ∈ V, then span{v1 , . . . , vn } = {α1 v1 + α2 v2 + · · · + αn vn ,
Theorem 5.3. This gives the theorem: The span of {v1 , . . . , vn } is a subspace.
Definition 5.4. The set {v1 , . . . , vn } is a spanning set of span{v1 , . . . , vn }.
Note the 0 ∈ span{v
. . , vn }, and 0 ∈ subspace. 1 , . 1
1
−2
For example, span
contained in R2 = span
,
. This gives rise to
2
2
−4
1
the basis vector
, thus the system is one-dimensional. The basis vector is illustrated
2
along with the solution on Figure 5.1.
Definition 5.5. A basis for a vector space is a minimal spanning set.
Theorem 5.6. Any two passes for a vector space have the same number of elements.
Definition 5.7. The number of elements in the basis is equal to the dimension of the space.
56
αk ∈ F}.
5.1. Lecture 14: September 20, 2013
Applied Matrix Theory
x2
x1
Figure 5.1. Basis vector of example solution.
For example, P2 = {a1 + a2 x + a3 x2 } the basis of this set is {1, x, x2 } and we observe
that it must have three dimensions. Therefore, for a polynomial of degree n the dimensions
of the polynomial function space are dim(Pn ) = n + 1.
As another example, S = {0} = ∅ the basis is the null set, and we have a zero-dimensional
system. Thus, zero cannot be an element of a basis.
Definition 5.8. A set {v1 , . . . , vn } is linearly independent if α1 v1 + α2 v2 + · · · + αn vn = 0
implies α1 = α2 = · · · = αn = 0.
It follows that {0} is not a linearly independent space since,
α0 = 0,
for any α 6= 0.
(5.1.1)
Similarly, any set containing 0 is not linearly independent.
Examples of function spaces
On example is the solutions to y 00 = 0. This is the set {y = αx + b | α, β ∈ R}. The
vector space has two dimensions and the basis is {1, x}. Another example is the set
of solutions of y 00 = y. The set of solutions is {y = c1 ex + c2 e−x }, which has the twodimensional basis {ex , e−x }. A third example is the set of solutions of y 00 = −y. This set is
{y = c1 sin(x) + c2 cos(x)} which is also the two-dimensional space {sin(x), cos(x)}. A final
example of interest is y 00 = 2. This gives the solution set {y = x2 + αx + β}. This however
is not a vector space because we are restricted by the defined coefficient of x2 being one!
This results from the fact that this is a non homogeneous system, unlike the other examples
which may be rearranged into homogeneous
form.
a b
2×2
In the general example of R
=
, the basis of this system is
c d
1 0
0 1
0 0
0 0
,
,
,
.
0 0
0 0
1 0
0 1
57
Nitsche and Benner
5.2
Unit 5. Vector Spaces
Lecture 15: September 23, 2013
The four subspaces of Am×n
We now define the four fundamental subspaces of Am×n : Rn → Rm . These are:
1. R(A) = {y : y = Ax, x ∈ Rn } ⊂ Rm This is the column space.
2. N(A) = {x ∈ Rn : Ax = 0} ⊂ Rn . This is the null space of A.
3. R(A| ) = {y : y = A| x, x ∈ Rm } ⊂ Rn . This is equivalently, R(A| ) = {y : y| =
x| A, x ∈ Rn } ⊂ Rm . This determines why this is the row space of A.
|
4. N(A| ) = {x ∈ Rm : A| x = 0 or x| A = 0 } ⊂ Rm . This is called the left null space of
A.
We want to show that R(A) is a vector space. So we let y1 , y2 ∈ R(A) Then, y1 = Ax1 and
y2 = Ax2 for some x1 , x2 . This tells us that
y1 + y2 = Ax1 + Ax2 ,
= A (x1 + x2 ) ∈ R(A).
(5.2.1a)
(5.2.1b)
αy1 = αAx1 ,
= Aαx1 ∈ R(A).
(5.2.2a)
(5.2.2b)
Also
Thus R(A) is a subspace of Rm .
An example: Find the spanning

1 2
2 4
A=
1 2
2 4
set for all 4 subspaces of,


1 3 3
1 2 0 2


0 4 4
0 0 1 1
→


3 5 5
0 0 0 0
0 4 2
0 0 0 0

0
0

1
0
(5.2.3)
So the row space,
      
3 
1
1



      
2
0
,  , 4 ⊂ R4 .
R(A) = span 
1 3 5





2
2
0
(5.2.4)
To find the column space, we need the solution of the homogeneous equation Ax = 0.
x1 = −2x2 − 2x4 ,
x3 = −x4 ,
x5 = 0,
58
(5.2.5a)
(5.2.5b)
(5.2.5c)
5.2. Lecture 15: September 23, 2013
Applied Matrix Theory
or


 
−2
−2
 1
 0
 
 
 + x4 −1 .
0
x = x2 
 
 
 0
 1
0
0
(5.2.6)
   
−2
−2 





  0

1

   
5
  
N(A) = span 
 0 , −1 ⊂ R .


 0  1






0
0
(5.2.7)
A → EA : Pm×m Am×n = EA,m×n .
(5.2.8)
Thus,
Now say,
We have that Pm×m is square and invertible (it is a product matrix). We also know that
PA = EA where the rows EA are a linear combination of rows of A. Similarly, A = P−1 EA
has that the rows of A are linearly commutations of the rows of EA or that the row space
of A is equal to the row space of EA . So,
|
|
R(A ) = row space of A,
      
1
0
0 












2 0 0

    
= span 
0 , 1 , 0 ,


2 1 0






0
0
1
|
|
| = y : y = A x or y = x A
59
(5.2.9a)
(5.2.9b)
(5.2.9c)
Nitsche and Benner
Unit 5. Vector Spaces
To find the fourth space, N(A| ),

1 2 1
2 4 2

1 0 3

3 4 5
3 4 5
 
2
1
0
4
 

0
 → 0
4 0
2
0

1
0

→
0
0
0

1
0

→
0
0
0

1
0

→
0
0
0

1
0

→
0
0
0
2
0
−2
−2
−2
2
1
0
0
0
2
1
0
0
0
2
1
0
0
0
0
1
0
0
0

1
2
0
0

2 −2
,
2 −2
2
4

1
2
−1
1

0 −2
,
0
0
0
0

1
2
−1
1

0 −2
,
0
0
0
0

1 0
−1 0

0 1
,
0 0
0 0

3 0
−1 0

0 1
.
0 0
0 0
(5.2.10a)
(5.2.10b)
(5.2.10c)
(5.2.10d)
(5.2.10e)
So the solution for A| x = 0,
x1 = −3x3 ,
x2 = x3 ,
x3 = x3 .
or
(5.2.11a)
(5.2.11b)
(5.2.11c)


−3
 1

x = x3 
 1 .
0
(5.2.12)
  
−3 


  

1
|

N(A ) = span   ⊂ R4 .
1 





0
(5.2.13)
This finally gives us that,
60
5.3. Lecture 16: September 25, 2013
Applied Matrix Theory
So the dimension of the row space of A is
dim (R(A)) = r,
(5.2.14)
which is also known as the rank of A. The dimensions of the other spaces are
dim (N(A)) = n − r.
(5.2.15)
| dim R(A ) = r.
(5.2.16)
| dim N(A ) = n − r.
(5.2.17)
For
Finally,
Alternative to fin the left null space of A. That is
|
|
N(A ) = x : x A = 0 .
(5.2.18)
We use

—



PA = —


b1
..
.
br
..
.

—



—


—
0
—
(5.2.19a)
with r rows occupied and n − r zero rows. From this we can use block matrices,
P1
P=
P2
(5.2.20)
P1
P1 A
PA =
A=
.
P2
P2 A
(5.2.21)
So
We know that P2 A = 0. So we claim that the rows of P2 span the left null space of
A = N(A| ) and
|
|
R(P2 ) = N(A ).
(5.2.22)
5.3
Lecture 16: September 25, 2013
Dr. Nitsche is not in town October 18 or Wednesday before thanksgiving. May have to have
alternate times for class.
61
Nitsche and Benner
Unit 5. Vector Spaces
The Four Subspaces of A
To recall what we discussed last class,
• R(A) is the range of A or the column space. This has dimensions r.
• N(A) is the column space of A| = {A| y}. This has dimensions n − r.
• R(A| ) is the rowspace transpose of A = {(yA| )| } and is also known as the left range
of A.This has dimensions r.
• N(A| ) = {x : A| x = 0} = {x : x| A = 0} and this is the left null space of A. This has
dimensions m − r.
Returning to the manipulation A → EA with PA = EA with Pm×m is invertible.
P1
P1 A
A=
,
(5.3.1a)
P2
P2 A
B1
=
,
(5.3.1b)
0
where P2 A = 0.
Theorem 5.9.
|
|
N(A ) = R(P2 )
(5.3.2)
where the right hand side is the rowspace of P2 .
Proof. For proof of ⊇, Assume y ∈ R(P|2 ). Then y = P|2 x for some x. Reformuating,
y| = x| P2 . So y| A = x| P2 A = x| 0, which gives y ∈ N(A| ).
|
|
| −1
Also assume
⊆, assume y ∈ N(A ), Then y A = 0. This gives y P EA = 0 =
U
y| [Q1 |Q2 ]
. So, 0 = (y| Q1 )Ur×m where we have that U is full rank. This gives
0
y| Q1 = 0.
P
We know that QP = I. [Q1 |Q2 ] 1 = I, Q1 P1 + Q2 P2 = I, Q1 P1 = I − Q2 P2 . This
P2
gives, 0 = y| Q1 P1 = y| (I − Q2 P2 ). So, y| = y| Q2 P2 and so we have
|
| |
(5.3.3)
y = P2 Q2 y ∈ R(P ).
As an example,

1 2 1 3
 2 4 0 4

 1 2 3 5
2 4 0 4
3
4
5
2
1
0
0
0
0
1
0
0
0
0
1
0
 
0
1 2
 0 0
0 
 →
0   0 0
1
0 0
62
0
1
0
0
2
1
0
0
0
0
1
0

0 − 12
0
1
1
1 
0 − 23
3
2 
1
0
0 − 12 
2
1
1 − 3 − 13
0
(5.3.4a)
5.3. Lecture 16: September 25, 2013
Applied Matrix Theory
Note that the N(A| ) is orthogonal to R(A). We also find from this manipulation that
      
1
1
3 



      
2 0 −4

R(A) = span   ,   ,  
(5.3.5)
1
3
5 





2
0
2
and
  
3 


  

−1
|
|
 .
N(A ) = R(P2 ) = 
−1





0
(5.3.6)
Linear Independence
Definition 5.10. A set {v1 , . . . , vn } is linearly independent if α1 v1 + · · · + αn vn = 0 implies
α1 = · · · = αn = 0. From this we get the equivalent statements;
• {v1 , . . . , vn } linearly independent,
• A = [v1 · · · vn ] has full rank r,
• N(A) = {Aα = 0} = {0}.
For example we have the polynomial basis set to order n, {1, x, x2 , . . . , xn } which is
linearly independent because, c0 + c1 x + c2 x2 + · · · + cn xn = 0 implies that c0 = · · · = xn = 0.
As another example we can show that the zero set, {0} is linearly independent. This is
because α0 = 0 for any α 6= 0. Any set containing 0, e.g. {v1 , . . . , vn , 0} is linearly dependent.
Another example is any set of distinct unit vectors, {ei1 , ei2 , . . . , ein } where ei ∈ Rm and
n ≤ m. This is also a linear independent since,


0 0 1
0 0 0


.
1
0
0
A=
(5.3.7)


0 1 0
0 0 0
We take as another example the Van der Monde matrix which has applications in polynomial interpolation. Let x1 , . . . , xm be distinct real numbers,


1 x1 x21 · · · x1n−1
1 x2 x2 · · · xn−1 
2
2


A=
(5.3.8)

..


.
n−1
1 xm x2m · · · xm
where n ≤ m. Then we have Ac = y, where c = [c0 · · · cn−1 ]| . Because p(x1 ) = y1 and
p(xm ) = ym . Solution to Ac = y gives a polynomial that interpolates (xk , yk ). For Ac = 0
then we have p with m roots x1 , . . . , xm , but another polynomial of degree n − 1 can only
have n − 1 distinct roots since m > n − 1. So p ≡ 0 and therefore c ≡ 0.
63
Nitsche and Benner
Unit 5. Vector Spaces
y
(xn , yn )
(x1 , y1 )
x
(xk , yk )
Figure 5.2. Interpolating system.
5.4
Lecture 17: September 27, 2013
Linear functions (rev)
Is f linear? Here it was good to find the formula. Some could be done by inspection. Here
we also should check f (p1 + p2 ) = f (p1 ) + f (p2 ) and f (αp) = αf (p). So let’s talk about
the finding the functions; say the flipping function:
1
0
x
f (x, y) = (x, −y) =
(5.4.1)
0 −1
y
For the mapping of the projection,
1
x+y x+y
f (x, y) =
,
= 12
2
2
2
1
2
1
2
x
y
(5.4.2)
For the rotation, x = r cos(ψ), and y = r sin(ψ). If we denote the shifted with primes,
x0 = r cos(ψ + θ) and y 0 = r sin(ψ + θ). We can use identities to get x0 = r(cos ψ cos θ −
sin ψ sin θ) = x cos θ − y sin θ and y 0 = r(sin ψ cos θ + cos ψ sin θ) = y cos θ + x cos θ. This
gives us the function,
0 x
cos(θ) − sin(θ)
x
f (x, y) =
=
.
(5.4.3)
y0
sin(θ)
cos(θ)
y
Note this is a skew symmetric matrix with determinant equal to 1.
Review for exam
Anything on the first three homework’s is fair game. We have been doing computations
of the LU, PLU, REF, RREF. We have solved Ax = b. Writing systems of linear equations in matrix form. We have talked about the elementary matrices and the process of
premultiplication as well as there invertibility.
64
5.4. Lecture 17: September 27, 2013
Applied Matrix Theory
We have also discussed some proof, especially this last one. We showed this major
|
|
−1
one: tr(AB) = tr(AB), (AB) = B| A| , (AB)−1 = B−1 A−1 , (A−1 ) = (A| ) = A−| .
Similarly we have shown that the LU P
decomposition exists if all principle submatrices are
invertible. The relation (I − A)−1 = nk=0 Ak if Ak → 0. We also discussed (A + B)−1
with perturbation matrices. Finally, we discussed rank one matrices, so we need to know
−1
the Sherman–Morrison formula, (I + uv| ) .
Previous lecture continued
Comment on previous lecture:


1 x1 x21 · · · xn−1
1
1 x2 x2 · · · xn−1 
2
2


A=

..


.
2
n−1
1 xm xm · · · xm
m×n
(5.4.4)
When we consider Ac = y is equivalent to p(xi ) = yi , where i = 1, · · · , m. Thus we have
the equation c0 + c1 xi + c2 x2 + · · · + cn−1 xin−1 = 0, where i = 1, · · · , m, and we have a linear
system in the coefficients, ck . m ≥ n. In terms of vectors, these are linearly independent
because the set
     
 n−1 
2


x1
1
x
x
1
1


 .. 
 ..   ..   .. 
(5.4.5)
. ,  .  ,  .  , · · · ,  .  ,


 1
n−1 
2
x
x
x
m
m
m
has rank(A) = n. To show that this is linearly independent, we set up the system
 n−1   
 
 
 
x1
1
x1
x21
0
 ..   .. 
 .. 
 .. 
 .. 
c0  .  + c1  .  + c2  .  + · · · + cn−1  .  =  .  .
1
xm
x2m
xn−1
m
(5.4.6)
0
Here we must show that we have at least m distinct roots, but p ∈ Pn−1 has at most n − 1
roots. We know this by the fundamental theorem of algebra. So, m > n − 1 and the
polynomial must be identically equal to the zero polynomial, p ≡ 0, and ck = 0 for all k.
So we want to interpolate the polynomial p(x) ∈ Pn−1 . We set up p(xi ) = yi for i =
1, . . . , m. If n − 1 = m then we will have a unique solution to the interpolation. If instead
m > n then we have either no solution or infinitely many solutions. We defined the span
of a set as the set ofP
all linear combinations that are a vector set over the field of reals:
span {v1 , . . . , vn } = { cn vn , cn ∈ R}. The basis for a vector space V is the set {v1 , . . . , vk },
that spans V and is linearly independent. We also know that the basis for {0} is the empty
set ∅. Thus, for convenience, we define span {∅} = {0}.
Theorem 5.11. If {v1 , . . . , vn } is a linearly independent basis of V, then {u1 , . . . , um }
m > n is linearly dependent.
65
Nitsche and Benner
5.5
Unit 5. Vector Spaces
Lecture 18: October 2, 2013
Exams and Points
We decided that we will have three exams total, but only the best two will each count for
20% of our semester grade. Homework will be worth 60%. Lecture notes will be posted
online.
Continuation of last lecture
Theorem 5.12. If {u1 , . . . , un } spans V and S = {v1 , . . . , vn } ⊂ V with m > n, then S is
linearly dependent.
P
Pn
Proof. Consider m
i=1 αi v1 = 0. Using vi =
j=1 cij uj ,
m
X
αi
i=1
cij uj = 0,
(5.5.1a)
j=1
n
X
m
X
j=1
i=1
|
n
X
!
uj = 0.
αi cij
{z
(α| C)j
(5.5.1b)
}
Since C|n×m αP= 0 has ranks free, recall there exists α 6= 0 such that C| α = 0. So (C| α)j = 0
for any j so i αi vi = 0.
Definition 5.13. A basis of V is a linearly independent spanning set of V.
Theorem 5.14. Any two basis have the same number of elements.
Equivalent characterizations of basis,
• linearly independent spanning set
• minimal spanning set
• max linearly independent subset of V.
Definition 5.15. dim(V) is equal to the number of elements in the basis.
Recalling the four subspaces for a matrix,


| |
|
Am×n = a1 a2 · · · an 
;
| |
| m×n
• R(A) ⊂ Rm , dim = r;
• N(A) ⊂ Rn , dim = n − r;
• R(A| ) ⊂ Rn , dim = r;
66
(5.5.2)
5.5. Lecture 18: October 2, 2013
Applied Matrix Theory
• N(A| ) ⊂ Rm , dim = m − r.
Definition 5.16. If X and Y are two subspaces of V then
X + Y = {x + y, x ∈ X , y ∈ Y} .
(5.5.3)
Is X + Y a subspace? We shall illustrate this in two parts
1. Given z ∈ X + Y, is αz ∈ X + Y?
If this is the case, z = x + y and αz = αx + αy ∈ X + Y, where we recalled that the
vectors x and y are within their respective sets.
2. Given z1 , z2 ∈ X + Y, is z1 + z2 ∈ X + Y?
Here we substitute for the summed vectors of each of the z vectors, (x1 +y1 )+(x2 +y2 ) =
(x1 + x2 ) + (y1 + y2 ) ∈ X + Y.
Theorem 5.17. dim(X + Y) = dim(X ) + dim(Y) − dim(X ∩ Y).
Proof. Let BX ∩Y = {z1 , . . . , zk } be the basis for X ∩ Y. Then we can extent the set to bases
for X and Y.
BX = {z1 , . . . , zk , x1 , . . . , xn } ,
BY = {z1 , . . . , zk , y1 , . . . , ym } .
(5.5.4a)
(5.5.4b)
We now claim that we have a set S = {z1 , . . . , zk , x1 , . . . , xn , y1 , . . . , ym } = BX +Y . We now
consider: does S span X + Y? We let z ∈ X + Y. Then, we know z = x + y for every x ∈ X
and y ∈ Y. So,
!
!
X
X
X
X
z=
(5.5.5a)
αi zi +
β i xi +
αi0 zi +
γi yi ,
=
i
i
X
αi0 ) zi
(αi +
i
i
+
X
β i xi +
i
i
X
γi yi ,
∈ span(S).
(5.5.5b)
i
Is S linearly independent? Consider
X
X
X
αi zi +
β i xi +
γi yi = 0,
(5.5.6a)


X
X

αi zi +
βi xi  , ∈ X ∩ Y,
γi yi = − 
|
{z
}
| {z }
∈Y
∈X
X
=
δi zi ,
X
X
γi yi +
δi zi = 0,
X
This indicates γi = δi ≡ 0.
X
X
αi zi +
βi xi = 0,
(5.5.6b)
(5.5.6c)
(5.5.6d)
(5.5.6e)
which also indicates αi = δi ≡ 0.
67
Nitsche and Benner
Unit 5. Vector Spaces
From our example the range was spanned by the vectors,
      
1
1
3 


      

2
0
 ,   , 4 ⊂ R4 ,
R(A) = span 
1 3 5





2
0
2
    
−2
−2 










 1  0

5
  
N(A) = span 
 0 , −1 ⊂ R ,


 0  0






0
0
      
1
0
0 












2 0 0


|






R(A ) = span 0 , 1 , 0 ⊂ R5 ,
      


2
1
0 





0
0
1
  
3 



  
−1
|

N(A ) = span   ⊂ R4 .
−1 





0
(5.5.7a)
(5.5.7b)
(5.5.7c)
(5.5.7d)
Theorem 5.18. (a) R(A) is orthogonal to N(A| ) and (b) R(A) ∪ N(A| ) = {0}.
Which means R(A) + N(A| ) = Rm and R(A| ) + N(A) = Rn . Any Am×n gives an
orthogonal decomposition of Rn and Rm .
Proof. (a) Let y ∈ R(A) gives y = Az for some z. Then x ∈ N(A| ) which means that
A| x = 0 and additionally x| A = 0. Considering x| y = x| Az = 0, so x must be
orthogonal to y; therefore R(A) ⊥ N(A).
(b) If x ∈ R(A) and x ∈ N(A| ), then x| x = 0 which implies xi = 0 and x = 0.
68
UNIT 6
Least Squares
6.1
Lecture 19: October 4, 2013
Least Squares
We will now be covering the concept of least squares. If we are given an equation Ax = b, we
may multiply by the transpose of the matrix to find the least squares solution; A| Ax = A| b.
We will show that this is consistent even if Ax = b is inconsistent.
Previously we showed,
Theorem 6.1. dim(X + Y) = dim(X ) + dim(Y) − dim(X ∩ Y), where X , Y are subspaces
of V.
We now consider,
Theorem 6.2. Given conformal matrices A and B,
rank(A + B) ≤ rank(A) + rank(B) .
{z
} | {z } | {z }
|
dim(R(A+B))
dim(R(A))
(6.1.1)
dim(R(B))
Proof. R(A + B) ⊂ R(A) + R(B) since, if y ∈ R(A + B) then
y = (A + B)x,
= Ax + Bx,
∈ R(A) + R(B),
⊂ R(A) ⊂ R(B).
(6.1.2a)
(6.1.2b)
Further,
dim(R(A + B)) ≤ dim(R(A) + R(B)),
= dim(R(A)) + dim(R(B)) − dim(R(A) ∩ R(B)),
≤ dim(R(A)) + dim(R(B)),
= rank(A) + rank(B).
(6.1.3a)
(6.1.3b)
(6.1.3c)
(6.1.3d)
69
Nitsche and Benner
Unit 6. Least Squares
Theorem 6.3. rank(AB) = rank(B) − dim(N(A) ∩ R(B))
Proof. Let S = {x1 , . . . , xs } be a basis of N(A) ∩ R(B). Since N(A) ∩ R(B) ⊂ R(B) can
extend S to a basis for R(B),
BR(B) = {x1 , . . . , xs , z1 , . . . , zt } .
(6.1.4)
To prove dim(R(AB)) = t we claim {Az1 , . . . , Azt } is a basis for R(AB). First we show
that it spans. We let b ∈ R(AB). So b = ABy, for some y where By ∈ R(B). So
!
X
X
b=A
αi x1 +
βi z i ,
(6.1.5a)
i
=
X
=
X
i
i
αi Axi +
|{z}
=0
βi Azi ,
X
βi Azi ,
since x1 ∈ N(A).
(6.1.5b)
i
∈ span(S1 ).
(6.1.5c)
i
P
P
αi zi ) = 0
0.
Rearranging,
A
(
NextP
we show that S2 is lineally independent; i αi Azi =
iP
P
P
and
i αi zi ∈ N(A) ∩ R(B) since zi ∈ R(B). Thus,
i αi zi =
i βi xi and
i αi z i −
P
β
x
=
0.
Therefore,
α
=
β
=
0
since
{z
,
x
}
are
linearly
independent.
i
i
i
i
i i i
Theorem 6.4. Given matrices Am×n and Bn×p , then
rank(A) + rank(B) − n ≤ rank(AB) ≤ min(rank(A), rank(B))
(6.1.6)
Proof. We will consider the right inequality first and the left inequality second. First,
rank(AB) ≤ rank(B). We know that rank((AB)| ) = rank(B| A| ) ≤ rank(A| ) = rank(A)
and finally rank((AB)| ) = rank(AB).
For the left inequality, N(A) ∩ R(B) ⊂ N(A). Thus, dim(N(A) ∩ R(B)) ≤ dim(N(A)) =
n − rank(A). So, rank(AB) = rank(B) − dim(N(A) ∩ R(B)) ≥ rank(B) − (n − rank(A)). Theorem 6.5. (1) rank(A| A) = rank(A) and rank(AA| ) = rank(A| ) = rank(A).
(2) R(A| A) = R(A| ) and R(AA| ) = R(A).
(3) N(A| A) = N(A) and N(AA| ) = N(A| ).
Proof. For part (1), rank(A| A) = rank(A) − dim(N(A| ) ∩ R(A)), but N(A| ) ⊥ R(A) and so
N(A| ) ∩ R(A) = {0}. Since, if we let x ∈ N(A| ) and x ∈ R(A) then A| x = 0 and x = Ay.
Which gives x| x = y| Ax = 0 which implies that x = 0. So dim(N(A| ) ∩ R(A)) = 0.
to be continued...
6.2
Lecture 20: October 7, 2013
We will have two weeks for the next homework.
70
6.2. Lecture 20: October 7, 2013
Applied Matrix Theory
Properties of Transpose Multiplication
In review we covered the following theorems last time:
Theorem 6.6. dim(X + Y) = dim(X ) + dim(Y) − dim(X ∩ Y), where X , Y are subspaces
of V.
We also had the theorem,
Theorem 6.7. rank(AB) = rank(B) − dim(N(A) ∩ R(B))
And finally we showed the relation
Theorem 6.8. rank(AB) = rank(B) − dim(N(A) ∩ R(B))
We left off at the theorem covering multiplication relations and the rank and dimensions
of the matrix,
Theorem 6.9. (1) rank(A| A) = rank(A) and rank(AA| ) = rank(A| ) = rank(A).
(2) R(A| A) = R(A| ) and R(AA| ) = R(A).
(3) N(A| A) = N(A) and N(AA| ) = N(A| ).
We proved the first one using the third of the theorems above. We now prove the second
and third parts of this theorem.
Proof. For part 2, Let y ∈ R(A| A) then A| Ax = y for some x. So y = A| z for some z. Thus
y ∈ R(A| ) So R(A| A) ⊂ R(A| ), since dim(R(A| A)) = dim(R(A| )) and R(A| A) = R(A| ).
This is because BR(A| A) ⊂ BR(A| ) but since these have the same number of elements so
BR(A| A) = BR(A| ) .
For the third part we want to show that the basis are contained in the other and then we
compare the domimensions. So we let x ∈ N(A) the Ax = 0 or A| Ax = 0 and x ∈ N(A| A)
so N(A) ⊂ N(A| A). But also dim(N(A)) = n + r or dim(N(A| A)) = n − r therefore the
two sets must be the same; N(A) = N(A| A).
The Normal Equations
Definition 6.10. The normal equations for a system Ax = b is
|
|
A Ax = A b.
(6.2.1)
Theorem 6.11. For any A, A| Ax = A| b is consistent.
Proof. RHS in R(A| ); by the previous theorem RHS ∈ R(A| A) for every x such that A| Ax =
RHS.
Note: the solution to the normal equation is unique when rank(A) = n.
71
Nitsche and Benner
Unit 6. Least Squares
Example 6.12. Fit (xi , yi ), i = 1, . . . , m by a polynomial of degree 2. So p(x) = c0 + c1 x +
c2 x2 , where m > 3. Our problem it solve is p(xi ) = yi , i = 1, . . . , m or c0 + c1 xi + ci x2i =
yi , i = 1, . . . , m. The system is therefore linear in the system of the unknowns c0 , c1 , and
c2 . We can write this in matrix form,


 
1 x1 x21  
y1
 1 x 2 x 2  c0
 y2 
2

 
(6.2.2)

 c1  =  .. 
..


 . 
.
c
2
1 xm x2m
ym
or alternatively we have the system Ac =
that

1
1



1
y. What is the rank of the matrix A? We know

x1 x21
x2 x22 


..

.
2
xm xm
is invertible, since Ac = 0 implies that c = 0. So,


p(x1 )
 p(x2 ) 


Ac =  ..  .
 . 
p(xm )
(6.2.3)
Now, 3 ≤ m − 1 and we know that

1 x1 x21
 1 x2 x2 
2

rank 
=3
..


.
2
1 xm xm

(6.2.4)
and A| Ax = A| b has a unique solution.
To solve the normal equations,


 

 1 x1 x21   
 y1
c0
1 1
1 y 
1 1
1 1 x
x22 
2
  
 2
 x1 x2 · · · xm  

 c1 = x1 x2 · · · xm  ..  ,
..


 . 
.
x21 x22
x2m
c2
x21 x22
x2m
2
1 xm xm
ym

P P 2    P 
c0
Pm Pxi Pxi 3
P yi
 xi x2i





x
c
=
1
P 2 P 3 P i4
P x2i yi
xi
xi
xi
c2
xi y i
(6.2.5a)
(6.2.5b)
Suggestion: have an outline of the major proofs we have shown in class in your mind.
Go back and give them a study over.
|
Theorem 6.13. A| Ax = A| b gives x which minimizes kAx − bk22 = (Ax − b) (Ax − b).
72
6.2. Lecture 20: October 7, 2013
Applied Matrix Theory
b
N (A)
Ax
Figure 6.1. Minimization of distance between point and a plane.
y
x
Figure 6.2. Parabolic fitting by least squares
By corollary this is an if and only if statement. Here every solution
the normal
Pof
m
equations minimizes the sum
of the squares of the entries of the vector i=1 (Ax − b)2i .
P
Note here kxk22 = x| x =
x2i . We illustrate this in Figure 6.1 where the minimal line
connecting a point to a plane is shown.
Example 6.14. What does the solution
the normal equations
minimize from
Pto
P
P our example?
m
2
2
The solution c0 , c1 , and c2 minimizes i=1 (Ac − y)i = ((Ac)i − yi ) = (p(xi ) − yi )2 .
We can visualize our parabolic least squares method as shown in Figure 6.2.
Exam 1
We had a range from 36–98, with a median of 66. For this exam: 70–100 is an A-range
score, 50–70 is about a B, and below is a C (as long as as you are showing involvement in the
class). First two problems went fine; four was covered in class, five was on the homework,
we will confer the solution of the sixth problem in class next time.
73
Nitsche and Benner
6.3
Unit 6. Least Squares
Lecture 21: October 9, 2013
Need to have a couple classes early because of missing next Friday. So, next Monday and
Wednesday we will start at 8:35.
We will review problem 6 from the exam, then finish up least squares; cover linear
dependence and finally linear transformations.
Exam Review
We review exam problems 6. Given u, v ∈ Rn . (a) Show A = I + uv| is A−1 = I + αuv| .
Find α.
So we check that AA−1 = A−1 A = I. Now
|
|
AA−1 = I + αuv
I + αuv ,
(6.3.1a)
|
|
|
|
(6.3.1b)
= I + αuv + uv + uv αuv ,
|
|
| |
= I + αuv + uv + αu v u v ,
(6.3.1c)
|
|
| |
= I + αuv + uv + α v u uv ,
(6.3.1d)
|
|
= I + uv 1 + α(1 + v u) .
(6.3.1e)
This is equal to I if 1 + α(1 + v| u) = 0 or when α = 1+v1 | u . Thus, the Sherman–Morrison
formula is,
1
|
| −1
(6.3.2)
I + uv
=I−
| uv .
1+v u
|
|
For part (b) B = A + αêi êj = A I + αA−1 êi êj where A is invertible. For the inverse of
B:
| −1 −1
B−1 = I + αA−1 êi êj
A ,
(6.3.3a)
#
"
1
|
αA−1 êi êj A−1 .
(6.3.3b)
= I+
|
−1
1 + êj αA êi
|
This exists if 1 + αêj A−1 êi 6= 0 and can make α sufficiently small.
| {z }
A−1
ji
Least squares and minimization
|
Theorem 6.15. x solves A| Ax =
is equivalent to x minimizes (Ax − b)| (Ax − b) =
PA b
|
2
2
kAx − bk2 , where kxk2 = xx = i x2i .
Note:
f (x) = f (x1 , x2 , . . . , xn ),
|
= (Ax − b) (Ax − b),
|
|
(6.3.4a)
(6.3.4b)
|
= (x A − b )(Ax − b),
|
|
|
|
|
(6.3.4c)
|
= x A Ax − x A b − b Ax + b b.
74
(6.3.4d)
6.3. Lecture 21: October 9, 2013
Applied Matrix Theory
|
|
For scalars x, we have that x| = x. So, v| Ax = b Ax = x| A| b. This manipulates our
previous result to,
|
| |
| |
= x A Ax − 2x A b + b b.
(6.3.5)
This is a quadratic form and the minimum occurs when
∂f
∂xi
= 0.
Proof. To prove from the right hand side to the left; suppose x minimizes f (x), then
∂f
,
∂xi
∂x
∂x| |
∂x| |
| |
A Ax + x A A
−2
A b,
=
∂xi
∂xi
∂xi
| |
| |
= 2êi A Ax − 2êi A b.
(6.3.6a)
0=
(6.3.6b)
(6.3.6c)
This gives us
|
|
|
|
êi A Ax = êi A b
(6.3.7)
and
|
|
(A Ax)i = (A b)i ,
any i.
(6.3.8)
This finally means that we have formulated equivalently to A| Ax = A| b.
ASIDE:
∂
∂u
∂v
(uv) =
v+u
.
∂xi
∂xi
∂xi
(6.3.9)
To prove going the other direction, suppose that x solves A| Ax = A| b then show that
f (x) < f (y) for any y 6= x. First, we consider
|
|
|
|
|
|
|
|
f (y) − f (x) = y A Ay − 2y |{z}
A b +b b − x A Ax − 2x |{z}
A b +b b,
|
|
A Ax
|
|
(6.3.10a)
|
A Ax
|
|
|
= y A Ay − 2yA Ax − x A Ax − 2xA Ax,
|
(6.3.10b)
= (Ay − Ax) (Ay − Ax) ,
(6.3.10c)
x)k22 ,
(6.3.10d)
(6.3.10e)
= kA (y −
≥ 0.
If A has full rank (no nontrivial null space), then this must be greater than zero. So
any solution to the normal equations minimizes this norm, or and solution A| Ax = A| b
minimizes kAx − bk22 . Further, if A has full rank then we are guaranteed a unique least
squares solution x. Finally, if A has a nontrivial null space (r < n) then we have infinitely
many least squares solutions.
In Matlab we can do help \ to find out what solution it gives for underdetermined
solutions. What does it minimize?
75
Nitsche and Benner
6.4
Unit 6. Least Squares
Homework Assignment 4: Due Monday, October
21, 2013
1. Textbook 4.1.1: Vector spaces, subspaces, fundamental subspaces of a matrix.
Determine which of the following subsets of Rn are in fact subspaces of Rn (n > 2).
(a) {x | xi ≥ 0},
(b) {x | x1 = 0},
(c) {x | x1 x2 = 0},
n P
o
n
(d) x x
=
0
,
j=1 j
n P
o
n
(e) x x
=
1
,
j
j=1
(f) {x | Ax = b, where Am×n 6= 0 and bm×1 6= 0}.
2. Textbook 4.1.2
Determine which of the following subsets of Rn×n are in fact subspaces of Rn×n .
(a)
(b)
(c)
(d)
(e)
(f)
(g)
(h)
(i)
The symmetric matrices.
The diagonal matrices.
The nonsingular matrices.
The singular matrices.
The triangular matrices.
The upper-triangular matrices.
All matrices that commute with a given matrix A.
All matrices such that A2 = A.
All matrices such that tr(A) = 0.
3. Textbook 4.1.6
Which of the following are spanning sets for R3 ?
(a)
(b)
(c)
(d)
(e)
1
1
1
1
1
1
0
0
2
2
1 ,
0 , 0
0 , 0
1 , 2
1 , 2
0
1
0
0
1 ,
0 , 0 0 1 , 1 1 1 ,
−1 , 4 4 1 ,
−1 , 4 4 0 .
4. Textbook 4.1.7
For a vector space V, and for M, N ⊆ V, explain why span(M ∪ N ) = span(M) +
span(N ).
76
6.4. HW 4: Due October 21, 2013
Applied Matrix Theory
5. Textbook 4.2.1
Determine spanning sets for each of the four

1
2
A = −2 −4
1
2
fundamental subspaces associated with

1 1
5
0 4 −2 .
2 4
9
6. Textbook 4.2.3
Suppose that A is a 3 × 3 matrix such that
    
  
1 
 1
 −2 
R = 2, −1
and N =  1




3
2
0
spanR(A)
 and N(A), respectively, and consider a linear system Ax = b, where
1
b = −7.
0
(a) Explain why Ax = b must be consistent.
(b) Explain why Ax = b cannot have a unique solution.
7. Textbook 4.2.7
A1
If A =
is a square matrix such that N(A1 ) = R(A|2 ), prove that A must be
A2
nonsingular.
8. Textbook 4.2.8
Consider a linear system of equations Ax = b for which y| b = 0 for every y ∈ N(A| ).
Explain why this means the system must be consistent.
9. Textbook 4.3.1(abc): Linear independence, basis.
Determine which of the following sets are linearly independent. For those sets that are
linearly dependent, write one of the vectors as a linear combination of the others.
     
2
1 
 1
(a) 2, 1, 5


3
0
9
1 2 3 , 0 4 5 , 0 0 6 , 1 1 1
(b)
     
1
2 
 3





2 , 0 , 1
(c)


1
0
0
10. Textbook 4.3.4
Consider a particular species of wild flower in which each plant has several stems,
leaves, and flowers, and for each plant let the following hold.
77
Nitsche and Benner
Unit 6. Least Squares
S = the average stem length (in inches).
L = the average leaf width (in inches).
F = the number of flowers.
Four particular plants are examined, and the
matrix:
S

#1 1
#2
2
A=
#3 2
#4 3
information is tabulated in the following
L
1
1
2
2
F

10
12 

15 
17
For these four plants, determine whether or not there exists a linear relationship between S, L, and F . In other words, do there exist constants α0 , α1 , α2 , and α3 such
that α0 + α1 S + α2 L + α3 F = 0?
11. Textbook 4.3.13
Which of the following sets of functions are linearly independent?
(a) {sin(x), cos(x), x sin(x)}.
(b) {ex , xex , x2 ex }.
(c) sin2 (x), cos2 (x), cos(2x) .
12. Textbook 4.4.2
Find a basis for each of the four fundamental subspaces associated with


1 2 0 2 1
A = 3 6 1 9 6 
2 4 1 7 5
(6.4.1)
13. Textbook 4.4.8
Let B = {b1 , b2 , . . . , bn } be a basis for a vector space V. Prove that each v ∈ V can
be expressed as a linear combination of the bi ’s, v = α1 b1 + α2 b2 + · · · + αn bn , in only
one way—i.e., the coordinates αi are unique.
14. Textbook 4.5.5
For A ∈ Rm×n , explain why A| A = 0 implies A = 0.
15. Textbook 4.5.8
Is rank(AB) = rank(BA) when both products are defined? Why?
16. Textbook 4.5.14
P
Prove that if the entries of Fr×r satisfy rj=1 |fij | < 1 for each i (i.e., each absolute
row sum < 1), then I + F is nonsingular. Hint: Use the triangle inequality for scalars
|α + β| ≤ |α| + |β| to show N(I + F) = 0.
17. Textbook 4.5.18
If A is n × n, prove that the following statements are equivalent:
78
6.4. HW 4: Due October 21, 2013
Applied Matrix Theory
(a) N(A) = N(A2 )
(b) R(A) = R(A2 )
(c) R(A) ∩ N(A) = {0}
18. Textbook 4.6.1: Least Squares.
Hookes law says that the displacement y of an ideal spring is proportional to the force
x that is applied—i.e., y = kx for some constant k. Consider a spring in which k is
unknown. Various masses are attached, and the resulting displacements shown in the
figure are observed. Using these observations, determine the least squares estimate for
k.
19. Textbook 4.6.2
Show that the slope of the line that passes through the origin in R2 and comes closest
in the least squares
to passing through the points {(x1 , y1 ), (x2 , y2 ), . . . , (xn , yn )}
P sense P
is given by m = i xi yi / i x2i .
20. Textbook 4.6.6
After studying a certain type of cancer, a researcher hypothesizes that in the short run
the number (y) of malignant cells in a particular tissue grows exponentially with time
(t). That is, y = α0 eα1 t . Determine least squares estimates for the parameters α0 and
α1 from the researchers observed data given below.
t (days)
y (cells)
1
16
2
27
3
45
4
74
5
122
Hint: What common transformation converts an exponential function into a linear
function?
79
Nitsche and Benner
Unit 6. Least Squares
80
UNIT 7
Linear Transformations
7.1
Lecture 22: October 14, 2013
Theorem 7.1. Given a vector space V. If {u1 , . . . , un } spans V and {vi }m
i=1 ⊂ V, then
m
{vi }i=1 is linearly dependent if m > n (because there is more vectors in the set).
Pn
Pm
Pn
Pm
Proof. Consider
j=1 cij uj = 0 and
i=1 αi
j=1 cij uj then
i=1 αi vi = 0. Use vi =
n
X
|
Pm
α
·
·
·
α
So c| α = 0 has nonzero
α
c
u
=
0
If
we
consider
α
=
1
n
i
ij
j
i=1
j=1
| {z }
(α| C)j =(C| α)j
solutions α, since m−n > 0 free variables. So for every αi 6= 0, c| α = 0 and
P
αi vi = 0. Any two bases for V have the same number of elements.
Definition 7.2.Let
 V be a vector space with basis B = {b1 , . . . , bn } The coordinates of
c1
P
 .. 
x ∈ V are cj =  .  such that x = nj=1 cj bj .
cn
 
c1
 .. 
Theorem 7.3. Coordinates of x ∈ V with respect to the basis B are unique. [x]B =  . .
cn
Example 7.4. We take as an example a vector x ∈ R3 ,
 
1
x = 2 ,
3
(7.1.1a)
= 1ê1 + 2ê2 + 3ê3 ,
(7.1.1b)
= ı̂ + 2̂ + 3k̂.
(7.1.1c)
81
Nitsche and Benner
Unit 7. Linear Transformations
with the standard bidis in Rn = {ê1 , . . . , ên } = S or
 
1
2 = [x]S
3
(7.1.2)
We can have another basis for R3 ;
      
1
2 
 1
1 , 1 , 0 = B


0
1
0
(7.1.3)
This is linearly independent because the matrix


2 1 1
0 1 1
0 0 1
is nonsingular. So

−1
[x]B =  3
− 12

(7.1.4)
 
c1

Now find c2  such that
c3
 
 
 
2
1
1





c1 1 + c2 1 + c3 0 = 1ê1 + 2ê2 + 3ê3 .
0
1
0
(7.1.5)
In matrix form,
 
1

Bc = 2 ,
3

   
1 1 2
c1
1
1 1 0 c2  = 2 .
0 1 0
c3
3
(7.1.6a)
(7.1.6b)
Solving for the individual variables,
c2 = 3;
c1 = 2 − c1 ,
= −1;
2c3 = 1 − 3 + 1,
1
c3 = − .
2
(7.1.7a)
(7.1.7b)
(7.1.7c)
(7.1.7d)
(7.1.7e)
Summary
• For any vector space, V, there exists a basis B.
• Any x ∈ V is represented uniquely by a tuple of numbers, the coordinates [x]B .
82
7.1. Lecture 22: October 14, 2013
Applied Matrix Theory
Linear Transformations
Definition 7.5. Given the vector spaces U, V, a map T : U → V such that,
• T(x + y) = T(x) + T(y)
• T(αx) = αT(x)
is a linear transformation of U → V.
We also recognize that a linear transformation is a linear function on vector spaces.
Definition 7.6. A linear transformation U → U is a linear operator on U.
Our goal now is two fold:
• Show that the set of all linear transformations U → V is a vector space L(U, V).
• Find the basis and coordinate unit basis of any T ∈ L(U, V).
Examples of Linear Functions
Example 7.7. T(x) = Am×n xn×1 so T : Rn → Rm .
• Rotation A = R(θ)
• projection
• reflection
Example 7.8. f (x) = ax, f : R → R
df
, D : Pn →
dx
´b
= a f (x) dx, I
Example 7.9. D(f ) =
Pn−1 or D : C 1 → set of all functions.
Example 7.10. I(f )
: C0 → R
Example 7.11. One final example regarding matrices, T(Bn×k ) = Am×n Bn×k , T : Rn×k →
Rm×k .
Matrix representation of linear transformations
Every linear transformation on finite dimensional spaces has a matrix representation. Suppose T : U → V and B = {u1 , . . . , un } forms the basis for U and B 0 = {v1 , . . . , vn } forms
the basis for V. Then the action of T on U is
!
n
X
T(u) = T
ξi ui ,
(7.1.8a)
i=1
=
=
=
n
X
i=1
n
X
ξi T (ui ) ,
ξi
n
X
αij vj ,
(7.1.8c)
αij ξi vj ,
(7.1.8d)
i=1
j=1
n
n
XX
i=1 j=1
where αij describes the action of T.
83
(7.1.8b)
Nitsche and Benner
Unit 7. Linear Transformations
Theorem 7.12. The set of all linear transformations T : U, V = L(U, V) is a vector space.
Proof. Given T1 , T2 ∈ L(U, V), then (T1 + T2 ) x = T1 x + T2 x and T1 + T2 ∈ L(U, V).
Further (αT1 )x = αT1 (x) which gives αT1 ∈ L(U, V). Some other properties of note:
0x = 0 and 0 ∈ L(U, V); (T1 − T1 ) = 0, etc.
Theorem 7.13. Given U with basis B = {u1 , . . . , un } and V with basis B 0 = {v1 , . . . , vn }
then a basis for L(U, V) is {Bij }i=1,...,n; j=1,...,m , where Bij : U → V by Bij (u) = ξi vj where
P
u = nk=1 ξk uk .
It follows that dim(L(U, V)) = dim(U) dim(V) = nm.
P
Proof. Let’s prove linear independence: Consider
ηij Bij = 0, then
!
X
0=
ηij Bij (uk ),
(7.1.9a)
ij
=
X
ηij (Bij uk ),
(7.1.9b)
ηkj vj
(7.1.9c)
ij
=
X
j
ASIDE: Note that Bij uk = ξi vj = 0, i 6= k; vj , i = k With [uk ]B = 0
kth position.
···
1
···
|
0 , with the 1 at the
Since {vj } are linearly independent it follows that ηkj ≡ 0 for all j and each k. Therefore
Bij are linearly independent.
7.2
Lecture 23: October 16, 2013
The next major things we are going to try to cover are:
• Basis for L(U, V) coordinates for T ∈ L(U, V)
• Action of T
• Change of coordinates of u ∈ U under change of basis
• Change of coordinates of T ∈ L(U, V) under change of basis
Basis of a linear transformation
The linear set,
L(U, V) = {T : U → V | T linear transformation}
(7.2.1)
Theorem 7.14. Bji : U → V by Bji u = ξiP
vj where B = {u1 , . . . , un } is a basis for U and
0
B = {v1 , . . . , vn } is a basis for V and u = nk=1 ξk uk . Also, {Bij } are basis for L(U, V).
84
7.2. Lecture 23: October 16, 2013
Applied Matrix Theory
Proof. First, we observe that we have linear independence. Second we check the span. If we
let T ∈ L(U, V), then
X
T(u) = T(
ξj uj ),
(7.2.2a)
j
=
X
ξj T(uj ),
(7.2.2b)
j
=
X
ξj
j
Here we recognize that T(uj ) =
Pm
i=1
m
XX
j
=
i=1
j
i
(7.2.2c)
αij ξj vi ,
|{z}
(7.2.2d)
m
XX
j
P P
αij vi .
i=1
αij vi .
=
for any u. Thus, T =
m
X
Bij (u)
!
αij Bij
(u).
(7.2.2e)
i=1
αij Bij ; so {Bij } spans L(U, V). It follows that
[T]BB0 = {αij } ,


α11 α12
α1n
 α21 α22 · · · α2n 


=
..
..  ,
.
.

.
.
. 
αm1 αm2 · · · αmn
= [T(u1 )]B0 [T(u2 )]B0 · · · [T(un )]B0 .
(7.2.3a)
(7.2.3b)
(7.2.3c)
If T : U → U is a linear operator that goes to the same space then [T]BB = [T]B for
convenience.
Example 7.15. Let D : Pn → Pn−1 by D(p) =
85
dp
.
dx
Our basis is B = {1, x, . . . , xn } and we
Nitsche and Benner
Unit 7. Linear Transformations
also have the operated basis B 0 = {1, x, . . . , xn−1 }. So,
[D(1)]B0 = [0]B0 ,
 
0
 .. 
= . ;
(7.2.4a)
(7.2.4b)
0
[D(x)]B0 = [1]B0 ,
 
1
0
 
=  ..  ;
.
0
D(x2 ) B0 = [2x]B0 ,
 
0
2
 
 
= 0 ;
 .. 
.
0
n−1 n
[D(x )]B0 = nx
0 ,
  B
0
 .. 
 
= ..
0
n
This allows us to represent the differentiation operator by the matrix,


0 1 0 0
0
0 0 2 0 · · · 0 


..

. 0
[D]BB0 = 0 0 0 3
.

 .

.
.
.
. . . . .. 
 ..
0 0 0 0 · · · n n×(n+1)
(7.2.4c)
(7.2.4d)
(7.2.4e)
(7.2.4f)
(7.2.4g)
(7.2.4h)
(7.2.5)
dp
. This will be the same as the previous
Example 7.16. Let D : Pn → Pn by D(p) = dx
example except we will add a row of zeros at the bottom and give us a square matrix.


0 1 0 0
0
0 0 2 0 · · · 0 


..


.
0
0 0 0 3
[D]B =  .
.
(7.2.6)
. . . . .. 
 ..
.
. .


0 0 0 0 · · · n
0 0 0 0
0 (n+1)×(n+1)
86
7.2. Lecture 23: October 16, 2013
Applied Matrix Theory
We may do this for any operator. For example we could do this for projection. What
we want is to find a basis that gives us a nice representation of the operator. Highly sparse
basis are nice.
Action of linear transform
The action of T : U → V. Recall,
T(u) = T
n
X
!
ξj uj
,
(7.2.7a)
j=1
=
=
=
n
X
j=1
n
X
ξj T (uj ) ,
ξj
m
X
j=1
i=1
n
X
m
X
j=1
i=1
|
(7.2.7b)
αij vi ,
(7.2.7c)
!
vi
αij ξj
{z
[Aξ]i
(7.2.7d)
}
This gives us the coordinates of the V basis.
[T(u)]B0 = Aξ,
= [T]BB0 [u]B .
(7.2.8a)
(7.2.8b)
Thus the action is represented by matrix multiplication. Now return to our example,
dp
. Our basis is B = {1, x, . . . , xn } and we
Example 7.17. Let D : Pn → Pn−1 by D(p) = dx
also have the operated basis B 0 = {1, x, . . . , xn−1 }. If we consider p(x) = α0 +α1 x+· · ·+αn xn
and D(p(x)) = α1 + 2α2 x + · · · + nαn xn−1 . This gives our vector representation of


α1
 2α2 


 3α3 
[D(p)]B0 = 
(7.2.9a)
,
 .. 
 . 
nαn
 

 α0
0 1 0 0
0  
0 0 2 0 · · · 0   α1 

  α2 
..

 
. 0
= 0 0 0 3
(7.2.9b)
  α3  .
 .
 
.
.
.
. . . . ..   .. 
 ..
 . 
0 0 0 0 ··· n
αn
It follows that [L + T]BB0 = [L]BB0 + [T]BB0 and [αL]BB0 = α [L]BB0 . We may also consider
the composition of linear operators. Say L(T(x)) = (LT)(x), also [LT]BB00 = [L]BB0 [T]B0 B00 .
87
Nitsche and Benner
Unit 7. Linear Transformations
Change of Basis
If we change the coordinates of our system when given vector space U. Let B = {u1 , . . . , un }
is a basis for U and B 0 = {v1 , . . . , vn } be two bases for U. The relation between [u]B and
[u]B0 is given by
[u]B0 = P [u]B .
(7.2.10)
P is called the change of basis matrix from B to B 0 . Recall, the coordinates of [T(u)]B0 =
[T(u)]BB0 [u]B . Clearly P is [T(u)]BB0 when T = I or P = [I(u)]BB0 . We will use our
differentiation operator as an example once more.
Example 7.18. Given U = P2 we have the bases B = {1, x, x2 } and B 0 = {1, 1 + x, 1 + x + x2 },
then
(7.2.11a)
[I(u)]BB0 = [I(u1 )]B0 [I(u2 )]B0 [I(u3 )]B0 ,
= [u1 ]B0 [u2 ]B0 [u3 ]B0 ,
(7.2.11b)


1 −1
0

1 −1,
= 0
(7.2.11c)
0
0
1
= P.
(7.2.11d)
We know this is true for any u. We can find the representation of the polynomial p(x) =
3 + 2t + 4t2 in the [p]B0 . So,

 
1 −1
0
3
1 −1 2,
[p]B0 = 0
(7.2.12a)
0
0
1
4
 
1
= −2.
(7.2.12b)
4
Finally, let U be a vector space with basis B = {u1 , . . . , un } and B 0 = {v1 , . . . , vn }.
Then if we have T : U → U. We know the relation between [T]B and [T]B0 and we may let
P = [I]BB0 . We have,
[T(u)]B0 = [T]BB0 [u]B ,
= A [u]B .
(7.2.13a)
(7.2.13b)
[u]B0 = P [u]B ,
[T(u)]B0 = P [T(u)]B .
(7.2.14a)
(7.2.14b)
P [T(u)]B = A . . .
(7.2.15)
Further we have
So
to be continued. . .
Note: No class Friday.
88
7.3. Lecture 24: October 21, 2013
7.3
Applied Matrix Theory
Lecture 24: October 21, 2013
Change of Basis (cont.)
If we have T : U → U, let U be a vector space with basis B = {u1 , . . . , un } and B 0 =
{v1 , . . . , vn }.
1. Basis for L(U, V) = {Bij : Bij u= ξi vj , where u =
[Tu1 ]B0 [Tu2 ]B0 · · · [Tun ]B0 .
P
k ξk uk }
coordinates of T, [T] =
2. Achar of T [T(u)]B0 = [T]BB0 [u]B .
3. Given x ∈ U with B, B 0 are two bases for U, then [x]B0 = P [x]B . and P = [I]BB0 .
4. T : U → U with B, B 0 are two bases for U, then we want to relate [T]B and [T]B0 .
To show property 4,
[Tu]B0 = [T]B0 B0 [u]B0 ,
[Tu]B = [T]BB [u]B .
(7.3.1a)
(7.3.1b)
[Tu]B0 = P [Tu]B ,
(7.3.2a)
[u]B0 = P [u]B .
(7.3.2b)
P [Tu]B = · · ·
(7.3.3a)
[T]BB = P−1 [T]B0 B0 P,
[T]B = P−1 [T]B0 P.
(7.3.4a)
(7.3.4b)
But also,
Considering P = [I]BB0
So
And we get,
The matrix representation of T under different basis are self similar.
Definition 7.19. If A = C−1 BC for some C, then A and B are self-similar (A, B, C ∈
Rn×n ).
Theorem 7.20. Given any two self-similar matrices A, B, they represent the same linear
transformation under two different bases.
89
Nitsche and Benner
Unit 7. Linear Transformations
Example 7.21. Example illustrating the self-similarity: [T]B = P−1 [T]B0 P. Let T ∈
L(U, U) be defined by
0 1
x
Tu =
(7.3.5)
−2 3
y
where u = xu1 + yu2 .
Tu =
y
−2x + 3y
= yu1 + (−2x + 3y)u2 .
(7.3.6)
In basis notation we may consider this,
[Tu]B = M [u]B .
(7.3.7)
1
1
Now let’s consider a different basis. Let S = {ê1 , ê2 } and S =
,
. Now
1
2
0
[T]S = [Tê1 ]S [Tê2 ]S ,
0
1
=
,
−2 S
3 S
0 1
=
,
−2 3
= M.
(7.3.8a)
(7.3.8b)
(7.3.8c)
(7.3.8d)
Now in our different basis,
[T]S 0
1
1
T
T
=
,
1 S
2 S
1
2
=
,
1 S0
4 S0
1 0
=
.
0 2
(7.3.9a)
(7.3.9b)
(7.3.9c)
This helps us by diagonalizing the operator. Now we want to find P,
P = [I]BB0 ,
(7.3.10a)
= [Tu1 ]B0 [Tu2 ]B0 ,
1
0
=
,
0 B0
1 B0
2 −1
=
.
−1 1
(7.3.10b)
(7.3.10c)
(7.3.10d)
Similarly,
−1
P
1 1
=
.
1 2
90
(7.3.11)
7.4. Lecture 25: October 23, 2013
Applied Matrix Theory
We can verify this,
P
−1
1 1
1 0
2 −1
[T]S 0 P =
,
1 2
0 2
−1
1
1 1
2 −1
=
,
1 2
−2
2
0 1
=
.
−2 3
(7.3.12a)
(7.3.12b)
(7.3.12c)
So this checks out.
Example 7.22. Let M ∈ L(U, V) defined by [M(u)]S = M [u]S where S is the standard
basis. Then
[M]S = M,
(7.3.13a)
= [Mê1 ]S [Mê2 ]S · · · [Mên ]S
(7.3.13b)
and we define S 0 = {q1 , . . . , qn }. When we have Q = [I]S 0 S ,
[M]S 0 = Q−1 MQ,
(7.3.14a)
= [q1 ]S [q2 ]S · · · [qn ]S ,
= q1 q2 · · · qn .
(7.3.14b)
(7.3.14c)
.
Now let A = Q−1 BQ with S = {ê1 , . . . , ên } and S 0 = {q1 , . . . , qn } and Let L(u) = Bu.
[L]S = B and [I]S 0 S = Q so [L]S 0 = Q−1 BQ.
If T ∈ L(U, U) and X ⊂ U such that T(X ) ⊂ X where T(X ) = {T(x) such that x ∈ X }
then X is an invariant subspace of U under T.
Example 7.23. If (λ, v) are an eigen-pair of A then
(λI − A) v = 0,
λv = Av.
(7.3.15a)
(7.3.15b)
and span{v} is an invariant subspace under A.
7.4
Lecture 25: October 23, 2013
Properties of Special Bases
If we consider B and B 0 as bases for U with operation T : U → U. Then we have,
[T]BB0 = [T(u1 )]B0 · · · [T(un )]B0 ,
(7.4.1a)
[T]B = [T(u1 )]B · · · [T(un )]B ,
(7.4.1b)
−1
= P [T]B0 P
(7.4.1c)
91
Nitsche and Benner
Unit 7. Linear Transformations
And we also have P = (I)BB0 . We consider T on Rn , T(x) = Ax and [T]S = A. So
A = P−1 BP for appropriate B and P, with B = [T]B0
Note: A tuple is an ordered set of numbers.
Now we have two goals:
1. Find a basis such that [T]B is simple
2. FInd invariant quantities
Example 7.24. tr(P−1 BP) = tr(BPP−1 ) = tr(B)
Example 7.25. For T : Pn → Pn by T(p) = Dp ,

0 1 0 ···

0 0 2

[T]B = 
.
 ..
0
0 ···
0

0
.. 
.



n
0
(7.4.2)
tr(T) = 0
Example 7.26. rank(P−1 BP) = rank(B)
Example 7.27. Nilpotent operator of index k N
Nk = 0, but Nk−1 6= 0.
: U → U 2such thatk−1
On the homework we will have to show that x, Nx, N x, . . . , N x a basis for Rk and
x is defined such that Nk−1 (x) 6= 0. So,


0 0 ··· 0
. . . .. 

.
1 0
(7.4.3)
[N]B =  . .
=J
 .. . . 0 0
0 ··· 1 0
Example 7.28. An idempotent operator E : U → U has the property E2 = E. This is
because
which can only return the same answer if done twice.
 these are projection operators





B = x1 , . . . , xr , y1 , . . . , yn−r .

{z
}
| {z } |

BR(E)
BN(E)
Ir×r 0
[E] =
(7.4.4)
0 0
Example 7.29. If A has a full set of ê-vectors qj , j = 1, . . . , n. Then, Aqj = λj qj with
bases S, P. So
[I]PS = q1 , . . . , qn ,
(7.4.5a)
= Q,
(7.4.5b)
−1
[T]P = Q [T]S Q,
(7.4.5c)
−1
Λ = Q AQ
(7.4.5d)
92
7.4. Lecture 25: October 23, 2013
Applied Matrix Theory
So

λ1

0
[T]P =  .
 ..
0
···
...

0
.. 
λ2
.
,
... ...
0
· · · 0 λn
0
T(x) = Ax.
(7.4.6a)
(7.4.6b)
Invariant Subspaces
Let T be a linear operator T : U → U.
Definition 7.30. A subset X ⊂ U is invariant under T if Tx ∈ X for any x ∈ X (or
T(X ) ⊂ X ). Also T1x : X → X .
Example 7.31. Given
T(x) = Ax,


−1 −1 −1 −1
 0 −5 −16 −22
.
=
 0
3
10
14
4
8
12
14
(7.4.7a)
(7.4.7b)


 
2
−1
−1
 2

 
X = span {q1 , q2 } where q1 = 
 0  and q2 = −1. Show that X is invariant under T.
0
0
So,


−1
−15

T(q1 ) = 
(7.4.8a)
 −3,
0
= q1 + 3q2 ∈ span(X );
 
0
 6

T(q2 ) = 
−4,
0
= 2q1 + 4q2 ∈ span(X ).
(7.4.8b)
(7.4.8c)
(7.4.8d)
So for any T(α1 q1 +α2 q2 ) = α1 T(q1 )+α2 T(q2 ) ⊂ X . So for T : R4 → R4 with T1x : X → X ,
1 2
[T1x ]q1 ,q2 =
.
(7.4.9)
3 4
93
Nitsche and Benner
Unit 7. Linear Transformations

   
1
0 




0 0
  
Now say we have [T]P , P = q1 , q2 , 
0 , 0. Then,





0
1

1
3
[T]P = 
0
0

2
x x
4
x x

0 −1 x
0
4 x
(7.4.10)
So we have gained some zero elements. This is since,
   
1
−1
0  0
  
T
0 =  0 ,
0
4
  

0
−1
0 −12
 

T
0 =  14
1
14
(7.4.11a)
(7.4.11b)
Now if X , Y are subspaces
of U and are invariant under T; T(X ) ⊂ X and T; T(Y) ⊂ Y.
and X + Y = U. Then B = x1 , . . . , xr , y1 , . . . , yn−r .
[T]B = [T(x1 )]B · · · [T(xr )]B [T(y1 )]B · · ·
[T1x ]Bx
0
=
,
0
[T1y ]By
= Q−1 AQ.
7.5
T(yn−r ) B ,
(7.4.12a)
(7.4.12b)
(7.4.12c)
Homework Assignment 5: Due Monday, November
4, 2013
1. Explain how we proved in class that, for any A ∈ Rm×n , the linear A| Ax = Ab is
consistent. Do not reproduce all proofs, but outline the train of thought, starting from
basic linear algebra facts.
2. For the overdetermined linear system


 
1 2
1
1 2 x = 1
1 2
2
(a) Is the matrix A rank-deficient or of full rank? What is the rank of A| A?
(b) Find all least squares solutions.
94
7.5. HW 5: Due November 4, 2013
Applied Matrix Theory
(c) Find the solution that Matlab returns, using A\b. Also find the least squares
solution of minimum norm. Do they agree?
(d) What criterion does Matlabs use to choose a solution? (use help mldivide to find
out)
3. Textbook 4.7.2: Linear transformations
For A ∈ Rn×n , determine which of the following functions are linear transformations.
(a)
(b)
(c)
(d)
T(Xn×n ) = AX − XA,
T(xn×1 ) = Ax + b for b 6= 0,
T(A) = A| ,
T(Xn×n ) = (X + X| ) /2.
4. Textbook 4.7.6
For the operator T : R2 →
R2 defined
by T(x, y) = (x + y, −2x + 4y), determine [T]B ,
1
1
where B is the basis B =
,
.
1
2
5. Textbook 4.7.11
Let P be the projector that maps each point v ∈ R2 to its orthogonal projection on
the line y = x as depicted in Figure 4.7.4.
Figure 7.1. Figure 4.7.4
(a) Determine the coordinate matrix of P with respect to the standard basis.
α
(b) Determine the orthogonal projection of v =
onto the line y = x.
β
6. Textbook 4.7.13
For P2 and P3 (the spaces of polynomials of degrees less than or equal to two ´and three,
t
respectively), let S : P2 → P3 be the linear transformation defined by S(p) = 0 p(x) dx.
Determine [S]BB0 , where B = {1, t, t2 } and B 0 = {1, t, t2 , t3 }.
95
Nitsche and Benner
Unit 7. Linear Transformations
7. Textbook 4.8.1: Change of basis
Explain why rank is a similarity invariant.
8. Textbook 4.8.2
Explain why similarity is transitive in the sense that A ' B and B ' C implies
A ' C.
9. Textbook 4.8.3
A(x, y, z) = (x + 2y − z, −y, x + 7z) is a linear operator on R3 .
(a) Determine [A]S , where S is the standard basis.
−1
(b) Determine
[A]
as the
S 0 as

well
 
nonsingular matrix Q such that [A]S 0 = Q [A]S Q
1
1 
 1
for S 0 = 0 , 1 , 1 .


0
0
1
10. Textbook 4.8.11
(a) N is nilpotent of index k when Nk = 0 but Nk−1 6= 0. If N is a nilpotent operator
of index n on Rn , and if Nn−1 (y) 6= 0, show B = {y, N(y), N2 (y), . . . , Nn−1 (y)}
is a basis for Rn , and then demonstrate that?


0
0

0

.. 
.
0 0 ··· 1 0
0
1


[N]B = J = 0
 ..
.
0
0
1
..
.
···
···
···
..
.
0
0
0
..
.
(b) If A and B are any two n × n nilpotent matrices of index n, explain why A ' B.
(c) Explain why all n × n nilpotent matrices of index n must have a zero trace and be
of rank n − 1.
11. Textbook 4.8.12
E is idempotent when E2 = E. For an idempotent operator E on Rn , let X = {xi }ri=1
and Y = {xi }n−r
i=1 be bases for R(E) and N(E), respectively.
(a) Prove that B = X ∪ Y is a basis for Rn . Hint: Show Exi = xi and use this to
deduce that B is linearly independent.
Ir 0
.
(b) Show that [E]B =
0 0
(c) Explain why two n × n idempotent matrices of the same rank must be similar.
(d) If F is an idempotent matrix, prove that rank(F) = tr(F).
96
7.5. HW 5: Due November 4, 2013
Applied Matrix Theory
12. Textbook 4.9.3: Invariant subspaces
Let T be the linear operator on R4 defined by
T(x1 , x2 , x3 , x4 ) = (x1 + x2 + 2x3 − x4 , x2 + x4 , 2x3 − x4 , x3 + x4 ),
and let X = span {ê1 , ê2 } be the subspace that is spanned by the first two unit vectors
in R4 .
(a) Explain why X is invariant under T.
(b) Determine T/X {ê1 ,ê2 } .
(c) Describe the structure of [T]B , where B is any basis obtained from an extension of
{ê1 , ê2 }.
13. Textbook 4.9.4
Let T and Q be the matrices


−2 −1 −5 −2
−9
0 −8 −2

T=
 2
3
11
5
3 −5 −13 −7


1
0
0 −1
 1
1
3 −4

and Q = 
−2
0
1
0
3 −1 −4
3
(a) Explain why the columns of Q are a basis for R4 .
(b) Verify that X = span {Q:1 , Q:2 } and Y = span {Q:3 , Q:4 } are each invariant subspaces under T.
(c) Describe the structure of Q−1 TQ without doing any computation.
(d) Now compute the product Q−1 TQ to determine
T/Y {Q:3 ,Q:4 } .
T/X {Q:1 ,Q:2 } and
14. Textbook 4.9.7
If A is an n × n matrix and λ is a scalar such that (A − λI) is singular (i.e., λ is an
eigenvalue), explain why the associated space of eigenvectors N(A − λI) is an invariant
subspace under A.
15. Textbook 4.9.8
Consider the matrix A =
−9 4
.
−24 11
(a) Determine the eigenvalues of A.
(b) Identify all subspaces of R2 that are invariant under A.
(c) Find a nonsingular matrix Q such that Q−1 AQ is a diagonal matrix.
97
Nitsche and Benner
Unit 7. Linear Transformations
98
UNIT 8
Norms
8.1
Lecture 26: October 25, 2013
Homework 5 due Friday
Difinition of norms
Norm acts on a vector space V over R or C.
Definition 8.1. A norm is a function k · k : V → R by : x → kxk such that
1. kxk ≥ 0 for any x ∈ V, and kxk = 0 if and only if x = 0
2. kαxk = |α|kxk
3. kx + yk ≤ kxk + kyk
Vector Norms
Some norms:
• kxk2 =
pPn
• kxk1 =
Pn
i=1
i=1
x2i which is the 2-norm or the Euclidean norm
|xi |
P
1/p
• kxkp = ( ni=1 xpi )
• kxk∞ = maxi |xi | = limp→∞ kxkp
The two norm
x
A unit vector is kxk
and the unit ball in R2 {x ∈ R2 : kxk = 1} We illustrate the unit balls
for the three primary norms: kxk2 = 1 which gives a circle, kxk1 = 1 or |x1 | + |x2 | = 1 which
gives a rhombus, kxk∞ = 1 or (x1 , x2 ) such that max(|x1 |, |x2 |) = 1 which gives a square.
99
Nitsche and Benner
Unit 8. Norms
Theorem 8.2. kxk∞ ≤ kxk2 ≤ kxk1
Proof.
kxk∞ = max |xi |,
i
q
= max x2i ,
i
q
= x2k , for some k,
v
u n
uX
≤t
x2i ,
(8.1.1a)
(8.1.1b)
(8.1.1c)
(8.1.1d)
i=1
= kxk2 ;
qX
|xi |2 ,
=
r
2
X
|xi | ,
≤
= kxk1 .
(8.1.1e)
(8.1.1f)
(8.1.1g)
(8.1.1h)
Our goal is now to prove the triangle inequality for the 2-norm. Note that kxk22 =
x| x, where x| y is the standard inner product.
P
x2i =
Theorem 8.3. The Cauchy–Schwarz inequality (or CBS): |x| y| ≤ kxkkyk
Proof. Let α =
x| y
;
x| x
note x| y = y| x. Also,
x| y
x (αx − y) = x
x−y ,
x| x
|
|x y
|
= x | x − x y,
x x
x| y |
|
= | x x − x y,
x x
|
|
= x y − x y,
= 0.
|
|
(8.1.2a)
(8.1.2b)
(8.1.2c)
(8.1.2d)
(8.1.2e)
Further,
|
0 ≤ kαx − yk22 = (αx − y) (αx − y) ,
= αx (αx − y) − y (αx − y) ,
|
|
= −αy x + y y,
x| y |
|
= − | y x + y y,
x x
|x| y|
2
=−
2 + kyk2 .
kxk2
100
(8.1.3a)
(8.1.3b)
(8.1.3c)
(8.1.3d)
(8.1.3e)
8.1. Lecture 26: October 25, 2013
this gives kyk2 ≥
|x| y|
kxk22
Applied Matrix Theory
2
and therefore kxk2 kyk2 ≥ |x| y| .
Theorem 8.4. kx + yk2 ≤ kxk2 + kyk2
Proof.
|
kx + yk22 = (x + y) (x + y) ,
|
|
= x + y (x + y) ,
|
|
|
= x x + 2x y + y y,
| ≤ kxk2 + 2x y + kyk2 ,
≤ kxk2 + 2kxk2 kyk2 + kyk2 ,
= (kxk2 + kyk2 )2 ,
q
kx + yk2 ≤ (kxk2 + kyk2 )2 ,
≤ kxk2 + kyk2 .
(8.1.4a)
(8.1.4b)
(8.1.4c)
(8.1.4d)
(8.1.4e)
(8.1.4f)
(8.1.4g)
(8.1.4h)
Matrix Norms
Definition 8.5. A matrix norm is a function k · k : Rn×m → R such that,
1. kAk ≥ 0 for any A ∈ Rn×m , and kAk = 0 if and only if A = 0
2. kαAk = |α|kAk
3. kA + Bk ≤ kAk + kBk
The Frobenius Norm
The Frobeius norm is defined
kAkF =
sX
a2ij .
(8.1.5)
kAi,: k22 ,
(8.1.6a)
kA:,j k22 ,
(8.1.6b)
i,j
or
kAk2F =
X
i
=
X
j
=
X
|
aj aj ,
(8.1.6c)
j
|
= tr(A A).
which gives us a convenient way of expressing this norm.
101
(8.1.6d)
Nitsche and Benner
Unit 8. Norms
Induced Norms
Given a vector norm on Rn we may define (where sup is the smallest upper bound )
kAk = sup
x∈Rn
kAxk
= sup kAxk.
kxk
kxk=1
(8.1.7)
we may also replace the smallest upper bound (sup) with the maximum (max). We can now
take kAk2 , kAk1 , and kAk∞
8.2
Lecture 27: October 28, 2013
Matrix norms (review)
Definition 8.6. A norm on V
1. kAk ≥ 0 for any A ∈ Rn×m , and kAk = 0 if and only if A = 0
2. kαAk = |α|kAk
3. kA + Bk ≤ kAk + kBk
Frobenius Norm
The Frobenius norm is defined
kAk2F =
X
|aij |2 ,
(8.2.1a)
kAi,: k22 ,
(8.2.1b)
kA:,j k22 ,
(8.2.1c)
i
=
X
i
=
X
j
|
= tr(A A),
= tr(A? A)
for A ∈ Cn×m . In in the real set A? = A| .
Properties of the Frobenius norm:
1. kAxk2 = kxk2 kAkF
2. kABkF = kAkF kBkF
102
(8.2.1d)
(8.2.1e)
8.2. Lecture 27: October 28, 2013
Applied Matrix Theory
Proof. Property (1):
kAxk2 =
X
(Ax)2i ,
(8.2.2a)
i
=
X
(Ai,: x)2 ,
(8.2.2b)
kAi,: k22 kxk22 ,
(8.2.2c)
i
≤
X
i
= kxk22
X
kAi,: k22 .
(8.2.2d)
i
|
{z
kAk2F
}
Property (2):
kABk2F =
X
k(AB)j,: k22 ,
(8.2.3a)
kABj,: k22 ,
(8.2.3b)
kAk2F kBj,: k22 ,
(8.2.3c)
X
(8.2.3d)
j
=
X
j
≤
X
j
= kAk2F
kBj,: k22 .
j
|
{z
kBk2F
}
Example 8.7.
1 2
A=
0 2
1
AA=
2
1
=
2
|
0
2
2
8
1 2
,
0 2
(8.2.4)
(8.2.5a)
(8.2.5b)
So
p
tr(A| A) ,
√
= 9,
= 3.
kAk2 =
which may be called by norm(A, ’fro’) in Matlab.
103
(8.2.6a)
(8.2.6b)
(8.2.6c)
Nitsche and Benner
Unit 8. Norms
Induced Matrix Norms
Definition 8.8. For A ∈ Rn×m the induced norm of the matrix is
kAxk
,
x∈R
kxk
= max kAxk
kAk = maxn
kxk=1
(8.2.7a)
(8.2.7b)
Example 8.9.
1 2
A=
0 2
kAk1 = max kAxk1 ,
kxk=1
X
= max
|(Ax)|.
kxk=1
(8.2.8)
(8.2.9a)
(8.2.9b)
This provides a remap of the vector x. For example we may find the image of the points
of the corners of the unit rhombus for the x vector. Which can provide a way to find the
1-norm, but this is not the best physically. Returning to the ∞-norm,
kAk∞ = max kAxk∞ ,
kxk∞ =1
= max max |(Ax)i |.
kxk∞ =1
i
(8.2.10a)
(8.2.10b)
Here we can remap the corners of the unit square to a stretched parallelogram. What is
the maximum ∞-norm? From the figure, we can see it is 3. Now we are interested in the
mapping of the 2-norm, which is the unit circle.
kAk2 = max kAxk2 ,
kxk2 =1
≈ 2.92.
(8.2.11a)
(8.2.11b)
ASIDE: Say we have,
(Ax)21 + (Ax)22 = (a11 x1 + a12 x2 )2 + (a21 x1 + a22 x2 )2 ,
(8.2.12a)
= a211 x21 + x1 x2 (a11 a12 + a21 a22 ) + a222 x22 ,
(8.2.12b)
= constant.
(8.2.12c)
which would give an ellipse.
P
Theorem
8.10.
kAk
=
max
j
1
i |aij | which gives the maximum column-sum, and kAk1 =
P
maxi j |aij | which is the maximum row-sum.
104
8.2. Lecture 27: October 28, 2013
Applied Matrix Theory
Properties
The induced norms of a matrix have similar properties to the Frobenius norm:
1. kAxk ≤ kAkkxk since
kAxk
kxk
≤ kAk
2. kABk = kAkkBk (Will be shown in the homework)
Example 8.11. The induced norm of the identity matrix is 1; kIk = 1.
Proof.
kAk1 = max
X
kxk1 =1
|(Ax)|,
(8.2.13a)
i
{z
|
kAxk1
}
X X
= max
a
x
ij j ,
kxk1 =1
i
j
XX
≤ max
|aij ||xj |,
kxk1 =1
i
= max
(8.2.13c)
j
X
kxk1 =1
≤ max
(8.2.13b)
|xj |
X
i
X
kxk1 =1
|aij |,
(8.2.13d)
j
X
|xj |
i
|aij | ,
(8.2.13e)
j
| {z }
independent of j
= max max
j
kxk1 =1
X
|aij |
i
X
|xj |,
(8.2.13f)
j
| {z }
kxk1 =1
= max
j
X
|aij |.
(8.2.13g)
i
Now find an x such that the upper bound is attained. So let k =
Now let x = êk , then
kAxk1 = kAêk k,
= kA:k k1 ,
X
=
|aij |,
P
i
|aik | = maxj
P
i
|aij |.
(8.2.14a)
(8.2.14b)
(8.2.14c)
i
= max |aij |,
(8.2.14d)
= upper bound.
(8.2.14e)
j
Further kAk22 = max kAxk22 such that kxk22 = 1. Then, kAk22 = max (x| A| Ax) such that
x| x = 1. This arrizes Lagrange multipliers, or ∇f = λ∇g.
105
Nitsche and Benner
8.3
Unit 8. Norms
Lecture 28: October 30, 2013
The 2-norm
Given the 2-norm kAk2 = maxkxk2 =1 kAxk2 we have, f (x) = kAk22 = max (x| A| Ax) such
that g(x) = x| x = 1 where f (x) : Rn → R. This needs Lagrange multipliers, or ∇f = λ∇g.
For a minimization problem.
∂UV
∂U
∂V
=
V+U
(8.3.1)
∂xj
∂xj
∂xj
Lemma 8.12. If B is symmetric, ∇ (x| Bx) = 2Bx
Note: ∇ (x| x) = 2x
Proof. To prove this lemma,
∂
∂x
∂
|
|
|
,
x Bx =
x Bx + x B
∂xj
∂xj
∂xj
|
|
= êj Bx + x Bêj ,
|
|
|
= êj Bx + x Bêj ,
=
|
êj
=
=B|
|
2êj Bx,
B x+
|{z}
| |
êj B x,
(8.3.2a)
(8.3.2b)
(8.3.2c)
(8.3.2d)
(8.3.2e)
= 2 (Bx)j .
(8.3.2f)
Proof. Alternatively, we may consider,
∂ X
xi (Bx)i
∂xj
i
!
!
∂ X X
=
xi
Bik xk ,
∂xj
i
k
!
X
X
∂
=
xi Bik xk ,
∂xj
i
k
X
X
=
Bjk xk +
xi Bij ,
k
i
=
X
Bjk xk +
X
k
k
=
X
Bjk xk +
X
k
= 2 (Bx)j .
(8.3.3a)
(8.3.3b)
(8.3.3c)
Bkj xk ,
(8.3.3d)
Bjk xk ,
(8.3.3e)
k
(8.3.3f)
106
8.3. Lecture 28: October 30, 2013
Applied Matrix Theory
So,
|
2A Ax = 2x,
|
A Ax = λx,
(8.3.4a)
(8.3.4b)
and the solution (λ, x) is an eigenpair of A| A. Note, for these x, f (x) = x| A| Ax = x| λx =
λx| x = λ. Thus,
max(f ) = λmax = max λk
(8.3.5)
k
|
|
and λk = eigenvalue of A A. Note further that A A is symmetric so the eigenvalues are real
and therefore f (x) ≥ 0 and λk ≥ 0.
Example 8.13. Given
and
1 2
A=
0 2
(8.3.6)
1 2
AA=
.
2 8
(8.3.7)
|
Then,
1 − λ
2
,
det A A − λI = 2
8 − λ
|
= (1 − λ) (8 − λ) − 4,
= λ2 − 9λ + 4.
So,
(8.3.8a)
(8.3.8b)
(8.3.8c)
√
81 − 16
,
(8.3.9a)
λ1,2 =
√2
9 ± 65
=
,
(8.3.9b)
2
and
√
9 + 65
λmax =
.
(8.3.10)
2
Therefore:
√
9 + 65
kAk2 =
≈ 2.9208 . . .
(8.3.11)
2
Now, kxk∞ ≤ kxk2 ≤ kxk1 . This inequality does not hold for matrices. Some properties,
(where U| U = I and V| V = I)
9±
• kAk2 = kA| k2
• kA| Ak2 = kAk22
A 0 • 0 B = max (kAk2 , kBk2 )
2
• kU| AUk2 = kAk2
• kA−1 k2 = √
1
λmin (A| A)
107
Nitsche and Benner
Unit 8. Norms
108
UNIT 9
Orthogonalization with Projection and
Rotation
9.1
Lecture 28 (cont.)
Inner Product Spaces
An inner product space V plus the the inner product.
Definition 9.1. Given a vector space V, an inner product is a function f : V × V → R or C
by f (x, y) = hx, yi such that
• hx, yi = hy, xi
• hx, αyi = α hx, yi, note hx, αyi = hy, xαi = α hy, xi = αhy, xi = α hx, yi
• hx + z, yi = hx, yi + hz, yi
• hx, xi ≥ 0 for any x ∈ V
• hx, xi = 0 implies x = 0
1. hx, yi = x| y with V = Rn and hx, yi = x∗ y with V = Cn , where
Example 9.2.
x∗ = x| .
2. hx, yiA = x| A| Ay with V = Rn and hx, yiA = x∗ A∗ Ay with V = Cn
√
This gives us a new norm kxkA = x| A| Ax = kAxk2 .
q´
q´
´b
b
b
0
3. hf, gi = a f (x)g(x) dx, V = C [a, b] and kf k =
f (x)f (x) dx =
|f (x)|2 dx
a
a
4. hf, gi =
´b
ω(x)f (x)g(x) dx where ω(x) ≥ 0
p
5. hA, Bi = tr(A| B) and kAk = tr(A| A) = kAkF
a
109
Nitsche and Benner
9.2
Unit 9. Orthogonalization with Projection and Rotation
Lecture 29: November 1, 2013
Inner Product Spaces
Reviewing properties of inner product spaces,
• hx, yi = hy, xi
• hx, αyi = α hx, yi
• hx + z, yi = hx, yi + hz, yi
• hx, xi ≥ 0 for any x ∈ V
• hx, xi = 0 implies x = 0
p
Now we may define norms kxk = hx, xi . Let’s say we want to define angles between
vectors and ky − xk2 = kxk2 + kyk2 − 2kxkkyk cos(θ). Rearranged,
cos(θ) =
=
=
=
−ky − xk2 + kxk2 + kyk2
,
2kxkkyk
hx, xi + hy, yi − hy − x, y − xi
,
2kxkkyk
hy, xi + hx, yi
,
2kxkkyk
hx, yi
,
kxkkyk
(9.2.1a)
(9.2.1b)
(9.2.1c)
(9.2.1d)
only if hx, yi ∈ R. For a more general definition hy, xi + hx, yi = hy, xi + hy, xi =
2 Re(hy, xi). So we would have the problem of the conjugate in finding the angle, but
have reduced this issue.
Definition 9.3. The angle between x, y is given by
cos(θ) =
hx, yi
.
kxkkyk
(9.2.2)
So, for x ⊥ y means hx, yi = 0.
Note: If the inner product is not a real number, then hx, yi = 0 means kxk2 + kyk2 =
ky − xk2 , but not vice-versa.
Example 9.4.

1
−2

x=
 3
−1



4
 1

and y = 
−2 .
−4
110
9.2. Lecture 29: November 1, 2013
Applied Matrix Theory
So x ⊥ y in hx, yi = x| y, but x 6 ⊥y in hx, yiA = x| A| Ay where,

1
0
A=
0
0
2
1
0
0
0
0
1
0

0
0
.
0
1
Definition 9.5. A set {u1 , . . . , un } is orthonormal if kuk k = 1 for any k and huj , uk i = 0
for any j 6= k.
Fourier Expansion
Given an orthonormal basis for V we can write x ∈ V as
x = c1 u1 + c2 u2 + · · · cn un
(9.2.3)
with hx, uj i = cj huj , uj i = cj .
Example 9.6. Given a series
´π
product −π f (x)g(x) dx.
n
√1
π
on
sin(kx)
is orthonormal with respect to the inner
k−1
´
´ 1−cos(2kx)
How do we compute theP
following integrals?
sin(kx) dx
=
dx So if f ∈
2
´π
n
1
1
span {sin(kx)} then f = √π k=1 ck sin(kx). Thus, ck = √π −π f (x) sin(kx) dx.
In homework we will approximate a line on [−π, π] with the sine and cosine Fourier series.
This is essentially the 2-norm approximation of the span of the Fourier series. The Gibbs
phenomena will be observed with overshoot of the sines and cosines above the function.
Thus, orthonormal bases are useful for partial differential equations applications.
Orthogonalization Process (Gramm-Schmidt)
Goal: Given basis {a1 , . . . , an } find an orthonormal basis {u1 , . . . , un } for V. This is the
orthogonalization process. Method: find uk such that span {u1 , . . . , un } = span {a1 , . . . , an }
for k = 1, . . . , n. Now let’s show the process.
k = 1:
u1 =
a1
ka1 k
k = 2:
u2 =
a2 − hu1 , a2 i u1
ka2 − hu1 , a2 i u1 k
111
Nitsche and Benner
Unit 9. Orthogonalization with Projection and Rotation
As an example of the orthogonality of u1 and u2
a2 − hu1 , a2 i u1
hu1 , a2 − hu1 , a2 i u1 i
=
,
ka2 − hu1 , a2 i u1 k
`2
1
= hu1 , a2 − hu1 , a2 i u1 i ,
`2
1
= [hu1 , a2 i − u1 hu1 , a2 i u1 ] ,
`2


1
= hu1 , a2 i − hu1 , a2 i hu1 , u1 i ,
| {z }
`2
(9.2.4a)
(9.2.4b)
(9.2.4c)
(9.2.4d)
1
= 0.
(9.2.4e)
k = 3:
...
k = k:
uk =
ak − hu1 , ak i u1 − hu2 , ak i u2 − · · · − huk−1 , ak i uk−1
.
kak − hu1 , ak i u1 − hu2 , ak i u2 − · · · − huk−1 , ak i uk−1 k
This is the Gramm–Schmidt orthogonalization process. If we want, we can write it as,
uk =
Here
(9.2.5)

Uk−1
9.3
(I − Uk−1 U∗k ) ak
.
k(I − Uk−1 U∗k ) ak k

|
|
= u1 · · · uk−1 .
|
|
Lecture 30: November 4, 2013
Gramm–Schmidt Orthogonalization
Given basis {a1 , . . . , an } find an orthonormal basis {u1 , . . . , un } for that spans the same
space. Algorithm,
a1
u1 =
,
(9.3.1a)
ka1 k
a2 − (u1 a2 ) u1
u2 =
,
(9.3.1b)
`2
(9.3.1c)
with using projections,
| hu1 , a2 i u1 = u1 a2 u1 ,
|
= u1 u1 a2 .
| {z }
P11
112
(9.3.2a)
(9.3.2b)
9.3. Lecture 30: November 4, 2013
Applied Matrix Theory
From,
a2 − (u1 a2 ) u1
,
ka2 − (u1 a2 ) u1 k
(I − u1 u|1 ) a2
,
=
k(I − u1 u|1 ) a2 k
= P⊥ a2 .
u2 =
(9.3.3a)
(9.3.3b)
(9.3.3c)
Example 9.7. Given the vectors,
 
0

a1 = 3 ,
4


−20
a2 =  27 ,
11


−14
and a3 =  −4
−2
Then we can find the orthogonal vectors,
 
0
1 
3 .
u1 =
5
4
(9.3.4)
Then,
v2 = a2 − hu1 , a1 i u1 ,



 
−20
−20
0
1
1





27 −
0 3 4
27
3 ,
=
5
5
11
11
4


 
−20
0
125  


27 −
3
=
25
11
4
(9.3.5a)
(9.3.5b)
···
(9.3.5c)
***
and
 
0
1
u1 = 3 ,
5
4


−20
1 
−12 ,
u2 =
25
−9


−15
1 
−16 .
and u3 =
25
12
113
(9.3.6a)
(9.3.6b)
(9.3.6c)
Nitsche and Benner
Unit 9. Orthogonalization with Projection and Rotation
Now rewriting our system,
a1
,
`1
a2 − r12 u1
u2 =
,
`2
a3 − r13 u1 − r23 u2
u3 =
,
`3
···
an − r1n u1 − r2n u2 − · · · − rn−1,n un−1
.
un =
`n
(9.3.7a)
u1 =
(9.3.7b)
(9.3.7c)
(9.3.7d)
(9.3.7e)
where rij = hai , uj i. Now in different vector form,
a1 = `1 u1 ,
a2 = r12 u1 + `2 u2 ,
a3 = r13 u1 + r23 u2 + `3 u3 ,
···
an = r1n u1 + r2n u2 + · · · + rn−1,n un−1 + `n un .
(9.3.8a)
(9.3.8b)
(9.3.8c)
(9.3.8d)
(9.3.8e)
We can put this in a matrix form. If A is full rank (must have m ≥ n). Since can have at
most m linearly independent vectors ai . With A = QR,



|
|
| |
|

a1 a2 · · · an 
= u1 u2
|
|
| |
| m×n

`1 r12 r13

 0 `2 r23
|



· · · un
 0 0 `3

.. . .
| m×n 
.
.
0 0
0
···
...
..
.
···

r1n
r2n 


r3n 
. (9.3.9)
.. 
. 
`n n×n
where rii = `i 6= 0 > 0 and R is invertible. This uniquely determines the Fourier coefficients
of the Fourier expansion of this system.
Thus, every matrix A of full rank has a unique decomposition, known as a QR factorization, Am×n = Qm×n Rn×n , where R is invertible. What do we know about Q| Q?
(Q| Q)ij = u|i uj which is zero for i 6= j and one for i = j. So Q| Q = In×n . These are
orthogonal matrices.
Decompositions of A:
• Am×n = Qm×n Rn×n , where Q| Q = I and R is invertible.
• A = LU if |Ak | =
6 0.
• PA = LU always exists.
Now what about QQ| ? It will be an m × m matrix, but otherwise we know little about it.
114
9.4. Lecture 31: November 6, 2013
Applied Matrix Theory
Example 9.8. Returning to our example,

 


0 −20 −14
0 −20/25 −15/25
5 25 r13
3
27 −4 = 3/5
12/25 −16/25 0 `2 r23 
4
11 −2
4/5 −9/25
12/25
0 0 `3
(9.3.10)
In this case Q has three linearly independent columns and three linearly independent rows.
|
So Q| has linearly independent columns. And interestingly (Q| ) Q| = QQ| = I. This is
an orthogonal matrix: it is both invertible and has orthogonal columns. In general this is
not the case because it is not n × n and QQ| is not necessarily the identity if m > n.
Use A = QR:
Example 9.9. Assume An×n invertible; solve Ax = b. Rewrite
QRx = b,
|
Q QRx = Q b,
|
Rx = Q b.
|
(9.3.11a)
(9.3.11b)
(9.3.11c)
This system is quick to solve (once Q and R are known).
Example 9.10. Assume Am×n full rank m > n then Ax = b is an overdetermined system
and least squares solution satisfies,
|
|
A Ax = A b,
| |
| |
R Q QRx = R Q b,
|
| |
R Rx = R Q b,
| −1
| |
Rx = R
R Q b,
|
Rx = Q b.
(9.3.12a)
(9.3.12b)
(9.3.12c)
(9.3.12d)
(9.3.12e)
Go through this proof and the solutions manual. Then we will see how well SVD can
improve things later.
9.4
Lecture 31: November 6, 2013
In homework the reduced QR factorization reffered to is where we can always write Am×n =
Qm×n Rn×n where Q| Q = I and Rn×n is triangular. This factorization is unique, but we
may also



 x x ··· x
|
| 
..
. x
0 x

QR = q1 · · · qn   . .
(9.4.1)

 .. . . . . . ... 
|
|
0 0 ··· x
115
Nitsche and Benner
Unit 9. Orthogonalization with Projection and Rotation
since {q1 , . . . , qn } is an orthogonal basis for R(A) ⊂ Rm . Now,


|
|
|
QR = q1 · · · qn qn+1
|
|
|
x x
0 x
.

. ...
|
.

· · · qm 
0 0

| m×m  0 0

..

.
0 0
···
···
..
.

x
x
.. 

.

· · · x

· · · 0


··· 0
(9.4.2)
m×n
In this case, the reduced QR is not unique.
Unitary (orthogonal) matrices
The unitary refers to the complex case and the orthogonal refers to the real.
Definition 9.11. A unitary matrix is Q ∈ Cn×n such that Q∗ Q = I. This means we have
Q has n orthogonal columns. Additionally, since Q is square we have n orthogonal rows.
So, Q ∈ Rn×n , with Q| Q = QQ| = In×n .
Properties
Some properties for a unitary Q:
• Q∗ Q = QQ∗ = In×n
• Q−1 = Q∗
• columns are orthonormal
• rows are orthonormal
• (Qx)∗ Qy = x∗ Q∗ Qy = x∗ y for any x, y.
Note: kQxk = kxk, so Q is an isometry. Also, If u, v unitary, then uv is unitary since,
(uv)∗ (uv) = v∗ u∗ uv,
= v∗ v,
= I,
= (uv) (uv)∗ .
(9.4.3a)
(9.4.3b)
(9.4.3c)
(9.4.3d)
Example 9.12. Q in full QR factorization of any A. In Matlab, [q, r] = qr(a) (is this
full QR?) and [q, r] = qr(a,0) (is this reduced QR?).
Now to compute the QR factorization, the Gramm–Schmidt algorithm is not numerically
stable. Thus, small changes in the input matrix values can cause large changes in the result.
The alternative is the modified Gramm–Schmidt which improves the stability properties. We
116
9.4. Lecture 31: November 6, 2013
Applied Matrix Theory
will not cover this here, but it is discussed in future courses. A better algorithm is to obtain
the QR by premuliplying by orthogonal matrices until it is triangular, or
Qn · · · Q1 A = R,
| {z }
(9.4.4)
Q∗
then A = QR. This is better because it does not use projections, which are not orthogonal.
Rotations of orthogonal matrices as well as reflections are useful to introduce zeros. As an
example,
Rotation
cos(θ) − sin(θ)
.
sin(θ) cos(θ)
Example 9.13. Rotation in the xy plane about the origin. So the matrix P =
cos(θ) sin(θ)
−1
Now, P = P−θ =
= P| . This again shows that it is orthogonal, par− sin(θ) cos(θ)
ticularly the columns are orthogonal. These are rotations in the plane.
Example 9.14. 3D Rotation
Rotation in three dimensions about the z-axis: This is very similar,


cos(θ) − sin(θ) 0
P =  sin(θ) cos(θ) 0 .
0
0
1
(9.4.5)
this rotates in the xy plane.
We can further rotate in any plane ij for some vector in Rn ;
i

j

1
i

P= 

j
− sin(θ)
cos(θ)


,


1
sin(θ)
cos(θ)
(9.4.6)
1

x1
..
.









xi−1


cos(θ)xi − sin(θ)xj 




xi+1


.

.
..
Px = 



x


j−1
sin(θ)x − cos(θ)x 

i
j


xj+1




.
.


.
xn
117
(9.4.7)
Nitsche and Benner
Unit 9. Orthogonalization with Projection and Rotation
This is called a Givens rotation. We can choose our θ
 

x x x x
x
x x x x  x

 
 
Pθ = 
x x x x  = x
x x x x  x
x x x x
0
such that (Qi x)j = 0. So if we have

x x x
x x x

x x x
(9.4.8)
.

x x x
x x x
So the QR factorization by Givens rotations,
Pθn · · · Pθ2 Pθ1 A = R.
{z
}
|
(9.4.9)
Q∗
Note, projections are not orthogonal. We can check this with PP∗ = I. However P(u1 ) = 0
and this means we have a non-trivial null-space so projections is not invertible. Therefore,
this is not invertible.
Reflection
Example 9.15. Suppose we have the vectors u and x, where kuk = 1. We want to reflect
x across the plane orthogonal to u. We will consider this operation Rx This operation is
also orthogonal. Now we will generalize a vector u⊥ = {v : v| u = 0}. So the orthogonal
projection onto u⊥ ; first we know hu, xi,
Px = x − hu, xi u,
= (I − uu∗ ) x,
Rx = (I − 2uu∗ ) x.
(9.4.10a)
(9.4.10b)
(9.4.10c)
where P is the projection onto the subspace and R is the reflection across the subspace.
Now R∗ = (I − 2uu∗ ) = R and R2 = I. This implies that R−1 = R∗ and R is orthogonal.
9.5
Homework Assignment 6: Due Monday, November
11, 2013
1 2
1. Let A =
. Find kAkp for p = 1, 2, ∞, F.
3 4
P
2. Show that kAk∞ = max j |aij | (Hint: make sure you understand how the analogous
i
formula for kAk1 was derived in class.)
defines a matrix
3. (a) Given a vector norm kxk, prove that the formula kAk = sup kAxk
kxk
x6=0
norm.
(This is called the induced matrix norm.)
(b) Show that for any induced matrix norm, kAxk ≤ kAkkxk.
(c) Prove that any induced matrix norm also satisfies kABk ≤ kAkkBk.
118
9.5. HW 6: Due November 11, 2013
Applied Matrix Theory
4. Consider the formula kAk = max |aij |
i,j
(a) Show that it defines a matrix norm.
(b) Show that it is not induced by a vector norm.
5. Meyer, Exercise 5.2.6
Establish the following properties of the matrix 2-norm.
(a) kAk2 =
(b)
(c)
max
|y∗ Ax|,
kxk2 =1, kyk2 =1
kAk2 = kA∗ k2 ,
kA∗ Ak = kAk22 ,
A 0 (d) 0 B = max {kAk2 , kBk2 } (take A, B to be real),
2
∗
(e) kU AVk2 = kAk2 when UU∗ = I and V∗ V = I.
q
1
−1
where λmin is the smallest eigenvalue of A| A.
6. Show that kA k = λmin
7. Show that hA, Bi = tr(A∗ B) defines an inner product.
8. Meyer, Exercise 5.3.4
For a real inner-product space with k ? k2 = h?, ?i, derive the inequality
kxk2 + kyk2
hx, yi ≤
.
2
Hint: Consider x − y.
9. Meyer, Exercise 5.3.5
For n × n matrices A and B, explain why each of the following inequalities is valid.
(a) |tr(B)|2 ≤ n[tr(B∗ B)].
(b) tr(B2 ) ≤ tr(B| B) for real matrices.
(c) tr(A| B) ≤
10. Given
tr(A| A)+tr(B| B)
2
for real matrices.

1
1
A=
1
0
(a)
(b)
(c)
(d)
(e)

0 −1
2
1
,
1 −3
1
1
 
1
1

and b = 
1.
1
Find an orthonormal basis for R(A), using the standard inner product.
Find the (reduced) QR decomposition of A.
For the matrix Q in (b), compute Q| Q and QQ| .
Find the least squares solution of Ax = b, using your results above.
Determine the Fourier expansion of b with respect to the basis you found in (a).
11. Explain why the (reduced) QR factorization of a matrix A of full rank is unique.
119
Nitsche and Benner
Unit 9. Orthogonalization with Projection and Rotation
12. Meyer, Exercise 5.5.11
Let V be the inner-product space of real-valued continuous functions defined on the
interval [−1, 1], where the inner product is defined by
ˆ
1
hf, gi =
f (x)g(x) dx ,
−1
and let S be the subspace of V that is spanned by the three linearly independent
polynomials q0 = 1, q1 = x, q2 = x2 .
(a) Use the Gram–Schmidt process to determine an orthonormal set of polynomials
{p0 , p1 , p2 } that spans S. These polynomials are the first three normalized Legendre
polynomials.
(b) Verify that pn satisfies Legendres differential equation
(1 − x2 )y 00 − 2xy 0 + n(n + 1)y = 0
for n = 0, 1, 2. This equation and its solutions are of considerable importance in
applied mathematics.
9.6
Lecture 32: November 8, 2013
From last time:
Elementary orthogonal projectors
Let u, where kuk = 1, then the projection of a vector x onto the sub plane orthogonal
to u is P⊥ x = x − hu, xi u. And P|| = uu∗ and P = I − uu∗ . Now this projector,
P, is not orthogonal. This is because an orthogonal matrix has the form Q∗ = Q−1 or
Q∗ Q = QQ∗ = I. Now,
P∗⊥ = I − (u∗ )∗ u∗ ,
= P⊥ .
(9.6.1a)
(9.6.1b)
P∗ P = P2 ,
= P,
6= I.
(9.6.2a)
(9.6.2b)
(9.6.2c)
This further gives
This property shows that once we project, projection a second time does not change the
result. Also, N(P) 6= 0, so the projectors are not invertible. Now the null space of P|| is
equal to u⊥ , or N(P|| ) = u⊥ . Similarly N(P⊥ ) = span(u).
120
9.6. Lecture 32: November 8, 2013
Applied Matrix Theory
Elementary reflection
Now Rx = x − hu, xi u, and in this case R is orthogonal. So R∗ = R and R∗ R = RR∗ = I.
Also,
(I − 2uu∗ ) (I − 2uu∗ ) = I − 2uu∗ − 2uu∗ + 4u (u∗ u) u∗ ,
= I − 4uu∗ + 4uu∗ ,
= I.
Now use reflectors to compute A = QR. So say we have


x x x
x x x

Ru = 
x x x .
x x x
(9.6.3a)
(9.6.3b)
(9.6.3c)
(9.6.4)
So Rx = (kuk, 0) = kukêi . Thus, u = x − kxkêi . Doing successive reflections,
Ru · · · Ru Ru A = R.
| N {z 2 }1
(9.6.5)
Q
This gives us the Householder method .
Complimentary Subspaces of V
Definition 9.16. If V = X + Y, where X , Y are subspaces such that X ∩ Y = {0}, which
are called complimentary subspaces and V = X ⊕ Y is the direct sum of X , Y.
Given the general picture, how do we define the angle between two subspaces? Note: If
V = X ⊕ Y then any z ∈ V can be written uniquely as z = x + y, for x ∈ X and y ∈ Y.
Further dim(V) = dim(X ) + dim(Y) and BV = BX ∪ BY .
Proof. If z = x1 + y1 = x2 + y2 then x1 − x2 = y1 − y2 ∈ X ∩ Y. So x1 − x2 = y1 − y2 = 0
and X ∩ Y = {0}.
Example 9.17. Say we have Rn = R(A) ⊕ N(A| ) for Am×n .
Projectors
Definition 9.18. We define general projectors: The projector P onto X along Y is the
linear operator such that P(z) = P(x + y) = x.
Note: If P projects onto X along Y then P2 = P because P2 (x+y) = P(x) = P(x+0) =
x = P(z). Now the null space, N(P) = y because P(z) = P(x + y) = x = 0. Further,
R(P) = x. Also, R(P) ⊕ N(P) = Rn as we showed in Homework 5.
Ultimately, we want to find the Jordan canonical form of our matrices. In general R(A)+
N(A) 6= Rn . This is obvious if Am×n because they have different dimensions, so this only
makes sense if An×n . But even if A is square, let y ∈ N(A) ∩ R(A) then Ay = 0 and
y = Az for some z. Then A (Az) = A2 z = 0, and we have a non-trivial intersection. So if
A2 has a nontrivial null space, then N(A) and R(A) have nontrivial intersection.
121
Nitsche and Benner
Unit 9. Orthogonalization with Projection and Rotation
Example 9.19. Obviously this cannot be an invertible matrix, so say we have
A=
0 1
0 0
0 0
and A =
0 0
2
This is an example of a null-potent matrix. But this is only true for projectors.
Theorem 9.20. P is a projector if and only if P2 = P. These are also known as idempotent
matrices.
9.7
Lecture 33: November 11, 2013
From last time:
Definition 9.21. P : V → V is a projector if for each X , Y such that V = X ⊕ Y and
V(x + y) for any z = x + y ∈ V.
Note: R(V) = X and N(V) = Y.
Projectors
Theorem 9.22. P is a projector if and only if P2 = P. These are also known as idempotent
matrices.
Proof. Given the vector space V and the operator P = P2 ,
R(P) ⊕ N(P) = V,
| {z } | {z }
X
(9.7.1a)
Y
P(x + y) = Px + Py,
= P Px0 ,
|{z}
(9.7.1b)
(9.7.1c)
x, some x0
= Px0 ,
= x.
(9.7.1d)
(9.7.1e)
Going the other way,
z = |{z}
Pz + (z − Pz),
| {z }
(9.7.2a)
V = R(P) ⊕ N(P).
(9.7.2b)
∈R(P)
∈N(P)
122
9.7. Lecture 33: November 11, 2013
Applied Matrix Theory
Representation of a projector
We discuss the representation of P. Given {m1 , . . . , mr } as a basis for R(P) = X and
{n1 , . . . , nn−r } as a basis for N(P) = Y. Then Pmi = mi and Pni = 0. Let B = [M | N].
Then
PB = P[M | N],
= [M | 0].
(9.7.3a)
(9.7.3b)
[P]s = P,
= [M | 0]B−1 ,
Ir×r 0
= [M | N]
B−1 ,
0
0
Ir×r 0
=B
B−1 ,
0
0
(9.7.4a)
(9.7.4b)
(9.7.4c)
(9.7.4d)
= [I]BS [P]B [I]−1
BS .
(9.7.4e)
Definition 9.23. For any subspace M ⊂ V, M⊥ = v ∈ V such that v⊥ u = 0, u ∈ M .
Theorem 9.24. For any subspace M ⊂ V, V = M ⊕ M⊥
Proof. Given basis {b1 , . . . , bm } of M, choose {bi } orthonormal complement by orthogonal
set {bm+1 , . . . , bn } such that {b1 , . . . , bm , bm+1 , . . . , bn } is a basis for V.
| {z } |
{z
}
basis for M
basis for M⊥
Example 9.25. Rn = R(A) ⊕ N(A| ) where R(A) ⊥ N(A| ). An orthogonal projector onto
M is PM is
I 0
PM = [M | N]
[M | N]−1 ,
(9.7.5a)
0 0
M∗ M = 0,
N∗ N = 0,
(9.7.5b)
(9.7.5c)
(9.7.5d)
Where


|
|
M = m1 · · · mm 
|
|
n×m
Note:

|
and N = nm+1
|

|
· · · nn 
.
| n×(n−m)
I 0
(M∗ M)−1 M∗
M N =
0 I
(N∗ N)−1 N∗
{z
}
|
(9.7.6)
[M | N]−1
123
(9.7.7)
Nitsche and Benner
Unit 9. Orthogonalization with Projection and Rotation
and
PM
∗ −1 ∗ (M M) M
= [M | 0]
,
(N∗ N)−1 N∗
(9.7.8a)
= M (M∗ M)−1 M∗ .
(9.7.8b)
But if the basis were orthonormal, how does this change the formula? Given any basis
{m1 , . . . , mm } for subspace M, orthogonal projector.
PM = M (M∗ M)−1 M∗ .
(9.7.9)
If {m1 , . . . , mm } are orthogonal then M∗ M = I and
PM = MM∗ .
(9.7.10)
Example 9.26. Elementary orthogonal projectors,
P|| = uu∗ .
(9.7.11)
P⊥ = I − uu∗
(9.7.12)
kx − PM xk22 = min kx − yk22 .
(9.7.13)
and
Theorem 9.27.
y∈M
(we will prove this as an exercise)
−1
Note: A (A| A) A| is the projector onto the range of A, or PR(A) where we assume
that A has full rank. The normal equations to solve Ax = b is
|
|
A Ax = A b.
and
(9.7.14)
| −1 |
A b.
x= AA
{z
}
|
(9.7.15)
pseudoinverse
So,
|
Ax = A A A
= PR(A) b.
9.8
−1
|
A b,
(9.7.16a)
(9.7.16b)
Lecture 34: November 13, 2013
Projectors
We discussed a projector P onto X along Y and also that the projector is idempotent,
P2 = P. Further, R(P) = X and N(P) = Y.
Ir×r 0
[P]S = [M | N]
[M | N]−1 .
(9.8.1)
0
0
124
9.8. Lecture 34: November 13, 2013
Applied Matrix Theory
The orthogonal projector onto M = R(M), where M = [m1 · · · mm ] is the basis of M,
P = M (M∗ M)−1 M∗
(9.8.2)
The normal equations for Ax = b, with A being a full rank matrix, are
Ax = PR(A) b.
(9.8.3)
Projector P is orthogonal, then P∗ = P.
Proof. P is an orthogonal projector,
P = M (M∗ M)−1 M∗ ,
∗
∗
−1
P = M (M M)
= P.
(9.8.4a)
∗
M,
(9.8.4b)
(9.8.4c)
further suppose that P = P2 and P = P∗ . Now we want to show that N(P) ⊥ R(P), where
it is normal in the standard inner product. Let x ∈ R(P) and y ∈ N(P). Then consider the
inner product,
y∗ x = y∗ Px,
= (|{z}
P∗ y)∗ x,
(9.8.5a)
(9.8.5b)
P
= (Py)∗ x,
| {z }
(9.8.5c)
= 0∗ x,
= 0.
(9.8.5d)
(9.8.5e)
0∗
if {mi } are orthogonal, PM = MM∗ .
Example 9.28.
P|| = uu∗ ,
P⊥ = I − uu∗ .
(9.8.6a)
(9.8.6b)
V = X ⊕ Y.
Decompositions of Rn
Given An×n , we know R(A) ⊕ N(A| ) = Rn and R(A| ) ⊕ N(A) = Rn , but R(A)⊥ = N(A| ).
Let B = { u1 , . . . , ur , ur+1 , . . . , un } orthonormal. Further B = { v1 , . . . , vr , vr+1 , . . . , vn }
| {z } |
| {z } |
{z
}
{z
}
basis for R(A| )
basis for R(A) basis for N(A| )
125
basis for N(A)
Nitsche and Benner
Unit 9. Orthogonalization with Projection and Rotation
also orthonormal. So,
|
|
U AV = UR(A) UN(A| ) A VR(A| ) VN(A) ,
| UR(A)
AVR(A| ) AVN(A) ,
=
|
UN(A| )
|
UR(A) AVR(A| ) 0
,
=
U|N(A| ) AVN(A) 0
Cr×r 0
=
.
0
0
UN(A| ) A
|
|
= A UN(A| ) ,
= 0.
(9.8.7a)
(9.8.7b)
(9.8.7c)
(9.8.7d)
(9.8.8a)
(9.8.8b)
Range Nullspace decomposition of An×n
Theorem 9.29. Rn = R(Ak ) ⊕ N(Ak ) for some k. This is not necessarily an orthogonal
decomposition. The smallest such k is called the index of A.
Proof. First, note that R(Ak+1 ) ⊂ R(Ak ) for any k. This is because if y ∈ R(Ak+1 ), then
y = Ak+1 z for some z, then y = Ak (Az). Second, R(A) ⊂ R(A2 ) ⊂ R(A3 ) ⊂ · · · ⊂
R(Ak ) = R(Ak+1 ) = R(Ak+2 ) = · · · contains equality for some k.
to be continued. . .
9.9
Homework Assignment 7: Due Friday, November
22, 2013
You may use Matlab to compute matrix products, or to reduce a matrix to Row Echelon
Form.
1. (a) Let A ∈ Rm×n . Prove R(A)

1

(b) Verify this fact for A = 2
1
and N(A| ) are orthogonal complements of Rm .

2 0
4 1 .
2 0
2. Prove: If X , Y are subspaces of V such that V = X ⊕ Y, then for any x ∈ V there
exists a unique x ∈ X and y ∈ Y such that z = x + y.
3. Prove: If X , Y are subspaces of V such that V = X +Y and dim(X )+dim(Y) = dim(V)
then X ∩ Y = {0}.
4. Textbook 5.11.3:
   
1
2 


   

2 4

Find a basis for the orthogonal complement of M = span  ,   .
0
1 





3
6
126
9.9. HW 7: Due November 22, 2013
Applied Matrix Theory
5. Let P be a projector. Let P0 = I − P.
(a) Show that P0 = I − P is also a projector. It is called the complementary projector
of P.
(b) Any projector projects a point z ∈ V onto X along Y, where X ⊕ Y = V, by
P(z) = P(x + y) = x. What are the X and Y for P and I − P, respectively?
6. Textbook 5.9.1:
Let X and Y be subspaces of R3 whose respective bases are
   
  
1 
 1
 1 




1 , 2
BX =
and BY = 2




1
2
3
(a) Explain why X and Y are complementary subspaces of R3 .
(b) Determine the projector P onto X along Y as well as the complementary projector
Q onto Y along X .
 
2

(c) Determine the projection of v = −1 onto Y along X .
1
(d) Verify that P and Q are both idempotent.
(e) Verify that R(P) = X = N(Q) and N(P) = Y = R(Q).
7. (a) Find the orthogonal projection of b = (4, 8)| onto M = span {u}, where u =
(3, 1)| .
(b) Find the orthogonal projection of b onto u⊥ , for b, u given in (a).
(c) Find the orthogonal projection of b = (5, 2, 5, 3)| onto
|
|
|
M = span (3/5, 0, 4/5, 0) , (0, 0, 0, 1) , (4/5, 0, 3/5, 0) .
(Note: the given columns are orthonormal.)
(d) Find the orthogonal projection of b = (1, 1, 1)| onto the range of


1 0
A = 2 1 
1 0
8. (a) Show that kPk2 ≥ 1 for every projector P 6= 0. When is kPk2 = 1?
(b) Show that kI − Pk2 = kPk2 for all projectors P 6= 0, I.
9. (a) Show that the eigenvalues of a unitary matrix satisfy |λ| = 1. Show by a counterexample that reverse not true.
(b) Show that the eigenvalues of a projector are either 0 or 1. Show by a counterexample that the reverse not true.
10. Let u be a unit vector. The elementary reflector about u⊥ is defined to be R = I−2uu∗ .
127
Nitsche and Benner
Unit 9. Orthogonalization with Projection and Rotation
(a) Prove that all elementary reflectors are involutory (R2 = I), hermitian, and unitary.
(b) Prove that if Rx = µêi , then µ = ±kxk2 , and that R:i = Rêi = ±x.
(c) Find the elementary reflector that maps x = 13 (1, −2, −2)| onto the x-axis.
(d) Verify by direct computation that your reflector in (c) is symmetric, orthogonal,
involutory.
(e) Extend the vector x in (c), to an orthonormal basis for R3 . (Hint: what do you
know about the columns of R from parts (a,b) above?)
11. Textbook 5.6.17:
Perform the following sequence of rotations in R3 beginning with
 
1

1
v0 =
−1
1. Rotate v0 counterclockwise 45° around the x-axis to produce v1 .
2. Rotate v1 clockwise 90° around the y-axis to produce v2 .
3. Rotate v2 counterclockwise 30° around the z-axis to produce v3 .
Determine the coordinates of v3 as well as an orthogonal matrix Q such that Qv0 = v3 .


−2 0 −4
4. Find its core-nilpotent decomposition.
12. (a) Find the index of A =  4 2
3 2
2
(b) A matrix is said to be nilpotent if Ak = 0 for some k. Show that the index of
a nilpotent matrix is the smallest k for which Ak = 0. Find its core-nilpotent
decomposition.
(c) Find the index of a projector that is not the identity. Find its core-nilpotent
decomposition.
(d) What is the index of the identity?
9.10
Lecture 35: November 15, 2013
Range Nullspace decomposition of An×n
Theorem 9.30. For any An×n and some k; Rn = R(Ak ) ⊕ N(Ak ). The smallest such k is
called the index of A.
Example 9.31. Nilpotent matrices have some k such that Nk = 0, R(Nk ) = {0}, and
N(Nk ) = Rn
Proof. First, note that R(Ak+1 ) ⊆ R(Ak ) for any k. This is because if y ∈ R(Ak+1 ), then
y = Ak+1 z for some z, then y = Ak (Az). Second, R(A) ⊂ R(A2 ) ⊂ R(A3 ) ⊂ · · · ⊂
R(Ak ) = R(Ak+1 ) = R(Ak+2 ) = · · · contains equality for some k. The dimensions decrease
128
9.10. Lecture 35: November 15, 2013
Applied Matrix Theory
if proper. Third, once equality achieved, it is maintained through the rest of the chain. The
proof:
R(Ak+2 ) = R(Ak+1 A),
(9.10.1a)
= AR(Ak+1 ),
(9.10.1b)
= AR(Ak ),
(9.10.1c)
k
= R(A ).
(9.10.1d)
Fourth, N(A0 ) ⊂ N(A) ⊂ N(A2 ) ⊂ · · · ⊂ N(Ak ) = N(Ak+1 ) = N(Ak+2 ) = · · ·. Why
does the nullspace change at the same spot as the columnspace? Because dim(N(Ak )) =
n−dim(R(Ak )), so once the dimensions are constant in the columnspace, then the dimensions
will be constant for the nullspace. Fifth, R(Ak ) ∩ N(Ak ) = {0}: Let y ∈ R(Ak ) and
y ∈ N(Ak ), then y = Ak x for some x, and Ak y = 0. So A2k x = 0 and x ∈ N(A2k ) = N(Ak )
k
so A
x = 0. Sixth, R(Ak ) + N(Ak ) = Rn since the dimensions add up and there is no
|{z}
y
intersection of the two spaces (except for {0}).
Now, how can we factor the matrix?
Corresponding factorization of A
k
k
Let
{x1 , . . . , xr } be a basis
for R(A ) and y1 , . . . , yn−r be a basis for N(A ). Then S = x1 , . . . , xr , y1 , . . . , yn−r , and we note that X = span {x1 , . . . , xr } and Y = span y1 , . . . , yn−r
which are both invariant subspaces. So
Cr×r
0
−1
S AS =
.
(9.10.2)
0
N(n−r),(n−r)
k
Note S−1 Ak S = (S−1 AS) because the inverse and normal S terms cancel out in the exponentiation. Thus,
C̃ 0
−1 k
S A S=
,
(9.10.3a)
0 Nk
= S−1 Ak X Y ,
(9.10.3b)
= S−1 Ak X Ak Y ,
(9.10.3c)
−1
Ak X 0 ,
=S
(9.10.3d)
−1 k
= S A X 0 .
(9.10.3e)
Thus Nk = 0 and N is nilpotent and C is invertible. So we have a core-nilpotent factorization
of A. So we have a similarity factorization which always exists. We recall the decomposition
for any A ∈ Rn×n = R(A) ⊕ N(A| ) = R(A| ) ⊕ N(A), corresponding factorization
C 0
|
U AV =
.
(9.10.4)
0 0
129
Nitsche and Benner
Unit 9. Orthogonalization with Projection and Rotation
130
UNIT 10
Singular Value Decomposition
10.1
Lecture 35 (cont.)
Singular Value Decomposition
The singular value decomposition is a way to find the orthogonal matrices Un and Vn may
be found such that we may diagonalize A. Or


σ1 0 · · · 0 0 · · · 0
.. 
. .

.
. 
 0 σ2 . . .. ..
 . .

 ..
.. ... 0 0 · · · 0 


|
| |

Um · · · U2 U1 AV1 V2 · · · Vm = 
(10.1.1)
 0 · · · 0 σr 0 · · · 0 
 0 ··· 0 0 0 ··· 0 



 .
.
.
.
.
.
.
.
.
.
.
.
 .
. . 
.
. .
0 ··· 0 0 0 ··· 0
Theorem 10.1. For any Am×n there exists orthogonal U and V such that
|
Am×n = UDV ,
(10.1.2a)

σ1 0 · · · 0
.

.
 0 σ2 . . ..
. .
 ..
.. ... 0

= [U]m×m 
 0 · · · 0 σr
 0 ··· 0 0

.
..
..
 ..
.
.
0 ··· 0 0
0 ···
..
.
0
0
0
..
.
···
···
···
..
.
0 ···

0
.. 
.

0
 |
0
 V n×n
0

.. 
.
0
(10.1.2b)
where σi are real and greater than 0. Further σ1 ≥ σ2 ≥ · · · ≥ σr , where r = rank(A).
Definition 10.2. σi are the singular values of A.
Note:
131
Nitsche and Benner
Unit 10. Singular Value Decomposition
1. σi are uniquely determined, but U, V are not unique
2. rank(A) = rank(D)
3. kAk2 = kDk2
A 0 0 B = max (kAk2 , kBk2 )
2
4. If A is invertible,

σ1

0
A = U .
 ..
0
1
σ
A−1
 1
0
= V
 ..
.
0
1
σn

0
= Ṽ 
 ..
.
0
Now K(A) = kAk·kA−1 k =
σ1
σn

··· 0
.
.
σ2 . . ..  |
V ,
.. ..
.
. 0
· · · 0 σr

0 ··· 0
.
..
1
. .. 
σ2
 U| ,

.. ..
. 0
.
· · · 0 σ1n

0 ··· 0
.
..
1
. ..  |
σn−1
 Ũ .

..
..
. 0
.
· · · 0 σ11
0
(10.1.3a)
(10.1.3b)
(10.1.3c)
which means that we can have issues with singularities.
Example 10.3. Prove kI − Pk2 = kPk2 . What is the norm of P and of I − P? From
illustration we can use tangents to the unit ball. Then kPωk = k(I − P) ωk needs to be
shown.
10.2
Lecture 36: November 18, 2013
We will do review for exam on Friday.
Singular Value Decomposition
SVD:
Theorem 10.4. For any Am×n there exists orthogonal U, V such that
|
Am×n = Um×m Dm×n Vn×n ,
132
(10.2.1)
10.2. Lecture 36: November 18, 2013
Applied Matrix Theory
where

σ1
0
···
...
0
..
.
0 ···
..
.

0
.
 ..

D=
0
0

.
 ..
σ2
... ...
···
···
0
0
..
.
0
σr
0
..
.
0
0
0
..
.
0
···
0
0
0 ···
···
···
···
..
.

0
.. 
.

0

0

0

.. 
.
0 m×n
(10.2.2)
and σi > 0, σ1 ≥ σ2 ≥ · · · ≥ σr > 0.
Notes:
1. kAk2 = σ1 , kA−1 k2 =
1
,
σn
where A would have to be invertible.
The condition number is κ(A) =
σ1
.
σn
2. r = rank(A).
3. |det(A)| =
Qn
i=1
σi .
4. A−1 = VD−1 U| .
Existence of the Singular Value Decomposition
Proof. We know that there exists U and V such that,
|
UA V =
C 0
,
0 0
(10.2.3)
where Cr×r is invertible. Let x be such that kxk = 1, and
kCk2 = max kCyk2 ,
(10.2.4a)
= kCxk2 ,
= σ1 ,
= kAk2 .
(10.2.4b)
(10.2.4c)
(10.2.4d)
kyk2 =1
Let y =
Cx
kCxk2
and further the two orthogonal matrices, [x | X] and [y | Y]. Now,
|
C 0
y
Cx CX ,
[y | Y]
[x | X] =
|
0 0
Y
|
y Cx y| CX
=
Y| Cx Y| CX
|
133
(10.2.5a)
(10.2.5b)
Nitsche and Benner
Unit 10. Singular Value Decomposition
Further,
x| C| Cx
y Cx =
,
kCxk2
|
kCxk22
,
=
kCxk2
= kCxk2 ,
= σ1 .
(10.2.6a)
(10.2.6b)
(10.2.6c)
(10.2.6d)
Similarly, YCx = 0,
|
x| C| CX
,
kCxk2
x| C| Cxx| X
=
,
kCxk2
x| C| Cx |
=
x X,
kCxk2
|
= σ1 x
X ,
|{z}
y CX =
(10.2.7a)
(10.2.7b)
(10.2.7c)
(10.2.7d)
orthogonal
= 0.
(10.2.7e)
So we have reduced to,
σ1 0
.
|
0 C̃
We may then repeat this by maximizing the two-norm to get the full singular value decomposition.
Notes:

Am×n
··· 0 0
. .

.
 0 σ2 . . .. ..
. .
 ..
.. ... 0 0

= [U]m×m 
 0 · · · 0 σr 0
 0 ··· 0 0 0

.
..
.. ..
 ..
.
. .
0 ··· 0 0 0

σ1 0


|
|

 0 σ2
= u1 · · · ur 
. .
..
.
|
| m×r .
0 ···
σ1
0

··· 0
.. 
.

· · · 0

|
[V ]n×n ,
· · · 0

· · · 0

. . .. 
. .
· · · 0 m×n



··· 0
− v|1 −
.. 
..
. . 

..
,
 

.
..
|
. 0
− vr − r×n
0 σr r×r
= ÛD̂V̂.
(10.2.8a)
(10.2.8b)
(10.2.8c)
134
10.2. Lecture 36: November 18, 2013
Applied Matrix Theory
from trimming out the zeros. Here σ1 , . . . , σr are unique, and u1 , . . . , ur and v1 , . . . , vr are
unique up to sign.
From the existence of A = UDV| , what can we deduce? We know that U| U = UU| = I
and V| V = VV| = I. So,
[AV]:j = [UD]:j ,
 
0
 .. 
.
 
0
 
= U σ j  ,
 
0
.
 .. 
0
(10.2.9a)
(10.2.9b)
= σj uj .
(10.2.9c)
σj uj , 1 ≤ j ≤ r
,
0,
j>r
(10.2.10a)
where (AB):j = AB:j . Now,
Avj =
|
|
|
A = VD U ,
|
|
A U = VD, A uj =
σj vj , 1 ≤ j ≤ r
.
0,
j>r
(10.2.10b)
(10.2.10c)
So, the four fundamental subspaces are
• R(A) = span {u1 , . . . , ur }
• N(A) = span {vr+1 , . . . , vn }
• R(A| ) = span {v1 , . . . , vr }
• N(A| ) = span {ur+1 , . . . , um }
|
AA
|
n×n
|
|
= VD U UDV ,
|
(10.2.11a)
|
= VD DV ,
 2
σ1 0 · · · 0
.

.
 0 σ22 . . ..
. .
 ..
.. ... 0

2
= V
 0 · · · 0 σr
 0 ··· 0 0

.
..
..
 ..
.
.
0 ··· 0 0
|
| A AV :j = VD D :j ,
1
σj vj , j ≤ r
|
A Avj =
.
0,
j>r
135
(10.2.11b)
0 ···
..
.
0
0
0
..
.
···
···
···
...
0 ···

0
.. 
.

0

|
V ,
0

0

.. 
.
0 n×n
(10.2.11c)
(10.2.11d)
(10.2.11e)
Nitsche and Benner
Unit 10. Singular Value Decomposition
p
Thus, σj = λs (A| A) , for j = 1, . . . , r. Similarly, vj are the eigenvectors of A| A for j =
1, . . . , r and vj are orthogonal because eigenvectors of symmetric matrices are orthogonal.
To construct the SVD, we will
1. find λj , which are the eigenvalues of A| A and the eigenvectors of A| A, vj .
2. Find u1 , . . . , ur for σj uj = Avj
3. Find complementary orthogonal set ur+1 , . . . , um and vr+1 , . . . , vn .
10.3
Lecture 37: November 20, 2013
Review and correction from last time
From last time:
C 0
U AV =
0 0
|
(10.3.1)
Then we said there exists an x such that kCxk = kCk2 = σ1 . Then we let y = Cx
Consider,
σ1
[x | X] and [y | Y]. In our system (we must correct this from last lecture), since we know that
the x is the eigenvector corresponding to the λ and C| Cx = λx. Then x| C| C = x| λ = λx|
x| C| CX
,
σ1
λx| X
,
=
σ1
= 0.
|
y CX =
(10.3.2a)
(10.3.2b)
(10.3.2c)
SVD will not be on the exam, but will be on the final.
Singular Value Decomposition
We know,
|
(10.3.3a)
|
(10.3.3b)
VA = UD.
(10.3.4)
A = UDV ,
= ÛD̂V̂
Similarly,
This means that
Avj =
σj uj , j ≤ r
0,
j>r
(10.3.5)
σj v j , j ≤ r
0,
j>r
(10.3.6)
Then,
|
A uj =
Thus,pvj are called the right singular vectors, the uj are the left singular vectors, and
σj = λ(A| A) are the singular values. Also we may define the four subspaces,
136
10.3. Lecture 37: November 20, 2013
Applied Matrix Theory
• R(A) = span {u1 , . . . , ur }
• N(A) = span {vr+1 , . . . , vn }
• R(A| ) = span {v1 , . . . , vr }
• N(A| ) = span {ur+1 , . . . , um }
So if we have the SVD, it is easy to describe these subspaces. So we will construct the SVD
using these facts. Now,
|
|
|
|
A A = VD U UDV ,
|
|
= VD DV ,
|
|
A AV = VD D.
(10.3.7a)
(10.3.7b)
(10.3.7c)
1 1
A=
.
2 2
(10.3.8)
Example 10.5. Given
Then r = 1 and
1
A A=
1
5
=
5
|
2
2
5
5
1 1
,
2 2
(10.3.9a)
(10.3.9b)
|
A Av = λv,
|
det A A − λI = 0,
5 − λ
5 = 25 − 10λ + λ2 − 25,
5
5 − λ
(10.3.10a)
(10.3.10b)
(10.3.10c)
= λ2 − 10λ,
= λ (λ − 10) .
(10.3.10d)
(10.3.10e)
So to find v1 for (A| A − λI)v = 0.
5 − 10
5
v = 0,
5
5 − 10
−5 5
v1
0
=
,
5 −5
v2
0
(10.3.11a)
−5v1 + 5v2 = 0,
1
1
v1 = √
2 1
v2 = v1 ,
(10.3.11b)
(10.3.11c)
(10.3.11d)
So
1
1
1 2
1
2
√
Av1 =
=√
1 2
1
2
2 4
137
(10.3.12a)
Nitsche and Benner
Unit 10. Singular Value Decomposition
Thus,
1
2
,
u1 = √
20 4
1
1
=√
5 2
(10.3.13a)
(10.3.13b)
1
1 √ 1
A= √
10 √ 1 1 ,
5 2
2
|
= ÛD̂V̂ ,
√
1
1
1 2
1 1
10 0
√
=√
,
0
0
5 2 −1
2 −1 1
|
= UDV
(10.3.14a)
(10.3.14b)
(10.3.14c)
(10.3.14d)
This is great to do by hand, but is not a very numerically stable way to find the SVD.
Geometric interpretation
The image of the unit sphere S2 = {x ∈ Rn , kxk2 = 1}
y = Ax,
|
= UDV x,
|
|
U y = DV x.
(10.3.15a)
(10.3.15b)
(10.3.15c)
y0 = Dx0 ,
yj0 = σj x0j .
(10.3.16a)
(10.3.16b)
Let y0 = U| x and x0 = V| x. So
2
Now kxk22 = 1 and kx0 k22 = kV| xk2 = 1. Thus,
2
2
2
(x01 ) + (x02 ) + · · · + (x0n ) = 1,
0 2 0 2
0 2
y1
y2
yn
+
+ ··· +
= 1,
σ1
σ2
σn
(10.3.17a)
(10.3.17b)
which is a hyperellipse! Viewing the transformation of Axj = σj uj . This shows that the σj
give the major and minor axes of the multi-dimensional ellipsoid.
There is a nice fact about the SVD. For low rank approximations (the second step may
be rationalized easily from the matrix form)
|
A = UDV ,
r
X
|
=
σj uj vj .
(10.3.18a)
(10.3.18b)
j=1
This is a way to write any matrix as a sum of rank 1 matrices. Now the σj decrease, so we may
P
truncate the series when σj gets close to zero. Let Ak = kj=1 σj uj v|j with rank(Ak ) = k.
138
10.4. Lecture 38: November 22, 2013
Applied Matrix Theory
Theorem 10.6. kA − Ak k2 = σk+1 and is the best approximation, or
kA − Ak k2 =
From


..
.
σk
σk+1
..
.
σr
0
..






= U






kA − Bk2 .
(10.3.19)

σ1






Ak = U






min
rank(B)=k
.
0
σk+1
..
.
σr


σ1



..
.









σk



 |

 |
0
V − U
V ,
.



.
.






0






..
.
.. 

. 
0
0
(10.3.20a)






 |
V





..
. 
0
(10.3.20b)
We will explore the proof and implications of this theorem later.
10.4
Lecture 38: November 22, 2013
Review for Exam 2
From homework, we need to be able to go through proofs like this
• kAk∞ ⇐⇒ kAk1
• Matrix norm
• QR unique
• kA−1 k2 = √
• kAk2 =
1
λmin (A| A)
p
λmax (A| A)
Norms
To show that something is a norm (whether matrices or vectors), we must show the following
properties,
139
Nitsche and Benner
Unit 10. Singular Value Decomposition
1. kxk ≥ 0 for any x and kxk = 0 implies x = 0
2. kαxk = |α|kxk
3. kx + yk = kxk + kyk
Several matrix norms, the induced and Frobenius, have the fourth property,
kABk ≤ kAkkBk
(10.4.1)
More major topics
The exam covers chapters 4 and 5 (minus the SVD). These are things to know:
• Subspace (closed under addition and scalar multiplication)
• Linear transformations (Definition: addition and scalar multiplication)
• Coordinates and change of bases
[x]S = x,
X
=
xi êi = Ix,
(10.4.2a)
(10.4.2b)
i
=
X
ci ui = Uc.
(10.4.2c)
i
where c = [x]B and this is clearly a problem of inverting a matrix. The formula is
c = [x]B ,
(10.4.3a)
= [ê1 ]B [ê2 ]B · · · [ên ]B [x]S .
|
{z
}
(10.4.3b)
U−1
So we really care about what the representation is of some linear operator for some
basis.
[T]B = [T(u1 )]B [T(u2 )]B · · · [T(un )]B ,
(10.4.4a)
[T(x)]B = [T]B [x]B .
(10.4.4b)
• Change coordinates
[T]B ∼ [T]0B ,
T = ST0 S−1 .
(10.4.5a)
(10.4.5b)
• Least squares: Ax = b. The normal equations are
|
|
A Ax̂ = A b
140
(10.4.6)
10.4. Lecture 38: November 22, 2013
Applied Matrix Theory
This connects with projections because,
| −1 |
Ax̂ = A A A
A b = Pb.
|
{z
}
(10.4.7)
P⊥R(A)
The solution is unique of the matrix is full rank because then A| A is invertible.
• Projectors
Defined by P = P2 , similarly we have the properties of the complementary projector
(I − P). These are orthogonal to each other, and P∗ = P (Unitary matrix is an
orthonormal matrix (when real) Q∗ Q = I and Q∗ = Q−1 . The projector always
projects onto its range. The proof of P and (I − P).
• Gram–Schmidt needed to orthogonalize a set of matrices. or P = uu∗ = UU∗ = I−uu∗
• QR othogonalization (using Gram–Schmidt)
Show A = QR is unique, rjj > 0, Q is orthonormal, R is upper triangular. Existence
and uniqueness? From the Gramm–Schmidt construction process we know we can get
it because we can always construct it. GS was
a1 = r11 q1 ,
a2 = r12 q1 + r22 q2 ,
···
an = r1n q1 + r2n q2 + · · · + r2,n qn .
(10.4.8a)
(10.4.8b)
(10.4.8c)
(10.4.8d)
Uniqueness: this also shows uniqueness directly because you have these equations and
may invert them. (Invertibility) a1 = r11 q1 implies ka1 k = kr11 q1 k = |r11 |kq1 k and we
may find the r11 so then q1 = r111 a1 . Then induction may prove this is true for all the
other values of n. First we would show true for n = 1 (all qk are uniquely determined),
then show if true for n = k then it’s also still true for n = k + 1. This is done with
showing r1,k+1 , . . . , rk+1,k+1 , qk+1 are uniquely determined.
ak+1 = r1,k+1 q1 + · · · + rk,k+1 qk + rk+1,k+1 qk+1
(10.4.9a)
This is a Fourier series and we may take ak+1 , qj = rj,k+1 qj , qj = rj,k+1 for j < k+1
therefore all we have left is to find the vector
rk+1,k+1 qk+1 = ak+1 − r1,k+1 q1 − · · · − rk,k+1 qk
and we can do the same argument again to finish with rk+1,k+1 and qk+1
rk+1,k+1 qk+1 = kbk,
|rk+1,k+1 | qk+1 = kbk,
| {z }
(10.4.9b)
(10.4.10a)
(10.4.10b)
1
|rk+1,k+1 | = kbk,
rk+1,k+1 = kbk.
141
(10.4.10c)
(10.4.10d)
Nitsche and Benner
Unit 10. Singular Value Decomposition
For positive rj,j .
So we have several decompositions now to work with.
• Invariant subspaces will give a block diagonal form of the matrix.
will have class on Wednesday.
10.5
Homework Assignment 8: Due Tuesday, December 10, 2013
You may use Matlab to compute matrix products, or to reduce a matrix to Row Echelon
Form.
1. Determine the SVDs of the following matrices (by hand calculation).
3
0
(a)
0 −2


0 2
(b) 0 0
0 0
1 1
(c)
1 1
1 2
2. Let
0 2
(a) Use Matlab to find the SVD of A. State U, Σ, V (4-decimal digit format is
fine).
(b) In one plot draw the unit circle C and indicate the vectors v1 , v2 , and in another
plot draw the ellipse AC (i.e. the image of the circle under the transformation x →
Ax) and indicate the vectors Av1 = σ1 u1 , Av2 = σ2 u2 . Use the axis(’square’)
command in Matlab to ensure that the horizontal and vertical axes have the
same scale.
(c) Find A1 , the best rank-1 approximation to A in the 2-norm. Find kA − A1 k2 .
3. Let A ∈ Rm×n , with rank r. Use the singular value decomposition of A to prove the
following.
(a) N(A) and R(A| ) are orthogonal complementary subspaces of Rn .
(b) Properties in 5.2.6 (b, c, d, e):
Establish the following properties of the matrix 2-norm.
(a) *
(b) kAk2 = kA∗ k2 ,
(c) kA∗ Ak2 = kAk22 ,
142
10.5. HW 8: Due December 10, 2013
Applied Matrix Theory
A 0 = max {kAk , kBk } (take A, B to be real,
(d) 2
2
0 A 2
∗
∗
(e) kU AVk2 = kAk2 when UU = I and V∗ V = I.
p
(c) kAkF = σ12 + σ22 + · · · + σr2 .
4. Show that if A ∈ Rn×n is symmetric then σj = |λj |.
5. Compute the determinants of the matrices given in 6.1.3 (a), 6.1.3 (c), 6.2.1 (b).


1 2 3
(a) A = 2 4 1
1 4 4


1
2 −3
4
 4
8 12 −8

(b) A = 
 2
3
2
1
−3 −1
1 −4
0 0 −2 3
1 0
1
2
(c) 2 1
−1 1
0 2 −3 0
6. (a)
(b)
(c)
(d)
(e)
Show that if A is invertible, then det(A−1 ) = 1/ det(A).
Show that for any invertible matrix S, det(SAS−1 ) = det(A).
If A is n × n, show that det(αA) = αn det(A).
If A is skew-symmetric, show that A is singular whenever n is odd.
Show by example that in general, det(A + B) 6= det(A) + det(B).
7. (a) Let An×n = diag {d1 , d2 , . . . , dn }. What are the eigenvalues and eigenvectors of A?
(b) Let A be a nonsingular matrix and let λ be an eigenvalue of A. Show that 1/λ is
an eigenvalue of A−1 .
(c) Let A be an n × n matrix and let B = A − αI for some scalar α. How do the
eigenvalues of A and B compare? Explain.
(d) Show that all eigenvalues of a nilpotent matrix are 0.
8. For each of the two matrices,


3
2 1
2 0 ,
A = A1 =  0
−2 −3 0

−4 −3 −3
0
A = A2 =  0 −1
6
6
5
determine if they are diagonalizable. If they are, find
(a) a nonsingular P such that P−1 AP is diagonal.
(b) A100
(c) eA .
143

Nitsche and Benner
Unit 10. Singular Value Decomposition
9. Use diagonalization to solve the system
dx
= x + y,
dt
dy
= −x + y,
dt
x(0) = 100,
y(0) = 100.
10. 7.4.1
Suppose that An×n is diagonalizable, and let P = [x1 |x2 | · · · |xn ] be a matrix whose
columns are a complete set of linearly independent eigenvectors corresponding to eigenvalues λi . Show that the solution to u0 = Au, u(0) = c, can be written as
u(t) = ξ1 eλ1 t x1 + ξ1 eλ1 t x1
in which the coefficients ξi satisfy the algebraic system Pξ = c.
11. 7.5.3
Show that A ∈ Rn×n is normal and has real eigenvalues if and only if A is symmetric.
12. 7.5.4
Prove that the eigenvalues of a real skew-symmetric or skew-hermitian matrix must be
pure imaginary numbers (i.e., multiples of i).
13. 7.6.1
Which of the following matrices are positive definite?




1 −1 −1
20 6 8
5
1, B =  6 3 0,
A = −1
−1
1
5
8 0 8


2 0 2
C = 0 6 2 .
2 2 4
14. 7.6.4
By diagonalizing the quadratic form 13x2 + 10xy + 13y 2 , show that the rotated graph
of 13x2 + 10xy + 13y 2 = 72 is an ellipse in standard form as shown in Figure 7.2.1 on
p. 505.
10.6
Lecture 39: November 27, 2013
We will have one more homework before the end. We will have a homework on SVD and
eigenvalues with the diagonalization, and we will be covering the Jordan Canonical Form
but may not be putting it on the homework. It will be due next Friday so we have time for
the solutions before the final. The final is cumulative and will be held on Wednesday.
Singular Value Decomposition
We know that A = UΣV| for any matrix A. Here Σ is a diagonal matrix. We may
rearrange,
AV = UΣ,
σj uj , j ≤ r
Avj =
0
j>r
144
(10.6.1a)
(10.6.1b)
10.6. Lecture 39: November 27, 2013
Applied Matrix Theory
P
P
The SVD A = rj=1 σj uj v|j for a matrix of rank r. We may define, Ak = kj=1 σj uj v|j and
have an aproximation of rank k.
Theorem 10.7.
kA − Ak k2 = σk+1 ,
= min
rank(B)=k
kA − Bk2
(10.6.2a)
(10.6.2b)
In words, Ak is a best approximation of rank k of A in the 2-norm.
Proof. The first part is easily shown by the matrix form of the eigenvalues which are in the
diagonal matrix of the SVD. For the second part, we assume there is a matrix B which has
rank k and follows the condition kA − Bk2 < σk+1 . Then there exists a subspace W of
dim(W) = n − k such that Bw = 0 for any w ∈ W. For such a w,
kAwk2 = k(A − B) wk2 ,
≤ k(A − B)k2 kwk2 ,
< σk+1 kwk2 .
(10.6.3a)
(10.6.3b)
(10.6.3c)
But subspace V of dim(V) = k + 1 such that, kAwk2 ≥ σk+1 for all w ∈ V, namely
V = span {v1 , . . . , vk+1 }. This is impossible though because the subspaces do not have
agreeing dimensions, or since dim(V) + dim(W) > n there exists w 6= 0 ∈ (V ∩ W). For
this w must have kAwk2 < σk+1 kwk2 and kAwk2 ≥ σk+1 kwk2 which is an impossible
contradiction. This proof is a little more elementary than the proof in the book.
Thus, we can approximate a matrix by some lower-rank matrices. This is good because
then we have fewer non-zero entries in our system and reduce our co.
SVD in Matlab
Example handed out in class: In Matlab if you say x = load(clown.mat) then type
whos and you will see a matrix X. This may be displayed with image(X). Then we do
[U,S,V] = svd(X). The first figure (Figure 10.1) plots the diagonal entries of S. So we see
we can truncate the small values. As we increase the approximations for k = 3, 10, 30 we
see a significantly improving image in Figure 10.2. So Ak = ŨΣ̃Ṽ| and this is done with
Ak = U(:,1:k) * S(1:k,1:k) * V(:,1:k)’. Now we see that for k = 30 we have a good
approximation which is significantly less expensive than the original matrix. Further in
Table 10.1 we observe that the relative error decreases significantly.
Listing 10.1. svdimag.m
1
2
3
4
5
6
7
% a p p l i c a t i o n o f t h e SVD t o image c o m p r e s s i o n
% from ” A p p l i e d Numerical L i n e a r A l g e b r a ” , by J . Demmel , page 114 (SIAM)
load clown . mat
% X i s a m a t r i x o f p i c e l s o f dimension 200 by 320
[ U, S ,V]=svd (X ) ;
%%
figure (1)
145
Nitsche and Benner
Unit 10. Singular Value Decomposition
Table 10.1. Relative error of SVD approximation matrix Ak
relative error compression ratio
k
σk+1 /σk
520k/(200 · 320)
3
0.155
0.024
10
0.077
0.081
30
0.027
0.244
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
plot ( diag ( S ) ) ;
set ( gca , ’ F o n t S i z e ’ , 1 5 )
xlabel ( ’ k ’ )
ylabel ( ’ \ s i g m a k ’ )
title ( ’ Singular values of X’ )
%%
figure (2)
i f o n t =12
colormap ( ’ gray ’ )
subplot ( ’ p o s i t i o n ’ , [ . 0 7 , . 5 4 , . 4 0 , . 4 0 ] )
k=3; image (U ( : , 1 : k ) ∗ S ( 1 : k , 1 : k ) ∗V ( : , 1 : k ) ’ ) ; t i t l e ( ’ k=3 ’ )
set ( gca , ’ F o n t S i z e ’ , i f o n t )
set ( gca , ’ XTickLabel ’ , ’ ’ )
%
subplot ( ’ p o s i t i o n ’ , [ . 5 , . 5 4 , . 4 0 , . 4 0 ] )
k =10; image (U ( : , 1 : k ) ∗ S ( 1 : k , 1 : k ) ∗V ( : , 1 : k ) ’ ) ; t i t l e ( ’ k=10 ’ )
set ( gca , ’ F o n t S i z e ’ , i f o n t )
set ( gca , ’ YTickLabel ’ , ’ ’ )
set ( gca , ’ XTickLabel ’ , ’ ’ )
%
subplot ( ’ p o s i t i o n ’ , [ . 0 7 , . 0 6 , . 4 0 , . 4 0 ] )
k =30; image (U ( : , 1 : k ) ∗ S ( 1 : k , 1 : k ) ∗V ( : , 1 : k ) ’ ) ; t i t l e ( ’ k=30 ’ , ’ F o n t S i z e ’ , i f o n t )
set ( gca , ’ F o n t S i z e ’ , i f o n t )
%
subplot ( ’ p o s i t i o n ’ , [ . 5 , . 0 6 , . 4 0 , . 4 0 ] )
image (X ) ; t i t l e ( ’ o r i g i n a l ’ )
set ( gca , ’ F o n t S i z e ’ , i f o n t )
set ( gca , ’ YTickLabel ’ , ’ ’ )
146
10.6. Lecture 39: November 27, 2013
Applied Matrix Theory
Singular values of X
8,000
σk
6,000
4,000
2,000
0
0
20
40
60
80
100
k
120
140
160
Figure 10.1. Singular values σk of matrix X versus k.
k=3
k=10
k=30
original
20
40
60
80
100
120
140
160
180
200
20
40
60
80
100
120
140
160
180
200
50
100
150
200
250
300
50
100
150
200
250
300
Figure 10.2. Rank k approximations of original image.
147
180
200
Nitsche and Benner
Unit 10. Singular Value Decomposition
148
UNIT 11
Additional Topics
11.1
Lecture 39 (cont.)
The Determinant
We will quickly cover the essentials of chapter 6. The determinant is defined;
Definition 11.1.
det(A) =
X
σ(p)a1p1 a2p2 · · · anpn
(11.1.1)
p
where p is the number of permutations of (1, . . . , n) → (p1 , p2 , . . . , pn ). Also, σ(p) is the sign
of the permutation,
+1, if even number of exchanges needed to obtain p from (1, . . . , n)
σ(p) =
(11.1.2)
−1, if odd number of exchanges needed to obtain p from (1, . . . , n)
If we have a non-zero determinant, then Ax = b has a unique solution.
Theorem 11.2. We have several interesting properties of determinants.
1. Triangular matrices:

a11 a12 · · · a1n
n
..

. a2n 
0 a
 Y
det  . . 22 .
=
aii

. . . . ...  i=1
 ..
0
0 · · · ann

2. det(A| ) = det(A)
3. det(AB) = det(A) det(B).
4. If B is obtained for A by
• Exchange row i with row j; det(B) = det(A).
• Multiply row i by α; det(B) = α det(A).
• Add multiple of row i to row j; det(B) = det(A).
5. det(A) is a bilinear operator in the rows and columns of A
149
(11.1.3)
Nitsche and Benner
11.2
Unit 11. Additional Topics
Lecture 40: December 2, 2013
Further details for class
Homework due Friday, with latest it can possibly be turned in on Tuesday before 4:30 (to
get solutions). Final is on Wednesday at 7:30–9:30. (?)
Today we will cover eigenvalues and eigenvectors. Then on Wednesday we will cover
positive-definite matrices.
For Final, we will review on Friday. Some homework problems may definitely be ignored
because they were too involved.
Diagonalizable Matrices
We know that for any matrix,
A∼B
(11.2.1)
A = SBS−1 .
(11.2.2)
means
Now we want to know when A ∼ D which is a diagonal matrix .
Eigenvalues and eigenvectors
Say we have the eigen-pair (λ, v), when
Av = λv,
(A − λI) v = 0.
(11.2.3a)
(11.2.3b)
which is only the case for v ∈ N(A − λI). Thus we care about det(A − λI) = 0. So,
a11 − λ
a12
···
a1n a21
a22 − λ · · ·
a2n det (A − λI) = ..
(11.2.4a)
..
.. ,
...
.
.
.
an1
an2
· · · ann − λ
= (a11 − λ) (a22 − λ) · · · (ann − λ) + powers of λ of degree ≤ n − 2, (11.2.4b)
= p(λ),
(11.2.4c)
= (−1)n λn + (−1)n−1 λn−1 (a11 + a22 + · · · + ann ) + lower order terms in λk , k ≤ n − 2,
{z
}
|
tr(A)
(11.2.4d)
n
= (λ − λ1 ) (λ − λ2 ) · · · (λ − λn ) (−1) ,
= (−1)n λn + (−1)n−1 (λ1 + λ2 + · · · + λn ) + l.o.t.,
= (−1)n λn + λn−1 (−λ1 − λ2 − · · · − λn ) + l.o.t. ,
(11.2.4e)
(11.2.4f)
(11.2.4g)
(11.2.4h)
with the final step being from the fundamental theorem of algebra. From this we get the
following:
150
11.2. Lecture 40: December 2, 2013
Applied Matrix Theory
• Every matrix A has n eigenvalues.
P
• The sum
λk = tr(A).
Q
•
λk = p(0) = det(A).
Q
• If A is triangular det(A − λi I) = (aii − λi ) = 0 so the roots are simply the aii and
λi = aii .
Example 11.3. For a little reviewing find the eigenvalues and the eigenvectors of
1 −1
A=
1
1
So,
1 − λ −1 det(A − λI) = 1
1 − λ
λ1,2
= (1 − λ)2 + 1,
(11.2.5a)
= λ2 − 2λ + 2,
√
2± 4−8
,
=
2
= 1 ± i.
(11.2.5b)
(11.2.5c)
(11.2.5d)
Then for λ1 :
1 − (1 + i)
−1
0
(A − λI) v = 0
1
1 − (1 + i) 0
1 −i 0
→
,
−i −1 0
1 −i 0
,
→
0 0 0
=
−i −1 0
1
i 0
,
(11.2.6a)
(11.2.6b)
(11.2.6c)
(11.2.6d)
So,
v1 − iv2 = 0,
v1 = iv2 ,
i
v1 =
1
(11.2.7a)
(11.2.7b)
λ2 = 1 − i,
−i
v2 =
1
(11.2.8a)
(11.2.7c)
Then
Note that the eigenvectors v1 , v2 are linearly independent.
151
(11.2.8b)
Nitsche and Benner
Unit 11. Additional Topics
Note: If A has a linearly independent set of eigenvectors, then,
|
|
|
V=
v1 v2 · · · vn | |
|
is invertible and Avj = λj vj . Then, for a diagonal matrix D with the eigenvalues along the
diagonal
(AV):j = (VD):j ,
(11.2.9a)
AV = VD,
A = VDV−1 .
(11.2.9b)
(11.2.9c)
So not all matrices are diagonalizable.
Example 11.4. Given the matrix
1 1
A=
,
0 1
has the double eigenvalue of 1; λ1 = λ2 = 1. So,
0 1
A − λI =
,
0 0
dim(N(A − λI)) = 1,
(11.2.10a)
(11.2.10b)
Thus there is only one eigenvector.
Example 11.5. Given the matrix
1 0
A=
,
0 1
has the double eigenvalue of 1; λ1 = λ2 = 1. But here,
0 0
A − λI =
,
0 0
dim(N(A − λI)) = 2,
(11.2.11a)
(11.2.11b)
and there are two linearly independent eigenvectors.
1
0
v1 =
and v2 =
.
0
1
Example 11.6. Any nilpotent matrix where Nk
This is because,

0 ···
1 0

A ∼  ..

.
0
= 0 does not have a full set of eigenvalues.

0
0

. . . .. 
.
1 0
0
So λ1 = λ2 = · · · = λn = 0 and dim(N(A − λI)) = dim(N(A)) = 1.
152
(11.2.12a)
11.2. Lecture 40: December 2, 2013
Applied Matrix Theory
Theorem 11.7. If A has n distinct eigenvalues, then the corresponding eigenvectors are
distinct.
Proof. Assume that {vk } are linearly dependent. Then, we canPwrite one of them as a
linearly independent subset of the other eigenvectors. Then, vk = j6=k cj vj where {vj } are
linearly independent. Then,
X
(A − λk I) vk = (A − λk I)
cj v j ,
(11.2.13a)
j6=k
0 = λk v k − λk v k ,
X
=
cj (Avj − λk vj ) ,
(11.2.13b)
(11.2.13c)
j6=k
=
X
j6=k
0=
X
cj (λj − λk ) vj ,
| {z }
(11.2.13d)
αj v j .
|{z}
(11.2.13e)
6=0
6=0
This however means that the set {vj } is linearly dependent. But this is a contradiction so
the assumption is not possible. So {vj } are linearly independent.
Now if A = VDV−1 then,
Ak = VDV−1 VDV−1 · · · VDV−1 ,
k
= VD V
−1
(11.2.14a)
(11.2.14b)
Similarly we can do a power series. This will be useful in solving systems of differential
equations.
153
Nitsche and Benner
Unit 11. Additional Topics
154
Index
backward substitution, 9
basic columns, 38
basis, 56, 66, 84
bilinear operator, 149
Cauchy–Schwarz inequality, 100
change of basis, 88
column space, 58
complementary projector, 127
complimentary subspaces, 121
condition number, 27, 48
consistent system, 36
determinant, 149
diagonal matrix, 150
differentiation, 86
direct sum, 121
eigenvalues, 150
eigenvectors, vi, 150
elementary operations, 15
Euclidian norm, 19
exams, 73, 74
field, 55
finite difference, 2, 44
four fundamental subspaces, 58
Frobeius norm, 101
fundamental theorem of algebra, 65, 150
geometric series, 46
Givens rotation, 118
Gramm–Schmidt orthogonalization, 112
homogeneous solutions, 39
Householder method, 121
idempotent matrices, 122
idempotent operator, 92
ill-posed, 20
induced norm, 104
inner product, 109
interpolation, 63
invariant subspace, 91
isometry, 116
Laplace equation, 2
least squares, 69
left null space, 58
linear function, 39
linear system, 1
linear transformation, 83
action, 83, 87
linearly dependent, 66
linearly independent, 57, 63
lower triangular, 25
lower triangular system, 5
matrix form, 1
matrix norm, 101
minimization, 74
modified Gramm–Schmidt, 116
nilpotent matrix, 128
nilpotent operator, 92
nonbasic columns, 38
norm, 47, 99
normal equations, 71
null space, 58
operation count, 9
order, 3
orthogonal projector, 123
orthogonalization, 111
orthonormal, 111
155
Nitsche and Benner
Index
orthonormal basis, 111
partial differential equations, 111
particular solution, 38
periodic boundary conditions, 44
perturbations, 42
pivoting, 19, 22
PLU factorization, 22
projection, 118
QR factorization, 114
rank, 61
reduced row echelon form, 35
reflection, 118
review, 140
rotation, 117
row echelon form, 31
row space, 58
self-similar, 89
Sherman–Morrison formula, 44
singular value decomposition, 131
singular values, 131
smallest upper bound, 102
spanning set, 56
sparsity, 18
submatrices, 26
subspaces, 67
Taylor series, 3
trace, 40
tridiagonal matrix, 18
tuple, 92
Van der Monde matrix, 63
vector form, 1
vector space, 56
well-posed, 20
156
Figures
1.1
Finite difference approximation of a 1D boundary value problem. . . . . . .
2
2.1
2.2
One-dimensional discrete grids. . . . . . . . . . . . . . . . . . . . . . . . . .
Two-dimensional discrete grids. . . . . . . . . . . . . . . . . . . . . . . . . .
10
11
3.1
Plot of linear problems and their solutions. . . . . . . . . . . . . . . . . . . .
21
4.1
4.2
Geometric illustration of linear systems and their solutions. . . . . . . . . . .
Figures for Textbook problem 3.3.4. . . . . . . . . . . . . . . . . . . . . . . .
36
51
5.1
5.2
Basis vector of example solution. . . . . . . . . . . . . . . . . . . . . . . . .
Interpolating system. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
57
64
6.1
6.2
Minimization of distance between point and a plane. . . . . . . . . . . . . .
Parabolic fitting by least squares . . . . . . . . . . . . . . . . . . . . . . . .
73
73
7.1
Figure 4.7.4 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
95
10.1 Singular values σk of matrix X versus k. . . . . . . . . . . . . . . . . . . . . 147
10.2 Rank k approximations of original image. . . . . . . . . . . . . . . . . . . . . 147
157
Nitsche and Benner
Figures
158
Tables
3.1
Variation of error with the perturbation variable . . . . . . . . . . . . . . . .
20
10.1 Relative error of SVD approximation matrix Ak . . . . . . . . . . . . . . . . 146
159
Nitsche and Benner
Tables
160
Listings
2.1 code stub for tridiagonal solver . . . . . . . . . . . . . . . . . . . . . . . . . 13
10.1 svdimag.m . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 145
161
Was this manual useful for you? yes no
Thank you for your participation!

* Your assessment is very important for improving the work of artificial intelligence, which forms the content of this project

Download PDF

advertisement