INFORMATION TO USERS

INFORMATION TO USERS
INFORMATION TO USERS
This manuscript has been reproduced from the microfilm master. UMI
films the text directly from the original or copy submitted. Thus, some
thesis and dissertation copies are in typewriter face, while others may
be from any type of computer printer.
The quality of this reproduction is dependent upon the quality of the
copy submitted. Broken or indistinct print, colored or poor quality
illustrations and photographs, print bleedthrough, substandard margins,
and improper alignment can adversely affect reproduction.
In the unlikely event that the author did not send UMI a complete
manuscript and there are missing pages, these will be noted. Also, if
unauthorized copyright material had to be removed, a note will indicate
the deletion.
Oversize materials (e.g., maps, drawings, charts) are reproduced by
sectioning the original, beginning at the upper left-hand corner and
continuing from left to right in equal sections with small overlaps. Each
original is also photographed in one exposure and is included in
reduced form at the back of the book.
Photographs included in the original manuscript have been reproduced
xerographically in this copy. Higher quality 6" x 9" black and white
photographic prints are available for any photographs or illustrations
appearing in this copy for an additional charge. Contact UMI directly
to order.
University Microfilms International
A Bell & Howell Information Company
300 North Zeeb Road. Ann Arbor, Ml 48106-1346 USA
313/761-4700 800/521-0600
Order Number 1S5S117
New methods for super-resolution
Walsh, David Oliver, M.S.
The University of Arizona, 1993
UMI
SOON.ZeebRd.
Ann Aifeor, MI 48106
NEW METHODS FOR SUPER-RESOLUTION
by
David Oliver Walsh
A Thesis Submitted to the Faculty of the
DEPARTMENT OF ELECTRICAL AND COMPUTER ENGINEERING
In Partial Fulfillment of the Requirements
For the Degree of
MASTER OF SCIENCE
WITH A MAJOR IN ELECTRICAL ENGINEERING
In the Graduate College
THE UNIVERSITY OF ARIZONA
19 9 3
2
STATEMENT BY AUTHOR
This thesis has been subntiitted in partial fulfillment of requirements for an ad­
vanced degree at The University of Arizona and is deposited in the University Library
to be made available to borrowers under rules of the Library.
Brief quotations from this thesis are allowable without special permission, pro­
vided that accurate acknowledgment of source is made. Requests for permission for
extended quotation from or reproduction of this manuscript in whole or in part may
be granted by the head of the major department or the Dean of the Graduate College
when in his or her judgment the proposed use of the material is in the interests of
scholarship. In all other instances, however, permission must be obtained from the
author.
SIGNED:
APPROVAL BY THESIS DIRECTOR
This thesis has been approved on the date shown below:
Pamela A. Nielsen
Assistant Professor of
Electrical and Computer Engineering
3
ACKNOWLEDGMENTS
"That which does not kill us, makes us stronger." -Friedrich Nietzsche
First and foremost I wish to thank my advisor, Dr. Pamela Nielsen, who deserves
credit for many of the new ideas contained in this thesis. I am particularly grateful for
the generous financial assistance which she provided to me. I also wish to thank Dr.
Donald Dudley and Dr. Michael Marcellin for taking the time to serve on my thesis
committee. I must thank Mr. David Marshall for serving as a mathematical reference
and for providing valuable feedback, and I would like to thank Mr. Justin Judkins
for acknowledging myself in his thesis. Thanks also go to Dr. Richard Ziolkowski
and Dr. Hal Tharp for providing excellent reference materials and computer software
which I used for this thesis. And, of course, I must thank my parents who have
supported me in all my endeavors.
In Memory of Noog (1981-1992)
5
TABLE OF CONTENTS
LIST OF FIGURES
7
LIST OF TABLES
9
ABSTRACT
1. INTRODUCTION
2. THE DIRECT METHOD
2.1. Description
2.2. Errors And Error Bounds
2.2.1. Types of Errors
2.2.2. Effects of Errors
2.2.3. Error Bounds
2.3. The Least Squares Solution
2.4. Results From Experimental Trials
2.4.1. Case 1 Trials—Gaussian Noise
2.4.2. Case 2 Trials—Uniform Noise
2.4.3. Results
2.4.4. Analysis of Trial Results
2.5. Summary
3. THE GERCHBERG ALGORITHM
3.1. Description of the Gerchberg Algorithm
3.1.1. Example 3.1.1
3.1.2. Error Energy Reduction
3.1.3. Discrete Implementation
3.1.4. The Method of Alternating Orthogonal Projections
3.1.5. Example 3.1.2
3.2. Relationship to the Direct Method
3.2.1. Example 3.2
3.3. The Overdetermined Case
3.3.1. Example 3.3
10
11
16
16
18
19
25
27
31
32
33
34
35
42
43
44
45
45
53
54
54
55
58
60
65
66
6
3.4. Error Bounds for the Gerchberg Algorithm
3.5. Summary, Advantages and Disadvantages
68
69
4. TERMINATION SCHEMES FOR THE GERCHBERG ALGO­
RITHM
71
4.1. Convergence For The Gerchberg Algorithm
4.2. Results From Experimental Trials
4.2.1. Procedure
4.2.2. Results
4.2.3. Aliasing Error
4.3. Termination Schemes
4.3.1. Convergence Factor
4.3.2. 2nd Derivative of Energy
4.3.3. Statistically Optimum Termination
4.3.4. Comparison Between Termination Schemes
4.4. Summary
5. THE SVD METHOD
5.1. Eigenvector Expansion
5.2. SVD Expansion
5.2.1. Advantages
5.3. Example: Two Point Target
5.4. 2-D Example
5.5. Error Concentration
5.6. Summary
6. SUMMARY AND CONCLUSION
REFERENCES
71
74
74
75
79
82
82
83
87
87
88
90
91
93
94
96
106
117
119
121
123
7
LIST OF FIGURES
1.1. Effect of bandlimiting a time-limited function
1.2. Infinite frequency spectrum which has been bandlimited to -LlQHz. .
2.1. The original time-limited function •w{t)
2.2. The infinite frequency spectrum W { f ) of the time-limited function
12
13
20
21
22
23
2.3. The periodic time-limited function Ws(t)
2.4. The periodic frequency spectrum W s { f ) o f W s { t )
2.5. Aliasing error between continuous frequency spectrum W(/) and dis­
crete periodic frequency spectrum Ws{f)
2.6. A-priori error bound and error norms for 1000 trials (exactly deter­
mined case)
2.7. A-priori error bound and error norms for 1000 trials (overdetermined
case)
41
3.1. Flow chart for the Gerchberg algorithm
3.2. Known portion of frequency spectrum
3.3. Time domain representation of known frequency spectrum
3.4. Figure 3.3 truncated to time-limited region
3.5. Frequency spectrum of Figure 3.4
3.6. Known portion of frequency spectrum replaced
3.7. Time domain representation of Figure 3.6
3.8. The method of alternating orthogonal projections
3.9. Convergence factor vs. iteration number for Example 3.2
3.10. Convergence to numerical limit of computer
3.11. Correction energy for Example 3.3
46
47
48
49
50
51
52
56
62
64
67
4.1. Example of reconstructed energy
4.2. Second derivative of total reconstructed energy from Figure 4.2. . . .
4.3. Normalized mean squared error between true solution and Gerchberg
algorithm's latest estimate
84
85
86
5.1. Original time-limited object
5.2. Discrete frequency spectrum of original object
97
98
24
40
8
5.3. Known portion of frequency spectrum distorted by noise
5.4. Image from noisy, diffraction limited spectrum
5.5. Error norm for Gerchberg algorithm
5.6. Result from Gerchberg algorithm after 154 iterations
5.7. Result from SVD method; 6 singular values thrown out
5.8. Result from SVD method: 5 singular values thrown out
5.9. Original space-limited object
5.10. Frequency spectrum of original object
5.11. Frequency spectrum distorted by noise
5.12. Known portion of frequency spectrum
5.13. Image from noisy, diffraction-limited spectrum
5.14. SVD result using 8 singular vectors
5.15. SVD result using 6 singular vectors
5.16. SVD result using 7 singular vectors
5.17. SVD result using 9 singular vectors
5.18. SVD result using 10 singular vectors
99
100
101
102
103
104
107
108
109
110
Ill
112
113
114
115
116
9
LIST OF TABLES
2.1. Results from Case 1 trials
2.2. Results from Case 2 trials
2.3. Condition number oi A
36
37
38
3.1. Result after each step of example in Figure 3.8
57
4.1. Expected distribution of / and e over eigenvectors of P
4.2. Change in distribution of / and e over eigenvectors of P as known
frequencies move away from main lobe of frequency spectrum. .
76
78
10
ABSTRACT
This thesis presents a new, non-iterative method for super-resolution which we call
the direct method. By exploiting the inherent structure of the discrete signal process­
ing environment, the direct method reduces the discrete super-resolution problem to
solving a linear set of equations. The direct method is shown to be closely related to
the Gerchberg algorithm for super-resolution. A mathematical justification for early
termination of the Gerchberg algorithm is presented and the design of optimal ter­
mination schemes is discussed. Another new super-resolution method, which we call
the SVD method, is presented. The SVD method is based on the direct method and
employs SVD techniques to minimize errors in the solution due to noise and aliasing
errors on the known frequency samples. The new SVD method is shown to provide
results nearly identical to the optimal solution given by the Gerchberg algorithm,
with huge savings in time and computational work.
11
CHAPTER 1
INTRODUCTION
Super-resolution is the process of restoring lost frequency or spatial frequency
information to improve the resolution of a time or space domain object. The missing
frequency information has usually been lost by passing the time domain function
through a bandlimited system. For example, consider the time-hmited function
{1,0,
m=
—10ms < t < 10ms
elsewhere
(1.1)
which has been passed through an ideal low-pass filter with cutoff frequencies of
The original time domain function f{t) is shown as the dotted line in
Figure 1.1. The result of low-pass filtering (bandlimiting) this function to
is
shown as the solid hne in Figure 1.1. The filtered version of f { t ) has clearly lost the
sharp edges of the original function.
The original frequency spectrum of f{t) is
shown as the dotted line in Figure 1.2 and the low-pass filtered frequency spectrum
is shown as the solid line in the same figure. If the original frequency spectrum (the
dotted line in Figure 1.2) can be restored, then f{t) can be completely resolved.
Super-resolution has been shown to be theoretically possible for a time-limited
(space-limited) function [1], such as f(t) in the example above. The theory is based
Dotted;Tiine-Limiled Function, Solid:Low-Pass Filtered to 10 Hz
1.5
0.5
-0.5
-100
-80
-60
-40
100
-20
Time (ms)
Figure 1.1: Effect of bandliniiting a time-limited function.
13
DoUed;Original Spectiuin, Sc)lid;L()w-Pass Filtered lo 10 Hz
-200
-150
-100
-50
0
50
100
150
200
Frequency (Hz)
Figure 1.2: Infinite frequency spectrum which has been bandlimited to ± l Q H z .
14
on the fact that the frequency spectrum of a finite object is analytic [2], and any
analytic function is uniquely determined by a finite portion of itself. All superresolution methods discussed in this thesis make use of known time-Umited (spacelimited) constraints as well as known portions of the frequency spectrum to achieve
super-resolution.
Bandlimited extrapolation is essentially the same problem as super-resolution with
the domains reversed. Using a finite portion of an analytic time or space domain
function, and known bandlimited constraints, the entire analytic function can be
restored. To avoid confusion, this thesis will deal only with the super-resolution
problem, although all of the super-resolution methods discussed can be used for the
bandlimited extrapolation problem as well.
There are many applications for super-resolution in addition to the example given
above. Super-resolution can be used to improve the optical resolution of a diffraction
limited object. Any optical imaging system which has a finite aperture (such as a
camera lens) will have a diffraction limit (only a finite range of spatial frequencies are
passed). A space-limited object must have an infinite frequency spectrum, therefore
a diffraction limited system will reduce the resolution of a space-limited object, such
as an antenna.
15
Super-resolution also has applications in geophysical imaging. Using samples from
a small portion of the frequency response of a finite portion of the earth, superresolution can restore the entire frequency spectrum and thus provide a high reso­
lution model of underground structures. Super-resolution has found similar applica­
tions for limited angle tomography in the field of medicine.
In some cases super-resolution can reduce data storage requirements. Since superresolution can be used to restore the entire function from the partial information,
only a small portion of a sampled analytic function needs to be stored.
In Chapter 2 of this thesis a new super-resolution method which we call the direct
method will be presented and analyzed. Chapter 3 will examine the popular Gerchberg algorithm for super-resolution. The Gerchberg algorithm will be related to
the direct method and the overdetermined case will be re-evaluated. Chapter 4 will
provide the first mathematical justification for early termination of the Gerchberg
algorithm. The design of termination schemes for the Gerchberg algorithm will be
discussed and three specific schemes will be implemented and tested. Chapter 5 will
present a new non-iterative method based on the direct method of Chapter 2 and the
early termination criteria of Chapter 4. We call this new method the SVD method
because it incorporates SVD techniques. The SVD method is so fast and insensi­
tive to error that it should have significant impact on the field of super-resolution.
Chapter 6 will summarize the results in this thesis and draw a few conclusions from
them.
16
CHAPTER 2
THE DIRECT METHOD
This chapter will introduce a new super-resolution method which we call the di­
rect method. The direct method is a non-iterative method which extrapolates func­
tions using known time-limited constraints, available frequency domain samples, and
the Discrete Fourier Transform (DFT) coefficients which relate the known frequency
samples to the unknown time domain samples. In Section 2.1 the direct method is
described. Section 2.2 explores possible uses for the direct method. Section 2.3 begins
with an explanation of sources of errors. The effect of the errors is then illustrated
and error bounds are discussed and implemented. Section 2.4 generalizes the direct
method to the overdetermined case and a least squares solution is implemented. Fi­
nally, in Section 2.5, the results of extensive experimental trials are presented and
analyzed.
2.1
Description
Suppose there is a discrete periodic time domain function / with a period of N
samples which is known to be time-Hmited to n non-zero samples. Also suppose that
we know n samples of this function's periodic frequency spectrum F. The direct
17
method of super-resolution utilizes the relationship between the time and frequency
coefScients of the DFT and a-priori knowledge about the duration and location of
the function in the time domain to solve for the n unknown time domain samples
given n known frequency samples.
For example, consider a function / which is known to be time-limited to the first
3 time domain samples, has a periodic length of 8 samples, and only the 1st, 2nd,
and 8th frequency domain samples are known (only the low frequency components
are known).
/ = [/i/2 /30 0000]
(2.1)
F = [Fi F2 F3 F4 FS Fe Fr Fg]
(2.2)
Each element of F is related to the elements of / by the discrete Fourier transform
Fm = E
(2.3)
n=0
where Wn =
For this example, N = 8, so
= Ws =
Since /i
thru fs are all known to be zero, we have the following equations for Fj, F2, and Fg.
Fi = fiW^ + f2W^ + fsW^
(2.4)
F2 = fiW° -f f2W}j -f
(2.5)
Fs = /i< -h f2W'^ +
(2.6)
Since Fi, F2, and Fs are known, we have a set of 3 linear equations with 3 unknowns
(/15 /25 and /s). We can use Gaussian elimination to solve the system Ax = b for the
18
unknown vector x where;
A=
11
1 Wn
1 Wjf
1 •
X =
' h
/2
.
•
b=
.
' Fi '
F2
Fs
Having solved for the unknown time domain samples, the entire time domain function
/ can be transformed via the DFT to yield the entire discrete frequency spectrum
F. Thus given partial frequency information and knowledge about the location and
duration of the time domain function /, we can determine / and F completely.
In the example above, the system of equations is exactly determined (the number
of known frequency samples is equal to the number of unknown time domain samples).
Since the rows of A are independent, a unique solution for x is guaranteed. When
the number of unknown time domain samples is greater than the number of known
frequency samples, the system Ax = bis underdetermined and the solution for x will
not be unique. To avoid the underdetermined case the sampling rates in each domain
can be adjusted such that the number of known frequency samples is equal to the
number of unknown time domain samples. The overdetermined case and the least
squares solution will be discussed in detail in Section 2.4.
2.2
Errors And Error Bounds
The direct method is subject to many kinds of errors and can be quite sensitive to
them. This section will introduce and discuss sources of error and their effects on the
direct method. Also, a-priori and a-posteriori error bounds for the direct method's
solution will be derived and implemented.
19
2.2.1 Types of Errors
Two commonly encountered sources of error are random noise and measurement
errors. These errors may be incurred when sampling the frequency response directly,
or they may propagate to the frequency domain from noisy or poorly measured time
domain samples.
Another type of error to which frequency samples may be subjected is aliasing
error. Aliasing error occurs whenever a non-periodic sequence is modeled as a periodic
sequence, which is usually what we need to do to implement the direct method. The
following example illustrates how aliasing errors could occur.
Suppose we want to reconstruct the rectangular waveform w { t ) shown in Figure 2.1
from samples of its continuous frequency spectrum W{f) (Figure 2.2) taken between
-1.5 Hz and -|-1.5 Hz. Suppose we also know that w{t) is time-hmited to between
-.5 and +.5 seconds, and since w{t) is time-limited we know that W{f) cannot be
bandlimited. The time domain sampling rate has been arbitrarily chosen as 32 Hz
for this example.
Since the direct method is based on the DFT, it assumes
that w { t ) is the periodic function Ws( t ) (Figure 2.3) with periodic frequency spectrum
Ws{f) (Figure 2.4). By using the direct method we are modeling w(t) as one period
of
Ws{t).
If error-free samples from the periodic spectrum Ws{f) are used, we can recon­
struct Ws{t) perfectly, but remember we are not using samples from Ws{f) we are
using samples from W{f).
From Figure 2.5 it is apparent that W{f) and Ws{f) are
Time (seconds)
Figure 2.1: The original time-limited function w ( t ) .
21
W(f)=Frequency Spectrum of w(t)
1
0.8
0.6
0.4
0.2
0
-0.2
-0.4
-50
-40
-30
-20
-10
Frequency (Hz)
Figure 2.2: The infinite frequency spectrum W { f ) of the time-limited function w { t ) .
ws(t)=Periodic Representation of w(t)
1.4
1.2
1
0.8
0.6
0.4
0.2
0
-0.2
-10
-5
0
5
Time (seconds)
Figure 2.3: The periodic time-limited function Ws{t).
10
Ws(f)=Frequency Spectrum of ws(t)
1.2
1
0.8
0.6
0.4
0.2
0
-0.2
-0.4'-50
-40
-30
-20
-10
0
10
20
30
40
Frequency (Hz)
Figure 2.4; The periodic frequency spectrum Ws{f) of Ws{t).
50
24
Dotted=Wsw. Dashed=W(f). Solid=AIiasing Error.
1.2
1
0.8
0.6
0.4
0.2
0
-0.2
-0.4
-1.5
-1
-0.5
0
0.5
1
1.5
Frequency (Hz)
Figure 2.5: Aliasing error between continuous frequency spectrum W { f ) and discrete
periodic frequency spectrum Ws{f).
25
not exactly the same over the frequencies of interest. The difference between W { f )
and Ws{f) is the aliasing error.
Whenever samples from a non-bandlimited continuous frequency spectrum are
used in the reconstruction, aliasing error will occur. The aliasing error can be reduced
by increasing the sampling rate in the time domain, however this will increase the
number of unknown time domain samples which, as will be shown later in this chapter,
increases the direct method's sensitivity to errors dramatically.
A third type of error often encountered is computer roundoff error. Although
roundoff error in the frequency samples used for the reconstruction is usually neg­
ligible compared to noise and aliasing error, roundoff error on the elements of the
matrix A (the DFT coefficients) can become significant. As the periodic length N of
the signal gets large, the DFT elements Wn get closer together which increases the
condition number cond(A).
2.2.2
Effects of Errors
The effects of errors in the known frequency samples will be illustrated by an
example. Consider the function / from Section 2.1 which was time-limited to the
first 3 samples and had a periodic length of 8 samples. For this example we will let
the first 3 samples equal 1.
/ = [1 1 1 0 0 0 0 0]
(2.8)
26
Recall, we do not know the values of /, only that
thru fs are zero. We want
to determine / and F completely, but we know only the first, second, and eighth
samples of F (the periodic frequency spectrum of /):
Fi
F2
Fs
b=
=
3.000
1.7071 - 1.7071i
1.7071 + 1.7071i
(2.9)
The matrix A is the same as before.
A=
1 1
1 Wn
1 WJ,
1
1
1
1
1 0.7071 - 0.7071i -i
1 0.7071 + 0.707h' i
(2.10)
Solving the system Ax = b we get
' 1.000 "
/i
X =
=
/2
1.000
(2.11)
1.000
h
which is the true solution for x.
To illustrate the effect of errors, the vector r, consisting of samples of complex
Gaussian noise of zero mean and variance 0.01, is added to the known frequency
samples, b.
r=
I" -0.1140 -0.1435Z "
-0.0516 + 0.0825i
0.0316 + 0.0673i
b = b+ r =
2.8860 - 0.1435i
1.6555 -1.6246i
1.7387+ 1.7744i
(2.12)
(2.13)
Now the system Ax = 6 is solved yielding:
X=
0.8254 - 0.2269i
1.2329 + 0.4268i
0.8277 - 0.3434i
(2.14)
27
The resulting error in the solution due to the input error r is given by e.
-0.1746 - 0.2269?
0.2329 + 0.4268
-0.1723 - 0.3434i
e = x-x=
(2.15)
The output error e for this example is relatively large compared to the input error r.
It is shown in the next subsection that this amplification of error is limited by the
character of the matrix A.
2.2.3
Error Bounds
We would like to compute bounds for errors in the solution x due to input errors
on the known frequency samples 6, rounding errors on the DFT coefficients of A, or
both. Mathematical bounds have already been established for errors in the solution
to systems of linear equations, hence these known bounds can be used to bound the
absolute error, ||e||, and the relative error, ||e||/||a;||, of a reconstructed function.
First consider the solution of Ax = b when there are errors in the known frequency
A
A
samples b. As in the previous example b = b + r where r is the error in the frequency
samples, and x = x + e where e is the error in the solution. Now substituting for x
and b the system becomes:
A{x + e) = (b + r)
(2.16)
Since this system is linear, and since we know that Ax = b, the system can be
separated into two parts.
Ax = b , Ae = r
(2.17)
28
From the second part of the previous equation:
e = A-V
(2.18)
Given that A~^ is bounded, then there exists a finite Ci such that:
IIA-'rll < c.||r||
(2.19)
Define ||^~^|| as the smallest Ci such that Eq. 2.19 holds. Then (|^~^r|| <
and we have the following bound for the absolute error.
||e||<p-^|||H|
(2.20)
l|e|| ^ ll^llll^-^IIIMI
.oon
Eq. 2.20 is equivalent to:
II
where the left hand side is the relative error. Now given that A is bounded, then
there exists a finite C2 such that:
llAxll < C2llxlj
(2.22)
Define ||A|| as the smallest C2 such that Eq. 2.22 holds. Then ||6|| = ||Aa;|| < ||A||||a:||.
Substituting ||fe|| for the denominator in Eq. 2.21 we obtain the following relative error
bound for perturbations in b.
g<l|A|l||^-'||M
(2.23)
The term ||A||[|A~^|| is known as the condition number of the matrix A [5], and is
denoted as cond{A). Since the condition number can vary with the choice of norm,
29
1
another condition number, condt:{A), is defined as the ratio of the largest singular
value of A to the smallest singular value of A [3]. The justification for the use of
cond^[A) is based on the singular value decomposition of A, which will be discussed
in detail in Chapter 5.
A similar (but longer) process is used to derive the following generalized relative
error bound for errors in A and/or b where 6x, Sb, and SA are perturbations in x, b,
and A respectively.
||e||
Ikll
||fe|| ,
||x|| - 1
amd(A)
(\Ml
I ll^ll
11^6^
l|6|| j
^
'
The proof for this error bound has been omitted, but the interested reader is referred
to Atkinson [3] for details.
An alternate to the a-priori bounds given by Eq's. 2.23 and 2.24 is an a-posteriori
bound. The bound suggested by Aird and Lynch [4] was implemented and tested.
A
The error e =
— a; for the solution to the problem Ax = b can be bounded as:
(1+ r ) - "
"-(i-T)
where C = A~^ is the computed inverse of A, T = ||CA — /|| < 1, and the residual
r = h — Ax.
Aird and Lynch [4] have implemented the a-posteriori bound for real numbers as
follows. First, assume that all elements of vectors and matrices are real and that
A is error free. Compute C = A~^. Assume that the magnitude of the error on
30
each element of b is less than some known positive number
. Since b and K are
known, each element of h can be confined to an interval [6;, bu] with the corresponding
element of b in the center of this interval. Next, using the endpoints of [6i,6u],
r = h — b can be confined to an interval [r;,ru]. Now, using the extreme values of
[r/, r^] and appropriate matching of signs in the inner products, Cr can be confined
to an interval C[r/,r„] = [Cr/,Cru]. Now substitute the minimum value for ||Cr||
in the lower bound and the maximum value for ||Cr|| in the upper bound. If zero
is in the interval containing a component of a vector, then the lower bound for that
component is zero. Aird and Lynch [4] have shown that this bound can provide a
marked improvement over the a-priori bound when used for applications with real
numbers.
For our application it must be assumed that all components can be complex. We
adapt the a-posteriori bound of Aird and Lynch to complex elements as follows.
Again, we know b, and we know that the magnitude of the error for each element
must be less than K. Since the components of b are complex, we can no longer
confine the elements of b to an interval. However we can confine each element of b to
a circular region whose radius is K and whose center is the corresponding element of
b. Next we confine each element of the residual r = 6 — 6 to a circular region whose
radius is K and whose center is zero. Since we know that the magnitude of each
element of r must be less that K we can find the worst case ||r|| (the largest possible
^In a practical situation, x and b are unknown. Hence the bound in Eq. 2.25 cannot be imple­
mented directly. However, it may be reasonable to assume that the error on each element of b is
less than some fixed value. Such a situation arises with measurement errors.
31
||r||) for whichever type of norm we choose. Next we compute C = A~^ as before and
use the definition of the matrix norm to bound Cr by ||Cr|| < ||C||||r|| where ||r|| is
the largest possible ||r||. Substituting the bound for ||Cr|| into the a-posteriori error
upper bound, it becomes
11^-^11^(1?^
(2.26)
which is identical to the a-priori upper bound except for the term in the denominator.
Since 0 < r <1, this bound when adapted for complex elements is, at best, equivalent
to the a-priori bound and therefore not worth computing.
2.3
The Least Squares Solution
Suppose the number of known frequency samples is greater than the number of
unknown time-domain samples. In this case the system of equations Ax = h is
overdetermined. If the known frequency samples contain error there may be no
solution X which satisfies the entire set of equations and hence the direct method will
fail. One way to get around this problem is to simply throw away the extra frequency
samples and use the direct method to solve an exactly determined set of equations.
A better idea is to use the extra samples and the redundancy which they provide to
compute a least squares solution [5] for x.
The least squares solution to the system
AmXn^nXl ~ ^mxl ?
^
(2.27)
32
is that X which minimizes (b — Ax)^(b — Ax). This can be computed by the following
equation [5].
X = {A'^A)-'^AH
(2.28)
When using MATLAB [6] the command x = A\b, which uses a subroutine based on
Gaussian elimination, gives a numerically more accurate solution than the previous
equation which requires computing an inverse.
2.4
Results From Experimental Trials
Over 160 thousand experimental trials were performed to find the practical limits
of the direct method and the least squares method. The trials were designed to answer
a number of questions; we wanted to know how the SNR affected the accuracy of
the results and to be able to determine a minimum SNR for a desired accuracy; we
wanted to determine whether the least squares method produced better results than
the direct method; we wanted to evaluate the tightness of the error bounds presented
in Section 2.2.
The trials were divided into 2 cases. The Case 1 trials were designed to test the
direct method's performance in the presence of Gaussian noise. These trials were
intended to give a general idea of the magnitude of the errors to be expected for a
given set of values for the total number of samples, N, the number of unknown time
domain samples, n, the number of known frequency samples m, and SNR. The Case
2 trials tested the same things as the Case 1 trials except the noise was uniform. The
33
uniform noise also provided a more realistic setting in which to test the tightness
of the error bounds because the magnitude of the input errors (the uniform noise
samples) could be absolutely bounded for an entire set of trials.
2.4.1
Case 1 Trials—Gaussian Noise
For the Case 1 trials, values of N, n, m, and cr (standard deviation of Gaussian
noise distribution) were fixed and sets of 1000 trials were performed. Each trial pro­
ceeded as follows. First a random periodic time-limited signal / was generated. Each
of the first n samples of / were randomly chosen from a real uniform distribution
between 0 and 1 (thus a substantial DC component was guaranteed). All other sam­
ples of / were set to zero. Next, the periodic frequency spectrum F was generated
by computing the DFT of the time-limited signal /. The use of the DFT precluded
the possibility of introducing any aliasing error in the frequency domain. A vector of
complex Gaussian noise samples^ of zero mean and standard deviation a were then
added to the frequency samples, and the (m + l)/2 lowest non-negative frequency
samples (including the DC sample) were selected for the reconstruction. Since the
function / was known to be real, the complex conjugates of the known frequency
samples (except the DC sample) were computed and used as the corresponding neg­
ative low pass samples to provide a total of m known frequency samples. Next the
direct method was used to form the estimate / of / (if m > n the least squares
^Each noise sample was the sum a + jb, where a and 6 were chosen randomly from the Gaussian
distribution f(x) =
—e"^
34
A
solution was computed) and the norm of the error j|/ — /|) was computed. All vector
norms for the Case 1 and Case 2 trials were computed as the Euclidean or 2-norm
given in Atkinson [5] as
m
\
(2.29)
j=i
where Vj is the jth element of the vector V. Next the signal to noise ratio for the
trial was computed by the following formula.
SNR =
(2.30)
To implement the a-priori error bound, K was taken as the magnitude of the largest
element of the noise vector.
(2.31)
=
worst
(2.32)
2=1
For the least squares solution the relative error bound was computed.
Upon completion of 1000 trials various statistics (mean and standard deviation of
the error norm ||/ — /||, SNR, a-priori bound, etc ...) were calculated and stored.
The number of violations of the error bounds, if any, was also stored.
2.4.2 Case 2 Trials—Uniform Noise
The Case 2 trials were similar to the Case 1 trials except for the noise. The noise
was complex with the magnitude and phase components uniformly and independently
distributed between [0, K] and [0,27r] respectively. Values for N, n, m, and K were
35
fixed and each trial was performed in the same manner as the Case 1 trials. As in
the Case 1 trials the error bound was computed by Eq. 2.31
||/-/||<||A-^||||w..||
(2.33)
m
VwOTSt =
(2.34)
W
where K was the maximum possible magnitude of a noise sample. For the least
squares case, this bound was computed using the pseudoinverse of A. As in the Case
1 trials, various statistics were compiled upon completion of 1000 trials.
2.4.3
Results
Some of the statistics compiled from the Case 1 and Case 2 trials are shown in
Tables 2.1 and 2.2 respectively. Each entry in the statistics columns of these tables
represents a statistic calculated from 1000 independent trials. Each of these entries is
accurate to -klfyjnumber of trials = l/\/lOOO or about ±3%. For this reason only
two significant digits are given for these entries. In Table 2.1 the least squares results
for fixed values of N, n, and SNR are given in the same row as the direct method
for easy comparison. The condition number cond»[A) was calculated for a variety of
values for N, n, and m. The results are tabulated in Table 2.3.
After studying the three tables it becomes clear that the factor which has the
greatest effect on the size of the error in the solution is the number of unknown
time domain samples (n). For example, from Table 2.3, for
= 16, n = 3, m = 5
cond^:(A) = 14.3396. By increasing the number of unknowns by 2 (N = 16, n = 5,
36
Table 2.1: Results from Case 1 trials.
N
n
(7
8
8
8
8
16
16
16
16
16
32
32
32
32
32
32
32
32
32
32
32
32
64
64
64
64
64
64
128
128
128
128
256
256
256
256
256
3
3
3
3
3
3
3
5
5
3
3
3
5
5
5
7
7
7
9
9
9
3
3
3
5
5
5
3
3
3
5
3
3
3
3
5
E-4
E-3
E-2
E-i
E-3
E-2
E-4
E-3
E-4
e-3
E-2
E-6
E-5
E-4
£1-6
E-5
E-4
E-6
E-^
e-4
E-b
^-4
e-3
E-6
e-5
E-4
E-a
E-5
e-4
E-e
E-a
e-5
E-4
e-4
e-6
a — pn. mean
mean
mean
error
\\error\\
\\error\\ \\error\\
m = n m = n + 2 m = n-\- 4
bound
9.5£-4
9.5JE;-3
9.3^-2
QAE-'^
3.8E-'^
S.9E~^
3.9E-^
3.1E-^
S.IE-'^
1.6E-'^
1.6E~^
1.6^0
5AE-^
5.5E-^
5AE-'^
7.2E-2
7.0E-'^
7.1E°
4.8^-1
4.8E°
4.8E^
GAE-"
6AE-^
6.5E-'^
8.9E-^
8.8^-1
8.9E°
2.6E-'
2.6^-2
2.5E-'^
1AE°
l.OE-'^
l.OE-^
1.0 E°
l.OE^
2.3E^
i.GE-'^
4.6E-^
4.5E-^
4.6E-'^
1.8E-^
l.8E-^
l.SE-'^
I.IE-^
I.IE-^
7.2E-''
7AE-^
7.2E-^
l.8E-^
1.9E-^
1.9£-^
2.1E-2
2.1i;-i
2.1^;°
1.2E-'^
1.2E°
1.2E^
2.8E-'
2.9E-^
2.8E-1
3.0^-2
3.0^;-^
3.0£;°
\.2E-'
1.2E-^
1.1^-1
4.9E-1
4.4^;-^
4.8E-^
4AE~'^
4.5E°
7.9E°
l.ZE-^
lAE-^
l.ZE-^
lAE-'^
4AE-'^
4AE-^
4.ZE-'^
l.5E-^
l.bE-'^
1.8E-''
1.8JB-2
i.8£;-^
2.8E-'^
2.7E-^
2.8E-^
2.1E-^
2.0E~^
2.0£'-i
9.2E-^
9.2E-^
9.3E-^
7.ZE-^
7.SE-^
7.5E-^
4.6E-^
4AE-^
4.6£;-^
3.0iJ-^
2.9E-^
2.8E-^
7.2^-2
1.2^;-^
i.i^;-2
1.1^;-^
i.2i;°
1.1^;°
mean
Avg.
\\erroT\\
m = 7i + 10 sr^R
7.5E-2
7.6E-^
4.1E-^
4.2E-^
lAE-^
1.5E-^
lAE-^
2.9E-^
2.9E-^
S.OE-^
l.2E-^
l.SE-^
l.2E-'^
1.2E-''
1.2E-^
1.2E-^
2.0E-2
4.9iJ-4
4.8i;-3
4.7E-^
4.8iJ-i
S.2E-'^
6.^E-^
l.OE-^
2.bE-^
1.8E-®
9.&E-^
9.7E-^
2.8E-^
4.9E^
4.7E^
4.8E'^
4.8E^
G.GE"
6.5E^
6.5E'^
lAE^
lAE"^
GSE"
7.2E^
7.1E'^
2.1^13
2.11;"
2.0E^
3.5£i3
3.4i;"
3.5i;®
4.8E^^
4.8^"
4.8E^
7.3E'''
7.6E^
7AE^
2AE^^
2.5£"
2.5£®
7.3E^'^
7.3E^°
7.2£®
2.6E^^
7.5^1^
7.SE^°
7.5E^
7.7E^
2.7E^^
A''=periodic length of function, ra=number of unknown time domain samples,
m=number of known frequency samples, (T=standard deviation for noise samples.
37
Table 2.2: Results from Case 2 trials.
N
n
K
8
8
8
16
16
16
16
16
16
32
32
32
32
32
32
64
64
64
64
64
128
128
128
128
256
256
256
3
3
3
3
3
3
5
5
5
3
3
3
5
5
5
3
3
3
5
5
3
3
5
5
3
3
5
l.OE-^
l.OE-^
l.OE-^
LO^;-"
l.OE-"
1.0£^-3
l.OE-^
l.OE-"^
l.OE-^
1.0£;-^
l.OE-"^
l.OE-3
l.QE-^
1.0£;-®
1.0£;-^
1.0£:-^
i.oE-4
1.0£;-3
l.OE-^
1.0E-®
l.OS-"
l.OE-'^
1.0^;-®
1.0£;-^
l.QE-"
1.0^-''
1.0£;-®
a — pri. mean
a — pri.
mean
bound
||error|| bound
||error||
m = n m = n m = n-\-2 m = n + 2
7.9^-^
7.9E-^
7.9E-2
3.3E-3
Z.ZE-^
2.4E-3
2.4E-2
2.4^;-^
1.3£;-^
1.3E-2
1.3^;-^
4.3^-3
4.3£;-2
4.3^-^
5.4^-^
5.4E-2
5.4^-1
7.0E-2
7.0E-1
2.2E-''
2.2E-'^
l.\E°
1.1£;\
8.6E-''^
8.6£;-i
1.7^1
2.3E-'^
2.3E-^
2.1E-^
9.0^;-''
9.3^;-^
9.2E-^
5.3£;-3
4:.9E-^
3.8£;-''
3.7£;-3
3.6^-2
9.1^-^
8.9^;-^
8.8^;-^
1.5£;-^
I.5£J-2
\.%E-^
l.5E-^
1.4^-^
ME^
5.7E-^
2.4£;-^
2.3^°
2.4^"^
2.4£;-i
3.8^;°
2.5£;-^
2.5E-^
2.bE-^
9.2^-^
9.2£;-4
9.2E-^
3.6£;-^
3.6£;-^
3.6E-^
3W^
3.8E-^
3.8£;-2
6.7£;-^
UE-^
6.7^-^
llE=^
i.s^;-^
l.hE-^
I.IE'^
1.1^;-^
QJE^
6.1^-2
1.8^-^
1.8.E°
2.4^"^
2.4^"^
2.9E°
5.7^;-®
5.1
b.lE'^
1.9^-^
1.9£;-^
2.0E-^
6.2£;-®
6.3^-^
8.0^-^
l.m-^
7.6^;-3
\.2E-^
l.2E-^
1.2^-^
Z2E=^
a.o^;-^
Z.IE-'^
1.9E-^
2.0£;-2
TIE^
1.3E-^
S.S^-^
3.1g-^
S.IS"^
5.1£;-2
S.OE'i
A'^=periodic length of function, n=number of unknown time domain samples,
m=number of known frequency samples, /('=maximum magnitude of each noise sam­
ple.
38
Table 2.3: Condition number of A.
N n m
8
8
16
16
16
16
32
32
32
32
32
32
32
32
32
64
64
64
64
64
64
64
64
64
64
128
128
128
128
128
128
128
128
128
128
128
128
3
3
3
3
5
5
3
3
3
5
5
5
7
7
7
3
3
3
3
3
5
5
5
5
5
3
3
3
3
3
3
5
5
5
5
5
5
3
5
3
5
5
7
3
5
7
5
7
9
7
9
11
3
5
7
13
19
5
7
9
15
21
3
5
7
13
19
35
5
7
9
15
21
37
Condt,{A)
i.meE^
2.7526E°
5.5251£;^
1.4340£;^
4.2232^2
5.3563E^
2.3023E^
2.9651^1
SMUE^
1.2%6E^
3.6829i;2
l.m6E^
2.1571E^
9.3053£;2
2.6064£:2
1.2482E2
3.3434£;i
1.4613£;^
1.5284E®
2.3816£;^
7.2mE^
7.2735E^
1.5628£;2
3.7319£;3
1.0498£;3
5.0601E2
1.3974^2
6.3811£;^
1.7433£;i
2.4949E®
3.9564^5
1.2343£;®
1.3533£^^
3.2368£;2
2.7177£;2
N
n
m
Cond^:{A)
256
256
256
256
256
256
256
256
256
256
256
256
256
256
256
256
256
256
256
256
256
512
512
512
512
512
512
512
512
512
512
512
512
512
512
512
512
3
3
3
3
3
3
3
5
5
5
5
5
5
5
7
7
7
7
7
7
7
3
3
3
3
3
3
3
5
5
5
5
5
5
5
5
7
3
5
7
13
19
35
55
5
7
9
15
21
37
57
7
9
11
17
23
39
59
3
5
7
13
19
35
55
5
7
9
15
21
27
57
97
97
1.4937^;^
4.2065E3
2.0308£;3
5.6536E2
2.6143£;2
7.5211£;i
2.9280E^
4.0120E^
6.3913^;®
2.0059E®
2.2577E^
5.6044E^
5.3723^3
8.5459£2
4.0843^1°
4.5023^9
1.0405^;^
5.7849^^
8.5010E®
3.0422E®
2.1070E''
5.9758E^
1.6833E''
8.1302E3
2.2680E3
1.0521^3
3.0704^2
1.2295£;2
6.4274ii;8
1.0251^®
3.2220£;^
3.6512E®
9.1551£;®
3.2897E®
1.5676E''
1.7052^3
7A286E'^
iV=periodic length of function, n=number of unknown time domain samples,
m=number of known frequency samples.
39
m = 5) cond^{A) = 422.3243, an increase by a factor of about 30. As another
example from Table 2.3, for N = 128, ra = 3, m = 5 condt,(A) = 1049.8. Again just
by increasing n by 2 (iV = 128, n = 5, m = 5) cond^{A) jumps to 2,494,900, an
increase by a factor of about 2377.
The condition number cond^{A) is also dependent on the periodic length N and
the number of unknown time domain samples n. For example, if n and m are both
fixed at 3, doubling N tends to increase cond^{A) by a factor of 4. If n = 3 and
m = 5, doubling N still tends to increase cond^,{A) by a factor of about 4. If n and
m are fixed at 5, doubling N tends to increase cond^{A) by a factor of about 16. If
n = 5 and m = 7 doubling N will also tend to increase cond^,{A) by a factor of about
16.
While the effects of N and n appear to be linked, the effect of adding extra known
frequency samples seems to be independent. Referring again to Table 2.3, if
= 64
and n = 5, the addition of two extra known frequency samples from m = 5 to m = 7
decreases cond^,[A) by a factor of about 6.4. Using 4 extra samples (m = 9) decreases
cond^,{A) by a factor of about 21. Using 10 extra (m = 15) and 16 extra (m = 21)
decreases cond*(A) by factors of 210 and 978 respectively. Decreasing the norm of the
error by a factor of 978 translates to about 3 more significant digits in the solution.
Increasing the magnitude of the noise appears to cause a linearly proportional
increase in the magnitude of the error. Table 2.1 shows that increasing the standard
40
Norm of Error in Solution. L=64, n=3, m=3, K=.0001.
0.108
A-Priori Bound
0.054
300
400
500
600
800
900
1000
Trial#
Figure 2.6: A-priori error bound and error norms for 1000 trials (exactly determined
case).
deviation of the noise by a constant factor results in the size of the error in the
solution increasing by the same factor.
In over 160,000 independent trials, not a single violation of any of the error bounds
occurred. For the Case 2 trials using the direct method, the a-priori error bound was
typically 3 to 4 times as large as the mean of the error norm.
Figure 2.6 shows the
error norms and a-priori bound for a typical 1000 trial run. For several of the trials
in Figure 2.6 the norm of the error was greater than 80% of the bound, suggesting
that this bound is about as tight as one can expect while still being guaranteed.
41
Norm of Error in Solution. L=64, n=3, m=5, K=.0001
0.03
A-Priori Bound
0.015
0
0
100
200
300
400
500
600
700
800
900
1000
Trial#
Figure 2.7: A-priori error bound and error norms for 1000 trials (overdetermined
case).
For the Case 2 trials using the least squares method, the a-priori bound was
implemented using the pseudoinverse of A and was generally 4 to 6 times as large as
the mean of the error norms. Although this bound is not guaranteed for the least
squares implementation, there was not a single violation in 27000 trials.
Figure
2.7 shows the error norm and a-priori bound for a typical 1000 trial run using least
squares method. All variables for this example were the same as those in Figure 2.6
except the number of known frequency samples m. The lower error bound and error
42
norms of Figure 2.7 illustrate the value of using any extra samples in the least squares
method.
2.4.4
Analysis of Trial Results
Using Tables 2.1 thru 2.3 one can estimate how well the direct method or the
least squares method will work for a given application. The precision in the solu­
tion is limited by the condition number of A and the size of errors in the known
frequency samples. The size of the errors in the frequency samples may not always
be controllable, but there are ways to control the size of cond^,(A).
The most effective way to control the size of cond*(A) is to limit the number of
unknowns n in the time domain (preferably to fewer than 7 samples). This can be
accomplished by lowering the time-domain sampling rate. If an application requires
a large number of unknown time domain samples (such as resolving fine details in a
complicated time-limited signal) then another extrapolation method should be used.
Another way to improve the condition of A is to use any and all extra frequency
samples when they are available. The number of available samples will be limited
by the extent of the frequency spectrum which is available. Decreasing the sampling
interval in the frequency domain will increase the number of available frequency
samples m without affecting n, but the the resulting increase in the total length N
will mostly offset the value of using the extra frequency samples.
43
Section 2.2 alluded briefly to the unique problems caused by aliasing. To decrease
aliasing errors the sampling frequency must be increased. Increasing the sampling
frequency will necessarily increase n, which then increases cond^{A). We have al­
ready seen the drastic effects caused to cond^{A) by even slight increases of n. Our
experimental simulations have shown that reducing the sampling frequency by 1/2
roughly doubles the magnitude of the aliasing error. The additional aliasing error is
more than offset by the decrease in cond^,{A) caused by reducing n, therefore it is
better to limit n at the expense of additional aliasing error.
2.5
Summary
This chapter has introduced a new super-resolution method which we call the
direct method. This method uses DFT coefficients and known frequency samples to
solve for the unknown time domain samples directly. Unlike other super-resolution
methods ([2], [7]) which were first conceived in a continuous form and then adapted
to the discrete form, the direct method was originally conceived in the discrete form.
(In fact, there does not exist a continuous form for the direct method.) As a result,
the direct method exploits the inherent structure of the discrete form to provide a
very simple, very fast super-resolution method. Unfortunately, the direct method's
usefulness is severely limited by its condition number, but it does represent a new
way of looking at the super-resolution problem and it provides the foundation for
more sophisticated super-resolution methods to be introduced in Chapter 5.
44
CHAPTER 3
THE GERCHBERG ALGORITHM
The algorithm examined in this chapter was proposed independently by both Gerchberg [2] and Papoulis [7]. Since [2] predates [7], this algorithm for super-resolution
will be referred to as the Gerchberg algorithm throughout this thesis. The Gerchberg algorithm can be used to extrapolate time or frequency domain functions from
incomplete information. As in the previous chapter, only the frequency extrapo­
lation (super-resolution) problem will be considered, although the algorithm works
essentially the same for both applications.
This chapter begins with a review of the continuous and discrete forms of the
Gerchberg algorithm. In Section 3.2, the relationship of the Gerchberg algorithm
to the direct method is discussed and the exactly determined case is examined. In
Section 3.3 the overdetermined case is analyzed and the correction energy described
by Gerchberg [2] is re-evaluated. Computable error bounds for the Gerchberg algo­
rithm are introduced in Section 3.4. Finally, the advantages and disadvantages of the
Gerchberg algorithm will be discussed.
45
3.1
Description of the Gerchberg Algorithm
If a portion of the frequency spectrum of a time-limited object is known, and the
location and extent of the object in time are also known, then under certain reason­
able conditions [8] the Gerchberg algorithm can be used to reconstruct the unknown
portion of the frequency spectrum, thereby improving the object's resolution. The
algorithm is iterative and each iteration consists of four steps. The first step trans­
forms the known portion of the frequency spectrum into the time domain. Step two
sets the time domain function to zero outside the region in which it is known to be
time-limited. Step three transforms this result back to the frequency domain and the
fourth step replaces the known portion of the frequency spectrum into its appropriate
location. This four step process is repeated starting with the new frequency spectrum
estimate and iterates until satisfactory estimate is obtained.
A flow chart for the
Gerchberg algorithm is shown in Figure 3.1. The algorithm is better understood with
the help of the following example.
3.1.1
Example 3.1.1
Consider a periodic time-limited signal which has been sent through a band-limited
channel so that its high frequency components have been lost. The location and
extent of the original signal in the time domain are known. The application of one
cycle of the Gerchberg algorithm to this problem is illustrated in Figures 3.2-3.7.
The known portion of the frequency spectrum is shown in Figure 3.2 (the dotted line
Known Portion of
Frequency Spectrum
Estimated Frequency
Spectrum Corrected
Over Known Portion
Fourier
Transform
Fourier
Transform
Estimated object
Corrected to Zero
Outside Known
Extent
Known Extent
of Object
Figure 3.1: Flow chart for the Gerchberg algorithm.
47
350
400
450
500
550
600
figure 3.1 (a)
Figure 3.2: Known portion of frequency spectrum.
650
48
1.5
0.5
-0.5
350
400
450
500
550
650
600
figure 3.1 (b)
Figure 3.3: Time domain representation of known frequency spectrum,
indicates the original frequency spectrum which is to be reconstructed).
Step one
transformed the known portion of the frequency spectrum into the time domain as
shown in Figure 3.3.
Next, step two truncated the function to satisfy the known
time-limited constraints as shown in Figure 3.4.
Step three transformed this result
back into the frequency domain as shown in Figure 3.5.
Step four replaced the
known portion of the frequency spectrum into its corresponding frequencies as shown
in Figure 3.6.
Beginning another iteration, step one transformed the new frequency
spectrum estimate back to the time domain as shown in Figure 3.7.
49
450
500
550
figure 3.1 (c)
Figure 3.4: Figure 3.3 truncated to time-limited region.
50
1.5
0.5
-0.5
350
400
450
500
550
figure 3.1 (d)
Figure 3.5: Frequency spectrum of Figure 3.4.
600
650
51
1.5
0.5
-0.5
350
400
450
500
550
600
figure 3.1 (e)
Figure 3.6: Known portion of frequency spectrum replaced.
650
52
450
500
550
figure 3.1 (f)
Figure 3.7: Time domain representation of Figure 3.6.
650
53
3.1,2
Error Energy Reduction
The convergence for this algorithm is based on reducing the error energy at each
iteration. The following arguments were originally provided by Gerchberg [2]. The
entire band-limited frequency spectrum of the object can be considered as the sum
of the true spectrum and an error spectrum. Assuming the known portion of the
frequency spectrum is free from error, the error spectrum will be zero over the known
frequency region and it will be equal and opposite to the true spectrum outside the
known region. Since the algorithm is linear, its effect on the true spectrum and the
error spectrum will be independent. As long as the time-limited constraints are not
underestimated the true spectrum will be unaffected by the algorithm [2]. Since the
error spectrum has a finite length section equal to zero, it cannot be analytic, so
its inverse transform (in the time domain) will be infinite in extent. From Parseval's
theorem, the error energy at this point will be equal to the original error energy. After
the time-limiting step all of the error energy outside the time-limited region will be
lost, so the error energy will be less than its original value. Now since this function
is time-limited, its Fourier transform in the frequency domain must be analytic, so it
will have energy in the region where the true spectrum is known. When the known
portion of the true spectrum is replaced, the error energy in this region will be lost.
Therefore, at each iteration the error energy is reduced twice.
54
3.1.3 Discrete Implementation
To implement the Gerchberg algorithm on a computer, the time and frequency
domain functions must be represented by finite length vectors, thus a discrete version
of the Gerchberg algorithm is required. For the discrete version, continuous time
and frequency functions are modeled as discrete periodic vectors and transforms are
performed using the DFT. The known information will consist of sampled data which
can be obtained by sampling the continuous frequency spectrum directly.
In the discrete form, the problem is set up exactly the same for the Gerchberg
algorithm as it would be for the direct method. Both methods utilize the same sam­
pled frequency information and time-limited constraints; the same choices relating to
sampling frequency and frequency spacing must be made. The two methods are also
subject to the same types of errors. In particular, the discrete version of the Ger­
chberg algorithm will be subject to the same aliasing problems discussed in Section
2.3.
3.1.4 The Method of Alternating Orthogonal Projections
The Discrete Gerchberg algorithm has been shown by Youla [8] to be a special
case of the method of alternating orthogonal projections. If a vector / is known
a-priori to belong to a subspace Pb of a parent Hilbert space H, but all that is known
to the observer is its projection g = Paf, then the original vector / can be restored
by the following three step iterative algorithm: 1) project the latest estimate /„ onto
55
Pb, 2) project this result onto
(the subspace of H which is orthogonal to Pa), 3)
add back the known vector g to obtain the new estimate fn+i- Each iteration of the
algorithm can be expressed mathematically by the following equation.
U , = g + PtPbfn
(3.1)
The known vector g is used as the initial estimate and the algorithm is allowed to
iterate until a satisfactory estimate is obtained. If the known vector g is free from
error and Ph f) P^ = {0}, then /„ will converge to f as n approaches infinity (see
proof in Youla [8] for details). The operation of this algorithm and its relationship to
the Gerchberg algorithm are illustrated by the following simple example in 2-space.
3.1.5
Example 3.1.2
The method of alternating orthogonal projections can be used to restore the orig­
inal vector / = [2,0] (shown in Figure 3.8) from g = [1,1] (its projection onto Pa).
The vector / is known a-priori to belong to the normalized subspace Pb — [1,0]. The
normalized subspaces Pa = [0.7071,0.7071] and A. Pa = [0.7071,-0.7071] are also
shown in Figure 3.8 (all vectors and subspaces shown in Figure 3.8 are in the time
domain). In terms of the Gerchberg algorithm, Pb is the set of functions time-limited
to the first sample. Pa is the set of functions band-limited to the first sample, and
± Pa is the set of functions band-limited to the second sample. By inspecting the
time domain representations of Pa and ± Pa-, this may not seem obvious, but their
56
Pa
Pb
IPd
Figure 3.8: The method of alternating orthogonal projections.
57
Table 3.1: Result after each step of example in Figure 3.8.
time domain
q (known) :
[1,1]
step 1
1,0
step 2
[O.f5,-t).5]
step 3
fl 5,0.5]
step 1
[1.5,0]
[0.75,-0.75]
step 2
[1.75,0.25]
step 3
step 1
[0.875, -0.875]
step 2
[1.875,0.125]
step 3
frequency domain
2,0
1,1
0,1
2,1
[1 5,1
(3,1.^ f
2,1.1
0,1.75
2,1.75
respective frequency domain transformations (obtained by the DFT) are [1,0] and
[0,1].
The three steps of the algorithm are illustrated in Figure 3.8 for three iterations.
The first step, projecting the latest estimate of / onto Pf,, is equivalent to timelimiting to the first sample (step two for the Gerchberg algorithm). The second
step, the projection onto J_ Pa, is equivalent to zeroing the frequency spectrum over
the region where it is known. The third step, adding back the known vector g,
is equivalent to adding back the known portion of the frequency spectrum. The
combination of the second and third steps of the method of alternating orthogonal
projections are therefore equivalent to step four of the Gerchberg algorithm.
A
chart listing the result in both the time and frequency domains after each step of this
example is shown in Table 3.1.
58
For this example the method of alternating orthogonal projections performs the
same basic operations of time-limiting replacing the known frequency information,
as the Gerchberg algorithm. In general, the method of alternating orthogonal pro­
jections can project into any space, not just time-limited or band-limited spaces,
therefore the discrete Gerchberg algorithm is a special case of the method of alter­
nating orthogonal projections.
3.2
Relationship to the Direct Method
There are many similarities between the Direct Method and the Gerchberg algo­
rithm. As was pointed out in the previous section, they have the same problem set up
requirements and are subject to the same errors. In fact, for the exactly determined
case, the Gerchberg algorithm will converge to the same result given by the Direct
method (at least to the numerical precision of the computer).
To explain why this is true, consider the exactly determined case where the known
frequency samples are free from error. Neglecting computational roundoff errors, the
Direct Method will provide the correct solution ^ . Furthermore, Youla [8] and Jones
[9] have both shown that when the known samples are error free the Gerchberg
algorithm will converge to the correct solution. Since there can only be one correct
solution the two methods must provide the same solution for the exactly determined
error-free case.
^Define the correct solution to be the exact solution to the system Ax = 6, or in the case of
a system perturbed by errors, the exact solution to Ax = b. The correct solution to the system
Ax = 6 is not necessarily the same as the correct solution to the error free system Ax = b.
59
Now consider the exactly determined case where the known frequency samples do
contain errors. In this case there is a unique solution to the set of equations and
the direct method will find it. This solution will not be the same as that given by
error-free samples, but it will be the correct solution for the given frequency samples
which have been corrupted by error. The Gerchberg algorithm will also converge
to the correct solution for the given frequency samples; it will find the one function
which satisfies both the time-limited constraints and the known frequency samples
simultaneously. In other words, the known portion of the frequency spectrum of /
which contains error can be thought of as a portion of the error-free spectrum of
another time-limited function /. The direct method and the Gerchberg algorithm
will both find the correct solution for the given information: /.
Before demonstrating the equivalence of the direct method and the Gerchberg
algorithm for the exactly determined case, a convergence factor will be defined. The
purpose for defining a convergence factor is to track the rate of convergence of the
Gerchberg algorithm. The convergence factor is defined as
'f =
(3.2)
1/ — Jn-l\
where / is the solution which the Gerchberg algorithm will converge to and /„ is the
estimate of the solution after n iterations. Obviously, computing the convergence
factor requires knowing the solution / a-priori. When the vector estimate /„ is
60
converging in a straight line toward / the convergence factor can be defined as
c/ = II/.-/.-I I
II/-A-1II
(3.3)
where || * || indicates the standard Euclidean norm. When defined as Eq. 3.3, the
convergence factor represents the fraction of the distance to the solution closed at
each iteration. For example, a constant convergence factor of 0.5 means the difference
between the latest estimate and the solution is being cut in half with each iteration.
3.2.1 Example 3.2
The direct method and the Gerchberg algorithm are used to reconstruct the timelimited function
/ = [12300000]
(3.4)
from the first, second and eighth samples of its discrete frequency spectrum which
have been corrupted by noise. The known frequency samples are given as h.
h—h-\- noise —
6.0000 + O.OOOOi
2.4142 - 4.4142i
2.4142 + 4.4142^
+
6.0582 + 0.00002
2.4180 - 4.3966e
2.4180 + 4.3966i
0.0582 + O.OOOOi
0.0038 + 0.0176i
0.0038 - 0.01762
(3.5)
(3.6)
61
The solution / given by the direct method is as follows.
/=
1.1268 + 0.0000i
1.8260 -O.OOOOi
3.1055 + O.OOOOi
0.0000 + O.OOOOi
0.0000 + O.OOOOi
0.0000 + O.OOOOi
0.0000 + O.OOOOi
0.0000 + O.OOOOi
(3.7)
The Gerchberg algorithm is allowed to iterate 10000 times and the convergence factor
(Eq. 3.3) is computed after each iteration using / obtained by the direct method
above. The estimate given by the Gerchberg algorithm after 10000 iterations is as
follows.
/loooo —
1.1268 +O.OOOOi "
1.8260 - O.OOOOi
3.1055 + O.OOOOi
0.0000 + O.OOOOi
0.0000 + O.OOOOi
0.0000 + O.OOOOi
0.0000 + O.OOOOi
0.0000 + O.OOOOi
A
The difference / — /loooo is on the order of 10"
(3.8)
A
times the size of / which is nu­
merically equivalent to zero for the double precision computer system on which the
reconstructions were performed.
The convergence factor (Eq. 3.2) is plotted for the first 100 iterations in Figure 3.9.
After about twenty iterations the convergence factor approaches a constant of about
0.0185. The reason for the convergence factor eventually approaching a constant
will be discussed in Chapter 4. For now, the fact that the convergence factor will
eventually become a constant will be used to prove that for the exactly determined
Convergence Factor For Example 3.2
3
2.5
2
1.5
1
0.5
0
0
30
40
60
70
80
90
Iteration Number
Figure 3.9: Convergence factor vs. iteration number for Example 3.2.
100
63
case the Gerchberg algorithm will converge to the same solution given by the direct
method.
Assume that the Gerchberg algorithm is converging toward the direct method's
solution, /, (at a constant rate) after k iterations, with k finite. Then the distance
between / and fk is given by the following.
ll/-AII = l l / - A - . l l ( l - c / )
(3.9)
l|/-A+n|| = ||/-/.||(l-c/r
(3.10)
Now it will be shown that
Since the convergence factor is assumed to be constant,
||/-/.+i|| = ||/-MI(l-c/)
(3.11)
so Eq. 3.10 is true for n = 1. Now assume that it is true for some n > 1.
=
ll/-A+(.+i)ll=
(3.12)
ll/-A+«ll(i-c/)
(3.13)
= ll/-AII(l-c/)"(l-o/)
(3.14)
=
(3.15)
ll/-AII(l-c/)»"
Eq. 3.10 is true for n -\- 1, therefore by induction it must be true for all n. Since
0 < c/ < 1, the term (1 — c/)" must converge to zero as n approaches infinity.
Therefore ||/ — /fc+„|| must also converge to zero as n approaches infinity, so the
64
Convergence To Numerical Limit Of Computer
0.035
0.03
0.025
0.02
Convergence Factor
0.015
0.01
0.005
0
2000
2500
3000
3500
4000
4500
5000
5500
6000
Iteration Number
Figure 3.10: Convergence to numerical limit of computer.
Gerchberg algorithm's estimate /„ must converge to the direct method's solution /
as the number of iterations approaches infinity if the convergence factor is constant.
In practice, the numerical precision of the computer will limit how close the Ger­
chberg solution can converge to /.
Figure 3.10 is a plot of the convergence factor for
this example between iterations 2000 and 6000. This plot illustrates the convergence
factor becoming unstable as f — fn nears the precision limit of the computer. At
about 4800 iterations the Gerchberg algorithm finally reaches its numerical conver­
gence limit and will converge no further.
65
3.3 The Overdetermined Case
As is the case with the direct method, when there are more known frequency sam­
ples than unknown time-domain samples the system of equations is overdetermined,
so if there is any error on the known frequency samples there usually is no solution
which perfectly matches the data. The Gerchberg algorithm will be unable to find
a solution which satisfies the known frequency samples and the time-limited con­
straints simultaneously. In this case, the Gerchberg algorithm will provide a solution
estimate but the correction energy, as defined by Gerchberg [2], will not converge to
zero and the Gerchberg algorithm's estimate will not in general be the same as the
least squares solution.
Gerchberg [2] defined the correction energy as the amount by which the error
energy is reduced with each correction, and he also showed that the correction energy
must decrease with each iteration. Gerchberg [2] provided examples in which the
correction energy failed to converge to zero and reasoned that this was caused by error
on the known frequency samples. This is partly true, but the failure of the correction
energy to converge to zero is also due to the fact that the system is overdetermined.
This concept is illustrated by the following example.
66
3.3.1
Example 3.3
Consider the discrete time-limited function / and its discrete frequency spectrum
'1•
T
1
1
/=
0
0
0
0
.0
F ==
3.0000 -h O.OOOOi
1.7071 - 1.707h'
0.0000 - l.OOOOi
0.2929 -t- 0.2929i
l.OOOO-I-O.OOOOi
0.2929 - 0.2929i
0.0000 + l.OOOOi
1.7071 + 1.707h'
(3.16)
The frequency samples are corrupted by the following noise vector.
noise =
0.3750 + O.OOOOi
1.1252 -t-0.3180i
0.7286 - 0.51121
-2.3775 - 0.0020Z
-0.2738 4-1.6065i
-2.3775 -F 0.0020^•
0.7286 H-0.5112i
1.1252 - 0.3180i
(3.17)
The Gerchberg algorithm was used to reconstruct / for an exactly determined case
(using the first, second, and eighth corrupted frequency samples) and for an overdetermined case (using the third and seventh samples also).
The correction energy
was calculated at each iteration for the two cases and is plotted in Figure 3.11. It is
apparent from the figure that the correction energy converged to zero for the exactly
determined case despite significant error on the known frequency samples. Clearly,
the correction energy converged to a non-zero value for the overdetermined case.
While the least squares solution of the direct method finds the optimum solution
to the overdetermined problem in the least squares sense, the Gerchberg algorithm
67
Correction Energy For Example 3.3
-T
1
1
1
1
0.2
1
r
800
900
0.18
0.16
0.14
0.12
0.1
0.08
0.06
Dotted Line:Overdetermined
0.04
0.02
0
0
Solid Line:Exactly Determined
100
200
300
400
500
600
700
Iteration Number
Figure 3.11: Correction energy for Example 3.3.
1000
68
finds the optimum solution in terms of minimizing the correction energy. Which
solution is better is a matter of debate, but it seems reasonable to suggest that they
won't be radically different for most cases.
As with the direct method, when the system is underdetermined (there are more
unknown time domain samples than known frequency samples) there will not be a
unique solution. The Gerchberg algorithm will provide a solution in this case but it
is not guaranteed to be correct even if error free frequency samples are used for the
reconstruction, therefore the underdetermined case should always be avoided.
3.4
Error Bounds for the Gerchberg Algorithm
Youla [8] suggested a theoretical error bound based on the sin of the infimum of
the angle between the subspaces Pb and ± Pa- For most applications this error bound
would be difficult to visualize and more difficult to implement. For the exactly deter­
mined case, since the Gerchberg algorithm will converge to the solution given by the
direct method, the error bounds given in Chapter 2 for the direct method also apply
for the Gerchberg algorithm for the exactly determined case. These bounds (specifi­
cally the a-priori bounds given by Eq's. 2.20 and 2.23) are very easily implemented
and were shown in Chapter 2 to be reasonably tight.
For the overdetermined case, since the Gerchberg algorithm does not converge, in
general, to the least squares solution, the least squares bounds would not necessarily
apply for the Gerchberg algorithm, although it might be reasonable to use the least
69
squares bound as an estimate of the bound for the Gerchberg algorithm. It also might
be possible to use the correction energy to estimate the error in the solution given
by the Gerchberg algorithm. This is a topic for future research.
3.5
Summary, Advantages and Disadvantages
This chapter examined the Gerchberg algorithm for super-resolution and related it
to the method of alternating orthogonal projections and the direct method of Chapter
2. It was shown that for the exactly determined case, the Gerchberg algorithm
will converge to the result given by the direct method. This relationship was used
to establish easily computable a-priori error bounds for the Gerchberg algorithm.
The correction energy described by Gerchberg [2] was reevaluated and linked to the
overdetermined case.
The one obvious disadvantage to using the Gerchberg algorithm is its slow con­
vergence speed. It takes at least several thousand times as long for the Gerchberg
algorithm to converge to within one percent of its final value than it takes for the
direct method to find the solution. In general, a problem which the direct method
can solve in milliseconds will take minutes for the Gerchberg algorithm to solve.
To this point, it would not seem that there would be any advantages to using the
Gerchberg algorithm at all. For the exactly determined case and for the overdeter­
mined error free case, the Gerchberg method converges to the same solution given
by the direct method, and for the overdetermined case with error, the Gerchberg
70
algorithm converges to a solution which is worse (in the least squares sense) than the
direct method's solution. There is, however, one big advantage to using the Gerchberg algorithm when there is error on the known frequency samples, especially for
problems where the condition number for the direct method is large. The Gerchberg
algorithm can get closer to the true solution by terminating the iteration early. The
reasons for this and the design of early termination schemes will be discussed in the
next chapter.
71
CHAPTER 4
TERMINATION SCHEMES FOR THE GERCHBERG
ALGORITHM
Papoulis [7] states that the propagation of error through the Gerchberg algorithm
can be controlled by early termination of the iteration. This is generally accepted as
true. To date, however, there has been no explanation as to why early termination
should limit the effects of error and no criteria offered as to the optimal termination
point. This chapter will expand the model of Jones [9] to explain the conditions under
which early termination would limit the effects of error, and the results of extensive
experimental trials will show that these conditions seldom fail to occur. These results
may be used as a basis for the design of termination schemes. Finally, three specific
termination schemes are presented and tested.
4.1
Convergence For The Gerchberg Algorithm
The following model was introduced by Jones [9] to analyze the rate of convergence
for the Gerchberg algorithm. Define x as the solution to the unknown time domain
samples which the Gerchberg algorithm will converge to as the number of iterations
approaches infinity. Define fin as the matrix of DFT coefficients relating the unknown
72
frequency samples to the unknown time domain samples. The rows of fin correspond
to the unknown frequency samples and the columns of On correspond to the unknown
time domain samples. Define
(f indicates conjugate transpose). Now
the solution x can be expanded in eigenvectors of the matrix P
x = f^aiVi
(4.1)
i=l
where Vj is the jth eigenvector of P and the coefiicient ay is a real or complex number.
The following example illustrates the meaning of the matrix On. Consider again
the time-limited function / with discrete frequency spectrum F.
/=[/i/2/30 0000]
(4.2)
F = [i^i F2 F3 F4 F5 Fq Fr i^s]
(4-3)
Suppose that only the first, second, and eighth frequency samples are known. The
matrix 0 contains the DFT coefficients relating all of the frequency samples to all of
the time domain samples.
w^,
w% •
K ws, K
Wf, W}, n n
K Wj,
Wf, wi' W}}
ws, wi W},
w% wl wn wk' Wjf
0=
wp
W},
ws,
Wf, wnf wA' WS'
W0'
wp Wn W^' Wj}
w}?
ws,
ws,' wf
. K K n-
(4.4)
Since the first three time domain samples are unknown, and the third through seventh
frequency samples are unknown. On is the intersection of the first three columns and
73
the third through seventh rows of fl.
Wf^
^11=
(4.5)
Wk°
W%
From Jones [9], the estimate of x after r iterations is given by
71
Ir = E(1 - •*;)")
+ P'"'
(4.6)
where Aj is the eigenvalue corresponding to the jth eigenvector of P. From [9],
since all eigenvalues of P satisfy 0 < A < 1, the term P'^xq will converge to zero
as r —)• oo. From Eq. 4.6, it is apparent that the speed of convergence of each
eigenvector is determined by the size of its corresponding eigenvalue. Eigenvectors
with small corresponding eigenvalues will converge rapidly, while eigenvectors with
large eigenvalues (close to 1) will converge slowly. Therefore, if the function x is
concentrated in the eigenvectors with small eigenvalues, x can be recovered quickly,
and if x is concentrated in eigenvectors with large eigenvalues it will take a large
number of iterations to converge.
In practice, there will be error in the solution x due to errors (noise and aliasing)
on the known frequency samples. Hence, the solution can be represented as the sum
of a true solution x and an error e.
X = a; + e
(4.7)
74
The true solution and the error can be expanded separately in eigenvectors of P.
X = X + e = ^ ajVj + ^
i=i
i=i
(4.8)
By terminating the algorithm early, the contribution to the solution x from eigenvec­
tors with large eigenvalues can be eliminated almost entirely. Therefore, whenever
the error e is concentrated in the eigenvectors of P with large eigenvalues, the con­
tribution from e can be minimized by terminating the algorithm early. The results
from the next section will show that the error will usually be concentrated in these
eigenvectors, therefore justifying early termination.
4.2
Results From Experimental Trials
Experimental trials were performed in order to determine how the true function
X and the error e are usually distributed among the eigenvectors of P. These trials
were made feasible by the fact that for the exactly determined case the Gerchberg
algorithm will converge to the same result given by the direct method. Using the
Gerchberg algorithm to obtain the final result to within a small tolerance would have
required thousands of iterations per trial.
4.2.1
Procedure
For each trial a time-limited function / was generated by choosing the non-zero
time domain samples randomly from a uniform distribution. Complex Gaussian noise
75
samples^ were then added to the frequency samples used for the reconstruction, and
the solution for the unknown time domain samples x was obtained by the direct
method. The solution x was divided into a true function x and an error function e.
The functions x and e were expanded separately in eigenvectors of P.
x=
e = j2l3jVj
j=l
(4.9)
j=l
The eigenvectors (Vj) were ordered such that their corresponding eigenvalues were in
increasing order.
Ai < Az < ... < A„_i < Xn
(4.10)
In this way the first eigenvector Vi converged fastest and the last eigenvector Vn was
the slowest to converge. The fractions of the true function x and the error function
e in the jth eigenvector of P were calculated as follows.
a' =
^4—' EL. |a.i
(4.11)
^
'
For a given set of values for m (number of known frequency samples), n (number of
unknown time domain samples), and N (periodic length), 1000 independent trials
were performed and expected values for a'j and /3j were computed and tabulated.
4.2.2
Results
^Each noise sample Wcis the sum a + jb, where a and b were chosen randomly from the Gaussian
distribution f { x ) —
—x'
.
76
Table 4.1: Expected distribution of / and e ver eigenvectors of P.
N
8
16
16
32
32
32
64
64
64
128
128
128
256
256
256
n m
3
3
5
3
5
7
3
5
7
3
5
7
3
5
7
3
3
5
3
5
7
3
5
7
3
5
7
3
5
7
0.6524
0.6419
0.5219
0.6434
0.5392
0.4641
0.6484
0.5413
0.4863
0.6485
0.5490
0.4822
0.6517
0.5443
0.4836
"2
0.1730
0.1761
0.1124
0.1761
0.1138
0.0823
0.1700
0.1142
0.0844
0.1755
0.1168
0.0870
0.1738
0.1117
0.0892
«3
0.1746
0.1818
0.1413
0.1805
0.1169
0.1172
0.1816
0.1160
0.0877
0.1760
0.1074
0.0868
0.1746
0.1155
0.0842
0.0930
0.0318
0.0045
0.0111
0.0003
0.0000
0.0037
0.0000
0.0000
0.0010
0.0000
0.0000
0.0004
0.0000
0.0000
^'2
0.2017
0.1265
0.0065
0.0851
0.0010
0.0000
0.0493
0.0001
0.0000
0.0299
0.0000
0.0000
0.0175
0.0000
0.0000
^3
0.7053
0.8418
0.0172
0.9037
0.0054
0.0001
0.9470
0.0015
0.0000
0.9691
0.0004
0.0000
0.9822
0.0001
0.0000
N n m
8
16
16
32
32
32
64
64
64
128
128
128
256
256
256
3
3
5
3
5
7
3
5
7
3
5
7
3
5
7
3
3
5
3
5
7
3
5
7
3
5
7
3
5
7
<^4
"5
"6
«7
0.1112 0.1133
0.1141 0.1161
0.0836 0.0824 0.0891 0.0814
0.1154 0.1132
0.0855 0.0841 0.0848 0.0873
0.1127 0.1142
0.0843 0.0855 0.0854 0.0887
0.1113 0.1173
0.0851 0.0843 0.0876 0.0861
/?4
0'e
^7
0.1113 0.8604
0.0700 0 9232
0.0005 O!O042 0.0647 0.9304
0.0428 0 9555
0.0001 0^0013 0.0361 0.9626
0.0233 0.9763
0.0000 0.0003 0.0281 0.9716
0.0152 0.9847
0.0000 0.0001 0.0147 0.9852
iV=periodic length of function, n=nuniber of unknown time domain samples,
m=number of known frequency samples, Q:j=expected fraction of / in jth eigen­
vector of P, /3j=expected fraction of e in jth eigenvector of P.
77
Table 4.1 contains results from trials using time-limited functions with a sub­
stantial dc component. The time domain samples of x were randomly chosen from
the uniform distribution [0,1]. The frequency samples used for the reconstruction
were taken from the low-pass portion of the spectrum (including the dc frequency
component).
The top half of Table 4.1 contains the expected distribution of x over the eigen­
vectors of P. The largest portion of x was found in the first eigenvector, which was
the only eigenvector with a non-zero d.c. component. The function x is distributed
fairly evenly among the remaining eigenvectors. Additional trials were performed
using time-limited functions generated from the uniform distribution [-.5,-f.5] (the
expected value of the dc component for each trial was zero). For these trials the true
function x was generally spread evenly among all of the eigenvectors of P.
The lower half of Table 4.1 contains the sample density of the error function
e over the eigenvectors of P. From the table, it is apparent that the error tends
to be concentrated in the last few eigenvectors (the slowly converging ones), and
this tendency becomes very strong as N and n get large (as cond{A) gets large).
This result is important because when the error is heavily concentrated in the last
few eigenvectors its contribution to x can be reduced greatly by terminating the
Gerchberg algorithm early.
Table 4.2 illustrates how the distribution of the true function x among the eigen­
vectors of P changes as the location of the known frequency samples shifts away
78
Table 4.2: Change in distribution of / and e over eigenvectors of P as known fre­
quencies move away from main lobe of frequency spectrum.
L n m
ol'^
o^s
64
64
64
128
128
128
128
128
6
6
6
6
6
6
6
6
6
6
6
6
6
6
6
6
0.4821
0.0953
0.0881
0.5006
0.4598
0.0935
0.1556
0.0901
0.1296
0.1005
0.1002
0.1071
0.0945
0.1033
0.0899
0.1018
0.0984
0.4813
0.1985
0.0978
0.1696
0.4780
0.0915
0.1981
0.0987
0.1004
0.0931
0.0952
0.0917
0.0921
0.1186
0.0933
0.0974
0.1322
0.0875
0.1000
0.0920
0.1357
0.4516
0.0912
0.0938
0.0903
0.4326
0.0992
0.0924
0.0974
0.0928
0.4256
2
10
20
2
10
20
30
40
64
64
64
128
128
128
128
128
6 ?
6 6
6 6
6 6
6 6
6 6
6 6
6 6
o.lfdoo
0.0013
0.0019
0.0000
0.0000
0.0003
0.0007
0.0005
0.(^00
0.0013
0.0021
0.0000
0.0000
0.0003
0.0009
0.0005
0.(f^04
0.0094
0.0117
0.0001
0.0007
0.0043
0.0105
0.0063
O.fc
0.0099
0.0204
0.0021
0.0029
0.0046
0.0136
0.0094
0.^f28
0.1990
0.2299
0.2067
0.1324
0.2067
0.4124
0.2460
0.'ff96
0.7791
0.7341
0.7912
0.8639
0.7838
0.5618
0.7372
^2
10
20
2
10
20
30
40
A''=periodic length of function, n=number of unknown time domain samples.
m=number of known frequency samples, lst=sample number of first known frequency
sample,
expected fraction of / in jth eigenvector of F, /3j=expected fraction of
e in jth eigenvector of P.
79
from the low-pass region. The samples of x were again chosen randomly from the
uniform distribution [0,1] to ensure a substantial dc component. From Table 4.2 it
can be seen that as the known frequencies move away from the low-pass region, the
eigenvector containing the dc component of x shifts to the right. For cases where the
known frequency samples are quite far from the low-pass region, the dc component
is in the last (slowest converging) eigenvector. This means that for functions with a
substantial dc component, it is important that the known frequencies be as close to
the low-pass region as possible. In general, the reconstruction will be most effective
if the known frequencies are as close to the center of the main lobe of the frequency
spectrum as possible.
4.2.3
Aliasing Error
The previous trials have shown that error due to random noise is usually concen­
trated in the slower converging eigenvectors of P. It is more difficult to determine
the effects of aliasing error experimentally. To introduce aliasing error into a trial
one must start with a continuous time-limited function whose continuous frequency
spectrum can be expressed mathematically by the Fourier Transform equation:
(4.13)
Therefore randomly generated signals are not applicable, and one quickly runs out
of time-limited functions whose Fourier transforms are easily derived. In lieu of
L
80
thousands of trials, this section will provide two examples of the distribution of the
error function e caused by aliasing.
For the first example consider the time-limited function
0<f<l
elsewhere
(4.14)
whose Fourier Transform is given by Equation 4.15.
(4.15)
Five low frequency samples from F{(jj) were modeled as five samples of a periodic
frequency spectrum 128 samples long with a sampling frequency of 4.2 Hz. The
only error on the known frequency samples was the aliasing error as described in
Section 2.3. The solution x = x-\-e was computed by the direct method and the true
function x and the error function e were expanded separately in eigenvectors of P.
The fractions of x and e in each eigenvector are given in the following table.
a'l
«2
ol'z
"4
<^5
0.9960 0.0000 0.0000 0.0040 0.0000
(4.16)
/?(
^'2
^'4
0's
0.0015 0.0834 0.3665 0.0710 0.4476
For the second example consider the time-limited ramp function
0 <^<1
elsewhere
(4.17)
whose Fourier Transform is given by Eq. 4.18.
(4.18)
81
Again five samples from F{u>) were used to reconstruct x = x + e into a vector 128
samples long. The true function x and the error e were expanded in eigenvectors of
P and the fraction of x and e in each eigenvector are given in the following table.
a[
a'2
a'3
a'4
a's
0.5840 0.4130 0.0024 0.0007 0.0000
(4.19)
/?;
^'3
^'4
/3's
0.0010 0.1540 0.2478 0.2606 0.3366
The results from these two examples are not intended to be taken as a standard
for error in the solution due to aliasing, but they do show that aliasing error can be a
substantial problem. The error for these two examples is not nearly as well behaved
as for similar results shown for random noise (see Table 4.1: N = 128, m = n = 5).
It is interesting to note that for these two examples the true function x is particularly
well behaved in terms of being concentrated in the faster converging eigenvectors.
The results from this section indicate that the error in the solution due to input
errors will usually be concentrated in the slowly converging eigenvectors of the matrix
P, and as the size of the error increases the error becomes more heavily concentrated
in these eigenvectors. By terminating the Gerchberg algorithm early, the contribution
from these eigenvectors (and thus the error) can be minimized. However, terminating
the algorithm early introduces its own error because a portion the true function x is
also in the slowly converging eigenvectors. Therefore we need to find the point which
minimizes the overall error, this will be the optimal termination point.
82
4.3
Termination Schemes
In the past, the Gerchberg algorithm has been terminated by a human observer at
his discretion [10]. We would like to have a termination scheme which automatically
terminates the algorithm at or near the optimal termination point. Three possible
termination schemes are suggested in this section. These schemes are not neccesarily
the only schemes, or even the optimal schemes, but they all make use of the results
from the preceding section.
4.3.1
Convergence Factor
This method is based on computing the convergence factor (Eq. 2.3) between
iterations and terminating the algorithm when it becomes a constant (or changes less
than a certain threshold). When the convergence factor is a constant, there is only
one eigenvector (the last eigenvector) which is still converging. This is based on Eq.
4.6 which relates the speed of convergence to the eigenvectors of P. Assuming all
eigenvectors except the last eigenvector K have converged, and the term P'^xq has
converged to zero, Eq. 4.6 can be rewritten as:
x-
X r =
A;(o;„ + /?„)K
where x^ is the estimate of x after r iterations. Since A„ < 1,
(4.20)
will decrease by the
constant factor A„ at each iteration, so x — Xr will also decrease by a constant factor
at each iteration.
83
By terminating the algorithm when the convergence factor approaches a constant
the contribution from the last eigenvector will be minimized and the contributions
from the other eigenvectors will be realized. Figure 3.9 from Chapter 3 shows a
typical plot of the convergence factor. For this example, the algorithm would be
terminated at around iteration number 20.
Since the convergence factor can be computed for only the exactly determined
case, this termination scheme is of limited value. Also, since this scheme can reliably
eliminate only the last eigenvector it would be inappropriate for applications where
more than one eigenvector would need to be eliminated (such as longer time-limited
functions with substantial noise on the known frequency samples).
4.3.2
2nd Derivative of Energy
This method computes the total reconstructed energy after each iteration and
terminates the algorithm when the slope of the energy function approaches a constant
(its second derivative approaches zero). The justification for this method is based on
the assumption that most of the true function will reside in the first few eigenvectors
and the error will be concentrated in the last few eigenvectors. Hence, there should
be a sharp increase in the reconstructed energy in the early iterations as the first few
eigenvectors converge rapidly. After this point the energy should increase at a slow,
nearly constant rate as the slower eigenvectors converge.
84
Solid: Total Energy, Dotted: Energy Due To True Function
2600
2400
2200
2000
1800
1600
1400
1200
1000
800
100
Iteration Number
Figure 4.1: Example of reconstructed energy.
The main advantage of the energy method proposed herein, is its easy implemen­
tation: simply compute the energy after each iteration and compute an approximate
second derivative as the algorithm moves along. When the magnitude of the second
derivative drops below a specified threshold the algorithm is terminated. The main
disadvantage of this method is its lack of a rigorous supporting theory.
An example of this scheme is shown in Figures 4.1-4.3.
Figure 4.1 shows the
total reconstructed energy from using noisy frequency samples and the energy due
to the true function / alone.
Figure 4.2 shows the second derivative of the total
Close-up of 2nd Derivative of Energy
0.2
0
-0.2
Zero Crossing @ Iteration # 25
-0.4
-0.6
•0.8
1
-1.2
15
Iteration Number
Figure 4.2: Second derivative of total reconstructed energy from Figure 4.2.
86
Normalized Mean Squared Error
0.5
0.45
0.4
0.35
0.3
0.25
0.2
0.15
0.1
Minimum at Iteration # 10
0.05
100
Iteration Number
Figure 4.3: Normalized mean squared error between true solution and Gerchberg
algorithm's latest estimate.
reconstructed energy. The energy termination scheme would terminate the algorithm
at the first zero crossing (iteration number 25).
Figure 4.3 shows the normalized
mean squared error for this example. It reaches its minimum at iteration number 10,
and then increases steadily. For this example the energy method works well because
the mean squared error at iteration number 25 (the termination point given by the
energy method) is pretty close to the minimum mean squared error.
87
4.3.3
Statistically Optimum Termination
This method is based on performing experimental trials to determine one termi­
nation point which is the best on average. The basic idea is to run large numbers of
experimental trials using functions and errors similar to those which will be encoun­
tered for a particular application. One can then find the optimal termination point
for each trial and then choose the mean or the median as the statistically optimal
termination point.
4.3.4
Comparison Between Termination Schemes
Experimental trials were performed to compare the performance of the different
methods for two cases. Random time-limited functions and noise samples were gener­
ated as in the previous trials, and the Gerchberg algorithm was allowed to iterate up
to 500 times for each trial. For each trial the minimum mean squared error and the
iteration number at which it occurred were recorded. The termination points given
by various termination schemes were also recorded along with their corresponding
mean squared errors.
For the first case the performance of the energy method and the statistically
optimal method were tested for discrete functions with
= 64 and m = n = 9.
After 1000 trials the mean was computed for the number of iterations by which each
method missed the optimal termination point. The average mean squared error in the
solution at those points was also computed. The energy method did slightly better
88
for both measures, but neither method performed well consistently. On average, the
energy method missed the optimal termination point by about 120 iterations, and
the statistical method missed the optimal termination point by about 131 iterations.
For the second sample case the energy and convergence factor methods were tested
for discrete functions with N = 32, m = n = 3. Again neither method proved to be
extremely reliable. On average, the energy method missed the optimal termination
point by about 76 iterations and the convergence factor method missed by about 91
iterations. The two methods had an average mean squared error slightly less than
twice the minimum mean squared error, so in this regard, they didn't perform too
badly.
It is possible that for some of the trials the energy method mistakenly terminated
the algorithm early because the termination criteria appeared coincidentally before it
should have. Errors such as these would normally be detected by a human observer
and the algorithm would be allowed to continue.
4.4
Summary
This chapter explained the conditions under which early termination of the Gerchberg algorithm will limit the effects of input error on the result. The results from
experimental trials showed that these conditions can usually be expected, and the
conditions generally become stronger as the size of the error increases. Three schemes
89
were designed to automatically terminate the algorithm. These schemes were imple­
mented and proved to be fairly reliable, but not perfect.
CHAPTER 5
THE SVD METHOD
This chapter presents a new super-resolution method which could make the Gerchberg algorithm obsolete. The first section presents a super-resolution method which
combines the direct method of Chapter 2 and the eigenvector expansion of Chapter 4.
Although this method is non-iterative, it is based on the early termination criterion
of Chapter 4 and it achieves results nearly identical to the Gerchberg algorithm's
optimal solution with great savings in work and time. Section 2 presents a similar,
even more efficient method which is based on singular value decomposition (SVD)
techniques. Section 3 compares the new SVD method to the Gerchberg algorithm by
means of the familiar two point target example. Section 4 presents an example of the
SVD method used for a 2-dimensional super-resolution problem. The final section
provides explanations for the distribution of error and the change in the distribution
of the true function as the known portion of the frequency spectrum moves away
from the main lobe.
91
5.1
Eigenvector Expansion
In the previous chapter, the solution x given by the Gerchberg algorithm (and
the direct method for the exactly determined case) was expanded in eigenvectors of
the matrix P in order to show that the error is usually concentrated in one or a
few undesirable eigenvectors. The contribution from these eigenvectors (and thus the
error in the solution) can be minimized by terminating the algorithm early. Now, if
the solution x can be determined by the direct method and expanded in eigenvectors
of P, and the undesirable eigenvectors can be identified, then these eigenvectors can
be eliminated completely by simply recombining the expanded version of x while
omitting the components of x in the undesirable eigenvectors.
For example, consider once again the discrete time-limited function / with discrete
frequency spectrum F.
/ = [/i/2/3 00000]
(5.1)
F = [Fi F2 F3 Fi Fs Fe F7 Fg]
(5.2)
Suppose the first, second, and eighth frequency samples are known but have been
corrupted by noise. The direct method can be used to find the solution x to the
unknown time domain samples. Since the known frequency samples contain error,
the solution x will be the sum of a true solution x and an error function e. The
solution
X
is expanded in eigenvectors of P by solving the linear set of equations
92
Va =
X ioT
a
>11 Ki2 143"
V21
V22
V^23
V31 1^32 ^3
ori
"2
as _
Xi
=
A
=
X2
X3
_
>11 •
V21
F31
>12"
+
OL2
V22
V32 _
+
OLz
>13"
V23
V33
where the columns of V are the eigenvectors of P. Suppose that the error e is
concentrated in the third term 03^3 and that the true solution x is concentrated in
the first two terms. Now the error in the solution can be greatly reduced by simply
throwing away the third term and estimating the solution x as the sum of the first
two terms.
X = ai
>11 •
>12"
V21 + 0:2 V22
V31
F32
For the exactly determined case, the eigenvector expansion can find the estimated
solution given at any iteration of the Gerchberg algorithm by weighting the eigen­
vectors in Eq. 5.3 by the terms aj{l — A^) from Jones [9], where Xj is the eigenvalue
corresponding to the jth eigenvector Vj and r is the iteration number. For the
overdetermined case, the least squares solution (Eq. 2.27) can also be expanded in
eigenvectors of P and the undesirable vectors can be thrown out as described above.
By eliminating the undesirable eigenvectors directly, the eigenvector expansion
method accomplishes the same thing as the Gerchberg algorithm and it achieves
nearly identical results. The eigenvector expansion method, however, provides tremen­
dous savings in work and time over the Gerchberg algorithm. The entire process
consists of three steps. Step one is solving Ax = b for x. Step two is expanding x in
eigenvectors of P, which requires solving another set of linear equations. Step three
93
is recombining the expanded components of x while omitting those components in
the undesired eigenvectors.
5.2
SVD Expansion
The solution x can also be expanded in right singular vectors of the matrix A.
Recall from Chapter 2 that the matrix A consists of the DFT coefficients relating
the known frequency samples to the unknown time domain samples. From [5], any
mxn matrix A whose number of rows m is greater than or equal to its number of
columns n, can be written as the product of an m
X
n column-orthonormal matrix
U, an nxn diagonal matrix W with non-negative elements in decreasing order down
the diagonal, and the transpose of an n x n orthonormal matrix V.
Singular Value Decomposition of A:
Wi
W2
u
yT
w„
TlXn
-• m X n
The WjS are the singular values of A.
From [11], the columns of V (known as the right singular vectors of A) are an or­
thonormal set of eigenvectors for A^A. The singular values (wj's) corresponding
to the columns of V are the positive square roots of the eigenvalues of A^A. The
columns of V form an orthonormal set of vectors which span the space orthogonal to
94
the nullspace of A, and the columns of U form an orthonormal set of vectors which
span the range of A [12]. For our application, each column of A is linearly indepen­
dent. Therefore if m > n then A is full rank and its null space consists of the zero
vector, so in this case the columns of V span i?" [13].
Experimental trials have shown that when the singular value decomposition is done
in this way the columns of V are similar (in some cases identical) to the orthonormal
eigenvectors of the matrix P . This means that the error will usually be concentrated
in the last few columns of V just as it is concentrated in the last few eigenvectors
of P. These are the columns of V which have the smallest corresponding singular
values. Expanding the solution x in column vectors of V (right singular vectors of A)
and throwing away the vectors with small corresponding singular values accomplishes
essentially the same thing as the eigenvector expansion described previously.
5.2.1
Advantages
The SVD expansion is preferable to the eigenvector expansion because it only
involves the matrix A whereas the eigenvector expansion requires constructing the
matrix fin which can be very large for long functions. Extra matrix manipulations
must also be performed to find the matrix P, and numerical errors occur with each
extra computation.
The entire process of computing f, expanding it in column vectors of V, and elimi­
nating the undesirable column vectors can be accomplished in one step by computing
95
the following equation from [5].
(5.5)
X = V • [ d i a g ( l / w n ) ] • [ f / ^ • b]
If the 1/Wj terms are left unaltered, the equation above will compute the least squares
solution for Ax =
b.
Setting the largest 1/wj terms to zero will eliminate the con­
tribution from the undesirable column vectors of V to the solution x. The following
expansion of Eq. 5.5 illustrates the process.
X =V • [ d i a g { l l w n ) ] •
Vu
V21 V22
llwi
Vln
V2n
\/w2
Vn\ Vn2 ... Kinn
Vu V12
V21 V22
l/l^n _
.
Vn2
In
V22n
. Vn\
V22
_ Uin U2n ••• Ujmn
"m
U\2
U22
Vln
V2n
k
Uib
U2h
'i-IWn J
[Unb _
(l/u;2)t/2&
[
.
V12
V2n
V22
+ {lfw2)U2b
Knl
61
U21
1/U>2
Vn2 ... Vnn J
Vn
V21
{\lwi)Uih
Uml
Um2
Un
l/wi
K,
V21
=
[f/^ • 6]
-f ... + {llWn)U\b
Kn2
(5.6)
K,
96
From the final form of the equation above it is clear that x is a linear combination
of the right singular vectors Vj. It is also clear that the contribution from a particular
Vj can be eliminated by setting the corresponding l/(wj) to zero.
Equation 5.5 is the main result of this thesis. The entire discrete-time superresolution problem has been reduced to finding the SVD of the matrix A and per­
forming a simple matrix multiplication. There are no linear equations to solve and
no inverses to compute. This method accomplishes essentially the same thing as ter­
minating the Gerchberg algorithm early: it eliminates the components of the solution
which are dominated by error. Therefore, the SVD method can provide results nearly
identical to the optimal solution given by the Gerchberg algorithm with huge savings
in time and work.
5.3
Example: Two Point Target
This example will be used to compare the performance of the SVD method to the
Gerchberg algorithm for a two point target. This is the same example which was
used by Gerchberg in his original paper [2], with the exception of the noise samples
(Gerchberg used uniform noise, this example uses Gaussian noise).
The original time-limited object and its discrete frequency spectrum are shown in
Figures 5.1 and 5.2 respectively. The original object is a two point target. Each point
of the target consists of 2 samples and the points are separated by a distance of 7
samples. The total periodic length of the function is 256 samples. As in [2], all that
Original Time-Limited (Space-Limited) Object
1400
1200
1000
800
600
400
200
0
-50
0
50
Time (Space)
Figure 5.1: Original time-limited object.
IFFT of Original Object!
5000
4500
4000
3500
3000
2500
2000
1500
lOOO
500
100
150
200
Frequency
Figure 5.2: Discrete frequency spectrum of original object.
250
99
IKnown Portion of Frequency Spectrum Distorted by Noisel
5000
4500
4000
3500
3000
2500
2000
1500
1000
500
0
0
50
100
150
200
250
Frequency
Figure 5.3: Known portion of frequency spectrum distorted by noise.
• iiO'V
is known of the original object are the 37 low-pass samples of its frequency spectrum
which have been corrupted by random noise, and the exact extent of the object in
the time-domain (9 samples).
The known portion of the frequency spectrum is
shown in Figure 5.3, and the image given from this noisy, diffraction-limited spectrum
is shown in Figure 5.4. The image in Figure 5.4 has lost all information about the
two point nature of the original object due to noise and diffraction in the frequency
domain. Note, we did not zero the samples outside the time-limited region, as would
100
Image from Noisy, Diffraction Limited Spectrum
1400
1200
1000
800
600
400
200
-100
-50
0
50
100
Time (Space)
Figure 5.4: Image from noisy, diffraction limited spectrum.
101
Error Norm for Gerchberg Algorithm.
1700
1600
1500
1400
1300
1200
Minimum error norm=999.5 at 154 iterations.
1100
1000
100
200
300
400
500
600
700
800
900
1000
Iteration Number
Figure 5.5: Error norm for Gerchberg algorithm.
normally be done. The energy outside the time-limited region is the correction energy
which was described in Chapter 3.
First, the Gerchberg algorithm's optimal solution was found. The algorithm was
allowed to iterate 1000 times and the norm of the error between the original object
and the algorithm's estimate over the time-limited region was computed after each
iteration.
A plot of the error norm versus iteration number is shown in Figure
5.5 from which it appears the minimum error norm was about 1000. The actual
minimum error norm was found to be 999.5 and it occurred at iteration number 154.
102
Result from Gerchberg Algorithm after 154 Iterations
1400
1200
1000
800
600
Norm of Error=999.5
400
200
0
-100
-50
0
50
100
Time (Space)
Figure 5.6; Result from Gerchberg algorithm after 154 iterations.
The Gerchberg algorithm was restarted and allowed to iterate 154 times.
The
result given by the Gerchberg algorithm after 154 iterations is shown in Figure 5.6,
this is the best result obtainable using the Gerchberg algorithm ^ .
Next the SVD method was used for the same problem.
Figure 5.7 shows the
result obtained by the SVD method eliminating 6 right singular vectors. This result
is nearly identical to Figure 5.6, with a slightly lower error norm.
Figure 5.8
shows the result obtained by the SVD method eliminating 5 right singular vectors.
^The portion of the result outside the known time-limited region would normally be set to zero.
This portion of the function is the error energy which was discussed in Chapter 3.
103
Result from SVD Method: 6 Singular Values Thrown Out
1
1
!•••
•• • 1
•••
1
1400
1200
1000
800
600
Norm of En:or=989.9
400
200
/
0
1
1
1
1
i
-100
-50
0
50
100
Time (Space)
Figure 5.7: Result from SVD method: 6 singular values thrown out.
104
Result from SVD Method: 5 Singular Values Thrown Out
1
1—
1
••
1
1
1400
1200
1000
800
600
Nonii of Error=1030.4
400
200
0
1
t
t
1
1
-100
-50
0
50
100
Time (Space)
Figure 5.8: Result from SVD method: 5 singular values thrown out.
105
Although the error norm for this result is slightly larger than that for Figure 5.6, the
result is still a reasonable approximation to the original object.
To provide a fair comparison of the time and work requirements of the two meth­
ods, a 'stripped-down' version of the Gerchberg algorithm was used for this problem.
The stripped-down Gerchberg algorithm took about 8 seconds to complete 154 it­
erations, which was about 800 times as long as the 10 milliseconds required by the
SVD method. In terms of computing work, the SVD method required 9144 flops.
The Gerchberg algorithm required 3,895,740 flops for 154 iterations, an increase by
a factor of about 426 over the SVD method.
The savings in time and computing work for this example are probably uncharac­
teristically low. The signal to noise ratio for this example was less than 10 dB, which
is pretty dismal. As the SNR increases, the Gerchberg algorithm requires many more
iterations to reach its optimal result, while the SVD method will not require any
additional time or work. For the extreme case when there is no error, the SNR will
be infinite and theoretically the Gerchberg algorithm will require an infinite number
of iterations to reach its optimal solution. Therefore, much greater savings in time
and work can be expected for applications with a reasonably high signal to noise
ratio.
106
5.4
2-D Example
This example is included to show that the SVD method is applicable to multidi­
mensional super-resolution problems.
The original object (shown in Figure 5.9)
was known to be space-limited to a 7 x 7 square in the center of the 128 x 128 image
plane.
The discrete spatial frequency spectrum of the original object is shown
in Figure 5.10.
The frequency spectrum was corrupted by complex Gaussian
noise samples^ as shown in Figure 5.11 and a 15 x 15 section of the noisy frequency
spectrum (shown in Figure 5.12) was used for the reconstruction.
The image given
by the known portion of the frequency spectrum before the reconstruction is shown
in Figure 5.13. This blurred image gives no indication of the four distinct points of
the original object.
The SVD method was used to reconstruct the image from the known portion of
the noisy frequency spectrum.
The best result (lowest error norm) was obtained
using 8 singular vectors of A and is shown in Figure 5.14. The image in Figure 5.14
clearly shows 4 distinct peaks and is a great improvement over the blurred image in
Figure 5.13. The four point nature of the original object was also clearly visible in
results using 6, 7, 9, and 10 singular vectors.
These results are shown in
Figures 5.15, 5.16, 5.17, and 5.18 respectively.
^Each noise sample was the sum a + j b , where a and b were chosen randomly from the Gaussian
_x2
distribution f ( x ) =
e"^.
107
Original Space-limited Object
Figure 5.9: Original space-limited object.
109
INoisy Frequency SpectrumI
Figure 5.11: Frequency spectrum distorted by noise.
Known Portion of Frequency Spectrum
Figure 5.12: Known portion of frequency spectrum.
111
llmage from Noisy, Diffraction-Limited Spectrum!
Figure 5.13: Image frorn noisy, diffraction-limited spectrum.
ISVD Result Using 8 Singular VectorsI
Figure 5.14: SVD result using 8 singular vectors.
ISVD Result Using 6 Singular Vectorsl
Figure 5.15: SVD result using 6 singular vectors.
ISVD Result Using 7 Singular VectorsI
Figure 5.16: SVD result using 7 singular vectors.
ISVD Result Using 9 Singular VectorsI
Figure 5.17: SVD result using 9 singular vectors.
116
ISVD Result Using 10 Singular Vectorsl
Figure 5.18: SVD result using 10 singular vectors.
117
5.5
Error Concentration
The results from earlier sections have shown that as the size of the error increases,
it becomes increasingly concentrated in the right singular vectors of
A
with small
corresponding singular values. It was also shown that as the known frequencies move
away from the main lobe of the frequency spectrum the true function
x
becomes
increasingly concentrated in these same singular vectors. In this section the nature
of the SVD solution itself will be used to partially explain why these two phenomena
occur.
Again consider the problem A x
=b
and the equivalent problem
A{x+ e) = (b+r)
where e is the error in the solution and r is the error on the known frequency samples.
The error e is given by the SVD solution as:
e = V • ldiaff(l/wn)] • [t/^ • r
v21
vi2
vln
v22
V2n
llwi
1/^2
l/ii
1/21
u12
u22
Uml
U 'm2
n
f2
•
K,
vn2
...
^f'^n
k
V12 ...
v2l
v22
_
_
uxn
u2n •••
l/zui
•
•
1/w2
v2n
uir
u2r
,
vnl
vii2
•••
vyin
l/wn _
u'nnx
•
118
Vll
vi2
vin
{l/wi)uir
V21
V22
v2n
{l/w2)u2r
^nl
^n2 •••
^nn
V21
_
Knl
/^n)unv
T42
T4n
V22
v2n
+ {l/w2)u2r
= {l/wi)uir
_
(5.7)
+ ...+ {l/wn)unr
vn2
k,
where [/j denotes the jth row of [/^ (the transpose of the jth column of U). From
the final form of Eq. 5.7 it is clear that the error e is a linear combination of the
right singular vectors (V^'s) of A. The contribution to e from each Vj is given by
the coefficient {\lwj)Ujr. Using the Schwarz inequality and the fact that the f/y's
are orthonormal, the size of the contribution from each right singular vector, V}, is
limited by the following formula.
IKiMjCiWI <
(5.8)
Eq. 5.8 implies that if the error norm ||e|| is large compared to ||r||, then the error e
must have large components from singular vectors with small corresponding singular
values.
The SVD expansion can also be used to explain the change in the distribution of
the true function x among the T^ 's as the known frequencies move away from the
main lobe of the frequency spectrum. The true function x can be expanded in terms
of the SVD components as was done above for the error e. Omitting the intermediate
119
steps, X can be expressed as the following linear combination of right singular vectors
of A.
'
• V12"
"
V21
X = {lfwi)Uib
•
Ki
'
V2n
V22
-f
{l/w2)U2b
•
14. "
+ ... + { l / W n ) U i b
Vn2
•
(5.9)
v„„
'nn
Again using the Schwarz inequality and the fact that the C/j's are orthonormal, the
contribution to x from each Vj is limited by the following formula.
||(lM)C/i6||<||(lM)||||&||
(5.10)
As the known portion of the frequency spectrum b moves away from the main lobe,
its size given by ||6|| decreases. The size of x remains constant however, so as ||6||
decreases, larger coefficients {l/wj)Ujb are required to reconstruct x. Since the terms
with large singular values, Wj, are severely limited by Eq. 5.10, the signal is effectively
forced into the singular vectors with small corresponding singular values.
5.6
Summary
This chapter introduced two new super-resolution methods, the eigenvector ex­
pansion method and the SVD method. Both of these methods are based on solving
the linear system Ax — b. It was shown that the eigenvector expansion method is
a generalization of the discrete Gerchberg algorithm. Hence the eigenvector method
can directly determine the result given by the Gerchberg algorithm at any iteration
with huge savings in time and work. The SVD method was introduced and shown
120
to do essentially the same thing as the eigenvector expansion method, and the use
of SVD techniques makes this method numerically superior to the eigenvector ex­
pansion method. An example was used to compare the performance of the SVD
method to the Gerchberg algorithm. A second example applied the SVD method to
a 2-dimensional super-resolution problem. Finally, the SVD solution was expanded
to partly explain the expected distribution of the true function x and the error e
among the right singular vectors of A.
121
CHAPTER 6
SUMMARY AND CONCLUSION
This thesis has introduced several new concepts and methods for super-resolution.
A new super-resolution method, which we call the direct method, was introduced in
Chapter 2. This method is the first to recognize that, in the discrete form, the
super-resolution problem can be reduced to solving a linear set of equations relating
the known frequency samples to the unknown time or space domain samples. By
taking full advantage of the inherent structure of the discrete case, the direct method
achieves super-resolution very quickly and efficiently. The direct method can be
generalized to provide a least squares solution for the overdetermined case. The
main drawback of the direct method is its sensitivity to input errors.
The Gerchberg algorithm for super-resolution was examined in Chapter 3. The
discrete Gerchberg algorithm is an iterative method for solving the same set of equa­
tions which the direct method solves directly. For the exactly determined case, the
discrete Gerchberg algorithm converges to the same solution given by the direct
method. For the overdetermined case, the discrete Gerchberg algorithm converges to
the solution which minimizes the correction energy, which is not usually the same as
the least squares solution.
122
Chapter 4 presented a mathematical justification for early termination of the Gerchberg algorithm to limit the effects of errors. The conditions under which early
termination would minimize the effects of errors were outlined, and the results from
experimental trials showed that these conditions seldom fail to occur. Three termi­
nation schemes designed to limit the effects of error were introduced and tested.
In Chapter 5 two new super-resolution methods were introduced, the eigenvector
expansion method and the SVD method. The eigenvector expansion method is a noniterative generalization of the discrete Gerchberg algorithm based on solving the set of
linear equations directly. The eigenvector expansion method can directly determine
the solution given by the Gerchberg algorithm at any iteration. The SVD method
was shown to accomplish essentially the same thing. Due to the SVD techniques
which it employs, the SVD method is faster and numerically more accurate than the
eigenvector expansion method.
The SVD method has overcome the most significant drawback of the Gerchberg
algorithm, its slow convergence speed. The savings in time and computational work
which it provides over the Gerchberg algorithm are huge. What might take hours for
the Gerchberg algorithm to accomplish can be done in seconds with the new SVD
method. The savings in time might be especially crucial to the prospects for real-time
super-resolution applications.
123
REFERENCES
[1] J. L. Harris, "Diffraction and Resolving Power," Journal of the Optical Society
of America, vol. 54, No. 7, pp. 931-936, July 1964.
[2] R. W. Gerchberg, "Super-resolution through error energy reduction," Optica
Acta, vol. 21, pp. 709-720, Sept. 1974.
[3] K. E. Atkinson, An Introduction to Numerical Analysis. New York, NY: John
Wiley & Sons, 1978.
[4] T. J. Aird and R. E. Lynch, "Computable Accurate Upper and Lower Error
Bounds for Approximate Solutions of Linear Algebraic Systems," ACM Trans­
actions on Mathematical Software, vol. 1, No. 3, pp. 217-231, Sept. 1975.
[5] W. H. Press et ah. Numerical Recipes. Cambridge, UK: Cambridge University
Press, 1986.
[6] T. Mathworks, Matlab: A Tutorial. Natick, MA: The MathWorks, Inc., 1985.
[7] A. Papoulis, "A New Algorithm in Spectral Analysis and Band-Limited Ex­
trapolation," IEEE Transactions on Circuits and Systems, vol. CAS-22, No. 9,
pp. 735-742, Sept. 1975.
[8] D. C. Youla, "Generalized Image Restoration by the Method of Alternating
Orthogonal Projections," IEEE Transactions on Circuits and Systems, vol. CAS25, No. 9, pp. 694-702, Sept. 1978.
[9] M. C. Jones, "The Discrete Gerchberg Algorithm," IEEE Transactions on
Acoustics, Speech and Signal Processing, vol. ASSP-34, No. 3, pp. 624-626, June
1986.
[10] 1. Sadka and H. Ur, "On The Application of Cadzow's Extrapolation Method
of BL Signals," in International Conference on Acoustics, Speech, and Signal
Processing, 1992.
124
[11] R. A. DeCarlo, Linear Systems, a State Variable Approach with Numerical Im­
plementation. Englewood Cliffs, NJ: Prentice-Hall, 1989.
[12] H. S. Tharp, "A Numerical Algorithm for Chained Aggregation and Modified
Chained Aggregation," M.S. thesis, University of Illinois, Urbana-Champaign,
1983.
[13] F. Ayres, Theory and Problems of Matrices. New York, NY: McGraw-Hill, 1962.
Was this manual useful for you? yes no
Thank you for your participation!

* Your assessment is very important for improving the work of artificial intelligence, which forms the content of this project

Download PDF

advertisement