Fourier and Wavelet Representations

Fourier and Wavelet Representations
Fourier and Wavelet Signal Processing
Copyright 2013 J. Kovačević, V. K. Goyal, and M. Vetterli
Fourier and Wavelet Signal Processing
Jelena Kovačević
Carnegie Mellon University
Vivek K Goyal
Massachusetts Institute of Technology
Martin Vetterli
École Polytechnique Fédérale de Lausanne
January 17, 2013
Copyright (c) 2013 Jelena Kovačević, Vivek K Goyal, and Martin Vetterli.
These materials are protected by copyright under the
Attribution-NonCommercial-NoDerivs 3.0 Unported License
from Creative Commons.
α3.2 [January 2013] [free version] CC by-nc-nd
Comments to [email protected]
Fourier and Wavelet Signal Processing
Copyright 2013 J. Kovačević, V. K. Goyal, and M. Vetterli
ii
Cover photograph by Christiane Grimm, Geneva, Switzerland.
Experimental set up by Prof. Libero Zuppiroli and Philippe Bugnon,
Laboratory of Optoelectronics Molecular Materials, EPFL, Lausanne,
Switzerland.
The cover photograph captures the experiment first described by Isaac
Newton in Opticks in 1730, showing that white light can be split into
its color components and then synthesized back into white light. It
is a physical implementation of a decomposition of white light into its
Fourier components—the colors of the rainbow, followed by a synthesis
to recover the original.
α3.2 [January 2013] [free version] CC by-nc-nd
Comments to [email protected]
Fourier and Wavelet Signal Processing
Copyright 2013 J. Kovačević, V. K. Goyal, and M. Vetterli
Contents
Image Attribution
ix
Quick Reference
xi
Preface
xvii
Acknowledgments
1
xxi
Filter Banks: Building Blocks of Time-Frequency Expansions
1.1
Introduction
1.2
Orthogonal Two-Channel Filter Banks
1.2.1
A Single Channel and Its Properties
1.2.2
Complementary Channels and Their Properties
1.2.3
Orthogonal Two-Channel Filter Bank
1.2.4
Polyphase View of Orthogonal Filter Banks
1.2.5
Polynomial Approximation by Filter Banks
1.3
Design of Orthogonal Two-Channel Filter Banks
1.3.1
Lowpass Approximation Design
1.3.2
Polynomial Approximation Design
1.3.3
Lattice Factorization Design
1.4
Biorthogonal Two-Channel Filter Banks
1.4.1
A Single Channel and Its Properties
1.4.2
Complementary Channels and Their Properties
1.4.3
Biorthogonal Two-Channel Filter Bank
1.4.4
Polyphase View of Biorthogonal Filter Banks
1.4.5
Linear-Phase Two-Channel Filter Banks
1.5
Design of Biorthogonal Two-Channel Filter Banks
1.5.1
Factorization Design
1.5.2
Complementary Filter Design
1.5.3
Lifting Design
1.6
Two-Channel Filter Banks with Stochastic Inputs
1.7
Computational Aspects
1.7.1
Two-Channel Filter Banks
1.7.2
Boundary Extensions
1
2
7
7
10
11
14
18
20
20
21
25
26
29
32
32
34
35
37
37
39
40
41
42
42
46
iii
α3.2 [January 2013] [free version] CC by-nc-nd
Comments to [email protected]
Fourier and Wavelet Signal Processing
Copyright 2013 J. Kovačević, V. K. Goyal, and M. Vetterli
iv
Contents
Chapter at a Glance
Historical Remarks
Further Reading
49
52
52
2
Local Fourier Bases on Sequences
2.1
Introduction
2.2
N -Channel Filter Banks
2.2.1
Orthogonal N -Channel Filter Banks
2.2.2
Polyphase View of N -Channel Filter Banks
2.3
Complex Exponential-Modulated Local Fourier Bases
2.3.1
Balian-Low Theorem
2.3.2
Application to Power Spectral Density Estimation
2.3.3
Application to Communications
2.4
Cosine-Modulated Local Fourier Bases
2.4.1
Lapped Orthogonal Transforms
2.4.2
Application to Audio Compression
2.5
Computational Aspects
Chapter at a Glance
Historical Remarks
Further Reading
55
56
59
59
61
66
67
68
73
75
76
83
85
87
88
90
3
Wavelet Bases on Sequences
91
3.1
Introduction
93
3.2
Tree-Structured Filter Banks
99
3.2.1
The Lowpass Channel and Its Properties
99
3.2.2
Bandpass Channels and Their Properties
103
3.2.3
Relationship between Lowpass and Bandpass Channels105
3.3
Orthogonal Discrete Wavelet Transform
106
3.3.1
Definition of the Orthogonal DWT
106
3.3.2
Properties of the Orthogonal DWT
107
3.4
Biorthogonal Discrete Wavelet Transform
112
3.4.1
Definition of the Biorthogonal DWT
112
3.4.2
Properties of the Biorthogonal DWT
114
3.5
Wavelet Packets
114
3.5.1
Definition of the Wavelet Packets
115
3.5.2
Properties of the Wavelet Packets
116
3.6
Computational Aspects
116
Chapter at a Glance
118
Historical Remarks
119
Further Reading
119
4
Local Fourier and Wavelet Frames on Sequences
4.1
Introduction
4.2
Finite-Dimensional Frames
4.2.1
Tight Frames for CN
4.2.2
General Frames for CN
α3.2 [January 2013] [free version] CC by-nc-nd
121
122
133
133
140
Comments to [email protected]
Fourier and Wavelet Signal Processing
Contents
Copyright 2013 J. Kovačević, V. K. Goyal, and M. Vetterli
v
4.2.3
Choosing the Expansion Coefficients
145
Oversampled Filter Banks
151
4.3.1
Tight Oversampled Filter Banks
152
4.3.2
Polyphase View of Oversampled Filter Banks
155
4.4
Local Fourier Frames
158
4.4.1
Complex Exponential-Modulated Local Fourier Frames159
4.4.2
Cosine-Modulated Local Fourier Frames
163
4.5
Wavelet Frames
165
4.5.1
Oversampled DWT
165
4.5.2
Pyramid Frames
167
4.5.3
Shift-Invariant DWT
170
4.6
Computational Aspects
171
4.6.1
The Algorithm à Trous
171
4.6.2
Efficient Gabor and Spectrum Computation
171
4.6.3
Efficient Sparse Frame Expansions
171
Chapter at a Glance
173
Historical Remarks
174
Further Reading
174
4.3
5
Local Fourier Transforms, Frames and Bases on Functions
177
5.1
Introduction
178
5.2
Local Fourier Transform
178
5.2.1
Definition of the Local Fourier Transform
178
5.2.2
Properties of the Local Fourier Transform
182
5.3
Local Fourier Frame Series
188
5.3.1
Sampling Grids
188
5.3.2
Frames from Sampled Local Fourier Transform
188
5.4
Local Fourier Series
188
5.4.1
Complex Exponential-Modulated Local Fourier Bases 188
5.4.2
Cosine-Modulated Local Fourier Bases
188
5.5
Computational Aspects
188
5.5.1
Complex Exponential-Modulated Local Fourier Bases 188
5.5.2
Cosine-Modulated Local Fourier Bases
188
Chapter at a Glance
188
Historical Remarks
188
Further Reading
188
6
Wavelet Bases, Frames and Transforms on Functions
189
6.1
Introduction
189
6.1.1
Scaling Function and Wavelets from Haar Filter Bank190
6.1.2
Haar Wavelet Series
195
6.1.3
Haar Frame Series
202
6.1.4
Haar Continuous Wavelet Transform
204
6.2
Scaling Function and Wavelets from Orthogonal Filter Banks
208
6.2.1
Iterated Filters
208
6.2.2
Scaling Function and its Properties
209
α3.2 [January 2013] [free version] CC by-nc-nd
Comments to [email protected]
Fourier and Wavelet Signal Processing
vi
Copyright 2013 J. Kovačević, V. K. Goyal, and M. Vetterli
Contents
6.2.3
6.2.4
Wavelet Function and its Properties
Scaling Function and Wavelets from Biorthogonal
Filter Banks
6.3
Wavelet Series
6.3.1
Definition of the Wavelet Series
6.3.2
Properties of the Wavelet Series
6.3.3
Multiresolution Analysis
6.3.4
Biorthogonal Wavelet Series
6.4
Wavelet Frame Series
6.4.1
Definition of the Wavelet Frame Series
6.4.2
Frames from Sampled Wavelet Series
6.5
Continuous Wavelet Transform
6.5.1
Definition of the Continuous Wavelet Transform
6.5.2
Existence and Convergence of the Continuous Wavelet
Transform
6.5.3
Properties of the Continuous Wavelet Transform
6.6
Computational Aspects
6.6.1
Wavelet Series: Mallat’s Algorithm
6.6.2
Wavelet Frames
Chapter at a Glance
Historical Remarks
Further Reading
7
218
220
222
223
227
230
239
242
242
242
242
242
243
244
254
254
259
259
259
259
Approximation, Estimation, and Compression
261
7.1
Introduction
262
7.2
Abstract Models and Approximation
262
7.2.1
Local Fourier and Wavelet Approximations of Piecewise Smooth Functions
262
7.2.2
Wide-Sense Stationary Gaussian Processes
262
7.2.3
Poisson Processes
262
7.3
Empirical Models
262
7.3.1
ℓp Models
262
7.3.2
Statistical Models
262
7.4
Estimation and Denoising
262
7.4.1
Connections to Approximation
262
7.4.2
Wavelet Thresholding and Variants
262
7.4.3
Frames
262
7.5
Compression
262
7.5.1
Audio Compression
262
7.5.2
Image Compression
262
7.6
Inverse Problems
262
7.6.1
Deconvolution
262
7.6.2
Compressed Sensing
262
Chapter at a Glance
262
Historical Remarks
262
Further Reading
262
α3.2 [January 2013] [free version] CC by-nc-nd
Comments to [email protected]
Fourier and Wavelet Signal Processing
Contents
7.A
Copyright 2013 J. Kovačević, V. K. Goyal, and M. Vetterli
vii
Elements of Source Coding
7.A.1
Entropy Coding
7.A.2
Quantization
7.A.3
Transform Coding
Bibliography
α3.2 [January 2013] [free version] CC by-nc-nd
263
263
263
263
265
Comments to [email protected]
Fourier and Wavelet Signal Processing
viii
α3.2 [January 2013] [free version] CC by-nc-nd
Copyright 2013 J. Kovačević, V. K. Goyal, and M. Vetterli
Contents
Comments to [email protected]
Fourier and Wavelet Signal Processing
Copyright 2013 J. Kovačević, V. K. Goyal, and M. Vetterli
Image Attribution
Source
s1
s2
s3
s4
s5
Permission
Christiane Grimm, Geneva, Switzerland
Wikimedia Commons
Nobelprize.org
KerryR.net
Edward Lee’s home page
p1
p2
p3
p4
p5
p6
p7
p8
Permission of the copyright holder
Copyright expired
GNU Free Documentation license
Public domain
Common property, no original authorship
NASA material, not protected by copyright
Public domain in the US
Unknown rights
Front Material
• Cover photograph.s1,p1 Experimental set up by Prof. Libero Zuppiroli,
Laboratory of Optoelectronics Molecular Materials, EPFL, Lausanne,
Switzerland.
ix
α3.2 [January 2013] [free version] CC by-nc-nd
Comments to [email protected]
Fourier and Wavelet Signal Processing
x
α3.2 [January 2013] [free version] CC by-nc-nd
Copyright 2013 J. Kovačević, V. K. Goyal, and M. Vetterli
Image Attribution
Comments to [email protected]
Fourier and Wavelet Signal Processing
Copyright 2013 J. Kovačević, V. K. Goyal, and M. Vetterli
Quick Reference
Abbreviations
AR
ARMA
AWGN
BIBO
CDF
DCT
DFT
DTFT
DWT
FFT
FIR
i.i.d.
IIR
KLT
LOT
LPSV
LSI
MA
MSE
PDF
POCS
ROC
SVD
WSCS
WSS
Autoregressive
Autoregressive moving average
Additive white Gaussian noise
Bounded input, bounded output
Cumulative distribution function
Discrete cosine transform
Discrete Fourier transform
Discrete-time Fourier transform
Discrete wavelet transform
Fast Fourier transform
Finite impulse response
Independent and identically distributed
Infinite impulse response
Karhunen–Loève transform
Lapped orthogonal transform
Linear periodically shift varying
Linear shift invariant
Moving average
Mean square error
Probability density function
Projection onto convex sets
Region of convergence
Singular value decomposition
Wide-sense cyclostationary
Wide-sense stationary
Abbreviations used in tables and captions but not in the text
FT
FS
LFT
WT
Fourier transform
Fourier series
Local Fourier transform
Wavelet transform
xi
α3.2 [January 2013] [free version] CC by-nc-nd
Comments to [email protected]
Fourier and Wavelet Signal Processing
Copyright 2013 J. Kovačević, V. K. Goyal, and M. Vetterli
xii
Quick Reference
Elements of Sets
natural numbers
integers
positive integers
real numbers
positive real numbers
complex numbers
a generic index set
a generic vector space
a generic Hilbert space
N
Z
Z+
R
R+
C
I
V
H
real part of
imaginary part of
closure of set S
ℜ( · )
ℑ( · )
S
functions
sequences
x(t)
xn
ordered sequence
set containing xn
vector x with xn as elements
(xn )n
{xn }n
[xn ]
Dirac delta function
δ(t)
Kronecker delta sequence
δn
indicator function of interval I
1I (t)
Elements of Real Analysis
integration by parts
Elements of Complex Analysis
complex number
z
conjugation
z∗
conjugation of coefficients
X∗ (z)
but not of z itself
principal root of unity
WN
Asymptotic Notation
big O
little o
Omega
Theta
asymptotic equivalence
α3.2 [January 2013] [free version] CC by-nc-nd
x ∈ O(y)
x ∈ o(y)
x ∈ Ω(y)
x ∈ Θ(y)
x≍y
0, 1, . . .
. . . , −1, 0, 1, . . .
1, 2, . . .
(−∞, ∞)
(0, ∞)
a + jb or rejθ with a, b, r, θ ∈ R
argument t is continuous valued, t ∈ R
argument n is an integer, n ∈ Z
Z
∞
x(t)δ(t) dt = x(0)
−∞
δn = 1 for n = 0; δn = 0 otherwise
1I (t) = 1 for t ∈ I; 1I (t) = 0 otherwise
Z
u dv = u v −
Z
v du
a + jb, rejθ , a, b ∈ R, r ∈ [0, ∞), θ ∈ [0, 2π)
a − jb, re−jθ
X ∗ (z ∗ )
e−j2π/N
0 ≤ xn ≤ γyn for all n ≥ n0 ; some n0 and γ > 0
0 ≤ xn ≤ γyn for all n ≥ n0 ; some n0 , any γ > 0
xn ≥ γyn for all n ≥ n0 ; some n0 and γ > 0
x ∈ O(y) and x ∈ Ω(y)
limn→∞ xn /yn = 1
Comments to [email protected]
Fourier and Wavelet Signal Processing
Copyright 2013 J. Kovačević, V. K. Goyal, and M. Vetterli
Quick Reference
xiii
Standard Vector Spaces
Hilbert space of square-summable
2
ℓ (Z)
sequences
Hilbert space of square-integrable
L2 (R)
functions
normed vector space of sequences with
ℓp (Z)
finite p norm, 1 ≤ p < ∞
normed vector space of functions with
Lp (R)
finite p norm, 1 ≤ p < ∞
normed vector space of bounded sequences with
ℓ∞ (Z)
(
x:Z→C |
X
2
)
|xn | < ∞ with
X
inner product hx, yi =
xn yn∗
n
Z
x:R→C |
|x(t)|2 dt < ∞ with
Z
inner product hx, yi = x(t)y(t)∗ dt
(
)
X
p
x:Z→C |
|xn | < ∞ with
n
X
norm kxkp = (
|xn |p )1/p
Zn
x:R→C |
|x(t)|p dt < ∞ with
Z
norm kxkp = ( |x(t)|p dt)1/p
x : Z → C | sup |xn | < ∞ with
n
n
supremum norm
normed vector space of bounded functions with
L∞ (R)
supremum norm
norm kxk∞ = sup |xn |
n
x : R → C | sup |x(t)| < ∞ with
t
norm kxk∞ = sup |x(t)|
t
Bases and Frames for Sequences
standard Euclidean basis
vector, element of basis or frame
basis or frame
operator
vector, element of dual basis or frame
operator
expansion in a basis or frame
α3.2 [January 2013] [free version] CC by-nc-nd
{en }
ϕ
Φ
Φ
ϕ
e
e
Φ
e
Φ
e∗x
x = ΦΦ
en,k = 1, for k = n, and 0 otherwise
when applicable, a column vector
set of vectors {ϕn }
concatenation of ϕn s in a linear
operator: [ϕ0 ϕ1 . . . ϕN−1 ]
when applicable, a column vector
set of vectors {ϕ
en }
concatenation of ϕ
en s in a linear
operator: [ϕ
e0 ϕ
e1 . . . ϕ
eN−1 ]
Comments to [email protected]
Fourier and Wavelet Signal Processing
Copyright 2013 J. Kovačević, V. K. Goyal, and M. Vetterli
xiv
Quick Reference
Transforms
Fourier transform
FT
x(t) ←→ X(ω)
Z
X(ω) =
x(t)e−jωt dt
−∞
Z ∞
1
X(ω) ejωt dω
2π −∞
Z
1 T /2
Xk =
x(t)e−j(2π/T )kt dt
T −T /2
X
x(t) =
Xk ej(2π/T )kt
x(t) =
Fourier series
∞
FS
x(t) ←→ Xk
k∈Z
DTFT
jω
discrete-time Fourier transform
xn ←→ X(e )
discrete Fourier transform
xn ←→ Xk
DFT
X
X(e ) =
xn e−jωn
Zn∈Z
π
1
X(ejω )ejωn dω
xn =
2π −π
jω
Xk =
local Fourier transform
x(t) ←→ X(Ω, τ )
1
N
CWT
x(t) ←→ X(a, b)
wavelet series
x(t) ←→ βk
WS
(ℓ)
Xk WN−kn
n=0
X(Ω, τ ) =
x(t) =
continuous wavelet transform
xn WNkn
n=0
N−1
X
xn =
LFT
N−1
X
1
2π
X(a, b) =
Z
∞
x(t)p(t − τ )e−jΩt dt
Z −∞
∞ Z ∞
−∞
Z
X(Ω, τ ) gΩ,τ (t) dΩ dτ
−∞
∞
x(t) ψa,b (t) dt
Z−∞
∞Z ∞
db da
1
X(a, b)ψa,b (t) 2
x(t) =
Cψ 0
a
−∞
Z ∞
(ℓ)
βk =
x(t) ψℓ,k (t) dt
−∞X
X
(ℓ)
x(t) =
βk ψℓ,k (t)
ℓ∈Z k∈Z
discrete wavelet transform
DWT
xn ←→
(J )
(J )
(1)
αk , βk , . . . , βk
(J )
αk
=
X
(J )
n∈Z
X
xn =
DCT
xn ←→ Xk
r
X0 =
Xk =
x0 =
α3.2 [January 2013] [free version] CC by-nc-nd
ZT
xn ←→ X(z)
r
r
2
N
1
N
(ℓ)
xn hn−2ℓ k
+
J X
X
(ℓ) (ℓ)
βk hn−2ℓ k
ℓ=1 k∈Z
N−1
X
n=0
N−1
X
xn
xn cos ((2π/2N )k(n + 1/2))
n=0
N−1
X
Xk
k=0
N−1
X
2
Xk cos ((2π/2N )k(n + 1/2))
N k=0
X
X(z) =
xn z −n
xn =
z-transform
r
1
N
X
n∈Z
(J ) (J )
αk gn−2J k
k∈Z
discrete cosine transform
(ℓ)
xn gn−2J k , βk =
n∈Z
Comments to [email protected]
Fourier and Wavelet Signal Processing
Copyright 2013 J. Kovačević, V. K. Goyal, and M. Vetterli
Quick Reference
xv
Discrete-Time Nomenclature
Sequence
xn
Convolution
linear
circular
h∗x
h⊛x
signal, vector
X
xk hn−k =
k∈Z
N−1
X
X
hk xn−k
k∈Z
xk h(n−k) mod N =
k=0
(h ∗ x)n
hℓ−n ∗n xn−m
Eigensequence
infinite time
finite time
Frequency response
infinite time
finite time
vn
vn = ejωn
vn = ej2πkn/N
jω
H(e )
Hk
Convolution
linear
h∗x
circular
h⊛x
Eigenfunction
infinite time
finite time
(h ∗ x)(t)
v(t)
v(t) = ejωt
v(t) = ej2πkt/T
Frequency response
infinite time
H(ω)
finite time
Hk
hk x(n−k) mod N
k=0
nth
X element of the convolution result
xk−m hℓ−n+k
k∈Z
eigenfunction, eigenvector
h ∗ v = H(ejω ) v
h ⊛ v = Hk v
eigenvalue
corresponding to vn
X
−jωn
hn e
n∈Z
N−1
X
hn e−j2πkn/N =
n=0
Continuous-Time Nomenclature
Function
x(t)
N−1
X
N−1
X
hn WNkn
n=0
signal
Z
Z ∞
x(τ )h(t − τ ) dτ =
h(τ )x(t − τ ) dτ
−∞
Z
Z−∞
T
T
h(τ )x(t − τ ) dτ
x(τ )h(t − τ ) dτ =
∞
0
convolution result at t
0
eigenvector
h ∗ v = H(ω) v
h ⊛ v = Hk v
eigenvalue
corresponding to v(t)
Z ∞
−jωt
h(t)e
dt
Z−∞
T /2
h(τ )e−j2πkτ /T dτ
−T /2
α3.2 [January 2013] [free version] CC by-nc-nd
Comments to [email protected]
Fourier and Wavelet Signal Processing
Copyright 2013 J. Kovačević, V. K. Goyal, and M. Vetterli
xvi
Quick Reference
Two-Channel Filter Banks
Basic characteristics
number of channels
sampling factor
channel sequences
Filters
orthogonal
biorthogonal
polyphase components
M =2
N =2
αn
βn
Synthesis
lowpass
gn
gn
g0,n , g1,n
highpass
hn
hn
h0,n , h1,n
Tree-Structured Filter Banks (DWT)
Basic characteristics
number of channels
M =J +1
sampling at level ℓ
N (ℓ) = 2ℓ
)
channel sequences
α(J
βn(ℓ)
n
Filters
orthogonal
biorthogonal
polyphase component j
S ynthesis
lowpass
gn(J )
gn(J )
(J )
gj,n
N -Channel Filter Banks
Basic characteristics
number of channels
M =N
sampling factor
N
channel sequences
αi,n
Filters
orthogonal filter i
biorthogonal filter i
polyphase component j
Synthesis
gi,n
gi,n
gi,j,n
Oversampled Filter Banks
Basic characteristics
number of channels
M >N
sampling factor
N
channel sequences
αi,n
Filters
filter i
polyphase component j
α3.2 [January 2013] [free version] CC by-nc-nd
Synthesis
gi,n
gi,j,n
bandpass(ℓ)
h(ℓ)
n
h(ℓ)
n
(ℓ)
hj,n
Analysis
lowpass
g−n
gen
ge0,n , e
g1,n
highpass
h−n
e
hn
e
h0,n , e
h1,n
ℓ = 1, 2, . . . , J
Analysis
lowpass
(J )
g−n
(J )
gen
(J )
gej,n
bandpass(ℓ)
(ℓ)
h−n
e
h(ℓ)
n
(ℓ)
e
hj,n
j = 0, 1, . . . , 2ℓ − 1
i = 0, 1, . . . , N − 1
Analysis
gi,−n
gei,n
gei,j,n
j = 0, 1, . . . , N − 1
i = 0, 1, . . . , M − 1
Analysis
gei,n
gei,j,n
j = 0, 1, . . . , N − 1
Comments to [email protected]
Fourier and Wavelet Signal Processing
Copyright 2013 J. Kovačević, V. K. Goyal, and M. Vetterli
Preface
The aim of this book, together with its predecessor Signal Processing: Foundations
(SP:F) [107], is to provide a set of tools for users of state-of-the-art signal processing
technology and a solid foundation for those hoping to advance the theory and practice of signal processing. Many of the results and techniques presented here, while
rooted in classic Fourier techniques for signal representation, first appeared during
a flurry of activity in the 1980s and 1990s. New constructions for local Fourier
transforms and orthonormal wavelet bases during that period were motivated both
by theoretical interest and by applications, in particular in multimedia communications. New bases with specified time–frequency behavior were found, with impact
well beyond the original fields of application. Areas as diverse as computer graphics
and numerical analysis embraced some of the new constructions—no surprise given
the pervasive role of Fourier analysis in science and engineering.
Now that the dust has settled, some of what was new and esoteric is now
fundamental. Our motivation is to bring these new fundamentals to a broader
audience to further expand their impact. We thus provide an integrated view of
classical Fourier analysis of signals and systems alongside structured representations
with time–frequency locality and their myriad of applications.
This book relies heavily on the base built in SP:F. Thus, these two books
are to be seen as integrally related to each other. References to SP:F are given in
italics.
Signal Processing: Foundations The first book covers the foundations for an extensive understanding of signal processing. It contains material that many readers
may have seen before, but without the Hilbert space interpretations that are essential in contemporary signal processing research and technology. In Chapter 2,
From Euclid to Hilbert, the basic geometric intuition central to Hilbert spaces is developed, together with all the necessary tools underlying the construction of bases.
Chapter 3, Sequences and Discrete-Time Systems, is a crash course on processing
signals in discrete time or discrete space. In Chapter 4, Functions and ContinuousTime Systems, the mathematics of Fourier transforms and Fourier series is reviewed.
Chapter 5, Sampling and Interpolation, talks about the critical link between discrete
and continuous domains as given by the sampling theorem and interpolation, while
Chapter 6, Approximation and Compression, veers from the exact world to the approximate one. The final chapter is Chapter 7, Localization and Uncertainty, and
xvii
α3.2 [January 2013] [free version] CC by-nc-nd
Comments to [email protected]
Fourier and Wavelet Signal Processing
xviii
Copyright 2013 J. Kovačević, V. K. Goyal, and M. Vetterli
Preface
it considers time–frequency behavior of the abstract representation objects studied
thus far. It also discusses issues arising in the real world as well as ways of adapting
these tools for use in the real world. The main concepts seen—such as geometry of
Hilbert spaces, existence of bases, Fourier representations, sampling and interpolation, and approximation and compression—build a powerful foundation for modern
signal processing. These tools hit roadblocks they must overcome: finiteness and
localization, limitations of uncertainty, and computational costs.
Signal Processing: Fourier and Wavelet Representations This book presents
signal representations, including Fourier, local Fourier and wavelet bases, related
constructions, as well as frames and continuous transforms.
It starts with Chapter 1, Filter Banks: Building Blocks of Time-Frequency Expansions, which presents a thorough treatment of the basic block—the two-channel
filter bank, a signal processing device that splits a signal into a coarse, lowpass
approximation, and a highpass detail.
We generalize this block in the three chapters that follow, all dealing with
Fourier- and wavelet-like representations on sequences: In Chapter 2, Local Fourier
Bases on Sequences, we discuss Fourier-like bases on sequences, implemented by
N -channel modulated filter banks (first generalization of the two-channel filter
banks). In Chapter 3, Wavelet Bases on Sequences, we discuss wavelet-like bases
on sequences, implemented by tree-structured filter banks (second generalization).
In Chapter 4, Local Fourier and Wavelet Frames on Sequences, we discuss both
Fourier- and wavelet-like frames on sequences, implemented by oversampled filter
banks (third generalization).
We then move to the two chapters dealing with Fourier- and wavelet-like representations on functions. In Chapter 5, Local Fourier Transforms, Frames and
Bases on Functions, we start with the most natural representation of smooth functions with some locality, the local Fourier transform, followed by its sampled version/frame, and leading to results on whether bases are possible. In Chapter 6,
Wavelet Bases, Frames and Transforms on Functions, we do the same for wavelet
representations on functions, but in opposite order: starting from bases, through
frames and finally continuous wavelet transform.
The last chapter, Chapter 7, Approximation, Estimation, and Compression,
uses all the tools we introduced to address state-of-the-art signal processing and
communication problems and their solutions. The guiding principle is that there is
a domain where the problem at hand will have a sparse solution, at least approximately so. This is known as sparse signal processing, and many examples, from the
classical Karhunen-Loève expansion to nonlinear approximation in discrete cosine
transform and wavelet domains, all the way to contemporary research in compressed
sensing, use this principle. The chapter introduces and overviews sparse signal processing, covering approximation methods, estimation procedures such as denoising,
as well as compression methods and inverse problems.
Teaching Points Our aim is to present a synthetic view from basic mathematical principles to actual constructions of bases and frames, always with an eye on
α3.2 [January 2013] [free version] CC by-nc-nd
Comments to [email protected]
Fourier and Wavelet Signal Processing
Copyright 2013 J. Kovačević, V. K. Goyal, and M. Vetterli
Preface
xix
concrete applications. While the benefit is a self-contained presentation, the cost
is a rather sizable manuscript. Referencing in the main text is sparse; pointers to
bibliography are given in Further Reading at the end of each chapter.
The material grew out of teaching signal processing, wavelets and applications
in various settings. Two of the authors, Martin Vetterli and Jelena Kovačević, authored a graduate textbook, Wavelets and Subband Coding (originally with Prentice
Hall in 1995), which they and others used to teach graduate courses at various US
and European institutions. This book is now online with open access.1 With more
than a decade of experience, the maturing of the field, and the broader interest
arising from and for these topics, the time was right for an entirely new text geared
towards a broader audience, one that could be used to span levels from undergraduate to graduate, as well as various areas of engineering and science. As a case in
point, parts of the text have been used at Carnegie Mellon University in classes
on bioimage informatics, where some of the students are life-sciences majors. This
plasticity of the text is one of the features which we aimed for, and that most probably differentiates the present book from many others. Another aim is to present
side-by-side all methods that arose around signal representations, without favoring
any in particular. The truth is that each representation is a tool in the toolbox of
the practitioner, and the problem or application at hand ultimately determines the
appropriate one to use.
Free Version This free, electronic version of the book contains all of the main
material, except for solved exercises and exercises. Consequently, all references to
exercises will show as ??. Moreover, this version does not contain PDF hyperlinks.
Jelena Kovačević, Vivek K Goyaland Martin Vetterli
March 2012
1 http://waveletsandsubbandcoding.org/
α3.2 [January 2013] [free version] CC by-nc-nd
Comments to [email protected]
Fourier and Wavelet Signal Processing
xx
α3.2 [January 2013] [free version] CC by-nc-nd
Copyright 2013 J. Kovačević, V. K. Goyal, and M. Vetterli
Preface
Comments to [email protected]
Fourier and Wavelet Signal Processing
Copyright 2013 J. Kovačević, V. K. Goyal, and M. Vetterli
Acknowledgments
This project exists thanks to the help of many people, whom we attempt to list
below. We apologize for any omissions. The current edition is a work in progress;
we welcome corrections, complaints, and suggestions.
We are grateful to Prof. Libero Zuppiroli of EPFL and Christiane Grimm
for the photograph that graces the cover; Prof. Zuppiroli proposed an experiment
from Newton’s treatise on Opticks [65] as emblematic of the book, and Ms. Grimm
beautifully photographed the apparatus that he designed. Françoise Behn, Jocelyne
Plantefol and Jacqueline Aeberhard typed and organized parts of the manuscript,
Eric Strattman assisted with many of the figures, Krista Van Guilder designed and
implemented the book web site, and Jorge Albaladejo Pomares designed and implemented the book blog. We thank them for their diligence and patience. Patrick
Vandewalle designed, wrote and implemented most of the Matlab companion to the
book. Similarly, S. Grace Chang and Yann Barbotin helped organize and edit the
problem companion. We thank them for their expertise and insight. We are indebted to George Beck and Ed Pegg, Jr., for their Mathematica editorial comments
on Wolfram Demonstrations inspired by this book.
Many instructors have gamely tested pre-alpha versions of this manuscript.
Of these, Amina Chebira, Yue M. Lu, and Thao Nguyen have done far more than
their share in providing invaluable comments and suggestions. We also thank Zoran Cvetković and Minh Do for teaching with the manuscript and providing many
constructive comments, Matthew Fickus for consulting on some finer mathematical
points, and Thierry Blu for providing a simple proof for a particular case of the
Strang–Fix theorem. Useful comments have also been provided by Pedro Aguilar,
A. Avudainayagam, Aniruddha Bhargava, André Tomaz de Carvalho, S. Esakkirajan, Germán González, Alexandre Haehlen, Mina Karzand, Hossein Rouhani, Noah
Stein, and Christophe Tournery.
Martin Vetterli thanks current and former EPFL graduate students and postdocs who helped develop the material, solve problems, catch typos, and suggest improvements, among other things. They include Florence Bénézit, Amina Chebira,
Minh Do, Ivan Dokmanic, Pier Luigi Dragotti, Ali Hormati, Ivana Jovanović, Mihailo Kolundzija, Jérôme Lebrun, Yue Lu, Pina Marziliano, Fritz Menzer, Reza
Parhizkar, Paolo Prandoni, Juri Ranieri, Olivier Roy, Rahul Shukla, Jayakrishnan Unnikrishnan, Patrick Vandewalle and Vladan Velisavljević. He gratefully acknowledges support from the Swiss National Science Foundation through awards
2000-063664, 200020-103729, 200021-121935, and the European Research Council
xxi
α3.2 [January 2013] [free version] CC by-nc-nd
Comments to [email protected]
Fourier and Wavelet Signal Processing
xxii
Copyright 2013 J. Kovačević, V. K. Goyal, and M. Vetterli
Acknowledgments
through award SPARSAM 247006.
Jelena Kovačević thanks her present and past graduate students Ramu Bhagavatula, Amina Chebira, Jackie Chen, Siheng Chen, Charles Jackson, Xindian Long,
Mike McCann, Anupama Kuruvilla, Tad Merryman, Vivek Oak, Aliaksei Sandryhaila and Gowri Srinivasa, many of whom served as TAs for her classes at CMU, together with Pablo Hennings Yeomans. Thanks also to all the students in the following classes at CMU: 42-431/18-496, 42-540, 42-731/18-795, 18-799, 42-540, taught
from 2003 to 2012. She is grateful to her husband, Giovanni Pacifici, for the many
useful scripts automating various book-writing processes. She gratefully acknowledges support from the NSF through awards 1017278, 1130616, 515152, 633775,
0331657, 627250 and 331657; the NIH through awards 1DC010283, EB008870,
EB009875 and 1DC010283; the CMU CIT through Infrastructure for Large-Scale
Biomedical Computing award; the PA State Tobacco Settlement, Kamlet-Smith
Bioinformatics Grant; and the Philip and Marsha Dowd Teaching Fellowship.
Vivek Goyal thanks TAs and students at MIT for suggestions, in particular
Baris Erkmen, Ying-zong Huang, Zahi Karam, Ahmed Kirmani, Ranko Sredojevic,
Ramesh Sridharan, Watcharapan Suwansantisuk, Vincent Tan, Archana Venkataraman, Adam Zelinski, and Serhii Zhak. He gratefully acknowledges support from the
NSF through awards 0643836, 0729069, 1101147, 1115159, and 1161413; Texas Instruments through its Leadership University Program; Hewlett-Packard, Inc.; MIT
through an Esther and Harold E. Edgerton Career Development Chair; and the
Centre Bernoulli of EPFL.
α3.2 [January 2013] [free version] CC by-nc-nd
Comments to [email protected]
Fourier and Wavelet Signal Processing
Copyright 2013 J. Kovačević, V. K. Goyal, and M. Vetterli
Chapter 1
Filter Banks: Building
Blocks of Time-Frequency
Expansions
Contents
1.1
Introduction
1.2
Orthogonal Two-Channel Filter Banks
1.3
Design of Orthogonal Two-Channel Filter Banks
1.4
Biorthogonal Two-Channel Filter Banks
1.5
Design of Biorthogonal Two-Channel Filter Banks
1.6
Two-Channel Filter Banks with Stochastic Inputs
1.7
Computational Aspects
Chapter at a Glance
Historical Remarks
Further Reading
2
7
20
26
37
41
42
49
52
52
The aim of this chapter is to build discrete-time bases with desirable timefrequency features and structure that enable tractable analysis and efficient algorithmic implementation. We achieve these goals by constructing bases via filter
banks.
Using filter banks provides an easy way to understand the relationship between
analysis and synthesis operators, while, at the same time, making their efficient
implementation obvious. Moreover, filter banks are at the root of the constructions
of wavelet bases in Chapters 3 and 6. In short, together with discrete-time filters
and the FFT, filter banks are among the most basic tools of signal processing.
This chapter deals exclusively with two-channel filter banks since they are
(1) the simplest; (2) reveal the essence of the N -channel ones; and (3) are used
as building blocks for more general bases. We focus first on the orthogonal case,
which is the most structured and has the easiest geometric interpretation. Due
to its importance in practice, we follow with the discussion of the biorthogonal
case. We consider real-coefficient filter banks exclusively; pointers to complexcoefficient ones, as well as to various generalizations, such as N -channel filter banks,
multidimensional filter banks and transmultiplexers, are given in Further Reading.
1
α3.2 [January 2013] [free version] CC by-nc-nd
Comments to [email protected]
Fourier and Wavelet Signal Processing
2
1.1
Copyright 2013 J. Kovačević, V. K. Goyal, and M. Vetterli
Chapter 1. Filter Banks: Building Blocks of Time-Frequency Expansions
Introduction
Implementing a Haar Orthonormal Basis Expansion
At the end of the previous book, we constructed an orthonormal basis for ℓ2 (Z)
which possesses structure in terms of time and frequency localization properties
(it serves as an almost perfect localization tool in time, and a rather rough one in
frequency); and, is efficient (it is built from two template sequences, one lowpass
and the other highpass, and their shifts). This was the so-called Haar basis.
What we want to do now is implement that basis using signal processing
machinery. We first rename our template basis sequences from (??) and (??) as:
gn = ϕ0,n =
hn = ϕ1,n =
√1 (δn
2
√1 (δn
2
+ δn−1 ),
(1.1a)
− δn−1 ).
(1.1b)
This is done both for simplicity, as well as because it is the standard way these
sequences are denoted. We start by rewriting the reconstruction formula (??) as
X
X
hx, ϕ2k+1 i ϕ2k+1,n
xn =
hx, ϕ2k i ϕ2k,n +
| {z }
| {z }
k∈Z
k∈Z
=
X
k∈Z
=
X
k∈Z
αk
αk ϕ2k,n +
| {z }
gn−2k
αk gn−2k +
X
k∈Z
X
βk
βk ϕ2k+1,n
| {z }
hn−2k
βk hn−2k ,
(1.2)
k∈Z
where we have renamed the basis functions as in (1.1), as well as denoted the
expansion coefficients as
hx, ϕ2k i = hxn , gn−2k in = αk ,
hx, ϕ2k+1 i = hxn , hn−2k in = βk .
(1.3a)
(1.3b)
Then, recognize each sum in (1.2) as the output of upsampling followed by filtering
(3.203) with the input sequences being αk and βk , respectively. Thus, the first sum
in (1.2) can be implemented as the input sequence α going through an upsampler by
2 followed by filtering by g, and the second as the input sequence β going through
an upsampler by 2 followed by filtering by h.
By the same token, we can identify the computation of the expansion coefficients in (1.3) as (3.200), that is, both α and β sequences can be obtained using
filtering by g−n followed by downsampling by 2 (for αk ), or filtering by h−n followed
by downsampling by 2 (for βk ).
We can put together the above operations to yield a two-channel filter bank
implementing a Haar orthonormal basis expansion as in Figure 1.1(a). The left part
that computes the expansion coefficients is termed an analysis filter bank, while the
right part that computes the projections is termed a synthesis filter bank.
As before, once we have identified all the appropriate multirate components,
we can examine the Haar filter bank via matrix operations (linear operators). For
α3.2 [January 2013] [free version] CC by-nc-nd
Comments to [email protected]
Fourier and Wavelet Signal Processing
Copyright 2013 J. Kovačević, V. K. Goyal, and M. Vetterli
1.1. Introduction
3
h−n
2
β
xW
hn
2
x
x
+
g−n
2
α
xV
gn
2
(a) Block diagram.
|G(ejω )|
|H(ejω )|
π
π
2
(b) Frequency division.
Figure 1.1:
A two-channel analysis/synthesis filter bank. (a) Block diagram, where
an analysis filter bank is followed by a synthesis filter bank. In the orthogonal case,
the impulse responses of the analysis filters are time-reversed versions of the impulse
responses of the synthesis filters. The filter g is typically lowpass, while the filter h is
typically highpass. (b) Frequency responses of the two Haar filters computing averages
and differences, showing the decomposition into low- and high-frequency content.
example, in matrix notation, the analysis process (1.3) can be expressed as














..
.
α0
β0
α1
β1
α2
β2
..
.


..
.



. . .



. . .




 = √1 . . .

2

. . .

. . .



. . .


..
.
|
α3.2 [January 2013] [free version] CC by-nc-nd
..
.
1
1
0
0
0
0
..
.
..
.
1
−1
0
0
0
0
..
.
..
..
..
.
.
.
0
0 0
0
0 0
1
1 0
1 −1 0
0
0 1
0
0 1
..
..
..
.
.
.
{z
ΦT

.


0 . . .


0 . . . 


0 . . .


0 . . . 


1 . . .


−1 . . . 

..
..
.
.
}
..
.
..
..
.
x0
x1
x2
x3
x4
x5
..
.







,






(1.4)
Comments to [email protected]
Fourier and Wavelet Signal Processing
4
Copyright 2013 J. Kovačević, V. K. Goyal, and M. Vetterli
Chapter 1. Filter Banks: Building Blocks of Time-Frequency Expansions
and the synthesis process (1.2) as



..
..
..
.
.
 . 

 x0 
. . . 1



 x1 
. . . 1



 x2 


 = √1 . . . 0
 x3 
2

. . . 0

 x4 
. . . 0



 x5 
. . . 0



..
..
..
.
.
.
| {z }
|
..
.
1
−1
0
0
0
0
..
.
..
..
.
.
0
0
0
0
1
1
1 −1
0
0
0
0
..
..
.
.
{z

.


. . .


. . . 


. . .


. . . 


. . .


. . . 

..
.
}|
..
..
.
.
0
0
0
0
0
0
0
0
1
1
1 −1
..
..
.
.
..
Φ
x
or
x = ΦΦT x
⇒
..
.
α0
β0
α1
β1
α2
β2
..
.
{z
X







,






(1.5)
}
ΦΦT = I.
(1.6)
Of course, the matrix Φ is the same matrix we have seen in (??). Moreover, from
(1.6), it is a unitary matrix, which we know from Chapter 2, Chapter at a Glance,
implies that the Haar basis is an orthonormal basis (and have already shown in
Chapter 7). Table 1.8 gives a summary of the Haar filter bank in various domains.
Implementing a General Orthonormal Basis Expansion
What we have seen for the Haar orthonormal basis is true in general; we can construct an orthonormal basis for ℓ2 (Z) using two template basis sequences and their
even shifts. As we have seen, we can implement such an orthonormal basis using a
two-channel filter bank, consisting of downsamplers, upsamplers and filters g and h.
Let g and h be two real-coefficient, causal filters,2 where we implicitly assume that
these filters have certain time and frequency localization properties, as discussed in
Chapter 7 (g is lowpass and h is highpass). The synthesis (1.5) generalizes to












..
.
x0
x1
x2
x3
x4
..
.


..
.



. . .



. . .


 = . . .



. . .



. . .


..
.
..
.
g0
g1
g2
g3
g4
..
.
..
.
h0
h1
h2
h3
h4
..
.
..
.
0
0
g0
g1
g2
..
.
..
.
0
0
h0
h1
h2
..
.
..
.
0
0
0
0
g0
..
.
..
.
0
0
0
0
h0
..
.

.


. . .


. . . 


. . .


. . . 


. . .

..
.
..
..
.
α0
β0
α1
β1
α2
..
.






 = Φ X,





(1.7)
with the basis matrix Φ as before. To have an orthonormal basis, the basis sequences
{ϕk }k∈Z —even shifts of template sequences g and h, must form an orthonormal set
as in (2.89), or, Φ must be unitary, implying its columns are orthonormal:
hgn , gn−2k in = δk ,
hhn , hn−2k in = δk ,
hgn , hn−2k in = 0.
(1.8)
2 While causality is not necessary to construct a filter bank, we impose it later and it improves
readability here. We stress again that we deal exclusively with real-coefficient filters.
α3.2 [January 2013] [free version] CC by-nc-nd
Comments to [email protected]
Fourier and Wavelet Signal Processing
Copyright 2013 J. Kovačević, V. K. Goyal, and M. Vetterli
1.1. Introduction
5
We have seen in (3.215) that such filters are called orthogonal; how to design them
is a central topic of this chapter.
As we are building an orthonormal basis, computing the expansion coefficients
of an input sequence means taking the inner product between that sequence and
each basis sequence. In terms of the orthonormal set given by the columns of Φ,
this amounts to a multiplication by ΦT :







..
..
..
..
..
.. . .
..
..
..
.
.
.
.
.
.
.
.


 . 
 . 



 α0 
. . . g0 g1 g2 g3 g4 . . .  x0 
hx
,
g
i


n
n
n







 β0 
. . . h0 h1 h2 h3 h4 . . .  x1 
 hxn , hn in 







 x2  . (1.9)
 α1  = 
 hxn , gn−2 in  = 
...
0
0 g0 g1 g2 . . .








 x3 
 β1 
. . .
hxn , hn−2 in 
0
0 h0 h1 h2 . . .








 x4 
 α2 
. . .
 hxn , gn−4 in 
0
0
0
0 g0 . . .








..
..
..
.. . .
..
..
..
..
..
.
.
.
.
.
.
.
.
.
.
| {z }
|
{z
} | {z }
|
{z
}
X
ΦT
X
x
As in the Haar case, this can be implemented with convolutions by g−n and h−n ,
followed by downsampling by 2—an analysis filter bank as in Figure 1.1(a). In filter
bank terms, the representation of x in terms of a basis (or frame) is called perfect
reconstruction.
Thus, what we have built is as in Chapter 7—an orthonormal basis with structure (time and frequency localization properties) as well as efficient implementation
guaranteed by the filter bank. As in the Haar case, this structure is seen in the
subspaces V and W on which the orthonormal basis projects; we implicitly assume
that V is the space of coarse (lowpass) sequences and W is the space of detail
(highpass) sequences. Figure 1.3 illustrates that, where a synthetic sequence with
features at different scales is split into lowpass and highpass components. These
subspaces are spanned by the lowpass template g and its even shifts (V ) and the
highpass template h and its even shifts (W ) as in (??):
V = span({ϕ0,n−2k }k∈Z ) = span({gn−2k }k∈Z ),
W = span({ϕ1,n−2k }k∈Z ) = span({hn−2k }k∈Z ),
(1.10a)
(1.10b)
and produce the lowpass and highpass approximations, respectively:
X
xV =
αk gn−2k ,
(1.11a)
xW =
(1.11b)
k∈Z
X
βk hn−2k .
k∈Z
As the basis sequences spanning these spaces are orthogonal to each other and all
together form an orthonormal basis, the two projection subspaces together give back
the original space as in (??): ℓ2 (Z) = V ⊕ W .
In this brief chapter preview, we introduced the two-channel filter bank as in
Figure 1.1(a). It uses orthogonal filters satisfying (1.8) and computes an expansion
α3.2 [January 2013] [free version] CC by-nc-nd
Comments to [email protected]
Fourier and Wavelet Signal Processing
6
Copyright 2013 J. Kovačević, V. K. Goyal, and M. Vetterli
Chapter 1. Filter Banks: Building Blocks of Time-Frequency Expansions
W
l2 (Z)
x
xW
V
xV
Figure 1.2: A sequence x is split into two approximation sequences xV and xW . An
orthonormal filter bank ensures that xV and xW are orthogonal and sum up to the original
sequence. We also show the split of ℓ2 (Z) into two orthogonal complements V (lowpass
subspace) and W (highpass subspace).
x
1
0
−1
0
xV
200
400
600
800
1000
0
xW
200
400
600
800
1000
0
200
400
600
800
1000
1
0
−1
1
0
−1
Figure 1.3: A sequence and its projections. (a) The sequence x with different-scale features (low-frequency sinusoid, high-frequency noise, piecewise polynomial and a Kronecker
delta sequence). (b) The lowpass projection xV . (c) The highpass projection xW .
α3.2 [January 2013] [free version] CC by-nc-nd
Comments to [email protected]
Fourier and Wavelet Signal Processing
Copyright 2013 J. Kovačević, V. K. Goyal, and M. Vetterli
1.2. Orthogonal Two-Channel Filter Banks
7
with respect to the set of basis vectors {gn−2k , hn−2k }k∈Z , yielding a decomposition into approximation spaces V and W having complementary signal processing
properties. Our task now is to find appropriate filters (template basis sequences)
and develop properties of the filter bank in detail. We start by considering the
lowpass filter g, since everything else will follow from there. We concentrate only
on real-coefficient FIR filters since they are dominant in practice.
Chapter Outline
We start by showing how orthonormal bases are implemented by orthogonal filter
banks in Section 1.2 and follow by discussing three approaches to the design of
orthogonal filter banks in Section 1.3. We then discuss the theory and design of
biorthogonal filter banks in Sections 1.4 and 1.5. In Section 1.6, we discuss stochastic
filter banks, followed by algorithms in Section 1.7.
Notation used in this chapter: In this chapter, we consider real-coefficient filter
banks exclusively; pointers to complex-coefficient ones are given in Further Reading.
Thus, Hermitian transposition will occur rarely; when filter coefficients are complex,
the transposition in some places should be Hermitian transposition, however, only
coefficients should be conjugated and not z. We will point these out throughout
the chapter.
1.2
Orthogonal Two-Channel Filter Banks
This section develops necessary conditions for the design of orthogonal two-channel
filter banks implementing orthonormal bases and the key properties of such filter
banks. We assume that the system shown in Figure 1.1(a) implements an orthonormal basis for sequences in ℓ2 (Z) using the basis sequences {gn−2k , hn−2k }k∈Z . We
first determine what this means for the lowpass and highpass channels separately,
and follow by combining the channels. We then develop a polyphase representation
for orthogonal filter banks and discuss their polynomial approximation properties.
1.2.1
A Single Channel and Its Properties
We now look at each channel of Figure 1.1 separately and determine their properties.
As the lowpass and highpass channels are essentially symmetric, our approach is
to establish (1) the properties inherent to each channel on its own; and (2) given
one channel, establish the properties the other has to satisfy so as to build an
orthonormal basis when combined. While we have seen most of the properties
already, we summarize them here for completeness.
Consider the lower branch of Figure 1.1(a), projecting the input x onto its lowpass approximation xV , depicted separately in Figure 1.4. In (1.11a), that lowpass
approximation xV was given as
X
xV =
αk gn−2k .
(1.12a)
k∈Z
α3.2 [January 2013] [free version] CC by-nc-nd
Comments to [email protected]
Fourier and Wavelet Signal Processing
8
Copyright 2013 J. Kovačević, V. K. Goyal, and M. Vetterli
Chapter 1. Filter Banks: Building Blocks of Time-Frequency Expansions
l2 (Z)
g−n
x
Figure 1.4:
2
α
2
gn
V
xV
The lowpass branch of a two-channel filter bank, mapping x to xV .
Similarly, in (1.11b), the highpass approximation xW was given as
xW =
X
βk hn−2k .
(1.12b)
k∈Z
Orthogonality of the Lowpass Filter Since we started with an orthonormal basis,
the set {gn−2k }k∈Z is an orthonormal set. We have seen in Section 3.7.4 that such
a filter is termed orthogonal and satisfies (3.215):
Matrix View
hgn , gn−2k i = δk
←→
ZT
←→
DTFT
←→
D2 GT GU2 = I
G(z)G(z −1 ) + G(−z)G(−z −1 ) = 2
G(ejω )2 + G(ej(ω+π) )2 = 2
(1.13)
In the matrix view, we have used linear operators (infinite matrices) introduced in
Section 3.7. These are: (1) downsampling by 2, D2 , from (3.183a); (2) upsampling
by 2, U2 , from (3.189a) and (3) filtering by G, from (3.62). The matrix view
expresses the fact that the columns of GU2 form an orthonormal set.3 The DTFT
version is the quadrature mirror formula from (3.214).
Orthogonality of the Highpass Filter Similarly to {gn−2k }k∈Z , the set {hn−2k }k∈Z
is an orthonormal set, and the sequence h can be seen as the impulse response of
an orthogonal filter satisfying:
Matrix View
hhn , hn−2k i = δk
←→
ZT
←→
DTFT
←→
D2 H T HU2 = I
H(z)H(z −1 ) + H(−z)H(−z −1 ) = 2
H(ejω )2 + H(ej(ω+π) )2 = 2
(1.14)
The matrix view expresses the fact that the columns of HU2 form an orthonormal
set. Again, the DTFT version is the quadrature mirror formula from (3.214).
Deterministic Autocorrelation of the Lowpass Filter As it is widely used in filter
design, we rephrase (1.13) in terms of the deterministic autocorrelation of g, given
3 We remind the reader once more that we are considering exclusively real-coefficient filter banks,
and thus transposition instead of Hermitian transposition in (1.13).
α3.2 [January 2013] [free version] CC by-nc-nd
Comments to [email protected]
Fourier and Wavelet Signal Processing
Copyright 2013 J. Kovačević, V. K. Goyal, and M. Vetterli
1.2. Orthogonal Two-Channel Filter Banks
9
Lowpass Channel in a Two-Channel Orthogonal Filter Bank
Lowpass filter
Original domain
Matrix domain
z-domain
DTFT domain
gn
G
G(z)
G(ejω )
Polyphase domain
G(z) = G0 (z 2 ) + z −1 G1 (z 2 )
Deterministic autocorrelation
Original domain
an = hgk , gk+n ik
Matrix domain
A = GT G
z-domain
A(z) = G(z)G(z −1 )
a2k = δk
D2 AU2 = I
A(z) + A(−z) = 2
∞
X
A(z) = 1 + 2
a2k+1 (z 2k+1 + z −(2k+1) )
k=0
2
A(ejω ) = G(ejω )
DTFT domain
hgn , gn−2k in = δk
D2 GT GU2 = I
G(z)G(z −1 ) + G(−z)G(−z −1 ) = 2
|G(ejω )|2 + |G(ej(ω+π) )|2 = 2
(quadrature mirror formula)
G0 (z)G0 (z −1 ) + G1 (z)G1 (z −1 ) = 1
A(ejω ) + A(ej(ω+π) ) = 2
Orthogonal projection onto smooth space V = span({gn−2k }k∈Z )
xV = PV x
PV = GU2 D2 GT
Table 1.1: Properties of the lowpass channel in an orthogonal two-channel filter bank.
Properties for the highpass channel are analogous.
by (3.96)):
Matrix View
hgn , gn−2k i = a2k = δk
←→
D2 AU2 = I
ZT
←→
A(z) + A(−z) = 2
DTFT
←→
A(e ) + A(e
jω
j(ω+π)
(1.15)
) = 2
In the above, A = GT G is a symmetric matrix with element ak on the kth diagonal
left/right from the main diagonal. Thus, except for a0 , all the other even terms of
ak are 0, leading to
(a)
A(z) =
G(z)G(z −1 ) = 1 + 2
∞
X
a2k+1 (z 2k+1 + z −(2k+1) ),
(1.16)
k=0
where (a) follows from (3.143).
Deterministic Autocorrelation of the Highpass Filter Similarly to the lowpass
filter,
Matrix View
hhn , hn−2k i = a2k = δk
←→
D2 AU2 = I
ZT
←→
DTFT
←→
A(z) + A(−z) = 2
A(e ) + A(e
jω
j(ω+π)
(1.17)
) = 2
Equation (1.16) holds for this deterministic autocorrelation as well.
α3.2 [January 2013] [free version] CC by-nc-nd
Comments to [email protected]
Fourier and Wavelet Signal Processing
10
Copyright 2013 J. Kovačević, V. K. Goyal, and M. Vetterli
Chapter 1. Filter Banks: Building Blocks of Time-Frequency Expansions
Orthogonal Projection Property of the Lowpass Channel We now look at the
lowpass channel as a composition of four linear operators we just saw:
xV = PV x = GU2 D2 GT x.
(1.18)
The notation is evocative of projection onto V , and we will now show that the
lowpass channel accomplishes precisely this. Using (1.13), we check idempotency
and self-adjointness of P Definition 2.27,
(a)
PV2 = (GU2 D2 GT ) (GU2 D2 GT ) = GU2 D2 GT = PV ,
|
{z
}
I
(b)
PVT = (GU2 D2 GT )T = G(U2 D2 )T GT = GU2 D2 GT = PV ,
where (a) follows from (1.13) and (b) from (3.194). Indeed, PV is an orthogonal
projection operator, with the range given in (1.10a):
V = span({gn−2k }k∈Z ).
(1.19)
The summary of properties of the lowpass channel is given in Table 1.1.
Orthogonal Projection Property of the Highpass Channel The highpass channel
as a composition of four linear operators (infinite matrices) is:
xW = PW x = HU2 D2 H T x.
(1.20)
It is no surprise that PW is an orthogonal projection operator with the range given
in (1.10b):
W = span({hn−2k }k∈Z ).
(1.21)
The summary of properties of the highpass channel is given in Table 1.1 (table
provided for lowpass channel, just make appropriate substitutions).
1.2.2
Complementary Channels and Their Properties
While we have discussed which properties each channel has to satisfy on its own,
we now discuss what they have to satisfy with respect to each other to build an
orthonormal basis. Intuitively, one channel has to keep what the other throws away;
in other words, that channel should project to a subspace orthogonal to the range
of the projection operator of the other. For example, given PV , PW should project
onto the leftover space between ℓ2 (Z) and PV ℓ2 (Z).
Thus, we start by assuming our filter bank in Figure 1.1(a) implements an
orthonormal basis, which means that the set of basis sequences {gn−2k , hn−2k }k∈Z
is an orthonormal set, compactly represented by (1.8). We have already used the
orthonormality of the set {gn−2k }k∈Z in (1.13) as well as the orthonormality of the
set {hn−2k }k∈Z in (1.14). What is left is that these two sets are orthogonal to each
other, the third equation in (1.8).
α3.2 [January 2013] [free version] CC by-nc-nd
Comments to [email protected]
Fourier and Wavelet Signal Processing
Copyright 2013 J. Kovačević, V. K. Goyal, and M. Vetterli
1.2. Orthogonal Two-Channel Filter Banks
11
Orthogonality of the Lowpass and Highpass Filters Using similar methods as
before, we summarize the lowpass and highpass sequences must satisfy:
Matrix View
hgn , hn−2k i = 0
←→
D2 H T GU2 = 0
ZT
←→
DTFT
←→
G(z)H(z
−1
) + G(−z)H(−z −1) = 0
G(ejω )H(e−jω ) + G(ej(ω+π) )H(e−j(ω+π) ) = 0
(1.22)
Deterministic Crosscorrelation of the Lowpass and Highpass Filters Instead of
the deterministic autocorrelation properties of an orthogonal filter, we look at the
deterministic crosscorrelation properties of two filters orthogonal to each other:
Matrix View
hgn , hn−2k i = c2k = 0
←→
D2 CU2 = 0
ZT
←→
DTFT
←→
C(z) + C(−z) = 0
C(e ) + C(e
jω
j(ω+π)
(1.23)
) = 0
In the above, C = H T G is the deterministic crosscorrelation operator, and the
deterministic crosscorrelation is given by (3.99). In particular, all the even terms
of c are equal to zero.
1.2.3
Orthogonal Two-Channel Filter Bank
We are now ready to put together everything we have developed so far. We have
shown that the sequences {gn−2k , hn−2k }k∈Z form an orthonormal set. What is left
to show is completeness: any sequence from ℓ2 (Z) can be represented using the
orthonormal basis built by our orthogonal two-channel filter bank. To do this, we
must be more specific, that is, we must have an explicit form of the filters involved.
In essence, we start with an educated guess (and it will turn out to be unique,
Theorem 1.2), inspired by what we have seen in the Haar case. We can also
strengthen our intuition by considering a two-channel filter bank with ideal filters
as in Figure 1.5. If we are given an orthogonal lowpass filter g, can we say anything
about an appropriate orthogonal highpass filter h such that the two together build
an orthonormal basis? A good approach would be to shift the spectrum of the lowpass filter by π, leading to the highpass filter. In time domain, this is equivalent to
multiplying gn by (−1)n . Because of the orthogonality of the lowpass and highpass
filters, we also reverse the impulse response of g. We will then need to shift the
filter to make it causal again. Based on this discussion, we now show how, given
an orthogonal filter g, it completely specifies an orthogonal two-channel filter bank
implementing an orthonormal basis for ℓ2 (Z):
Theorem 1.1 (Orthogonal two-channel filter bank) Given is an FIR
filter g of even length L = 2ℓ, ℓ ∈ Z+ , orthonormal to its even shifts as in
(1.13). Choose
hn = ±(−1)n g−n+L−1
α3.2 [January 2013] [free version] CC by-nc-nd
ZT
←→
H(z) = ∓z −L+1G(−z −1 ).
(1.24)
Comments to [email protected]
Fourier and Wavelet Signal Processing
12
Copyright 2013 J. Kovačević, V. K. Goyal, and M. Vetterli
Chapter 1. Filter Banks: Building Blocks of Time-Frequency Expansions
|X(ejω )|
−π
(a)
π
ω
(b)
(c)
(d)
(e)
|X(ejω )|
−π
(f)
π
ω
Figure 1.5: Two-channel decomposition of a sequence using ideal filters. Left side depicts the process in the lowpass channel, while the right side depicts the process in the
highpass channel. (a) Original spectrum. (b) Spectra after filtering. (c) Spectra after
downsampling. (d) Spectra after upsampling. (e) Spectra after interpolation filtering. (f)
Reconstructed spectrum.
Then, {gn−2k , hn−2k }k∈Z is an orthonormal basis for ℓ2 (Z), implemented by an
orthogonal filter bank specified by analysis filters {g−n , h−n } and synthesis filters
{gn , hn }. The expansion splits ℓ2 (Z) as
ℓ2 (Z) = V ⊕ W,
with
V = span({gn−2k }k∈Z ),
W = span({hn−2k }k∈Z ).
(1.25)
Proof. To prove the theorem, we must prove that (i) {gn−2k , hn−2k }k∈Z is an orthonormal set √
and (ii) it is complete. The signñ in (1.24) just changes phase; assuming
√
G(1) = 2, if the sign is positive, H(1) = 2, and if the sign is negative, H(1) = − 2.
Most of the time we will implicitly assume the sign to be positive; the proof of the
theorem does not change in either case.
(i) To prove that {gn−2k , hn−2k }k∈Z is an orthonormal set, we must prove (1.8).
The first condition is satisfied by assumption. To prove the second, that is, h is
orthogonal to its even shifts, we must prove one of the conditions in (1.14). The
definition of h in (1.24) implies
H(z)H(z −1 ) = G(−z)G(−z −1 ),
(1.26)
and thus,
(a)
H(z)H(z −1 ) + H(−z −1 )H(−z −1 ) = G(−z)G(−z −1 ) + G(z)G(z −1 ) = 2,
α3.2 [January 2013] [free version] CC by-nc-nd
Comments to [email protected]
Fourier and Wavelet Signal Processing
Copyright 2013 J. Kovačević, V. K. Goyal, and M. Vetterli
1.2. Orthogonal Two-Channel Filter Banks
13
where (a) follows from (1.13).
To prove the third condition in (1.8), that is, h is orthogonal to g and all its
even shifts, we must prove one of the conditions in (1.22):
(a)
G(z)H(z −1 ) + G(−z)H(−z −1 ) = −z L−1 G(z)G(−z) + (−1)L z L−1 G(−z)G(z)
(b)
= −z L−1 G(z)G(−z) + z L−1 G(z)G(−z) = 0,
where (a) follows from (1.24); and (b) from L = 2ℓ even.
(ii) To prove completeness, we prove that perfect reconstruction holds for any x ∈
ℓ2 (Z) (an alternative would be to prove Parseval’s equality kxk2 = kxV k2 +
kxW k2 ). What we do is find z-domain expressions for XV (z) and XW (z) and
prove they sum up to X(z). We start with the lowpass branch. In the lowpass
channel, the input X(z) is filtered by G(z −1 ), and is then down- and upsampled,
followed by filtering with G(z) (and similarly for the highpass channel). Thus,
the z-transforms of xV and xW are:
XV (z) = 12 G(z) G(z −1 )X(z) + G(−z −1 )X(−z) ,
(1.27a)
−1
−1
1
XW (z) = 2 H(z) H(z )X(z) + H(−z )X(−z) .
(1.27b)
The output of the filter bank is the sum of xV and xW :
XV (z) + XW (z) = 21 G(z)G(−z −1 ) + H(z)H(−z −1 ) X(−z)
|
{z
}
+
1
2
S(z)
G(z)G(z
|
−1
) + H(z)H(z −1 ) X(z).
{z
}
(1.28)
T (z)
Substituting (1.24) into the above equation, we get:
S(z) = G(z)G(−z −1 ) + H(z)H(−z −1 )
h
ih
i
(a)
= G(z)G(−z −1 ) + −z −L+1 G(−z −1 ) −(−z −1 )−L+1 G(z)
h
i
(b)
= 1 + (−1)−L+1 G(z)G(−z −1 ) = 0,
(1.29a)
T (z) = G(z)G(z −1 ) + H(z)H(z −1)
(c)
(d)
= G(z)G(z −1 ) + G(−z −1 )G(−z) = 2,
(1.29b)
where (a) follows from (1.24); (b) from L = 2ℓ is even; (c) from (1.26); and (d)
from (1.13). Substituting this back into (1.28), we get
XV (z) + XW (z) = X(z),
(1.30)
proving perfect reconstruction, or, in other words, the assertion in the theorem
statement that the expansion can be implemented by an orthogonal filter bank.
To show (1.25), we write (1.30) in the original domain as in (1.11):
X
X
αk gn−2k +
βk hn−2k ,
xn =
(1.31)
k∈Z
|
{z
xV,n
}
k∈Z
|
{z
xW,n
}
showing that any sequence x ∈ ℓ2 (Z) can be written as a sum of its projections
onto two subspaces V and W , and these subspaces add up to ℓ2 (Z). V and W
are orthogonal from (1.22) proving (1.25).
α3.2 [January 2013] [free version] CC by-nc-nd
Comments to [email protected]
Fourier and Wavelet Signal Processing
14
Copyright 2013 J. Kovačević, V. K. Goyal, and M. Vetterli
Chapter 1. Filter Banks: Building Blocks of Time-Frequency Expansions
In the theorem, L is an even integer, which is a requirement for FIR filters of lengths
greater than 1 (see Exercise ??). Moreover, the choice (1.24) is unique; this will
be shown in Theorem 1.2. Table 1.9 summarizes various properties of orthogonal,
two-channel filter banks we covered until now.
Along with the time reversal and shift, the other qualitative feature of (1.24)
is modulation by ejnπ = (−1)n (mapping z → −z in the z domain, see (3.137)). As
we said, this makes h a highpass filter when g is a lowpass filter. As an example,
if we apply Theorem 1.1 to the Haar lowpass filter from (1.1a), we obtain the Haar
highpass filter from (1.1b).
In applications, filters are causal. To implement a filter bank with causal
filters, we make analysis filters causal (we already assumed the synthesis ones are)
by shifting them both by (−L + 1). Beware that such an implementation implies
perfect reconstruction within a shift (delay), and the orthonormal basis expansion
is not technically valid anymore. However, in applications this is often done, as the
output sequence is a perfect replica of the input one, within a shift: x̂n = xn−L+1 .
1.2.4
Polyphase View of Orthogonal Filter Banks
As we saw in Section 3.7, downsampling introduces periodic shift variance into the
system. To deal with this, we often analyze multirate systems in polyphase domain,
as discussed in Section 3.7.5. The net result is that the analysis of a single-input,
single-output, periodically shift-varying system is equivalent to the analysis of a
multiple-input, multiple-output, shift-invariant system.
Polyphase Representation of an Input Sequence For two-channel filter banks,
a polyphase decomposition of the input sequence is achieved by simply splitting it
into its even- and odd-indexed subsequences as in (3.216), the main idea being that
the sequence can be recovered from the two subsequences by upsampling, shifting
and summing up, as we have seen in Figure 3.29. This simple process is called a
polyphase transform (forward and inverse).
Polyphase Representation of a Synthesis Filter Bank To define the polyphase
decomposition of the synthesis filters, we use the expressions for upsampling followed
by filtering from (3.220):
X
ZT
g0,n = g2n ←→ G0 (z) =
g2n z −n ,
(1.32a)
n∈Z
g1,n = g2n+1
ZT
←→
G1 (z) =
X
g2n+1 z −n ,
(1.32b)
n∈Z
G(z) = G0 (z 2 ) + z −1 G1 (z 2 ),
(1.32c)
where we split each synthesis filter into its even and odd subsequence as we have
done for the input sequence x. Analogous relations hold for the highpass filter h.
We can now define a polyphase matrix Φp (z):
G0 (z) H0 (z)
Φp (z) =
.
(1.33)
G1 (z) H1 (z)
α3.2 [January 2013] [free version] CC by-nc-nd
Comments to [email protected]
Fourier and Wavelet Signal Processing
Copyright 2013 J. Kovačević, V. K. Goyal, and M. Vetterli
1.2. Orthogonal Two-Channel Filter Banks
15
As we will see in (1.37), such a matrix allows for a compact representation, analysis
and computing projections in the polyphase domain.
Polyphase Representation of an Analysis Filter Bank The matrix in (1.33) is on
the synthesis side; to get it on the analysis side, we can use the fact that this is an
orthogonal filter bank. Thus, we can write
e
G(z)
= G(z −1 ) = G0 (z −2 ) + zG1 (z −2 ).
In other words, the polyphase components of the analysis filter are, not surprisingly,
time-reversed versions of the polyphase components of the synthesis filter. We can
summarize this as (we could have obtained the same result using the expression for
polyphase representation of downsampling preceded by filtering (3.226)):
g0,n = ge2n = g−2n
e
g1,n = ge2n−1 = g−2n+1
e
ZT
←→
ZT
←→
e0 (z) =
G
e1 (z) =
G
X
g−2n z −n ,
(1.34a)
g−2n+1 z −n ,
(1.34b)
n∈Z
X
n∈Z
e
G(z)
= G0 (z −2 ) + zG1 (z −2 ), (1.34c)
with analogous relations for the highpass filter e
h, yielding the expression for the
analysis polyphase matrix
#
"
e0 (z) H
e 0 (z)
G
G0 (z −1 )
e
=
Φp (z) = e
e 1 (z)
G1 (z −1 )
G1 (z) H
H0 (z −1 )
= Φp (z −1 ).
H1 (z −1 )
(1.35)
A block diagram of the polyphase implementation of the system is given in Figure 1.6. The left part shows the reconstruction of the original sequence using the
synthesis polyphase matrix.4 The right part shows the computation of expansion
coefficient sequences α and β; note that as usual, the analysis matrix (polyphase in
this case) is taken as a transpose, as it operates on the input sequence. To check
that, compute these expansion coefficient sequences:
α(z)
X0 (z)
G0 (z −1 ) G1 (z −1 ) X0 (z)
= ΦTp (z −1 )
=
β(z)
X1 (z)
H0 (z −1 ) H1 (z −1 ) X1 (z)
G0 (z −1 )X0 (z) + G1 (z −1 )X1 (z)
=
.
H0 (z −1 )X0 (z) + H1 (z −1 )X1 (z)
(1.36)
We can obtain exactly the same expressions if we substitute (1.34) into the expression for downsampling by 2 preceded by filtering in (3.201a).
4 A comment is in order: we typically put the lowpass filter in the lower branch, but in matrices
it appears in the first row/column, leading to a slight inconsistency when the filter bank is depicted
in the polyphase domain.
α3.2 [January 2013] [free version] CC by-nc-nd
Comments to [email protected]
Fourier and Wavelet Signal Processing
16
Copyright 2013 J. Kovačević, V. K. Goyal, and M. Vetterli
Chapter 1. Filter Banks: Building Blocks of Time-Frequency Expansions
xn
αn
2
ΦTp (z −1 )
z
2
+
2
xn
Φp (z)
βn
2
z −1
Figure 1.6: Polyphase representation of a two-channel orthogonal filter bank.
Polyphase Representation of an Orthogonal Filter Bank The above polyphase
expressions allow us now to compactly represent an orthogonal two-channel filter
bank in the polyphase domain:
X0 (z 2 )
X(z) = 1 z −1 Φp (z 2 )ΦTp (z −2 )
.
(1.37)
X1 (z 2 )
From (1.24), we get that the polyphase components of H are
H0 (z) = ±z −L/2+1 G1 (z −1 ),
H1 (z) = ∓z −L/2+1 G0 (z −1 ),
leading to the polyphase matrix
G0 (z) ±z −L/2+1 G1 (z −1 )
Φp (z) =
.
G1 (z) ∓z −L/2+1 G0 (z −1 )
(1.38a)
(1.38b)
(1.39)
Since g is orthogonal to its even translates, substitute (1.32) into the z-domain
version of (1.13) to get the condition for orthogonality of a filter in polyphase form:
G0 (z)G0 (z −1 ) + G1 (z)G1 (z −1 ) = 1.
(1.40)
Using this, the determinant of Φp (z) becomes −z −L/2+1 . From (1.37), the polyphase
matrix Φp (z) satisfies the following:
Φp (z)ΦTp (z −1 ) = I,
(1.41)
a paraunitary matrix as in (3.303a). In fact, (1.39), together with (1.40), define the
most general 2 × 2, real-coefficient, causal FIR lossless matrix, a fact we summarize
in form of a theorem, the proof of which can be found in [101]:
Theorem 1.2 (General form of a paraunitary matrix) The most general 2 × 2, real-coefficient, causal FIR lossless matrix is given by (1.39), where
G0 and G1 satisfy (1.40) and L/2 − 1 is the degree of G0 (z), G1 (z).
α3.2 [January 2013] [free version] CC by-nc-nd
Comments to [email protected]
Fourier and Wavelet Signal Processing
1.2. Orthogonal Two-Channel Filter Banks
Copyright 2013 J. Kovačević, V. K. Goyal, and M. Vetterli
17
Example 1.1 (Haar filter bank in polyphase form) The Haar filters (1.1)
are extremely simple in polyphase form: Since they are both of length 2, their
polyphase components are of length 1. The polyphase matrix is simply
1 1
1
Φp (z) = √
.
(1.42)
2 1 −1
The form of the polyphase matrix for the Haar orthonormal basis is exactly the
same as the Haar orthonormal basis for R2 , or one block of the Haar orthonormal
basis infinite matrix Φ from (??). This is true only when a filter bank implements
the so-called block transform, that is, when the nonzero support of the basis
sequences is equal to the sampling factor, 2 in this case.
The polyphase notation and the associated matrices are powerful tools to
derive filter bank results. We now rephrase what it means for a filter bank to be
orthogonal—implement an orthonormal basis, in polyphase terms.
Theorem 1.3 (Paraunitary polyphase matrix and orthonormal basis)
A 2 × 2 polyphase matrix Φp (z) is paraunitary if and only if the associated
two-channel filter bank implements an orthonormal basis for ℓ2 (Z).
Proof. If the polyphase matrix is paraunitary, then the expansion it implements is
complete, due to (1.41). To prove that the expansion is an orthonormal basis, we must
show that the basis sequences form an orthonormal set. From (1.39) and (1.41), we get
(1.40). Substituting this into the z-domain version of (1.13), we see that it holds, and
thus g and its even shifts form an orthonormal set. Because h is given in terms of g as
(1.24), h and its even shifts form an orthonormal set as well. Finally, because of the
way h is defined, g and h are orthogonal by definition and so are their even shifts.
The argument in the other direction is similar; we start with an orthonormal basis
implemented by a two-channel filter bank. That means we have template sequences g
and h related via (1.24), and their even shifts, all together forming an orthonormal
basis. We can now translate those conditions into z-transform domain using (1.13) and
derive the corresponding polyphase-domain versions, such as the one in (1.40). These
lead to the polyphase matrix being paraunitary.
We have seen in Chapter 3 that we can characterize vector sequences using deterministic autocorrelation matrices (see Table 3.13). We use this now to
describe the deterministic autocorrelation of a vector sequence of expansion coeffi
T
cients αn βn , as
Aα (z) Cα,β (z)
α(z) α(z −1 ) α(z) β(z −1 )
Ap,α (z) =
=
Cβ,α (z) Aβ (z)
β(z) α(z −1 ) β(z) β(z −1 )
α(z) α(z −1 ) β(z −1 )
=
β(z)
(a)
X0 (z) X1 (z −1 ) X0 (z −1 ) Φp (z)
= ΦTp (z −1 )
X1 (z)
= ΦTp (z −1 )Ap,x (z)Φp (z),
α3.2 [January 2013] [free version] CC by-nc-nd
(1.43)
Comments to [email protected]
Fourier and Wavelet Signal Processing
18
Copyright 2013 J. Kovačević, V. K. Goyal, and M. Vetterli
Chapter 1. Filter Banks: Building Blocks of Time-Frequency Expansions
where (a) follows from (1.36), and Ap,x is the deterministic autocorrelation matrix
of the vector of polyphase components of x. This deterministic autocorrelation
matrix can be seen as a filtered deterministic autocorrelation of the input. We now
have the following result:
Theorem 1.4 (Filtered deterministic autocorrelation matrix) Given
is a 2 × 2 paraunitary polyphase matrix Φp (ejω ). Then the filtered deterministic
autocorrelation matrix, Ap,α (ejω ), is positive semidefinite.
Proof. Since Φp (z) is paraunitary, Φp (ejω ) is unitary on the unit circle. This further
means that:
cos θ sin θ ΦTp (e−jω ) = cos φ sin φ ,
(1.44)
for some φ. We can now write:
cos θ
cos θ (a) sin θ Ap,α (ejω )
= cos θ
sin θ
(b)
=
cos φ
cos θ
sin θ ΦTp (e−jω )Ap,x (ejω )Φp (ejω )
sin θ
(c)
cos φ
sin φ Ap,x (ejω )
≥ 0,
sin φ
where (a) follows from (1.43); (b) from (1.44); and (c) from TBD, proving the theorem.
1.2.5
Polynomial Approximation by Filter Banks
An important class of orthogonal filter banks are those that have polynomial approximation properties; these filter banks will approximate polynomials of a certain
degree5 in the lowpass (coarse) branch, while, at the same time, blocking those
same polynomials in the highpass (detail) branch. To derive these filter banks, we
recall what we have learned in Section 3.B.1: Convolution of a polynomial sequence
x with a differencing filter (δn − δn−1 ), or, multiplication of X(z) by (1 − z −1 ),
reduces the degree of P
the polynomial by 1. In general, to block a polynomial of
N −1
degree (N − 1), xn = k=0
ak nk , we need a filter of the form:
(1 − z −1 )N R′ (z).
(1.45)
Let us now apply what we just learned to two-channel orthogonal filter banks with
polynomial sequences as inputs. We will construct the analysis filter in the highpass
branch to have N zeros at z = 1, thus blocking polynomials of degree up to (N − 1).
Of course, since the filter bank is perfect reconstruction, whatever disappeared in the
highpass branch must be preserved in the lowpass one; thus, the lowpass branch will
reconstruct polynomials of degree (N − 1). In other words, xV will be a polynomial
approximation of the input sequence a certain degree.
5 We restrict our attention to finitely-supported polynomial sequences, ignoring the boundary
issues. If this were not the case, these sequences would not belong to any ℓp space.
α3.2 [January 2013] [free version] CC by-nc-nd
Comments to [email protected]
Fourier and Wavelet Signal Processing
Copyright 2013 J. Kovačević, V. K. Goyal, and M. Vetterli
1.2. Orthogonal Two-Channel Filter Banks
19
To construct such a filter bank, we start with the analysis highpass filter e
h
which must be of the form (1.45); we write it as:
(a)
(b)
e
H(z)
= (1 − z −1 )N ∓z L−1 R(−z) = ∓z L−1 (1 − z −1 )N R(−z) = ∓z L−1 G(−z),
|
{z
}
R′ (z)
where in (a) we have chosen R′ (z) to lead to a simple form of G(z) in what follows;
and (b) follows from Table 1.9, allowing us to directly read the synthesis lowpass as
G(z) = (1 + z −1 )N R(z).
(1.46)
If we maintain the convention that g is causal and of length L, then R(z) is a polynomial in z −1 of degree (L − 1 − N ). Of course, R(z) has to be chosen appropriately,
so as to obtain an orthogonal filter bank.
Putting at least one zero at z = −1 in G(z) makes a lot of signal processing
sense. After all, z = −1 corresponds to ω = π, the maximum discrete frequency;
it is thus natural for a lowpass filter to have a zero at z = −1 and block that
highest frequency. Putting more than one zero at z = −1 has further approximation
advantages, as the Theorem 1.5 specifies, and as we will see in wavelet constructions
in later chapters.
Theorem 1.5 (Polynomial reproduction) Given is an orthogonal filter
bank in which the synthesis lowpass filter G(z) has N zeros at z = −1. Then
polynomial sequences up to degree (N − 1) and of finite support are reproduced
in the lowpass approximation subspace spanned by {gn−2k }k∈Z .
Proof. By assumption, the synthesis filter G(z) is given by (1.46). From Table 1.9,
the analysis highpass filter is of the form ∓z L−1 G(−z), which means it has a factor
(1 − z −1 )N , that is, it has N zeros at z = 1. From our discussion, this factor annihilates
a polynomial input of degree (N − 1), resulting in β = 0 and xW = 0. Because of the
perfect reconstruction property, x = xV , showing that the polynomial sequences are
reproduced by a linear combination of {gn−2k }k∈Z , as in (1.11a).
Polynomial reproduction by the lowpass channel and polynomial cancellation
in the highpass channel are basic features in wavelet approximations. In particular,
the cancellation of polynomials of degree (N − 1) is also called the zero-moment
property of the filter (see (3.140a)):
mk =
X
n∈Z
nk hn = 0,
k = 0, 1, . . . , N − 1,
(1.47)
that is, kth-order moments of h up to (N − 1) are zero (see Exercise ??).
α3.2 [January 2013] [free version] CC by-nc-nd
Comments to [email protected]
Fourier and Wavelet Signal Processing
20
1.3
Copyright 2013 J. Kovačević, V. K. Goyal, and M. Vetterli
Chapter 1. Filter Banks: Building Blocks of Time-Frequency Expansions
Design of Orthogonal Two-Channel Filter Banks
To design a two-channel orthogonal filter bank, it suffices to design one orthogonal
filter—the lowpass synthesis g with the z-transform G(z) satisfying (1.13); we have
seen how the other three filters follow (Table 1.9). The design is based on (1)
finding a deterministic autocorrelation function satisfying (1.15) (it is symmetric,
positive semi-definite and has a single nonzero even-indexed coefficient; and (2)
factoring that deterministic autocorrelation A(z) = G(z)G(z −1 ) into its spectral
factors (many factorizations are possible, see Section 3.5).6
We consider three different designs. The first tries to approach an ideal halfband lowpass filter, the second aims at polynomial approximation, while the third
uses lattice factorization in polyphase domain.
1.3.1
Lowpass Approximation Design
Assume we wish to get our lowpass synthesis filter G(ejω ) to be as close as possible to an ideal lowpass halfband filter as in TBD. Since according to (3.96)
the deterministic
autocorrelation of g can be expressed in the DTFT domain as
jω
jω 2
A(e ) = G(e ) , this deterministic autocorrelation is an ideal lowpass halfband
function as well:
2, if |ω| < π/2;
jω
A(e ) =
(1.48)
0, otherwise.
From Table 3.5, the deterministic autocorrelation sequence is
Z π/2
1
an =
2 ejnω dω = sinc(nπ/2),
2π −π/2
(1.49)
a valid deterministic autocorrelation; it has a single nonzero even-indexed coefficient (a0 = 1) and is positive semi-definite. To get a realizable function, we apply
a symmetric window function w that decays to zero. The new deterministic autocorrelation a′ is the pointwise product
a′n = an wn .
(1.50)
Clearly, a′ is symmetric and still has a single nonzero even-indexed coefficient.
However, this is not enough for a′ to be a deterministic autocorrelation. We can
see this in frequency domain,
(a)
A′ (ejω ) =
1
A(ejω ) ∗ W (ejω ),
2π
(1.51)
where we used the convolution in frequency property (3.94). In general, (1.51)
is not nonnegative for all frequencies anymore, and thus not a valid deterministic
autocorrelation. One easy way to enforce nonnegativity is to choose W (ejω ) itself
positive, for example as the deterministic autocorrelation of another window w′ , or
W (ejω ) = |W ′ (ejω )|2 .
6 Recall that we only consider real-coefficient filters, thus a is symmetric and not Hermitian
symmetric.
α3.2 [January 2013] [free version] CC by-nc-nd
Comments to [email protected]
Fourier and Wavelet Signal Processing
Copyright 2013 J. Kovačević, V. K. Goyal, and M. Vetterli
1.3. Design of Orthogonal Two-Channel Filter Banks
21
If w′ is of norm 1, then w0 = 1, and from (1.50), a′0 = 1 as well. Therefore, since
A(ejω ) is real and positive, A′ (ejω ) will be as well. The resulting sequence a′ and
its z-transform A′ (z) can then be used in spectral factorization (see Section 3.5.3)
to obtain an orthogonal filter g.
Example 1.2 (Lowpass approximation design of orthogonal filters)
We design a length-4 filter by the lowpass approximation method. Its deterministic autocorrelation is of length 7 with the target impulse response obtained by
evaluating (1.49):
h
iT
2
2
.
a =
. . . 0 − 3π
0 π2
1 π2 0 − 3π
0 ...
For the window w, we take it to be the deterministic autocorrelation of the
sequence wn′ , which is specified by wn′ = 1/2 for 0 ≤ n ≤ 3, and wn′ = 0
otherwise:
h
iT
w =
.
. . . 0 0 14 21 34 1 34 12 41 0 0 . . .
Using (1.50), we obtain the new deterministic autocorrelation of the lowpass filter
as
h
iT
1
3
3
1
a′ =
.
. . . 0 − 6π
0 2π
1 2π
0 − 6π
0 ...
Factoring this deterministic autocorrelation (requires numerical polynomial root
finding) gives
h
iT
g ≈ . . . 0 0.832 0.549 0.0421 −0.0637 0 . . . .
The impulse response and frequency response of g are shown in Figure 1.7.
The method presented is very simple, and does not lead to the best designs.
For better designs, one uses standard filter design procedures followed by adjustments to ensure positivity. For example, consider (1.51) again, and define
min
A′ (ejω ) = ε.
ω∈[−π,π]
If ε ≥ 0, we are done, otherwise, we simply choose a new function
A′′ (ejω ) = A′ (ejω ) − ε,
which is now nonnegative, allowing us to perform spectral factorization. Filters
designed using this method are tabulated in [100].
1.3.2
Polynomial Approximation Design
Recall that a lowpass filter G(z) with N zeros at z = −1 as in (1.46) reproduces
polynomials up to degree (N − 1). Thus, the goal of this design procedure is to find
a deterministic autocorrelation A(z) of the form
A(z) = G(z)G(z −1 ) = (1 + z −1 )N (1 + z)N Q(z),
α3.2 [January 2013] [free version] CC by-nc-nd
(1.52)
Comments to [email protected]
Fourier and Wavelet Signal Processing
22
Copyright 2013 J. Kovačević, V. K. Goyal, and M. Vetterli
Chapter 1. Filter Banks: Building Blocks of Time-Frequency Expansions
jω
|G(e )|
g
n
1.4
0.9
0.8
1.2
0.7
1
0.6
0.5
0.8
0.4
0.6
0.3
0.2
0.4
0.1
0.2
0
−0.1
−2
0
2
4
0
0
π/2
π
Figure 1.7: Orthogonal filter design based on lowpass approximation in Example 1.2.
(a) Impulse response. (b) Frequency response.
with Q(z) chosen such that (1.15) is satisfied, that is,
A(z) + A(−z) = 2,
(1.53)
Q(z) = Q(z −1 ) (qn symmetric in time domain), and Q(z) is nonnegative on the
unit circle. Satisfying these conditions allows one to find a spectral factor of A(z)
with N zeros at z = −1, and this spectral factor is the desired orthogonal filter.
We illustrate this procedure through an example.
Example 1.3 (Polynomial approximation design of orthogonal filters)
We will design a filter g such that it reproduces linear polynomials, that is, N = 2:
A(z) = (1 + z −1 )2 (1 + z)2 Q(z) = (z −2 + 4z −1 + 6 + 4z + z 2 ) Q(z).
Can we now find Q(z) so as to satisfy (1.53), in particular, a minimum-degree
solution? We try with (remember qn is symmetric)
Q(z) = az + b + az −1
and compute A(z) as
A(z) = a(z 3 + z −3 ) + (4a + b)(z 2 + z −2 ) + (7a + 4b)(z + z −1 ) + (8a + 6b).
To satisfy (1.53), A(z) must have a single nonzero even-indexed coefficient. We
thus need to solve the following pair of equations:
4a + b = 0,
8a + 6b = 1,
yielding a = −1/16 and b = 1/4. Thus, our candidate factor is
1 −1
1
1
− z +1− z .
Q(z) =
4
4
4
α3.2 [January 2013] [free version] CC by-nc-nd
Comments to [email protected]
Fourier and Wavelet Signal Processing
Copyright 2013 J. Kovačević, V. K. Goyal, and M. Vetterli
1.3. Design of Orthogonal Two-Channel Filter Banks
|G(ejω)|
g
1.2
23
n
1.5
1
0.8
1
0.6
0.4
0.5
0.2
0
−0.2
0
π/2
(a)
(b)
(c)
(d)
π
Figure 1.8: Orthogonal filter design based on polynomial approximation in Example 1.3.
(a) Impulse response. (b) Frequency response. (c) Linear x is preserved in V . (d) Only
the linear portion of the quadratic x is preserved in V ; the rest shows in W .
It remains to check whether Q(ejω ) is nonnegative:
1
1 jω
1
jω
−jω
Q(e ) =
1 − (e + e
) =
1−
4
4
4
1
2
cos ω
> 0
since | cos(ω)| ≤ 1. So Q(z) is a valid deterministic autocorrelation and can be
written as Q(z) = R(z)R(z −1). Extracting its causal spectral factor
R(z) =
√
√
1
√ (1 + 3 + (1 − 3)z −1 ),
4 2
the causal orthogonal lowpass filter with 2 zeros at z = −1 becomes
G(z) = (1 + z −1 )2 R(z)
i
√
√
√
√
1 h
= √ (1 + 3) + (3 + 3)z −1 + (3 − 3)z −2 + (1 − 3)z −3 .
4 2
This filter is one of the filters from the Daubechies family of orthogonal filters.
Its impulse and frequency responses are shown in Figure 1.8. The rest of the
filters in the filter bank can be found from Table 1.9.
In the example, we saw that solving a linear system followed by spectral factorization were the key steps. In general, for G(z) with N zeros at z = −1, the
minimum-degree R(z) to obtain an orthogonal filter is of degree (N − 1), corresponding to N unknown coefficients. Q(z) = R(z)R(z −1 ) is obtained by solving an
N × N linear system (to satisfy A(z) + A(z) = 2), followed by spectral factorization
α3.2 [January 2013] [free version] CC by-nc-nd
Comments to [email protected]
Fourier and Wavelet Signal Processing
24
Copyright 2013 J. Kovačević, V. K. Goyal, and M. Vetterli
Chapter 1. Filter Banks: Building Blocks of Time-Frequency Expansions
Step
Operation
1.
2.
Choose N , the number of zeros at z = −1
G(z) = (1 + z −1 )N R(z), where R(z) is causal with powers
(0, −1, . . . , −N + 1)
A(z) = (1 + z −1 )N (1 + z)N Q(z), where Q(z) is symmetric and
has powers (−(N − 1), . . . , 0, . . . , (N + 1))
A(z) + A(−z) = 2. This leads to N linear constraints on
the coefficients of Q(z)
Solve the N × N linear system for the coefficients of Q(z)
Take the spectral factor of Q(z) = R(z)R(z −1 )
(for example, the minimum-phase factor, see Section 3.5)
The minimum phase orthogonal filter is G(z) = (1 + z −1 )N R(z)
3.
4.
5.
6.
7.
Table 1.2: Design of orthogonal lowpass filters with maximum number of zeros at z = −1.
g0
g1
g2
g3
g4
g5
g6
g7
g8
g9
g10
g11
L=4
L=6
L=8
L = 10
L = 12
0.482962913
0.836516304
0.224143868
-0.129409522
0.332670553
0.806891509
0.459877502
-0.135011020
-0.085441274
0.035226292
0.230377813309
0.714846570553
0.630880767930
-0.027983769417
-0.187034811719
0.030841381836
0.032883011667
-0.010597401785
0.160102398
0.603829270
0.724308528
0.138428146
-0.242294887
-0.032244870
0.077571494
-0.006241490
-0.012580752
0.003335725
0.111540743350
0.494623890398
0.751133908021
0.315250351709
-0.226264693965
-0.129766867567
0.097501605587
0.027522865530
-0.031582039318
0.000553842201
0.004777257511
-0.001077301085
Table 1.3: Orthogonal filters with maximum number of zeros at z = −1 (from [29]). For
a lowpass filter of even length L = 2ℓ, there are L/2 zeros at z = −1.
to produce the desired result. (It can be verified that Q(ejω ) ≥ 0.) These steps are
summarized in Table 1.2, while Table 1.3 gives filter-design examples.
Note that A(z) has the following form when evaluated on the unit circle:
A(ejω ) = 2N (1 + cos ω)N Q(ejω ),
with Q(ejω ) real and positive. Since A(ejω ) and its (2N − 1) derivatives are zero at
ω = π, |G(ejω )| and its (N − 1) derivatives are zero at ω = π. Moreover, because
of the quadrature mirror formula (3.214), |G(ejω )| and its (N − 1) derivatives are
zero at ω = 0 as well. These facts are the topic of Exercise ??.
α3.2 [January 2013] [free version] CC by-nc-nd
Comments to [email protected]
Fourier and Wavelet Signal Processing
Copyright 2013 J. Kovačević, V. K. Goyal, and M. Vetterli
1.3. Design of Orthogonal Two-Channel Filter Banks
1.3.3
25
Lattice Factorization Design
When discussing the polyphase view of filter banks in Section 1.2.4, we saw that
orthogonality of a two-channel filter bank is connected to its polyphase matrix being
paraunitary. The following elegant factorization result is used in the design of that
paraunitary matrix:
Theorem 1.6 The polyphase matrix of any real-coefficient, causal, FIR orthogonal two-channel filter bank can be written as
Φp (z) = U
K−1
Y 1
0
0
z −1
k=1
Rk ,
(1.54)
where U is a general unitary matrix as in (2.227) (either a rotation as in (2.229a)
or a rotoinversion (2.229b)), and Rk , k = 1, 2, . . . , K − 1, are rotation matrices
as in (2.229a).
The resulting filters are of even length 2K (see Exercise ??). That the above
structure produces and orthogonal filter bank is clear as the corresponding polyphase matrix Φp (z) is paraunitary. Proving that any orthogonal filter bank can be
written in the form of (1.54) is a bit more involved. It is based on the result that for
two, real-coefficient polynomials PK−1 and QK−1 of degree (K − 1), with pK−1 (0)
pK−1 (K − 1) 6= 0 (and PK−1 , QK−1 are power complementary as in (3.214)), there
exists another pair PK−2 , QK−2 such that
PK−1 (z)
cos θ
=
QK−1 (z)
sin θ
− sin θ
cos θ
PK−2 (z)
.
z −1 QK−2 (z)
(1.55)
Repeatedly applying the above result to (1.39) one obtains the lattice factorization
given in (1.6). The details of the proof are given in [100].
Using the factored form, designing an orthogonal filter bank amounts to choosing U and a set of angles (θ0 , θ1 , . . . , θK−1 ). For example, the Haar filter bank in
lattice form amounts to keeping only the constant-matrix term, U , as in (1.42), a rotoinversion. The factored form also suggests a structure, called a lattice, convenient
for hardware implementations (see Figure 1.9).
How do we impose particular properties, such as zeros at ω = π, or, z = −1,
for the lowpass filter G(z)? Write the following set of equations:
(a)
(b) √
G(z)|z=1 = (G0 (z 2 ) + z −1 G1 (z 2 ))z=1 = G0 (1) + G1 (1) =
2, (1.56a)
(c)
(d)
G(z)|z=−1 = (G0 (z 2 ) + z −1 G1 (z 2 ))z=−1 = G0 (1) − G1 (1) = 0, (1.56b)
where (a) and (c) follow from (1.32); (b) from (1.13) and the requirement that G(z)
α3.2 [January 2013] [free version] CC by-nc-nd
Comments to [email protected]
Fourier and Wavelet Signal Processing
26
Copyright 2013 J. Kovačević, V. K. Goyal, and M. Vetterli
Chapter 1. Filter Banks: Building Blocks of Time-Frequency Expansions
RK−2
RK−1
U
•••
z−1
z−1
•••
z−1
Figure 1.9: Two-channel lattice factorization of paraunitary filter banks. The 2×2 blocks
Rk are rotation matrices, and U is a general unitary matrix (rotation or rotoinversion).
The inputs are the polyphase components of the sequence x, and the output are the lowpass
and highpass channels.
be 0 at z = −1, and similarly for (d). We can rewrite these compactly as:
√ 1
1 G0 (1)
2
=
.
1 −1 G0 (1)
0
| {z }
(1.57)
G
The vector G above is just the first column of Φp (z 2 )z=−1 , which, in turn, is either
a product of (1) K rotations by (θ0 , θ1 , . . . , θK−1 ), or, (2) one rotoinversion by θ0
and K − 1 rotations (θ1 , . . . , θK−1 ). The solution to the above is:
K−1
X
θk = 2nπ +
k=0
θ0 −
K−1
X
k=1
π
,
4
U is a rotation,
(1.58a)
FIGURE 3.5
θk
π
= 2nπ + ,
4
figA.1.0
U is a rotoinversion,
(1.58b)
for some n ∈ Z. Imposing higher-order zeros at z = −1, as required for higher-order
polynomial reproduction, leads to more complicated algebraic constraints. As an
example, choosing θ0 = π/3 and θ1 = −π/12 leads to a double zero at z = −1, and
is thus the lattice version of the filter designed in Example 1.3 (see Exercise ??). In
general, design problems in lattice factored form are nonlinear and thus nontrivial.
1.4
Biorthogonal Two-Channel Filter Banks
While orthogonal filter banks have many attractive features, one eludes them: when
restricted to real-coefficient, FIR filters, solutions that are both orthonormal and
linear phase do not exist except for Haar filters. This is one of the key motivations
for looking beyond the orthogonal case, as well as for the popularity of biorthogonal filter banks, especially in image processing. Similarly to the orthogonal case,
we want to find out how to implement biorthogonal bases using filter banks, in
particular, those having certain time and frequency localization properties. From
Definition 2.43, we know that a system {ϕk , ϕ
ek } constitutes a pair of biorthogonal bases of the Hilbert space ℓ2 (Z), if (1) they satisfy biorthogonality constraints
(2.108):
e T = ΦΦ
e T = I,
hϕk , ϕ
ei i = δk−i
↔
ΦΦ
(1.59)
α3.2 [January 2013] [free version] CC by-nc-nd
Comments to [email protected]
Fourier and Wavelet Signal Processing
Copyright 2013 J. Kovačević, V. K. Goyal, and M. Vetterli
1.4. Biorthogonal Two-Channel Filter Banks
2
h̃n
β
27
hn
2
xW
x
x
+
g̃n
2
α
gn
2
xV
Figure 1.10: A biorthogonal two-channel analysis/synthesis filter bank. The output is
the sum of the lowpass approximation xV and its highpass counterpart xW .
e is an infinite matrix
where Φ is an infinite matrix having ϕk as its columns, while Φ
having ϕ
ek as its columns; and (2) it is complete:
X
X
ek ϕ
e X,
e
x =
Xk ϕk = ΦX =
X
ek = Φ
(1.60)
k∈Z
k∈Z
for all x ∈ ℓ2 (Z), where
Xk = hϕ
ek , xi
↔
e T x,
X = Φ
and
ek = hϕk , xi
X
↔
e = ΦT x.
X
It is not a stretch now to imagine that, similarly to the orthogonal case, we are
looking for two template basis sequences—a lowpass/highpass pair g and h, and a
dual pair e
g and e
h so that the biorthogonality constraints (1.59) are satisfied. Under
the right circumstances described in this section, such a filter bank will compute a
biorthogonal expansion. Assume that indeed, we are computing such an expansion.
Start from the reconstructed output as in Figure 1.10:
X
X
x = xV + xW =
αk gn−2k +
βk hn−2k ,
k∈Z
k∈Z
or












|
..
.
x0
x1
x2
x3
x4
..
.
{z
x


..
.



. . .



. . .


 = . . .



. . .



. . .


..
.
}
|
..
.
g0
g1
g2
g3
g4
..
.
..
.
h0
h1
h2
h3
h4
..
.
..
.
0
0
g0
g1
g2
..
.
{z
Φ
..
.
0
0
h0
h1
h2
..
.
..
.
0
0
0
0
g0
..
.
..
.
0
0
0
0
h0
..
.

.


. . .


. . .


. . .


. . .


. . .

..
.
}|
..
..
.
α0
β0
α1
β1
α2
..
.
{z
X






 = Φ X,





(1.61)
}
exactly the same as (1.7). As in (1.7), gn−2k and hn−2k are the impulse responses
of the synthesis filters g and h shifted by 2k, and αk and βk are the outputs of the
α3.2 [January 2013] [free version] CC by-nc-nd
Comments to [email protected]
Fourier and Wavelet Signal Processing
28
Copyright 2013 J. Kovačević, V. K. Goyal, and M. Vetterli
Chapter 1. Filter Banks: Building Blocks of Time-Frequency Expansions
analysis filter bank downsampled by 2. The basis sequences are the columns of
Φ = {ϕk }k∈Z = {ϕ2k , ϕ2k+1 }k∈Z = {gn−2k , hn−2k }k∈Z ,
(1.62)
that is, the even-indexed basis sequences are the impulse responses of the synthesis
lowpass filter and its even shifts, while the odd-indexed basis sequences are the
impulse responses of the synthesis highpass filter and its even shifts.
So far, the analysis has been identical to that of orthogonal filter banks; we
repeated it here for emphasis. Since we are implementing a biorthogonal expansion,
the transform coefficients αk and βk are inner products between the dual basis
sequences and the input sequence: αk = hx, ϕ
e2k i, βk = hx, ϕ
e2k+1 i. From (3.60a),
αk = hx, ϕ
e2k i = hxn , ge2k−n in = gen−2k ∗k x
βk = hx, ϕ
e2k+1 i = hxn , e
h2k−n in = e
hn−2k ∗k x
↔
↔
e Tg x,
α = Φ
e T x,
β = Φ
h
that is, we can implement the computation of the expansion coefficients αk and βk
using convolutions, exactly as in the orthogonal case. We finally get
e T x.
X = Φ
From above, we see that the dual basis sequences are
e = {ϕ
Φ
ek }k∈Z = {ϕ
e2k , ϕ
e2k+1 }k∈Z = {e
g2k−n , e
h2k−n }k∈Z ,
(1.63)
that is, the even-indexed dual basis sequences are the shift-reversed impulse responses of the analysis lowpass filter and its even shifts, while the odd-indexed
basis sequences are the shift-reversed impulse responses of the analysis highpass
filter and its even shifts.
We stress again that the basis sequences of Φ are synthesis filters’ impulse
e are the shift-reversed
responses and their even shifts, while the basis sequences of Φ
analysis filters’ impulse responses and their even shifts. This shift reversal comes
from the fact that we are implementing our inner product using a convolution. Note
e are completely interchangeable.
also that Φ and Φ
As opposed to the three orthonormality relations (1.8), here we have four
biorthogonality relations, visualized in Figure 1.11:
hgn , ge2k−n in = δk ,
hhn , e
h2k−n in = δk ,
hhn , ge2k−n in = 0,
hgn , e
h2k−n in = 0.
(1.64a)
(1.64b)
(1.64c)
(1.64d)
The purpose of this section is to explore the family of impulse responses {g, h}
and their duals {e
g, e
h} so as to satisfy the biorthogonality constraints. This family is
much larger than the orthonormal family, and will contain symmetric/antisymmetric
solutions, on which we will focus.
α3.2 [January 2013] [free version] CC by-nc-nd
Comments to [email protected]
Fourier and Wavelet Signal Processing
Copyright 2013 J. Kovačević, V. K. Goyal, and M. Vetterli
1.4. Biorthogonal Two-Channel Filter Banks
f
W
e
h
29
W
h
g
g
e
V
Ve
Figure 1.11: In a biorthogonal basis, e
g is orthogonal to h, and e
h is orthogonal to g.
e
Then, e
g and h are normalized so that the inner products with their duals are 1.
1.4.1
A Single Channel and Its Properties
As we have done for the orthogonal case, we first discuss channels in isolation and
determine what they need to satisfy. Figure 1.12 shows the biorthogonal lowpass
channel, projecting the input x onto its lowpass approximation xV . That lowpass
approximation xV can be expressed identically to (1.12a):
xV =
X
αk gn−2k .
(1.65a)
k∈Z
The highpass channel follows the lowpass exactly, substituting h for g, e
h for ge, and
xW for xV (see Figure 1.12). The highpass approximation xW is
xW =
X
βk hn−2k .
(1.65b)
k∈Z
Biorthogonality of the Lowpass Filters Since we started with a pair of biorthogonal bases, {gn−2k , e
g2k−n }k∈Z satisfy biorthogonality relations (1.64a). Similarly to
the orthogonal case, these can be expressed in various domains as:
Matrix View
e GU2 = I
D2 G
e
e
hgn , e
g2k−n in = δk
←→
G(z)G(z)
+ G(−z)G(−z)
= 2
DTFT
jω e jω
j(ω+π) e j(ω+π)
←→
G(e )G(e ) + G(e
)G(e
) = 2
(1.66)
In the matrix view, we have used linear operators (infinite matrices) as we did for
the orthogonal case; it expresses the fact that the columns of GU2 are orthogonal
e The z-transform expression is often the defining equation of a
to the rows of D2 G.
e
biorthogonal filter bank, where G(z) and G(z)
are not causal in general.
←→
ZT
α3.2 [January 2013] [free version] CC by-nc-nd
Comments to [email protected]
Fourier and Wavelet Signal Processing
30
Copyright 2013 J. Kovačević, V. K. Goyal, and M. Vetterli
Chapter 1. Filter Banks: Building Blocks of Time-Frequency Expansions
Lowpass Channel in a Two-Channel Biorthogonal Filter Bank
Lowpass filters
Original domain
Matrix domain
z domain
DTFT domain
Polyphase domain
gn , e
gn
e
G, G
e
G(z), G(z)
e jω )
G(ejω ), G(e
G(z) = G0 (z 2 ) + z −1 G1 (z 2 )
e
e0 (z 2 ) + z G
e 1 (z 2 )
G(z)
=G
Deterministic crosscorrelation
Original domain
cn = hgk , e
gk+n ik
e
Matrix domain
C = GG
e −1 )
z domain
C(z) = G(z)G(z
e jω )
DTFT domain
C(ejω ) = G(ejω )G(e
hgn , e
g2k−n in = δk
e GU2 = I
D2 G
e
e
G(z)G(z)
+ G(−z)G(−z)
=2
e jω ) + G(ej(ω+π) )G(e
e j(ω+π) ) = 2
G(ejω )G(e
e
e
G0 (z)G0 (z) + G1 (z)G1 (z) = 1
D2 CU2 = I
C(z) + C(−z) = 2
C(ejω ) + C(ej(ω+π) ) = 2
Oblique projection onto smooth space V = span({gn−2k }k∈Z )
e
xV = PV x
PV = GU2 D2 G
Table 1.4: Properties of the lowpass channel in a biorthogonal two-channel filter bank.
e
Properties for the highpass channel are analogous. With gen = g−n , or, G(z)
= G(z −1 )
in the z-transform domain, the relations in this table reduce to those in Table 1.1 for the
orthogonal two-channel filter bank.
Biorthogonality of the Highpass Filters
Matrix View
hhn , e
h2k−n in = δk
←→
ZT
←→
DTFT
←→
e HU2 = I
D2 H
e
e
H(z)H(z)
+ H(−z)H(−z)
= 2
jω e jω
j(ω+π) e j(ω+π)
H(e )H(e ) + H(e
)H(e
) = 2
(1.67)
Deterministic Crosscorrelation of the Lowpass Filters In the orthogonal case, we
rephrased relations as in (1.66) in terms of the deterministic autocorrelation of g;
here, as we have two sequences g and e
g, we express it in terms of the deterministic
crosscorrelation of g and ge, (3.99):
Matrix View
hgn , ge2k−n in = c2k = δk
←→
ZT
←→
DTFT
←→
D2 CU2 = I
C(z) + C(−z) = 2
(1.68)
C(ejω ) + C(ej(ω+π) ) = 2
e is a Toeplitz matrix with element c±k on the kth diagonal
In the above, C = GG
left/right from the main diagonal (see (2.237)). While this deterministic crosscorrelation will be used for design as in the orthogonal case, unlike in the orthogonal
case: (1) C(z) does not have to be symmetric; (2) C(ejω ) does not have to be positive; and (3) any factorization of C(z) leads to a valid solution, that is, the roots
e
of C(z) can be arbitrarily assigned to G(z)
and G(z).
α3.2 [January 2013] [free version] CC by-nc-nd
Comments to [email protected]
Fourier and Wavelet Signal Processing
Copyright 2013 J. Kovačević, V. K. Goyal, and M. Vetterli
1.4. Biorthogonal Two-Channel Filter Banks
ℓ2 (Z)
x
g̃n
2
Figure 1.12:
α
31
2
V
xV
gn
The biorthogonal lowpass channel.
Deterministic Crosscorrelation of the Highpass Filters
Matrix View
hhn , e
h2k−n in = c2k = δk
←→
ZT
←→
DTFT
←→
D2 CU2 = I
C(z) + C(−z) = 2
(1.69)
C(ejω ) + C(ej(ω+π) ) = 2
Projection Property of the Lowpass Channel We now look at the lowpass channel
as a composition of linear operators:
e x.
xV = PV x = GU2 D2 G
(1.70)
While PV is a projection, it is not an orthogonal projection:
e = PV ,
e (GU2 D2 G)
e = GU2 D2 G
PV2 = (GU2 D2 G)
|
{z
}
I
e T = G
eT (U2 D2 )T GT = G
eT U2 D2 GT 6= PV .
PVT = (GU2 D2 G)
Indeed, PV is a projection operator (it is idempotent), but it is not orthogonal (it
is not self-adjoint). Its range is as in the orthogonal case:
V = span({gn−2k }k∈Z ).
(1.71)
Note the interchangeable roles of ge and g. When g is used in the synthesis, then xV
lives in the above span, while if e
g is used, it lives in the span of {e
gn−2k }k∈Z . The
summary of properties of the lowpass channel is given in Table 1.4.
Projection Property of the Highpass Channel The highpass projection operator
PW is:
e x;
xW = PW x = HU2 D2 H
(1.72)
again a projection operator (it is idempotent), but not orthogonal (it is not selfadjoint) the same way as for PV . Its range is:
W = span({hn−2k }k∈Z ).
α3.2 [January 2013] [free version] CC by-nc-nd
(1.73)
Comments to [email protected]
Fourier and Wavelet Signal Processing
32
1.4.2
Copyright 2013 J. Kovačević, V. K. Goyal, and M. Vetterli
Chapter 1. Filter Banks: Building Blocks of Time-Frequency Expansions
Complementary Channels and Their Properties
Following the path set during the analysis of orthogonal filter banks, we now discuss
what the two channels have to satisfy with respect to each other to build a biorthogonal filter bank. Given a pair of filters g and ge satisfying (1.66), how can we choose
h and e
h to complete the biorthogonal filter bank and thus implement a biorthogonal basis expansion? The sets of basis and dual basis sequences {gn−2k , hn−2k }k∈Z
and {e
g2k−n , e
h2k−n }k∈Z must satisfy (1.64). We have already used (1.64a) in (1.66)
and similarly for the highpass sequences in (1.67). What is left to use is that these
lowpass and highpass sequences are orthogonal to each other as in (1.64c)–(1.64d):
Orthogonality of the Lowpass and Highpass Filters
Matrix View
←→
ZT
hhn , e
g2k−n in = 0
←→
DTFT
←→
and similarly for g and e
h:
Matrix View
hgn , e
h2k−n in = 0
1.4.3
←→
ZT
←→
DTFT
←→
e HU2 = 0
D2 G
e
e
H(z)G(z)
+ H(−z)G(−z)
= 0
jω e jω
j(ω+π) e j(ω+π)
H(e )G(e ) + H(e
)G(e
) = 0
(1.74a)
e GU2 = 0
D2 H
e
e
G(z)H(z)
+ G(−z)H(−z)
= 0
jω e jω
j(ω+π) e j(ω+π)
G(e )H(e ) + G(e
)H(e
) = 0
(1.74b)
Biorthogonal Two-Channel Filter Bank
We now pull together what we have developed for biorthogonal filter banks. The
following result gives one possible example of a biorthogonal filter bank, inspired by
the orthogonal case. We choose the highpass synthesis filter as a modulated version
of the lowpass, together with an odd shift. However, because of biorthogonality, it
is the analysis lowpass that comes into play.
Theorem 1.7 (Biorthogonal two-channel filter bank) Given are two
FIR filters g and ge of even length L = 2ℓ, ℓ ∈ Z+ , orthogonal to each other
and their even shifts as in (1.66). Choose
hn = (−1)n e
gn−2ℓ+1
e
hn = (−1)n gn+2ℓ−1
ZT
←→
ZT
←→
e
H(z) = −z −L+1 G(−z)
e
H(z)
= −z L−1 G(−z)
(1.75a)
(1.75b)
Then, sets {gn−2k , hn−2k }k∈Z and {e
g2k−n , e
h2k−n }k∈Z are a pair of biorthogonal
2
bases for ℓ (Z), implemented by a biorthogonal filter bank specified by analysis
filters {e
g, e
h} and synthesis filters {g, h}.
α3.2 [January 2013] [free version] CC by-nc-nd
Comments to [email protected]
Fourier and Wavelet Signal Processing
Copyright 2013 J. Kovačević, V. K. Goyal, and M. Vetterli
1.4. Biorthogonal Two-Channel Filter Banks
33
Proof. To prove the theorem, we must prove that (i) {gn−2k , hn−2k }k∈Z and {e
g2k−n ,
e
h2k−n }k∈Z are biorthogonal sets and (ii) they are complete.
(i) To prove that {gn−2k , hn−2k }k∈Z and {e
g2k−n , e
h2k−n }k∈Z are biorthogonal sets,
we must prove (1.64). The first condition, (1.64a), is satisfied by assumption. To
prove the second, (1.64b), that is, h is orthogonal to e
h and its even shifts, we
must prove one of the conditions in (1.67). The definitions of h and e
h in (1.75)
imply
e
e
H(z)H(z)
= G(−z)G(−z)
(1.76)
and thus,
(a)
e
e
e
e
H(z)H(z)
+ H(−z)H(−z)
= G(−z)G(−z)
+ G(z)G(z)
= 2,
where (a) follows from (1.66).
To prove (1.64c)–(1.64d), we must prove one of the conditions in (1.74a)–
(1.74b), respectively. We prove (1.64c), (1.64d) follows similarly.
(a)
e
e
e
e
e G(−z)
e
H(z)G(z)
+ H(−z)G(−z)
= −z L−1 G(−z)
G(z)
− (−1)−L+1 z L−1 G(z)
(b)
e
e G(−z)
e
= −z L−1 G(−z)G(z)
+ z L−1 G(z)
= 0,
where (a) follows from (1.75a); and (b) L = 2ℓ even.
(ii) To prove completeness, we prove that perfect reconstruction holds for any x ∈
ℓ2 (Z). What we do is find z-domain expressions for XV (z) and XW (z) and prove
they sum up to X(z). We start with the lowpass branch. The proof proceeds as
in the orthogonal case.
h
i
e
e
XV (z) = 12 G(z) G(z)X(z)
+ G(−z)X(−z)
,
(1.77a)
h
i
e
e
+ H(−z)X(−z)
.
(1.77b)
XW (z) = 12 H(z) H(z)X(z)
The output of the filter bank is the sum of xV and xW :
h
i
e
e
XV (z) + XW (z) = 21 G(z)G(−z)
+ H(z)H(−z)
X(−z)
{z
}
|
S(z)
+
1
2
h
i
e
e
G(z)G(z)
+ H(z)H(z)
X(z).
|
{z
}
(1.78)
T (z)
Substituting (1.75) into the above equation, we get:
e
e
S(z) = G(z)G(−z)
+ H(z)H(−z)
h
ih
i
(a)
e
e
= G(z)G(z)
+ −z −L+1 G(−z)
−(−z)L−1 G(z)
h
i
(b)
e
= 1 + (−1)L+1 G(z)G(−z)
= 0,
e
e
T (z) = G(z)G(z)
+ H(z)H(z)
(c)
(d)
e
e
= G(z)G(z)
+ G(−z)G(−z)
= 2,
α3.2 [January 2013] [free version] CC by-nc-nd
Comments to [email protected]
Fourier and Wavelet Signal Processing
34
Copyright 2013 J. Kovačević, V. K. Goyal, and M. Vetterli
Chapter 1. Filter Banks: Building Blocks of Time-Frequency Expansions
where (a) follows from (1.75); (b) from L = 2ℓ even; (c) from (1.75); and (d) from
(1.66). Substituting this back into (1.78), we get
XV (z) + XW (z) = X(z),
(1.79)
proving perfect reconstruction, or, in other words, the assertion in the theorem
statement that the expansion can be implemented by a biorthogonal filter bank.
Note that we could have also expressed our design problem based on the synthesis
(analysis) filters only.
Unlike the orthogonal case, the approximation spaces V and W are not orthogonal
anymore, and therefore, there exist dual spaces Ṽ and W̃ spanned by e
g−n and e
h−n
and their even shifts. However, V is orthogonal to W̃ and W is orthogonal to
Ṽ . This was schematically shown in Figure 1.11. Table 1.10 summarizes various
properties of biorthogonal, two-channel filter banks we covered until now.
1.4.4
Polyphase View of Biorthogonal Filter Banks
We have already seen how polyphase analysis of orthogonal filter banks adds to the
analysis toolbox. We now give a brief account of important polyphase notions when
dealing with biorthogonal filter banks. First, recall from (1.33) that the polyphase
matrix of the synthesis bank is given by7
G0 (z) H0 (z)
G(z) = G0 (z) + z −1 G1 (z),
Φp (z) =
,
(1.80a)
G1 (z) H1 (z)
H(z) = H0 (z) + z −1 H1 (z).
By the same token, the polyphase matrix of the analysis bank is given by
#
"
e0 (z) H
e 0 (z)
e
e0 (z) + z G
e1 (z),
G
G(z)
= G
e p (z) =
,
(1.80b)
Φ
e1 (z) H
e 1 (z)
e
e 0 (z) + z H
e 1 (z).
G
H(z)
= H
Remember that the different polyphase decompositions of the analysis and synthesis
filters are a matter of a carefully chosen convention.
For a biorthogonal filter bank to implement a biorthogonal expansion, the
following must be satisfied:
e Tp (z) = I.
Φp (z) Φ
(1.81)
From this, [NOTE: Requires scrutiny. Possibly needs transposition.]
1
H1 (z) −H0 (z)
e p (z) = (ΦT (z))−1 =
Φ
.
p
G0 (z)
det Φp (z) −G1 (z)
(1.82)
Since all the matrix entries are FIR, for the analysis to be FIR as well, det Φp (z)
must be a monomial, that is:
det Φp (z) = G0 (z)H1 (z) − G1 (z)H0 (z) = z −k .
(1.83)
7 When we say polyphase matrix, we will mean the polyphase matrix of the synthesis bank; for
the analysis bank, we will explicitly state analysis polyphase matrix.
α3.2 [January 2013] [free version] CC by-nc-nd
Comments to [email protected]
Fourier and Wavelet Signal Processing
Copyright 2013 J. Kovačević, V. K. Goyal, and M. Vetterli
1.4. Biorthogonal Two-Channel Filter Banks
35
In the above, we have implicitly assumed that Φp (z) was invertible, that is, its
columns are linearly independent. This can be rephrased in filter bank terms by
stating when, given G(z), it is possible to find H(z) such that it leads to a perfect reconstruction biorthogonal filter bank. Such a filter H(z) will be called a
complementary filter.
Theorem 1.8 (Complementary filters) Given a causal FIR filter G(z),
there exists a complementary FIR filter H(z), if and only if the polyphase components of G(z) are coprime (except for possible zeros at z = ∞).
Proof. We just saw that a necessary and sufficient condition for perfect FIR reconstruction is that det(Φp (z)) be a monomial. Thus, coprimeness is obviously necessary,
since if there were a common factor between G0 (z) and G1 (z), it would show up in the
determinant.
Sufficiency follows from the Bézout’s identity (3.290) that says that given two
coprime polynomials a(z) and b(z), the equation a(z)p(z) + b(z)q(z) = c(z) has a
solution p(z), q(z). Fixing a(z) = G0 (z), b(z) = G1 (z) and c(z) = z −k , we see that
Bézout’s identity is equal to (1.83), and thus guarantees a solution p(z) = H0 (z) and
q(z) = H1 (z), that is, a complementary filter H(z).
Note that the coprimeness of G0 (z) and G1 (z) is equivalent to G(z) not having any
zero pairs {z0 , −z0 }. This can be used to prove that the binomial filter G(z) =
(1 + z −1 )N always has a complementary filter (see Exercise ??).
The counterpart to Theorem 1.3 and Corollary 1.4 for orthogonal filter banks
are the following theorem and corollary for the biorthogonal ones (we state these
without proof):
Theorem 1.9 (Positive definite matrix and biorthogonal basis) Given
a filter bank implementing a biorthogonal basis for ℓ2 (Z) and its associated
polyphase matrix Φp (ejω ), then Φp (ejω )ΦTp (e−jω ) is positive definite.
Corollary 1.10 (Filtered deterministic autocorrelation matrix is positive semidefinite)
Given is a 2 × 2 polyphase matrix Φp (ejω ) such that Φp (ejω )ΦTp (e−jω ). Then the
filtered deterministic autocorrelation matrix, Ap,α (ejω ), is positive semidefinite.
1.4.5
Linear-Phase Two-Channel Filter Banks
We started this section by saying that one of the reasons we go through the trouble
of analyzing and constructing two-channel biorthogonal filter banks is because they
α3.2 [January 2013] [free version] CC by-nc-nd
Comments to [email protected]
Fourier and Wavelet Signal Processing
36
Copyright 2013 J. Kovačević, V. K. Goyal, and M. Vetterli
Chapter 1. Filter Banks: Building Blocks of Time-Frequency Expansions
allow us to obtain real-coefficient FIR filters with linear phase.8 Thus, we now do
just that: we build perfect reconstruction filter banks where all the filters involved
are linear phase. Linear-phase filters were defined in (3.107).
As was true for orthogonal filters, not all lengths of filters are possible if we
want to have a linear-phase filter bank. This is summarized in the following theorem,
the proof of which is left as Exercise ??:
Theorem 1.11 In a two-channel, perfect reconstruction filter bank where all filters are linear phase, the synthesis filters have one of the following forms:
(i) Both filters are odd-length symmetric, the lengths differing by an odd multiple of 2.
(ii) One filter is symmetric and the other is antisymmetric; both lengths are
even, and are equal or differ by an even multiple of 2.
(iii) One filter is of odd length, the other one of even length; both have all zeros
on the unit circle. Either both filters are symmetric, or one is symmetric
and the other one is antisymmetric.
Our next task is to show that indeed, it is not possible to have an orthogonal
filter bank with linear-phase filters if we restrict ourselves to the two-channel, FIR,
real-coefficient case:
Theorem 1.12 The only two-channel perfect reconstruction orthogonal filter
bank with real-coefficient FIR linear-phase filters is the Haar filter bank.
Proof. In orthogonal filter banks, (1.40)–(1.41) hold, and the filters are of even length.
Therefore, following Theorem 1.11, one filter is symmetric and the other antisymmetric.
Take the symmetric one, G(z) for example,
(a)
G(z) = G0 (z 2 ) + z −1 G1 (z 2 )
(b)
(c)
= z −L+1 G(z −1 ) = z −L+1 (G0 (z −2 ) + zG1 (z −2 ))
= z −L+2 G1 (z −2 ) + z −1 (z −L+2 G0 (z −2 )),
where (a) and (c) follow from (1.32), and (b) from (3.153). This further means that
for the polyphase components, the following hold:
G0 (z) = z −L/2+1 G1 (z −1 ),
G1 (z) = z −L/2+1 G0 (z −1 ).
(1.84)
Substituting (1.84) into (1.40) we obtain
G0 (z) G0 (z −1 ) =
1
.
2
8 If we allow filters to have complex-valued coefficients or if we lift the restriction of two channels,
linear phase and orthogonality can be satisfied simultaneously.
α3.2 [January 2013] [free version] CC by-nc-nd
Comments to [email protected]
Fourier and Wavelet Signal Processing
Copyright 2013 J. Kovačević, V. K. Goyal, and M. Vetterli
1.5. Design of Biorthogonal Two-Channel Filter Banks
37
The only FIR, real-coefficient polynomial satisfying the above is
G0 (z) =
−m
√1 z
.
2
√
Performing a similar analysis for G1 (z), we get that G1 (z) = (1/ 2)z −k , and
G(z) =
−2ℓ
√1 (z
2
+ z −2k−1 ),
H(z) = G(−z),
yielding Haar filters (m = k = 0) or trivial variations thereof.
While the outstanding features of the Haar filters make it a very special solution,
Theorem 1.12 is a fundamentally negative result as the Haar filters have poor frequency localization and no polynomial reproduction capability.
1.5
Design of Biorthogonal Two-Channel Filter
Banks
Given that biorthogonal filters are less constrained than their orthogonal counterparts, the design space is much more open. In both cases, one factors a Laurent
polynomial9 C(z) satisfying C(z) + C(−z) = 2 as in (1.68). In the orthogonal case,
C(z) was a deterministic autocorrelation, while in the biorthogonal case, it is a deterministic crosscorrelation and thus more general. In addition, the orthogonal case
requires spectral factorization (square root), while in the biorthogonal case, any factorization will do. While the factorization method is not the only approach, it is the
most common. Other approaches include the complementary filter design method
and the lifting design method. In the former, a desired filter is complemented so
as to obtain a perfect reconstruction filter bank. In the latter, a structure akin
to a lattice is used to guarantee perfect reconstruction as well as other desirable
properties.
1.5.1
Factorization Design
From (1.66)–(1.68), C(z) satisfying C(z) + C(−z) = 2 can be factored into
e
C(z) = G(z)G(z),
e
where G(z) is the synthesis and G(z)
the analysis lowpass filter (or vice-versa, since
the roles are dual). The most common designs use the same C(z) as those used
in orthogonal filter banks, for example, those with a maximum number of zeros at
z = −1, performing the factorization so that the resulting filters have linear phase.
Example 1.4 (Biorthogonal filter bank with linear-phase filters) We
reconsider Example 1.3, in particular C(z) given by
2
1
1
1
C(z) = 1 + z −1 (1 + z)2
− z −1 + 1 − z ,
4
4
4
9 A Laurent polynomial is a polynomial with both positive and negative powers, see Appendix 3.B.1.
α3.2 [January 2013] [free version] CC by-nc-nd
Comments to [email protected]
Fourier and Wavelet Signal Processing
38
Copyright 2013 J. Kovačević, V. K. Goyal, and M. Vetterli
Chapter 1. Filter Banks: Building Blocks of Time-Frequency Expansions
which satisfies C(z) + C(−z) = 2 by construction. This also means it satisfies
e
(1.66) for any factorization of C(z) into G(z)G(z).
Note that we can add factors
−1
z or z
in one filter, as long as we cancel it in the other; this is useful for
obtaining purely causal/anticausal solutions.
One possible factorization is
2
3
G(z) = z −1 1 + z −1 (1 + z) = 1 + z −1
= 1 + 3z −1 + 3z −2 + z −3 ,
1
1
1
1
e
G(z)
= z(1 + z)
− z −1 + 1 − z =
−1 + 3z + 3z 2 − z 3 .
4
4
4
16
The other filters follow from (1.75), with L = 2ℓ = 4:
H(z) = −z −3
e
= −z 3
H(z)
1
1
−1 − 3z + 3z 2 + z 3 =
−1 − 3z −1 + 3z −2 + z −3 ,
16
16
1 − 3z −1 + 3z −2 − z −3 = 1 − 3z + 3z 2 − z 3 .
The lowpass filters are both symmetric, while the highpass ones are antisymmete
ric. As H(z)
has three zero moments, G(z) can reproduce polynomials up to
degree 2, since such sequences go through the lowpass channel only.
Another possible factorization is
2
G(z) = 1 + z −1 (1 + z)2 = z −2 + 4z −1 + 6 + 4z + z 2 ,
1
1
1
1
e
G(z)
=
− z −1 + 1 − z =
−z −1 + 4 − z ,
4
4
4
16
where both lowpass filters are symmetric and zero phase. The highpass filters
are (with L = 0):
1
z z −1 + 4 + z ,
16
e
H(z) = −z −1 z −2 − 4z −1 + 6 − 4z + z 2 ,
H(z) = −
which are also symmetric, but with a phase delay of ±1 sample.
The zeros at z = −1 in the synthesis lowpass filter become, following (1.75b),
zeros at z = 1 in the analysis highpass filter. Therefore, many popular biorthogonal
filters come from symmetric factorizations of C(z) with a maximum number of zeros
at z = −1.
Example 1.5 (Design of the 9/7 filter pair) The next higher-degree C(z)
with a maximum number of zeros at z = −1 is of the form
3
3
C(z) = 2−8 (1 + z) 1 + z −1
3z 2 − 18z + 38 − 18z −1 + 3z −2 .
One possible factorization yields the so-called Daubechies 9/7 filter pair (see
Table 1.5). These filters have odd length and even symmetry, and are part of the
JPEG 2000 image compression standard.
α3.2 [January 2013] [free version] CC by-nc-nd
Comments to [email protected]
Fourier and Wavelet Signal Processing
Copyright 2013 J. Kovačević, V. K. Goyal, and M. Vetterli
1.5. Design of Biorthogonal Two-Channel Filter Banks
n
0
±1
±2
±3
±4
Daubechies 9/7
gen
gn
0.60294901823635790
0.26686411844287230
-0.07822326652898785
-0.01686411844287495
0.02674875741080976
1.11508705245699400
0.59127176311424700
-0.05754352622849957
-0.09127176311424948
39
LeGall 5/3
gen
gn
3/4
1/4
−1/8
1
1/2
Table 1.5: Biorthogonal filters used in the still-image compression standard JPEG 2000.
The lowpass filters are given; the highpass filters can be derived using (1.75a)–(1.75b).
The first pair is from [5] and the second from [57].
1.5.2
Complementary Filter Design
e
Assume we have a desired synthesis lowpass filter G(z). How can we find G(z)
such that we obtain a perfect reconstruction biorthogonal filter bank? It suffices to
e
find G(z)
so that (1.66) is satisfied, which, according to Theorem 1.8, can always
e
be done if G(z) has coprime polyphase components. Then G(z)
can be found by
solving a linear system of equations.
Example 1.6 (Complementary filter design) Suppose
G(z) =
1
2z
+ 1 + 21 z −1 =
1
2 (1
+ z)(1 + z −1 ).
e
e
We would like to find G(z)
such that C(z) = G(z)G(z)
satisfies C(z)+C(−z) = 2.
(It is easy to verify that the polyphase components of G(z) are coprime, so
e
e
such a G(z)
should exist. We exclude the trivial solution G(z)
= 1; it is of
no interest as it has no frequency selectivity.) For a length-5 symmetric filter
e
G(z)
= cz 2 + bz + a + bz −1 + cz −2 , we get the following system of equations:
a+b = 1
and
1
2b
+ c = 0.
To get a unique solution, we could, for example, impose that the filter have a
zero at z = −1,
a − 2b + 2c = 0,
leading to a = 6/8, b = 2/8, and c = −1/8:
1
e
−z 2 + 2z + 6 + 2z −1 − z −2 .
G(z)
=
8
All coefficients of (g, ge) are integer multiples of 1/8, making the analysis and
synthesis exactly invertible even with finite-precision (binary) arithmetic. These
filters are used in the JPEG 2000 image compression standard; see Table 1.5.
As can be seen from this example, the solution for the complementary filter is highly
nonunique. Not only are there solutions of different lengths (in the case above, any
α3.2 [January 2013] [free version] CC by-nc-nd
Comments to [email protected]
Fourier and Wavelet Signal Processing
40
Copyright 2013 J. Kovačević, V. K. Goyal, and M. Vetterli
Chapter 1. Filter Banks: Building Blocks of Time-Frequency Expansions
+
2
x
P
z
Figure 1.13:
respectively.
2
−
+
α
U
U
β
Φg
+
−
2
+
P
+
Φh
2
x
z −1
The lifting filter bank, with P and U predict and update operators,
length 3 + 4m, m ∈ N, is possible), but even a given length has multiple solutions.
It can be shown that this variety is given by the solutions of a Diophantine equation
related to the polyphase components of the filter G(z).
1.5.3
Lifting Design
We conclude this section with the design procedure based on lifting. While the original idea behind lifting was to build shift-varying perfect reconstruction filter banks,
it has also become popular as it allows for building discrete-time bases with nonlinear operations. The trivial filter bank to start lifting is the polyphase transform
which splits the sequence into even- and odd-indexed components as in Figure 1.13.
In the first lifting step, we use a prediction filter P to predict the odd coefficients
from the even ones. The even coefficients remain unchanged, while the result of
the prediction filter applied to the even coefficients is subtracted from the odd coefficients yielding the highpass coefficients. In the second step, we use an update
filter U to update the even coefficients based on the previously computed highpass
coefficients. We start with a simple example.
Example 1.7 (Haar filter bank obtained by lifting) The two polyphase
components of x are x0 (even subsequence) and x1 (odd subsequence) as in
(3.216). The purpose of the prediction operator P is to predict odd coefficients
based on the even ones. The simplest prediction says that the odd coefficients are
exactly the same as the even ones, that is pn = δn . The output of the highpass
branch is thus the difference (δn − δn−1 ), a reasonable outcome. The purpose of
the update operator U is to then update the even coefficients based on the newly
computed odd ones. As we are looking for a lowpass-like version in the other
branch, the easiest is to subtract half of this difference from the even sequence,
leading to x0,n − (x1,n − x0,n )/2, that is, the average (xo,n + x1,n )/2, again a
reasonable output, but this time lowpass in nature. Within scaling, it is thus
clear that the choice pn = δn , un = (1/2)δn leads to the Haar filter bank.
α3.2 [January 2013] [free version] CC by-nc-nd
Comments to [email protected]
Fourier and Wavelet Signal Processing
Copyright 2013 J. Kovačević, V. K. Goyal, and M. Vetterli
1.6. Two-Channel Filter Banks with Stochastic Inputs
41
Let us now identify the polyphase matrix Φp (z):
Φg (z) = α(z) − U (z)β(z),
Φh (z) = β(z) + P (z)Φg (z)
= β(z) + P (z)(α(z) − U (z)β(z))
= P (z)α(z) + (1 − P (z)U (z))β(z),
which we can write as
Φg (z)
1
−U (z)
α(z)
α(z)
=
= Φp (z)
.
Φh (z)
P (z) 1 − P (z)U (z) β(z)
β(z)
e p (z) is:
On the analysis side, Φ
e p (z) = (ΦTp (z))−1 =
Φ
1 − P (z)U (z) −P (z)
.
U (z)
1
(1.85)
(1.86)
As the det(Φp (z)) = 1, the inverse of Φp (z) does not involve actual inversion, one
of the reasons why this technique is popular. Moreover, we can write Φp as
1
−U (z)
1
0 1 −U (z)
Φp (z) =
=
,
(1.87)
P (z) 1 − P (z)U (z)
P (z) 1 0
1
decomposing Φp (z) into a sequence of lower/upper triangular matrices—lifting steps.
What we also see is that the inverse of each matrix of the form:
−1
−1
1 0
1
0
1 M
1 −M
=
and
=
,
M 1
−M 1
0 1
0
1
meaning to invert these one needs only reverse the sequence of operations as shown
in Figure 1.13. This is why this scheme allows for nonlinear operations; if M is
nonlinear, its inversion amounts to simply reversing the sign in the matrix.
1.6
Two-Channel Filter Banks with Stochastic Inputs
Our discussion so far assumed we are dealing with deterministic sequences as inputs into our filter bank, most often those with finite energy. If the input into
our filter bank is stochastic, then we must use the tools developed in Section 3.8.
The periodic shift variance for deterministic systems has its counterpart in widesense cyclostationarity. The notions of energy spectral density (3.96) (DTFT of
the deterministic autocorrelation) and energy (3.98) have their counterparts in the
notions of power spectral density (3.239) (DTFT of the stochastic autocorrelation)
and power (3.240). We now briefly discuss the effects of a filter bank on an input
WSS sequence.
Until now, we have seen various ways of characterizing systems with deterministic and stochastic inputs, among others via the deterministic and stochastic
autocorrelations.
In a single-input single-output system:
α3.2 [January 2013] [free version] CC by-nc-nd
Comments to [email protected]
Fourier and Wavelet Signal Processing
42
Copyright 2013 J. Kovačević, V. K. Goyal, and M. Vetterli
Chapter 1. Filter Banks: Building Blocks of Time-Frequency Expansions
(i) For a deterministic sequence, its autocorrelation is Hermitian symmetric (see
(3.16)) and can be factored as in (3.96), (3.143), that is, it is nonnegative on
the unit circle. It is sometimes called energy spectral density.
(ii) For a WSS sequence, the counterpart to the deterministic autocorrelation is
the power spectral density given in (3.239).
In a multiple-input multiple-output system, such as a filter banks, where the
multiple inputs are naturally polyphase components of the input sequence:
(i) For a deterministic sequence, we have a matrix autocorrelation of the vector
T
of polyphase components x0 x1 , given by (3.219). In particular, we have
T
seen it for a the vector of expansion coefficient sequences α β
in twochannel filter bank earlier in this chapter, in (1.43).
(ii) For a WSS sequence x, we can also look at the matrix of power spectral densi
T
ties of the polyphase components x0 x1 as in (3.247). In what follows, we
T
analyze that matrix for the vector of expansion coefficient sequences α β .
Filter-Bank Optimization Based on Input Statistics The area of optimizing filter
banks based on input statistics is an active one. In particular, principal-component
filter banks have been shown to be optimal for a wide variety of problems (we
give pointers to the literature on the subject in Further Reading). For example,
in parallel to our discussion in Chapter 6 on the use of KLT, it is known that the
e a is diagonal),
coding gain is maximized if the channel sequences are decorrelated, (Φ
e a by factoring
and Aα (ejω ) ≥ Aβ (ejω ) if var(αn ) ≥ var(βn ). We can diagonalize Φ
e a = QAQ∗ , where Q is the matrix of eigenvectors of Φ
e a.
it as Φ
1.7
Computational Aspects
The power of filter banks is that they are a computational tool; they implement a
wide variety of bases (and frames, see Chapter 4). As the two-channel filter bank
is the basic building block for many of these, we now spend some time discussing
various computational concerns that arise in applications.
1.7.1
Two-Channel Filter Banks
We start with a two-channel filter bank with synthesis filters {g, h} and analysis
filters {e
g, e
h}. For simplicity and comparison purposes, we assume that the input is
of even length M , filters are of even length L, and all costs are computed per input
sample. From (1.80b), the channel signals α and β are
α = ge0 ∗ x0 + e
g1 ∗ x1 ,
β = e
h0 ∗ x0 + e
h1 ∗ x1 ,
(1.88a)
(1.88b)
where ge0,1 , e
h0,1 are the polyphase components of the analysis filters ge and e
h. We
have immediately written the expression in polyphase domain, as it is implicitly
α3.2 [January 2013] [free version] CC by-nc-nd
Comments to [email protected]
Fourier and Wavelet Signal Processing
Copyright 2013 J. Kovačević, V. K. Goyal, and M. Vetterli
1.7. Computational Aspects
43
clear that it does not make sense to do the filtering first and then discard every
other product (see Section 3.9.3).
In general, (1.88) amounts to four convolutions with polyphase components
x0 and x1 , each of half the original length, plus the necessary additions. Instead
of using (3.275a), we compute directly the cost per input sample. The four convolutions operate at half the input rate and thus, for every two input samples, we
compute 4L/2 multiplications and 4((L/2) − 1) + 2 additions. This leads to L multiplications and L − 1 additions/input sample, that is, exactly the same complexity
as a convolution by a single filter of size L. The cost is thus
Cbiorth,time = 2L − 1
∼
O(L),
(1.89)
per input sample.
If an FFT-based convolution algorithm is used, for example, overlap-add, we
need four convolutions using DFTs of length N as in (3.274), plus 2N additions.
Assume for simplicity and comparison purposes that M = L = N 2.
Cbiorth,freq = 16α log2 L + 14
∼
O(log2 L),
(1.90)
per input sample.
In [73], a precise analysis is made involving FFTs with optimized lengths so as
to minimize the operation count. Using the split-radix FFT algorithm, the number
of operations becomes (for large L)
Cbiorth,freq,optim = 4 log2 L + O(log2 log2 L),
again per input sample. Comparing this to Cbiorth,freq (and disregarding the constant α), the algorithm starts to be effective for L = 8 and a length-16 FFT, where
it achieves around 5 multiplications per sample rather than 8, and leads to improvements of an order of magnitude for large filters (such as L = 64 or 128). For
medium-size filters (L = 6, . . . , 12), a method based on fast running convolution is
best (see [73]).
Let us now consider some special cases where additional savings are possible.
Linear-Phase Filter Banks It is well-known that if a filter is symmetric or antisymmetric, the number of operations can be halved in (1.89) by simply adding (or
subtracting) the two input samples that are multiplied by the same coefficient. This
trick can be used in the downsampled case as well, that is, filter banks with linearphase filters require half the number of multiplications, or L/2 multiplications per
input sample (the number of additions remains unchanged), for a total cost of
Clp,direct =
3
L−1
2
∼
O(L),
(1.91)
still O(L) but with a savings of roughly 25% over (1.89). If the filter length is
odd, the polyphase components are themselves symmetric or antisymmetric, and
the saving is obvious in (1.88).
α3.2 [January 2013] [free version] CC by-nc-nd
Comments to [email protected]
Fourier and Wavelet Signal Processing
44
Copyright 2013 J. Kovačević, V. K. Goyal, and M. Vetterli
Chapter 1. Filter Banks: Building Blocks of Time-Frequency Expansions
Another option is to use a linear-phase lattice factorization:
N/2−1 1
1 Y 1
0
1 αi
Φp (z) = α
.
1 −1
0 z −1 αi 1
i=1
The individual 2 × 2 symmetric matrices can be written as (we assume αi 6= 1)
1+αi
1 − αi 1
0 1
1 1−α
1 αi
1
i
=
.
1 −1
αi 1
0
1 1 −1
2
By gathering the scale factors together, we see that each new block in the cascade
structure (which increases the length of the filters by two) adds only one multiplication. Thus, we need L/4 multiplications, and (L − 1) additions per input sample,
for a total cost of
5
Clp,lattice = L − 1
∼
O(L),
(1.92)
4
per input sample. The savings is roughly 16% over (1.91), and 37.5% over (1.89).
Two-Channel Filter Bank
µ
ν
Cost
Order
Biorthogonal
Frequency
Time
16α log2 L
L
14
L−1
16α log2 L + 14
2L− 1
O(log2 L)
O(L)
Linear phase
Direct form
Lattice form
(1/2) L
(1/4) L
L−1
L−1
(3/2) L − 1
(5/4) L − 1
O(L)
O(L)
Orthogonal
Lattice form
Denormalized lattice
(3/4) L
(1/2) L + 1
(3/4) L
(3/4) L
(3/2) L
(5/4) L + 1
O(L)
O(L)
QMF
(1/2) L
(1/2) L
L
O(L)
Table 1.6: Cost per input sample of computing various two-channel filter banks with
length-L filters.
Orthogonal Filter Banks As we have seen, there exists a general form for a twochannel paraunitary matrix, given in (1.39). If G0 (z) and G1 (z) were of degree
zero, it is clear that the matrix in (1.39) would be a rotation matrix, which can
be implemented with three multiplications, as we will show shortly. It turns out
that for arbitrary-degree polyphase components, terms can still be gathered into
rotations, saving 25% of multiplications (at the cost of 25% more additions). This
rotation property is more obvious in the lattice structure form of orthogonal filter
banks (1.54), where matrices Rk can be written as:



cos θi − sin θi
0
0
1
0
cos θ − sin θ
1 0 1 
0
cos θi + sin θi
0  0
1 .
=
sin θ
cos θ
0 1 1
1 −1
0
0
sin θi
α3.2 [January 2013] [free version] CC by-nc-nd
Comments to [email protected]
Fourier and Wavelet Signal Processing
Copyright 2013 J. Kovačević, V. K. Goyal, and M. Vetterli
1.7. Computational Aspects
45
Thus, only three multiplications are needed, or 3L/2 for the whole lattice. Since
the lattice works in the downsampled domain, the cost is 3L/4 multiplications per
input sample and a similar number of additions, for a total cost of
Corth,lattice =
3
L
2
∼
O(L),
(1.93)
per input sample. We could also denormalize the diagonal matrix in the above
equation (taking out sin θi for example) and gather all scale factors at the end of
the lattice, leading to (L/2 + 1) multiplications per input sample, and the same
number of additions as before, for a total cost of
Corth,lattice,denorm =
5
L+1
4
∼
O(L),
(1.94)
per input sample.
QMF Filter Banks The classic QMF solution discussed in Exercise ??, besides
using even-length linear phase filters, forces the highpass filter to be equal to the
lowpass, modulated by (−1)n . The polyphase matrix is therefore:
G0 (z)
G1 (z)
1
1 G0 (z)
0
Φp (z) =
=
,
G0 (z) −G1 (z)
1 −1
0
G1 (z)
where G0 and G1 are the polyphase components of G(z). The factorized form on
the right indicates that the cost is halved. However, this scheme only approximates a basis expansion (perfect reconstruction) when using FIR filters. Table 1.6
summarizes the costs of various filter banks we have seen so far.
Multidimensional Filter Banks While we have not discussed multidimensional
filter banks so far (some pointers are given in Further Reading), we do touch upon
the cost of computing them. For example, filtering an M × M image with a filter of
length L × L requires of the order of O(M 2 L2 ) operations. If the filter is separable,
that is, G(z1 , z2 ) = G1 (z1 )G2 (z2 ), then filtering on rows and columns can be done
separately and the cost is reduced to an order O(2M 2 L) operations (M row filterings
and M column filterings, each using M L operations).
A multidimensional filter bank can be implemented in its polyphase form,
bringing the cost down to the order of a single nondownsampled convolution, just
as in the one-dimensional case. A few cases of particular interest allow further
reductions in cost. For example, when both filters and downsampling are separable, the system is the direct product of one-dimensional systems, and the implementation is done separately over each dimension. Consider a two-dimensional
system filtering an M × M image into four subbands using the filters {G(z1 )G(z2 ),
G(z1 )H(z2 ), H(z1 )G(z2 ), H(z1 )H(z2 )} each of length M × M followed by separable
downsampling by two in each dimension. This requires M decompositions in one
dimension (one for each row), followed by M decompositions in the other, for a total
of O(2M 2 L) multiplications and a similar number of additions. This is a saving of
the order of L/2 with respect to the nonseparable case.
α3.2 [January 2013] [free version] CC by-nc-nd
Comments to [email protected]
Fourier and Wavelet Signal Processing
46
1.7.2
Copyright 2013 J. Kovačević, V. K. Goyal, and M. Vetterli
Chapter 1. Filter Banks: Building Blocks of Time-Frequency Expansions
Boundary Extensions
While most of the literature as well as our exposition implicitly assume infinitelength sequences, in practice this is not the case. Given an N × N image, for
example, the result of processing it should be another image of the same size. In
Chapter 3, we discussed the finite-length case by introducing periodic (circular)
extension, when the appropriate convolution is the circular convolution and the
appropriate Fourier transform is the DFT. In practice, however, periodic extension
is rather artificial as it wraps the sequence around (for example, what is on the
left boundary of the image would appear on the right boundary). Other extension
are possible, and while for some of them (for example, symmetric), appropriate
notions of convolution and Fourier transform are available, in practice this is not
done. Instead, different types of extensions are applied (zero-padding, symmetric,
continuous, smooth) while still using the tools developed for the periodic extension.
Throughout this subsection, we assume a sequence of length N ; also, we will be
using the extension nomenclature adopted in Matlab, and will point out other names
under which these extensions are known.
Periodic Extension From x, create a periodic y as
yn = xn mod N .
Of those we consider here, this is the only mathematically correct extension in
conjunction with the DFT. Moreover, it is simple and works for any sequence length.
The drawback is that the underlying sequence is most likely not periodic, and thus,
periodization creates artificial discontinuities at multiples of N ; see Figure 1.14(b).10
Zero-Padding Extension From x, create y as
xn , n = 0, 1, . . . , N − 1;
yn =
0,
otherwise.
Again, this extension is simple and works for any sequence length. However, it
too creates artificial discontinuities as in Figure 1.14(c). Also, during the filtering
process, the sequence is extended by the length of the filter (minus 1), which is
often undesirable.
Symmetric Extension From x, create a double-length y as
xn ,
n = 0, 1, . . . , N − 1;
yn =
x2N −n−1 , n = N, N + 1, . . . , 2N − 1,
and then periodize it. As shown in Figure 1.14(d), this periodic sequence of period
2N does not show the artificial discontinuities of the previous two cases.11 However,
10 Technically speaking, a discrete sequence cannot be continuous or discontinuous. However, if
the sequence is a densely sampled version of a smooth sequence, periodization will destroy this
smoothness.
11 It does remain discontinuous in its derivatives however; for example, if it is linear, it will be
smooth but not differentiable at 0 and N .
α3.2 [January 2013] [free version] CC by-nc-nd
Comments to [email protected]
Fourier and Wavelet Signal Processing
Copyright 2013 J. Kovačević, V. K. Goyal, and M. Vetterli
1.7. Computational Aspects
47
xn
N-1
(a)
n
yn
N-1
(b)
n
yn
N-1
(c)
n
yn
N-1
(d)
n
yn
N-1
(e)
n
yn
(f)
N-1
n
Figure 1.14: Boundary extensions. (a) Original sequence x of length N . (b) Periodic
extension: x is repeated with a period N . (c) Zero-padding extension: Beyond the support,
y is set to zero. (d) Symmetric extension: The sequence is flipped at the boundaries to
preserve continuity. (Half-point symmetry is shown is shown.) (e) Continuous extension:
The boundary value is replicated. (e) Smooth extension: At the boundary, a polynomial
extension is applied to preserve higher-order continuity.
the sequence is now twice as long, and unless carefully treated, this redundancy
is hard to undo. Cases where it can be handled easily are when the filters are
symmetric or antisymmetric, because the output of the filtering will be symmetric
or antisymmetric as well.
There exist two versions of the symmetric extension, depending on whether
whole- or half-point symmetry is used. The formulation above is called half-point
symmetric because y is symmetric about the half-integer index value N − 12 . An
α3.2 [January 2013] [free version] CC by-nc-nd
Comments to [email protected]
Fourier and Wavelet Signal Processing
48
Copyright 2013 J. Kovačević, V. K. Goyal, and M. Vetterli
Chapter 1. Filter Banks: Building Blocks of Time-Frequency Expansions
alternative is whole-point symmetric y
xn ,
n = 0, 1, . . . , N − 1;
yn =
x2N −n−2 , n = N, N + 1, . . . , 2N − 2,
with even symmetry around N .
Continuous Extension From x, create a double-length y as

n = 0, 1, . . . , N − 1;
 xn ,
xN −1 , n = N, N + 1, . . . , 2N − 1;
yn =

x0 ,
n = 0, −1, . . . , −N + 1,
shown in Figure 1.14(e). This extension is also called boundary replication extension.
It is a relatively smooth extension and is often used in practice.
Smooth Extension Another idea is to extend the sequence by polynomial extrapolation, as in Figure 1.14(f). This is only lightly motivated at this point, but after
we establish polynomial approximation properties of the discrete wavelet transforms
in Chapter 3, it will be clear that a sequence extension by polynomial extrapolation
will be a way to get zeros as detail coefficients. The degree of the polynomial is
such that on the one hand, it gets annihilated by the zero moments of the wavelet,
and on the other hand, it can be extrapolated by the lowpass filter.
α3.2 [January 2013] [free version] CC by-nc-nd
Comments to [email protected]
Fourier and Wavelet Signal Processing
Copyright 2013 J. Kovačević, V. K. Goyal, and M. Vetterli
Chapter at a Glance
49
Chapter at a Glance
Our goal in this chapter was to use signal processing machinery to build discrete-time bases
with structure in terms of time-frequency localization properties. Moreover, we restricted
ourselves to those bases generated by two prototype sequences, one that together with its
shifts covers the space of lowpass sequences, and the other that together with its shifts
covers the space of highpass sequences. The signal processing tool implementing such
bases is a two-channel filter bank.
Block diagram
h−n
xW
β
2
hn
2
x
+
g−n
Basic characteristics
number of channels
sampling factor
channel sequences
α
2
M =2
N =2
αn
Synthesis
lowpass
orthogonal
gn
biorthogonal
gn
polyphase components
g0,n , g1,n
gn
2
x
xV
βn
Filters
highpass
hn
hn
h0,n , h1,n
Analysis
lowpass
g−n
gn
e
g0,n , e
e
g1,n
highpass
h−n
e
hn
e
h0,n , e
h1,n
Table 1.7: Two-channel filter bank.
Synthesis
lowpass
highpass
Analysis
lowpass
highpass
Time domain
gn
√
(δn + δn−1 )/ 2
hn
√
(δn − δn−1 )/ 2
g−n
√
(δn + δn+1 )/ 2
h−n
√
(δn − δn+1 )/ 2
z-domain
G(z)
√
(1 + z −1 )/ 2
H(z)
√
(1 − z −1 )/ 2
G(z −1 )
√
(1 + z)/ 2
H(z −1 )
√
(1 − z)/ 2
DTFT domain
G(ejω )
√
(1 + e−jω )/ 2
H(ejω )
√
(1 − e−jω )/ 2
G(e−jω )
√
(1 + ejω )/ 2
H(e−jω )
√
(1 − ejω )/ 2
Table 1.8: Haar filter bank.
α3.2 [January 2013] [free version] CC by-nc-nd
Comments to [email protected]
Fourier and Wavelet Signal Processing
50
Copyright 2013 J. Kovačević, V. K. Goyal, and M. Vetterli
Chapter 1. Filter Banks: Building Blocks of Time-Frequency Expansions
Relationship between lowpass and highpass filters
Time domain
hhn , gn−2k in = 0
Matrix domain
D2 H T GU2 = 0
z domain
G(z)H(z −1 ) + G(−z)H(−z −1 ) = 0
DTFT domain
G(ejω )H(ejω ) + G(ej(ω+π) )H(e(ω+π) ) = 0
Polyphase domain
G0 (z)G1 (z −1 ) + H0 (z)H1 (z −1 ) = 0
Sequences
Frequency domain
Time domain
Basis
lowpass
{gn−2k }k∈Z
Filters
Synthesis
lowpass
gn
G(z)
G(ejω )
Time domain
z domain
DTFT domain
highpass
{hn−2k }k∈Z
highpass
±(−1)n g−n+2ℓ−1
∓z −2ℓ+1 G(−z −1 )
∓ej(−2ℓ+1)ω G(e−j(ω+π) )
Analysis
lowpass
g−n
G(z −1 )
G(e−jω )
highpass
±(−1)n gn+2ℓ−1
∓z 2ℓ−1 G(−z)
∓ej(2ℓ−1)ω G(ej(ω+π) )
Matrix view
Basis
Time domain
Φ
z domain
Φ(z)
DTFT domain
Φ(ejω )
Polyphase domain
Φp (z)
Constraints
Time domain
Orthogonality relations
ΦT Φ = I
Perfect reconstruction
Φ ΦT = I
z domain
Φ(z −1 )T Φ(z) = I
Φ(z) ΦT (z −1 ) = I
DTFT domain
ΦT (e−jω ) Φ(ejω ) = I
Φ(ejω ) ΦT (e−jω ) = I
Polyphase domain
−1
ΦT
) Φp (z)
p (z
−1
)=I
Φp (z) ΦT
p (z
h
i
. . . gn−2k hn−2k . . .
"
#
G(z)
H(z)
G(−z) H(−z)
"
#
G(ejω )
H(ejω )
G(ej(ω+π) ) H(ej(ω+π) )
"
#
G0 (z) H0 (z)
G1 (z) H1 (z)
=I
Table 1.9: Two-channel orthogonal filter bank.
α3.2 [January 2013] [free version] CC by-nc-nd
Comments to [email protected]
Fourier and Wavelet Signal Processing
Copyright 2013 J. Kovačević, V. K. Goyal, and M. Vetterli
Chapter at a Glance
51
Relationship between lowpass and highpass filters
Time domain
hhn , gn−2k in = 0
Matrix domain
D2 H T GU2 = 0
z domain
G(z)H(z −1 ) + G(−z)H(−z −1 ) = 0
DTFT domain
G(ejω )H(e−jω ) + G(ej(ω+π) )H(ej(−ω+π) ) = 0
Polyphase domain
G0 (z)G1 (z −1 ) + H0 (z)H1 (z −1 ) = 0
Sequences
Filters
Time domain
z domain
DTFT domain
Basis
lowpass
{gn−2k }k∈Z
Synthesis
lowpass
gn
G(z)
G(ejω )
highpass
{hn−2k }k∈Z
Dual basis
lowpass
{e
g2k−n }k∈Z
highpass
±(−1)n e
gn−2ℓ+1
e
∓z −2ℓ+1 G(−z)
e j(ω+π) )
∓ej(−2ℓ+1)ω G(e
Analysis
lowpass
gn
e
e
G(z)
e jω )
G(e
Matrix view
Basis
Time domain
Φ
z domain
Φ(z)
DTFT domain
Φ(ejω )
Polyphase domain
Φp (z)
Constraints
Time domain
Biorthogonality relations
e =I
ΦT Φ
z domain
DTFT domain
Polyphase domain
h
i
. . . gn−2k hn−2k . . .
"
#
G(z)
H(z)
G(−z) H(−z)
"
#
G(ejω )
H(ejω )
G(ej(ω+π) ) H(ej(ω+π) )
"
#
G0 (z) H0 (z)
G1 (z) H1 (z)
e
Φ(z)T Φ(z)
=I
e jω ) = I
ΦT (ejω ) Φ(e
e
ΦT
p (z) Φp (z)
=I
Dual basis
e
Φ
e
Φ(z)
e jω )
Φ(e
e p (z)
Φ
highpass
{e
h2k−n }k∈Z
highpass
±(−1)n gn+2ℓ−1
∓z 2ℓ−1 G(−z)
∓ej(2ℓ−1)ω G(ej(ω+π) )
h
i
... e
g2k−n e
h2k−n . . .
"
#
e
e
G(z)
H(z)
e
e
G(−z)
H(−z)
"
#
e jω )
e jω )
G(e
H(e
e j(ω+π) ) H(e
e j(ω+π) )
G(e
"
#
e0 (z) H
e 0 (z)
G
e1 (z) H
e 1 (z)
G
Perfect reconstruction
eT = I
ΦΦ
e T (z) = I
Φ(z) Φ
e T (ejω ) = I
Φ(ejω ) Φ
e T (z) = I
Φp (z) Φ
p
Table 1.10: Two-channel biorthogonal filter bank.
α3.2 [January 2013] [free version] CC by-nc-nd
Comments to [email protected]
Fourier and Wavelet Signal Processing
52
Copyright 2013 J. Kovačević, V. K. Goyal, and M. Vetterli
Chapter 1. Filter Banks: Building Blocks of Time-Frequency Expansions
Historical Remarks
Filter banks have been popular in signal processing since the
1970s when the question of critically-sampled filter banks, those
with the number of channel samples per unit of time conserved,
arose in the context of subband coding of speech. In that method,
a speech sequence is split into downsampled frequency bands, allowing for more powerful compression. However, downsampling
can create a perceptually disturbing effect known as aliasing, prompting Esteban and Galand [38] in 1977 to propose a simple and elegant quadrature mirror filters (QMF) aliasingremoval technique. As QMF solution does not allow for perfect reconstruction, a flurry of
work followed to solve the problem. Mintzer [64] as well Smith and Barnwell [83] proposed
an orthogonal solution independently in the mid 1980s. Vaidyanathan [96] established
a connection to lossless systems, unveiling the factorization and design of paraunitary
matrices [100]. For wavelet purposes, Daubechies then designed filters with a maximum
number of zeros at z = −1 [29], a solution that goes back to Herrmann’s design of maximally flat FIR filters [47]. The equivalent IIR filter design problem leads to Butterworth
filters, as derived by Herley and Vetterli [45]. Vetterli solved the biorthogonal filter bank
problem [103, 104], while Cohen, Daubechies and Feauveau [23] as well as Vetterli and
Herley [105] tackled those with maximum number of zeros at z = −1. The polyphase
framework was used by many authors working on filter banks, but really goes back to earlier work on transmultiplexers by Bellanger and Daguet [7]. The realization that perfect
reconstruction subband coding can be used for perfect transmultiplexing appears in [104].
The idea of multichannel structures that can be inverted perfectly, including with quantization, goes back to ladder structures in filter design and implementation, in the works of
Bruckens and van den Enden, Marshall, Shah and Kalker [14,62,79]. Sweldens generalized
this idea under the name of lifting [90], deriving a number of new schemes based on this
concept, including filter banks with nonlinear operators and nonuniform sampling.
Further Reading
Books and Textbooks A few standard textbooks on filter banks exist, written by
Vaidyanathan [98], Vetterli and Kovačević [106], Strang and Nguyen [87], among others.
N -Channel Filter Banks One of the important and immediate generalizations of twochannel filter banks is when we allow the number of channels to be N . Numerous options are available, from directly designing N -channel filter banks, studied in detail by
Vaidyanathan [96, 97], through those built by cascading filter banks with different number
of branches, leading to almost arbitrary frequency divisions. The analysis methods follow closely those of the two-channel filter banks, albeit with more freedom; for example,
orthogonality and linear phase are much easier to achieve at the same time. We discuss
N -channel filter banks in detail in Chapter 2, with special emphasis on local Fourier bases.
Multidimensional Filter Banks The first difference we encounter when dealing with
multidimensional filter banks is that of sampling. Regular sampling with a given density
can be accomplished using any number of sampling lattices, each having any number of
associated sampling matrices. These have been described in detail by Dubois in [35], and
α3.2 [January 2013] [free version] CC by-nc-nd
Comments to [email protected]
Fourier and Wavelet Signal Processing
Further Reading
Copyright 2013 J. Kovačević, V. K. Goyal, and M. Vetterli
53
have been used by Viscito and Allebach [109], Karlsson and Vetterli [50], Kovačević and
Vetterli [55], Do and Vetterli [33], among others, to design multidimensional filter banks.
Apart from the freedom coming with different sampling schemes, the associated filters can
now be truly multidimensional, allowing for a much larger space of solutions.
IIR Filter Banks While IIR filters should be of importance because of their good frequency selectivity and computational efficiency, they have not been used extensively as
their implementation in a filter-bank framework comes at a cost: one side of the filter bank
is necessarily anticausal. They have found some use in image processing as the finite length
of the input allows for storing the state in the middle of the filter bank and synthesizing
from that stored state. Coverage of IIR filter banks can be found in [45, 70, 82].
Oversampled Filter Banks Yet another generalization occurs when we allow for redundancy, leading to overcomplete filter banks implementing frame expansions, covered
in Chapter 4. These filter banks are becoming popular in applications due to inherent
freedom in design.
Complex-Coefficient Filter Banks This entire chapter dealt exclusively with realcoefficient filter banks, due to their prevalence in practice. Complex-coefficient filter banks
exist, from the very early QMFs [66] to more recent ones, mostly in the form of complex
exponential-modulated local Fourier bases, discussed in Section 2.3, as well as the redundant ones, such as Gabor frames [10–12, 26, 39], discussed in Chapter 4.
QMF Filter Banks QMF filter banks showed the true potential of filter banks, as it was
clear that one could have nonideal filters and still split and reconstruct the input spectrum.
The excitement was further spurred by the famous linear-phase designs by Johnston [49]
in 1980. Exercise ?? discusses derivation of these filters and their properties.
Time-Varying Filter Banks and Boundary Filters The periodic shift variance of filter
banks can be exploited to change a filter bank essentially every period. This was done for
years in audio coding through the so-called MDCT filter banks, discussed in Section 2.4.1.
Herley and Vetterli proposed a more formal approach in [46], by designing different filters
to be used at the boundary of a finite-length input, or a filter-bank change.
Transmultiplexing The dual scheme to a filter bank is known as a transmultiplexer,
where two sequences are synthesized into a combined sequence from which the two parts
can be extracted perfectly. An orthogonal decomposition with many channels leads to orthogonal frequency-division multiplexing (OFDM), the basis for many modulation schemes
used in communications, such as 802.11. The analysis of transmultiplexers uses similar
tools as for filter banks [104], covered in Solved Exercise ?? for the orthogonal case, and
in Exercise ?? for the biorthogonal case. Exercise ?? considers frequency-division multiplexing with Haar filters.
Filter Banks with Stochastic Inputs Among the wealth of filter banks available, it
is often necessary to determine which one is the most suitable for a given application.
A number of measures have been proposed, for example, quantifying shift variance of
subband energies for deterministic inputs. Similarly, in [1, 2], the author proposes, among
others, a counterpart measure based on the cyclostationarity of subband powers. We do
not dwell on these here, rather we leave the discussion for Chapter 7. Akkarakaran and
α3.2 [January 2013] [free version] CC by-nc-nd
Comments to [email protected]
Fourier and Wavelet Signal Processing
54
Copyright 2013 J. Kovačević, V. K. Goyal, and M. Vetterli
Chapter 1. Filter Banks: Building Blocks of Time-Frequency Expansions
Vaidyanathan in [4] discuss bifrequency and bispectrum maps (deterministic and stochastic
time-varying autocorrelations) in filter banks and answer many relevant questions; some
similar issues are tackled by Therrien in [91]. In [99], Vaidyanathan and Akkarakaran
give a review of optimal filter banks based on input statistics. In particular, principalcomponent filter banks offer optimal solutions to various problems, some of these discussed
in [93] and [94, 95].
α3.2 [January 2013] [free version] CC by-nc-nd
Comments to [email protected]
Fourier and Wavelet Signal Processing
Copyright 2013 J. Kovačević, V. K. Goyal, and M. Vetterli
Chapter 2
Local Fourier Bases on
Sequences
Contents
2.1
2.2
2.3
Introduction
N -Channel Filter Banks
Complex Exponential-Modulated Local Fourier
Bases
2.4
Cosine-Modulated Local Fourier Bases
2.5
Computational Aspects
Chapter at a Glance
Historical Remarks
Further Reading
56
59
66
75
85
87
88
90
Think of a piece of music: notes appear at different instants of time, and then
fade away. These are short-time frequency events the human ear identifies easily,
but are a challenge for a computer to understand. These notes are well identified
frequencies, but they are short lived. Thus, we would like to have access to a
local Fourier transform, that is, a time-frequency analysis tool that understands the
spectrum locally in time. While such a transform is known under many names, such
as windowed Fourier transform, Gabor transform and short-time Fourier transform,
we will use local Fourier transform exclusively throughout the manuscript. The
local energy distribution over frequency, which can be obtained by squaring the
magnitude of the local Fourier coefficients, is called the spectrogram, and is widely
used in speech processing and time-series analysis.
Our purpose in this chapter is to explore what is possible in terms of obtaining
such a local version of the Fourier transform of a sequence. While, unfortunately, we
will see that, apart from short ones, there exist no good longer local Fourier bases,
there exist good local Fourier frames, the topic we explore in Chapter 4. Moreover,
there exist good local cosine bases, where the complex-exponential modulation is
replaced by cosine modulation. These constructions will all be implemented using
general, N -channel filter banks, the first generalization of the basic two-channel
filter bank block we just saw in the last chapter.
55
α3.2 [January 2013] [free version] CC by-nc-nd
Comments to [email protected]
Fourier and Wavelet Signal Processing
Copyright 2013 J. Kovačević, V. K. Goyal, and M. Vetterli
56
2.1
Chapter 2. Local Fourier Bases on Sequences
Introduction
We now look at the simplest example of a local Fourier transform decomposing
the spectrum into N equal parts. As we have learned in the previous chapter, for
N = 2, two-channel filter banks do the trick; for a general N , it is no surprise that
N -channel filter banks perform that role, and we now show just that.
If we have an infinite-length sequence, we could use the DTFT we discussed
in Chapter 3; however, as we mentioned earlier, this representation will erase any
time-local information present in the sequence. We could, however, use another tool
also discussed in Chapter 3, the DFT. While we have said that the DFT is a natural
tool for the analysis of either periodic sequences or infinite-length sequences with
a finite number of nonzero samples, circularly extended, we can also use the DFT
as a tool to observe the local behavior of an infinite-length sequence by dividing it
into pieces of length N , followed by a length-N DFT.
Implementing a Length-N DFT Basis Expansion
We now mimic what we have done for the Haar basis in the previous chapter, that
is, implement the DFT basis using signal processing machinery. We start with the
basis view of the DFT from Section 3.6.1; we assume this finite-dimensional basis
is applied to length-N pieces of our input sequence. The final basis then consists of
−1
{ϕi }N
i=0 from (3.162) and all their shifts by integer multiples of N , that is,
ΦDFT = {ϕi,n−N k }i∈{0,1,...,N −1},k∈Z .
(2.1)
In other words, as opposed to two template basis sequences generating the entire
basis by shifting as in the Haar case, not surprisingly, we now have N template
basis sequences generating the entire basis by shifting. We rename those template
basis sequences to (we use the normalized version of the DFT):
1
gi,n = ϕi,n = √ WN−in .
N
(2.2)
This is again done both for simplicity, as well as because it is the standard way
these sequences are denoted.
Then, we rewrite the reconstruction formula (3.160b) as
xn =
N
−1 X
X
i=0 k∈Z
=
N
−1 X
X
i=0
hxn , ϕi,n−N k in ϕi,n−N k
|
{z
}
αi,k
αi,k ϕi,n−N k =
| {z }
k∈Z
gi,n−N k
N
−1 X
X
αi,k gi,n−N k ,
(2.3a)
i=0 k∈Z
where we have renamed the basis sequences as explained above, as well as denoted
the expansion coefficients as
hxn , ϕi,n−N k in = hxn , gi,n−N k in = αi,k .
α3.2 [January 2013] [free version] CC by-nc-nd
(2.3b)
Comments to [email protected]
Fourier and Wavelet Signal Processing
Copyright 2013 J. Kovačević, V. K. Goyal, and M. Vetterli
2.1. Introduction
57
x
g̃N −1
..
.
N
g̃0
N
αN −1
α0
gN −1
N
..
.
x
+
g0
N
Figure 2.1: An N -channel analysis/synthesis filter bank.
As for Haar, we recognize each sum in (2.3a) as the output of upsampling by N followed by filtering ((3.203) for upsampling factor N ) with the input sequences being
αi,k . Thus, each sum in (2.3a) can be implemented as the input sequence αi going
through an upsampler by N followed by filtering by gi (right side in Figure 2.1).
By the same token, we can identify the computation of the expansion coefficients αi in (2.3b) as (3.200) for downsampling factor N , that is, filtering by gi,−n
followed by downsampling by N (left side in Figure 2.1).
We can now merge the above operations to yield an N -channel filter bank
implementing a DFT orthonormal basis expansion as in Figure 2.1. Part on the
left, which computes the projection coefficients, is termed an analysis filter bank,
while the part on the right, which computes the actual projections, is termed a
synthesis filter bank.
As before, once we have identified all the appropriate multirate components,
we can examine the DFT filter bank via matrix operations. For example, in matrix
notation, the analysis process (2.3b) can be expressed as

..
.



 α0,0 


 . 
 . 
 . 


1
αN −1,0 

 = √
 α0,1 
N


 .. 
 . 


|
αN −1,1 


..
.
α3.2 [January 2013] [free version] CC by-nc-nd

..
 .





F
F
{z
Φ∗
..
.
x0
..
.













  xN −1 


  xN  ,


.. 
.. 

. 
.

}
x

 2N −1 
..
.
(2.4a)
Comments to [email protected]
Fourier and Wavelet Signal Processing
Copyright 2013 J. Kovačević, V. K. Goyal, and M. Vetterli
58
Chapter 2. Local Fourier Bases on Sequences
with F as in (3.161a), and the synthesis process (2.3a) as


 . 
..
..
 . 


 α0,0 
 x0 





 . 
 . 
.


.
 .. 
 .. 

 .




 xN −1 

 α
F∗
N −1,0  .


 = √1 


 xN 
 α
F∗
N



  0,1 

 . 
. .  .. 
 .. 
.

. 



|
{z
}
x


αN −1,1 
 2N −1 
Φ


..
..
.
.
(2.4b)
√
Of course, Φ is a unitary matrix, since F/ N is.
Localization Properties of the Length-N DFT It is quite clear that the time
localization properties of the DFT are superior to those of the DTFT, as now, we
have access to the time-local events at the resolution of length N . However, as a
result, the frequency resolution must necessarily worsen; to see this, consider the
frequency response of g0 (the other gk s are modulated versions and therefore have
the same frequency resolution):
G0 (ejω ) =
√ sinc (ωN /2)
N
,
sinc (ω/2)
(2.5)
that is, it is the DTFT of a box sequence (see Table 4.6. It has zeros at ω = 2πk/N ,
k = 1, 2, . . . , N − 1, but decays slowly in between.
The orthonormal basis given by the DFT is just one of many basis options implementable by N -channel filter banks; many others, with template basis sequences
with more than N nonzero samples are possible (similarly to the two-channel case).
The DFT is a local Fourier version as the time events can be captured with the
resolution of N samples.
Chapter Outline
This short introduction leads naturally to the following structure of the chapter:
In Section 2.2, we give an overview of N -channel filter banks. In Section 2.3, we
present the local Fourier bases implementable by complex exponential-modulated
filter banks. We then come to the crucial, albeit negative result: the Balian-Low
theorem, which states the impossibility of good complex exponential-modulated
local Fourier bases. We look into their applications: local power spectral density via
periodograms, as well as in transmultiplexing. To mitigate the Balian-Low negative
result, Section 2.4 considers what happens if we use cosine modulation instead
of the complex one to obtain a local frequency analysis. In the block-transform
case, we encounter the discrete cosine transform, which plays a prominent role in
image processing. In the sliding window case, a cosine-modulated filter bank allows
α3.2 [January 2013] [free version] CC by-nc-nd
Comments to [email protected]
Fourier and Wavelet Signal Processing
2.2. N -Channel Filter Banks
Copyright 2013 J. Kovačević, V. K. Goyal, and M. Vetterli
59
the best of both worlds, namely an orthonormal basis with good time-frequency
localization. We also discuss variations on this construction as well as an application
to audio compression.
Notation used in this chapter: Unlike in the previous chapter, in this one, complexcoefficient filter banks are the norm. Thus, Hermitian transposition is used often,
with the caveat that only coefficients should be conjugated and not z. We will point
these out throughout the chapter.
2.2
N -Channel Filter Banks
We could imagine achieving our goal of splicing the spectrum into N pieces many
ways; we have just seen one, achievable by using the DFT, a representation with
reasonable time but poor frequency localization. Another option is using an ideal
N th band filter and its shifts (we have seen it in Table 3.5 and Table 4.6, as well as
(3.108) with ω0 = 2π/N , but repeat it here for completeness):
√
1
DTFT
N , |ω| ≤ π/N ;
jω
√
sinc(πn/N )
←→ G0 (e ) =
(2.6)
g0,n =
0,
otherwise,
N
which clearly has perfect frequency localization but poor time localization as its
impulse response is a discrete sinc sequence. We have discussed this trade-off already
in Chapter 7, and depict it in Figure 2.2.
The question now is whether there exist constructions in between these two
extreme cases? Specifically, are there basis sequences with better frequency localization than the block transform, but with impulse responses that decay faster than
the sinc impulse response (for example, a finite impulse response)?
To explore this issue, we introduce general N -channel filter banks. These are as
shown in Figure 2.1, where the input is analyzed by N filters gei , i = 0, 1, . . . , N −1,
and downsampled by N . The synthesis is done by upsampling by N , followed by
interpolation with gi , i = 0, 1, . . . , N − 1.
The analysis of N -channel filter banks can be done in complete analogy to the
two-channel case, by using the relevant equations for sampling rate changes by N .
We now state these without proofs, and illustrate them on a particular case of a
3-channel filter bank, especially in polyphase domain.
2.2.1
Orthogonal N-Channel Filter Banks
As for two-channel filter banks, N -channel orthogonal filter banks are of particular
interest; the DFT is one example. We now briefly follow the path from the previous
chapter and put in one place the relations governing such filter banks. The biorthogonal ones follow similarly, and we just touch upon them during the discussion of
the polyphase view.
Orthogonality of a Single Filter Since we started with an orthonormal basis, the
set {gi,n−N k }k∈Z,i∈{0,1,...,N −1} is an orthonormal set. We have seen in Section 3.7.4
α3.2 [January 2013] [free version] CC by-nc-nd
Comments to [email protected]
Fourier and Wavelet Signal Processing
Copyright 2013 J. Kovačević, V. K. Goyal, and M. Vetterli
60
Chapter 2. Local Fourier Bases on Sequences
ω
π
2
π
3π
2
2π
(b)
(a)
ω
π
2
π
3π
2
2π
(d)
(c)
Figure 2.2: Time- and frequency-domain behaviors of two orthonormal bases with N = 8
channels. (a)–(b) Sinc basis. (a) Impulse response is a sinc sequence, with poor time
localization. (b) Frequency response is a box function, with perfect frequency localization.
(c)–(d) DFT basis. (c) Impulse response is a box sequence, with good time localization.
(d) Frequency response is a sinc function, with poor frequency localization.
that each such filter is orthogonal and satisfies, analogously to (3.215):
Matrix View
hgi,n , gi,n−N k i = δk
←→
ZT
←→
DTFT
←→
PN −1
DN GTi Gi UN = I
Gi (W k z)Gi (WN−k z −1 ) = N
PN −1 N j(ω−(2π/N
)k) 2
) =N
k=0 Gi (e
k=0
(2.7)
As before, the matrix view expresses the fact that the columns of Gi UN form an
orthonormal set and the DTFT version is a generalization of the quadrature mirror
formula (3.214). For example, take g0 and N = 3. The DTFT version is then
2 2
G0 (ejω )2 + G0 (ej(ω−2π/N ) ) + G0 (ej(ω−4π/N ) ) = 3;
essentially, the magnitude response squared of the filter, added to its modulated
versions by 2π/N and 4π/N , sum up to a constant. This is easily seen in the
√ case
of an ideal third-band filter, whose frequency response would be constant 3 (see
(2.6)), and thus squared and shifted across the spectrum would satisfy the above.
Deterministic Autocorrelation of a Single Filter With ai the deterministic autocorrelation of gi , the deterministic autocorrelation version of (2.7) is straightfor-
α3.2 [January 2013] [free version] CC by-nc-nd
Comments to [email protected]
Fourier and Wavelet Signal Processing
Copyright 2013 J. Kovačević, V. K. Goyal, and M. Vetterli
2.2. N -Channel Filter Banks
61
ward:
Matrix View
hgi,n , gi,n−N k i = ai,N k = δk
←→
ZT
←→
DTFT
←→
DN Ai UN = I
PN −1
k
k=0 Ai (WN z) =
j(ω−(2π/N )k)
)=
k=0 Ai (e
PN −1
N
N
(2.8)
Orthogonal Projection Property a Single Channel Analogously to two channels,
a single channel with orthogonal filters projects onto a coarse subspace V0 or detail
subspaces Wi , i = 1, 2, . . . , N − 1, depending on the frequency properties of the
filter. Each of the orthogonal projection operators is given as
PV0 = G0 UN DN GT0 ,
PWi = Gi UN DN GTi ,
i = 1, 2, . . . , N − 1,
with the range
V0 = span({g0,n−N k }k∈Z ),
Wi = span({gi,n−N k }k∈Z ),
i = 1, 2, . . . , N − 1.
Orthogonality of Filters As in the previous chapter, once the orthonormality of
each single channel is established, what is left is the orthogonality of the channels
among themselves. Again, all the expressions are analogous to the two-channel case;
we state them here without proof. We assume below that i 6= j.
Matrix View
hgi,n , gj,n−N k i = 0
←→
ZT
←→
DTFT
←→
DN GTj Gi UN = 0
PN −1
−k −1
k
) =
k=0 Gi (WN z)Gj (WN z
j(ω−(2π/N )k)
−j(ω+(2π/N )k)
)Gj (e
)=
k=0 Gi (e
PN −1
0
0
(2.9)
Deterministic Crosscorrelation of Filters Calling ci,j the deterministic crosscorrelation of gi and gj :
Matrix View
hgi,n , gj,n−N k i = ci,j,N k = 0
2.2.2
←→
ZT
←→
DTFT
←→
PN −1
k=0
PN −1
k=0
Ci,j (e
DN Ci,j UN = 0
Ci,j (WNk z)
j(ω−(2π/N )k)
= 0
)=0
(2.10)
Polyphase View of N-Channel Filter Banks
To cover the polyphase view for general N , we cover it through an example with
N = 3, and then briefly summarize the discussion for a general N .
α3.2 [January 2013] [free version] CC by-nc-nd
Comments to [email protected]
Fourier and Wavelet Signal Processing
Copyright 2013 J. Kovačević, V. K. Goyal, and M. Vetterli
62
Chapter 2. Local Fourier Bases on Sequences
Example 2.1 (Orthogonal 3-channel filter banks) For two-channel filter banks, a polyphase decomposition is achieved by simply splitting both sequences and filters into their even- and odd-indexed subsequences; for 3-channel
filter banks, we split sequences and filters into subsequences modulo 3. While
we have seen the expression for a polyphase representation of a sequence and
filters for a general N in (3.227), we write them out for N = 3 to develop some
intuition, starting with the input sequence x:
x0,n = x3n
ZT
←→
X0 (z) =
ZT
X
x3n z −n ,
n∈Z
x1,n = x3n+1
←→
X1 (z) =
x2,n = x3n+2
ZT
X2 (z) =
X
x3n+1 z −n ,
n∈Z
←→
X
x3n+2 z −n ,
n∈Z
X(z) = X0 (z 3 ) + z −1 X1 (z 3 ) + z −2 X2 (z 3 ).
In the above, x0 is the subsequence of x at multiples of 3 downsampled by 3, and
similarly for x1 and x2 :
x0 =
x1 =
x2 =
...
x−3
x0
x3
x6
...
x−2
x1
x4
x7
...
x−1
x2
x5
x8
T
... ,
T
... ,
T
... .
This is illustrated in Figure 2.3(a): to get x0 we simply keep every third sample
from x; to get x1 , we shift x by one to the left (advance by one represented by
z) and then keep every third sample; finally, to get x2 , we shift x by two to
the left and then keep every third sample. To get the original sequence back,
we upsample each subsequence by 3, shift appropriately to the right (delays
represented by z −1 and z −2 ), and sum up.
Using (3.227), we define the polyphase decomposition of the synthesis filters:
gi,0,n = gi,3n
gi,1,n = gi,3n+1
gi,2,n = gi,3n+2
ZT
←→
Gi,0 (z) =
ZT
Gi,1 (z) =
←→
ZT
←→
X
gi,3n z −n ,
(2.11a)
gi,3n+1 z −n ,
(2.11b)
gi,3n+2 z −n ,
(2.11c)
n∈Z
X
n∈Z
Gi,2 (z) =
X
n∈Z
Gi (z) = Gi,0 (z 3 ) + z −1 Gi,1 (z 3 ) + z −2 Gi,2 (z 3 ),
where the first subscript denotes the filter, the second the polyphase component,
and the last, the discrete time index. In the above, we split each synthesis filter
α3.2 [January 2013] [free version] CC by-nc-nd
Comments to [email protected]
Fourier and Wavelet Signal Processing
Copyright 2013 J. Kovačević, V. K. Goyal, and M. Vetterli
2.2. N -Channel Filter Banks
into its subsequences modulo 3
h
g0 = . . .
h
g1 = . . .
h
g2 = . . .
63
as we have done for the input sequence x:
iT
g−3 g0 g3 g6 . . . ,
iT
g−2 g1 g4 g7 . . .
iT
g−1 g2 g5 g8 . . . .
We can now define the polyphase matrix

G0,0 (z)
Φp (z) = G0,1 (z)
G0,2 (z)
Φp (z):

G1,0 (z) G2,0 (z)
G1,1 (z) G2,1 (z) .
G1,2 (z) G2,2 (z)
The matrix above is on the synthesis side; to get it on the analysis side, we
define the polyphase decomposition of analysis filters using (3.227) and similarly
to what we have done in the two-channel case:
X
ZT
ei,0 (z) =
gi,0,n = e
e
gi,3n = gi,−3n ←→ G
gi,−3n z −n ,
n∈Z
gi,1,n = gei,3n−1 = gi,−3n+1
e
gi,2,n = gei,3n−2 = gi,−3n+2
e
ZT
←→
ZT
←→
ei,1 (z) =
G
ei,2 (z) =
G
X
gi,−3n+1 z −n ,
n∈Z
X
gi,−3n+2 z −n ,
n∈Z
e
G(z)
= Gi,0 (z −3 ) + zGi,2 (z −3 ) + z 2 Gi,1 (z −3 ).
The three polyphase components are:
h
iT
h
iT
g0,n = . . . ge−3 ge0 ge3 ge6 . . .
e
= . . . g3 g0 g−3 g−6 . . . ,
iT
h
h
iT
ge1 = . . . ge−4 ge−1 ge2 ge5 . . .
= . . . g4 g1 g−2 g−5 . . . ,
iT
h
h
iT
ge2 = . . . ge−5 ge−2 ge1 ge4 . . .
= . . . g5 g2 g−1 g−4 . . . .
e i (z) = Gi (z −1 ), as in the two-channel case. With these definitions,
Note that G
the analysis polyphase matrix i:


G0,0 (z −1 ) G1,0 (z −1 ) G2,0 (z −1 )
e p (z) = G0,1 (z −1 ) G1,1 (z −1 ) G2,1 (z −1 ) = Φp (z −1 ).
Φ
G0,2 (z −1 ) G1,2 (z −1 ) G2,2 (z −1 )
Figure 2.3 shows the polyphase implementation of the system, with the reconstruction of the original sequence using the synthesis polyphase matrix on the
right 12 and the computation of projection sequences αi on the left; note that as
12 Remember that we typically put the lowpass filter in the lower branch, but in matrices it
appears in the first row/column, leading to a slight inconsistency when the filter bank is depicted
in the polyphase domain.
α3.2 [January 2013] [free version] CC by-nc-nd
Comments to [email protected]
Fourier and Wavelet Signal Processing
Copyright 2013 J. Kovačević, V. K. Goyal, and M. Vetterli
64
Chapter 2. Local Fourier Bases on Sequences
α0
3
x
z
3
z2
3
α1
Φ̃p
3
Φp
α2
3
z −1
3
z −2
+
x
Figure 2.3: A 3-channel analysis/synthesis filter bank in polyphase domain.
usual, the analysis matrix (polyphase here) is taken as a transpose (to check it,
we could mimic what we did in Section 1.2.4; we skip it here).
The upshot of all this algebra is that we now have a very compact inputoutput relationship between the input (decomposed into polyphase components)
and the result coming out of the synthesis filter bank:
X(z) =
1 z −1
z
−2


X0 (z 3 )
Φp (z 3 ) Φ∗p (z −3 ) X1 (z 3 ) .
X2 (z 3 )
Note that we use the Hermitian transpose here because we will often deal with
complex-coefficient filter banks in this chapter. The conjugation is applied only to
coefficients and not to z. The above example went through various polyphase
concepts for an orthogonal 3-channel filter bank. We now summarize the same
concepts for a general, biorthogonal N -channel filter bank, and characterize classes
of solutions using polyphase machinery.
Using (3.227), in an N -channel filter bank, the polyphase decomposition of
the input sequence, synthesis and analysis filters, respectively, is given by:
xj,n = xN n+j
ZT
←→
X
Xj (z) =
xN n+j z −n ,
(2.12a)
z −j Xj (z N ),
(2.12b)
n∈Z
X(z) =
N
−1
X
j=0
gi,j,n = gi,N n+j
ZT
←→
Gi,j (z) =
X
gi,N n+j z −n ,
(2.12c)
z −j Gi,j (z N ),
(2.12d)
n∈Z
Gi (z) =
N
−1
X
j=0
α3.2 [January 2013] [free version] CC by-nc-nd
Comments to [email protected]
Fourier and Wavelet Signal Processing
Copyright 2013 J. Kovačević, V. K. Goyal, and M. Vetterli
2.2. N -Channel Filter Banks
gei,j,n = gei,N n−j
65
ZT
←→
ei,j (z) =
G
ei (z) =
G
X
n∈Z
N
−1
X
j=0
leading to the corresponding polyphase matrices:

G0,0 (z)
G1,0 (z)
...
G0,1 (z)
G
(z)
...
1,1

Φp (z) = 
..
..
..

.
.
.
G0,N −1 (z)
e
G0,0 (z)
G
e 0,1 (z)
e p (z) = 
Φ

..

.
e
G0,N −1 (z)
G1,N −1 (z)
e 1,0 (z)
G
e 1,1 (z)
G
..
.
e
G1,N −1 (z)
...
...
...
..
.
...
gei,N n−j z −n ,
e i,j (z N ),
zj G
GN −1,0 (z)
GN −1,1 (z)
..
.



,

GN −1,N −1 (z)
eN −1,0 (z) 
G
eN −1,1 (z) 
G

.
..

.
e
GN −1,N −1 (z)
(2.12e)
(2.12f)
(2.13a)
(2.13b)
This formulation allows us to characterize classes of solutions. We state these
without proof as they follow easily from the equivalent two-channel filter bank
results, and can be found in the literature.
Theorem 2.1 (N -channel filter banks in polyphase domain) Given
e p (z). Then:
an N -channel filter bank and the polyphase matrices Φp (z), Φ
is
(i) The filter bank implements a biorthogonal expansion if and only if
e ∗p (z) = I.
Φp (z)Φ
(2.14a)
Φp (z)Φ∗p (z −1 ) = I,
(2.14b)
(ii) The filter bank implements an orthonormal expansion if and only if
that is, Φp (z) is paraunitary.
(iii) The filter bank implements an FIR biorthogonal expansion if and only if
Φp (z) is unimodular (within scaling), that is, if
det(Φp (z)) = αz −k .
(2.14c)
Note that we use the Hermitian transpose in (2.14b) because we will often deal
with complex-coefficient filter banks in this chapter. The conjugation is applied only
to coefficients and not to z.
Design of N -Channel Filter Banks In the next two sections, we will discuss two
particular N -channel filter bank design options, in particular, those that add localization features to the DFT. To design general N -channel orthogonal filter banks,
α3.2 [January 2013] [free version] CC by-nc-nd
Comments to [email protected]
Fourier and Wavelet Signal Processing
66
Copyright 2013 J. Kovačević, V. K. Goyal, and M. Vetterli
Chapter 2. Local Fourier Bases on Sequences
we must design N × N paraunitary matrices. As in the two-channel case, where
such matrices can be obtained by a lattice factorization (see Section 1.3.3), N × N
paraunitary matrices can be parameterized in terms of elementary matrices (2 × 2
rotations and delays). Here, we just give an example of a design of a 3 × 3 paraunitary matrix leading to a 3-channel orthogonal filter bank; pointers to literature are
given in Further Reading.
Example 2.2 (Orthogonal N -channel filter banks) One way of parameterizing paraunitary matrices is via the following factorization:
"K−1
#
Y
−1
Φp (z) = U0
diag([z , 1, 1]) Uk ,
(2.15a)
k=1
where
U0
Uk



1
0
0 cos θ01 0 − sin θ01
cos θ02
1
0  sin θ02
= 0 cos θ00 − sin θ00   0
0 sin θ00
cos θ00
sin θ01 0
cos θ01
0



cos θk0 − sin θk0 0 1
0
0
cos θk0 0 0 cos θk1 − sin θk1  .
=  sin θk0
0
0 1 0 sin θk1
cos θk1

− sin θ02 0
cos θ02 0 ,
0 1
(2.15b)
The degrees of freedom in design are given by the angles θkj . This freedom in
design allows for constructions of orthogonal and linear-phase FIR solutions, not
possible in the two-channel case.
2.3
Complex Exponential-Modulated Local Fourier
Bases
At the start of the previous section, we considered two extreme cases of local Fourier
representations implementable by N -channel filter banks: those based on the DFT
(box in time/sinc in frequency, good time/poor frequency localization) and those
based on ideal bandpass filters (sinc in time/box in frequency, good frequency/poor
time localization). These two particular representations have something in common; as implemented via N -channel orthogonal filter banks, they are both obtained
through a complex-exponential modulation of a single prototype filter.
Complex-Exponential Modulation Given a prototype filter p = g0 , the rest of the
filters are obtained via complex-exponential modulation:
gi,n = pn ej(2π/N )in = pn WN−in ,
Gi (z) = P (WNi z),
(2.16)
Gi (ejω ) = P (ej(ω−(2π/N )i) ) = P (WNi ejω ),
for i = 1, 2, . . . , N −1. This is clearly true for the DFT basis from (2.2), (2.5), as well
as that constructed from the ideal filters (2.6) (see also Exercise ??). A filter bank
α3.2 [January 2013] [free version] CC by-nc-nd
Comments to [email protected]
Fourier and Wavelet Signal Processing
Copyright 2013 J. Kovačević, V. K. Goyal, and M. Vetterli
2.3. Complex Exponential-Modulated Local Fourier Bases
67
implementing such an expansion is often called complex exponential-modulated filter
bank. While the prototype filter p = g0 is typically real, the rest of the bandpass
filters are complex.
2.3.1
Balian-Low Theorem
We are now back to the question whether we can find complex exponential-modulated
local Fourier bases with a trade-off of time and frequency localization we have seen
for the DFT and sinc bases in Figure 2.2. To that end, we might want to worsen
the time localization of the DFT a bit in the hope of improving the frequency
one; unfortunately, the following result excludes the possibility of having complex
exponential-modulated local Fourier bases with support longer than N :13
Theorem 2.2 (Discrete Balian-Low theorem) There does not exist a complex exponential-modulated local Fourier basis implementable by an N -channel
FIR filter bank, except for a filter bank with filters of length N .
Proof. To prove the theorem, we analyze the structure of the polyphase matrix of a
complex exponential-modulated filter bank with filters as in (2.16). Given the polyphase
representation (2.12d) of the prototype filter p = g0 ,
P (z) = P0 (z N ) + z −1 P1 (z N ) + . . . + z −(N−1) PN−1 (z N ),
the modulated versions become
Gi (z) = P (WNi z) = P0 (z N ) + WN−i z −1 P1 (z N ) + . . . + WN
−(N−1)i −(N−1)
z
for i = 1, 2, . . . , N − 1. As an example, for N = 3, the polyphase matrix is


P0 (z)
P0 (z)
P0 (z)
−1
−2
Φp (z) = P1 (z) W3 P1 (z) W3 P1 (z)
P2 (z) W3−2 P2 (z) W3−1 P2 (z)



1
1
1
P0 (z)
−1
−2
 1 W3
= 
P1 (z)
W3 ,
P2 (z)
1 W3−2 W3−1
|
{z
}
PN−1 (z N )
(2.17)
F∗
that is, a product of a diagonal matrix of prototype filter polyphase components and
the conjugated DFT matrix (3.161a). According to Theorem 2.1, this filter bank implements an FIR biorthogonal expansion if and only if Φp (z) is a monomial. So,
det(Φp (z)) =
N−1
Y
j=0
Pj (z) det(F ∗ ) =
| {z }
1/
√
N
√
N
N−1
Y
Pj (z),
(2.18)
j=0
13 This result is known as the Balian-Low theorem in the continuous-domain setting, see Section 5.4.1.
α3.2 [January 2013] [free version] CC by-nc-nd
Comments to [email protected]
Fourier and Wavelet Signal Processing
68
Copyright 2013 J. Kovačević, V. K. Goyal, and M. Vetterli
Chapter 2. Local Fourier Bases on Sequences
is a monomial if and only if each polyphase component is; in other words, each polyphase
component of P (z) has exactly one nonzero term, or, P (z) has N nonzero coefficients
(one from each polyphase component).
While the above theorem is a negative result in general, the proof shows the factorization (2.17) that can be used to derive a fast algorithm, shown in Section 2.5 (the
same factorization is used in Solved Exercise ?? to derive the relationship between
the modulation and polyphase matrices). Rewriting (2.17) for general N , as well
as the analysis polyphase version of it,
Φp (z) = diag([P0 (z), P1 (z), . . . , PN −1 (z)]) F ∗ ,
e p (z) = diag([G̃0,0 (z), G̃0,1 (z), . . . , G̃0,N −1 (z)]) F ∗ , ,
Φ
(2.19a)
(2.19b)
this filter bank implements a basis expansion if and only if
diag([P0 (z), . . . , PN −1 (z)]) diag([G̃0,0 (z), . . . , G̃0,N −1 (z)])∗ = z −k N I,
e ∗p (z) = z −k I (N
a more constrained condition then the general one of Φp (z)Φ
appears here since we are using the unnormalized version of F ). We also see exactly
what the problem is in trying to have an orthogonal complex exponential-modulated
filter bank with filters of length longer than N : If the filter bank were orthogonal,
then G̃0 (z) = G0 (z −1 ) = P (z −1 ), and the above would reduce to
Φp (z)Φ∗p (z −1 ) = diag([P0 (z), P1 (z), . . . , PN −1 (z)]) F ∗
F diag([P0 (z −1 ), P1 (z −1 ), . . . , PN −1 (z −1 )])∗
= N diag([P0 (z)P0 (z −1 ), P1 (z)P1 (z −1 ), . . . , PN −1 (z)PN −1 (z −1 )])
= N I,
possible with FIR filters if and only if each polyphase component Pj (z) of the prototype filter P (z) were exactly of length 1 (we assumed the prototype to be real). Figure 2.4 depicts a complex exponential-modulated filter bank with N = 3 channels.
Solved Exercise ?? explores relationships between various matrix representations of
a 3-channel complex exponential-modulated filter bank.
2.3.2
Application to Power Spectral Density Estimation
[NOTE: This is a total mess. Not clear what is deterministic, what stochastic.
We say we talk about periodograms and put a footnote explaning the difference
between periodograms and spectrograms, but then continue on to explain
something that sounds more like a spectrogram. Needs serious scrutiny.] We
now discuss the computation of periodograms, a widely used application of complex
exponential-modulated filter banks.14
Given a discrete stochastic process, there exist various ways to estimate its
autocorrelation. In Chapter 3, we have seen that, for a discrete WSS process, there
14 The terms periodogram and spectrogram should not be confused with each other: the former
computes the estimate of the power spectral density of a sequence, while the latter shows the
dependence of the power spectral density on time.
α3.2 [January 2013] [free version] CC by-nc-nd
Comments to [email protected]
Fourier and Wavelet Signal Processing
Copyright 2013 J. Kovačević, V. K. Goyal, and M. Vetterli
2.3. Complex Exponential-Modulated Local Fourier Bases
x
3
g̃0,0
z
3
g̃0,1
z2
3
g̃0,2
α0
F
α1
F∗
α2
69
g0,0
3
g0,1
3
z −1 +
g0,2
3
z −2
x
Figure 2.4: A 3-channel analysis/synthesis complex exponential-modulated filter bank,
with analysis filters G̃i (z) = G̃0 (W3i z) and synthesis filters Gi (z) = G0 (W3i z) = P (W3i z).
F and F ∗ are unnormalized DFT matrices (thus 3x at the output) and are implemented
using FFTs.
is a direct link between the autocorrelation of a discrete stochastic process and the
power spectral density, given by (3.239). However, when the process is changing
over time, and we are interested in local behavior, we need a local estimate of
the autocorrelation, and therefore, a local power spectral density. We thus need a
local Fourier transform, by windowing the sequence appropriately, and squaring the
Fourier coefficients to obtain a local power spectral density.
Block-Based Power Spectral Density Estimation A straightforward way to estimate local power spectral density is simply to cut the sequence into adjacent, but
nonoverlapping, blocks of size M ,
[. . . x−1 x0 x1 . . .] = [. . . xnM . . . xnM+M−1 x(n+1)M . . . x(n+1)M+M−1 . . .],
{z
} |
|
{z
}
bn
bn+1
(2.20)
with bn the nth block of length M , and then take a length-M DFT of each block,
Bn = F b n ,
(2.21)
with F from (3.161a). Squaring the magnitudes of the elements of Bn leads to an
approximation of a local power spectral density, known as a periodogram.
While this method is simple and computationally attractive (using an order
O(log M ) operations per input sample), it has a major drawback we show through
a simple example, when the sequence is white noise, or xn is i.i.d. with variance
σx2 . Since F is a unitary transform (within scaling), or, a rotation in M dimensions,
the entries of Bn are i.i.d. with variance σx2 , independently of M (see Exercise ??).
The power spectral density is a constant, but while the resolution increases with
M , the variance does not diminish.
α3.2 [January 2013] [free version] CC by-nc-nd
Comments to [email protected]
Fourier and Wavelet Signal Processing
Copyright 2013 J. Kovačević, V. K. Goyal, and M. Vetterli
70
Chapter 2. Local Fourier Bases on Sequences
Figure 2.5: Power spectral density from (2.23). (a) Theoretical, as well as local estimates
computed using (2.21) on the blocked version of the source yn , with blocks of length (b)
M = 64, (c) M = 256 and (d) M = 1024, respectively.
Figure 2.6: Averaged power spectral density from (2.25). The theoretical power spectral
density is the same as in Figure 2.5(a). (a) Average of 16 blocks of length 64. (b) Average
of 4 blocks of length 256.
Example 2.3 (Block-based power spectral density estimation) Consider
a source generated by filtering white Gaussian noise with a causal filter
H(z) =
1 − αz −1
,
(1 − 2β cos ω0 z −1 + β 2 z −1 )
ROC = {z | |z| >
1
},
β
(2.22)
where α, β are real and 1 < β < ∞. This filter has poles at (1/β)e±jω0 and
zeroes at (1/α) and ∞. The power spectral density of y = h ∗ x is
Ay (ejω ) =
|1 − αe−jω |2
,
|1 − 2β cos ω0 e−jω + β 2 e−j2ω |2
(2.23)
plotted in Figure 2.5(a) for α = 1.1, β = 1.1 and ω0 = 2π/3. Figures 2.5 (b), (c)
and (d) show the power spectral density calculated using (2.21) on the blocked
version of yn , with blocks of length M = 64, 256 and 1024. While the shape of
the power spectral density can be guessed, the variance does indeed not diminish.
Averaged Block-Based Power Spectral Density Estimation When the sequence
is stationary, the obvious fix is to average several power spectra. Calling An,k the
block-based power spectral density, or, from (2.21)
An,k = |Bn,k |2 = |(F bn )k |2 ,
α3.2 [January 2013] [free version] CC by-nc-nd
(2.24)
Comments to [email protected]
Fourier and Wavelet Signal Processing
Copyright 2013 J. Kovačević, V. K. Goyal, and M. Vetterli
2.3. Complex Exponential-Modulated Local Fourier Bases
71
Figure 2.7: Rectangular, triangle and Hamming windows of length M = 31, centered
at the origin. (a) Time-domain sequences. (b) DTFT magnitude responses (in dB).
Figure 2.8: Windowing with 50% overlap using a triangle window.
we can define an averaged local power spectrum by summing K successive ones,
(K)
An,k =
K−1
1 X
An,k ,
K n=0
(2.25)
(K)
known as an averaged periodogram. Exercise ?? shows that the variance of An,k
is about 1/K the variance of An,k . Given a length-L (L = KM ) realization of a
stationary process, we can now vary M or K to achieve a trade-off between spectral
resolution (large M ) and small variance (large K).
Example 2.4 (Averaged block-based power spectral density estimation)
Continuing Example 2.3, we now have a realization of length L = 1024, but would
like to reduce the variance of the estimate by averaging. While any factorization
of 1024 into K blocks of length 1024/K, for K = 2i , i = 0, 1, . . . , 10, is possible, too many blocks lead to a poor frequency resolution (K = 1 was shown
in Figure 2.5 (d)). We consider two intermediate cases, 16 blocks of length 64
and 4 blocks of length 256, shown in Figure 2.6(a) and (b). These should be
compared to Figure 2.5 (b) and (c), where the same block size was used, but
without averaging.
Estimation Using Windowing and Overlapping Blocks In practice, both the periodogram and its averaged version are computed using a window, and possibly
overlapping blocks.
We first discuss windowing. In the simplest case, the block bn in (2.20) corresponds to applying a rectangular window from (3.13a) otherwise, shifted to location
α3.2 [January 2013] [free version] CC by-nc-nd
Comments to [email protected]
Fourier and Wavelet Signal Processing
Copyright 2013 J. Kovačević, V. K. Goyal, and M. Vetterli
72
Chapter 2. Local Fourier Bases on Sequences
H(z)
N
|.|2
S(z)
S
H(wz)
N
|.|2
S(z)
S
..
.
..
.
..
.
..
.
..
.
H(wN −1 z)
N
|.|2
S(z)
S
xn
Figure 2.9: Filter-bank implementation of the periodogram and averaged periodogram
using a complex exponential-modulated filter bank with filters (2.16). The sampling factor
N indicates the overlapping between blocks (N = M , basis; N ≤ M , frame); |.|2 computes
the squared magnitude; S(z) computes the K-point averaging filter; and finally, the output
is possibly downsampled by S. (TfBD: s should be S, K should be N , N should be M .)
nM (most often the nonunit-norm version, so that the height of the window is 1).
To smooth the boundary effects, smoother windows are used, of which many designs
are possible. All windows provide a trade-off between the width of the main lobe
(the breadth of the DTFT around zero, typically of the order of 1/M ), and the
height of the side lobes (the other maxima of the DTFT). Exercise ?? considers
a few such windows and their respective characteristics. The upshot is that the
rectangular window has the narrowest main but the highest side lobe, while others,
such as the triangle window, have lower side but broader main lobes. Figure 2.7
shows three commonly used windows and their DTFT magnitude responses in dB.
Instead of computing the DFT on adjacent, but nonoverlapping blocks of size
M as in (2.20), we can allow for overlap between adjacent blocks, for example
(assume for simplicity M is even),
bn =
xnM/2
xnM/2+1
xnM/2+2
. . . xnM/2+M−1 ,
(2.26)
a 50% overlap, shown in Figure 2.8, using a triangle window for illustration. In
general, the windows move by N that is smaller or equal to M .
Filter-Bank Implementation The estimation process we just discussed has a natural filter bank implementation. Consider a prototype filter g̃0 = p (we assume a symmetric window so time reversal is not an issue), and construct an M -channel complex
exponential-modulated filter bank as in Figure 2.1 and (2.16). The prototype filter
computes the windowing, and the modulation computes the DFT. With the sampling factor N = M , we get a critically sampled, complex exponential-modulated
filter bank. The sampling factor N can be smaller than M , in which case the resulting filter bank implements a frame, discussed in Chapter 4 (see Figure 4.14).
α3.2 [January 2013] [free version] CC by-nc-nd
Comments to [email protected]
Fourier and Wavelet Signal Processing
Copyright 2013 J. Kovačević, V. K. Goyal, and M. Vetterli
2.3. Complex Exponential-Modulated Local Fourier Bases
73
Squaring the output computes a local approximation to the power spectral density. Averaging the output over K outputs computes the averaged periodogram,
accomplished withP
a K-point averaging filter on each of the M filter bank outputs,
K−1
or S(z) = (1/K) m=0 z −m . Finally, the output of the averaging filters maybe
downsampled by a factor S ≤ K. We have thus constructed a versatile device to
compute local power spectral density, summarized in Figure 2.9, and Table 2.1.
Parameter
Filter-Bank Operation
Computes
M
Number of channels
G̃0 (z) = P (z)
N
Prototype filter
Downsampling factor
S(z)
Channel filter
Number of frequency bins
(frequency resolution of the analysis)
Windowing
Overlap between adjacent blocks
(N = M , basis; N < M , frame)
Averaging and variance reduction
Table 2.1: Complex exponential-modulated filter-bank implementation of block-based
power spectral density estimation.
The discussion so far focused on nonparametric spectral estimation, that is,
we assumed no special structure for the underlying power spectral density. When
a deterministic sequence, like a sinusoid, is buried in noise, we have a parametric
spectral estimation problem, since we have prior knowledge on the underlying deterministic sequence. While the periodogram can be used here as well (with the
effect of windowing now spreading the sinusoid), there exist powerful parametric
estimation methods specifically tailored to this problem (see Exercise ??).
2.3.3
Application to Communications
Transmultiplexers15 are used extensively in communication systems. They are at
the heart of orthogonal frequency division multiplexing (OFDM), a modulation
scheme popular both in mobile communications as well as in local wireless broadband systems such as IEEE 802.11 (Wi-Fi).
As we have seen in Chapter 1 and Solved Exercise ??, a transmultiplexer exchanges the order of analysis/synthesis banks. If the filters used are the complex
exponential-modulated ones as we have seen above, transmultiplexers become computationally efficient. For example, consider N sequences αi , i = 0, 1, . . . , N −
1, entering an N -channel complex exponential-modulated synthesis filter bank,
to produce a synthesized sequence x. Analyzing x with an N -channel complex
exponential-modulated analysis filter bank should yield again N sequences αi ,
i = 0, 1, . . . , N − 1 (Figure 2.10).
Similarly to what we have seen earlier, either length-N filters (from the DFT
(2.2)) or sinc filters, (2.6), lead to a basis. Using a good lowpass leads to approximate
reconstruction (see Exercise ?? for an exploration of the end-to-end behavior).
15 Such devices were used to modulate a large number of phone conversations onto large bandwidth transatlantic cables.
α3.2 [January 2013] [free version] CC by-nc-nd
Comments to [email protected]
Fourier and Wavelet Signal Processing
Copyright 2013 J. Kovačević, V. K. Goyal, and M. Vetterli
74
Chapter 2. Local Fourier Bases on Sequences
α0
N
g0
g0
e
N
α0
α1
N
g1
N
α1
..
.
..
.
..
.
g1
e
..
.
..
.
..
.
αN −1
N
gN −1
geN −1
N
αN −1
+
x
Figure 2.10: A transmultiplexer modulates N sequences into a single sequence x of
N -times higher bandwidth, and analyzes it into N channels.
In typical communication scenarios, a desired signal is sent over a channel with
impulse response c(t) (or equivalently cn in a bandlimited and sampled case), and
thus, the received signal is the input convolved with the channel. Often, the effect
of the channel needs to be canceled, a procedure called channel equalization. If the
channel is LSI, this equalization can be performed in Fourier domain, assuming cn is
known (either a priori or measured). This is a first motivation for using a Fourierlike decomposition. Moreover, as the complex sinusoids are eigensignals of LSI
systems, such complex sinusoids (or approximations thereof) are good candidates
for signaling over a known LSI channel. Namely, an input sinusoid of frequency ω0
and a known amplitude A (or a set of possible amplitudes {Ai }), will come out of the
channel as a sinusoid, scaled by the channel frequency response at ω0 , and perturbed
by additive channel noise present at that frequency. Digital communication amounts
to being able to distinguish a certain number of signaling waveforms per unit of time,
given a constraint on the input (such as maximum power).
It turns out that an optimal way to communicate over an LSI channel with
additive Gaussian noise is precisely to use Fourier-like waveforms. While an ideal
system would require a very large number of perfect bandpass channels, practical
systems use a few hundred channels (for example, 256 or 512). Moreover, instead
of perfect bandpass filters (which require sinc filters), approximate bandpass filters
based on finite windows are used. This time localization also allows to adapt to a
changing channel, for example, in mobile communications.
The system is summarized in Figure 2.11, for M channels upsampled by N .16
Such a device, allowing to put M sequences {xi }i=0,...,M−1 , onto a single channel
of N -times larger bandwidth, has been historically known as a transmultiplexer.
When the prototype filter is a rectangular window of length M , in the absence
of channel effects, the synthesis/analysis complex exponential-modulated filter bank
is perfect reconstruction. When a channel is present, and the prototype filter is a
16 Again,
if M > N , such a filter bank implements a frame, discussed in Chapter 4.
α3.2 [January 2013] [free version] CC by-nc-nd
Comments to [email protected]
Fourier and Wavelet Signal Processing
Copyright 2013 J. Kovačević, V. K. Goyal, and M. Vetterli
2.4. Cosine-Modulated Local Fourier Bases
75
x0
N
g0
g0
e
N
x1
N
g1
N
..
.
..
.
..
.
g1
e
..
.
..
.
xM−1
N
gM−1
geM−1
N
+
c
Figure 2.11: Communication over a channel using a complex exponential-modulated
transmultiplexer. When M = N , it is critically sampled, while for M > N , the sequence
entering the channel is redundant.
perfect lowpass filter of bandwidth [−π/M, π/M ), each bandpass channel is affected
by the channel independently of the others, and can be individually equalized.
For filters with finite impulse response, one can either use a narrower-band
prototype (so neighboring channels do not interact), or, use fewer than the critical
number of channels, which in both cases means some redundancy is left in the
synthesized sequence that enters the channel.
2.4
Cosine-Modulated Local Fourier Bases
A possible escape from the restriction imposed by the Balian-Low theorem is to
replace complex-exponential modulation (multiplication by WNi = e−j2πi/N ) with
an appropriate cosine modulation. This has an added advantage that all filters are
real if the prototype is real.
Cosine Modulation Given a prototype filter p, all of the filters are obtained via
cosine modulation:
2π
1
gi,n = pn cos
(i + )n + θi
(2.27)
2N
2
h
i
1 jθi −(i+1/2)n
(i+1/2)n
= pn
e W2N
+ e−jθi W2N
,
2
h
i
1 jθi
(i+1/2)
−(i+1/2)
Gi (z) =
e P (W2N
z) + e−jθi P (W2N
z) ,
2
i
1 h jθi
jω
Gi (e ) =
e P (ej(ω−(2π/2N )(i+1/2)) ) + e−jθi P (ej(ω+(2π/2N )(i+1/2) ) ,
2
for i = 0, 1, . . . , N −1, and θi is a phase factor that gives us flexibility in designing the
representation; you may assume it to be 0 for now. Compare the above with (2.16)
for the complex-exponential modulation; the difference is that given a real prototype
α3.2 [January 2013] [free version] CC by-nc-nd
Comments to [email protected]
Fourier and Wavelet Signal Processing
Copyright 2013 J. Kovačević, V. K. Goyal, and M. Vetterli
76
Chapter 2. Local Fourier Bases on Sequences
|Gi (ejω )|2
0
1
2
2π
6
4π
6
3
4
8π
6
π
(a)
5
10π
6
2π
|Gi (ejω )|2
0
1
2
π
12
3π
12
5π
12
3
4
···
5
5
π
4
3
2
···
1
0
2π
(b)
Figure 2.12: Complex exponential-modulated versus cosine-modulated filter bank with
N = 6. (a) In the complex case, the bandwidth of the prototype is 2π/6, and the center
frequencies are 2πi/6, i = 0, 1, . . . , 5. (b) In the cosine case, the bandwidth of the prototype
is 2π/12, and the center frequencies are (2i + 1)π/12, i = 0, 1, . . . , 5. Unlike in (a), the
first filter does not correspond to the prototype, but is modulated to π/12.
filter, all the other filters are real. Moreover, the effective bandwidth, while 2π/N
in the case of complex-exponential modulation, is π/N here. The difference occurs
because, the cosine-modulated filters being real, have two side lobes, which reduces
the bandwidth per side lobe by two. The modulation frequencies follow from an
even coverage of the interval [0, π] with side lobes of width π/N . This is illustrated
in Figure 2.12 for N = 6 for both complex as well as cosine modulation.
Will such a modulation lead to an orthonormal basis? One possibility is to
choose an ideal lowpass filter as prototype, with support [−π/2N, π/2N ). However,
as we know, this leads to a sinc-like basis with infinite and slowly-decaying impulse
responses. Another solution is a block transform, such as the discrete cosine transform (DCT) discussed in Solved Exercise ??, too short for an interesting analysis.
Fortunately, other solutions exist, with FIR filters of length longer than N , which
we introduce next.
2.4.1
Lapped Orthogonal Transforms
The earliest example of such cosine-modulated bases was developed for filters of
length 2N , implying that the nonzero support of the basis sequences overlaps by
N/2 on each side of a block of length N (see Figure 2.13), earning them the name
lapped orthogonal transforms (LOT).
α3.2 [January 2013] [free version] CC by-nc-nd
Comments to [email protected]
Fourier and Wavelet Signal Processing
Copyright 2013 J. Kovačević, V. K. Goyal, and M. Vetterli
2.4. Cosine-Modulated Local Fourier Bases
g0,n+8
77
g0,n
g0,n−8
8
n
16
Figure 2.13: LOT for N = 8. The filters are of length 2N = 16, and thus, they overlap
with their nearest neighbors by N/2 = 4 (only g0 is shown). Tails are orthogonal since the
left half (red) is symmetric and the right half (orange) is antisymmetric.
LOTs with a Rectangular Prototype Window
Consider N filters g0 , g1 , . . . , gN −1 of length 2N given by (2.27). We start with a
rectangular prototype window filter:
1
pn = √ ,
n = 0, 1, . . . , 2N − 1,
(2.28)
N
√
where, by choosing pn = 1/ N , we ensured that kgi k = 1, for all i.
As we always do, we first find conditions so that the set {gi,n−N k }, k ∈ Z,
i ∈ {0, 1, . . . , N − 1}, is an orthonormal set. To do that, we prove (2.7) and (2.9).
Orthogonality of a Single Filter To prove that a single filter gi is orthogonal to its
shifts by N , it is enough to prove this for just two neighboring shifts (as the length
of the filters is 2N , see Figure 2.13). An easy way to force this orthogonality would
be if the left half (tail) of the filter support (from 0, 1, . . . , N − 1) were symmetric
around its midpoint (N − 1)/2, while the right half (tail) of the filter support (from
N, N + 1, . . . , 2N − 1) were antisymmetric around its midpoint (3N − 1)/2. Then,
the inner product hgi,n , gi,n−N i would amount to the inner product of the right tail
of gi,n with the left tail of gi,n−N , and would automatically be zero as a product
of a symmetric sequence with an antisymmetric sequence. The question is whether
we can force such conditions on all the filters. Fortunately, we have a degree of
freedom per filter given by θi , which we choose to be
θi = −
2π
1 N −1
(i + )
.
2N
2
2
(2.29)
After substituting it into (2.27), we get
2π
1
N −1
1
gi,n = √ cos
(i + ) (n −
) .
2N
2
2
N
We now check the symmetries of the tails; for the left tail,
1
2π
1
N −1
(a)
gi,N −n−1 = √ cos
(i + ) (−n +
) = gi,n ,
2N
2
2
N
α3.2 [January 2013] [free version] CC by-nc-nd
(2.30)
(2.31a)
Comments to [email protected]
Fourier and Wavelet Signal Processing
Copyright 2013 J. Kovačević, V. K. Goyal, and M. Vetterli
78
Chapter 2. Local Fourier Bases on Sequences
for n = 0, 1, . . . , N/2 − 1, that is, it is indeed symmetric. In the above, (a) follows
from the symmetry of the cosine function. Similarly, for the right tail,
gi,2N −n−1 =
(a)
=
=
=
(b)
1
√
N
1
√
N
1
√
N
1
√
N
2π
cos
2N
2π
cos
2N
2π
cos
2N
2π
cos
2N
(i +
(i +
(i +
(i +
1
3N − 1
) (−n +
)
2
2
1
3N − 1
) (n −
)
2
2
1
N +1
) (n +
− 2N )
2
2
1
N +1
) (n +
)+π
2
2
= −gi,N +n ,
(2.31b)
for n = 0, 1, . . . , N/2 − 1, that is, it is indeed antisymmetric. In the above, (a)
follows from the symmetry of the cosine function and (b) from cos(θ + π) = cos(θ).
An LOT example with N = 8 is given in Figure 2.14.
Orthogonality of Filters We now turn our attention to showing that all the filters
are orthogonal to each other (and their shifts). As we have done in (2.27), we use
(3.283), to express gi from (2.30)
gi,n =
1 (i+1/2)(n−(N −1)/2)
−(i+1/2)(n−(N −1)/2)
√
+ W2N
.
W2N
2 N
(2.32)
The inner product between two different filters is then:
hgi , gk i =
2N −1
1 X (i+1/2)(n−(N −1)/2)
−(i+1/2)(n−(N −1)/2)
W2N
+ W2N
4N n=0
(k+1/2)(n−(N −1)/2)
−(k+1/2)(n−(N −1)/2)
W2N
+ W2N
2N −1
1 X (i+k+1)(n−(N −1)/2)
(i−k)(n−(N −1)/2)
=
W2N
+ W2N
+
4N n=0
−(i−k)(n−(N −1)/2)
−(i+k+1)(n−(N −1)/2)
W2N
+ W2N
.
To show that the above inner product is zero, we show that each of the four sums
are zero. We show it for the first sum; the other three follow the same way.
2N −1
2N
−1
X
1 X
1
−(N −1)/2
(i+k+1) n
(i+k+1)(n−(N −1)/2)
=
W
(W2N
) = 0, (2.33)
W
4N n=0 2N
4N 2N
n=0
because of the orthogonality of the roots of unity (3.285c).
α3.2 [January 2013] [free version] CC by-nc-nd
Comments to [email protected]
Fourier and Wavelet Signal Processing
Copyright 2013 J. Kovačević, V. K. Goyal, and M. Vetterli
2.4. Cosine-Modulated Local Fourier Bases
g0,n
79
g2,n
g1,n
n
n
g3,n
g4,n
n
g5,n
n
n
g6,n
n
|Gi (ejω )|
g7,n
n
n
ω
Figure 2.14: LOT for N = 8 with a rectangular prototype window. The eight basis
sequences (note the symmetric and antisymmetric tails) and their magnitude responses,
showing the uniform split of the spectrum.
Matrix View As usual, we find the matrix view of an expansion to be illuminating.
We can write the output of an LOT synthesis bank similarly to (1.7),

 .



Φ = 




α3.2 [January 2013] [free version] CC by-nc-nd

..
G0
G1
G0
G1
G0
G1
..
.




,




(2.34)
Comments to [email protected]
Fourier and Wavelet Signal Processing
Copyright 2013 J. Kovačević, V. K. Goyal, and M. Vetterli
80
Chapter 2. Local Fourier Bases on Sequences
where the columns of G0 and G1 are

g0,0
g0,1


..

.

g0,N −1
G0
= 
g0,N
G1

g0,N +1


..

.
g0,2N −1
the left and right tails respectively,

g1,0
. . . gN −1,0

g1,1
. . . gN −1,1


..
..
..

.
.
.

g1,N −1 . . . gN −1,N −1 
.

g1,N
. . . gN −1,N

g1,N +1 . . . gN −1,N +1 


..
..
..

.
.
.
g1,2N −1 . . . gN −1,2N −1
(2.35)
Since the expansion is orthonormal, ΦΦT = I, but also ΦT Φ = I, or,
G0 GT0 + G1 GT1 = I,
(2.36a)
= 0,
= I,
(2.36b)
(2.36c)
= 0.
(2.36d)
G1 GT0
GT0 G0
GT0 G1
=
+
=
G0 GT1
GT1 G1
GT1 G0
Following the symmetry/antisymmetry of the tails, the matrices G0 and G1 have
repeated rows. For example, for N = 4,




g0,0 g1,0 g2,0 g3,0
g0,4
g1,4
g2,4
g3,4
g0,1 g1,1 g2,1 g3,1 
 g0,5
g1,5
g2,5
g3,5 



G0 = 
g0,1 g1,1 g2,1 g3,1  and G1 = −g0,5 −g1,5 −g2,5 −g3,5  .
g0,0 g1,0 g2,0 g3,0
−g0,4 −g1,4 −g2,4 −g3,4
b0 and G
b1 the upper halves of G0 and G1 , respectively, we can express
Denoting by G
G0 and G1 as
IN/2 b
IN/2 b
G0 =
G0
and
G1 =
G1 ,
(2.37)
JN/2
−JN/2
where IN/2 is an N/2 × N/2 identity matrix and JN/2 is an N/2 × N/2 antidiagonal
2
matrix (defined in Section 2.B.2). Note that JN
= IN , and that premultiplying by
JN reverses the row order (postmultiplying reverses the column order).
From the above, both G0 and G1 have rank N/2. We can
√ easily check that
b 0 and G
b1 form an orthogonal set, with norm 1/ 2. Using all of the
the rows of G
above, we finally unearth the special structure of the LOTs:
IN/2 b bT IN/2 1
G0 GT0 =
G0 G0 IN/2 JN/2 =
IN/2 IN/2 JN/2
JN/2
JN/2 2
1 IN/2 JN/2
1
=
= (IN + JN ),
(2.38a)
2 JN/2 IN/2
2
1
1
IN/2 −JN/2
G1 GT1 =
= (IN − JN ),
(2.38b)
IN/2
2 −JN/2
2
GT0 G1 = GT1 G0 = 0.
α3.2 [January 2013] [free version] CC by-nc-nd
(2.38c)
Comments to [email protected]
Fourier and Wavelet Signal Processing
Copyright 2013 J. Kovačević, V. K. Goyal, and M. Vetterli
2.4. Cosine-Modulated Local Fourier Bases
81
pn
|W (ejω )|
n
8
ω
(a)
(b)
′
g1,n
′
g0,n
′
g2,n
n
′
g3,n
n
′
g5,n
′
g4,n
n
′
g6,n
n
n
′
g7,n
n
|G′0 (ejω )|
n
n
ω
(c)
Figure 2.15: LOT for N = 8 with a smooth, power-complementary prototype window
from Table 2.2. Its (a) impulse response and (b) magnitude response. (c) The eight
windowed basis sequences and their magnitude responses. Note the improved frequency
resolution compared to Figure 2.14(b).
LOTs with a Nonrectangular Prototype Window
At this point, we have N filters of length 2N , but their impulse responses are simply
rectangularly-windowed cosine sequences. Such a rectangular prototype window is
discontinuous at the boundary, and thus not desirable; instead we aim for smooth
tapering at the boundary. We now investigate whether we can window our previous
solution with a smooth prototype window and still retain orthogonality.
α3.2 [January 2013] [free version] CC by-nc-nd
Comments to [email protected]
Fourier and Wavelet Signal Processing
Copyright 2013 J. Kovačević, V. K. Goyal, and M. Vetterli
82
Chapter 2. Local Fourier Bases on Sequences
p0
p1
p2
p3
p4
p5
p6
p7
0.0887655
2366415
0.4238081
0.6181291
0.7860766
0.9057520
0.9715970
0.9960525
Table 2.2: Power-complementary prototype window used in Figure 2.15. The prototype
window is symmetric, so only half of the coefficients are shown.
For this, we choose a power-complementary,17 real and symmetric prototype
window sequence p, such that:
p2N −n−1 = pn ,
|pn | + |pN −n−1 |2 = 2,
(2.39a)
(2.39b)
2
for n = 0, 1, . . . , N − 1. Let
P0 = diag([p0 , p1 , . . . , pN −1 ]),
P1 = diag([pN , pN +1 , . . . , p2N −1 ]).
Then, (2.39) can be rewritten as
P1 = JN P0 JN ,
P02 + P12 = 2I.
(2.40a)
(2.40b)
The counterpart to (2.35) are now the windowed impulse responses
′
G0
P0 G0
P0
G0
=
=
.
G′1
P1 G1
JN P0 JN G1
(2.41)
Note that
P0 JN P0 = P1 JN P1 .
(2.42)
These windowed impulse responses have to satisfy (2.36), or, (2.38) (substituting
G′i for Gi ). For example, we check the orthogonality of the tails (2.38c):
(b)
IN/2 b
′ (a)
T
T
b
I
−J
G′T
G
=
G
(J
P
J
)
P
G
=
G
G0 ,
J
P
J
P
N 0 N
0 0
N 0 N 0
N/2
N/2
1
0
1
1
JN/2
where (a) follows from (2.41), and (b) from (2.37). As the product JN P0 JN P0 is
diagonal and symmetric (the kth entry is pk pN −k ), we get
I
IN/2 −JN/2 JN P0 JN P0 N/2 = 0.
JN/2
17 The term power complementary is typically used to denote a filter whose magnitude response
squared added to the frequency-reversed version of the magnitude response squared, sums to a
constant, as in (3.214). We use the term more broadly here to denote a sequence whose magnitude
squared added to the time-reversed version of the magnitude squared, sums to a constant.
α3.2 [January 2013] [free version] CC by-nc-nd
Comments to [email protected]
Fourier and Wavelet Signal Processing
Copyright 2013 J. Kovačević, V. K. Goyal, and M. Vetterli
2.4. Cosine-Modulated Local Fourier Bases
83
To complete the orthogonality proof, we need to verify (2.36a) (with appropriate
substitutions as above),
(a)
G′0 (G′0 )T + G′1 G′T
= P0 G0 GT0 P0 + P1 G1 GT1 P1
1
1
(b) 1
= P0 (IN + JN )P0 + P1 (IN − JN )P1
2
2
1 2
1
(c)
2
= (P0 + P1 ) + (P0 JN P0 − P1 JN P1 ) = I,
|2
{z
} |2
{z
}
I
0
where (a) follows from (2.41); (b) from (2.38) and (c) from (2.40b) and (2.42). An
example of a windowed LOT is shown in Figure 2.15 for N = 8. The prototype
window is symmetric of length 16, with coefficients as in Table 2.2.
Shift-Varying LOT Filter Banks We end this section with a discussion of a variation on the theme of prototype windows, both for its importance in practice18 and
because it shows the same basic principles at work. Assume one wants to process
a sequence with an N -channel filter bank and then switch to a 2N -channel filter
bank. In addition, one would like a smooth rather than an abrupt transition. Interestingly, to achieve this, it is enough for the two adjacent prototype windows
to have overlapping tails that are power complementary (see Figure 2.16). Calling
p(L) and p(R) the two prototype windows involved, then
2
(R) 2
= 2
|p(L)
n | + |pn |
leads again to orthogonality of the overlapping tails of the two filter banks.
2.4.2
Application to Audio Compression
In Section 2.3.2, we have made numerous references to redundancy, which we will
discuss in Chapter 4. In compression, the opposite is required: we want to remove
the redundancy from the sequence as much as possible, and thus, typically, bases are
used (in particular, orthonormal bases). While we will discuss compression in detail
in Chapter 7, here, we just discuss its main theme: a small number of transform
coefficients should capture a large part of the energy of the original sequence. In
audio compression, the following characteristics are important:
(i) The spectrum is often harmonic, with a few dominant spectral components.
(ii) The human auditory system exhibits a masking effect such that a large sinusoid
masks neighboring smaller sinusoids.
(iii) Sharp transitions, or attacks, are a key feature of many instruments.
It is clear that (i) and (iii) are in contradiction. The former requires long prototype
windows, with local frequency analysis, while the latter requires short prototype
windows, with global frequency analysis. The solution is to adapt the prototype
window size, depending on the sequence content.
18 Audio coding schemes use this feature extensively, as it allows for switching the number of
channels in a filter bank, and consequently, the time and frequency resolutions of the analysis.
α3.2 [January 2013] [free version] CC by-nc-nd
Comments to [email protected]
Fourier and Wavelet Signal Processing
Copyright 2013 J. Kovačević, V. K. Goyal, and M. Vetterli
84
Chapter 2. Local Fourier Bases on Sequences
(a)
(b)
(c)
(d)
Figure 2.16: An example of the flexibility allowed by LOTs illustrated through different
transitions from an 2-channel LOT to an 8-channel LOT. (a) Direct transition (both
prototype windows have the same tails, a restriction on the 8-channel LOT as its prototype
window must then be flat in the middle). (b) Transition using an asymmetric 4-channel
LOT prototype window (allows for a greater flexibility in the 8-channel LOT prototype
window). (c) Transition using an asymmetric 4-channel LOT prototype window and a
symmetric 4-channel LOT prototype window. (d) Transition using several symmetric 4channel LOT prototype windows (all the prototype windows are now symmetric and have
the same tails).
x(t)
frequency
t
(a)
(b)
t
Figure 2.17: Analysis of an audio segment using a cosine-modulated filter bank. (a)
Time-domain sequence. (b) Tiling of the time-frequency plane where shading indicates
the square of the coefficient corresponding to basis sequence situated at that specific timefrequency location.
Both for harmonic analysis and for windowing, including changing the size
of the filter bank (we have just seen this), we use cosine-modulated filter banks
similar to those from Section 2.4, creating an adaptive tiling of the time-frequency
plane, with local frequency resolution in stationary, harmonic segments, and local
time resolution in transition, or, attack phases. The best tiling is chosen based
on optimization procedures that try to minimize the approximation error when
keeping only a small number of transform coefficients (we discuss such methods in
Chapter 7). Figure 2.17 gives an example of adaptive time-frequency analysis.
For actual compression, in addition to an adaptive representation, a number
of other tricks come into play, related to perceptual coding (for example, masking),
quantization and entropy coding, all specifically tuned to audio compression.19
19 All of this is typically done off line, that is, on the recorded audio, rather than in real time.
This allows for complex optimizations, potentially using trial and error, until a satisfactory solution
α3.2 [January 2013] [free version] CC by-nc-nd
Comments to [email protected]
Fourier and Wavelet Signal Processing
Copyright 2013 J. Kovačević, V. K. Goyal, and M. Vetterli
2.5. Computational Aspects
85
pn
0.03
0.01
64
128
192
256
320
384
448
512
n
Figure 2.18: Impulse response of the prototype window sequence modulating the cosinemodulated filter bank used in MP3.
Example 2.5 (Filter banks used in audio compression) The MPEG audio standard20 , often called MP3 in consumer products, uses a 32-channel filter
bank. It is not a perfect reconstruction filter bank; rather, it uses a symmetric
prototype window pn of length L = 2N − 1 (in this case L = 511) with a symmetry around n = 255. The ith filter is obtained by modulation of the prototype
window as
2π
1
N
gi,n = pn cos
(i + )(n + ) ,
(2.43)
2N
2
2
for i = 0, 1, . . . , N −1. Comparing this to (2.30), we see that, except for the phase
factor, the cosine modulation is the same. Of course, the prototype window is
also different (it is of odd length hinting at the phase difference). The impulse
response of the prototype window used in MP3 is displayed in Figure 2.18. Such a
filter bank is called pseudo-QMF, because nearest neighbor aliasing is canceled as
in a classical two-channel filter bank.21 While aliasing from other, further bands,
is not automatically canceled, the prototype is a very good lowpass suppressing it
almost perfectly. The input-output relationship is not perfect (unlike for LOTs),
but again, with a good prototype window, it is almost perfect.
2.5
Computational Aspects
The expressions for the synthesis and analysis complex exponential-modulated filter
banks in (2.19a) and (2.19b) (see an illustration with three channels in Figure 2.4)
lead to the corresponding fast algorithms given in Tables 2.3 and 2.4.
Complex Exponential-Modulated Filter Banks We now look into the cost of implementing the analysis filter bank; the cost of implementing the synthesis one is
the same, as the two are dual to each other. Consider a prototype filter p of length
L = N M ; each polyphase component is then of length N .
is obtained.
20 While MPEG is a video standardization body, MP3 is its subgroup dealing with audio. Several
different versions of audio compression, of different complexity and quality, have been developed,
and the best of these, called layer III, gave the acronym MP3.
21 The QMF filters are discussed in Further Reading of Chapter 1, as well as Exercise ??.
α3.2 [January 2013] [free version] CC by-nc-nd
Comments to [email protected]
Fourier and Wavelet Signal Processing
Copyright 2013 J. Kovačević, V. K. Goyal, and M. Vetterli
86
Chapter 2. Local Fourier Bases on Sequences
ComplexModSynthesis(p, {α0 , . . . , αN−1 })
Input: The prototype filter p and N channel sequences {α0 , . . . , αN−1 }.
Output: Original sequence x.
Decompose prototype p into its N polyphase components pj,n = pNn+j
for all n do
Fourier transform: Transform channel sequences with

1
1
1
 ′

α0,n

−1
−2
1
W
W

N
N
 α′


 1,n 
−2
−4

WN
WN

 = 1


...
.
..
..


 ..
.
.

α′N−1,n
−(N−1)
−2(N−1)
WN
1 WN
the scaled inverse DFT

...
1


α0,n
−(N−1) 
. . . WN

α1,n 

−2(N−1) 

. . . WN


.


.

..
..


.

.
.

α
2
N−1,n
−(N−1)
. . . WN
Convolution: Convolve each sequence α′j,n with the jth polyphase component of p

α′′
0,n
α′′
1,n
..
.









 = 






α′′
N−1,n
 
p0,n
p1,n
..
.
α′0,n
α′1,n
..
.

 

 

∗

 

 

pN−1,n
α′N−1,n
Inverse polyphase transform: Upsample/interleave channel sequences to get xNn+j = α′′
j,n
end for
return x
Table 2.3: Fast implementation of a complex exponential-modulated synthesis filter bank.
First, we need to compute M convolutions, but on polyphase components of
the input sequence, that is, at a rate M times slower. This is equivalent to a single
convolution at full rate, or, of order O(N ) operations per input sample. We then
use an FFT, again at the slower rate. From (3.268), an FFT requires of the order
O(log2 M ) operations per input sample. In total, we have
C ∼ α log2 M + N
∼
O(log2 M ),
(2.44)
operations per input sample. This is very efficient, since simply taking a length-M
FFT for each consecutive block of M samples would already require log2 M operations per input sample. Thus, the price of windowing given by the prototype filter
is of the order O(N ) operations per input sample, or, the length of the prototype
window normalized per input sample. A value for N depends on the desired frequency selectivity; a typical value can be of order O(log2 M ). Exercise ?? looks
into the cost of a filter bank similar to those used in audio compression standards,
such as MPEG from Example 2.5.
What is the numerical conditioning of this algorithm? Clearly, both the polyphase transform and the FFT are unitary maps, so the key resides in the diagonal
matrix of polyphase components. While there are cases when it is unitary (such as
in the block-transform case where it is the identity), it is highly dependent on the
prototype window. See Exercise ?? for an exploration of this issue.
α3.2 [January 2013] [free version] CC by-nc-nd
Comments to [email protected]
Fourier and Wavelet Signal Processing
Copyright 2013 J. Kovačević, V. K. Goyal, and M. Vetterli
Chapter at a Glance
87
ComplexModAnalysis(p, x)
Input: The prototype filter p and input x.
Output: N channel sequences {α0 , . . . , αN−1 }.
Decompose prototype p into its N polyphase components pj,n = pNn−j
for all n do
Polyphase transform: Compute input sequence polyphase components xj,n = xNn+j
Convolution: Convolve polyphase components of prototype and input
 ′


 

α0,n
p0,n
x0,n
 α′


  x

p1,n
 1,n 

  1,n 

 = 
∗

..
..
..



 




 

.
.
.
α′N−1,n
pN−1,n
xN−1,n
Fourier transform: Compute channel sequences by applying the forward DFT


1
1
1
...
1


 ′

α0,n
α0,n
1
N−1 
2
W
W
.
.
.
W


N
N
N
 α

 α′


2(N−1) 
 1,n 
2
4
1,n 

WN
WN
. . . WN

 = 

1

..
.




..
.

..
..
..
..



.
 ..

.
.
.
.


′
αN−1,n
α
2
N−1,n
(N−1)
2(N−1)
(N−1)
1 WN
WN
. . . WN
end for
return {α0 , . . . , αN−1 }
Table 2.4: Fast implementation of a complex exponential-modulated analysis filter bank.
Chapter at a Glance
Our goal in this chapter was twofold: (1) to extend the discussion from Chapter 1 to more
than two channels and associated bases; and (2) to consider those filter banks implementing
local Fourier bases.
The extension to N channels, while not difficult, is a bit more involved as we now
deal with more general matrices, and, in particular, N × N matrices of polynomials. Many
of the expressions are analogous to those seen Chapter 1; we went through them in some
detail for orthogonal N -channel filter banks, as the biorthogonal ones are similar.
General, unstructured N -channel filter banks are rarely seen in practice; instead,
N -channel modulated filter banks are widespread because (1) of their close connection
to local Fourier representations, (2) computational efficiency, (modulated filter banks are
implemented using FFTs), and (3) only a single prototype filter needs to be designed.
We studied uniformly-modulated filters bank using both complex exponentials and
as well as cosines. The former, while directly linked to a local Fourier series (indeed, when
the filter length is N , we have a blockwise DFT), is hampered by a negative result, BalianLow theorem, which prohibits good orthonormal bases. The latter, with proper design of
the prototype filter, leads to good, orthonormal, local cosine bases (LOTs). These are
popular in audio and image processing using a prototype filter of length L = 2N .
To showcase their utility, we looked at the use of complex exponential-modulated
filter banks in power spectral density estimation, communications (OFDM) and transmultiplexing (Wi-Fi), as well as that of cosine-modulated ones in audio compression (MP3).
α3.2 [January 2013] [free version] CC by-nc-nd
Comments to [email protected]
Fourier and Wavelet Signal Processing
Copyright 2013 J. Kovačević, V. K. Goyal, and M. Vetterli
88
Chapter 2. Local Fourier Bases on Sequences
Block diagram
g̃N−1
N
αN−1
N
gN−1
b
b
b
b
b
b
x
g̃0
Basic characteristics
number of channels
sampling factor
channel sequences
N
α0
M =N
N
αi,n
N
+
x
g0
i = 0, 1, . . . , N − 1
Filters
Synthesis Analysis
orthogonal filter i
gi,n
gi,−n
i = 0, 1, . . . , N − 1
biorthogonal filter i
gei,n
polyphase component j
gi,j,n
gei,j,n
j = 0, 1, . . . , N − 1
Table 2.5: N -channel filter bank.
Filters
Modulation
complex-exponential
gi,n
−in
pn WN
Gi (z)
i
P (WN
z)
Gi (ejω )
P (ej(ω−(2π/N)i) )
cosine
2π
1
pn cos
(i + )n + θi
2N
2
i
1 h jθi −(i+1/2)n
(i+1/2)n
pn
e W2N
+ e−jθi W2N
2
i
1 h jθi
(i+1/2)
−(i+1/2)
e P (W2N
z) + e−jθi P (W2N
z)
2
i
1 h jθi
e P (ej(ω−(2π/2N)(i+1/2)) ) + e−jθi P (ej(ω+(2π/2N)(i+1/2) )
2
Table 2.6: Local Fourier modulated filter bank.
Historical Remarks
The earliest application of a local Fourier analysis was by Dennis Gabor to the analysis
of speech [40]. The idea of a local spectrum, or periodogram, was studied and refined by
statisticians interested in time series of sun spots, floods, temperatures, and many others.
It led to the question of windowing the data. Blackman, Tukey, Hamming, among others,
worked on window designs, while the question of smoothing was studied by Bartlett and
Welch, producing windowed and smoothed periodograms.
MP3 For compression, especially speech and audio, real modulated local Fourier filter
banks with perfect or almost perfect reconstruction appeared in the 1980’s and 1990’s.
Nussbaumer, Rothweiler and others proposed pseudo-QMF filter banks, with nearly perfect
α3.2 [January 2013] [free version] CC by-nc-nd
Comments to [email protected]
Fourier and Wavelet Signal Processing
Copyright 2013 J. Kovačević, V. K. Goyal, and M. Vetterli
Historical Remarks
89
Relationship between filters
Time domain
hgi,n , gj,n−Nk in = δi−j δk
Matrix domain
D N GT
j Gi UN = δi−j
N−1
X
−k −1
k
z domain
Gi (WN
z)Gj (WN
z ) = N δi−j
DTFT domain
Polyphase domain
k=0
N−1
X
k=0
N−1
X
k −jω
Gi (ej(ω−(2π/N)k) )Gj (WN
e
) = N δi−j
Gi,k (z)Gj,k (z) = δi−j
k=0
Basis sequences
Time domain
{gi,n−2k }i=0,...,N−1,k∈Z
Frequency domain
{Gi (z)}i=0,...,N−1
Filters
Synthesis
gi,n , Gi (z), Gi (ejω )
Analysis
gi,−n , Gi (z −1 ), Gi (e−jω )
Matrix view
Basis
Time domain
Φ
z domain
Φ(z)
DTFT domain
Φ(ejω )
Polyphase domain
Φp (z)
Constraints
Time domain
Orthogonality relations
Φ∗ Φ = I
Perfect reconstruction
Φ Φ∗ = I
z domain
Φ(z −1 )∗ Φ(z) = I
Φ(z) Φ∗ (z −1 ) = I
∗
−jω
h
i
. . . g0,n−2k g1,n−2k . . . g0N−1,n−2k


G0 (z)
G1 (z)
. . . GN−1 (z)
G (W z)
G1 (WN z)
. . . GN−1 (WN z)
 0 N



.
..
..
..


.
.


.
.
.
N−1
N−1
N−1
G0 (WN z) G1 (WN z) . . . GN−1 (WN z)


G0 (ejω )
G1 (ejω )
. . . GN−1 (ejω )
G (W ejω )
G1 (WN ejω )
. . . GN−1 (WN ejω )
 0 N



..
..
..
..


.


.
.
.
N−1 jω
N−1 jω
jω )
G0 (WN
e ) G1 (WN
e ) . . . GN−1
(W
e
N
N−1


G0,0 (z)
G1,0 (z)
. . . GN−1,0 (z)
G (z)

G1,1 (z)
. . . GN−1,1 (z)
 0,1



..
..
..
..


.


.
.
.
G0,N−1 (z) G1,N−1 (z) . . . GN−1,N−1 (z)
) Φ(e
jω
DTFT domain
Φ (e
)=I
Polyphase domain
Φ∗p (z −1 ) Φp (z) = I
Φ(ejω ) Φ∗ (e−jω ) = I
Φp (z) Φ∗p (z −1 ) = I
Table 2.7: N -channel orthogonal filter bank
reconstruction, frequency selective filters and high computational efficiency. This type
of filter bank is used today in most audio coding standards, such as MP3. A different
approach, leading to shorter filters and LOTs, was championed by Malvar, Princen and
Bradley, among others. These are popular in image processing, where frequency selectivity
is not as much of a concern.
α3.2 [January 2013] [free version] CC by-nc-nd
Comments to [email protected]
Fourier and Wavelet Signal Processing
90
Copyright 2013 J. Kovačević, V. K. Goyal, and M. Vetterli
Chapter 2. Local Fourier Bases on Sequences
Wi-Fi Frequency division multiplexing has been a popular communications method since the 1960’s, and its digital version led to
complex exponential-modulated transmultiplexers with FFTs, as
proposed by Bellanger and co-workers. That perfect transmultiplexing is possible was pointed out by Vetterli. Multicarrier
frequency signaling, which relies on efficient complex exponential-modulated transmultiplexers is one of the main communications methods, with orthogonal frequency division
multiplexing (OFDM) being at the heart of many standards (for example, Wi-Fi, 802.11).
Further Reading
Books and Textbooks For a general treatment of N -channel filter banks, see the books
by Vaidyanathan [98], Vetterli and Kovačević [106], Strang and Nguyen [87], among others. For modulated filter banks, see [98] as well as Malvar’s book [61], the latter with a
particular emphasis on LOTs. For a good basic discussion of periodograms, see Porat’s
book on signal processing [68], while a more advanced treatment of spectral estimation
can be found in Porat’s book on statistical signal processing [67], and Stoica and Moses’
book on spectral estimation [86].
Design of N -Channel Filter Banks General N -channel filter
bank designs were investigated in [96,97]. The freedom in design
offered by more channels shows in examples such as linear-phase,
orthogonal FIR solutions, not possible in the two-channel case [84, 108].
α3.2 [January 2013] [free version] CC by-nc-nd
Comments to [email protected]
Fourier and Wavelet Signal Processing
Copyright 2013 J. Kovačević, V. K. Goyal, and M. Vetterli
Chapter 3
Wavelet Bases on
Sequences
Contents
3.1
Introduction
93
3.2
Tree-Structured Filter Banks
3.3
Orthogonal Discrete Wavelet Transform
99
3.4
Biorthogonal Discrete Wavelet Transform
112
3.5
Wavelet Packets
114
3.6
Computational Aspects
116
106
Chapter at a Glance
118
Historical Remarks
119
Further Reading
119
If the projection of the signal onto two subspaces is advantageous, projecting
onto more subspaces might be even better. These projections onto multiple subspaces are implemented via multichannel filter banks, which come in various flavors:
For example, there are direct multichannel filter banks, with N filters covering the
entire spectrum, their outputs downsampled by N , covered in Chapter 2. There
are also tree-structured multichannel filter banks, where a two-channel filter bank
from Chapter 1 is used as a building block for more complex structures. While we
will discuss arbitrary tree structures later in this chapter, most of the chapter deals
with a particularly simple one that has some distinguishing features, both from
mathematical as well as practical points of view. This elementary tree structure
recursively splits the coarse space into ever coarser ones, yielding, in signal processing parlance, an octave-band filter bank. The input spectrum (subspace) from 0 to
π is cut into a highpass part from π/2 to π, with the remainder cut again into π/4
to π/2 and a new remainder from 0 to π/4, and so on. As an example, performing
the split three times leads to the following 4-channel spectral division:
h π hπ π hπ π hπ 0,
,
,
,
,
,
,π ,
8
8 4
4 2
2
91
α3.2 [January 2013] [free version] CC by-nc-nd
Comments to [email protected]
Fourier and Wavelet Signal Processing
Copyright 2013 J. Kovačević, V. K. Goyal, and M. Vetterli
92
Chapter 3. Wavelet Bases on Sequences
h−n
2
β (1)
x
h−n
g−n
2
β (2)
2
g−n
h−n
2
g−n
2
2
β (3)
α(3)
(a)
β (1)
β (2)
β (3)
2
α
2
x
+
+
+
hn
hn
2
hn
(3)
2
2
gn
gn
2
gn
(b)
π
8
π
4
π
π
2
ω
(c)
Figure 3.1: A two-channel orthogonal filter bank iterated three times to obtain one
coarse subspace with support [0, π/8), and three bandpass subspaces. (a) Analysis filter
bank. (b) Synthesis filter bank. (c) The corresponding frequency division.
yielding a lowpass (coarse) version and three bandpass (detail) versions, where each
corresponds to an octave of the initial spectrum, shown in Figure 3.1(c).22
Such an unbalanced tree-structured filter bank shown in Figure 3.1 is a central
concept both in filter banks as well as wavelets. Most of this chapter is devoted to
its study, properties, and geometrical interpretation. In wavelet parlance, when the
22 Another
interpretation of octave-band filter banks is that the bandpass channels have constant
relative bandwidth. For a bandpass channel, its relative bandwidth Q is defined as its center
frequency divided by its bandwidth. In the example above, the channels go from π/2i+1 to π/2i ,
with the center frequency 3π/2i+2 and bandwidth π/2i+1 . The relative bandwidth Q is then 3/2.
In classic circuit theory, the relative bandwidth is called the Q-factor, and the filter bank above
has constant-Q bandpass channels.
α3.2 [January 2013] [free version] CC by-nc-nd
Comments to [email protected]
Fourier and Wavelet Signal Processing
Copyright 2013 J. Kovačević, V. K. Goyal, and M. Vetterli
3.1. Introduction
93
lowpass filter is designed appropriately, the filter bank computes a discrete wavelet
transform (DWT). Even more is true: the same construction can be used to derive
continuous-time wavelet bases, and the filter bank leads to an algorithm to compute
wavelet series coefficients, the topic of Chapter 6.
3.1
Introduction
The iterated structure from Figure 3.1(a) clearly performs only the analysis operation. Given what we have learned so far, the channel sequences β (1) , β (2) , β (3) ,
α(3) compute projection coefficients onto some, yet unidentified, subspaces; it is
left to establish which expansion has these as its projection/transform coefficients.
Moreover, we should be able to then express the entire expansion using filter banks
as we have done in Chapter 1. It is not difficult to see, that the synthesis filter bank
corresponding to the analysis one from Figure 3.1(a), is the one in Figure 3.1(b).
Every analysis two-channel block in Figure 3.1(a) has a corresponding synthesis
two-channel block in Figure 3.1(b). We can thus use the whole machinery from
Chapter 1 to study such an iterated structure. Moreover, the example with J = 3
levels can be easily generalized to an arbitrary number of levels.
As we have done throughout the book, we introduce the main concepts of
this chapter through our favorite example—Haar. Building upon the intuition we
develop here, generalizations will come without surprise in the rest of the chapter.
Implementing a Haar DWT Expansion
We start with a 3-level iterated filter bank structure as in Figure 3.1, where the
two-channel filter bank block is the Haar orthogonal filter bank from Table 1.8,
with synthesis filters
G(z) =
√1 (1
2
+ z −1 ),
H(z) =
√1 (1
2
− z −1 ).
Equivalent Filters As mentioned earlier, we now have four channel sequences, β (1) ,
β (2) , β (3) , α(3) , and thus, we should be able to represent the tree structure from
Figure 3.1 as a 4-channel filter bank, with four channel filters and four samplers.
This is our aim now.
We first consider the channel sequence α(3) and its path through the lower
branches of the first two filter banks, depicted in Figure 3.2(a). In part (b) of the
same figure, we use one of the identities on the interchange of multirate operations
and filtering we saw in Chapter 3, Figure 3.28, to move the first filter G(z) across the
second upsampler, resulting in part (c) of the figure. In essence, we have compacted
the sequence of steps “upsampler by 2—filter G(z)—upsampler by 2—filter G(z)”
into a sequence of steps “upsampler by 4—equivalent filter G(2) (z) = G(z)G(z 2 )”.
We can now iteratively continue the process by taking the equivalent filter and
passing it across the third upsampler along the path of the lower branch in the last
(rightmost) filter bank, resulting in a single branch with a single upsampler by 8
followed by a single equivalent filter G(3) (z) = G(z)G(z 2 )G(z 4 ), resulting in the
lowest branch of Figure 3.3.
α3.2 [January 2013] [free version] CC by-nc-nd
Comments to [email protected]
Fourier and Wavelet Signal Processing
Copyright 2013 J. Kovačević, V. K. Goyal, and M. Vetterli
94
Chapter 3. Wavelet Bases on Sequences
G(z)
2
G(z)
2
(a)
G(z)
≡
2
G2 (z)
2
(b)
G(z)G(z 2 )
4
(c)
Figure 3.2: Path through the lower branches of the first two filter banks in Figure 3.1.
(a) Original system. (b) Use of one of the identities on the interchange of multirate
operations and filtering from Figure 3.28 results in moving the filter across the upsampler
by upsampling its impulse response. (c) Equivalent system consisting of a single upsampler
by 4 followed by an equivalent filter G(2) (z) = G(z)G(z 2 ).
Repeating the process on the other three branches transforms the 3-level treestructured synthesis filter bank from Figure 3.1(b) into the 4-channel synthesis filter
bank from Figure 3.3, with the equivalent filters:
H (1) (z) = H(z) =
√1 (1
2
H (2) (z) = G(z)H(z 2 ) =
H
(3)
= 21 (1 + z −1 − z
2
− z −1 ),
−1
1
)(1
2 (1 + z
−2
−3
−z
4
),
(z) = G(z)G(z )H(z ) =
=
1
√
(1
2 2
+z
−1
+z
−2
1
√
(1
2 2
− z −2 )
−1
(3.1b)
−2
−4
1
√
(1
2 2
−3
+z
1
√
(1
2 2
−3
+ z −1 )(1 + z −2 )(1 + z −4 )
+z
G(3) (z) = G(z)G(z 2 )G(z 4 ) =
=
(3.1a)
+ z −1 + z −2 + z
−z
−4
)(1 + z
−z
−5
−z
)(1 − z
−6
−z
−7
)
),
+ z −4 + z −5 + z −6 + z −7 ).
(3.1c)
(3.1d)
If we repeated the above iterative process J times instead, the lowpass equivalent
filter would have the z-transform
(J)
G
(z) =
J−1
Y
2ℓ
G(z ) =
ℓ=0
1
2J/2
J
2X
−1
z −n ,
(3.2a)
n=0
that is, it is a length-2J averaging filter
gn(J)
=
1
2J/2
J
2X
−1
δn−k ,
(3.2b)
k=0
while the same-level bandpass equivalent filter follows from
J−1
J−1
H (J) (z) = H(z 2 )G(J−1) (z) = √12 G(J−1) (z) − z −2 G(J−1) (z) ,
α3.2 [January 2013] [free version] CC by-nc-nd
(3.2c)
Comments to [email protected]
Fourier and Wavelet Signal Processing
Copyright 2013 J. Kovačević, V. K. Goyal, and M. Vetterli
3.1. Introduction
95
β (1)
2
H (1) = H(z)
β (2)
4
H (2) = G(z)H(z 2 )
+
β (3)
8
H (3) = G(z)G(z 2 )H(z 4 )
α(3)
8
G(3) = G(z)G(z 2 )G(z 4 )
x
Figure 3.3: Equivalent filter bank to the 3-level synthesis bank shown in Figure 3.1(b).
with the impulse response
h(J)
=
n

1 
2J/2
2J−1
X−1
k=0
δn−k −
J
2X
−1
k=2J−1

δn−k  .
(3.2d)
Basis Sequences As we have done in Chapter 1, we now identify the resulting
expansion and corresponding basis sequences. To each branch in Figure 3.3 corresponds a subspace spanned by the appropriate basis sequences. Let us start from
the top. The first channel, with input β (1) , has the equivalent filter h(1) = h, just
as in the basic two-channel filter bank, (1.10b), with upsampling by 2 in front. The
corresponding sequences spanning the subspace W (1) are (Figure 3.4(a)):
(1)
W (1) = span({hn−2k }k∈Z ).
(3.3a)
The second channel, with input β (2) , has the equivalent filter (3.1b) with upsampling by 4 in front. The corresponding sequences spanning the subspace W (2) are
(Figure 3.4(b)):
(2)
W (2) = span({hn−4k }k∈Z ).
(3.3b)
The third and fourth channels, with inputs β (3) and α(3) , have the equivalent filters (3.1c), (3.1d), respectively, with upsampling by 8 in front. The corresponding
sequences spanning the subspaces W (3) and V (3) are (Figure 3.4(c), (d)):
(3)
(3.3c)
(3)
(3.3d)
W (3) = span({hn−8k }k∈Z ),
V (3) = span({gn−8k }k∈Z ).
The complete set of basis sequences is thus:
(1)
(2)
(3)
(3)
Φ = {hn−2k , hn−4k , hn−8k , gn−8k }k∈Z .
α3.2 [January 2013] [free version] CC by-nc-nd
(3.3e)
Comments to [email protected]
Fourier and Wavelet Signal Processing
Copyright 2013 J. Kovačević, V. K. Goyal, and M. Vetterli
96
Chapter 3. Wavelet Bases on Sequences
(1)
√1
2
(1)
(1)
(2)
(1)
(2)
hn
hn hn−2 hn−4 hn−6
hn−4
1
2
2
4
6
8
n
2
4
6
8
n
− 21
− √12
(a)
(b)
(3)
(3)
gn
hn
1
√
2 2
1
√
2 2
2
4
1
− 2√
2
6
8
n
2
4
6
8
n
1
− 2√
2
(c)
(d)
Figure 3.4: Discrete-time Haar basis. Eight of the basis sequences forming Φ0 : (a) level
(1)
(2)
ℓ = 1, hn and three of its shifts by 2, (b) level ℓ = 2, hn and one of its shifts by 4, (c)
(3)
(3)
level ℓ = 3, hn , and (d) level ℓ = 3, gn . A basis sequence at level i is orthogonal to a
basis sequence at level j, i < j, because it changes sign over an interval where the latter
is constant (see, for example, the blue basis sequences).
Orthogonality of Basis Sequences While we have called the above sequence basis
sequences, we have not established yet that they indeed form a basis (although this
is almost obvious from the two-channel filter bank discussion).
The sets spanning W (1) , W (2) , W (3) and V (3) are all orthonormal sets, as the
sequences within those sets do not overlap. To show that Φ is an orthonormal set,
we must show that sequences in each of the above subsets are orthogonal to each
other. To prove that, we have to show that h(1) and its shifts by 2 are orthogonal
to h(2) and its shifts by 4, h(3) and its shifts by 8, and g (3) and its shifts by 8.
Similarly, we must show that h(2) and its shifts by 4 are orthogonal to h(3) and
its shifts by 8 and g (3) and its shifts by 8, etc. For Haar filters, this can be done
by observing, for example, that h(1) and its shifts by 2 always overlap a constant
portion of h(2) , h(3) and g (3) , leading to a zero inner product (see Figure 3.4). With
more general filters, this proof is more involved and will be considered later in the
chapter.
To prove completeness, we first introduce the matrix view of this expansion.
Matrix View We have already seen that while g (3) and h(3) move in steps of 8,
h(2) moves in steps of 4 and h(1) moves in steps of 2. That is, during the nonzero
portion of g (3) and h(3) , h(2) and its shift by 4 occur, as well as h(1) and its shifts
by 2, 4 and 6 (see Figure 3.4). Thus, as in Chapter 1, we can describe the action of
α3.2 [January 2013] [free version] CC by-nc-nd
Comments to [email protected]
Fourier and Wavelet Signal Processing
Copyright 2013 J. Kovačević, V. K. Goyal, and M. Vetterli
3.1. Introduction
97
(1)
h−n
x
2
(2)
h−n
4
(3)
h−n
8
(3)
g−n
8
β (1)
β (2)
β (3)
α(3)
(1)
2
W (1)
hn
(2)
4
(1)
W (2)
hn
(3)
8
x
(2)
+
xh
W (3)
hn
(3)
+
xh
V (4)
(3)
gn
8
+
xh
(3)
xg
Figure 3.5: Projection of the input sequence onto V (3) , W (3) , W (2) and W (1) , respectively,
and perfect reconstruction as the sum of the projections.
the filter bank via an infinite matrix:
Φ = diag(Φ0 ),
with Φ0 as
h
(1)
Φ0 = h(1)
hn−2
n

(1)
hn−4
1
0
0
−1
0
0

 0
1
0

 0 −1
0
=
 0
0
1

 0
0
−1

 0
0
0
0
0
0
(1)
hn−6
(2)
hn
(2)
hn−4
0
1
0
1
0
1
0
1
0 −1
0
1
0 −1
0
1
0
0
1 −1
0
0
1 −1
1
0 −1 −1
−1
0 −1 −1
(3)
hn
 1
 √
1  2

1


1 


1


1 


1


1 

1
(3.4)
(3)
gn
√1
2
i
√1
2

√1
2
1
2
1
2
1
√
2 2
As before, Φ is block diagonal only when the length of the filters in the original
filter bank is equal to the downsampling factor, as is the case for Haar. The block is
of length 8 × 8 in this case , since the same structure repeats itself every 8 samples.
That is, h(3) and g (3) repeat every 8 samples, h(2) repeats every 4 samples, while
h(1) repeats every 2 samples. Thus, there will be 2 instances of h(2) in block Φ0
and 4 instances of h(1) (see Figure 3.4). The basis sequences are the columns of the
matrix Φ at the center block Φ0 and all their shifts by 8 (which corresponds to other
blocks Φ0 in Φ). Φ is unitary, as each block Φ0 is unitary, proving completeness for
the Haar case. As we shall see, if each two-channel filter bank is orthonormal, even
for longer filters, the orthonormality property will hold in general.
α3.2 [January 2013] [free version] CC by-nc-nd
1
√
2 2






.






Comments to [email protected]
Fourier and Wavelet Signal Processing
Copyright 2013 J. Kovačević, V. K. Goyal, and M. Vetterli
98
Chapter 3. Wavelet Bases on Sequences
x
2
0
−2
x (1)
h 0.5
0
−0.5
x (2)
h
0.5
0
−0.5
x (3)
h
0.2
0
−0.2
x (3)
g
1
0
−1
0
100
200
300
400
500
600
700
800
900
1000
Figure 3.6: Approximation properties of the discrete wavelet transform. (a) Original
sequence x with various components. The highpass approximation after the (b) first
(1)
(2)
(3)
iteration xh , (c) second iteration xh , and (d) the third iteration xh . (e) The lowpass
(3)
approximation xg .
Projection Properties In summary, the 3-level iterated two-channel filter bank
from Figure 3.1, splits the original space ℓ2 (Z) into four subspaces:
ℓ2 (Z) = V (3) ⊕ W (3) ⊕ W (2) ⊕ W (1) ,
given in (3.3a)–(3.3d), again, a property that will hold in general. Figure 3.5 il(3)
(ℓ)
lustrates this split, where xg denotes the projection onto V (3) , and xh denotes
the projection onto W (ℓ) , ℓ = 1, 2, 3, while Figure 3.6 shows an example input
sequence and the resulting channel sequences. The low-frequency sinusoid and the
polynomial pieces are captured by the lowpass projection, white noise is apparent
in all channels, and effects of discontinuities are localized in the bandpass channels.
Chapter Outline
After this brief introduction, the structure of the chapter follows naturally. First,
we generalize the Haar discussion and consider tree-structured filter banks, and,
in particular, those that will lead to the DWT in Section 3.2. In Section 3.3, we
study the orthogonal DWT and its properties such as approximation and projection
capabilities. Section 3.4 discusses a variation on the DWT, the biorthogonal DWT,
while Section 3.5 discusses another one, wavelet packets, which allow for rather
arbitrary tilings of the time-frequency plane. We follow by computational aspects
α3.2 [January 2013] [free version] CC by-nc-nd
Comments to [email protected]
Fourier and Wavelet Signal Processing
Copyright 2013 J. Kovačević, V. K. Goyal, and M. Vetterli
3.2. Tree-Structured Filter Banks
99
in Section 3.6.
3.2
Tree-Structured Filter Banks
Out of the basic building blocks of a two-channel filter bank (Chapter 1) and an
N -channel filter bank (Chapter 2), we can build many different representations
(Figure 3.7 shows some of the options together with the associated time-frequency
tilings). We now set the stage by showing how to compute equivalent filters in such
filter banks, as a necessary step towards building an orthogonal DWT in the next
section, biorthogonal DWT in Section 3.4, wavelet packets in Section 3.5. We will
assume we iterate orthogonal two-channel filter banks only (the analysis is parallel
for biorthogonal and/or N -channel filter banks), and that J times. We consider the
equivalent filter along the lowpass branches separately, followed by the bandpass
branches, and finally, the relationship between the lowpass and bandpass ones.
While we could make the discussion more general, we consider bandpass channels
to be only those iterated through lowpass branches until the last step, when a final
iteration is through a highpass one, as is the case for the DWT (iterations through
arbitrary combinations of lowpass and highpass branches would follow similarly).
3.2.1
The Lowpass Channel and Its Properties
We start with the lowpass channel iterated J times, leading to g (J) . Using the same
identity to move the filter past the upsampler as in Figure 3.2, a cascade of J times
upsampling and filtering by G(z) leads to upsampling by 2J followed by filtering
with the equivalent filter
J−1
G(J) (z) = G(z)G(z 2 )G(z 4 ) . . . G(z 2
) =
J−1
Y
ℓ
G(z 2 ),
(3.5a)
ℓ=0
as shown in Figure 3.8. If g is of length L, then g (J) is of length
L(J) = (L − 1)(2J − 1) + 1 ≤ (L − 1) 2J .
(3.5b)
Moreover, we see that
J−1
G(J) (z) = G(z) G(J−1) (z 2 ) = G(J−1) (z) G(z 2
).
(3.5c)
Some other recursive relations are given in Exercise ??.
Orthogonality of the Lowpass Filter We can use the orthogonality of the basic
two-channel synthesis building block to show the orthogonality of the synthesis
operator obtained by iterating. From this it must be true that the iterated lowpass
filter is orthogonal to its translates by 2J . As in Section 1.2.1, we summarize the
orthogonality properties here (as the proof is just a straightforward, albeit tedious,
α3.2 [January 2013] [free version] CC by-nc-nd
Comments to [email protected]
Fourier and Wavelet Signal Processing
Copyright 2013 J. Kovačević, V. K. Goyal, and M. Vetterli
100
Chapter 3. Wavelet Bases on Sequences
ω
ω
ω
n
n
(a)
n
(b)
ω
(c)
ω
ω
n
n
(d)
n
(e)
(f)
Figure 3.7: Filter banks and variations together with the corresponding time-frequency
tilings. (a) Two-channel filter bank (Chapter 1). (b) N -channel filter bank (Chapter 2).
(c) The local Fourier transform filter bank (Chapter 2). (d) The DWT tree (present
chapter). (e) The wavelet packet filter bank (present chapter). (f) The time-varying filter
bank. [NOTE: The top-right in filter-bank representation in (f ) should be as
in (d), DWT.]
(a)
2
G(z)
2
G(z)
...
G(z)
2
J times
(b)
2
J
2
2
2
G ( z ) ⋅ G ( z ) ⋅ G  z 
...
G  z
2
J–1


Figure 3.8: Cascade of J times upsampling and filtering. (a) Original system. (b)
Equivalent system.
α3.2 [January 2013] [free version] CC by-nc-nd
Comments to [email protected]
Fourier and Wavelet Signal Processing
Copyright 2013 J. Kovačević, V. K. Goyal, and M. Vetterli
3.2. Tree-Structured Filter Banks
101
use of multirate operations, we leave it as Exercise ??):
Matrix View
hgn(J) ,
(J)
gn−2J k i
←→
D2J (G(J) )T G(J) U2J = I
P2J −1
ZT
−1
G(J) (W2kJ z)G(J) (W2−k
) = 2J
J z
2
P2J −1 (J)
DTFT
k
jω
←→
(W2J e ) = 2J
k=0 G
(3.6a)
In the matrix view, we have used linear operators (infinite matrices) introduced in
Section 3.7: (1) downsampling by 2J , D2J , from (3.187); (2) upsampling by 2J ,
U2J , from (3.192); and (3) filtering by G(J) , from (3.62). The matrix view expresses
the fact that the columns of G(J) U2J form an orthonormal set. The DTFT version
is a version of the quadrature mirror formula we have seen in (3.214). This filter is
an 2J th-band filter (an ideal 2J th-band filter would be bandlimited to |ω| ≤ π/2J ,
see Figure 3.1(c) with J = 3).
Let us check the z-transform version of the above for J = 2:
3
X
= δk
←→
k=0
G(2) (W4k z)G(2) (W4−k z −1 )
k=0
= G(2) (z)G(2) (z −1 ) + G(2) (jz)G(2) (−jz −1 )
+ G(2) (−z)G(2) (−z −1 ) + G(2) (−jz)G(2) (jz −1 )
(a)
= G(z)G(z 2 )G(z −1 )G(z −2 ) + G(jz)G(−z 2 )G(−jz −1 )G(−z −2 )
+ G(−z)G(z 2 )G(−z −1 )G(z −2 ) + G(−jz)G(−z 2 )G(jz −1 )G(−z −2 )
(b)
= G(z 2 )G(z −2 ) G(z)G(z −1 ) + G(−z)G(−z −1 )
|
{z
}
2
+ G(−z 2 )G(−z −2 ) G(jz)G(−jz −1 ) + G(−jz)G(jz −1 )
|
{z
}
2
(c)
2
= 2 G(z )G(z
|
−2
2
) + G(−z )G(−z
{z
2
−2
(d)
) = 4,
}
where (a) follows from the expression for the equivalent lowpass filter at level 2,
(3.5a); in (b) we pulled out common terms G(z 2 )G(z −2 ) and G(−z 2 )G(−z −2 ); and
(c) and (d) follow from the orthogonality of the lowpass filter g, (1.13). This,
of course, is to be expected, because we have done nothing else but concatenate
orthogonal filter banks, which we know already implement orthonormal bases, and
thus, must satisfy orthogonality properties.
Deterministic Autocorrelation of the Lowpass Filter As we have done in Chapter 1, we rephrase the above results in terms of the deterministic autocorrelation of
the filter. This is also what we use to prove (3.6a) in Exercise ??. The deterministic
α3.2 [January 2013] [free version] CC by-nc-nd
Comments to [email protected]
Fourier and Wavelet Signal Processing
Copyright 2013 J. Kovačević, V. K. Goyal, and M. Vetterli
102
Chapter 3. Wavelet Bases on Sequences
Lowpass Channel in a J-Level Octave-Band Orthogonal Filter Bank
Lowpass filter
Original domain
Matrix domain
(J )
gn
G
G(J ) (z) =
z-domain
JY
−1
ℓ
G(z 2 )
ℓ=0
(J )
D2J (G
(J ) T
J
2X
−1
k=0
Deterministic autocorrelation
(J )
(J )
(J )
an = hgk , gk+n ik
A(J ) = (G(J ) )T G(J )
z-domain
A(J ) (z) = G(J ) (z)G(J ) (z −1 )
DTFT domain
A
(e
jω
−1
G(J ) (W2kJ z)G(J ) (W2−k
) = 2J
J z
2
(J )
G (W2kJ ejω ) = 2J
(J )
Original domain
Matrix domain
(J )
) G(J ) U2J = I
k=0
J
2X
−1 G(J ) (ejω )
DTFT domain
(J )
hgn , gn−2J k in = δk
(J )
a2J k = δk
D2J A(J ) U2J = I
2
) = G(J ) (ejω )
J
2X
−1
k=0
J
2X
−1
A(J ) (W2kJ z) = 2J
A(J ) (W2kJ ejω ) = 2J
k=0
(J )
Orthogonal projection onto smooth space V (J ) = span({gn−2J k }k∈Z )
PV = G(J ) U2J D2J (G(J ) )T
xV (J) = PV (J) x
Table 3.1: Properties of the lowpass channel in an orthogonal J-level octave-band filter
bank.
autocorrelation of g (J) is denoted by a(J) .
Matrix View
(J)
hgn(J) , gn−2J k i
=
(J)
a2J k
= δk
←→
ZT
←→
DTFT
←→
P2J −1
D2J A(J) U2J = I
(J)
(W2kJ z) = 2J
k=0 A
P2J −1
(J)
(W2kJ ejω ) = 2J
k=0 A
(3.6b)
Orthogonal Projection Property of the Lowpass Channel We now look at the
lowpass channel as a composition of four linear operators we just saw:
xV (J) = PV (J) x = G(J) U2J D2J (G(J) )T x.
(3.7)
As before, the notation is evocative of projection onto V (J) , and we will now show
that the lowpass channel accomplishes precisely this. Using (3.6a), we check idem-
α3.2 [January 2013] [free version] CC by-nc-nd
Comments to [email protected]
Fourier and Wavelet Signal Processing
Copyright 2013 J. Kovačević, V. K. Goyal, and M. Vetterli
3.2. Tree-Structured Filter Banks
103
potency and self-adjointness of PV (J) (Definition 2.27),
I
PV2 (J) = (G(J) U2J
(a)
PVT (J)
z
}|
{
D2J (G(J) )T ) (G(J) U2J D2J (G(J) )T )
= G(J) U2J D2J (G(J) )T = PV (J) ,
= (G(J) U2J D2J (G(J) )T )T = G(J) (U2J D2J )T (G(J) )T
(b)
= G(J) U2J D2J (G(J) )T = PV (J) ,
where (a) follows from (3.6a) and (b) from (3.194). Indeed, PV (J) is an orthogonal
projection operator, with the range given in:
(J)
V (J) = span({gn−2J k }k∈Z ).
(3.8)
The summary of properties of the lowpass channel is given in Table 3.1.
3.2.2
Bandpass Channels and Their Properties
While we have only one lowpass filter, in a J-level octave-band filter bank leading
to the DWT, we also have J bandpass filters, ideally, each bandlimited to π/2ℓ+1 ≤
|ω| ≤ π/2ℓ , for ℓ = 0, 1, . . . , J − 1, as in Figure 3.1(c) with J = 3. The analysis of
an iterated filter bank constructed through arbitrary combinations of lowpass and
highpass branches would follow similarly.
The filter H (ℓ) (z) corresponds to a branch with a highpass filter followed
by (ℓ − 1) lowpass filters (always with upsampling by 2 in between). The (ℓ −
1)th lowpass filter branch has an equivalent filter G(ℓ−1) (z) as in (3.5a), preceded
by upsampling by 2ℓ−1 . Passing this upsampling across the initial highpass filter
ℓ−1
changes H(z) into H(z 2 ) and
ℓ−1
H (ℓ) (z) = H(z 2
)G(ℓ−1) (z),
i = 1, . . . , J,
(3.9)
follows. The basis vectors correspond to the impulse responses and the shifts given
by the upsampling factors. These upsampling factors are 2J for the lowpass branch
and 2ℓ for the bandpass branches.
Example 3.1 (The 3-level octave-band Daubechies filter bank) We continue Example 1.3, the Daubechies filter with two zeros at z = −1:
i
√
√
√
√
1 h
G(z) = √ (1 + 3) + (3 + 3)z −1 + (3 − 3)z −2 + (1 − 3)z −3 . (3.10)
4 2
A 3-level octave-band filter bank has as basis sequences the impulse responses
of:
H (1) (z) = H(z) = −z −3 G(−z −1 ),
H (2) (z) = G(z)H(z 2 ),
H (3) (z) = G(z)G(z 2 )H(z 4 ),
G(3) (z) = G(z)G(z 2 )G(z 4 ),
together with shifts by multiples of 2, 4, 8 and 8, respectively (see Figure 3.9).
α3.2 [January 2013] [free version] CC by-nc-nd
Comments to [email protected]
Fourier and Wavelet Signal Processing
Copyright 2013 J. Kovačević, V. K. Goyal, and M. Vetterli
104
Chapter 3. Wavelet Bases on Sequences
(1)
(2)
hn
hn
4
4
2
2
5
10
15
20
n
-2
5
10
15
20
n
15
20
n
-2
(a)
(b)
(3)
(3)
hn
gn
4
4
2
2
5
10
15
20
-2
n
5
10
-2
(c)
(d)
Figure 3.9: Basis sequences for a 3-level octave-band filter bank based on a Daubechies
orthonormal length-4 filter with two zeros at z = −1, as in (3.10). The basis sequences
are h(1) , h(2) , h(3) and g (3) , together with shifts by multiples of 2, 4, 8 and 8, respectively.
(1)
(1)
(1)
(1)
(2)
(2)
(3)
(3)
We show (a) hn , hn−2 , hn−4 , hn−6 , (b) hn , hn−4 , (c) hn , and (d) gn .
Orthogonality of an Individual Bandpass Filter Unlike in the simple two-channel
case, now, each bandpass filter h(ℓ) is orthogonal to its shifts by 2ℓ , and to all the
other bandpass filters h(j ), ℓ 6= j, as well as to their shifts by 2j . We expect these
to hold as they hold in the basic building block, but state them nevertheless. While
we could state together the orthogonality properties for a single level and across
levels, we separate them for clarity. All the proofs are left for Exercise ??.
Matrix View
hh(ℓ)
n ,
(ℓ)
hn−2ℓ k i
= δk
←→
ZT
←→
DTFT
←→
P2ℓ −1
k=0
D2ℓ (H (ℓ) )T H (ℓ) U2ℓ = I
−1
H (ℓ) (W2kℓ z)H (ℓ) (W2−k
) = 2ℓ
ℓ z
P2ℓ −1 (ℓ)
k jω 2
(W2ℓ e ) = 2ℓ
k=0 H
(3.11a)
Orthogonality of Different Bandpass Filters Without loss of generality, let us
assume that ℓ < j. We summarize the orthogonality properties of the bandpass
filter h(ℓ) and its shift by 2ℓ to the bandpass filter h(j) and its shift by 2j .
Matrix View
(ℓ)
hhn−2ℓ k ,
(j)
hn−2j k i
α3.2 [January 2013] [free version] CC by-nc-nd
= 0
←→
ZT
←→
DTFT
←→
P2ℓ −1
ℓ
P2k=0
−1
k=0
D2j (H (j) )T H (ℓ) U2ℓ = 0
H
(ℓ)
−1
(W2kℓ z)H (j) (W2−k
) = 0
ℓ z
H (ℓ) (W2kℓ ejω )H (j) (W2kℓ e−jω ) = 0
(3.11b)
Comments to book-errat[email protected]
Fourier and Wavelet Signal Processing
Copyright 2013 J. Kovačević, V. K. Goyal, and M. Vetterli
3.2. Tree-Structured Filter Banks
105
Deterministic Autocorrelation of Individual Bandpass Filters
Matrix View
hh(ℓ)
n ,
(ℓ)
hn−2ℓ k i
=
(ℓ)
a2ℓ k
←→
P2ℓ −1
ZT
= δk
←→
ℓ
P2k=0
−1
DTFT
←→
k=0
D2ℓ A(ℓ) U2ℓ = I
A(ℓ) (W2kℓ z) = 2ℓ
(3.11c)
A(ℓ) (W2kℓ ejω ) = 2ℓ
Deterministic Crosscorrelation of Different Bandpass Filters Again without loss
of generality, we assume that ℓ < j.
Matrix View
(ℓ)
hhn−2ℓ k ,
(j)
hn−2j k i
=
(ℓ,j)
c2ℓ k
←→
P2ℓ −1
ZT
= 0
←→
ℓ
P2k=0
−1
DTFT
←→
k=0
D2j C (ℓ,j) U2ℓ = 0
C (ℓ,j) (W2kℓ z) = 0
C
(ℓ,j)
(W2kℓ ejω )
(3.11d)
=0
Orthogonal Projection Property of Bandpass Channels The lowpass channel
computes a projection onto a space of coarse sequences spanned by g (J) and its
shifts by 2J . Similarly, each bandpass channel computes a projection onto a space
of detail sequences spanned by each of h(ℓ) and its shifts by 2ℓ , for ℓ = 1, 2, . . . , J.
That is, we have J bandpass projection operators, computing bandpass projections:
xWℓ = PW (ℓ) x = H (ℓ) U2ℓ D2ℓ (H (ℓ) )T x,
(3.12)
for ℓ = 1, 2, . . . , J. That PW (ℓ ) is an orthogonal projection operator is easy to show;
follow the same path as for the lowpass filter. Each bandpass space is given by:
(ℓ)
W (ℓ) = span({hn−2ℓ k }k∈Z ).
3.2.3
(3.13)
Relationship between Lowpass and Bandpass Channels
The only conditions left to show for the lowpass impulse response and its shifts by
2J and all the bandpass impulse responses and their appropriate shifts to form an
orthonormal set, is the orthogonality of the lowpass and bandpass sequences. Since
the proofs follow the same path as before, we again leave them for Exercise ??.
Orthogonality of the Lowpass and Bandpass Filters
Matrix View
(J)
hgn−2J k ,
(ℓ)
hn−2ℓ k i
= 0
←→
ZT
←→
DTFT
←→
P2ℓ −1
ℓ
P2k=0
−1
k=0
D2ℓ (H (ℓ) )T G(J) U2J = 0
−1
G(J) (W2kℓ z)H (ℓ) (W2−k
) = 0
ℓ z
G(J) (W2kℓ ejω )H (ℓ) (W2kℓ e−jω ) = 0
(3.14a)
Deterministic Crosscorrelation of the Lowpass and Bandpass Filters
Matrix View
(J)
hgn−2J k ,
(ℓ)
hn−2ℓ k i
α3.2 [January 2013] [free version] CC by-nc-nd
=
(J,ℓ)
c2ℓ k
= 0
←→
ZT
←→
DTFT
←→
D2ℓ C (J,ℓ) U2J = 0
P2ℓ −1 (J,ℓ)
C
(W2kℓ z) = 0
ℓ
P2k=0
−1 (J,ℓ)
(W2kℓ ejω ) = 0
k=0 C
(3.14b)
Comments to [email protected]
Fourier and Wavelet Signal Processing
Copyright 2013 J. Kovačević, V. K. Goyal, and M. Vetterli
106
3.3
Chapter 3. Wavelet Bases on Sequences
Orthogonal Discrete Wavelet Transform
Following our introductory Haar example, it is now quite clear what the DWT does:
it produces a coarse projection coefficient α(J) , together with a sequence of ever finer
detail projection (wavelet) coefficients β (1) , β (2) , . . ., β (J) , using a J-level octaveband filter bank as a vehicle. As we have seen in that simple example, the original
space is split into a sequence of subspaces, each having a spectrum half the size
of the previous (octave-band decomposition). Such a decomposition is appropriate
for smooth sequences with isolated discontinuities (natural images are one example
of such signals; there is evidence that the human visual system processes visual
information in such a manner exactly).
3.3.1
Definition of the Orthogonal DWT
We are now ready to formally define the orthogonal DWT:
Definition 3.1 (Orthogonal DWT) The J-level orthogonal DWT of a sequence x is a function of ℓ ∈ {1, 2, . . . , J} given by
X
(J)
(J)
(J)
αk = hxn , gn−2J k in =
xn gn−2J k ,
k ∈ Z,
(3.15a)
n∈Z
(ℓ)
βk
=
(ℓ)
hxn , hn−2ℓ k in
=
X
(ℓ)
xn hn−2ℓ k ,
ℓ ∈ {1, 2, . . . , J}. (3.15b)
n∈Z
The inverse DWT is given by
xn =
X
(J) (J)
αk gn−2J k +
k∈Z
J X
X
(ℓ) (ℓ)
βk hn−2ℓ k .
(3.15c)
ℓ=1 k∈Z
In the above, the α(J) are the scaling coefficients and the β (ℓ) are the wavelet
coefficients.
The equivalent filter g (J) is often called the scaling sequence and h(ℓ) wavelets
(wavelet sequences), ℓ = 1, 2, . . . , J; they are given in (3.5a) and (3.9), respectively,
and satisfy (3.6a)–(3.6b), (3.11a)–(3.11c), as well as (3.14a)–(3.14b).
To denote such a DWT pair, we write:
xn
DWT
←→
(J)
(J)
(J−1)
αk , βk , βk
(1)
, . . . , βk .
The orthogonal DWT is implemented using a J-level octave-band orthogonal
filter bank as in Figure 3.1. This particular version of the DWT is called the dyadic
DWT as each subsequent channel has half of the coefficients of the previous one.
Various generalizations are possible; for example, Solved Exercise ?? considers the
DWT obtained from a 3-channel filter bank.
α3.2 [January 2013] [free version] CC by-nc-nd
Comments to [email protected]
Fourier and Wavelet Signal Processing
Copyright 2013 J. Kovačević, V. K. Goyal, and M. Vetterli
3.3. Orthogonal Discrete Wavelet Transform
3.3.2
107
Properties of the Orthogonal DWT
Some properties of the DWT are rather obvious (such as linearity), while others are
more involved (such as shift in time). We now list and study a few of these.
DWT properties
Time domain
DWT domain
Linearity
axn + byn
a {αx,k , βx,k , . . . , βx,k } + b {αy,k , βy,k , . . . , βy,k }
Shift in time
xn−2J n0
αk−n0 , βk−n0 , βk−2n0 , . . . , βk−2J−1 n
Parseval’s equality
2
kxk
(J )
(J )
(J )
=
X
n∈Z
|xn |
2
(1)
(J )
(J ) 2
= kα
k +
(J )
(J −1)
J
X
ℓ=1
kβ
(J )
(1)
(1)
0
(ℓ) 2
k
Table 3.2: Properties of the DWT.
Linearity
The DWT operator is a linear operator, or,
axn + byn
DWT
←→
(J)
(J)
(1)
(J)
(J)
(1)
a {αx,k , βx,k , . . . , βx,k } + b {αy,k , βy,k , . . . , βy,k }.
(3.16)
Shift in Time A shift in time by 2J n0 results in
xn−2J n0
DWT
←→
(J)
(J)
(J−1)
(1)
αk−n0 , βk−n0 , βk−2n0 , . . . , βk−2J−1 n0 .
(3.17)
This property shows is that the DWT is not shift invariant; it is periodically shift
varying with period 2J .
Parseval’s Equality The DWT operator is a unitary operator and thus preserves
the Euclidean norm (see (2.53)):
kxk2 =
X
n∈Z
|xn |
2
= kα(J) k2 +
J
X
ℓ=1
kβ (ℓ) k2 .
(3.18)
Projection After our Haar example, it should come as no surprise that a J-level
orthogonal DWT projects the input sequence x onto one lowpass space
(J)
V (J) = span({gn−2J k }k∈Z ),
and J bandpass spaces
(ℓ)
W (ℓ) = span({hn−2ℓ k }k∈Z ),
ℓ = 1, . . . , J,
where g (J) and h(ℓ) are the equivalent filters given in (3.5a) and (3.9), respectively.
The input space ℓ2 (Z) is split into the following (J + 1) spaces:
ℓ2 (Z) = V (J) ⊕ W (J) ⊕ W (J−1) ⊕ . . . ⊕ W (1) .
α3.2 [January 2013] [free version] CC by-nc-nd
Comments to [email protected]
Fourier and Wavelet Signal Processing
Copyright 2013 J. Kovačević, V. K. Goyal, and M. Vetterli
108
Chapter 3. Wavelet Bases on Sequences
Polynomial Approximation As we have done in Section 1.2.5, we now look at polynomial approximation properties of an orthogonal DWT,23 under the assumption
that the lowpass filter g has N ≥ 1 zeros at z = −1, as in (1.46):
G(z) = (1 + z −1 )N R(z).
Note that R(z)|z=1 cannot be zero because of the orthogonality constraint (1.13).
Remember that the highpass filter, being a modulated version of the lowpass, has
N zeros at z = 1. In other words, it annihilates polynomials up to degree (N − 1)
since it takes an N th-order difference of the sequence.
In the DWT, each bandpass channel annihilates finitely-supported polynomials of a certain degree, which are therefore carried by the lowpass branch. That is,
if x is a polynomial sequence of degree smaller than N , the channel sequences β (i)
are all zero, and that polynomial sequence x is projected onto V (J) , the lowpass
approximation space:
X (J)
J
xn = ni =
αk gn−2
0 < i < N,
J k,
k∈Z
that is, the equivalent lowpass filter reproduces polynomials up to degree (N − 1).
As this is an orthogonal DWT, the scaling coefficients follow from (3.15a),
X
(J)
(J)
(J)
(J)
ni gn−2J k .
αk = hxn , gn−2J k in = hni , gn−2J k in =
n∈Z
An example with the 4-tap Daubechies orthogonal filter from (3.10) is given in
Figure 3.10. Part (a) shows the equivalent filter after 6 levels of iteration: J = 6,
G(6) (z) = G(z)G(z 2 )G(z 4 )G(z 8 )G(z 16 )G(z 32 ) and length L(6) = 190 from (3.5b).
Part (b) shows the reproduction of a linear polynomial (over a finite range, ignoring
boundary effects).
In summary, the DWT, when wavelet basis sequences have zero moments, will
have very small inner products with smooth parts of an input sequence (and exactly
zero when the sequence is locally polynomial). This will be one key ingredient in
building successful approximation schemes using the DWT in Chapter 7.
Characterization of Singularities Due to its localization properties, the DWT has
a unique ability to characterize singularities.
Consider a single nonzero sample in the input, xn = δn−k . This delayed
Kronecker delta sequence now excites each equivalent filter’s impulse response of
length L(i) = (L − 1)(2i − 1) + 1 at level ℓ (see (3.5b)), which is then downsampled
by 2i (see Figure 3.5 for an illustration with J = 3). Thus, this single nonzero input
creates at most (L − 1) nonzero coefficients in each channel. Furthermore, since
each equivalent filter is of norm 1 and downsampled by 2ℓ , the energy resulting at
level ℓ is of the order
kβ (ℓ) k2 ∼ 2−ℓ .
23 Recall that we are dealing with finitely-supported polynomial sequences, ignoring the boundary
issues. If this were not the case, these sequences would not belong to any ℓp space.
α3.2 [January 2013] [free version] CC by-nc-nd
Comments to [email protected]
Fourier and Wavelet Signal Processing
Copyright 2013 J. Kovačević, V. K. Goyal, and M. Vetterli
3.3. Orthogonal Discrete Wavelet Transform
(6)
gn
500
0.15
109
xn
400
300
0.1
200
0.05
100
50
100
150
200
n
100
200
300
400
500
n
-100
-0.05
(b)
(a)
Figure 3.10: Polynomial reproduction, exact over a finite range. (a) Equivalent filter’s
impulse response after six iterations. (b) Reproduction of a linear polynomial (in red)
(6) 6
over a finite range. We also show he underlying weighted basis sequences (αk gn−2
J k,
k = 0, 1, . . . , 5) contributing to the reproduction of the polynomial. While the plots are
all discrete, they give the impression of being connected due to point density.
In other words, the energy of the Kronecker delta sequence is roughly spread across
the channels according to a geometric distribution. Another way to phrase the
above result is to note that as ℓ increases, coefficients β (ℓ) decay roughly as
β (ℓ) ∼ 2−ℓ/2 ,
when the input is an isolated Kronecker delta.
For a piecewise constant sequence, the coefficients behave instead as
β (ℓ) ∼ 2ℓ/2 .
Thus, two different types of singularities lead to different behaviors of wavelet
coefficients across scales. In other words, if we can observe the behavior of wavelet
coefficients across scales, we can make an educated guess of the type of singularity
present in the input sequence, as we illustrate in Example 3.2. We will study this
in more in detail in the continuous-time case, in Chapter 6.
Example 3.2 (Characterization of singularities by the Haar DWT)
Convolution of the Kronecker delta sequence at position k with the Haar analysis
(ℓ)
filter h−n in (3.2c) generates 2ℓ coefficients; downsampling by 2ℓ then leaves a
single coefficient of size 2−ℓ/2 .
As an example of a piecewise constant sequence, we use the Heaviside sequence (3.10) delayed by k. A single wavelet coefficient will be different from
zero at each scale, the one corresponding to the wavelet that straddles the discontinuity (Figure 3.11). At scale 2ℓ , this corresponds to the wavelet with support
from 2ℓ ⌊k/2ℓ ⌋ to 2ℓ (⌊k/2ℓ ⌋ + 1). All other wavelet coefficients are zero; on the
left of the discontinuity because the sequence is zero, and on the right because
the inner product is zero. The magnitude of the nonzero coefficient depends on
the location k and varies between 0 and 2ℓ/2−1 . When k is a multiple of 2ℓ , this
magnitude is zero, and when k is equal to ℓ 2ℓ + 2ℓ/2 , it achieves its maximum
value. The latter occurs when the discontinuity is aligned with the discontinuity
α3.2 [January 2013] [free version] CC by-nc-nd
Comments to [email protected]
Fourier and Wavelet Signal Processing
Copyright 2013 J. Kovačević, V. K. Goyal, and M. Vetterli
110
Chapter 3. Wavelet Bases on Sequences
xn
1
n
k
(a)
(ℓ)
hn
(ℓ)
hn−2ℓ
2−ℓ/2
n1
n2
n
−2−ℓ/2
(b)
Figure 3.11: Characterization of singularities by the Haar DWT. (a) A Heaviside
sequence at location k, and (b) the equivalent wavelet sequences, highpass filter h(ℓ)
and its shifts, at scale 2ℓ . A single wavelet, with support from n1 = 2ℓ ⌊k/2ℓ ⌋ to
n2 = 2ℓ (⌊k/2ℓ ⌋ + 1), has a nonzero inner product with a magnitude of the order of 2ℓ/2 .
of the wavelet itself; then, the inner product is 2ℓ−1 2−ℓ/2 = 2(ℓ/2)−1 , and we
obtain
β (ℓ) ∼ 2(ℓ/2)−1 .
We thus see that the magnitudes of the wavelet coefficients will vary, but
as ℓ increases, they will increase at most as 2ℓ/2 . In Figure 3.12(a), we show an
example input sequence consisting of a piecewise constant sequence and a Kronecker delta sequences, and its DWT in (b). We see that the wavelet coefficients
are gathered around the singular points (Kronecker delta, Heaviside step), and
they decay or increase, depending on the type of singularity.
In the example above, we obtained precisely β (ℓ) = 2(ℓ/2)−1 for the one nonzero
wavelet coefficient at scale 2ℓ . Figure 3.12 gives another example, with a sequence
with more types of singularities and a DWT with longer filters. We again have
∼ 2−ℓ/2 scaling of wavelet coefficient magnitudes and a roughly constant number
of nonzero wavelet coefficients per scale. We will study this effect in more detail in
Chapter 7, where the bounds on the coefficient magnitudes will play a large role in
quantifying approximation performance.
In summary, the DWT acts as a singularity detector, that is, it leads to nonzero
wavelet coefficients around singular points of a sequence. The number of nonzero
coefficients per scale is bounded by (L − 1). Moreover, the magnitude of the wavelet
coefficients across scales is an indicator of the type of singularity. Together with its
α3.2 [January 2013] [free version] CC by-nc-nd
Comments to [email protected]
Fourier and Wavelet Signal Processing
Copyright 2013 J. Kovačević, V. K. Goyal, and M. Vetterli
3.3. Orthogonal Discrete Wavelet Transform
x
111
4
2
0
−2
(1)
0
50
100
150
200
250
0
50
100
150
200
250
β (3) 2 0
50
100
150
200
250
0
50
100
150
200
250
0
50
100
150
200
250
0
50
100
150
200
250
β
2
0
−2
β (2)
2
0
−2
0
−2
β (4)
2
0
−2
α(4)10
0
−10
Figure 3.12: A piecewise constant sequence plus a Kronecker delta sequence and its
DWT. (a) The original sequence x. (b)–(e) Wavelet coefficients β (ℓ) at scales 2ℓ , ℓ =
1, 2, 3, 4. (f) Scaling coefficients α(4) . To compare different channels, all the sequences
have been upsampled by a factor 2ℓ , ℓ = 1, 2, 3, 4.
polynomial approximation properties as described previously, this ability to characterize singularities will be the other key ingredient in building successful approximation schemes using the DWT in Chapter 7.
Basis for ℓ2 (Z) An interesting twist on a J-level DWT is what happens when
we let the number of levels J go to infinity. This will be one way we will be
building continuous-time wavelet bases in Chapter 6. For sequences in ℓ2 (Z), such
an infinitely-iterated DWT can actually build an orthonormal basis based on wavelet
sequences (equivalent highpass filters) alone. The energy of the scaling coefficients
vanishes in ℓ2 norm; in other words, the original sequence is entirely captured by
the wavelet coefficients, thus proving Parseval’s equality for such a basis. While
this is true in general, below we prove the result for Haar filters only; the general
proof needs additional technical conditions that are beyond the scope of our text.
Theorem 3.2 (Discrete Haar wavelets as a basis for ℓ2 (Z)) The
discrete-time wavelets h(ℓ) with impulse responses as in (3.2d) and their
shifts by 2ℓ ,
(ℓ)
Φ = {hn−2ℓ k }k∈Z, ℓ=1, 2, ... ,
form an orthonormal basis for the space of finite-energy sequences, ℓ2 (Z).
α3.2 [January 2013] [free version] CC by-nc-nd
Comments to [email protected]
Fourier and Wavelet Signal Processing
Copyright 2013 J. Kovačević, V. K. Goyal, and M. Vetterli
112
Chapter 3. Wavelet Bases on Sequences
Proof. To prove Φ is an orthonormal basis, we must prove it is an orthonormal set
and that it is complete. The orthonormality of basis functions was shown earlier in
Section 3.1 for Haar filters and in Section 3.2 for the more general ones.
To prove completeness, we will show that Parseval’s equality holds, that is, for
an arbitrary input x ∈ ℓ2 (Z), we have
kxk2 =
∞
X
ℓ=1
kβ (ℓ) k2 ,
(3.19)
where β (ℓ) are the wavelet coefficients at scales 2ℓ , ℓ = 1, 2, . . .:
(ℓ)
βk
(ℓ)
= hxn , hn−2ℓ k in .
For any finite number of decomposition levels J, the Parseval’s equality (3.18) holds.
Thus, our task is to show limJ →∞ kα(J ) k2 = 0. We show this by bounding two quantities: the energy lost in truncating x and the energy in the scaling coefficients that
represent the truncated sequence.
Without loss of generality, assume x has unit norm. For any ε > 0, we will show
that kα(J ) k2 < ε for sufficiently large J. First note that there exists a K such that the
restriction of x to {−2K , −2K + 1, . . . , 0, 1, . . . , 2K − 1} has energy at least 1 − ε/2;
this follows from the convergence of the series defining the ℓ2 norm of x. Denote the
restriction by x̃.
(K)
A K-level decomposition of x̃ has at most two nonzero scaling coefficients α̃−1
(K)
(K)
(K) 2
and α̃0 . Each of these scaling coefficients satisfies |α̃k | ≤ 1 because kα̃ k ≤
kxk2 = 1 by Bessel’s inequality. We will now consider further levels of decomposition
beyond K. After one more level, the lowpass output α̃(K+1) has coefficients
(K+1)
α̃0
=
(K)
1
√
α̃
2 0
and
(K+1)
α̃−1
=
(K)
√1 α̃
.
2 −1
Similarly, after K + j total levels of decomposition we have
1 (K)
1
α̃
≤ j/2 ,
for k = −1, 0.
2j/2 k
2
2 2
(K+j)
(K+j)
Thus, kα̃(K+j) k2 = α̃−1
+ α̃0
≤ 2−(j−1) .
(K+j)
α̃k
=
Let J = K +j where 2−(j−1) < ε/2. Then kα(J ) k2 < ε because kα̃(J ) k2 < ε/2 and
kα(J ) k2 cannot exceed kα̃(J ) k2 by more than the energy ε/2 excluded in the truncation
of x.
3.4
Biorthogonal Discrete Wavelet Transform
We have seen that the properties of the dyadic orthogonal DWT follow from the
properties of the orthogonal two-channel filter bank. Similarly, the properties of
the biorthogonal DWT follow from the properties of the biorthogonal two-channel
filter bank. Instead of fully developing the biorthogonal DWT (as it is parallel to
the orthogonal one), we quickly summarize its salient elements.
3.4.1
Definition of the Biorthogonal DWT
If, instead of an orthogonal pair of highpass/lowpass filters, we use a biorthogonal
set {h, g, e
h, e
g} as in (1.64a)–(1.64d), we obtain two sets of equivalent filters, one for
α3.2 [January 2013] [free version] CC by-nc-nd
Comments to [email protected]
Fourier and Wavelet Signal Processing
Copyright 2013 J. Kovačević, V. K. Goyal, and M. Vetterli
3.4. Biorthogonal Discrete Wavelet Transform
113
1
3
0.5
2
0
1
−0.5
0
−1
0
1
2
3
4
5
6
−1
−1
0
1
2
3
4
5
6
10
12
1
0.5
10
0
5
−0.5
0
0
2
4
6
8
10
12
−1
0
2
4
6
8
1
0.5
40
0
20
0
−0.5
0
5
10
15
20
24
−1
0
5
10
15
20
24
10
20
30
40
48
1
200
0.5
150
0
100
−0.5
50
0
0
10
20
30
40
48
−1
0
(a)
(b)
Figure 3.13: Iteration of a biorthogonal pair of lowpass filters from (3.21): (a) the
iteration of gn leads to a smooth-looking sequence, while (b) the iteration of ge does not.
the synthesis side, G(J) , H (ℓ) , ℓ = 1, 2, . . . , J, and the other for the analysis side,
e (J) , H
e (ℓ) , ℓ = 1, 2, . . . , J:
G
G(J) (z) =
J−1
Y
k
ℓ−1
G(z 2 ),
H (ℓ) (z) = H(z 2
)G(ℓ−1) (z),
(3.20a)
e 2k ),
G(z
e (ℓ) (z) = H(z
e 2ℓ−1 )G
e(ℓ−1) (z),
H
(3.20b)
k=0
e (J) (z) =
G
J−1
Y
k=0
for ℓ = 1, . . . , J. This iterated product will play a crucial role in the construction
of continuous-time wavelet bases in Chapter 6, as g (J) and ge(J) can exhibit quite
different behaviors; we illustrate this with an example.
Example 3.3 (Biorthogonal DWT) In Example 1.4, we derived a biorthogonal pair with lowpass filters
gn = . . . 0 1 3 3 1 0 . . . ,
(3.21a)
1
. . . 0 −1 3 3 −1 0 . . . .
gen =
(3.21b)
4
Figure 3.13 shows the first few iterations of g (J) and e
g (J) indicating a very different behavior. Recall that both filters are lowpass filters as are their iterated
α3.2 [January 2013] [free version] CC by-nc-nd
Comments to [email protected]
Fourier and Wavelet Signal Processing
Copyright 2013 J. Kovačević, V. K. Goyal, and M. Vetterli
114
Chapter 3. Wavelet Bases on Sequences
versions. However, the iteration of e
g does not look smooth, indicating possible
problems as we iterate to infinity (as we will see in Chapter 6).
Similarly to properties that equivalent filters satisfy in an orthogonal DWT
(Section 3.2), we have such properties here mimicking the biorthogonal relations
from Section 1.4; their formulation and proofs are left as an exercise to the reader.
Definition 3.3 (Biorthogonal DWT) The J-level biorthogonal DWT of a
sequence x is a function of ℓ ∈ {1, 2, . . . , J} given by
(J)
αk
(J)
= hxn , e
gn−2J k in ,
(ℓ)
βk
The inverse DWT is given by
xn =
X
(ℓ)
= hxn , e
hn−2ℓ k in ,
(J) (J)
αk gn−2J k +
k∈Z
J X
X
k, n ∈ Z, ℓ ∈ {1, 2, . . . , J}.
(3.22a)
(ℓ) (ℓ)
βk hn−2ℓ k .
(3.22b)
ℓ=1 k∈Z
In the above, the α(J) are the scaling coefficients and the β (ℓ) are the wavelet
coefficients.
The equivalent filters g (J) , e
g (J) are often called the scaling sequences, and
(ℓ) e (ℓ)
h , h , ℓ = 1, 2, . . . , J, wavelets (wavelet sequences).
3.4.2
Properties of the Biorthogonal DWT
Similarly to the orthogonal DWT, the biorthogonal DWT is linear and shift varying.
As a biorthogonal expansion, it does not satisfy Parseval’s equality. However, as
we have now access to dual bases, we can choose which one to use for projection
(analysis) and which one for reconstruction (synthesis). This allows us to choose a
better-suited one between g and e
g to induce an expansion with desired polynomial
approximation properties and characterization of singularities.
3.5
Wavelet Packets
So far, the iterated decomposition was always applied to the lowpass filter, and
often, there are good reasons to do so. However, to match a wide range of sequences,
we can consider an arbitrary tree decomposition. In other words, start with a
sequence x and decompose it into a lowpass and a highpass version.24 Then, decide
if the lowpass, the highpass, or both, are decomposed further, and keep going until
a given depth J. The DWT is thus one particular case when only the lowpass
version is repeatedly decomposed. Figure 3.7 depicts some of these decomposition
possibilities.
24 Of
course, there is no reason why one could not split into N channels initially.
α3.2 [January 2013] [free version] CC by-nc-nd
Comments to [email protected]
Fourier and Wavelet Signal Processing
Copyright 2013 J. Kovačević, V. K. Goyal, and M. Vetterli
3.5. Wavelet Packets
115
For example, the full tree yields a linear division of the spectrum similar to
the local Fourier transform from Chapter 2, while the octave-band tree performs a
J-level DWT expansion. Such arbitrary tree structures were introduced as a family
of orthonormal bases for discrete-time sequences, and are known under the name of
wavelet packets. The potential of wavelet packets lies in the capacity to offer a rich
menu of orthonormal bases, from which the best one can be chosen (best according
to a given criterion). We discuss this in more detail in Chapter 7.
What we do here is define the basis functions and write down the appropriate
orthogonality relations; since the proofs are in principle similar to those for the
DWT, we chose to omit them.
3.5.1
Definition of the Wavelet Packets
(ℓ)
Equivalent Channels and Their Properties Denote the equivalent filters by gi,n ,
(ℓ)
i = 0, . . . , 2ℓ − 1. In other words, gi is the ith equivalent filter going through one
of the possible paths of length ℓ. The ordering is somewhat arbitrary, and we will
choose the one corresponding to a full tree with a lowpass in the lower branch of
each fork, and start numbering from the bottom.
Example 3.4 (2-level wavelet packet equivalent filters) Let us find all
equivalent filters at level 2, or, the filters corresponding to depth-1 and depth-2
trees.
(1)
G1 (z) = G1 (z),
(1)
(2)
G1 (z) = G0 (z) G1 (z 2 ),
(2)
G3 (z) = G1 (z) G1 (z 2 ).
G0 (z) = G0 (z),
(2)
G0 (z) = G0 (z) G0 (z 2 ),
(2)
G2 (z) = G1 (z) G0 (z 2 ),
(3.23)
With the ordering chosen in the above equations for level 2, increasing index
(2)
does not always correspond to increasing frequency. For ideal filters, G2 (ejω )
(2)
chooses the range [3π/4, π), while G3 (ejω ) covers the range [π/2, 3π/4). Beside
the identity basis, which corresponds to the no-split situation, we have four
possible orthonormal bases (full 2-level split, full 1-level split, full 1-level split
plus either lowpass or highpass split).
Wavelet Packet Bases Among the myriad of possible bases wavelet packets generate, one can choose that one best fitting the sequence at hand.
Example 3.5 (2-level wavelet packet bases) Continuing Example 3.4, we
have a family W = {Φ0 , Φ1 , Φ2 , Φ3 , Φ4 }, where Φ4 is simply {δn−k }k∈Z ;
(2)
(2)
(2)
(2)
Φ0 = {g0,n−22 k , g1,n−22 k , g2,n−22 k , g3,n−22 k }k∈Z
corresponds to the full tree;
(1)
(2)
(2)
Φ1 = {g1,n−2k , g1,n−22 k , g0,n−22 k }k∈Z
α3.2 [January 2013] [free version] CC by-nc-nd
Comments to [email protected]
Fourier and Wavelet Signal Processing
Copyright 2013 J. Kovačević, V. K. Goyal, and M. Vetterli
116
Chapter 3. Wavelet Bases on Sequences
corresponds to the DWT tree;
(1)
(2)
(2)
Φ2 = {g0,n−2k , g2,n−22 k , g3,n−22 k }k∈Z
corresponds to the tree with the highpass split twice; and,
(0)
(1)
Φ3 = {g0,n−2k , g1,n−2k }k∈Z
corresponds to the usual two-channel filter bank basis.
In general, we will have Fourier-like bases, given by
(J)
(J)
Φ0 = {g0,n−2J k , . . . , g2J −1,n−2J k }k∈Z ,
(3.24)
and wavelet-like bases, given by
(1)
(2)
(J)
(J)
Φ1 = {g1,n−2k , g1,n−22 k , . . . , g1,n−2J k , g0,n−2J k }k∈Z .
(3.25)
That these are all bases follows trivially from each building block being a basis
(either orthonormal or biorthogonal).
3.5.2
Properties of the Wavelet Packets
Exercises at the end of this chapter discuss various forms and properties of wavelet
packets: biorthogonal wavelet packets in Exercise ?? and arbitrary wavelet packets
in Exercise ??.
Number of Wavelet Packets How many wavelet packets (different trees) are
there? Call N (J) the number of trees of depth J, then we have the recursion
N (J) = (N (J−1) )2 + 1,
(3.26)
since each branch of the initial two-channel filter bank can have N (J−1) possible
trees attached to it and the +1 comes from not splitting at all. As an initial
condition, we have N (1) = 2 (either no split or a single split). It can be shown that
the recursion leads to an order of
J
N (J) ∼ 22
(3.27)
possible trees. Of course, many of these trees will be poor matches to real-life
sequences, but an efficient search algorithm allowing to find the best match between
a given sequence and a tree-structured expansion is possible. The proof of (3.26) is
left as Exercise ??.
3.6
Computational Aspects
We now consider the computational complexity of the DWT, and show an elementary but astonishing result, in large part responsible for the popularity of the DWT:
the complexity is linear in the input size.
α3.2 [January 2013] [free version] CC by-nc-nd
Comments to [email protected]
Fourier and Wavelet Signal Processing
Copyright 2013 J. Kovačević, V. K. Goyal, and M. Vetterli
3.6. Computational Aspects
117
Complexity of the DWT Computing a DWT amounts to computing a set of convolutions but with a twist crucial to the computational efficiency of the transform;
as the decomposition progresses down the tree (see, for example, Figure 3.1), the
sampling rate decreases. The implementation of J-level DWT for a length-N signal
with a filter bank is equivalent to a factorization





I(1−2−J+1 )N 0 I3N/4 0 IN/2 0 (1) H

H (3)  
H (2) 
H (J)  · · · 
,
0
0
0
G(1)
G(J)
G(3)
G(2)
where H (ℓ) ’s and G(ℓ) are the highpass and lowpass operators, respectively, each
with downsampling by two (see (3.199) and (3.202) for ℓ = 1), both sparse.
In the DWT tree, the second level has similar cost to the first, but at half
the sampling rate (see (3.276)). Continuing this argument, the cost of the J-level
DWT is
L
L L
L + + + . . . + J−1 < 2L
2
4
2
in both multiplications and additions, with the cost of the order of at most
CDWT ∼ 2 N L
∼
O(N ),
(3.28)
that is, it is linear in the input size with a constant depending on the filter length.
While the cost remains bounded, the delay does not. If the first block contributes a delay D, the second will produce a delay 2D and the ℓth block a delay
2ℓ−1 D, for a total delay of
DDWT = D + 2D + 22 D + . . . + 2J−1 D = (2J − 1)D.
This large delay is a serious drawback, especially for real-time applications such as
speech coding.
Complexity of the General Wavelet Packets What happens for more general
trees? Clearly, the worst case is for a full tree (see Figure 3.7(e)).
We start with a naive implementation of a 2J -channel filter bank, downsampled by 2J . Recall that, according to (3.5b), the length of the equivalent filters
(in the DWT or the full tree) are of the order O(L 2J ). Computing each filter and
downsampling by 2J leads to L operations per channel, or, for 2J channels we obtain
Cdirect ∼ N L 2J
∼
O(N L 2J ),
(3.29)
which grows exponentially with J. Exercise ?? compares these two implementations
of the full tree with two Fourier-based ones, showing gains in computational cost.
As the sampling rate goes down, the number of channels goes up, and the two
effects cancel each other. Therefore, for J, the cost amounts to
Cfull ∼ N L J
∼
O(N L J),
(3.30)
multiplications or additions, again for a length-N sequence and length-L filter.
α3.2 [January 2013] [free version] CC by-nc-nd
Comments to [email protected]
Fourier and Wavelet Signal Processing
Copyright 2013 J. Kovačević, V. K. Goyal, and M. Vetterli
118
Chapter 3. Wavelet Bases on Sequences
Chapter at a Glance
The goal of this chapter was twofold: (1) to extend the discussion from Chapters 1 and
2 to more multichannel filter banks constructed as trees and associated bases; and (2) to
consider those filter banks implementing the DWT and wavelet packets.
While in general, tree-structured filter banks can have as their building blocks general
N -channel filter banks, here we concentrated mostly on those built using basic two-channel
filter banks. Moreover, the bulk of the chapter was devoted to those tree-structured filter banks, octave-band, where only the lowpass channel is further decomposed, as they
implement the DWT. Such a decomposition is a natural one, with both theoretical and
experimental evidence to support its use. Experimentally, research shows that the human
visual system decomposes the field of view into octave bands; in parallel, theoretically, the
DWT is an appropriate tool for the analysis of smooth sequences with isolated discontinuities. Moreover, the DWT has interesting polynomial approximation powers as well as
the ability to characterize singularities.
Wavelet packets extend these ideas to more general tilings of the time-frequency
plane, adapting the decomposition to the sequence at hand. Here we discussed the decompositions only; the criteria for which decomposition to use are left for later.
Block diagram: Tree structure
β( 1 )
h
x
–n
2
h
g
–n
β( 2 )
2
–n
2
h
2
g
2
–n
h
2
–n
β( J )
2
2
h
g
2
h
2
g
n
x
n
n
n
n
α( J )
g
2
2
–n
g
n
Block diagram: Multichannel structure
β(1)
(1)
2
h –n
(2)
22
h –n
β(2)
(1)
2
h–n
22
hn
2J
(J)
hn
2J
gn
(2)
x
x
(J)
h–n
2J
(J)
2J
α( J)
g–n
Basic characteristics
number of channels
sampling at level ℓ
channel sequences
Filters
orthogonal
biorthogonal
polyphase component j
β(J)
(J)
M =J +1
N (ℓ) = 2ℓ
(J )
αn
(ℓ)
βn
Synthesis Analysis
lowpass
bandpass(ℓ)
(J )
gn
(J )
gn
(J )
gj,n
(ℓ)
hn
(ℓ)
hn
(ℓ)
hj,n
ℓ = 1, 2, . . . , J
lowpass
bandpass(ℓ)
(J )
g−n
(J )
gen
(J )
gej,n
h−n
(ℓ)
e
hn
(ℓ)
e
h
(ℓ)
j,n
ℓ = 1, 2, . . . , J
j = 0, 1, . . . , 2ℓ − 1
Table 3.3: DWT filter bank.
α3.2 [January 2013] [free version] CC by-nc-nd
Comments to [email protected]
Fourier and Wavelet Signal Processing
Copyright 2013 J. Kovačević, V. K. Goyal, and M. Vetterli
Historical Remarks
119
Historical Remarks
The tree-structured filter banks, and, in particular, octave-band,
or, constant-Q, ones, have been used in speech and audio. A
similar scheme, but redundant, was proposed as a pyramid coding technique by Burt and Adelson [15], and served as the initial link between the discrete-time setting and the works in the
continuous-time setting of Daubechies [29] and Mallat [59]. This
prompted a flurry of connections between the wavelet transform,
filter banks, and subband coding schemes, for example, the biorthogonal bases by Herley
and Vetterli [44]. It, moreover, further opened the door to a formal treatment and definition of the DWT as a purely discrete transform, and not only as a vehicle for implementing
continuous-time ones. Rioul [72] rigorously defined the discrete multiresolution analysis,
and Coifman, Meyer, Quake and Wickerhauser [24] proposed wavelet packets as an adaptive tool for signal analysis. As a result of these developments and its low computational
complexity, the DWT and its variations found their way into numerous applications and
standards, JPEG 2000 among others.
Further Reading
Books and Textbooks Numerous books cover the topic of the DWT, such as those by
Vetterli and Kovačević [106], Strang and Nguyen [87] and Mallat [60], among others.
Wavelet Packets Wavelet packets were introduced in 1991 by Coifman, Meyer, Quake
and Wickerhauser [24], followed by widely cited [112] and [113]. In 1993, Ramchandran
and Vetterli showed for the first time a different cost measure for pruning wavelet packets,
rate-distortion, as that suitable for compression [69], and Saito and Coifman extended the
idea further with local discriminant bases [74, 75].
α3.2 [January 2013] [free version] CC by-nc-nd
Comments to [email protected]
Fourier and Wavelet Signal Processing
120
α3.2 [January 2013] [free version] CC by-nc-nd
Copyright 2013 J. Kovačević, V. K. Goyal, and M. Vetterli
Chapter 3. Wavelet Bases on Sequences
Comments to [email protected]
Fourier and Wavelet Signal Processing
Copyright 2013 J. Kovačević, V. K. Goyal, and M. Vetterli
Chapter 4
Local Fourier and Wavelet
Frames on Sequences
Contents
4.1
Introduction
4.2
Finite-Dimensional Frames
4.3
Oversampled Filter Banks
4.4
Local Fourier Frames
4.5
Wavelet Frames
4.6
Computational Aspects
Chapter at a Glance
Historical Remarks
Further Reading
122
133
151
158
165
171
173
174
174
Redundancy is a common tool in our daily lives; it helps remove doubt or
uncertainty. Redundant signal representations follow the same idea to create robustness. Given a sequence, we often represent it in another domain where its
characteristics are more readily apparent in the expansion coefficients. If the representation in that other domain is achieved via a basis, corruption or loss of expansion
coefficients can be serious. If, on the other hand, that representation is achieved
via a redundant representation, such problems can be avoided.
As introduced in Chapter 2, Section 2.5.4, the redundant counterpart of bases
are called frames. Frames are the topic of the present chapter. The building blocks
of a representation can be seen as words in a dictionary; while a basis uses a minimum number of such words, a frame uses an overcomplete set. This is similar to
multiple words with slight variations for similar concepts, allowing for very short
sentences describing complex ideas.25 While in most of the previous chapters, our
emphasis was on finding the best expansion/representation vectors (Fourier, wavelet,
etc.), frames allow us even more freedom; not only do we look for the best expansion, we can also look for the best expansion coefficients given a fixed expansion
and under desired constraints (sparsity as one example). This freedom is due to
25 As
the urban legend goes, Eskimos have a hundred words for snow.
121
α3.2 [January 2013] [free version] CC by-nc-nd
Comments to [email protected]
Fourier and Wavelet Signal Processing
122
Copyright 2013 J. Kovačević, V. K. Goyal, and M. Vetterli
Chapter 4. Local Fourier and Wavelet Frames on Sequences
the fact that in a redundant dictionary, the expansion coefficients are not unique,
while in a basis they are.
We briefly introduced frames in R2 in Chapter 2, Section 2.1. We followed this
by a more formal discussion in Section 2.5.4. Our goal in this chapter is to explore
the potential of such overcomplete (redundant) representations in specific settings;
in particular, we study fundamental properties of frames, both in finite dimensions
as well as in ℓ2 (Z). We look into designing frames with some structure, especially
those implementable by oversampled filter banks, and more specifically those with
Fourier-like or wavelet-like time-frequency behavior. We end the chapter with the
discussion of computational aspects related to frame expansions.
Notation used in this chapter: We consider both real-coefficient and complexcoefficient frames here, unlike, for example, in Chapter 1. When Hermitian conjugation is applied to polyphase matrices, it is applied only to coefficients and not to
z.
4.1
Introduction
Redundant sets of vectors look like an overcomplete basis,26 and we call such sets
frames. Thus, a frame is an extension of a basis, where, for a given space, more
vectors than necessary are used to obtain an expansion with desirable properties. In
this section, we use the two frame examples from Section 2.1 to introduce and discuss
frame concepts in a simple setting but in more detail. For ease of presentation, we
will repeat pertinent equations as well as figures.
A Tight Frame for R2
Our first example is that from (2.15), a set of three vectors in R2 ,
"q #
"
#
"
#
2
− √16
− √16
3
ϕ0 =
,
ϕ1 =
,
ϕ2 =
.
√1
− √12
0
2
(4.1)
Expansion These vectors clearly span R2 since any two of them do (see Figure 4.1).
How do we represent a vector x ∈ R2 as a linear combination of {ϕi }i=0,1,2 ,
x =
2
X
αi ϕi ?
(4.2)
i=0
Let us gather these vectors into a frame matrix Φ,
"q
2
− √16
3
Φ = ϕ0 ϕ1 ϕ2 =
√1
0
2
26 Even
#
− √16
.
− √12
(4.3)
though this term is a contradiction.
α3.2 [January 2013] [free version] CC by-nc-nd
Comments to [email protected]
Fourier and Wavelet Signal Processing
Copyright 2013 J. Kovačević, V. K. Goyal, and M. Vetterli
4.1. Introduction
123
ϕ1
√1
2
ϕ0
q
− √16
2
3
ϕ2 −
√1
2
Figure 4.1: The three vectors {ϕ0 , ϕ1 , ϕ2 } from (4.1) form a frame for R2 (the same as
Figure 2.4(b).
To compute the expansion coefficients αi in (4.2), we need a right inverse of Φ. Since
Φ is rectangular, such an inverse is not unique, so we look for the simplest one. As
the rows of Φ are orthonormal, Φ ΦT = I2 , and thus, a possible right inverse of Φ
is just its transpose:

 q
 T
2
0
ϕ0
3


T
ΦT = − √1
√1  = ϕ1  .
(4.4)
6
2
T
1
1
ϕ
√
√
2
−
−
6
2
Gathering the expansion coefficients into a vector α,
α = ΦT x,
(4.5)
and, using the fact that
Φα = ΦΦT x = x,
we obtain the following expansion formula:
x =
2
X
i=0
hx, ϕi iϕi ,
(4.6)
for all x ∈ R2 , which looks exactly like the usual orthonormal expansion (2.91a),
except for the number of vectors involved.
Geometry of the Expansion Let us understand why this is so. The rows of Φ are
orthonormal,
ΦΦT = I2 .
(4.7)
α3.2 [January 2013] [free version] CC by-nc-nd
Comments to [email protected]
Fourier and Wavelet Signal Processing
124
Copyright 2013 J. Kovačević, V. K. Goyal, and M. Vetterli
Chapter 4. Local Fourier and Wavelet Frames on Sequences
Actually, Φ can be seen as two rows of a unitary matrix whose third √
row is orthog
onal to the rows of Φ. We call that third row Φ⊥ , that is, Φ⊥ = 1/ 3 1 1 1 .
That unitary matrix is then
q

2
− √16 − √16
3
Φ


(4.8)
=  0
√1
− √12  .
2
Φ⊥
√1
3
√1
3
√1
3
We can then write
ΦΦT = I2×2 ,
Φ(Φ⊥ )T = 02×1 ,
(4.9a)
(4.9b)
Φ⊥ (Φ⊥ )T = I1×1 = 1.
(4.9c)
Calling S the subspace of R3 spanned by the columns of ΦT , and S ⊥ its orthogonal
complement in R3 (spanned by the one column of (Φ⊥ )T ), we can write
 q  

2
0
3
1



S = span ΦT = span − √1  ,  √2  ,
6
− √12
− √16
 1 
S
⊥
⊥ T
= span (Φ )
R3 = S ⊕ S ⊥ .
√
 3 
= span  √13  ,
√1
3
We just saw that the rows of Φ are orthonormal;
pmoreover, while not of unit norm,
the columns of Φ are of the same norm, kϕi k = 2/3. Therefore, Φ is a very special
matrix, as can be guessed by looking at Figure 4.1.
Let us now understand the nature of the expansion coefficients α a bit more
in depth. Obviously, α cannot be arbitrary; since α = ΦT x, it belongs to the range
of the columns of ΦT , or, α ∈ S. What about some arbitrary α′ ∈ R3 ? As the
expansion coefficients must belong to S, we can calculate the orthogonal projection
of α′ onto S by first calculating some x′ = Φα′ , and then computing the unique
orthogonal projection we call α as
α = ΦT x′ = ΦT Φα′ ,
where G = ΦT Φ is the Gram matrix from (2.118),


2 −1 −1
1
(a)
−1
G = ΦT Φ =
2 −1 ,
3
−1 −1
2
(4.10)
and (a) follows from (4.3), (4.4). We can therefore express any α′ ∈ R3 as
α′ = α + α⊥
α3.2 [January 2013] [free version] CC by-nc-nd
(4.11a)
Comments to [email protected]
Fourier and Wavelet Signal Processing
Copyright 2013 J. Kovačević, V. K. Goyal, and M. Vetterli
4.1. Introduction
125
with α ∈ S and α⊥ ∈ S ⊥ , and thus
hα, α⊥ i = 0.
(4.11b)
Therefore, we see that, for our Φ, many different α′ are possible as expansion
coefficients. In other words, ΦT is not the only possible right inverse of Φ. While
this is not surprising, it allows frames to be extremely flexible, a fact we will explore
in detail in the next section. Throughout this chapter, when we write α′ , we will
mean any vector of expansion coefficients; in contrast, α will be the unique one
obtained using the canonical (unique) dual frame.
Energy of Expansion Coefficients For orthonormal bases, Parseval’s equality (energy conservation) (2.93a) is fundamental. To find out what happens here, we
compute the norm of α,
(a)
(b)
kαk2 = αT α = xT Φ ΦT x = xT x = kxk2 ,
(4.12)
where (a) follows from (4.5); and (b) from (4.7), again formally the same as for an
orthonormal basis. Beware though that the comparison is not entirely fair, as the
frame vectors are not of unit norm; we will see in a moment what happens when
this is the case.
Robustness to Corruption and Loss What does the redundancy of this expansion
buy us? For example, what if the expansion coefficients get corrupted by noise?
Assume, for instance, that α is perturbed by noise η ′ , where the noise components
ηi′ are uncorrelated with kη ′ k = 1. Then, reconstruction will project the noise
η ′ = η + η ⊥ , and thus cancel that part of η not in S:
y = Φ(α′ + η ′ ) = x + Φη + Φη ⊥ .
|{z} |{z}
0
xη
To compute kxη k2 , we write
kxη k2 = xTη xη = η T ΦT Φη = η T U ΣU T η,
where we have performed a singular value decomposition (2.222) on G = ΦT Φ as
G = ΦT Φ = U ΣU T
 √1
− 2

=  0
√1
2
− √16
q
2
3
− √16

√1
3
1

√1  
3
√1
3
1
 − √1
2

− √16
0
√1
3
0
q
2
3
√1
3

√1
2

− √16  ,
√1
3
and U is a unitary matrix. So kη ′T U k = 1, and thus kxη k2 = (2/3)kηk2 . We have
thus established that the energy of the noise gets reduced during reconstruction by
the contraction factor 2/3.
We have just looked at the effect of noise on the reconstructed vector in a frame
expansion. We now ask a different question: What happens if, for some reason, we
α3.2 [January 2013] [free version] CC by-nc-nd
Comments to [email protected]
Fourier and Wavelet Signal Processing
126
Copyright 2013 J. Kovačević, V. K. Goyal, and M. Vetterli
Chapter 4. Local Fourier and Wavelet Frames on Sequences
have access to only two out of three expansion coefficients during reconstruction (for
example, one was lost)? As in this case, any two remaining vectors form a basis,
we will still be able to reconstruct; however, the reconstruction is now performed
differently. For example, assume we lost the first expansion coefficient α0 . To
reconstruct, we must behave as if we had started without the first vector ϕ0 and
had computed the expansion coefficients using only ϕ1 and ϕ2 . This further means
that to reconstruct, we must find the inverse of the 2 × 2 submatrix of ΦT formed
by taking its last two rows. This new reconstruction matrix Φe (where e stands for
erasures) is
"
#−1
" √
√ #
√1
− √16
− √32 − √32
e
2
Φ =
=
,
√1
− √16 − √12
− √12
2
T
and thus, multiplying α1 α2 by Φe reconstructs the input vector:
# " √
√ #"
√1
− √32 − √32 − √16
x0
x0
e α1
2
=
.
Φ
=
√1
x1
α2
− √16 − √12 x1
− √1
2
2
Unit-Norm Version The frame we have just seen is a very particular frame and
intuitively close to an orthonormal basis. However,
there is one difference: while
p
all the frame vectors are of the same norm 2/3, they are not of unit norm. We
can normalize kϕi k to be of norm 1, leading to


1
0
1
1 √
1 −
−√2

3
√2
Φ =
ΦT = − 12
,
(4.13)
3
3 ,
√2 
−
0
1
3
2
2
−2 − 2
yielding the expansion
2
x =
2X
hx, ϕi iϕi ,
3 i=0
(4.14)
and the energy in the expansion coefficients
kαk2 =
3
kxk2 .
2
(4.15)
The difference between (4.13) and (4.3) is in normalization; thus, the factor (3/2)
appears in (4.15), showing that the energy in the expansion coefficients is (3/2)
times larger than that of the input vector. When the frame vectors are of unit
norm as in this case, this factor represents the redundancy 27 of the system—we
have (3/2) times more vectors than needed to represent a vector in R2 .
The frame (4.3) and its normalized version (4.13) are instances of the socalled tight frames. A tight frame has a right inverse that is its own transpose and
conserves energy (both within a scale factor). While the tight frame vectors we
have seen are of the same norm, in general, this is not a requirement for tightness,
as we will see later in the chapter.
27 There
exists a precise quantitative definition of redundancy, see Further Reading for details.
α3.2 [January 2013] [free version] CC by-nc-nd
Comments to [email protected]
Fourier and Wavelet Signal Processing
Copyright 2013 J. Kovačević, V. K. Goyal, and M. Vetterli
4.1. Introduction
127
Filter-Bank Implementation As we have seen in previous chapters, infinite-dimensional expansions can be implemented using filter banks. Let us for a moment
go back to the Haar expansion for ℓ2 (Z) and try to draw some parallels. First,
we have, until now, seen many times a 2 × 2 Haar matrix Φ, as a basis for R2 .
Then, in Chapter 7, we used these Haar vectors to form a basis for ℓ2 (Z), by
slicing the infinite-length sequence into pieces of length 2 and applying a Haar
basis to each of these. The resulting basis sequences for ℓ2 (Z) are then obtained as
infinite sequences with two nonzero elements only, shifted by integer multiples of 2,
(??),(??). Finally, in Chapter 1, (1.2)–(1.2), we showed how to implement such an
orthonormal expansion for ℓ2 (Z) by using a two-channel filter bank.
We can do exactly the same here. We slice an infinite-length input sequence
into pieces of length 2 and apply the frame we just saw to each of these. To form a
frame for ℓ2 (Z), we form three template frame vectors from the vectors (4.1) as:

ϕ0
..
.
0
q









 2 
3 ,
= 


 0 


 0 


..
.

ϕ1
..
.
0




 − √1
6
= 
 √1

2

 0

..
.






,






ϕ2
..
.
0




 − √1
6
= 
 √1
−

2

 0

..
.






.





(4.16)
We then form all the other frame sequences as versions of (4.16) shifted by integer
multiples of 2:
Φ = {ϕ0,n−2k , ϕ1,n−2k , ϕ2,n−2k , }k∈Z .
To implement this frame expansion using signal processing machinery, we do exactly
the same as we did for Haar basis in Section 1.1: we rename the template frame
sequences ϕ0 = g0 , ϕ1 = g1 and ϕ2 = g2 . Then we can write the reconstruction
formula as
X
X
X
xn =
α0,k g0,n−2k +
α1,k g1,n−2k +
α2,k g2,n−2k ,
(4.17a)
k∈Z
k∈Z
k∈Z
with
αi,k = hxn , gi,n−2k in .
(4.17b)
There is really no difference between (4.17a)–(4.17b) and (1.2)–(1.3), except that
we have 3 template frame sequences here instead of 2 template basis sequences
for Haar.28 We thus know exactly how to implement (4.17): it is going to be a
3-channel filter bank with down/upsampling by 2, as shown in Figure 4.2, with
synthesis filters’ impulse responses given by the frame vectors, and analysis filters’
impulse responses given by the time-reversed frame vectors.
28 Unlike
for the Haar case, the αi are not unique.
α3.2 [January 2013] [free version] CC by-nc-nd
Comments to [email protected]
Fourier and Wavelet Signal Processing
128
Chapter 4. Local Fourier and Wavelet Frames on Sequences
x
Figure 4.2:
expansion.
Copyright 2013 J. Kovačević, V. K. Goyal, and M. Vetterli
g2,−n
2
2
g2,n
g1,−n
2
2
g1,n
g0,−n
2
2
g0,n
+
x
A 3-channel filter bank with sampling by 2 implementing a tight frame
1
ϕ1
ϕ0
−1
1
ϕ2
−1
Figure 4.3: The three vectors {ϕ0 , ϕ1 , ϕ2 } from (4.18) form a frame for R2 (the same as
Figure 2.4(a)).
A General Frame for R2
Our second example is that from (2.14), again a set of three vectors in R2 ,
1
0
−1
ϕ0 =
,
ϕ1 =
,
ϕ2 =
,
(4.18)
0
1
−1
the standard orthonormal basis {ϕ0 , ϕ1 } plus a third vector. We follow the same
path as we just did to spot commonalities and differences.
Expansion Again, these vectors clearly span R2 since any two of them do (see
Figure 4.3). We have already paved the path for representing a vector x ∈ R2 as a
linear combination of {ϕi }i=0,1,2 by introducing a matrix Φ as in (2.16a),
1 0 −1
Φ =
.
(4.19)
0 1 −1
Unlike (4.3), this Φ does not have orthogonal rows, and thus, ΦT is not one of its
e T and Φ
e a dual frame. Then
right inverses. We call these right inverses Φ
e T = I2 .
ΦΦ
α3.2 [January 2013] [free version] CC by-nc-nd
(4.20)
Comments to [email protected]
Fourier and Wavelet Signal Processing
Copyright 2013 J. Kovačević, V. K. Goyal, and M. Vetterli
4.1. Introduction
129
We have seen one possible dual frame in (2.16c),29
0 −1 −1
e =
Φ
,
−1
0 −1
with the associated expansion
(a)
x =
2
X
i=0
(b)
hx, ϕ
ei iϕi =
2
X
i=0
hx, ϕi iϕ
ei ,
(4.21)
where we expressed x both in the frame (a) and the dual frame (b). This looks
exactly like the usual biorthogonal expansion (2.111a), except for the number of
vectors involved. The dual frame vectors are:
0
−1
−1
ϕ
e0 =
,
ϕ
e1 =
,
ϕ
e2 =
.
(4.22)
−1
0
−1
Geometry of the Expansion In the previous example, the geometry of the expane
sion was captured by Φ and Φ⊥ as in (4.8); here, we must add the dual frame Φ
⊥
e . Two possible complements corresponding to Φ and Φ
e are
and its complement Φ
e ⊥ = 1 1 −1
Φ⊥ = 1 1 1 ,
Φ
(4.23)
both of size 1×3. Then, the following capture the geometry of the matrices involved:
e T = I2×2 ,
ΦΦ
e ⊥ )T = 02×1 ,
Φ(Φ
e T = 01×2 ,
Φ⊥ Φ
⊥
e⊥ T
Φ (Φ )
(4.24a)
(4.24b)
(4.24c)
= I1×1 = 1.
(4.24d)
e T ⊕ (Φ
e ⊥ )T .
Thus, R3 is spanned by both ΦT ⊕ (Φ⊥ )T and Φ
Energy of Expansion Coefficients We saw how energy is conserved in a Parsevallike manner before, what can we say about kαk here?
eΦ
e T x = xT U ΣU T x,
kαk2 = αT α = xT Φ
where we have performed a singular value decomposition (2.222) on the Hermitian
eΦ
e T via (2.232a) as
matrix Φ
2 1
1 −1 3 0 √1
1 1
T
1
e
e
√
ΦΦ =
= 2
.
(4.25a)
2 −1 1
1 2
1
1 0 1
|
{z
} | {z } |
{z
}
U
Σ
UT
29 Note that while there exist infinitely many dual frames since Φ has a nontrivial null space,
here we concentrate on the canonical one as will be clear later in the chapter.
α3.2 [January 2013] [free version] CC by-nc-nd
Comments to [email protected]
Fourier and Wavelet Signal Processing
130
Copyright 2013 J. Kovačević, V. K. Goyal, and M. Vetterli
Chapter 4. Local Fourier and Wavelet Frames on Sequences
x
ge2,n
2
ge0,n
2
ge1,n
2
α2
α1
α0
2
g2,n
2
g1,n
2
g0,n
+
x
Figure 4.4: A 3-channel filter bank with sampling by 2 implementing a general frame
expansion.
eΦ
e T is a Hermitian matrix, (2.234) holds, that is,
Because Φ
eΦ
e T ≤ λmax I,
λmin I ≤ Φ
(4.25b)
e T xk2 ≤ 3kxk2 .
kxk2 ≤ kαk2 = kΦ
(4.26)
eΦ
e T . Thus, with
where λmin and λmax are the smallest and largest eigenvalues of Φ
λmin = 1, λmax = 3, we get
Therefore, although energy is not preserved, it is bounded from below and above
eΦ
e T . Depending on the range between the minimum and
by the eigenvalues of Φ
maximum eigenvalues, the energy can fluctuate; in general, the closer (tighter) the
eigenvalues are, the better-behaved the frame is.30 The set of inequalities (4.26) is
similar to how Riesz bases were defined in (2.86), and a similar relation holds for
Φ,
1
λmax
I ≤ ΦΦT ≤
1
λmin
I,
and thus
1
kxk2 ≤ kΦT xk2 ≤ kxk2 .
3
This frame is related to the previous one in the same way a biorthogonal basis is
related to an orthonormal basis, and is called a general frame.
Filter-Bank Implementation In parallel to what we have done for (4.1), we can
use this finite-dimensional frame as an expansion for sequences in ℓ2 (Z) by slicing
the input sequence into pieces of length 2 and applying the frame we just saw to
each of these. To form a frame for ℓ2 (Z), we form three template frame vectors
30 This
explains the word tight in tight frames (where the eigenvalues are equal).
α3.2 [January 2013] [free version] CC by-nc-nd
Comments to [email protected]
Fourier and Wavelet Signal Processing
Copyright 2013 J. Kovačević, V. K. Goyal, and M. Vetterli
4.1. Introduction
131
from the three R2 vectors (4.18)
 
 
..
..
.
 
 . 
0
0
 
 
1
0



ϕ0 =   ,
ϕ1 = 
 1 ,
0
 
 
0
0
 
 
..
..
.
.
ϕ2
 .
.
 .
 0

 −1
= 
 −1

 0

..
.





.




(4.27)
To form the dual frame, we form three template vectors from the three R2 dual
e
frame vectors, (4.22), leading to the frame Φ and dual frame Φ:
Φ = {ϕ0,n−2k , ϕ1,n−2k , ϕ2,n−2k , }k∈Z ,
e = {ϕ
Φ
e0,n−2k , ϕ
e1,n−2k , ϕ
e2,n−2k , }k∈Z .
Renaming the template frame sequences ϕi,n = gi,n and the dual ones ϕ
ei,n = gei,−n ,
we again have a 3-channel filter bank with down/upsampling by 2, as in Figure 4.4.
Choosing the Frame Expansion and Expansion Coefficients
So far, we have seen two redundant representations, a tight frame (4.3), akin to
an orthonormal basis, and a general frame (4.19), akin to a biorthogonal basis.
We showed properties, including robustness to noise and loss. Given a sequence
x, how do we then choose an appropriate frame expansion? Moreover, as we have
already mentioned, we can have infinitely many dual frames, and thus, infinitely
many expansion coefficients α′ , which one to choose? We tackle these questions in
the next section; here, we just show a simple example that indicates the trade-offs.
Choosing the Frame Expansion Assume we are working in RN and we are given
an input sequence x consisting of a single complex sinusoidal sequence of unknown
frequency (2π/N )ℓ and a Kronecker delta sequence of unknown location k:
xn = β1 ej(2π/N )ℓn + β2 δn−k .
As discussed in Chapter 7, we can use a length-N DFT to expand x; we know this
will effectively localize the sinusoid in frequency, but will do a poor job in time, and
the location of the Kronecker delta impulse will be essentially lost. We can use the
dual, standard basis, with the dual effect: it will do an excellent job of localizing
the Kronecker delta impulse but will fail in localizing the frequency of the sinusoid.
While we could use a wavelet representation from Chapter 3, an even more
obvious option is to use both bases at the same time, effectively creating a frame
with the following 2N frame vectors:
Φ = DFT I N ×2N ,
(4.28)
where the first N are the DFT basis vectors (3.162),
iT
1 h 0
i(N −1)
ϕi = √
,
WN WNi . . . WN
N
α3.2 [January 2013] [free version] CC by-nc-nd
Comments to [email protected]
Fourier and Wavelet Signal Processing
Copyright 2013 J. Kovačević, V. K. Goyal, and M. Vetterli
132
Chapter 4. Local Fourier and Wavelet Frames on Sequences
while the last N are the standard basis vectors δn−i . Using (2.144), we see that
this is a tight frame since31
DFT∗
Φ Φ∗ = DFT I
= |DFT{zDFT}∗ + I = 2I.
(4.29)
I
I
Choosing the Expansion Coefficients As x has only two components, there exists
a way to write it as
x = Φα′ ,
e ∗ x has exactly 2 nonzero coefficients32
where α′ = Φ
α′ =
0
. . . 0 β1
0 ... 0
β2
T
0 ... 0 ,
where β1 is at the ℓth location and β2 at the (N + k)th location. Such an expansion
is called sparse, in the sense that it uses a small number of frame vectors. This is
e = (1/2)Φ,
different from α obtained from the canonical dual Φ
α =
1 ∗
Φ x,
2
which has two dominant components at the same locations as α′ , but also many more
nonzero components. We will see later that, while α′ has fewer nonzero coefficients,
α has a smaller ℓ2 norm (see Solved Exercise ??).33 This is an important message;
while in bases, the expansion coefficients are always unique, in frames they are not,
and minimizing different norms will lead to different expansions.
Chapter Outline
This chapter is somewhat unusual in its scope. While most of the chapters in Part II
deal either with Fourier- or wavelet-like expansions, this chapter deals with both.
However, there is one important distinction: these expansions are all overcomplete,
or, redundant. Thus, our decision to keep them all in one chapter.
Unlike for bases, where we have discussed standard finite-dimensional expansions such as the DFT, we have not done so with frames until now, and thus,
Section 4.2 investigates finite-dimensional frames. We then resume the structure
we have been following starting with Chapter 1, that is, we discuss the signalprocessing vehicle for implementing frame expansions—oversampled filter banks.
We follow with local Fourier frames in Section 4.4 and wavelet frames in Section 4.5.
Section 4.6 concludes with computational aspects.
The sections that follow can also be seen as redundant counterparts of previous
chapters. For example, Section 4.2 on finite-dimensional frames, has its basis counterpart in Chapter 2 and Chapter 3, where we discussed finite-dimensional bases in
general (Chapter 2) as well as some specific ones, such as the DFT (Chapter 3).
31 Note
that now we use the Hermitian transpose of Φ as it contains complex entries.
it might not be obvious how to calculate that expansion.
33 In fact, α′ can be chosen to minimize the ℓ1 norm, while α minimizes the ℓ2 norm.
32 Although
α3.2 [January 2013] [free version] CC by-nc-nd
Comments to [email protected]
Fourier and Wavelet Signal Processing
Copyright 2013 J. Kovačević, V. K. Goyal, and M. Vetterli
4.2. Finite-Dimensional Frames
133
Section 4.3 on oversampled filter banks has its basis (critically-sampled filter bank)
counterpart in Chapter 1 (two-channel critically-sampled filter banks), Chapter 2
(N -channel critically-sampled filter banks) and Chapter 3 (tree-structured criticallysampled filter banks). Section 4.4 on local Fourier frames has its basis counterpart
in Chapter 2 (local Fourier bases on sequences), while Section 4.5 on wavelet frames
has its basis counterpart in Chapter 3 (wavelet bases on sequences). Thus, this chapter also plays a unifying role in summarizing concepts on expansions of sequences.
The two chapters that follow this one will deal with functions instead of sequences.
4.2
Finite-Dimensional Frames
We have just seen examples showing that finite-dimensional overcomplete sets of
vectors have properties similar to orthonormal and/or biorthogonal bases. We will
now look into general properties of such finite-dimensional frames in CN , with the
understanding that RN is just a special case. We start with tight frames and follow
with general frames. Since the representation in a frame is in general nonunique,
we discuss how to compute expansion coefficients, and point out that, depending
on which norm is minimized, different solutions are obtained.
Finite-dimensional frames are represented via rectangular matrices; thus, all
the material in this section is basic linear algebra. We use this simplicity to develop
the geometric intuition to be carried to infinite-dimensional frames that follow.
4.2.1
Tight Frames for CN
We work in a finite-dimensional space CN , where, a set of vectors Φ = {ϕi }M−1
i=0 ,
M > N , is a frame represented (similarly to (4.3) and (4.19)) by the frame matrix
Φ as
Φ = ϕ0 ϕ1 . . . ϕM−1 N ×M .
(4.30)
Assume that the rank(Φ) = N , that is, the column range of Φ is CN . Thus, any
x ∈ CN can be written as a nonunique linear combination of ϕi ’s.
We impose a further constraint and start with frames that satisfy a Parsevallike equality:
N
Definition 4.1 (Tight frame) A family Φ = {ϕi }M−1
is called a tight
i=0 in C
frame, or, λ-tight frame when there exists a constant 0 < λ < ∞ called the frame
bound, such that for all x ∈ CN ,
λkxk2 =
M−1
X
i=0
α3.2 [January 2013] [free version] CC by-nc-nd
|hx, ϕi i|2 = kΦ∗ xk2 .
(4.31)
Comments to [email protected]
Fourier and Wavelet Signal Processing
134
Copyright 2013 J. Kovačević, V. K. Goyal, and M. Vetterli
Chapter 4. Local Fourier and Wavelet Frames on Sequences
Expansion
Equation (4.31) has a number of consequences. First, it means that
Φ Φ∗ = λI.
(4.32)
e the dual frame as before, we see that
Thus, Φ∗ is a right inverse of Φ. Calling Φ
e = 1 Φ.
Φ
λ
Then:
x = Φα,
α =
(4.33)
1 ∗
Φ x,
λ
M−1
1 X
x =
hx, ϕi iϕi .
λ i=0
(4.34a)
(4.34b)
This looks very similar to an orthonormal basis expansion, except for the scaling
factor (1/λ) and the fact that Φ = {ϕi }M−1
i=0 cannot be a basis since ϕi are not
linearly independent. We can√pull the factor (1/λ) into the sum and renormalize
the frame vectors as ϕ′i = (1/ λ)ϕi leading to the expression that formally looks
identical to that of an orthonormal basis expansion:
x =
M−1
X
i=0
hx, ϕ′i iϕ′i .
(4.35)
We have already seen an example of such a renormalization in (4.1) and (4.13). A
frame normalized so that λ = 1 is called a Parseval tight frame or a 1-tight frame.
The expression for x is what is typically called a reconstruction or a representation of a sequence, or, in filter banks, synthesis, while the expression for the
expansion coefficients α is a decomposition, or, analysis in filter banks.
In the discussion above, we said nothing about the norms of the individual
frame vectors. Since the analysis computes inner products αi = hx, ϕi i, it often
makes sense for all ϕi to have the same norm, leading to an equal-norm frame (which
may not be tight). When we combine equal norm with tightness, we get a frame
that acts every bit like an orthonormal basis, except for the redundancy. That is,
all inner products αi = hx, ϕi i are projections of x onto vectors of the same norm,
allowing us to compare coefficients αi to each other. Moreover, because the frame
is tight, the right inverse is simply its adjoint (within scaling). In finite dimensions,
tightness corresponds to the rows of Φ being orthogonal. Because of this, it is hard
in general to obtain an equal-norm frame starting from the tight one.
Geometry of the Expansion Let us explore the geometry of tight frames. With
the frame matrix Φ as in (4.30), and as we did in (4.8), we introduce Φ⊥ ,
Φ⊥ = ϕ⊥
ϕ⊥
. . . ϕ⊥
(4.36)
0
1
M−1 (M−N )×M
α3.2 [January 2013] [free version] CC by-nc-nd
Comments to [email protected]
Fourier and Wavelet Signal Processing
Copyright 2013 J. Kovačević, V. K. Goyal, and M. Vetterli
4.2. Finite-Dimensional Frames
135
as one possible orthogonal complement of Φ in CM :
Φ
ϕ0 ϕ1 . . . ϕM−1
=
Φ⊥
ϕ⊥
ϕ⊥
. . . ϕ⊥
0
1
M−1 M×M
(4.37)
that is, the rows of Φ⊥ are chosen to be orthogonal to the rows of Φ, orthogonal to
each other and of norm 1, or,
Φ ∗
ΦΦ∗
Φ(Φ⊥ )∗
⊥ ∗
Φ
(Φ
)
=
Φ⊥
Φ⊥ Φ∗ Φ⊥ (Φ⊥ )∗
IN ×N
0
=
= IM×M .
(4.38)
0
I(M−N )×(M−N )
M−N
Note that each vector ϕi is in CN , while each vector ϕ⊥
. We can
i is in C
rewrite (4.37) as
S = span ΦT ⊂ CM ,
(4.39a)
M
⊥
⊥ T
S = span (Φ ) ⊂ C ,
(4.39b)
CM = S ⊕ S ⊥ ,
(4.39c)
and, because of (4.38),
ΦΦ∗ = IN ×N ,
(4.40a)
Φ(Φ⊥ )∗ = 0N ×(M−N ) ,
(4.40b)
Φ⊥ (Φ⊥ )∗ = I(M−N )×(M−N ) .
(4.40c)
A vector of expansion coefficients α′ ∈ CM can be written as
α′ = α + α⊥
(4.41a)
with α ∈ S and α⊥ ∈ S ⊥ , and thus
hα, α⊥ i = 0,
(4.41b)
as we have already seen in the simple example in the previous section, (4.11b).
Relation to Orthonormal Bases
We now look into connections between tight frames and orthonormal bases.
Theorem 4.2 A 1-tight frame with unit-norm vectors is an orthonormal basis.
Proof. For a tight frame expansion with λ = 1,
T = ΦΦ∗ = I,
α3.2 [January 2013] [free version] CC by-nc-nd
Comments to [email protected]
Fourier and Wavelet Signal Processing
136
Copyright 2013 J. Kovačević, V. K. Goyal, and M. Vetterli
Chapter 4. Local Fourier and Wavelet Frames on Sequences
and thus, all the eigenvalues of T are equal to 1. Using one of the useful frame relations
we introduce later in (4.64a),
(a)
N =
N−1
X
(b)
λj =
j=0
M
−1
X
i=0
(c)
kϕi k2 = M,
where (a) follows from all eigenvalues being equal to 1; (b) from (4.64a); and (c) from
all frame vectors being of unit norm. We get that M = N , and thus, our frame is an
orthonormal basis.
In this result, we see once more the tantalizing connection between tight frames and
orthonormal bases. In fact, even more is true: tight frames and orthonormal bases
arise from the minimization of the same quantity called the frame potential :
FP(Φ) =
M−1
X
i,j=0
|hϕi , ϕj i|2 .
(4.42)
In fact, minimizing the frame potential has two possible outcomes:
(i) When M ≤ N , the minimum value of the frame potential is
FP(Φ) = N,
achieved when Φ is an orthonormal set.
(ii) When M > N , the minimum value of the frame potential is
FP(Φ) =
M2
,
N
achieved when Φ is a unit-norm tight frame.34
This tells us that unit-norm tight frames are a natural extension of orthonormal
bases, that is, the theorem formalizes the intuitive notion that unit-norm tight
frames are a generalization of orthonormal bases. Moreover, both orthonormal bases
and unit-norm tight frames are results of the minimization of the frame potential,
with different parameters (number of elements equal/larger than the dimension of
the space). We give pointers to more details on this topic in Further Reading.
Example 4.1 (Tight frames and orthonormal bases) We illustrate this
result with an example. Fix N = 2.
(i) We first consider the case when M = N = 2. Then, we have two vectors
only, ϕ0 and ϕ1 , both on the unit circle. According to (4.42), the frame
potential is
(a)
FP({ϕ0 , ϕ1 }) = kϕ0 k2 +kϕ1 k2 +2|hϕ0 , ϕ1 i|2 = 2(1+|hϕ0 , ϕ1 i|2 ), (4.43)
where (a) follows from ϕ0 , ϕ1 , being of unit norm. The above expression is
minimized when hϕ0 , ϕ1 i = 0, that is, when ϕ0 and ϕ1 form an orthonormal
basis. In that case, the minimum of the frame potential is F P = 2 = N .
34 This lower bound for frames is known as Welch bound arising when minimizing interuser
interference in a CDMA system (see Further Reading for pointers).
α3.2 [January 2013] [free version] CC by-nc-nd
Comments to [email protected]
Fourier and Wavelet Signal Processing
Copyright 2013 J. Kovačević, V. K. Goyal, and M. Vetterli
4.2. Finite-Dimensional Frames
137
ϕ2
θ1
−1
Π
ϕ1
θ2 ϕ0
2Π
3
θ2
1
1
Π
3
0
0
(a)
Π
3
2Π
θ1
(b)
Π
3
Figure 4.5: Minimization of the frame potential for frames with unit-norm vectors. (a)
Three unit-norm vectors in R2 . (b) Density plot of the frame potential as a function of
angles between frame vectors. The two minima are identical and appear for θ1 = π/3,
θ2 = π/3 and θ1 = 2π/3, θ2 = 2π/3.
(ii) We now look at M larger than N ; we choose M = 3. Let us fix ϕ0 = 1 0 ;
ϕ1 is θ1 away from ϕ0 in counterclockwise direction; ϕ2 is θ2 away from
ϕ1 in counterclockwise direction (see Figure 4.5(a)). The frame potential
is now
FP({θ1 , θ2 }) = kϕ0 k2 + kϕ1 k2 + kϕ2 k2 +
2
2
(4.44)
2
2(|hϕ0 , ϕ1 i| + |hϕ0 , ϕ2 i| + |hϕ1 , ϕ2 i| )
= 3 + 2(cos θ1 + cos θ2 + cos(θ1 + θ2 )).
(4.45)
Figure 4.5(b) shows the density plot of FP({θ1 , θ2 }) for θi ∈ [0, π]. From
the figure, we see that there are two minima, for θ1 = θ2 = π/3 and
θ1 = θ2 = 2π/3, both of which lead to tight frames; the second choice is
the frame we have seen in (4.13), the first choice is, in fact, identical to the
second (within reflection). We thus see that the results of minimizing the
frame potential in this case are tight frames, with the minimum of
π π
π π
1 1 1
9
M2
FP({ , }) = FP({2 , 2 }) = 3 + 2( + + ) =
=
,
3 3
3 3
4 4 4
2
N
as per the theorem.
This simple example shows that minimizing the frame potential with different
parameters leads to either orthonormal sets (orthonormal bases when M = N )
or unit-norm tight frames.
Naimark’s Theorem Another powerful connection between orthonormal bases and
tight frames is also a constructive way to obtain all tight frames. It is given by the
following theorem, due to Naimark, which says that all tight frames can be obtained
by projecting orthonormal bases from a larger space (of dimension M ) onto a smaller
one (of dimension N ). We have seen one such example in (4.8), where the frame
Φ ∈ R2 from (4.3) was obtained by projecting an orthonormal basis from R3 .
α3.2 [January 2013] [free version] CC by-nc-nd
Comments to [email protected]
Fourier and Wavelet Signal Processing
138
Copyright 2013 J. Kovačević, V. K. Goyal, and M. Vetterli
Chapter 4. Local Fourier and Wavelet Frames on Sequences
Theorem 4.3 (Naimark [3], Han & Larson [42]) A frame Φ ∈ CN is tight
if and only if there exists an orthonormal basis Ψ ∈ CM , M ≥ N , such that
Φ∗ = Ψ[J],
(4.46)
where J ⊂ {0, 1, . . . , M − 1} is the index set of the retained columns of Ψ, a
process known as seeding.
Here, we considered only the tight-frame finite-dimensional instantiation of
the theorem. For general finite-dimensional frames, a similar result holds, that
is, any frame can be obtained by projecting a biorthogonal basis from a larger
space. In Theorem 4.8 we formulate the statement for infinite-dimensional frames
implementable by oversampled filter banks.
Proof. Given is a tight frame Φ, with columns ϕi , i = 0, 1, . . . , M − 1, and rows ψj ,
j = 0, 1, . . . , N − 1. Because Φ
√ is a tight frame, it satisfies (4.32); without loss of
generality, renormalize it by (1/ λ) so that the frame we work with is 1-tight. This
further means that
hψi , ψj i = δi−j ,
that is, {ψ0 , ψ1 , . . . , ψN−1 } is an orthonormal set, and, according to (4.39a), it spans
the subspace S ⊂ CM . The whole proof in this direction follows from the geometry of
tight frames we discussed earlier, by showing, as we did in (4.37), how to complete the
tight frame matrix Φ to obtain an orthonormal basis Ψ∗ .
The other direction is even easier. Assume we are given a unitary Ψ. Choose any
N columns of Ψ and call the resulting M × N matrix Φ∗ . Because these columns form
an orthonormal set, the rows of Φ form an orthonormal set, that is,
ΦΦ∗ = I.
Therefore, Φ is a tight frame.
Harmonic Tight Frames We now look into an example of a well-known family of
tight frames called harmonic tight frames, a representative of which we have already
seen in (4.3). Harmonic tight frames are frame counterparts of the DFT, and are,
in fact, obtained from the DFT by seeding, a process defined in Theorem 4.3.
Specifically, to obtain harmonic tight frames, we start with the DFT matrix
Ψ = DFTM given in (3.161a) and delete its last (M − N ) columns, yielding:


1
1
1
...
1
M−1
2

1
WM
WM
...
WM



2(M−1) 
2
4
1
W
W
.
.
.
W

,
M
M
M
Φ = 
(4.47a)

..
..
..
..
 ..

.
.
.
.
.

(N −1)
(N −1)·2
(N −1)(M−1)
1 WM
WM
. . . WM
with the corresponding frame vectors
h
0
i
ϕi = WM
WM
α3.2 [January 2013] [free version] CC by-nc-nd
...
i(N −1)
WM
iT
,
(4.47b)
Comments to [email protected]
Fourier and Wavelet Signal Processing
Copyright 2013 J. Kovačević, V. K. Goyal, and M. Vetterli
4.2. Finite-Dimensional Frames
139
for i = 0, 1, . . . , M − 1, where WM = e−j2π/M is the principal M th root of unity
(3.284). The norm of each frame vector is
kϕi k2 = ϕ∗i ϕi = N,
the frame is M -tight
ΦΦ∗ = M I,
and the Parseval-like equality is
kΦ∗ xk2 = M kxk2 .
In its unit-norm version, we can compute its redundancy as (M/N ). Harmonic tight
frames have a number of other interesting properties, some of which are explored
in Exercise ??.
For example, we explore an interesting property of frames that holds for harmonic tight frames.
Definition 4.4 (Frame maximally robust to erasures) A frame Φ is
called maximally robust to erasures when its every N × N submatrix is invertible.
We have seen one example of a frame maximally robust to erasures: every
2 × 2 submatrix of (4.3) is invertible. The motivation in that example was that
such frames can sustain a loss of a maximum number of expansion coefficients and
still afford perfect reconstruction of the original vector. In fact, harmonic tight
frames in general possess such a property since every N × N submatrix of (4.47a)
is invertible (it is a Vandermonde matrix whose determinant is always nonzero, see
(2.240) and Exercise ??).
Random Frames
While it seems that the tight frames as those we have just seen are very special,
it turns out that any unit-norm frame with high redundancy will be almost tight.
This is made precise by the following result:
Theorem 4.5 (Tightness of random frames [41]) Let {ΦM }∞
M=N be a sequence of frames in RN such that ΦM is generated by choosing M vectors independently with a uniform distribution on the unit sphere in RN . Then, in the
mean-squared sense,
1
1
ΦΦT →
IN
M
N
elementwise as M → ∞.
An illustration of the theorem for N = 2 is given in Figure 4.6.
α3.2 [January 2013] [free version] CC by-nc-nd
Comments to [email protected]
Fourier and Wavelet Signal Processing
140
Copyright 2013 J. Kovačević, V. K. Goyal, and M. Vetterli
Chapter 4. Local Fourier and Wavelet Frames on Sequences
∆00
∆01
0.2
0.2
0.1
0.1
M
M
-0.1
-0.1
-0.2
-0.2
∆10
∆11
0.2
0.2
0.1
0.1
M
M
-0.1
-0.1
-0.2
-0.2
Figure 4.6: Illustration of tightness of random frames for N = 2, and M = 2, 3 . . . , 1000.
Since the convergence is elementwise, each graph plots the behavior ∆ij = [(1/M )ΦΦT −
(1/2)I2 ]ij for i, j = 0, 1.
4.2.2
General Frames for CN
Tight frames are attractive the same way orthonormal bases are. They obey a
Parseval-like energy conservation equality, and the dual frame is equal to the frame
itself (the right inverse of Φ is just its own Hermitian transpose, possibly within
scaling). Tightness, however, has sometimes to be relaxed, just as orthonormality
does (for example, when we wanted to design two-channel linear-phase filter banks
in Chapter 1). Either a frame is given by a specific construction, and is not tight,
or the constraints posed by tightness are too restrictive for a desired design.
Given a frame Φ as in (4.30) of rank(Φ) = N , we can find the canonical dual
e (formalized in Definition 4.7), also of size N × M , made of dual frame
frame Φ
vectors as
e = (ΦΦ∗ )−1 Φ,
Φ
e0 ϕ
e1 . . . ϕ
eM−1 N ×M ,
= ϕ
(4.48a)
(4.48b)
e ∗ = ΦΦ∗ (ΦΦ∗ )−1 = IN ,
ΦΦ
e ∗ = (ΦΦ∗ )−1 ΦΦ
e ∗ = IN .
ΦΦ
(4.49a)
ϕ
ei = (ΦΦ∗ )−1 ϕi .
(4.48c)
The above are all well-defined, since T = ΦΦ∗ is of rank N and can thus be ine play the same roles for frames as their namesakes do
verted. Therefore, Φ and Φ
for biorthogonal bases, and, using (4.48a):
(4.49b)
e chosen here is a particular right inverse;
Note that the canonical dual frame Φ
we will formalize this in Definition 4.7. Note also that when Φ is tight, with this
e = Φ.
definition of the dual, we indeed obtain Φ
α3.2 [January 2013] [free version] CC by-nc-nd
Comments to [email protected]
Fourier and Wavelet Signal Processing
Copyright 2013 J. Kovačević, V. K. Goyal, and M. Vetterli
4.2. Finite-Dimensional Frames
141
Expansion
N
N
We know that Φ = {ϕi }M−1
can be written as
i=0 span C , that is, any x ∈ C
x =
M−1
X
αi ϕi
(4.50a)
α
ei ϕ
ei ,
(4.50b)
i=0
=
M−1
X
i=0
both of which follow from (4.49) by writing
(a)
e ∗x = Φ α
x = ΦΦ
(b)
e x = Φe
e α,
= ΦΦ
∗
(4.51a)
(4.51b)
where (a) leads to (4.50a) and (b) to (4.50b), respectively. Again, these are reconstruction (representation) of a sequence, or, in filter banks, synthesis.
We have used αi and α
ei liberally, defining them implicitly. It comes as no
surprise that
αi = hx, ϕ
ei i
α
ei = hx, ϕi i
e ∗ x,
α = Φ
α
e = Φ∗ x;
(4.52a)
(4.52b)
they are interchangeable like the expansion expressions. As before, the expression
for α is sometimes called decomposition (or, analysis, in filter banks).
Geometry of the Expansion Similarly to tight frames, let us explore the geometry
of general frames. For tight frames, we dealt with Φ and Φ⊥ as in (4.30),(4.37);
e and its complement Φ
e ⊥:
here, we must add the dual frame Φ
Φ
Φ⊥
with, similarly to (4.38),
e
Φ
e⊥
Φ
N ×M
(M − N ) × M
e ∗ = IN ×N ,
ΦΦ
e ⊥ )∗ = 0N ×(M−N ) ,
Φ(Φ
e ∗ = 0(M−N )×N ,
Φ⊥ Φ
⊥
e⊥ ∗
Φ (Φ )
= I(M−N )×(M−N ) .
(4.53a)
(4.53b)
(4.53c)
(4.53d)
As in (4.39a), S is the subspace of CM spanned by the columns of Φ∗ , while S ⊥ is the
subspace of CM spanned by the columns of (Φ⊥ )∗ .35 We will see when discussing
projection operators shortly, that
35 Note
that the
span(Φ∗ )
α3.2 [January 2013] [free version] CC by-nc-nd
=
e ∗ ),
span(Φ∗ ) = span(Φ
e ⊥ )∗ ),
span((Φ⊥ )∗ ) = span((Φ
(4.54a)
(4.54b)
span(ΦT ).
Comments to [email protected]
Fourier and Wavelet Signal Processing
142
Copyright 2013 J. Kovačević, V. K. Goyal, and M. Vetterli
Chapter 4. Local Fourier and Wavelet Frames on Sequences
and thus, from now on, we will use that
S = span (Φ∗ ) .
(4.55)
As before, an arbitrary vector of expansion coefficients α′ ∈ CM can be written as
α′ = α + α⊥
(4.56a)
with α ∈ S and α⊥ ∈ S ⊥ , and thus
hα, α⊥ i = 0.
(4.56b)
Frame Operator T When calculating the dual frame, the so-called canonical dual
of Φ, the product ΦΦ∗ is central; we call it
TN ×N = ΦΦ∗ .
(4.57)
It is a Hermitian and positive definite matrix (see (2.233), and thus, all of its
eigenvalues are real and positive. According to (2.232a), T can be diagonalized as
T = ΦΦ∗ = U ΛU ∗ ,
(4.58)
where Λ is a diagonal matrix of eigenvalues, and U a unitary matrix of eigenvectors.
The largest λmax and smallest λmin eigenvalues play a special role. For tight frames,
λmax = λmin = λ, and T is a scaled identity, T = λI, as it possesses a single
eigenvalue λ of multiplicity N .
Energy of Expansion Coefficients In the examples in the introduction as well as
for tight frames earlier, we have seen how the energy of the expansion coefficients
is conserved or bounded, (4.12), (4.26), and (4.31), respectively. We now look into
it for general frames by computing the energy of the expansion coefficients α
e as
(a)
(b)
ke
αk2 = α
e∗ α
e = x∗ ΦΦ∗ x = x∗ U ΛU ∗ x,
(4.59)
λmin kxk2 ≤ ke
αk2 ≤ λmax kxk2 .
(4.60a)
where (a) follows from (4.51b) and (b) from (4.57). Thus, using (4.25b),
Therefore, the energy, while not preserved, is bounded from below and above by the
eigenvalues of T . How close (tight) these eigenvalues are will influence the quality
of the frame in question, as we will see later. The same argument above can be
repeated for kαk, leading to
1
1
kxk2 ≤ kαk2 ≤
kxk2 .
λmax
λmin
(4.60b)
Relation to Tight Frames Given a general frame Φ, we can easily transform it
into a tight frame Φ′ . We do this by diagonalizing T as in (4.58). Then the tight
frame is obtained as
Φ′ = U Λ−1/2 U ∗ Φ.
α3.2 [January 2013] [free version] CC by-nc-nd
Comments to [email protected]
Fourier and Wavelet Signal Processing
Copyright 2013 J. Kovačević, V. K. Goyal, and M. Vetterli
4.2. Finite-Dimensional Frames
143
Frame Operators
The pair of inequalities (4.60a) leads to an alternate definition of a frame, similar
in spirit to Definition 2.43 for biorthogonal bases:
N
Definition 4.6 (Frame) A family Φ = {ϕi }M−1
is called a frame when
i=0 in C
there exist two constants 0 < λmin ≤ λmax < ∞, such that for all x ∈ CN ,
λmin kxk
2
≤
M−1
X
i=0
|hx, ϕi i|2 ≤ λmax kxk2 ,
(4.61)
where λmin , λmax are called lower and upper frame bounds.
Because of (4.60a), the frame bounds are clearly the eigenvalues of T as we
have seen previously. From the definition, we can also understand the meaning of
λ we have seen in (4.31); tight frames are obtained when the two frame bounds are
equal, that is, when λmax = λmin = λ.
The operators we have seen so far are sometimes called: analysis frame opere ∗ , synthesis frame operator Φ, and frame operator T = ΦΦ∗ . The analysis
ator Φ
frame operator is one of many, as there exist infinitely many dual frames for a given
frame Φ. In our finite-dimensional setting, the analysis frame operator maps an
e ∗ x belongs to the subspace S
input x ∈ CN onto a subspace of CM , namely α = Φ
∗
spanned by the columns of Φ as we have seen in (4.55).36 These, together with
other frame operators introduced shortly, are summarized in Table 4.1.
Given x ∈ CN , the frame operator T = ΦΦ∗ is a linear operator from CN to
N
C , guaranteed to be of full rank N since (4.61) ensures that λmin > 0. Also,
λmin I ≤ T = ΦΦ∗ ≤ λmax I.
(4.62)
On the other hand, given x ∈ CM , the operator Φ∗ Φ, which we have seen
before as a projection operator (4.10) in our simple example, maps the input onto a
36 Remember
e ∗.
that S is spanned by either Φ∗ or Φ
α3.2 [January 2013] [free version] CC by-nc-nd
Comments to [email protected]
Fourier and Wavelet Signal Processing
144
Copyright 2013 J. Kovačević, V. K. Goyal, and M. Vetterli
Chapter 4. Local Fourier and Wavelet Frames on Sequences
subspace S of CM . We called that operator a Gram operator in (2.118), G = Φ∗ Φ:
 ∗ 
ϕ0
 ϕ∗1  

G =  .  ϕ0 ϕ1 . . . ϕM−1
 .. 
ϕ∗M−1


hϕ0 , ϕ0 i
hϕ0 , ϕ1 i
...
hϕ0 , ϕM−1 i
 hϕ1 , ϕ0 i
hϕ1 , ϕ1 i
...
hϕ1 , ϕM−1 i 


= 

..
..
..
.
..


.
.
.
hϕM−1 , ϕ0 i hϕM−1 , ϕ1 i . . . hϕM−1 , ϕM−1 i M×M


kϕ0 k2
hϕ0 , ϕ1 i
. . . hϕ0 , ϕM−1 i
 hϕ0 , ϕ1 i∗
kϕ1 k2
. . . hϕ1 , ϕM−1 i


= 
(4.63)
 = G∗ .
..
..
..
.
.


.
.
.
.
hϕ0 , ϕM−1 i∗ hϕ1 , ϕM−1 i∗ . . .
kϕM−1 k2
This matrix G contains correlations between different frame vectors, and, while of
size M × M , it is of rank N only.
The frame operator T = ΦΦ∗ and the Gram operator G = Φ∗ Φ have the same
nonzero eigenvalues (see Section 2.B.2) and thus the same trace. This fact can be
used to show that that the sum of eigenvalues of T is equal to the sum of the norms
of the frame vectors. We state this and three further useful frame facts here; their
proofs are left for Exercise ??:
N
−1
X
λj =
j=0
Tx =
M−1
X
M−1
X
i=0
hϕi , T ϕi i =
(4.64a)
hx, ϕi iϕi ,
(4.64b)
|hx, ϕi i|2 ,
(4.64c)
|hϕi , ϕj i|2 .
(4.64d)
i=0
M−1
X
i=0
hx, T xi =
kϕi k2 ,
M−1
X
i=0
M−1
X
i,j=0
Dual Frame Operators We have already discussed the dual frame operator in
(4.48a); we now formalize it a bit more.
Definition 4.7 (Canonical dual frame) Given a frame satisfying (4.61), its
e and dual frame vectors are:
canonical dual frame Φ
e = (ΦΦ∗ )−1 Φ = T −1 Φ,
Φ
ϕ
ei = (ΦΦ∗ )−1 ϕi = T −1 ϕi .
α3.2 [January 2013] [free version] CC by-nc-nd
(4.65a)
(4.65b)
Comments to [email protected]
Fourier and Wavelet Signal Processing
Copyright 2013 J. Kovačević, V. K. Goyal, and M. Vetterli
4.2. Finite-Dimensional Frames
145
e is a frame with frame bounds
From (4.60b), we can say that the dual frame Φ
(1/λmax ) and (1/λmin ). We also see that
eΦ
e ∗ = T −1 ΦΦ∗ (T −1 )∗ = T −1 .
Te = Φ
|{z} | {z }
T
(4.66)
T −1
What is particular about this canonical dual frame is that among all right
e leads to the smallest expansion coefficients α in Euclidean norm, as
inverses of Φ, Φ
shown in Solved Exercise ??. We will also see later in this section, that expansion
coefficients α′ other than the coefficients α obtained from the canonical dual might
be more appropriate when minimizing other norms (such as ℓ1 - or ℓ∞ norms).
From (4.65b), we see that to compute those dual frame vectors, we need to
invert T . While in finite dimensions, and for reasonable M and N , this is not a
problem, it becomes an issue as M and N grow. In that case, the inverse can be
computed via a series
k
∞ X
2
2
−1
T
=
I−
T
,
(4.67)
λmin + λmax
λmin + λmax
k=0
which converges faster when the frame bounds λmin and λmax are close, that is,
when the frame is close to being tight. Solved Exercise ?? sketches a proof of
(4.67), and Solved Exercise ?? illustrates it with examples.
Projection Operators We have seen various versions of frame operators, mapping
CN to CN , as well as the Gram operator that maps CM to CM . We now look at
e ∗ Φ and Pe = Φ∗ Φ.
e In fact, these are the same, as
two other operators, P = Φ
e ∗ Φ = ((ΦΦ∗ )−1 Φ)∗ Φ = Φ∗ (ΦΦ∗ )−1 Φ = Φ∗ Φ
e = Pe.
P = Φ
(4.68)
Therefore, P maps CM to a subspace of CM , S, and is an orthogonal projection
operator, as it is idempotent and self-adjoint (Definition 2.27):
e ∗ Φ) (Φ
e ∗ Φ) (a)
e ∗ Φ = P,
P 2 = (Φ
= Φ
| {z }
I
P
∗
e∗
∗
= (Φ Φ)
(c)
e (b)
e ∗ Φ = P,
= Φ∗ Φ
= Φ∗ (T −1 Φ) = (T −1 Φ)∗ Φ = Φ
where (a) follows from (4.49a); (b) from (4.65a); and (c) from T being Hermitian
and thus self-adjoint. This projection operator projects the input onto the column
e ∗ , or, since P and Pe are the same, onto the column space of Φ∗ . Table 4.1
space of Φ
summarizes various operators we have seen until now, Table 4.2 does so for frame
expansions, Table 4.3 summarizes various classes of frames and their properties,
while Figure 4.7 does so pictorially.
4.2.3
Choosing the Expansion Coefficients
Given a frame Φ and a vector x, we have seen in (4.52a) that the expansion coeffie ∗ x; for a tight frame, this reduces to α = Φ∗ x.
cients are given by α = Φ
α3.2 [January 2013] [free version] CC by-nc-nd
Comments to [email protected]
Fourier and Wavelet Signal Processing
146
Copyright 2013 J. Kovačević, V. K. Goyal, and M. Vetterli
Chapter 4. Local Fourier and Wavelet Frames on Sequences
Operator
Symbol
Synthesis frame operator
Dual (analysis) frame operator
Frame operator
Gram operator
Projection operator
Φ
e
Φ
T
G
P
Expression
(ΦΦ∗ )−1 Φ
ΦΦ∗
Φ∗ Φ
e ∗Φ
Φ
Size
N ×M
N ×M
N ×N
M ×M
M ×M
Table 4.1: Frame operators.
Expansion
In Φ
e
In Φ
x = Φα
eα
x = Φe
e ∗x
α=Φ
α
e = Φ∗ x
e∗ = I
ΦΦ
e
ΦΦ∗ = I
λmin I ≤ T ≤ λmax I
(1/λmax )I ≤ T −1 ≤ (1/λmin )I
Table 4.2: Frame expansions.
Frames
Tight
frames
ONB
Equal-norm
frames
Figure 4.7: Frames at a glance. Tight frames with λ = 1 and unit-norm vectors lead to
orthonormal bases.
For frames, because Φ has a nontrivial null space, there exists an infinite set
of possible expansion coefficients (see also Solved Exercise ??). That is, given a
e from (4.65a), from (4.56a), we can write x as
frame Φ and its canonical dual Φ
x = Φα′ = Φ(α + α⊥ ),
(4.69)
where α′ is a possible vector of expansion coefficients from CM , α is its unique
projection onto S, and α⊥ is an arbitrary vector in S ⊥ . Within this infinite set
of possible expansion coefficients, we can choose particular solutions by imposing
further constraints on α′ . Typically, this is done by minimizing a particular norm,
some of which we discuss now.
Minimum ℓ2 -Norm Solution Among all expansion vectors α′ such that Φα′ = x,
the solution with the smallest ℓ2 norm is
min kα′ k2 = kαk2 ,
α3.2 [January 2013] [free version] CC by-nc-nd
(4.70)
Comments to [email protected]
Fourier and Wavelet Signal Processing
Copyright 2013 J. Kovačević, V. K. Goyal, and M. Vetterli
4.2. Finite-Dimensional Frames
147
Frame
Constraints
Properties
General
−1
{ϕi }M
i=0
λmin kxk2 ≤
PM −1
|hx, ϕi i|2 ≤ λmax kxk2
λmin kxk2 ≤
PM −1
|hx, ϕi i|2 ≤ λmax kxk2
is a frame for CN
Equal-norm
kϕi k = kϕj k = ϕ
for all i and j
Tight
λmin = λmax = λ
λmin I ≤ T ≤ λmax I
PN−1
P −1
2
tr(T ) = j=0
λj = tr(G) = M
i=0 kϕi k
PM −1
i=0
|hx, ϕi i|2 = λkxk2
tr(T ) =
λmin = λmax = 1
basis
kϕi k = 1
for all i
i=0
λmin I ≤ T ≤ λmax I
PN−1
P −1
2
2
tr(T ) = j=0
λj = tr(G) = M
i=0 kϕi k = M ϕ
T = λI
Orthonormal
i=0
PM −1
i=0
T =I
j=0
λj = N λ = tr(G) =
|hx, ϕi i|2 = kxk2
tr(T ) =
N=M
PN−1
PN−1
j=0
λj = N = tr(G) =
PM −1
i=0
PM −1
i=0
kϕi k2
kϕi k2 = M
Table 4.3: Summary of properties for various classes of frames.
e ∗ x is the expansion computed with respect to the canonical dual frame.
where α = Φ
The proof of this fact is along the lines of what we have seen in the introduction
for the frame (4.1), since, from (4.69),
(a)
kα′ k2 = kαk2 + kα⊥ k2 + 2ℜhα, α⊥ i = kαk2 + kα⊥ k2 ,
where (a) follows from (4.56b). The minimum is achieved for α⊥ = 0.
Since Φ contains sets of N linearly independent vectors (often a very large
number of such sets), we can write x as a linear combination of N vectors from
one such set, that is, α′ will contain exactly N nonzero coefficients and will be
sparse37 . On the other hand, the minimum ℓ2 -norm expansion coefficients α, using
the canonical dual, will typically contain M nonzero coefficients. We illustrate this
in the following example:
Example 4.2 (Nonuniqueness of the dual frame) Take R2 and the unitnorm tight frame covering the unit circle at angles (2πi)/M , for i = 0, 1, . . . , M −
1, an example of which we have already seen in (4.3) for M = 3. For M = 5, we
get

√ 
√
√
√
1
1
(−1 + 5)
− 14 (1 + 5)
− 41 (1 + 5)
(−1
+
5)
4


Φ = 
p
p
p
p
√
√
√
√ .
1
1
1
1
0 2√2 ( 5 + 5) 2√2 ( 5 − 5) − 2√2 ( 5 − 5) − 2√2 ( 5 + 5)
37 Remember
that by sparse we mean an expansion that uses only N out of the M frame vectors.
α3.2 [January 2013] [free version] CC by-nc-nd
Comments to [email protected]
Fourier and Wavelet Signal Processing
148
Copyright 2013 J. Kovačević, V. K. Goyal, and M. Vetterli
Chapter 4. Local Fourier and Wavelet Frames on Sequences
The dual frame is just a scaled version of the frame itself,
e = 2 Φ.
Φ
5
e ∗ x will typically have 5 nonzero coefficients, but no
For an arbitrary x, α = Φ
fewer than 4 (when x is orthogonal to one of the ϕ’s). On the other hand, every
set {ϕi , ϕj }, i 6= j, is a biorthogonal basis for R2 , meaning we can achieve an
expansion with only 2 nonzero coefficients. Specifically, choose a biorthogonal
e = {ϕ
basis Ψ = {ϕi , ϕj }, calculate its dual basis Ψ
ei , ϕ
ej }, and choose α′ as
α′k
=
hx, ϕ
ek i, k = i or k = j;
0,
otherwise.
We have 52 = 10 possible bases; which ones are the best? As usual, those closer
to an orthonormal basis will be better, because they are better conditioned.
Thus, we should look for pairs {ϕi , ϕj } that have an inner product |hϕi , ϕj i|
that is as small as possible. To do this, calculate the Gram operator (4.63) and
take the absolute values of its entries |hϕi , ϕj i|:
√
5−1
√ 4
 5−1
√ 4
√
1
 5+1
√
√5 − 1
4
 5+1
√
√5 + 1
5−1
5+1
√
√5 + 1
5−1
√ 4
√5 − 1
5+1

√
√5 + 1
√5 + 1
5−1
√ 4
5−1
√

√5 − 1

√5 + 1
,
5
+
1

√
5 − 1
4
and we see, as it is obvious from the geometry of the problem, that 5 bases,
{{ϕ0 , ϕ1 }, {ϕ0 , ϕ4 }, {ϕ1 , ϕ3 }, {ϕ
√ 2 , ϕ4 }, {ϕ3 , ϕ4 }}, have a minimum inner product, those with |hϕi , ϕj i| = ( 5 − 1)/4 ∼ 0.31. Now which of these to choose?
If we do not take into account x, it really does not matter. However, if we do
take it into account, then it makes sense to first choose a vector ϕi that is most
aligned with x:
max |hx, ϕi i|.
i
Assume ϕ0 is chosen, that is, x is in the shaded region in Figure 4.8. Then,
either ϕ1 or ϕ4 can be used. Let us choose an x in the shaded region, say
√
T
e ∗ x, as well as α′ = Ψ
e ∗ x with the
x=
3/2 1/2 , and compute both α = Φ
biorthogonal basis Ψ = {ϕ0 , ϕ1 }. Then,
T
0.34641 0.297258 −0.162695 −0.397809 −0.0831647 ,
T
α′ = 0.703566 0.525731 0 0 0 .
α =
As expected, α has 5 nonzero coefficients, while α′ has only 2. Then,
kαk2 = 0.63246
α3.2 [January 2013] [free version] CC by-nc-nd
<
0.87829 = kα′ k2 ,
(4.71)
Comments to [email protected]
Fourier and Wavelet Signal Processing
Copyright 2013 J. Kovačević, V. K. Goyal, and M. Vetterli
4.2. Finite-Dimensional Frames
149
ϕ2
ϕ1
ϕ0
1
ϕ3
ϕ4
Figure 4.8: Unit-norm tight frame in R2 . Those x belonging to the shaded region have
the maximum inner product (in magnitude) with ϕ0 . One can then choose ϕ1 or ϕ4 as
the other vector in a biorthogonal expansion.
as expected, as α has the minimum ℓ2 norm. However,
kαk1 = 1.28734
>
1.22930 = kα′ k1 ,
(4.72)
and thus, the sparser expansion is worse with respect to the ℓ2 norm, but is
better with respect to the ℓ1 norm, illustrating the wide range of possibilities for
expansions in frames, as well as algorithmic issues that will be explored later.
Minimum ℓ1 -Norm Solution Instead of the ℓ2 norm, we can minimize the ℓ1 norm.
That is, solve
min kα′ k1
under the constraint
Φα′ = x.
This can be turned into a linear program (see Section 4.6.3). Interestingly, minimizing the ℓ1 norm will promote sparsity.
Example 4.3 (Nonuniqueness of the dual frame (cont’d)) We now continue our previous example and calculate the expansion coefficients for the 5
biorthogonal bases Ψ01 = {ϕ0 , ϕ1 }, Ψ04 = {ϕ0 , ϕ4 }, Ψ13 = {ϕ1 , ϕ3 }, Ψ24 =
{ϕ2 , ϕ4 }, Ψ34 = {ϕ3 , ϕ4 }. These, and their ℓ1 norms are (we have already computed α′01 = α′ above but repeat it here for completeness):
α′
α′01
α′04
α′13
α′24
α′34
0.703566
1.028490
−0.177834
−1.664120
−1.028490
0.525731
−0.525731
−1.138390
−1.554221
0.109908
kα′ k1
1.22930
1.55422
1.31623
3.21834
1.13839
So we see that even among sparse expansions with exactly 2 nonzero coefficients
there are differences. In this particular case, Ψ34 has the lowest ℓ1 norm.
Minimum ℓ0 -Norm Solution The ℓ0 norm simply counts the number of nonzero
entries in a vector:
X
kxk0 = lim
|xk |p ,
(4.73)
p→0
k∈Z
α3.2 [January 2013] [free version] CC by-nc-nd
Comments to [email protected]
Fourier and Wavelet Signal Processing
150
Copyright 2013 J. Kovačević, V. K. Goyal, and M. Vetterli
Chapter 4. Local Fourier and Wavelet Frames on Sequences
with 00 = 0. Since a frame with M vectors in an N -dimensional space has necessarily a set of N linearly independent vectors, we can take these as a basis, compute
the biorthogonal dual basis, and find an expansion α′ with exactly N nonzero components (as we have just done in Example 4.2). Usually, there are many such sets
(see Exercise ??), all leading to an expansion with N nonzero coefficients. Among
these multiple solutions, we may want to choose that one with the least ℓ2 norm.
This shows that there exists a sparse expansion, very different from the expansion
that minimizes the ℓ2 norm (which will typically uses all M frame vectors and is
thus not sparse).
Minimum ℓ∞ -Norm Solution Among possible expansion coefficients α′ , we can
also chose that one that minimizes the maximum value |α′i |. That is, solve
min kα′ k∞
under the constraint
Φα′ = x.
This optimization problem can be solved using TBD. While such a solution is useful
when one wants to avoid large coefficients, minimizing the ℓ2 norm achieves a similar
goal.
Choosing the Expansion Coefficients In summary, we have seen that the nonuniqueness of possible frame expansion coefficients leaves us with freedom to optimize some
other criteria. For example, for a sparse expansion using only a few vectors from
the frame, minimizing the ℓ0 norm is a possible route, although computationally
difficult. Instead, minimizing the ℓ1 norm achieves a similar goal (as we will see in
Chapter 7), and can be done with an efficient algorithm—namely, linear programming (see Section 4.6.3). Minimizing the ℓ2 norm does not lead to sparsity; instead,
it promotes small coefficients, similarly to minimizing the maximum absolute value
of coefficients, or the ℓ∞ norm. We illustrate this discussion with a simple example:
Example 4.4 (Different norms lead to different expansions) Consider
the simplest example, N = 1, M = 2. As a frame and its dual, choose
Φ =
1
1 2
5
e =
Φ
1 2
e ∗ = I.
ΦΦ
Given an input x, the subspace of all expansion coefficients α′ that leads to
x = Φα′ is described by
1
2
α′ = α + α⊥ =
x +
γ,
2
−1
since the first term is colinear with Φ, while the second is orthogonal to Φ. In
′
Figure 4.9 we show
′ α for x = ′1. It is a line of slope −1/2 passing through
the point 1 2 , α1 = −(1/2)α0 + 5/2. We can choose any point on this line
T
as a possible set α′0 α′1
for reconstructing x with the frame Φ. Recalling
Figure 2.7 depicting points with constant ℓ1 -, ℓ2 -, and ℓ∞ norms, we now see
what the solutions are to the minimization problem in different norms:
α3.2 [January 2013] [free version] CC by-nc-nd
Comments to [email protected]
Fourier and Wavelet Signal Processing
Copyright 2013 J. Kovačević, V. K. Goyal, and M. Vetterli
4.3. Oversampled Filter Banks
151
α′1
3
D
C
2
S
1
-3
α′
1
-1
3
5
α′0
-1
-2
-3
Figure 4.9: The space of possible expansion coefficients in the frame Φ = (1/5)[1 2], and
the subspace α′ = [1 2]T x + [2 − 1]T γ for x = 1. To find the points of minimum ℓ1 -,
ℓ2 -, and ℓ∞ norms, we grow a diamond, a circle and a square, respectively, and find the
intercept points with the subspace α′ (see also Figure 2.7 showing points with constant
ℓ1 -, ℓ2 -, and ℓ∞ norms). These are D = [0 5/2], C = [1 2] and S = [5/3 5/3], respectively.
(i) Minimum ℓ2 -norm solution: The points with the same ℓ2 norm form a
′
circle. Thus, growing
a circle from the origin2 to the intercept with α yields
the point C = 1 2 with the minimum ℓ norm (see Figure 4.9). From
what we know about the ℓ2 norm, we could have also obtained it as the
point on α′ closest to the origin (orthogonal projection of the origin onto
the line of possible α′ ).
(ii) Minimum ℓ1 -norm solution: The points with the same ℓ1 norm form a
diamond. Thus, growing
from the origin to the intercept with α′
a diamond
yields the point D = 0 5/2 with the minimum ℓ1 norm (see Figure 4.9).
(iii) Minimum ℓ∞ -norm solution: The points with the same ℓ∞ norm form a
square. Thus, growing a square from the origin to the intercept with α′
yields the point S = 5/3 5/3 with the minimum ℓ∞ norm (see Figure 4.9).
The table below numerically compares these three cases:
D
C
S
ℓ1
ℓ2
ℓ∞
2.50
3.00
3.33
2.50
2.24
2.36
2.50
2.00
1.67
Emphasized entries are the minimum values for each respective norm.
4.3
Oversampled Filter Banks
This section develops necessary conditions for the design of oversampled filter banks
implementing frame expansions. We consider mostly those filter banks implementing tight frames, as the general ones follow easily and can be found in the literature.
α3.2 [January 2013] [free version] CC by-nc-nd
Comments to [email protected]
Fourier and Wavelet Signal Processing
152
Copyright 2013 J. Kovačević, V. K. Goyal, and M. Vetterli
Chapter 4. Local Fourier and Wavelet Frames on Sequences
x
geM−1
N
N
gM−1
b
b
b
b
b
b
g0
e
N
N
+
x
g0
Figure 4.10: A filter-bank implementation of a frame expansion: It is an M -channel
filter bank with sampling by N , M > N .
As we have done for filter banks implementing basis expansions (Chapters 1-3) we
also look into their polyphase representation.
From everything we have learned so far, we may expect to have an M -channel
filter bank, where each channel corresponds to one of the template frame vectors (a
couple of simple examples were given in Section 4.1 and illustrated in Figures 4.2 and
4.4). The infinite set of frame vectors is obtained by shifting the M template ones
by integer multiples of N , N < M ; thus the redundancy of the system. This shifting
can be modeled by the samplers in the system, as we have seen previously. Not
surprisingly thus, a general oversampled filter bank implementing a frame expansion
is given in Figure 4.10. We now go through the salient features in some detail;
however, since this material is a simple extension of what we have seen previously
for bases, we will be brief.
4.3.1
Tight Oversampled Filter Banks
We now follow the structure of the previous section and show the filter-bank equivalent of the expansion, expansion coefficients, geometry of the expansion, as well as
look into the polyphase decomposition as a standard analysis tool, as we have done
in the previous chapters.
As opposed to the previous section, we now work in an infinite-dimensional
space, ℓ2 (Z), where formally, many things will look the same. However, we need to
exercise care, and will point out specific instances when this is the case. Instead of
a finite-dimensional matrix Φ as in (4.30), we now deal with an infinite-dimensional
one, and with structure: the M template frame vectors, ϕ0 , ϕ1 , . . ., ϕM−1 , repeat
themselves shifted in time, much the same way they do for bases. Renaming them
g0 = ϕ0 , g1 = ϕ1 , . . ., gM−1 = ϕM−1 , we get
Φ =
h
...
g0,n
g1,n
...
gM−1,n
g0,n−N
g1,n−N
...
gM−1,n−N
i
... ,
just like for critically-sampled filter banks (those with the number of channel samples per unit of time conserved, that is, M = N , or, those implementing basis
expansions), except for the larger number of template frame vectors. We could easily implement finite-dimensional frame expansions we have seen in the last section
α3.2 [January 2013] [free version] CC by-nc-nd
Comments to [email protected]
Fourier and Wavelet Signal Processing
Copyright 2013 J. Kovačević, V. K. Goyal, and M. Vetterli
4.3. Oversampled Filter Banks
gM−1,−n
153
N
αM−1
N
gM−1,n
b
b
b
b
b
b
x
g0,−n
N
α0
N
+
x
g0,n
Figure 4.11: A filter-bank implementation of a tight frame expansion.
by just limiting the number of nonzero coefficients in

..
..
..
..
..
.
.
.
.
 .
. . .
g0,0
...
gM−1,0
0


. . .
g0,1
...
gM−1,1
0

..
..
..
..

. . .
.
.
.
.

. . . g0,N −1 . . . gM−1,N −1
0
Φ = 
. . .
0
...
0
g0,0

. . .
0
...
0
g0,1

..
..
..
..

. . .
.
.
.
.

. . .
0
...
0
g0,N −1

..
..
..
..
.
..
.
.
.
.


..
..
..
.
.. 
.
.
 .
 . . . Φ0
. . .
,
= 
. . .
Φ
. . .
0


..
. ..
..
.
.
.
.
.
gi to N , resulting in
..
.
...
...
..
.
..
.
0
0
..
.
...
...
...
..
.
0
gM−1,0
gM−1,1
..
.
...
..
.
gM−1,N −1
..
.
.
..


. . .


. . .


. . .

. . .

. . .

. . .


. . .

. . .

..
.
that is, a block-diagonal matrix, with the finite-dimensional frame matrix Φ0 of size
N × M on the diagonal. Recall that we concentrate on the tight-frame case, and
therefore, Φ0 Φ∗0 = I.
Expansion We can express the frame expansion formally in the same way as we
did for finite-dimensional frames in (4.32) (again, because it is the tight-frame case)
Φ Φ∗ = I,
(4.74)
except√that we will always work with 1-tight frames by normalizing Φ if necessary
by 1/ λ, for the filter bank to be perfect reconstruction. Writing out the expansion,
however, we see its infinite-dimensional aspect:
x =
M−1
XX
i=0 k∈Z
α3.2 [January 2013] [free version] CC by-nc-nd
hx, gi,n−N k i gi,n−N k .
(4.75)
Comments to [email protected]
Fourier and Wavelet Signal Processing
154
Copyright 2013 J. Kovačević, V. K. Goyal, and M. Vetterli
Chapter 4. Local Fourier and Wavelet Frames on Sequences
The process of computing the expansion coefficients is implemented via an analysis
filter bank, filtering by individual filters gi,−n , i = 0, 1, . . . , M − 1, and downsampling by N , as on the left side of Figure 4.11:
α = Φ∗ x
αi,k = hx, gi,n−N k i,
(4.76)
while the process of reconstructing x is implemented via a synthesis filter bank,
upsampling by N and filtering by individual filters gi,n , i = 0, 1, . . . , M − 1, as on
the right side of Figure 4.11:
x = Φα
x =
M−1
XX
αi,k gi,n−N k .
(4.77)
i=0 k∈Z
In all of the above, Φ is an infinite matrix, α and x are infinite vectors.
One can, of course, use the Fourier-domain or z-transform-domain expressions,
as before. Since they are identical (except for the number of filters), we just give
one as an example. For example, in z-transform-domain, we can find the expression
of the effect of one single branch as
Gi (z)
N −1
1 X
Gi (WN−k z −1 )X(WNk z).
N
k=0
Summing these over all branches, i = 0, 1, . . . , M − 1, we get
X(z) =
M−1
X
i=0
=
N −1
1 X
Gi (WN−k z −1 )X(WNk z)
N
k=0
!
M−1
X
−k −1
Gi (z)Gi (WN z ) X(WNk z).
Gi (z)
N −1
1 X
N
k=0
i=0
Therefore, for perfect reconstruction, the term with X(z) must equal N , while all
the others (aliasing terms) must cancel, that is:
M−1
X
Gi (z)Gi (z −1 ) = N,
i=0
M−1
X
i=0
Gi (z)Gi (WN−k z −1 ) = 0,
k = 1, 2, . . . , M − 1.
For example, for N = 2 and M = 3, we get that:
G0 (z)G0 (z −1 ) + G1 (z)G1 (z −1 ) + G2 (z)G2 (z −1 ) = 2,
G0 (z)G0 (−z −1 ) + G1 (z)G1 (−z −1 ) + G2 (z)G2 (−z −1 ) = 0.
Compare this to its counterpart expression in two-channel filter banks in (1.28).
α3.2 [January 2013] [free version] CC by-nc-nd
Comments to [email protected]
Fourier and Wavelet Signal Processing
Copyright 2013 J. Kovačević, V. K. Goyal, and M. Vetterli
4.3. Oversampled Filter Banks
155
Geometry of the Expansion Analogously to bases, each branch (channel) projects
onto a subspace of ℓ2 (Z) we call V0 or Wi , i = 1, 2, . . . , M − 1.38 While each of
these is on its own an orthogonal projection (because PV in (1.18) is an orthogonal
projection operator), they are not orthogonal to each other because of oversampling.
Each of the orthogonal projection operators is given as
PV0 = G0 UN DN GT0 ,
PWi = Gi UN DN GTi ,
i = 1, 2, . . . , M − 1,
with the range
V0 = span({g0,n−N k }k∈Z ),
Wi = span({gi,n−N k }k∈Z ),
4.3.2
i = 1, 2, . . . , M − 1.
Polyphase View of Oversampled Filter Banks
To cover the polyphase view for general N and M , we cover it through an example
with N = 2, M = 3; expressions for general N and M follow easily.
Example 4.5 (Tight oversampled 3-channel filter banks) For two-channel
filter banks, a polyphase decomposition is achieved by simply splitting both sequences and filters into their even- and odd-indexed subsequences; this is governed by the sampling factor. In an oversampled tight filter bank with N = 2
and M = 3, we still do the same; the difference is going to be in the number of
filters, as before. We have already seen how to decompose an input sequence in
(3.216), synthesis filters in (1.32), and analysis filters in (1.34). In our context,
these polyphase decompositions are the same, except that for filters, we have
more of them involved:
X
ZT
gi,0,n = gi,2n ←→ Gi,0 (z) =
gi,2n z −n ,
n∈Z
gi,1,n = gi,2n+1
ZT
←→
Gi,1 (z) =
X
gi,2n+1 z −n ,
n∈Z
Gi (z) = Gi,0 (z 2 ) + z −1 Gi,1 (z 2 ),
for i = 0, 1, 2 and synthesis filters. That is, we have 3 filters with 2 polyphase
components each, leading to the following synthesis polyphase matrix Φp (z):
Φp (z) =
G0,0 (z) G1,0 (z) G2,0 (z)
.
G0,1 (z) G1,1 (z) G2,1 (z)
As expected, the polyphase matrix is no longer square; rather, it is a (2 × 3)
matrix of polynomials. Similarly, on the analysis side, since this is a filter bank
e = Φ, we assume the same filters as on the
implementing a tight frame with Φ
38 We
assume here that the space V0 is lowpass in nature, while the Wi are bandpass.
α3.2 [January 2013] [free version] CC by-nc-nd
Comments to [email protected]
Fourier and Wavelet Signal Processing
156
Copyright 2013 J. Kovačević, V. K. Goyal, and M. Vetterli
Chapter 4. Local Fourier and Wavelet Frames on Sequences
synthesis side, only time reversed,
gi,0,n = e
e
gi,2n = gi,−2n
gi,1,n = gei,2n−1 = gi,−2n+1
e
ZT
←→
ZT
←→
ei,0 (z) =
G
ei,1 (z) =
G
X
gi,−2n z −n ,
n∈Z
X
gi,−2n+1 z −n ,
n∈Z
ei (z) = Gi,0 (z −2 ) + zGi,1 (z −2 ),
G
for i = 0, 1, 2. With this definition, the analysis polyphase matrix is, similarly
to the one for the two-channel case:
−1
−1
−1
e p (z) = G0,0 (z −1 ) G1,0 (z −1 ) G2,0 (z −1 ) = Φp (z −1 ),
Φ
G0,1 (z ) G1,1 (z ) G2,1 (z )
e p (z) is again a (2 × 3) matrix of polynomials.
where Φ
As before, this type of a representation allows for a very compact inputoutput relationship between the input (decomposed into polyphase components)
and the result coming out of the synthesis filter bank:
X0 (z 2 )
2
∗ −2
−1
,
X(z) = 1 z
Φp (z ) Φp (z )
X1 (z 2 )
where we have again used Hermitian transpose because we will often deal with
complex-coefficient filter banks in this chapter. The above is formally the same
as the expression for a critically-sampled filter bank with 2 channels; the overe p.
sampling is hidden in the dimensions of the rectangular matrices Φp and Φ
2
∗ −2
Clearly for the above to hold, Φp (z ) Φp (z ) must be an identity, analogously
to orthogonal filter banks. This result for tight frames is formalized in Theorem 4.8.
The above example went through various polyphase concepts for a tight oversampled
3-channel filter bank. For general oversampled filter banks with N , M , expressions
are the same as those given in (2.12c), (2.12e), except with M filters instead of N .
The corresponding polyphase matrices are of sizes N × M each.
Frame Operators All the frame operators we have seen so far can be expressed
via filter bank ones as well.
The frame operator T for a general infinite-dimensional frame is formally
defined as for the finite-dimensional one in (4.57), except that it is now infinitedimensional itself. Its polyphase counterpart is:
Tp (z) = Φp (z) Φ∗p (z −1 ).
(4.80)
For a tight frame implemented by a tight oversampled filter bank, this has to be
an identity as we have already said in the above example. In other words, Φp is a
rectangular paraunitary matrix. The frame operator Tp (z) is positive definite on
the unit circle:
Tp (ejω ) = |Φp (ejω )|2 > 0.
(4.81)
α3.2 [January 2013] [free version] CC by-nc-nd
Comments to [email protected]
Fourier and Wavelet Signal Processing
Copyright 2013 J. Kovačević, V. K. Goyal, and M. Vetterli
4.3. Oversampled Filter Banks
157
The canonical dual frame operator has its polyphase counterpart in:
e p (z) = Tp (z)−1 Φp (z).
Φ
(4.82)
Again, we can see that when the frame is tight, Tp (z) = I, then the dual polyphase
matrix is the same as Φp (z).
Polyphase Decomposition of an Oversampled Filter Bank As before, the polyphase formulation allows us to characterize classes of solutions. The following theorem, the counterpart of Theorem 2.1 for critically-sampled filter banks, summarizes
these without proof, the pointers to which are given in Further Reading.
Theorem 4.8 (Oversampled M -channel filter banks in polyphase domain)
Given is an M -channel filter bank with sampling by N and the polyphase matrices
e p (z). Then:
Φp (z), Φ
(i) Frame expansion in polyphase domain
A filter bank implements a general frame expansion if and only if
e ∗ (z) = I.
Φp (z)Φ
p
(4.83a)
Tp (z) = Φp (z)Φ∗p (z −1 ) = I,
(4.83b)
A filter bank implements a tight frame expansion if and only if
that is, Φp (z) is paraunitary.
(ii) Naimark’s theorem in polyphase domain
An infinite-dimensional frame implementable via an M -channel filter bank
with sampling by N is a general frame if and only if there exists a biorthogonal basis implementable via an M -channel filter bank with sampling by M
so that
Φ∗p (z) = Ψp (z)[J],
(4.84)
where J ⊂ {0, . . . , M − 1} is the index set of the retained columns of Ψp (z),
and Φp (z), Ψp (z) are the frame/basis polyphase matrices, respectively.
An infinite-dimensional frame implementable via an M -channel filter bank
with sampling by N is a tight frame if and only if there exists an orthonormal
basis implementable via an M -channel filter bank with sampling by M so
that (4.84) holds.39
(iii) Frame bounds
The frame bounds of a frame implementable by a filter bank are given by:
λmin =
min
Tp (ejω ),
(4.85a)
max Tp (ejω ).
(4.85b)
ω∈[−π,π)
λmax =
ω∈[−π,π)
α3.2 [January 2013] [free version] CC by-nc-nd
Comments to [email protected]
Fourier and Wavelet Signal Processing
158
Copyright 2013 J. Kovačević, V. K. Goyal, and M. Vetterli
Chapter 4. Local Fourier and Wavelet Frames on Sequences
The last statement on eigenvalues stems from the fact that the frame operator
T and its polyphase counterpart Tp (ejω ) are related via a unitary transformation.
If the eigenvalues of Tp (ejω ) are defined via Tp (ejω )v(ω) = λ(ω)v(ω), then the
eigenvalues of T and Tp (ejω ) are the same, leading to (4.85).
Example 4.6 (Tight oversampled 3-channel filter banks cont’d) We
now set M = 3, N = 2 and show how one can obtain a linear-phase tight frame
with filters of length greater than 2, a solution not possible for critically-sampled
filter banks with sampling by 2, as was shown in Theorem 1.12. We know that
such a filter bank implementing a tight frame transform must be seeded from an
orthogonal filter bank with a 3 × 3 paraunitary matrix.
We use such a matrix in the example showing how to parameterize N channel orthogonal filter banks, Example 2.2 with K = 2, that is, all polyphase
components will be first-degree polynomials in z −1 . We form a tight frame by
deleting its last column and call the resulting frame polyphase matrix ΦTp (z).
Since there are 5 angles involved, the matrix is too big to explicitly state here;
instead, we start imposing the linear-phase conditions to reduce the number of
degrees of freedom. A simple solution with θ00 = π/2, θ11 = π/2, θ02 = π/4 and
θ10 = 3π/4, leads to the first two filters being symmetric of length 3 and the last
antisymmetric of length 3. The resulting polyphase matrix is (where we have
rescaled the first and third columns by −1):
Φp (z) =
1
2
cos θ01 (1 + z −1 )
sin θ01
1
2
sin θ01 (1 + z −1 )
− cos θ01
1
2 (1
T
− z −1 )
,
0
leading to the following three filters:
1
cos θ01 + sin θ01 z −1 +
2
1
G1 (z) =
sin θ01 − cos θ01 z −1 +
2
1 1 −2
G2 (z) =
− z .
2 2
G0 (z) =
1
cos θ01 z −2 ,
2
1
sin θ01 z −2 ,
2
For example, with θ01 = π/3, the three resulting filters have reasonable coverage
of the frequency axis (see Figure 4.12).
4.4
Local Fourier Frames
Until now, the material in this chapter covered finite-dimensional frames (Section 4.2) and oversampled filter banks as a vehicle for implementing both finitedimensional as well as certain infinite-dimensional frames (previous section). We
now investigate a more specific class of frames; those obtained by modulating (shifting in frequency) a single prototype filter/frame vector, introduced in their basis
form in Chapter 2. These are some of the oldest bases and frames, and some of the
most widely used. The local Fourier expansions arose in response to the need to
α3.2 [January 2013] [free version] CC by-nc-nd
Comments to [email protected]
Fourier and Wavelet Signal Processing
Copyright 2013 J. Kovačević, V. K. Goyal, and M. Vetterli
4.4. Local Fourier Frames
159
1.5
|Gi (ejω )|
1.
0.5
0
Π
Π
2
ω
Figure 4.12: Tight oversampled 3-channel filter bank with sampling by N = 2 and
linear-phase filters. The figure depicts the three magnitude responses.
create a local Fourier tool, able to achieve some localization in time, at the price of
worsening the known excellent localization in frequency.
As in Chapter 2, we will consider two large classes of local Fourier frames, those
obtained by complex-exponential modulation, as well as those obtained by cosine
modulation of a single prototype filter/frame vector. In Chapter 2, we learned
that, while there exist no good local Fourier bases (apart from those equivalent to
a finite-dimensional basis), there do exist good local cosine bases. In this section,
we go even farther; we show that there exist good local Fourier frames, due to the
extra freedom redundancy buys us.
4.4.1
Complex Exponential-Modulated Local Fourier Frames
Complex-exponential modulation is used in many instances, such as the DFT basis,
(2.2), (2.5), as well as the basis constructed from the ideal filters (2.6), and is at the
heart of the local Fourier expansion known as Gabor transform. The term Gabor
frame is often used to describe any frame with complex-exponential modulation
and overlapping frame vectors (oversampled filter banks with filters of lengths longer
than the sampling factor N ). For complex exponential-modulated bases, we defined
this modulation in (2.16); for complex exponential-modulated frames, we do it now.
Complex-Exponential Modulation Given a prototype filter p = g0 , the rest of the
filters are obtained via complex-exponential modulation:
−in
gi,n = pn ej(2π/N )in = pn WM
,
i
Gi (z) = P (WM z),
(4.86)
i jω
Gi (ejω ) = P (ej(ω−(2π/M)i) ) = P (WM
e ),
for i = 1, 2, . . . , M − 1. A filter bank implementing such a frame expansion is often
called complex exponential-modulated oversampled filter bank. While the prototype
filter p = g0 is typically real, the rest of the bandpass filters are complex. The above
is identical to the expression for bases, (2.16); the difference is in the sampling factor
N , smaller here than the number of filters M .
Overcoming the Limitations of the Balian-Low Theorem In Chapter 2, Theorem 2.2, we saw that there does not exist a complex exponential-modulated local
α3.2 [January 2013] [free version] CC by-nc-nd
Comments to [email protected]
Fourier and Wavelet Signal Processing
160
Copyright 2013 J. Kovačević, V. K. Goyal, and M. Vetterli
Chapter 4. Local Fourier and Wavelet Frames on Sequences
Fourier basis implementable by an N -channel FIR filter bank, except for a filter
bank with filters of length N . We illustrated the proof with an example for N = 3
in (2.17) and demonstrated that the only solution consisted of each polyphase component being a monomial, leading to a block-based expansion.
We now investigate what happens with frames. Start with the polyphase
representation (2.12d) of the prototype filter p = g0 ,
P (z) = P0 (z N ) + z −1 P1 (z N ) + . . . + z −(N −1) PN −1 (z N ),
where Pi (z), i = 0, 1, . . . , N − 1 are its polyphase components. The modulated
versions become
i
Gi (z) = P (WM
z)
−(N −1)i −(N −1)
iN N
= P0 (WM
z ) + . . . + WM
z
iN N
z ),
PN −1 (WM
for i = 1, 2, . . . , M − 1. On a simple example, we now show that relaxing the basis
requirement allows us to implement a tight frame expansion via an oversampled
filter bank with FIR filters longer than the sampling factor N .
Example 4.7 (Overcoming limitations of Balian-Low theorem) Let
N = 2 and M = 3. The polyphase matrix corresponding to the complex
exponential-modulated filter bank is given by
P0 (z)
P0 (W32 z)
P0 (W3 z)
Φp (z) =
P1 (z) W32 P1 (W32 z) W3 P1 (W3 z)

1
=
0
1 1
0 0
0 0
1 1
1
0
0
0

0
0 Pu (z)
1

1
Pℓ (z) 
1
0

0
0
0 W32

0
1 

0 
 , (4.87)
0 

W3 
0
with Pu (z) and Pℓ (z) the diagonal matrices of polyphase components:
Pu (z) = diag([P0 (z), P0 (W3 z), P0 (W32 z)]),
Pu (z) = diag([P1 (z), P1 (W3 z), P1 (W32 z)]),
and W3−1 = W32 , W3−2 = W3 . Compare (4.87) to its basis counterpart in (2.17).
We now want to see whether it is possible for such a frame polyphase matrix
α3.2 [January 2013] [free version] CC by-nc-nd
Comments to [email protected]
Fourier and Wavelet Signal Processing
Copyright 2013 J. Kovačević, V. K. Goyal, and M. Vetterli
4.4. Local Fourier Frames
161
to implement a tight frame, in which case, it would have to satisfy (4.83b).
Φp (z)Φ∗p (z −1 ) =
=
=
=
(a)
=

1
1

1
1 1 1 0 0 0 Pu (z)
I W −1 Pu (z −1 )

−1
0 0 0 1 1 1
Pℓ (z) W
I
Pℓ (z ) 
0
0
0


1 0
 1 0


1 1 1 0 0 0
Pu (z)Pu (z −1 ) W −1 Pu (z)Pℓ (z −1 ) 
1 0



0 0 0 1 1 1 W Pℓ (z)Pu (z −1 )
Pℓ (z)Pℓ (z −1 ) 
 0 1
 0 1
0 1
P2
P2
−i
−i
i −1
i
i −1
P0 (W3 z)P0 (W3 z )
)
i=0 W3 P0 (W3 z)P1 (W3 z
Pi=0
P
2
2
−i −1
−i
i
i −1
)P1 (W3i z)
)
i=0 W3 P0 (W3 z
i=0 P1 (W3 z)P1 (W3 z
P2
P
2
−i −1
−i −1
i
i
i
P0 (W3 z)P0 (W3 z )
W3 P0 (W3 z)P1 (W3 z )
i=0
i=0
P2
P2
= I,
−i
−i −1
−i −1
i
)P1 (W3i z)
)
i=0 W3 P0 (W3 z
i=0 P1 (W3 z)P1 (W3 z

0
0

0

1

1
1
where (a) follows again from W3−1 = W32 , W3−2 = W3 , W = diag([1, W3 , W32 ]),
and we assumed that p is real. It is clear that the set of conditions above is much
less restrictive than that of every polyphase component of the prototype filter
having to be a monomial (the condition that lead to the negative result in the
discrete Balian-Low theorem, Theorem 2.2).
For example, we see that the conditions on each polyphase component:
2
X
i=0
2
X
P0 (W3i z)P0 (W3−i z −1 ) = 1,
P1 (W3i z)P1 (W3−i z −1 ) = 1,
i=0
are equivalent to those polyphase components being orthogonal filters as in (2.7).
On the other hand, the conditions involving both polyphase components:
2
X
W3i P0 (W3i z)P1 (W3−i z −1 ) = 0,
i=0
2
X
W3−i P0 (W3−i z −1 )P1 (W3i z) = 0,
i=0
are equivalent to P0 (z) and z −1 P1 (z) being orthogonal to each other as in (2.9).
For example, we know that the rows of (4.3) are orthogonal filters (since
it is a tight frame and the rows are orthonormal vectors from a 3 × 3 unitary
α3.2 [January 2013] [free version] CC by-nc-nd
Comments to [email protected]
Fourier and Wavelet Signal Processing
162
Copyright 2013 J. Kovačević, V. K. Goyal, and M. Vetterli
Chapter 4. Local Fourier and Wavelet Frames on Sequences
|P (ejω )|
1.
0.5
0
Π
2Π
3
3
ω
Figure 4.13: Magnitude response of the prototype filter P (z) of length 5.
Figure 4.14: Spectrogram of a speech segment. 64 frequency bins are evaluated between
0 and 4 KHz, and a triangle window with 50% overlap is used.
matrix via Naimark’s theorem), so we can take (with normalization)
P0 (z) =
1 √
( 2−
3
√1 z −1
2
−
√1 z −2 ),
2
1
P1 (z) = √ (1 − z −1 ).
6
We can now get the prototype filter P (z) as
P (z) = P0 (z 2 ) + z −1 P1 (z 2 ) =
√
√
1
√ (2 + 3z −1 − z −2 − 3z −3 − z −4 ),
3 2
a longer solutions than N = 2, with the magnitude response as in Figure 4.13.
Another example, with N = 2 and M = 4 is left as Exercise ??.
Application to Power Spectral Density Estimation In Chapter 2, Section 2.3.2,
we discussed the computation of periodograms as a widely used application of complex exponential-modulated filter banks. It is a process of estimating and computing
the local power spectral density. That process has a natural filter-bank implementation described in the same section. The prototype filter p computes the windowing,
and the modulation computes the DFT (see Figure 2.9 and Table 2.1). The downsampling factor N can be smaller than M , which is when we have a frame. For
example, with N = M/2, we have 50% overlap, and if N = 1 (that is, no downsampling) we are computing a sliding window DFT (with (M − 1)/M % overlap).
When both the time redundancy and the number of frequencies increases, this
time-frequency frame approaches a continuous transform called the local Fourier
transform, treated in detail in Chapter 5. A typical example for calculating the
α3.2 [January 2013] [free version] CC by-nc-nd
Comments to [email protected]
Fourier and Wavelet Signal Processing
Copyright 2013 J. Kovačević, V. K. Goyal, and M. Vetterli
4.4. Local Fourier Frames
163
periodogram of a speech signal uses M = 64, N = 32 (or 50% overlap) and a Hamming window. No averaging of the power spectral density coefficient is used. The
result is shown in Figure 4.14. This display is often called a spectrogram in the
speech processing literature. From this figure, one clearly sees the time-frequency
behavior typical of signals that have time-varying spectra.
4.4.2
Cosine-Modulated Local Fourier Frames
In Chapter 2, we saw that a possible escape from the restriction imposed by the
discrete Balian-Low theorem was to replace complex-exponential modulation with
an appropriate cosine modulation, with an added advantage that all filters are real
if the prototype is real. While frames in general offer another such escape, cosinemodulated frames provides even more options.
Cosine Modulation Given a prototype filter p, one of the possible ways to use the
cosine modulation is (other ways leading to different classes of cosine-modulated
filter banks exist; see Further Reading for pointers):
gi,n = pn cos
2π
1
(i + )n + θi
2M
2
(4.88)
i
1 h jθi −(i+1/2)n
(i+1/2)n
= pn
e W2M
+ e−jθi W2M
,
2
i
1 h jθi
(i+1/2)
−(i+1/2)
Gi (z) =
e P (W2M
z) + e−jθi P (W2M
z) ,
2
i
1 h jθi
jω
Gi (e ) =
e P (ej(ω−(2π/2M)(i+1/2)) ) + e−jθi P (ej(ω+(2π/2M)(i+1/2) ) ,
2
for i = 0, 1, . . . , M − 1, and θi is a phase factor that gives us flexibility in designing
the representation. Compare the above with (4.86) for the complex-exponential
modulation; the difference is that given a real prototype filter, all the other filters
are real. Compare it also with (2.27) for the cosine modulation in bases. The two
expressions are identical; the difference is in the sampling factor N , smaller here
than the number of filters M .
Matrix View We look at a particular class of cosine-modulated frames, those with
filters of length L = 2N , a natural extension of the LOTs from Section 2.4.1 (see
also Further Reading). We choose the same phase factor as in (2.29), leading to
gi,n = pn cos
2π
1
M −1
(i + ) (n −
) ,
2M
2
2
(4.89)
for i = 0, 1, . . . , M − 1, n =
√ 0, 1, . . . , 2N − 1. We know that for a rectangular
prototype window, pn = 1/ M , the above filters form a tight frame since they
were obtained directly by seeding the LOT with the rectangular prototype window
(compare (4.89) to (2.30)). We follow the same analysis as we did in Section 2.4.1.
α3.2 [January 2013] [free version] CC by-nc-nd
Comments to [email protected]
Fourier and Wavelet Signal Processing
164
Copyright 2013 J. Kovačević, V. K. Goyal, and M. Vetterli
Chapter 4. Local Fourier and Wavelet Frames on Sequences
As in (2.34), we can express the frame matrix Φ as

..
 .

G0


G1 G0
Φ = 

G1 G0


G1


..
.




,




(4.90)
except that blocks Gi that contain synthesis filters’ impulse responses are now of
size 2N × M (instead of 2N × N ). Given that the frame is tight (and real),
G0 GT0 + G1 GT1 = I,
(4.91a)
G1 GT0
(4.91b)
=
G0 GT1
= 0.
Assume we want to impose a prototype window; then, as in (2.41), the windowed
impulse responses are G′0 = P0 G0 and G′1 = P1 G1 , where P0 and P1 are the N × N
diagonal matrices with the left and right tails of the prototype window p on the
diagonal, and if the prototype window is symmetric, P1 = JP0 J. We can thus
substitute G′0 and G′1 into (4.91) to verify that the resulting frame is indeed tight
′ ′T
G′0 G′T
= P0 G0 GT0 P0 + P1 G1 GT1 P1
0 + G1 G1
= P0 G0 GT0 P0 + JP0 JG1 GT1 JP0 J = I.
Unlike for the LOTs, G0 GT0 has no special structure now; its elements are given by
π(i+n+1)
π(i−n)
sin
sin
2
2
1
1
+
,
(G0 GT0 )i,n = ti,n =
2M sin π(i+n+1)
2M sin π(i−n)
2M
2M
where notation t for the elements of G0 GT0 is evocative of the frame matrix T = ΦΦ∗ .
This leads to the following conditions on the prototype window:
tn,n p2n + (1 − tn,n ) p2N −n−1 = 1,
pn pk = pN −n−1 pN −k−1 ,
for n = 0, 1, . . . , N −1, k = 0, 1, . . . , N −1, k 6= n. We can fix one coefficient; let us
choose p0 = −1, then pN −1 = ±1 and pk = −pN −1 pN −k−1 for k = 1, 2, . . . , N − 2.
A possible solution for the prototype window satisfying the above conditions is
(
− cos( Nnπ
−1 ), N = 2k + 1;
dn =
2nπ
− cos( N −1 ), N = 2k,
for n = 0, 1, . . . , N − 1; an example window design is given in Figure 4.15, with
coefficients as in Table 4.4.
α3.2 [January 2013] [free version] CC by-nc-nd
Comments to [email protected]
Fourier and Wavelet Signal Processing
Copyright 2013 J. Kovačević, V. K. Goyal, and M. Vetterli
4.5. Wavelet Frames
165
pn
1.0
0.5
2
4
6
8
10
12
n
-0.5
-1.0
Figure 4.15: An example prototype window design for N = 7.
p0
−1
p1
√
− 3/2
p2
p3
p4
−1/2
0
1/2
p5
√
3/2
p6
1
Table 4.4: Prototype window used in Figure 4.15. The prototype window is symmetric,
so only half of the coefficients are shown.
4.5
Wavelet Frames
We now move from the Fourier-like frames to those that are wavelet-like. We have
seen examples of moving from bases to frames (DFT to harmonic tight frame,
for example, see Table 4.7), and we would like to do that in the wavelet case as
well. We start with the most obvious way to generate a frame from the DWT:
by removing some downsamplers. Then we move on to the predecessor of wavelet
frames originating in the work of Burt and Adelson on pyramid coding, and close
the section with the fully-redundant frames called shift-invariant DWT.
4.5.1
Oversampled DWT
How do we add redundancy starting from the DWT? We already mentioned that
an obvious way to do that was to remove some downsamplers, thereby getting a
finer time localization. Consider Figure 4.16(a), showing the sampling grid for the
DWT (corresponding to the wavelet tiling from Figure 3.7(d)): at each subsequent
level, only half of the points are present (half of the basis functions exist at that
scale). Ideally, we would like to, for each scale, insert additional points (one point
between every two). This can be achieved by having a DWT tree with the samplers
removed at all free branches (see Figure 4.17). We call this scheme oversampled
DWT, also known as the partial DWT (see Further Reading). The redundancy of
this scheme at level ℓ is Aj = 2, for a total redundancy of A = 2. The sampling
grid with J = 4 is depicted in Figure 4.16(b).
Example 4.8 (Oversampled DWT) Let us now look at a simple example
with J = 3. By moving upsamplers across filters, the filter bank in Figure 4.17
reduces to the one in Figure 4.18. The equivalent filters are then (we leave the
α3.2 [January 2013] [free version] CC by-nc-nd
Comments to [email protected]
Fourier and Wavelet Signal Processing
Copyright 2013 J. Kovačević, V. K. Goyal, and M. Vetterli
166
Chapter 4. Local Fourier and Wavelet Frames on Sequences
5
4
3
2
1
0
2
4
6
8
10
12
14
16
18
20
22
24
26
28
30
32
18
20
22
24
26
28
30
32
n
(a)
5
4
3
2
1
0
2
4
6
8
10
12
14
16
n
(b)
Figure 4.16: Sampling grids corresponding to the time-frequency tilings of (a) the DWT
(points—nonredundant) and (b) the oversampled DWT (squares—redundant).
h
h
h
level J
level 2
+
b
2
b
level 1
+
2
+
x
g
g
g
Figure 4.17: The synthesis part of the filter bank implementing the oversampled DWT.
The samplers are omitted at all the inputs into the bank. The analysis part is analogous.
H(z)
2
4
H(z 4 )
level 3
4
H(z 2 )
level 2
+
level 1
+
+
x
G(z)
G(z 2 )
G(z 4 )
Figure 4.18: The synthesis part of the equivalent filter bank implementing the oversampled DWT with J = 3 levels. The analysis part is analogous.
α3.2 [January 2013] [free version] CC by-nc-nd
Comments to [email protected]
Fourier and Wavelet Signal Processing
Copyright 2013 J. Kovačević, V. K. Goyal, and M. Vetterli
4.5. Wavelet Frames
167
filter bank in its tree form as this is how it is actually implemented):40
H (1) (z) = H(z),
H (2) (z) = G(z)H(z 2 ),
(4.92a)
(4.92b)
H (3) (z) = G(z)G(z 2 )H(z 4 ),
G(3) (z) = G(z)G(z 2 )G(z 4 ),
(4.92c)
(4.92d)
and the frame can be expressed as
(1)
(2)
(3)
(3)
Φ = {hn−k , hn−2k , hn−4k , gn−4k }k∈Z .
(4.93)
The template vector h moves by 1, h(2) moves by multiples of 2, and h(3) and
g (3) move by multiples of 4. Thus, the basic block of the infinite matrix is of
size 8 × 16 (the smallest period after which it starts repeating itself, redundancy
of 2) and it moves by multiples of 8. However, even for filters such as Haar
for which the DWT would become a block transform (the infinite matrix Φ is
block diagonal, see (3.4)), here this is not the case. Substituting Haar filters (see
Table 1.8) into the expressions for H (1) , H (2) , H (3) and G(3) above, we get
H (1) (z) =
√1 (1
2
− z −1 ),
1
(1 + z −1 − z −2 − z −3 ),
2
1
H (3) (z) = √ (1 + z −1 + z −2 + z −3 − z −4 − z −5 − z −6 − z −7 ),
2 2
1
(3)
G (z) = √ (1 + z −1 + z −2 + z −3 + z −4 + z −5 + z −6 + z −7 ).
2 2
H (2) (z) =
Renaming the template frame vectors, we can rewrite the frame Φ as
ϕk,n
ϕ8+k,n
ϕ12+k,n
ϕ14+k,n
=
=
=
=
(1)
hn−k ,
(2)
hn−2k ,
(3)
hn−4k ,
(3)
gn−4k ,
k
k
k
k
= 0,
= 0,
= 0,
= 0,
1, . . . , 7;
1, 2, 3;
1;
1;
Φ = {ϕi,n−8k }k∈Z, i=0, 1, ..., 15 .
(4.94)
Compare this to the DWT example from Section 3.1.
4.5.2
Pyramid Frames
Pyramid frames were introduced for coding in 1983 by Burt and Adelson. Although
redundant, the pyramid coding scheme was developed for compression of images and
was recognized in the late 1980s as one of the precursors of wavelet octave-band
decompositions. The scheme works as follows: First, a coarse approximation α is
40 Remember
that superscript (ℓ) denotes the level in the tree.
α3.2 [January 2013] [free version] CC by-nc-nd
Comments to [email protected]
Fourier and Wavelet Signal Processing
Copyright 2013 J. Kovačević, V. K. Goyal, and M. Vetterli
168
Chapter 4. Local Fourier and Wavelet Frames on Sequences
+
x
g
e
2
g
e
α
2
g
−
+
β
α
2
(a)
β
α
+
2
x
g
(b)
Figure 4.19: The (a) analysis and (b) synthesis part of the pyramid filter bank. This
scheme implements a frame expansion. The dashed line indicates the actual implementation, as in reality, the lowest branch would not be implemented; it is indicated here for
clarity and parallelism with two-channel filter banks.
derived (an example of how this could be done is in Figure 4.19).41 Then, from this
coarse version, the original is predicted (in the figure, this is done by upsampling
and filtering) followed by calculating the prediction error β. If the prediction is
good (as is the case for most natural images that have a lowpass characteristic), the
error will have a small variance and can thus be well compressed. The process can
be iterated on the coarse version. The outputs of the analysis filter bank are:
h
i
(a)
1/2
e 1/2 )X(z 1/2 ) + G(−z
e
)X(−z 1/2 ) ,
(4.95a)
α(z) = 12 G(z
i
h
(b)
e
e
+ G(−z)X(−z)
β(z) = X(z) − 21 G(z) G(z)X(z)
= X(z) − G(z) α(z 2 ),
(4.95b)
where (a) follows from (3.201a) and (b) from (1.77a). To reconstruct, we simply
upsample and interpolate the prediction α(z) and add it back to the prediction
error β(z):
G(z) α(z 2 ) + β(z) = X(z).
(4.96)
Upsampling and interpolating is, however, only one way to obtain the prediction
back at full resolution; any appropriate operator (even a nonlinear one) could have
been simply inverted by subtraction. We can also see that in the figure, the redundancy of the system is 50%; α is at half resolution while β is at full resolution,
that is, after analysis, we have 50% more samples than we started with. With the
analysis given in Figure 4.19(a), we now have several options:
41 While in the figure the intensity of the coarse approximation α is obtained by linear filtering
and downsampling, this need not be so; in fact, one of the powerful features of the original scheme
is that any operator can be used, not necessarily linear.
α3.2 [January 2013] [free version] CC by-nc-nd
Comments to [email protected]
Fourier and Wavelet Signal Processing
Copyright 2013 J. Kovačević, V. K. Goyal, and M. Vetterli
4.5. Wavelet Frames
β
α
169
e
h
2
β′
β′
α
α
(a)
2
h
2
g
+
x
(b)
Figure 4.20: The pyramid filter bank implementing a basis expansion. With {g, e
g , h, e
h}
a biorthogonal set, the scheme implements a biorthogonal basis expansion, while with g
and ge orthogonal, that is, gen = g−n and g satisfies (1.13), the scheme implements an
orthonormal basis expansion. The output β from Figure 4.19(a) goes through (a) filtering
and downsampling creating a new output β ′ . (b) Synthesis part.
• Synthesis is performed by upsampling and interpolating α by g as in Figure 4.19(b). In this case, the resulting scheme is clearly redundant, as we
have just discussed, and implements a frame expansion, which can be either:
(i) general, when filters g and ge are biorthogonal (they satisfy (1.66)), or,
(ii) tight, when filters g and ge are orthogonal, that is, gen = g−n and g satisfies
(1.13). We illustrate this case in Example 4.9.
• The analysis goes through one more stage, as in Figure 4.20(a), and synthesis
is performed as in Figure 4.20(b). In this case, the scheme implements a basis
expansion, which can be either (both are illustrated in Exercise ??):
(i) biorthogonal, when filters g and e
g are biorthogonal, or,
(ii) orthonormal, when filters g and ge are orthogonal.
Example 4.9 We use the pyramid filter bank as in Figure 4.19. Let us assume
that g is the Haar lowpass filter from (1.1a) and that e
gn = g−n . Then we know
from Chapter 1, that β is nothing else but the output of the highpass branch,
given in (??). For every two input samples, while α produces one output sample,
β produces two output samples; thus, the redundancy. We can write this as:



 1
√
√1
αn
2
2
β2n  =  1 − 1  x2n
.
2
2
x2n+1
1
β2n+1
− 21
|
{z 2 }
eT
Φ
We know, however, from our previous discussion that the above matrix is the
e T . Finding its canonical dual, we get that Φ = Φ,
e and thus,
dual frame matrix Φ
this pyramid scheme implements a tight frame expansion.
The redundancy for pyramid frames is A1 = 3/2 at level 1, A2 = 7/4 at level 2,
leading to A∞ = 2 (see Figure 4.21), far less than the shift-invariant DWT construction we will see in a moment. Thanks to this constant redundancy, pyramid coding
α3.2 [January 2013] [free version] CC by-nc-nd
Comments to [email protected]
Fourier and Wavelet Signal Processing
Copyright 2013 J. Kovačević, V. K. Goyal, and M. Vetterli
170
Chapter 4. Local Fourier and Wavelet Frames on Sequences
5
4
3
2
1
0
2
4
6
8
10
12
14
16
18
20
22
24
26
28
30
32
n
Figure 4.21: Sampling grid corresponding to the time-frequency tiling of the pyramid
coding scheme (points—nonredundant, squares—redundant).
x
β1
β2
α2
Figure 4.22: Two-level pyramid decomposition of an image x. A first-level coarse
approximation α1 is computed. A first-level prediction error β1 is obtained as the difference
of x and the prediction calculated on α1 . A second-level coarse approximation α2 is
computed. A second-level prediction error β2 is obtained as the difference of α1 and the
prediction calculated on α2 . The scheme is redundant, as the total number of samples
in expansion coefficients β1 , β2 , α2 is (1 + 1/4 + 1/16) times the number original image
samples, yielding redundancy of about 31%.
has been used together with directional coding to form the basis for nonseparable
multidimensional frames called contourlets (see Further Reading). An example of
a pyramid decomposition of an image is given in Figure 4.22.
4.5.3
Shift-Invariant DWT
The shift-invariant DWT is basically the nondownsampled DWT (an example for
J = 3 levels is shown in Figure 4.23). It is sometimes called stationary wavelet
transform, or, algorithme à trous,42 due to the its implementation algorithm by the
same name (see Section 4.6.1).
Let g and h be the filters used in this filter bank. At level ℓ we will have
equivalent upsampling by 2ℓ , which means that the filter moved across the upsampler will be upsampled by 2ℓ , inserting (2ℓ − 1) zeros between every two samples
and thus creating holes (thus algorithm with holes).
Figure 4.24 shows the sampling grid for the shift-invariant DWT, from where
it is clear that this scheme is completely redundant, as all points are computed.
This is in contrast to a completely nonredundant scheme such as the DWT shown
in Figure 4.16(a). In fact, while the redundancy per level of this algorithm grows
exponentially since A1 = 2, A2 P
= 4, . . ., AJ = 2J , . . ., the total redundancy for J
−J
levels is linear, as A = AJ 2 + Jℓ=1 Aℓ 2−ℓ = (J +1). This growing redundancy is
42 From French for algorithm with holes, coming from the computational method that can take
advantage of upsampled filter impulse responses, discussed in Section 4.6.1.
α3.2 [January 2013] [free version] CC by-nc-nd
Comments to [email protected]
Fourier and Wavelet Signal Processing
Copyright 2013 J. Kovačević, V. K. Goyal, and M. Vetterli
4.6. Computational Aspects
171
H(z)
H(z 2 )
H(z 4 )
level 1
+
level 2
x
G(z)
G(z 2 )
+
level 3
+
G(z 4 )
Figure 4.23: The synthesis part of the equivalent 3-channel filter bank implementing the
shift-invariant DWT with J = 3 levels. The analysis part is analogous and filters are given
in (4.92). This is the same scheme as in Figure 4.18 with all the upsamplers removed.
5
4
3
2
1
0
2
4
6
8
10
12
14
16
18
20
22
24
26
28
30
32
n
Figure 4.24: Sampling grid corresponding to the time-frequency tiling of the shiftinvariant DWT (points—nonredundant, squares—redundant).
the price we pay for shift invariance as well as the simplicity of the algorithm. The
2D version of the algorithm is obtained by extending the 1D
in a separable
Pversion
J
manner, leading to the total redundancy of A = AJ 2−J + 3 ℓ=1 Aℓ 2−ℓ = (3J + 1).
Exercise ?? illustrates the redundancy of such a frame.
4.6
4.6.1
Computational Aspects
The Algorithm à Trous
This algorithm was introduced as a fast implementation of the dyadic (continuous)
wavelet transform by Holschneider, Kronland-Martinet, Morlet, and Tchamitchian
in 1989, and corresponds to the DWT with samplers removed. We introduced it in
Section 4.5 as shift-invariant DWT and showed an example for J = 3 in Figure 4.23.
The equivalent filters in each branch are computed first, and then, the samplers are
removed. Because the equivalent filters are convolutions with upsampled filters, the
algorithm can be efficiently computed due to holes produced by upsampling.
4.6.2
Efficient Gabor and Spectrum Computation
4.6.3
Efficient Sparse Frame Expansions
Matching Pursuit
α3.2 [January 2013] [free version] CC by-nc-nd
Comments to [email protected]
Fourier and Wavelet Signal Processing
172
Copyright 2013 J. Kovačević, V. K. Goyal, and M. Vetterli
Chapter 4. Local Fourier and Wavelet Frames on Sequences
aTrous(α(0) )
Input: x = α(0) , the input signal.
Output: α(J ) , β (ℓ) , ℓ = 1, 2, . . . , J, transform coefficients.
initialize
for ℓ = 1 to J do
α(ℓ) = α(ℓ−1) ∗ (↑ 2ℓ−1 )g
β (ℓ) = α(ℓ−1) ∗ (↑ 2ℓ−1 )h
end for
return α(J ) , β (ℓ) , ℓ = 1, 2, . . . , J
Table 4.5: Algorithme à trous implementing the shift-invariant DWT. Upsampling an
impulse response g by a factor of n is denoted by (↑ n)g.
Orthonormal Matching Pursuit
Linear Programming
α3.2 [January 2013] [free version] CC by-nc-nd
Comments to [email protected]
Fourier and Wavelet Signal Processing
Copyright 2013 J. Kovačević, V. K. Goyal, and M. Vetterli
Chapter at a Glance
173
Chapter at a Glance
This chapter relaxed the constraint of nonredundancy bases carry, using frames to achieve
robustness and freedom in choosing not only the best expansion, but also, given a fixed
expansion, the best expansion coefficients under desired constraints. We introduced these
mostly on finite-dimensional frames, as they can be easily visualized via rectangular matrices. The infinite-dimensional frames we discussed were only those implementable by
oversampled filter banks, summarized in Table 4.6.
Block diagram
g̃M −1
N
αM −1
N
gM −1
b
b
b
b
x
b
+
x
b
g̃0
Basic characteristics
number of channels
sampling factor
channel sequences
N
α0
M >N
N
αi,n
N
g0
i = 0, 1, . . . , M − 1
Filters
Synthesis Analysis
filter i
gi,n
gi,n
e
i = 0, 1, . . . , M − 1
polyphase component j
gi,j,n
gi,j,n
e
j = 0, 1, . . . , N − 1
Table 4.6: Oversampled filter bank.
Block
Overlapped
transforms
transforms
(Fourier-like)
Time-frequency
constraints
(wavelet-like)
Bases
DFT
LOT
DWT
Frames
HTF
Local Fourier
Oversampled DWT
Table 4.7: Bases versus frames.
We discussed two big classes of frames following their counterparts in bases: local Fourier
frames and wavelet frames. Table 4.7 depicts relationships existing between various classes
of bases and frames. For example, the block-transform counterpart of the DFT are the harmonic tight frames, while the same for the LOT will be local Fourier frames, obtained by
both complex-exponential modulation as well as cosine modulation. By increasing the support of basis functions we can go from the DFT to the LOT, and similarly, from harmonic
tight frames to local Fourier frames. Imposing time-frequency constraints leads to new
α3.2 [January 2013] [free version] CC by-nc-nd
Comments to [email protected]
Fourier and Wavelet Signal Processing
174
Copyright 2013 J. Kovačević, V. K. Goyal, and M. Vetterli
Chapter 4. Local Fourier and Wavelet Frames on Sequences
classes of representations, such as the DWT, whose frame counterpart is the oversampled
DWT.
Historical Remarks
In the signal processing and harmonic analysis communities, frames are generally considered to have been born in 1952 in the paper by Duffin and Schaeffer [36]. Despite being
over half a century old, frames gained popularity only in the 1990s, due mostly to the work
of three wavelet pioneers—Daubechies, Grossman and Meyer [32]. An important piece to
understanding frames came with Naimark’s theorem, known for a long time in operator
algebra and used in quantum information theory, and rediscovered by several people in
the 1990s, among others, Han and Larson [42]; they came up with the idea that a frame
could be obtained by compressing a basis in a larger space.
The idea behind the class of complex exponential-modulated frames, consisting of
many families, dates back to Gabor [40] with insight of constructing bases by modulation of
a single prototype function. Gabor originally used complex-exponential modulation, and
thus, all those families with the same type of modulation are termed complex exponentialmodulated frames, or sometimes, Gabor frames. Other types of modulation are possible,
such as cosine modulation, and again, all those families with cosine modulation are termed
cosine-modulated frames.
Frame-like ideas, that is, building redundancy into a signal expansion, can be found
in numerous fields, from source and channel coding, to communications, classification,
operator and quantum theory.
Further Reading
Books and Textbooks The sources on frames are the book by Daubechies [31], a text
by Christensen [22] a number of classic papers [18, 30, 42, 43] as well as an introductory
tutorial on frames by Kovačević and Chebira [54].
Results on Frames A thorough analysis of oversampled filter banks implementing frame
expansions is given in [12, 27, 28]. Following up on the result of Benedetto and Fickus [8]
on minimizing frame potential from Section 4.2.1, Cassaza, Fickus, Kovačević, Leon and
Tremain extended the result to nonequal-norm tight frames, giving rise to the fundamental
inequality, which has ties to the capacity region in synchronous CDMA systems [110].
Casazza and Kutyniok in [20] investigated Gram-Schmidt-like procedure for producing
tight frames. In [9], the authors introduce a quantitative notion of redundancy through
local redundancy and a redundancy function, applicable to all finite-dimensional frames.
Local Fourier Frames For finite-dimensional frames, similar ideas to those of harmonic
tight frames have appeared in the work by Eldar and Bölcskei [37] under the name geometrically uniform frames, frames defined over a finite Abelian group of unitary matrices both
with a single as well as multiple generators. Harmonic tight frames have been generalized
in the works by Vale and Waldron [102], as well as Casazza and Kovačević [19].
Harmonic tight frames, as well as equiangular frames (where |hϕi , ϕj i| is a constant) [71], have strong connections to Grassmannian frames. In a comprehensive paper [89], Strohmer and Heath discuss those frames and their connection to Grassmannian
packings, spherical codes, graph theory and Welch Bound sequences (see also [48]).
α3.2 [January 2013] [free version] CC by-nc-nd
Comments to [email protected]
Fourier and Wavelet Signal Processing
Further Reading
Copyright 2013 J. Kovačević, V. K. Goyal, and M. Vetterli
175
The lack of infinite-dimensional bases with good time and frequency localization,
the result of the discrete Balian-Low theorem, prompted the development of oversampled
filter banks that use complex-exponential modulation. They are known under various
names: oversampled DFT filter banks, complex exponential-modulated filter banks, shorttime Fourier filter banks and Gabor filter banks and have been studied in [10–12,26,39,88].
Bölcskei and Hlawatsch in [11] have studied the other type of modulation, cosine. The
connection between these two classes is deep as there exists a general decomposition of
the frame operator corresponding to a cosine-modulated filter bank as the sum of the
frame operator of the underlying complex exponential-modulated frame and an additional
operator, which vanishes under certain conditions [6]. The lapped tight frame transforms
were proposed as a way to obtain a large number of frames by seeding from LOTs [21, 76].
Wavelet Frames Apart from those already discussed, like pyramid frames [15], many
other wavelet-like frame families have been proposed, among them, the dual-tree complex
wavelet transform, a nearly shift-invariant transform with redundancy of only 2, introduced by Kingsbury [51–53]. Selesnick in [77, 78] followed with the double-density DWT
and variations, which can approximately be implemented using a 3-channel filter bank with
sampling by 2, again nearly shift invariant with redundancy that tends towards 2 when
iterated. Some other variations include power-shiftable DWT [81] or partial DWT [85],
which removes samplers at the first level but leaves them at all other levels, with redundancy Aj = 2 at each level and again near shift invariance. Bradley in [13] introduces the
overcomplete DWT, the DWT with critical sampling for the first k levels followed by the
shift-invariant DWT for the last j − k levels.
Multidimensional Frames Apart from obvious, tensor-like, constructions of multidimensional frames, true multidimensional solutions exist. The oldest multidimensional
frame seems to be the steerable pyramid introduced by Simoncelli, Freeman, Adelson
and Heeger in 1992 [81], following on the previous work by Burt and Adelson on pyramid coding [15]. The steerable pyramid possesses many nice properties, such as joint
space-frequency localization, approximate shift invariance, approximate tightness and approximate rotation invariance. An excellent overview of the steerable pyramid and its
applications is given on Simoncelli’s web page [80].
Another multidimensional example is the work of Do and Vetterli on contourlets [25,
34], motivated by the need to construct efficient and sparse representations of intrinsic
geometric structure of information within an image. The authors combine the ideas of
pyramid filter banks [33] with directional processing, to obtain contourlets, expansions
capturing contour segments. These are almost critically sampled, with redundancy of
1.33.
Some other examples include [58] where the authors build both critically-sampled
and shift-invariant 2D DWT. Many ”-lets” are also multidimensional frames, such as
curvelets [16,17] and shearlets [56]. As the name implies, curvelets are used to approximate
curved singularities in an efficient manner [16, 17]. As opposed to wavelets, which use
dilation and translation, shearlets use dilation, shear transformation and translation, and
possess useful properties such as directionality, elongated shapes and many others [56].
Applications of Frames Frames have become extremely popular and have been used in
many application fields. The text by Kovačević and Chebira [54] contains an overview of
many of these and a number of relevant references. In some fields, frames have been used
for years, for example in CDMA systems, in the work of Massey and Mittelholzer [63] on
α3.2 [January 2013] [free version] CC by-nc-nd
Comments to [email protected]
Fourier and Wavelet Signal Processing
176
Copyright 2013 J. Kovačević, V. K. Goyal, and M. Vetterli
Chapter 4. Local Fourier and Wavelet Frames on Sequences
Welch bound and sequence sets for CDMA systems. It turns out that the Welch bound is
equivalent to the frame potential minimization inequality. The equivalence between unitnorm tight frames and Welch bound sequences was shown in [89]. Waldron formalized that
equivalence for general tight frames in [111], and consequently, tight frames are referred
in some works as Welch bound sequences [92].
α3.2 [January 2013] [free version] CC by-nc-nd
Comments to [email protected]
Fourier and Wavelet Signal Processing
Copyright 2013 J. Kovačević, V. K. Goyal, and M. Vetterli
Chapter 5
Local Fourier Transforms,
Frames and Bases on
Functions
Contents
5.1
Introduction
5.2
Local Fourier Transform
5.3
Local Fourier Frame Series
5.4
Local Fourier Series
5.5
Computational Aspects
Chapter at a Glance
Historical Remarks
Further Reading
178
178
188
188
188
188
188
188
Dear Reader,
This chapter needs to be finished. The only existing section, Section 5.2
has been proofread and integrated with the previous text. The rest of the
sections are yet to be written.
Please read on.
— MV, JK, and VKG
The aim of this chapter follows that of Chapter 2, but for functions. We
look for ways to localize the analysis Fourier transform provides by windowing
the complex exponentials. As before, this will improve the time localization of
the corresponding transform at the expense of the frequency localization. The
original idea dates back to Gabor, and thus Gabor transform is frequently used;
windowed Fourier transform and short-time Fourier transform are as well. We
choose the intuitive local Fourier transform, as a counterpart to local Fourier bases
from Chapter 2 and local Fourier frames from Chapter 4.
We start with the most redundant one, local Fourier transform, and then
sample to obtain local Fourier frames. With critical sampling we then try for local
Fourier bases, where, not surprisingly after what we have seen in Chapter 2, bases
with simultaneously good time and frequency localization do not exist, the result
177
α3.2 [January 2013] [free version] CC by-nc-nd
Comments to [email protected]
Fourier and Wavelet Signal Processing
178
Copyright 2013 J. Kovačević, V. K. Goyal, and M. Vetterli
Chapter 5. Local Fourier Transforms, Frames and Bases on Functions
known as Balian-Low theorem. Again as in Chapter 2, cosine local Fourier bases
do exist, as do wavelet ones we discuss in the next chapter.
5.1
Introduction
Fourier Series Basis Expansion
Localization Properties of the Fourier Series
Chapter Outline
We start with the most redundant one version of the local Fourier transform, the
local Fourier transform in Section 5.2, and then sample to obtain local Fourier
frames in Section 5.3. With critical sampling we then try for local Fourier bases in
Section 5.4, where, not surprisingly after what we have seen in Chapter 2, complex
exponential-modulated local Fourier bases with simultaneously good time and frequency localization do not exist, the result known as Balian-Low theorem. Again
as in Chapter 2, cosine-modulated local Fourier bases do exist, as do wavelet ones
we discuss in the next chapter.
Notation used in this chapter: The prototype window function in this chapter is
named p(t); this is for consistency with the prototype window sequences used in
Chapters 2 and 4; g(t) is more commonly seen in the literature.
5.2
Local Fourier Transform
Given a function x(t), we start with its Fourier transform X(ω) as in Definition 4.10.
We analyze x(t) locally by using a prototype window function p(t). We will assume
p(t) is symmetric, p(t) = p(−t), and real. The prototype function should be smooth
as well, in particular, it should be smoother than the function to be analyzed.43
5.2.1
Definition of the Local Fourier Transform
We can look at our windowing with the prototype function p(t) in two ways:
(i) We window the function x(t) as
xτ (t) = p(t − τ )x(t),
and then take its Fourier transform (4.42a),
Z
Z
Xτ (ω) = hxτ , vω i =
xτ (t) e−jωt dt =
t∈R
t∈R
(5.1)
x(t) p(t − τ )e−jωt dt, (5.2)
for ω ∈ R.
43 Otherwise,
it will interfere with the smoothness of the function to be analyzed; see Sec-
tion 5.2.2.
α3.2 [January 2013] [free version] CC by-nc-nd
Comments to [email protected]
Fourier and Wavelet Signal Processing
Copyright 2013 J. Kovačević, V. K. Goyal, and M. Vetterli
5.2. Local Fourier Transform
179
Figure 5.1: Local Fourier transform. The prototype function p(t) is centered at τ , and
thus, the Fourier transform only sees the neighborhood around τ . For simplicity, a triangle
prototype function is shown; in practice, smoother ones are used.
(ii) We window the complex exponentials vω (t) = ejωt yielding
gΩ,τ (t) = p(t − τ ) ejΩt ,
Gτ (ω) = e
−j(ω−Ω)τ
(5.3)
P (ω − Ω),
for τ, Ω ∈ R, and then define a new transform by taking the inner product
between x and gΩ,τ (t) as
Z
X(Ω, τ ) = hx, gΩ,τ i =
x(t) p(t − τ )e−jΩt dt,
(5.4)
t∈R
that is, this new transform X(Ω, τ ) is the Fourier transform of the windowed
function xτ as in (5.2).
From the construction, it is clear why this is called local Fourier transform, as shown
in Figure 5.1. We are now ready to formally define it:
Definition 5.1 (Local Fourier transform) The local Fourier transform of
a function x(t) is a function of Ω, τ ∈ R given by
Z
X(Ω, τ ) = hx, gΩ,τ i =
x(t)p(t − τ )e−jΩt dt,
Ω, τ ∈ R.
(5.5a)
t∈R
The inverse local Fourier transform of X(Ω, τ ) is
Z
Z
1
X(Ω, τ ) gΩ,τ (t) dΩ dτ.
x(t) =
2π Ω∈R τ ∈R
(5.5b)
To denote such a local Fourier-transform pair, we write:
x(t)
LFT
←→
X(Ω, τ ).
We will prove the inversion formula (5.5b) in a moment. For the analysis
of a function x(t), the X(Ω, τ ) uses time-frequency atoms gΩ,τ (t) that are centered
α3.2 [January 2013] [free version] CC by-nc-nd
Comments to [email protected]
Fourier and Wavelet Signal Processing
180
Copyright 2013 J. Kovačević, V. K. Goyal, and M. Vetterli
Chapter 5. Local Fourier Transforms, Frames and Bases on Functions
Figure 5.2: Time-frequency atom used in the local Fourier transform. (a) Time-domain
waveform gω,τ (t). The prototype function is the triangle function, and the real and imaginary parts of the complex exponential-modulated prototype function are shown. (b)
Schematic time-frequency footprint of gω,τ (t).
around Ω and τ , as shown schematically in Figure 5.2. The local Fourier transform is
highly redundant, mapping a one-dimensional function x(t) into a two-dimensional
transform X(Ω, τ ).
Prototype Window Function The prototype function p(t) is critical in the local
Fourier transform. The classical choice for p(t) is the unit-norm version of the
Gaussian function given in (4.11a) with γ = (2α/π)1/4 :
p(t) = (
2α 1/4 −α(t−µ)2
) e
,
π
(5.6)
where α is a scale parameter allowing us to tune the time resolution of the local
Fourier transform.
Another classic choice is the unit-norm sinc function from (4.75),
p(t) =
r
ω0 sin ω0 t/2
,
2π ω0 t/2
(5.7)
that is, a perfect lowpass of bandwidth |ω| ≤ ω0 /2. Here, the scale parameter allows
us to tune the frequency resolution of the local Fourier transform.
Other prototype functions of choice include rectangular, triangular (hat) or
higher-degree spline functions, as well as other classic prototype functions from
spectral analysis. An example is the Hanning, or, raised cosine window (we have
seen its discrete counterpart in (3.15)), its unit-norm version defined as
( q
2
3α (1 + cos(2πt/α)), |t| ≤ α/2;
p(t) =
0,
otherwise,
α3.2 [January 2013] [free version] CC by-nc-nd
Comments to [email protected]
Fourier and Wavelet Signal Processing
Copyright 2013 J. Kovačević, V. K. Goyal, and M. Vetterli
5.2. Local Fourier Transform
181
where α is a scale parameter.44
Inversion of the Local Fourier Transform While we have taken for granted that
the inversion formula (5.5b) holds, this is not a given. However, given the redundancy present in the local Fourier transform, we expect such an inversion to be
possible, which we now prove.
We are going to apply the generalized Parseval’s equality to (5.5b), and we
thus need the Fourier transform of X(Ω, τ ) with respect to τ . We have that
X(Ω, τ ) =
(a)
=
Z
Zt∈R
t∈R
x(t) p(t − τ ) e−jΩt dt,
(b)
p(τ − t) x(t) e−jΩt dt = (p ∗ xΩ )(τ ),
where (a) follows from p(t) = p(−t), and in (b) we introduced xΩ (t) = x(t)e−jΩt .
Using the shift-in-frequency property (4.54), the Fourier transform of xΩ (t) is X(ω+
Ω). Then, using the convolution property (4.61), the Fourier transform of X(Ω, τ )
with respect to τ becomes,
X(Ω, ω) = P (ω) X(ω + Ω).
(5.8)
In (5.5b), the other term involving τ is gΩ,τ (t) = p(t − τ )ejΩt . Using the
shift-in-time property (4.53) and because p(t) is symmetric, the Fourier transform
of p(t − τ ) with respect to τ is
p(t − τ )
FT
←→
e−jωt P (ω).
(5.9)
We now apply the generalized Parseval’s equality (4.70b) to the right side of (5.5b):
X(Ω, τ ) p(t − τ ) ejΩt dτ dΩ
Ω∈R
τ ∈R
Z
Z
1
(a) 1
=
X(ω + Ω) P (ω)P ∗ (ω) ejωt ejΩt dω dΩ
2π Ω∈R 2π ω∈R
Z
Z
1
1
(b)
=
|P (ω)|2
X(ω + Ω)ej(ω+Ω)t dΩ dω,
2π ω∈R
2π Ω∈R
Z
1
(c)
(d)
= x(t)
|P (ω)|2 dω = x(t).
2π ω∈R
1
2π
Z
Z
where (a) follows from (5.8), (5.9) and generalized Parseval’s equality (4.70b); (b)
from Fubini’s theorem (see Appendix 2.A.3) allowing for the exchange of the order
of integration; (c) from the inverse Fourier transform (4.42b); and (d) from p being
of unit norm and Parseval’s equality (4.70a).
44 In
the signal processing literature, the normalization factor is usually 1/2, such that p(0) = 1.
α3.2 [January 2013] [free version] CC by-nc-nd
Comments to [email protected]
Fourier and Wavelet Signal Processing
182
Copyright 2013 J. Kovačević, V. K. Goyal, and M. Vetterli
Chapter 5. Local Fourier Transforms, Frames and Bases on Functions
5.2.2
Properties of the Local Fourier Transform
We now look into the main properties of the local Fourier transform, including
energy conservation, followed by basic characteristics such as localization properties
and examples, including spectrograms, which are density plots of the magnitude of
the local Fourier transform.
TBD: Table with properties.
Linearity
The local Fourier transform operator is a linear operator, or,
LFT
α x(t) + β y(t)
←→
α X(Ω, τ ) + β Y (Ω, τ ).
(5.10)
Shift in Time A shift in time by t0 results in
x(t − t0 )
LFT
←→
e−jΩt0 X(Ω, τ − t0 ).
(5.11)
This is to be expected as it follows from the shift-in-time property of the Fourier
transform, (4.53). To see that,
Z
Z
′
(a)
p(t − τ ) x(t − t0 ) e−jΩt dt = e−jΩt0
p(t′ − (τ − t0 ))x(t′ )e−jΩt dt′
t′ ∈R
t∈R
(b)
= e−jΩt0 X(Ω, τ − t0 ),
where (a) follows from change of variable t′ = t − t0 ; and (b) from the definition of
the local Fourier transform (5.5a). Thus, a shift by t0 simply shifts the local Fourier
transform and adds a phase factor. The former illustrates the locality of the local
Fourier transform, while the latter follows from the equivalent Fourier-transform
property.
Shift in Frequency A shift in frequency by ω0 results in
ejω0 t x(t)
LFT
←→
X(Ω − ω0 , τ ).
(5.12)
To see this,
Z
=
t∈R
p(t − τ ) ejω0 t x(t) e−jΩt dt
t∈R
p(t − τ ) x(t) e−j(Ω−ω0 )t dt = X(Ω − ω0 , τ ),
Z
the same as for the Fourier transform. As before, a shift in frequency is often
referred to as modulation, and is dual to the shift in time.
Parseval’s Equality The local Fourier-transform operator is a unitary operator
and thus preserves the Euclidean norm (see (2.53)):
Z
Z
Z
1
1
2
2
|X(Ω, τ )|2 dΩ dτ =
kXk2 . (5.13)
kxk =
|x(t)| dt =
2π Ω∈R τ ∈R
2π
t∈R
α3.2 [January 2013] [free version] CC by-nc-nd
Comments to [email protected]
Fourier and Wavelet Signal Processing
Copyright 2013 J. Kovačević, V. K. Goyal, and M. Vetterli
5.2. Local Fourier Transform
183
We now prove Parseval’s equality for functions that are both in L1 and L2 : It
should come as no surprise that to derive Parseval’s equality for the local Fourier
transform, we use Parseval’s equality for the Fourier transform. Start with the right
side of (5.13) to get
1
2π
Z
Z
|X(Ω, τ )|2 dΩ dτ
Z
Z
1
(a) 1
|X(ω + Ω) P (ω)|2 dω dΩ
=
2π Ω∈R 2π ω∈R
Z
Z
1
1
(b)
=
|P (ω)|2
|X(ω + Ω)|2 dΩ dω,
2π ω∈R
2π Ω∈R
Z
1
(d)
(c)
= kxk2
|P (ω)|2 dω = kxk2 ,
2π ω∈R
τ ∈R
Ω∈R
where (a) follows from Parseval’s equality (4.70a) and (5.8); (b) from Fubini’s
theorem (see Appendix 2.A.3) allowing for the exchange of the order of integration;
(c) from the inverse Fourier transform (4.42b); and (d) from p being of unit norm
and Parseval’s equality (4.70a).
Redundancy The local Fourier transform maps a function of one variable into
a function of two variables. It is thus highly redundant, and this redundancy is
expressed by the reproducing kernel :
K(Ω, τ, ω0 , t0 ) = hgΩ,τ , gω0 ,t0 i
Z
=
p(t − τ )p(t − t0 ) ej(ω0 −Ω)t dt.
(5.14)
t∈R
While this is a four-dimensional object, its magnitude depends only on the two
differences (ω0 − Ω) and (t0 − τ ).45
Theorem 5.2 (Reproducing kernel formula for the local Fourier transform)
A function X(ω0 , t0 ) is the local Fourier transform of some function x(t) if and
only if it satisfies
Z
Z
1
X(Ω, τ ) =
X(ω0 , t0 )K(Ω, τ, ω0 , t0 ) dω0 dt0 .
(5.15)
2π t1 ∈R ω0 ∈R
Proof. If X(ω0 , t0 ) is a local Fourier transform, then there is a function x(t) such that
45 This
is expressed in a closely related function called the ambiguity function; see Exercise ??.
α3.2 [January 2013] [free version] CC by-nc-nd
Comments to [email protected]
Fourier and Wavelet Signal Processing
184
Copyright 2013 J. Kovačević, V. K. Goyal, and M. Vetterli
Chapter 5. Local Fourier Transforms, Frames and Bases on Functions
X(ω0 , t0 ) = X(ω0 , t0 ), or
Z
∗
X(Ω, τ ) =
x(t) gΩ,τ
(t) dt,
t∈R
Z
Z
Z
(a)
1
∗
X(ω0 , t0 )gω0 ,t0 (t) dω0 dt0 gΩ,τ
(t) dt
=
2π
t0 ∈R ω0 ∈R
t∈R
Z
Z
Z
1
=
X(ω0 , t0 )
gΩ,τ (t)gω0 ,t0 (t) dt dω0 dt0 ,
2π t0 ∈R ω0 ∈R
t∈R
Z
Z
(b)
1
=
X(ω0 , t0 )K(Ω, τ, ω0 , t0 )dω0 dt0 ,
2π t0 ∈R ω0 ∈R
(5.5b) where (a) follows from the inversion formula (5.5b) and (b) from (5.15).
For the converse, write (5.15) by making K(Ω, τ, ω0 , t0 ) explicit as an integral
over t (see (5.14)):
Z
Z
Z
1
X(Ω, τ ) =
X(ω0 , t0 ) gΩ,τ (t) gω,t (t) dt dω0 dt0
2π t1 ∈R ω1 ∈R t∈R
Z
Z
Z
(a)
1
X(ω0 , t0 ) gω,t (t) dω0 dt0 dt
=
gΩ,τ (t)
2π t1 ∈R ω1 ∈R
t∈R
Z
(b)
=
gΩ,τ (t) x(t) dt,
t∈R
where (a) follows from Fubini’s theorem (see Appendix 2.A.3) allowing for the exchange
of the order of integration, and (b) from the inversion formula (5.5b). Therefore, X(Ω, τ )
is indeed a local Fourier transform, namely the local Fourier transform of x(t).
The redundancy present in the local Fourier transform allows sampling and interpolation, and the interpolation kernel depends on the reproducing kernel.
Characterization of Singularities and Smoothness To characterize singularities,
we will take the view that the local Fourier transform is a Fourier transform of a
windowed function xτ (t) as in (5.1). Since this is a product between the function
and the prototype function, using the convolution-in-frequency property (4.64), in
the Fourier domain this is a convolution. That is, singularities are smoothed by the
prototype function.
We now characterization of singularities in time and frequency, depicted in
Figure 5.3.
(i) Characterization of singularities in time: Take a function perfectly localized
in time, the Dirac delta function x(t) = δ(t − t0 ). Then
Z
(a)
X(Ω, τ ) =
p(t − τ ) δ(t − t0 ) e−jΩt dt = p(t0 − τ ) e−jΩt0 ,
t∈R
where (a) follows from Table 4.1. This illustrates the characterization of singularities in time by the local Fourier transform: An event at time location
t0 will spread around t0 according to the prototype function, and this across
all frequencies. If p(t) has compact support [−T /2, T /2], then X(Ω, τ ) has
support [−∞, ∞] × [t0 − T /2, t0 + T /2].
α3.2 [January 2013] [free version] CC by-nc-nd
Comments to [email protected]
Fourier and Wavelet Signal Processing
Copyright 2013 J. Kovačević, V. K. Goyal, and M. Vetterli
5.2. Local Fourier Transform
185
Place Holder
for
Art Only
Place Holder
for
Art Only
(a)
(b)
Figure 5.3: Localization properties of the local Fourier transform. (a) A function perfectly localized in time, a Dirac delta function at τ , with a compactly supported prototype
function [−T /2, T /2]. (b) A function perfectly localized in frequency, a complex exponential function of frequency ω, with a prototype function having a compactly supported
Fourier transform [−B/2, B/2].
(ii) Characterization of singularities in frequency: Take now a function perfectly
localized in frequency, a complex exponential function x(t) = ejωt . Then,
Z
X(Ω, τ ) =
p(t − τ ) e−j(Ω−ω)t dt
t∈R
Z
′
(a) −j(Ω−ω)τ
= e
p(t′ )e−j(Ω−ω)t dt′
(b)
t′ ∈R
= e−j(Ω−ω)τ P (Ω − ω),
(5.16)
where (a) follows from change of variables t′ = t − τ ; and (b) from the Fourier
transform of p(t). This illustrates the characterization of singularities in frequency by the local Fourier transform: An event at frequency location ω
will spread around ω according to the prototype function, and this across all
time. If P (ω) has compact support [−B/2, B/2], then X(Ω, τ ) has support
[ω − B/2, ω + B/2] × [−∞, ∞].
What is important to understand is that if singularities appear together within a
prototype function, they appear mixed in the local Fourier transform domain. This
is unlike the continuous wavelet transform we will see in the next chapter, where
arbitrary time resolution is possible for the scale factor going to 0.
If the prototype function is smoother than the function to be analyzed, then
the type of singularity (assuming there is a single one inside the prototype function)
is determined by the decay of the Fourier transform.
Example 5.1 (Singularity characterization of the local Fourier transform)
Let us consider, as an illustrative example, a triangle prototype function from
(4.45):
1 − |t|, |t| < 1;
p(t) =
0,
otherwise,
α3.2 [January 2013] [free version] CC by-nc-nd
Comments to [email protected]
Fourier and Wavelet Signal Processing
186
Copyright 2013 J. Kovačević, V. K. Goyal, and M. Vetterli
Chapter 5. Local Fourier Transforms, Frames and Bases on Functions
which has a Fourier transform (4.47) decaying as |ω|−2 for large ω.
Consider a function x(t) ∈ C 1 (continuous and with at least one continuous
derivative, see Section 2.2.4) except for a discontinuity at t = t0 . If it were not
for the discontinuity, the Fourier transform of x(t) would decay faster than |ω|−2
(that is, faster than |P (ω)| does). However, because of the singularity at t = t0 ,
|X(ω)| decays only as |ω|−1 .
Now the locality of the local Fourier transform comes into play. There are
two modes, given by the regularity of the windowed function xτ (t): (1) When τ
is far from t0 , |τ − t0 | > 1, xτ (t) is continuous (but its derivative is not, because
of the triangle prototype function), and |X(Ω, τ )| decays as |ω|−2 . (2) When
τ is close to t0 , |τ − t0 | ≤ 1, that is, it is close to the discontinuity, xτ (t) is
discontinuous, and |X(Ω, τ )| decays only as |ω|−1 .
This above example indicates that there is a subtle interplay between the smoothness and support of the prototype function, and the singularities or smoothness of
the analyzed function. This is formalized in the following two results:
Theorem 5.3 (Singularity characterization of the local Fourier transform)
Assume a prototype function p(t) with compact support [−T /2, T /2] and sufficient smoothness. Consider a function x(t) which is smooth except for a
singularity of order n at t = t0 , that is, its nth derivative at t0 is a Dirac delta
function. Then its local Fourier transform decays as
1
|X(Ω, τ )| ∼ O
1 + |ω|n
in the region τ ∈ [t0 − T /2, t0 + T /2].
The proof follows by using the decay property of the Fourier transform applied to
the windowed function and is left as Exercise ??.
Conversely, a sufficiently decaying local Fourier transform indicates a smooth
function in the region of interest.
Theorem 5.4 (Smoothness from decay of the local Fourier transform)
Consider a sufficiently smooth prototype function p(t) of compact support
[−T /2, T /2]. If the local Fourier transform at t0 decays sufficiently fast, or for
some α and ε > 0,
α
|X(Ω, τ )| ≤
1 + |Ω|p+1+ε
then x(t) is C p on the interval [t0 − T /2, t0 + T /2].
α3.2 [January 2013] [free version] CC by-nc-nd
Comments to [email protected]
Fourier and Wavelet Signal Processing
5.2. Local Fourier Transform
Copyright 2013 J. Kovačević, V. K. Goyal, and M. Vetterli
187
Figure 5.4: The spectrogram. (a) A signal with various modes. (b) The spectrogram,
or |X(Ω, τ )|, with a short prototype function. (c) The spectrogram with a long prototype
function.
Spectrograms The standard way to display the local Fourier transform is as a
density plot of |X(Ω, τ )|. This is called the spectrogram and is very popular, for
example, for speech and music signals. Figure 5.4 shows a standard signal with
various modes and two spectrograms.
As can be seen, the sinusoid is chosen, and the singularities are identified
but not exactly localized due to the size of the prototype function. For the short
prototype function in (Figure 5.4(b)), various singularities are still isolated, but the
sinusoid is not well localized. The reverse is true for the long prototype function
(Figure 5.4(c)), where the sinusoid is well identified, but some of the singularities
are now mixed together. This is of course the fundamental tension between time
and frequency localization, as governed by the uncertainty principle we have seen
in Chapter 7.
α3.2 [January 2013] [free version] CC by-nc-nd
Comments to [email protected]
Fourier and Wavelet Signal Processing
188
5.3
Copyright 2013 J. Kovačević, V. K. Goyal, and M. Vetterli
Chapter 5. Local Fourier Transforms, Frames and Bases on Functions
Local Fourier Frame Series
5.3.1
Sampling Grids
5.3.2
Frames from Sampled Local Fourier Transform
5.4
5.4.1
Local Fourier Series
Complex Exponential-Modulated Local Fourier Bases
Complex-Exponential Modulation
Balian-Low Theorem
5.4.2
Cosine-Modulated Local Fourier Bases
Cosine Modulation
Local Cosine Bases
5.5
Computational Aspects
5.5.1
Complex Exponential-Modulated Local Fourier Bases
5.5.2
Cosine-Modulated Local Fourier Bases
Chapter at a Glance
TBD
Historical Remarks
TBD
Further Reading
TBD
α3.2 [January 2013] [free version] CC by-nc-nd
Comments to [email protected]
Fourier and Wavelet Signal Processing
Copyright 2013 J. Kovačević, V. K. Goyal, and M. Vetterli
Chapter 6
Wavelet Bases, Frames
and Transforms on
Functions
Contents
6.1
6.2
Introduction
Scaling Function and Wavelets from Orthogonal
Filter Banks
6.3
Wavelet Series
6.4
Wavelet Frame Series
6.5
Continuous Wavelet Transform
6.6
Computational Aspects
Chapter at a Glance
Historical Remarks
Further Reading
189
208
222
242
242
254
259
259
259
The previous chapter started with the most redundant version of the local
Fourier expansions on functions: the local Fourier transform. We lowered its redundancy through sampling, leading to Fourier frames. Ultimately, we wanted to
make it nonredundant by trying to build local Fourier bases; however, we hit a
roadblock, the Balian-Low theorem, prohibiting such bases with reasonable joint
time and frequency localization. While bases are possible with cosine, instead of
complex-exponential, modulation, we can do even better. In this chapter, we will
start by constructing wavelet bases, and then go in the direction of increasing redundancy, by building frames and finally the continuous wavelet transform.
6.1
Introduction
Iterated filter banks from Chapter 3 pose interesting theoretical and practical questions, the key one quite simple: what happens if we iterate the DWT to infinity?
While we need to make the question precise by indicating how this iterative process
takes place, when done properly, and under certain conditions on the filters used in
the filter bank, the limit leads to a wavelet basis for the space of square-integrable
189
α3.2 [January 2013] [free version] CC by-nc-nd
Comments to [email protected]
Fourier and Wavelet Signal Processing
190
Copyright 2013 J. Kovačević, V. K. Goyal, and M. Vetterli
Chapter 6. Wavelet Bases, Frames and Transforms on Functions
functions, L2 (R). The key notion is that we take a discrete-time basis (orthonormal
or biorthogonal) for ℓ2 (Z), and derive from it a continuous-time basis for L2 (R).
This connection between discrete and continuous time is reminiscent of the concepts
and aim of Chapter 5, including the sampling theorem. The iterative process itself
is fascinating, but the resulting bases are even more so: they are scale invariant
(as opposed to shift invariant) so that all basis vectors are obtained from a single
function ψ(t) through shifting and scaling. What we do in this opening section is
go through some salient points on a simple example we have seen numerous times
in Part II: the Haar basis. We will start from its discrete-time version seen in Chapter 1 (level 1, scale 0) and the iterated one seen in Chapter 3 (level J, scale 2J )
and build a continuous-time basis for L2 (R). We then mimic this process and show
how it can lead to a wealth of different wavelet bases. We will also look into the
Haar frame and Haar continuous wavelet transform. We follow the same roadmap
from this section, iterated filters—wavelet series—wavelet frame series—continuous
wavelet transform, throughout the rest of the chapter, but in a more general setting.
As the chapter contains a fair amount of material, some of it quite technical, this
section attempts to cover all the main concepts, and is thus rather long. The details
in more general settings are covered throughout the rest of the chapter.
6.1.1
Scaling Function and Wavelets from Haar Filter Bank
To set the stage, we start with the Haar filters g and h given in Table 1.8, Chapter 1,
where we used their impulse responses and shifts by multiples by two as a basis for
ℓ2 (Z). This orthonormal basis was implemented using a critically-sampled twochannel filter bank with down- and upsampling by 2, synthesis lowpass/highpass
filter pair gn , hn from (1.1) (repeated here for easy reference)
ZT
gn =
√1 (δn
2
+ δn−1 )
←→
hn =
√1 (δn
2
− δn−1 )
←→
ZT
G(z) =
√1 (1
2
+ z −1 )
(6.1a)
H(z) =
√1 (1
2
− z −1 ),
(6.1b)
and a corresponding analysis lowpass/highpass filter pair g−n , h−n .
We then used these filters and the associated two-channel filter bank as a
building block for the Haar DWT in Chapter 3. For example, we saw that in a
3-level iterated Haar filter bank, the lowpass and highpass at level 3 were given by
(3.1c)–(3.1d) and plotted in Figure 3.4:
G(3) (z) = G(z)G(z 2 )G(z 4 )
=
=
H
(3)
1
√
(1
2 2
1
√
(1
2 2
+ z −1 )(1 + z −2 )(1 + z −4 )
+ z −1 + z −2 + z −3 + z −4 + z −5 + z −6 + z −7 ),
2
(6.2a)
4
(z) = G(z)G(z )H(z )
=
=
α3.2 [January 2013] [free version] CC by-nc-nd
1
√
(1
2 2
1
√
(1
2 2
+ z −1 )(1 + z −2 )(1 − z −4 )
+ z −1 + z −2 + z −3 − z −4 − z −5 − z −6 − z −7 ).
(6.2b)
Comments to [email protected]
Fourier and Wavelet Signal Processing
Copyright 2013 J. Kovačević, V. K. Goyal, and M. Vetterli
6.1. Introduction
191
Iterated Filters We now revisit these filters and their iterations, but with a new
angle, as we let the iteration go to infinity by associating a continuous-time function
to the discrete-time sequence (impulse response of the iterated filter).
We first write the expressions for the equivalent filters at the last level of a
J-level iterated Haar filter bank:
J−1
Y
G(J) (z) =
ℓ
G(z 2 ) = 2−J/2
ℓ=0
J−1
Y
ℓ
(1 + z −2 )
ℓ=0
= 2−J/2
J
2X
−1
z −n = G(J−1) (z)
n=0
J−2
Y
H (J) (z) =
ℓ
J−1
G(z 2 )H(z 2
= 2
|
) = 2−J/2
ℓ=0
−J/2
√1 (1
2
G(z 2J−1 )
J−2
Y
ℓ=0
2J−1
X−1
(
z
−n
n=0
−
J
2X
−1
J−1
+ z2
{z
),
}
(6.3)
ℓ
J−1
(1 + z −2 )(1 − z −2
z −n ) = G(J−1) (z)
n=2J−1
√1 (1
2
|
)
J−1
− z2
{z
H(z 2J−1 )
).
}
We have seen the above expressions in (3.5c), (3.9) already; they construct the
equivalent filter at the subsequent level, from the equivalent filters at the previous
one.
We know that, by construction, these filters are orthonormal to their shifts by
2J , (3.6a), (3.11a), as well as orthogonal to each other, (3.14a), and their lengths
are
L(J) = 2J .
(6.4)
Scaling Function and its Properties We now associate a piecewise-constant func(J)
tion ϕ(J) (t) to gn so that ϕ(J) (t) is of finite length and norm 1; we thus have to
determine the width and height of the piecewise segments. Since the number of
(J)
piecewise segments (equal to the number of nonzero coefficients of gn ) grows ex−J
ponentially with J because of (6.4), we choose their width as 2 , upper bounding
(J)
the length of ϕ(J) (t) by 1. For ϕ(J) (t) to inherit the unit-norm property from gn ,
(J)
we choose the height of the piecewise segments as 2J/2 gn . Then, the nth piece of
(J)
ϕ (t) contributes
Z
Z
(n+1)/2J
|ϕ(J) (t)|2 dt =
n/2J
(n+1)/2J
(a)
2J (gn(J) )2 dt = (gn(J) )2 = 2−J
n/2J
to ϕ(J) (t) (where (a) follows from (6.3)). Summing up the individual contributions,
(J)
kϕ
(t)k
2
=
J
2X
−1
n=0
Z
(n+1)/2J
n/2J
(J)
|ϕ
2
(t)| dt =
J
2X
−1
2−J = 1
n=0
as in Figure 6.1. We have thus defined our piecewise-constant function as
α3.2 [January 2013] [free version] CC by-nc-nd
Comments to [email protected]
Fourier and Wavelet Signal Processing
192
Copyright 2013 J. Kovačević, V. K. Goyal, and M. Vetterli
Chapter 6. Wavelet Bases, Frames and Transforms on Functions
(4)
1.5
22 gn
1.5
1.0
1.0
0.5
0.5
0.0
5
10
15
n
ϕ(J ) (t)
0.0
-0.5
1
1
1
3
16
4
2
4
1
t
-0.5
(a)
(b)
(J )
Figure 6.1: Example construction of a piecewise-constant function ϕ(J ) (t) from gn .
(4)
(a) Discrete-time sequence 22 gn . (b) Continuous-time piecewise-constant function ϕ(4) (t)
(we plot a few isolated piecewise segments for emphasis).
|Φ(ω)|
1
0.5
-16 Π -12 Π -8 Π
-4 Π
0
4Π
8Π
12 Π
16 Π
w
Figure 6.2: Magnitude response Φ(ω) of the scaling function ϕ(t).
ϕ(J) (t) = 2J/2 gn(J) = 1
n
n+1
≤ t <
.
J
2
2J
(6.5)
As ϕ(J) (t) is 1 on every interval of length 2−J and g (J) has exactly 2J nonzero
entries (see (6.4)), this function is actually 1 on the interval [0, 1) for every J, that
is, the limit of ϕ(J) (t) is the indicator function of the unit interval [0, 1] (or, a box
function shifted to 1/2), ϕ(t), independently of J,
1, 0 ≤ t < 1;
(J)
ϕ (t) =
= ϕ(t).
(6.6)
0, otherwise,
Convergence is achieved without any problem, actually in one step! 46
The function ϕ(t) is called the Haar scaling function. Had we started with a
different lowpass filter g, the resulting limit, had it existed, would have lead to a
different scaling function, a topic we will address later in the chapter.
In the Fourier domain, Φ(J) (ω), the Fourier transform of ϕ(J) (t), will be the
same for every J because of (6.6), and thus, the Fourier transform of the scaling
function will be the sinc function in frequency (see Table 4.6 and Figure 6.2):
Φ(ω) = e−jω/2
sin(ω/2)
= e−jω/2 sinc (ω/2) .
ω/2
(6.7)
We now turn our attention to some interesting properties of the scaling function:
46 Although we could have defined a piecewise-linear function instead of a piecewise-constant
one, we chose not to do so as the behavior of the limit we will study does not change.
α3.2 [January 2013] [free version] CC by-nc-nd
Comments to [email protected]
Fourier and Wavelet Signal Processing
Copyright 2013 J. Kovačević, V. K. Goyal, and M. Vetterli
6.1. Introduction
193
ϕ(t)
ϕ(2t) + ϕ(2t − 1)
1
0.0
-0.5
1
0.5
1.0
1.5
t
-0.5
(a)
0.0
0.5
1.0
1.5
t
(b)
Figure 6.3: Two-scale equation for the Haar scaling function. (a) The scaling function
ϕ(t) and (b) expressed as a linear combination of ϕ(2t) and ϕ(2t − 1).
(i) Two-scale equation: The Haar scaling function ϕ(t) satisfies
√
ϕ(t) = 2 (g0 ϕ(2t) + g1 ϕ(2t − 1)) = ϕ(2t) + ϕ(2t − 1),
(6.8)
the so-called two-scale equation. We see that the scaling function is built out
of two scaled versions of itself, illustrated in Figure 6.3. While in this Haar
case, this does not come as a big surprise, it will when the scaling functions
become more complex. To find the expression for the two-scale equation in
the Fourier domain, we rewrite (6.8) as a convolution
1
√ X
ϕ(t) = ϕ(2t) + ϕ(2t − 1) = 2
gk ϕ(2t − k).
(6.9)
k=0
We can then use the convolution-in-time property (4.61) and the scaling property (4.55a) of the Fourier transform to get
Φ(ω) =
√1
2
G(ejω/2 ) Φ(ω/2) =
1
2 (1
+ e−jω/2 ) e−jω/4 sinc (ω/4) .
(6.10)
(ii) Smoothness: The Haar scaling function ϕ(t) is not continuous.47 This can also
be seen from the decay of its Fourier transform Φ(ω), which, as we know from
(6.7), is a sinc function (see also Figure 6.2), and thus decays slowly (it has,
however, only two points of discontinuity and is, therefore, not all-together
ill-behaved).
(iii) Reproduction of polynomials: The Haar scaling function ϕ(t) with its integer
shifts can reproduce constant functions. This stems from the polynomial approximation properties of g, as in Theorem 1.5. In the next section, we will
see how other scaling functions will be able to reproduce polynomials of degree
N . The key will be the number of zeros at ω = π of the lowpass filter G(ejω );
from (6.1a), for the Haar scaling function, there is just 1.
(iv) Orthogonality to integer shifts: The Haar scaling function ϕ(t) is orthogonal
to its integer shifts, another property inherited from the underlying filter. Of
47 In Theorem 6.1, we will see a sufficient condition for the limit function, if it exists, to be
continuous (and possibly k-times differentiable).
α3.2 [January 2013] [free version] CC by-nc-nd
Comments to [email protected]
Fourier and Wavelet Signal Processing
194
Copyright 2013 J. Kovačević, V. K. Goyal, and M. Vetterli
Chapter 6. Wavelet Bases, Frames and Transforms on Functions
ϕ(t)
1.0
1
ϕ(2t) − ϕ(2t − 1)
0.5
0.5
-0.5
1.0
1.5
t
0.5
-0.5
1.0
1.5
t
-0.5
-1.0
(a)
(b)
Figure 6.4: Two-scale equation for the Haar wavelet. (a) The wavelet ψ(t) and (b)
expressed as a linear combination of ϕ(2t) and ϕ(2t − 1).
course, in this Haar case the property is obvious, as the support of ϕ(t) is
limited to the unit interval. The property will still hold for more general
scaling functions, albeit it will not be that obvious to see.
Wavelet and its Properties The scaling function we have just seen is lowpass in
nature (if the underlying filter g is lowpass in nature). Similarly, we can construct
a wavelet (or, simply wavelet ) that will be bandpass in nature (if the underlying
filter h is highpass in nature).
(J)
We thus associate a piecewise-constant function ψ (J) (t) to hn in such a way
that ψ (J) (t) is of finite length and of norm 1; we use the same arguments as before
to determine the width and height of the piecewise segments, leading to
ψ (J) (t) = 2J/2 h(J)
n
n+1
n
≤ t <
.
2J
2J
(6.11)
Like ϕ(J) (t), the function ψ (J) (t) is again the same for every J since the length of
h(J) is exactly 2J . It is 1 for n = 0, 1, . . . , 2J−1 − 1, and is −1 for n = 2J−1 , 2J−1 +
1, . . . , 2J − 1. Thus, it comes as no surprise that the limit is

 1, 0 ≤ t < 1/2;
−1, 1/2 ≤ t < 1;
ψ(t) =
(6.12)

0, otherwise,
called the Haar wavelet, or, Haar wavelet,48 (see Figure 6.4(a)).
Similarly to Φ(ω), in the Fourier-domain,
Ψ(ω) =
1
2 (1
− e−jω/2 ) e−jω/4 sinc (ω/4) =
√1
2
H(ejω/2 ) Φ(ω/2).
(6.13)
We now turn our attention to some interesting properties of the Haar wavelet:
(i) Two-scale equation: We can see from Ψ(ω) its highpass nature as it is 0 at
ω = 0 (because H(1) = 0). Using the convolution-in-time property (4.61) and
48 Depending on the initial discrete-time filters, the resulting limits, when they exist, lead to
different wavelets.
α3.2 [January 2013] [free version] CC by-nc-nd
Comments to [email protected]
Fourier and Wavelet Signal Processing
Copyright 2013 J. Kovačević, V. K. Goyal, and M. Vetterli
6.1. Introduction
195
the scaling property (4.55a) of the Fourier transform, we see that the above
can be written in the time domain as (see Figure 6.4(b))
ψ(t) =
√
2 hhk , ϕ(2t − k)ik = ϕ(2t) − ϕ(2t − 1).
(6.14)
In other words, the wavelet is built out of the scaling function at a different
scale and its shift, its own two-scale equation, but involving scaled versions
of the scaling function instead of itself. The last expression in (6.13) is the
Fourier-domain version of the two-scale equation.
(ii) Smoothness: Since the wavelet is a linear combination of the scaled scaling
function and its shift, its smoothness is inherited from the scaling function; in
other words, like the scaling function, it is not continuous, having 3 points of
discontinuity.
(iii) Zero-moment property: We have seen that the Haar scaling function can reproduce constant functions. The Haar wavelet has a complementary property,
called zero-moment property. To see that,
Z ∞
Φ(ω)|ω=0 =
ψ(t) dt = 0,
−∞
that is, the inner product between the wavelet and a constant function will
be zero. In other words, the wavelet annihilates constant functions while the
scaling function reproduces them.
(iv) Orthogonality to integer shifts: Finally, like the scaling function, the wavelet
is orthogonal with respect to integer shifts. Again, this is trivial to see for the
Haar wavelet as it is supported on the unit interval only.
(v) Orthogonality of the scaling and wavelets: It is also trivial to see that the scaling function and the wavelet are orthogonal to each other. All these properties
are setting the stage for us to build a basis based on these functions.
6.1.2
Haar Wavelet Series
Thus far, we have constructed two functions, the scaling function ϕ(t) and the
wavelet ψ(t), by iterating the Haar filter bank. That filter bank implements a
discrete-time Haar basis, what about in continuous time? What we can say is that
this scaling function and the wavelet, together with their integer shifts, {ϕ(t −
k), ψ(t − k)}k∈Z do constitute a basis, for the space of piecewise-constant functions
on intervals of half-integer length or more (see Figure 6.5(a)–(c)). We can see that
as follows. Assume we are given a function x(−1) (t) that equals a for 0 ≤ t < 1/2;
b for 1/2 ≤ t < 1; and 0 otherwise. Then,
a+b
a−b
ϕ(t) +
ψ(t)
2
2
= hx(−1) (t), ϕ(t)i ϕ(t) + hx(−1) (t), ψ(t)i ψ(t)
x(−1) (t) =
= x(0) (t) + d(0) (t).
α3.2 [January 2013] [free version] CC by-nc-nd
(6.15)
Comments to [email protected]
Fourier and Wavelet Signal Processing
196
Copyright 2013 J. Kovačević, V. K. Goyal, and M. Vetterli
Chapter 6. Wavelet Bases, Frames and Transforms on Functions
x(−1) (t)
d(0) (t)
x(0) (t)
4
4
4
3
3
3
2
2
2
1
1
1
2
3
4
t
1
1
2
3
4
t
1
-1
-1
-1
-2
-2
-2
(a)
(b)
2
3
4
t
(c)
Figure 6.5: Haar series decomposition of (a) x(−1) (t), a function constant on half-integer
intervals, using {ϕ(t − k), ψ(t − k)}k∈Z , into (b) x(0) (t) and (c) d(0) (t).
x(−1) (t)
x(−2) (t)
d(−1) (t)
4
4
4
3
3
3
2
2
2
1
1
1
2
3
4
t
1
1
2
3
4
t
1
-1
-1
-1
-2
-2
-2
(a)
(b)
2
3
4
t
(c)
(−2)
(t), a function constant on quarterFigure 6.6: Haar series
√ of (a) x
√ decomposition
integer intervals, using { 2ϕ(2t − k), 2ψ(2t − k)}k∈Z , into (b) x(−1) (t) and (c) d(−1) (t).
Had the function x(−1) (t) been nonzero on any other interval, we could have used
the integer shifts of ϕ(t) and ψ(t).
Clearly, this process scales by 2. In other words, the scaled scaling
√ function√and the wavelet, together with their shifts by multiples of 1/2, { 2ϕ(2t −
k), 2ψ(2t − k)}k∈Z do constitute a basis, for the space of piecewise-constant functions on intervals of quarter-integer length or more (see Figure 6.6(a)–(c)). Assume,
for example, we are now given a function x(−2) (t) that equals c for 0 ≤ t < 1/4; d
for 1/4 ≤ t < 1/2; and 0 otherwise. Then,
c+d
c−d
ϕ(2t) +
ψ(2t)
2
2√
√
√
√
= hx(−2) , 2 ϕ(2t)i 2 ϕ(2t) + hx(−2) (t), 2 ψ(2t)i 2 ψ(2t)
x(−1) (t) =
(a)
= x(−1) (t) + d(−1) (t) = x(0) (t) + d(0) (t) + d(−1) (t),
where (a) follows from (6.15). In
√ other words, we could have also decomposed
x(−2) (t) using {ϕ(t − k), ψ(t − k) 2, ψ(2t − k)}k∈Z
Continuing this argument, to represent piecewise-constant functions on inter-
α3.2 [January 2013] [free version] CC by-nc-nd
Comments to [email protected]
Fourier and Wavelet Signal Processing
Copyright 2013 J. Kovačević, V. K. Goyal, and M. Vetterli
6.1. Introduction
197
vals of length 2−ℓ , we need the following basis:
Scale
Functions
-
0
scaling function and its shifts
ϕ(t − k)
wavelet and its shifts
ψ(t − k)
−1
..
.
wavelet and its shifts
..
.
21/2 ψ(2t − k)
...
−(ℓ − 1)
wavelet and its shifts
2−(ℓ−1)/2 ψ(2−(ℓ−1) t − k)
-
-
..
.
-
Definition of the Haar Wavelet Series From what we have seen, if we want to
represent shorter and shorter constant pieces, we need to keep on adding wavelets
with decreasing scale together with the scaling function at the coarsest scale. We
may imagine, and we will formalize this in a moment, that if we let this process
go to infinity, the scaling function will eventually become superfluous, and it does.
This previous discussion leads to the Haar orthonormal set and a truly surprising
result dating back to Haar in 1910, that this orthonormal system is in fact a basis
for L2 (R). This result is the Haar continuous-time counterpart of Theorem 3.2,
which states that the discrete-time wavelet hn and its shifts and scales (equivalent
iterated filters) form an orthonormal basis for the space of finite-energy sequences,
ℓ2 (Z). The general result will be given by Theorem 6.6.
For compactness, we start by renaming our basis functions as:
1
t − 2ℓ k
−ℓ/2
−ℓ
ψℓ,k (t) = 2
ψ(2 t − k) = ℓ/2 ψ
,
(6.16a)
2ℓ
2
1
t − 2ℓ k
ϕℓ,k (t) = 2−ℓ/2 ϕ(2−ℓ t − k) = ℓ/2 ϕ
.
(6.16b)
2ℓ
2
A few of the wavelets are given in Figure 6.7. Since we will show that the Haar
wavelets form an orthonormal basis, we can define the Haar wavelet series to be
Z ∞
(ℓ)
βk = hx, ψℓ,k i =
x(t) ψℓ,k (t) dt,
ℓ, k ∈ Z,
(6.17a)
−∞
and the inverse Haar wavelet series
x(t) =
XX
(ℓ)
βk ψℓ,k (t).
(6.17b)
ℓ∈Z k∈Z
α3.2 [January 2013] [free version] CC by-nc-nd
Comments to [email protected]
Fourier and Wavelet Signal Processing
198
Copyright 2013 J. Kovačević, V. K. Goyal, and M. Vetterli
Chapter 6. Wavelet Bases, Frames and Transforms on Functions
ψ0,0 (t)
ψ−1,1 (t)
1
1
1
4
-
ψ−2,1 (t)
21
1
3
4
1
1
t
1
21
4
-
3
4
1
t
1
14
21
-
3
4
1
t
1
(a)
(b)
(c)
Figure 6.7: Example Haar basis functions. (a) The prototype function ψ(t) = ψ0,0 (t);
(b) ψ−1,1 (t); (c) ψ−2,1 (t). (Repeated Figure 1.2.)
We call β (ℓ) the wavelet coefficients, and denote such a wavelet series pair by
x(t)
WS
←→
(ℓ)
βk .
Properties of the Haar Wavelet Series
(i) Linearity: The Haar wavelet series operator is a linear operator.
(ii) Parseval’s Equality: The Haar wavelet series operator is a unitary operator
and thus preserves the Euclidean norm (see (2.53)):
Z ∞
X X (ℓ)
2
2
kxk =
|x(t)| dt =
|βk |2 .
(6.18)
−∞
ℓ∈Z k∈Z
(iii) Zero-Moment Property: We have seen earlier that, while the Haar scaling
function with its integer shifts can reproduce constant functions, the Haar
wavelet with its integer shifts annihilates them. As the wavelet series uses
wavelets as its basis functions, it inherits that property; it annihilates constant
functions. In the remainder of the chapter, we will see this to be true for
higher-degree polynomials with different wavelets.
(iv) Characterization of Singularities: One of the powerful properties of waveletlike representations is that they can characterize the type and position of
singularities via the behavior of wavelet coefficients. Assume, for example,
that we want to characterize the step singularity present in the Heaviside
function (4.7) with the step at location t0 . We compute the wavelet coefficient
(ℓ)
βk to get
Z ∞
(ℓ)
βk =
x(t) ψℓ,k (t) dt
−∞
2ℓ/2 k − 2−ℓ/2 t0 ,
2ℓ k ≤ t0 < 2ℓ (k + 12 );
=
ℓ/2
−ℓ/2
ℓ
−2 (k + 1) + 2
t0 , 2 (k + 12 ) ≤ t0 < 2ℓ (k + 1).
Because the Haar wavelets at a fixed scale do not overlap, there exists exactly
one nonzero wavelet coefficient per scale, the one that straddles the discontinuity. Therefore, as ℓ → −∞, the wavelet series zooms towards the singularity,
shown in Figure 6.8. We see that as ℓ decreases from 5 to 1, the single nonzero
wavelet coefficients gets closer and closer to the discontinuity.
α3.2 [January 2013] [free version] CC by-nc-nd
Comments to [email protected]
Fourier and Wavelet Signal Processing
Copyright 2013 J. Kovačević, V. K. Goyal, and M. Vetterli
6.1. Introduction
199
x(t)
1
0
5
10
15
20
25
30
t
ℓ
1
2
3
4
5
k
0
(ℓ)
Figure 6.8: Behavior of Haar wavelet coefficients across scales. We plot βk for ℓ =
1, 2, . . . , 5, where k is dependent on the scale. Because the wavelet is Haar, there is exactly
one nonzero coefficient per scale, the one corresponding to the wavelet that straddles the
discontinuity.
Multiresolution Analysis In the discrete-time Chapters 1-4, we have often encountered coarse/detail approximating spaces V and W . We now use the same intuition
and start from similar spaces to build the Haar wavelet series in reverse. What
we will see is how the iterative construction and the two-scale equations are the
manifestations of a fundamental embedding property explicit in the multiresolution
analysis of Mallat and Meyer.
We call V (0) the space of piecewise-constant functions over unit intervals, that
is, we say that x(t) ∈ V (0) , if and only if x(t) is constant for t ∈ [k, k + 1), and x(t)
is of finite L2 norm. Another way to phrase the above is to note that
V (0) = span ({ϕ(t − k)}k∈Z ) = span ({ϕ0,k }k∈Z ) ,
(6.19)
where ϕ(t) is the Haar scaling function (6.6), and, since hϕ(t−k), ϕ(t−m)i = δk−m ,
this scaling function and its integer translates form an orthonormal basis for V (0)
(see Figure 6.9). Thus, x(t) from V (0) can be written as a linear combination
X (0)
x(t) =
αk ϕ(t − k),
k∈Z
(0)
where αk is simply the value of x(t) on the interval [k, k + 1). Since kϕ(tk = 1,
(0)
kx(t)k2 = kαk k2 ,
or, Parseval’s equality for this orthonormal basis. We now introduce a scaled
α3.2 [January 2013] [free version] CC by-nc-nd
Comments to [email protected]
Fourier and Wavelet Signal Processing
200
Copyright 2013 J. Kovačević, V. K. Goyal, and M. Vetterli
Chapter 6. Wavelet Bases, Frames and Transforms on Functions
4
x(0) (t) ∈ V (0)
1.5
ϕ(t − k)
3
1.0
2
0.5
1
1
2
3
4
t
-1
1
2
3
4
t
-0.5
-2
-1.0
(a)
(b)
Figure 6.9: Haar multiresolution spaces and basis functions. (a) A function x(0) (t) ∈
V (0) . (b) Basis functions for V (0) .
Figure 6.10: Multiresolution spaces.
version of V (0) called V (ℓ) , the space of piecewise-constant functions over intervals
of size 2ℓ , that is [2ℓ k, 2ℓ (k + 1)), ℓ ∈ Z. Then,
V (ℓ) = span ({ϕℓ,k }k∈Z ) ,
for ℓ ∈ Z. For ℓ > 0, V (ℓ) is a stretched version of V (0) , and for ℓ < 0, V (ℓ) is a
compressed version of V (0) (both by 2ℓ ). Moreover, the fact that functions constant
over [2ℓ k, 2ℓ (k + 1)) are also constant over [2m k, 2m (k + 1)), ℓ > m, leads to the
inclusion property (see Figure 6.10),
V (ℓ) ⊂ V (m)
ℓ > m.
(6.20)
We can use this to derive the two-scale equation (6.8), by noting that because of
V (0) ⊂ V (−1) , ϕ(t) can be expanded in the basis for V (−1) . Graphically, we show
the spaces V (0) , V (−1) , and their basis functions in Figures 6.9 and Figure 6.11; the
two-scale equation was shown in Figure 6.3.
What about the detail spaces? Take a function x(−1) (t) in V (−1) but not in
(0)
V ; such a function is constant over half-integer intervals but not so over integer
intervals (see Figure 6.11(a)). Decompose it as a sum of its projections onto V (0)
and W (0) , the latter the orthogonal complement of V (0) in V (−1) (see Figure 6.12),
x(−1) (t) = PV (0) (x(−1) )(t) + PW (0) (x(−1) )(t).
α3.2 [January 2013] [free version] CC by-nc-nd
(6.21)
Comments to [email protected]
Fourier and Wavelet Signal Processing
Copyright 2013 J. Kovačević, V. K. Goyal, and M. Vetterli
6.1. Introduction
4
201
√
x(−1) (t) ∈ V (−1)
1.5
2ϕ(2t − k)
3
1.0
2
0.5
1
1
2
3
4
t
-1
1
2
3
4
t
-0.5
-2
-1.0
(a)
(b)
Figure 6.11: Haar multiresolution spaces and basis functions. (a) A function x(−1) (t) ∈
V (−1) . (b) Basis functions for V (−1) .
W (0)
x(−1) (t) ∈ V (−1)
d(0) (t) =
PW (0) (x(−1) )(t)
(0)
x
V (0)
(t) = PV (0) (x(−1) )(t)
Figure 6.12: A function from V (1) as the sum of its projections onto V (0) and W (0) .
We first find the projection of x(−1) (t) onto V (0) as
PV (0) (x(−1) )(t) = x(0) (t),
X (0)
X
=
αk ϕ(t − k) =
hx(−1) (t), ϕ(t − k)it ϕ(t − k),
k∈Z
(a)
=
X
k∈Z
(b)
=
k∈Z
(−1)
hx
(t), ϕ(2t − 2k) + ϕ(2t − 2k − 1)it ϕ(t − k),
i
X h
(−1)
(−1)
1
1
x
(k)
+
x
(k
+
)
ϕ(t − k),
2
2
(6.22)
k∈Z
where (a) follows from the two-scale equation (6.8); and (b) from evaluating the
inner product between x(−1) (t) and the basis functions for V (−1) . In other words,
x(0) (t) is simply the average of x(−1) (t) over two successive intervals. This is the
best least-squares approximation of x(−1) (t) by a function in V (0) (see Exercise ??).
We now find the projection of x(−1) onto W (0) . Subtract the projection x(0)
from x(−1) and call the difference d(0) . Since x(0) is an orthogonal projection, d(0)
α3.2 [January 2013] [free version] CC by-nc-nd
Comments to [email protected]
Fourier and Wavelet Signal Processing
202
Copyright 2013 J. Kovačević, V. K. Goyal, and M. Vetterli
Chapter 6. Wavelet Bases, Frames and Transforms on Functions
4
x(−1) (t) ∈ V (−1)
4
x(0) (t) ∈ V (0)
4
3
3
3
2
2
2
1
1
1
2
3
4
1
t
1
2
3
4
t
1
-1
-1
-1
-2
-2
-2
(a)
√
1.5
(b)
2ϕ(2t − k)
1.5
1.0
1.0
0.5
0.5
1
d(0) (t) ∈ W (0)
2
3
-0.5
4
1.5
t
4
t
ψ(t − k)
0.5
1
2
3
4
t
1
2
3
-0.5
-1.0
(d)
4
1.0
-0.5
-1.0
3
(c)
ϕ(t − k)
t
2
-1.0
(e)
(f)
Figure 6.13: Haar decomposition of a function (a) x(−1) (t) ∈ V (−1) into a projection
(b) x(0) (t) ∈ V (0) (average over two successive intervals) and (c) d(0) (t) ∈ W (0) (difference
over two successive intervals). (d)–(f) Appropriate basis functions.
is orthogonal to V (0) (see Figure 6.12). Using (6.22) leads to
PW (0) (x(−1) )(t) = d(0) (t) = x(−1) (t) − x(0) (t),
1 (−1)
x
(k) − x(−1) (k + 21 ) , k ≤ t < k + 12 ;
2
=
− 21 x(−1) (k) − x(−1) (k + 21 ) , k + 12 ≤ t < k + 1,
i
X h
(−1)
(−1)
1
1
=
(k
+
)
ψ(t − k)
x
(k)
−
x
2
2
k
=
X
k
√1
2
h
i
X (0)
(−1)
(−1)
β2k − β2k+1 ψ(t − k) =
βk ψ(t − k).
(6.23)
k
We have thus informally shown that the space V (−1) can be decomposed as
V (−1) = V (0) ⊕ W (0)
(6.24)
(see an example in Figure 6.13). We also derived bases for these spaces, scaling
functions and their shifts for V (−1) and V (0) , and wavelets and their shifts for
W (0) . This process can be further iterated on V0 (see Figure 6.10).
6.1.3
Haar Frame Series
The Haar wavelet series we just saw is an elegant representation, and completely
nonredundant. As we have seen in Chapters 4 and 5, at times we can benefit from
relaxing this constraint, and allowing some redundancy in the system. Our aim
would then be to build frames. There are many ways in which we could do that.
For example, by adding to the Haar wavelet basis wavelets at points halfway in
α3.2 [January 2013] [free version] CC by-nc-nd
Comments to [email protected]
Fourier and Wavelet Signal Processing
Copyright 2013 J. Kovačević, V. K. Goyal, and M. Vetterli
6.1. Introduction
1.0
203
ψ(t − k)
1.0
0.5
ψ(t − k/2)
0.5
1
2
3
4
t
1
-0.5
-0.5
-1.0
-1.0
(a)
2
3
4
t
(b)
Figure 6.14: A few Haar wavelets at scale ℓ = 0 for the (a) Haar wavelet series (nonredundant) and (b) Haar frame series (redundant). Clearly, there are twice as many wavelets
in (b), making it a redundant expansion with the redundancy factor 2.
between the existing ones, we would have twice as many wavelets, leading to a
redundant series representation with a redundancy factor of 2, a simple example we
will use to illustrate Haar frame series.
Definition of the Haar Frame Series We now relax the constraint of critical
sampling, but still retain the series expansion. That is, we assume that expansion
coefficients are
Z ∞
(ℓ)
βk = hx, ψℓ,k i =
x(t) ψℓ,k (t) dt,
ℓ, k ∈ Z,
(6.25)
−∞
where ψℓ,k (t) is now given by
−ℓ/2
ψℓ,k (t) = a0
−ℓ/2
ψ(a−ℓ
0 t − b0 k) = a0
ψ
t − aℓ0 b0 k
aℓ0
,
(6.26)
with a0 > 1 and b0 > 0. With a0 = 2 and b0 = 1, we get back our nonredundant
Haar wavelet series. What we have allowed ourselves to do here is to choose different
scale factors from 2, as well as different coverage of the time axis by shifted wavelets
at a fixed scale. For example, keep the scale factor the same, a0 = 2, but allow
overlap of half width between Haar wavelets, that is, choose b0 = 1/2. Figure 6.14(b)
shows how many wavelets then populate the time axis at a fixed scale (example for
ℓ = 0), compared to the wavelet series (part (a) of the same figure).
Properties of the Haar Frame Series Such a Haar frame series satisfies similar
properties as the Haar wavelet series: it is linear, it is able to characterize singularities, it inherits the zero-moment property. One property though, Parseval’s equality,
bears further scrutiny. Let us express the energy of the expansion coefficients as:
X X (ℓ)
X X (ℓ)
X X (ℓ)
(a)
|βk/2 |2 =
|βk |2 +
|βk+1/2 |2 = 2 kxk2 ,
ℓ∈Z k∈Z
ℓ∈Z k∈Z
ℓ∈Z k∈Z
where (a) follows from (6.18). In other words, this frame series behaves like two
orthonormal bases glued together; it is then not surprising that the energy in the
expansion coefficients is twice that of the input function, making this transform a
tight frame.
α3.2 [January 2013] [free version] CC by-nc-nd
Comments to [email protected]
Fourier and Wavelet Signal Processing
204
Copyright 2013 J. Kovačević, V. K. Goyal, and M. Vetterli
Chapter 6. Wavelet Bases, Frames and Transforms on Functions
ψ(t)
1.0
0.5
-1.0
0.5
-0.5
1.0
t
-0.5
-1.0
Figure 6.15: The prototype wavelet in the Haar continuous wavelet transform.
6.1.4
Haar Continuous Wavelet Transform
We finally relax all the constraints and discuss the most redundant version of a
wavelet expansion, where the Haar wavelet (6.12) is now shifted to be centered at
t = 0 (see Figure 6.15):


1, −1/2 ≤ t < 0;
−1,
0 ≤ t < 1/2;
ψ(t) =

0,
otherwise.
(6.27)
Then, instead of a = aℓ0 , we allow all positive real numbers, a ∈ R+ . Similarly,
instead of shifts b = b0 k, we allow all real numbers, b ∈ R:
1
ψa,b (t) = √ ψ
a
t−b
a
,
a ∈ R+ ,
b ∈ R,
(6.28)
with ψ(t) the Haar wavelet. The scaled and shifted Haar wavelet is then centered at
t = b and scaled by a factor a. All the wavelets are again of unit norm, kψa,b (t)k = 1.
For a = 2ℓ and b = 2ℓ k, we get the nonredundant wavelet basis as in TBD.
Definition of the Haar Continuous Wavelet Transform We then define the Haar
continuous wavelet transform to be (an example is given in Figure 6.16):
X(a, b) = hx, ψa,b i =
Z
∞
x(t) ψa,b (t) dt,
−∞
a ∈ R+ ,
b ∈ R,
(6.29a)
with ψa,b (t) from (6.28), with the inverse Haar continuous wavelet transform
x(t) =
1
Cψ
Z
a∈R+
Z
X(a, b)ψa,b (t)
b∈R
db da
,
a2
(6.29b)
where the equality holds in L2 sense. To denote such a pair, we write:
x(t)
α3.2 [January 2013] [free version] CC by-nc-nd
CWT
←→
X(a, b).
Comments to [email protected]
Fourier and Wavelet Signal Processing
Copyright 2013 J. Kovačević, V. K. Goyal, and M. Vetterli
6.1. Introduction
205
a
-3
-2
-1
a
1
2
3
b
-1
(a)
a
1
2
3
4
5
b
-3
-2
(b)
1
-1
2
3
b
(c)
Figure 6.16: (a) The Haar wavelet transform of an example function x(t) (hat function).
(b) Illustration of the shift-in-time property for x(t − 2). (c) Illustration of the scaling-intime property for x(t/2).
Properties of the Haar Continuous Wavelet Transform
(i) Linearity: The Haar continuous wavelet transform operator is a linear operator.
(ii) Shift in time: A shift in time by t0 results in (see Figure 6.16(b))
x(t − t0 )
CWT
←→
X(a, b − t0 ).
(6.30)
(iii) Scaling in time: Scaling in time by α results in (see Figure 6.16(c))
x(αt)
CWT
←→
1
a b
√ X( , ).
α
α α
(6.31)
(iv) Parseval’s equality: Parseval’s equality holds for the Haar continuous wavelet
transform; we omit it here and revisit it in the general context in (6.112).
(v) Redundancy: Just like for the local Fourier transform, the continuous wavelet
transform maps a function of one variable into a function of two variables. It
is thus highly redundant, and this redundancy is expressed by the reproducing
kernel :
K(a0 , b0 , a, b) = hψa0 ,b0 , ψa,b i,
(6.32)
a four-dimensional function. Figure 6.17 shows the reproducing kernel of the
Haar wavelet, namely K(1, 0, a, b); note that the reproducing kernel is zero at
all dyadic scale points as the wavelets are then orthogonal to each other.
(vi) Characterization of Singularities: This is one of the most important properties
of the continuous wavelet transform, since, by looking at its behavior, we can
infer the type and position of singularities occurring in the function.
For example, assume we are given a Dirac delta function at location t0 ,
x(t) = δ(t − t0 ). At scale a, only those Haar wavelets whose support straddles
α3.2 [January 2013] [free version] CC by-nc-nd
Comments to [email protected]
Fourier and Wavelet Signal Processing
206
Copyright 2013 J. Kovačević, V. K. Goyal, and M. Vetterli
Chapter 6. Wavelet Bases, Frames and Transforms on Functions
a
-3
-2
-1
1
2
3
b
Figure 6.17: The Haar wavelet transform of the Haar wavelet (6.12). This is also the
reproducing kernel, K(1, 0, a, b), of the Haar wavelet.
t0 will produce nonzero coefficients. Because of the form of the Haar wavelet,
the nonzero coefficients will extend over a region of size 2a around t0 . As a →
−∞, these coefficients focus arbitrarily closely on the singularity. Moreover,
these coefficients grow at a specific rate, another way to identify the type of a
singularity. We will go into more details on this later in the chapter.
As a simple example, take x(t) to be the Heaviside function (4.7) with
the step at location t0 . We want to see how the Haar wavelet (6.27) isolates
and characterizes the step singularity. To do that, we will need two things:
(1) The primitive of the wavelet, defined as
θ(t) =
Z
t
−∞
ψ(τ ) dτ =
1/2 − |t|, |t| < 1/2;
0,
otherwise,
(6.33)
that is, a triangle function (4.45) on the interval |t| < 1/2. Note
√ that the
primitive of the scaled and normalized wavelet a−1/2 ψ(t/a) is aθ(t/a), or
a factor a larger due to integration. (2) We also need the derivative of x(t),
which exists only in a generalized sense (using distributions) and can be shown
to be a Dirac delta function at t0 , x′ (t) = δ(t − t0 ).
Now, the continuous wavelet transform of the Heaviside function follows
as
Z ∞
1
t−b
√ ψ
X(a, b) =
x(t) dt
a
a
−∞
∞
Z ∞
√
√
t−b
t−b
(a)
=
aθ
x(t)
−
aθ
x′ (t) dt
a
a
−∞
−∞
Z ∞
√
t−b
(b)
= −
aθ
δ(t − t0 ) dt
a
−∞
t0 − b
(c) √
=
aθ
,
(6.34)
a
α3.2 [January 2013] [free version] CC by-nc-nd
Comments to [email protected]
Fourier and Wavelet Signal Processing
Copyright 2013 J. Kovačević, V. K. Goyal, and M. Vetterli
6.1. Introduction
207
a
1
2
3
4
5
6
b
Figure 6.18: The Haar wavelet transform of a piecewise-polynomial function x(t) as in
(6.35).
where (a) follows from integration by parts; (b) from θ being of compact
support; and (c) from the shifting property of the Dirac delta function in
Table 4.1. Thus, as a → 0, the continuous wavelet transform zooms towards
the singularity and scales as a1/2 , with a shape given by the primitive of
the wavelet θ(t); thus, we may expect a triangle-like region of influence (see
Figure 6.18 at the step discontinuity, t = 4, for illustration).
This discussion focused on the behavior of the Haar wavelet transform
around a point of singularity; what about smooth regions? Take x(t) to be

 t − 1, 1 ≤ t < 2;
2, 2 ≤ t < 4;
x(t) =
(6.35)

0, otherwise,
and the Haar wavelet(6.27). The function x(t) has three singularities, discontinuities at t = 1, 2, 3. The wavelet has 1 zero moment, so it will have a zero
inner product inside the interval [2, 3], where x(t) is constant (see Figure 6.18).
What happens in the interval [1, 2], where x(t) is linear?
Calculate the continuous wavelet transform for some shift b ∈ [1, 2] for a
sufficiently small so that the support of the shifted wavelet [b − a/2, b + a/2] ∈
[1, 2]:
!
Z b
Z b+a/2
1
1
X(a, b) = √
t dt −
t dt = − a3/2 .
(6.36)
a
4
b−a/2
b
Thus, the lack of a second zero moment (which would have produced a zero
inner product) produces a residual of order a3/2 as a → 0.
To study qualitatively the overall behavior, there will be two cones of
influence at singular points 2 and 3, with an order a1/2 behavior as in (6.34),
a constant of order a3/2 in the (1, 2) interval as in (6.36) (which spills over
into the (2, 3) interval), and zero elsewhere, shown in Figure 6.18.
α3.2 [January 2013] [free version] CC by-nc-nd
Comments to [email protected]
Fourier and Wavelet Signal Processing
208
Copyright 2013 J. Kovačević, V. K. Goyal, and M. Vetterli
Chapter 6. Wavelet Bases, Frames and Transforms on Functions
Chapter Outline
We now follow the path set through this simple Haar example, and follow with more
general developments. We start in Section 6.2 with iterated filter banks building
scaling functions and wavelets as limits of iterated filters. We study issues of convergence and smoothness of resulting functions. In Section 6.3, we then define and look
into the properties of wavelet series: localization, zero moments and decay of wavelet
coefficients, before considering the characterization of singularities by the decay of
the associated wavelet series coefficients. We study multiresolution analysis, revisiting the wavelet construction from an axiomatic point of view. In Section 6.4, we relax the constraints of nonredundancy to construct wavelet frames, midway between
the nonredundant wavelet series and a completely redundant wavelet transform.
We follow in Section 6.5 with the continuous wavelet transform. Section 6.6 is devoted to computational issues, in particular to Mallat’s algorithm, which allows us
to compute wavelet coefficients with an initial continuous-time projection followed
by a discrete-time, filter-bank algorithm.
Notation used in this chapter: In most of this chapter, we consider real-valued
wavelets only and thus the domain for the scale factor a is R+ ; the extension to
complex wavelets requires simply a ∈ R, a 6= 0.
6.2
Scaling Function and Wavelets from Orthogonal
Filter Banks
In the previous section, we set the stage for this section by examining basic properties of iterated filter banks with Haar filters. The results in this section should
thus not come as a surprise, as they generalize those for Haar filter banks.
We start with an orthogonal filer bank with filters g and h whose properties
were summarized in Table 1.9, Chapter 1, where we used these filters and their shifts
by multiples by two as the basis for ℓ2 (Z). This orthonormal basis was implemented
using a critically-sampled two-channel filter bank with down- and upsampling by
2, an orthogonal synthesis lowpass/highpass filter pair gn , hn and a corresponding
analysis lowpass/highpass filter pair g−n , h−n . We then used these filters and the
associated two-channel filter bank as building blocks for a DWT in Chapter 3. For
example, we saw that in a 3-level iterated Haar filter bank, the lowpass and highpass
at level 3 were given by (3.1c)–(3.1d) and plotted in Figure 3.4; we repeated the lastlevel filters in (6.2). Another example, a 3-level iterated filter bank with Daubechies
filters, was given in Example 3.1.
6.2.1
Iterated Filters
As for the Haar case, we come back to filters and their iterations, and associate
a continuous-time function to the discrete-time sequence representing the impulse
response of the iterated filter.
We assume a length-L orthogonal lowpass/highpass filter pair (gn , hn ), and
α3.2 [January 2013] [free version] CC by-nc-nd
Comments to [email protected]
Fourier and Wavelet Signal Processing
Copyright 2013 J. Kovačević, V. K. Goyal, and M. Vetterli
6.2. Scaling Function and Wavelets from Orthogonal Filter Banks
209
write the equivalent filters at the last level of a J-level iterated filter bank:
G(J) (z) =
J−1
Y
ℓ
J−1
G(z 2 ) = G(J−1) (z) G(z 2
),
(6.37a)
ℓ=0
H (J) (z) =
J−2
Y
ℓ
J−1
G(z 2 )H(z 2
J−1
) = G(J−1) (z) H(z 2
).
(6.37b)
ℓ=0
We know that, by construction, these filters are orthonormal to their shifts by 2J ,
(3.6a), (3.11a), as well as orthogonal to each other, (3.14a).
The equivalent filters g (J) , h(J) have norm 1 and length L(J) , which can be
upper bounded by (see (3.5b))
L(J) ≤ (L − 1) 2J .
6.2.2
(6.38)
Scaling Function and its Properties
(J)
We now associate a piecewise-constant function ϕ(J) (t) to gn so that ϕ(J) (t) is
of finite length and norm 1. Since the number of piecewise segments (equal to the
(J)
number of nonzero coefficients of gn ) grows exponentially with J (see (6.38)), we
−J
choose their width as 2 , upper bounding the length of ϕ(J) (t) by (L − 1):
support ϕ(J) (t) ⊂ [0, L − 1],
(6.39)
where support( ) stands for the interval of the real line where the function is different
(J)
from zero. For ϕ(J) (t) to inherit the unit-norm property from gn , we choose the
(J)
height of the piecewise segments as 2J/2 gn . Then, the nth piece of the ϕ(J) (t)
contributes
Z
(n+1)/2J
(J)
|ϕ
n/2J
2
(t)| dt =
Z
(n+1)/2J
2J (gn(J) )2 dt = (gn(J) )2
n/2J
to ϕ(J) (t). Summing up the individual contributions,
|ϕ(J) (t)|2 =
L(J)
X−1
n=0
Z
(n+1)/2J
n/2J
|ϕ(J) (t)|2 dt =
L(J)
X−1
(gn(J) )2 = 1.
n=0
We have thus defined the piecewise-constant function as
ϕ(J) (t) = 2J/2 gn(J) ,
=
L(J)
X−1
n=0
α3.2 [January 2013] [free version] CC by-nc-nd
n+1
n
≤ t <
,
2J
2J
gn(J) 2J/2 ϕh (2J t − n),
(6.40)
Comments to [email protected]
Fourier and Wavelet Signal Processing
210
Copyright 2013 J. Kovačević, V. K. Goyal, and M. Vetterli
Chapter 6. Wavelet Bases, Frames and Transforms on Functions
(1)
1.5
21/2 gn = 21/2 gn
1.5
1.0
1.0
0.5
0.5
0.0
1
2
3
4
5
6
n
-0.5
0.0
ϕ(1) (t)
0.5
1.0
(a)
22/2 gn
1.5
1.0
1.0
0.5
0.5
0.0
2
4
6
8
10
12
n
-0.5
3.0
t
0.0
0.5
1.0
1.5
2.0
2.5
3.0
t
2.0
2.5
3.0
t
2.0
2.5
3.0
t
-0.5
(d)
(3)
23/2 gn
1.5
1.0
1.0
0.5
0.5
0.0
5
10
15
20
n
-0.5
0.0
ϕ(3) (t)
0.5
1.0
1.5
-0.5
(e)
(f)
(4)
1.5
2.5
ϕ(2) (t)
(c)
1.5
2.0
(b)
(2)
1.5
1.5
-0.5
24/2 gn
1.5
1.0
1.0
0.5
0.5
0.0
10
20
30
40
-0.5
n
0.0
ϕ(4) (t)
0.5
1.0
1.5
-0.5
(g)
(h)
(J )
Figure 6.19: Iterated filter 2J/2 gn and associated piecewise-constant function ϕ(J ) (t)
based on a 4-tap Daubechies lowpass filter (3.10) at level (a)–(b) J = 1; (c)–(d) J = 2;
(e)–(f) J = 3; and (g)–(h) J = 4. Note that we have rescaled the equivalent filters’
impulse responses as well as plotted them at different discrete intervals to highlight the
correspondences with their piecewise-constant functions.
where ϕh (t) is the Haar scaling function (box function) from (6.6). We verified that
the above iterated function is supported on a finite interval and has unit norm.49
In Figure 6.19, we show a few iterations of a 4-tap filter from Example 3.1 and
its associated piecewise-constant function. The piecewise-constant function ϕ(J)
has geometrically decreasing piecewise segments and a support contained in the
49 We could have defined a piecewise-linear function instead of a piecewise-constant one, but it
does not change the behavior of the limit we will study.
α3.2 [January 2013] [free version] CC by-nc-nd
Comments to [email protected]
Fourier and Wavelet Signal Processing
Copyright 2013 J. Kovačević, V. K. Goyal, and M. Vetterli
6.2. Scaling Function and Wavelets from Orthogonal Filter Banks
211
interval [0, 3]. From the figure it is clear that the smoothness of ϕ(J) (t) depends on
(J)
the smoothness of gn . If the latter tends, as J increases, to a sequence with little
local variation, then the piecewise-constant approximation will tend to a smooth
function as well, as the piecewise segments become finer and finer. On the contrary,
(J)
if gn has too much variation as J → ∞, the sequence of functions ϕ(J) (t) might
not have a limit as J → ∞. This leads to the following necessary condition for the
filter gn , the proof of which is given in Solved Exercise ??:
Theorem 6.1 (Necessity of a zero at π) For the limJ→∞ ϕ(J) (t) to exist, it
is necessary for G(ejω ) to have a zero at ω = π.
As a direct corollary of this result, the necessity of a zero at ω = π translates
also to the necessity of
√
(6.41)
G(ejω )ω=0 = 2,
because of (1.13). We are now ready to define the limit function:
Definition 6.2 (Scaling function) We call the scaling function ϕ(t) the
limit, when it exists, of:
ϕ(t) = lim ϕ(J) (t),
(6.42)
J→∞
Scaling Function in the Fourier Domain We now find the Fourier transform of
ϕ(J) (t), denoted by Φ(J) (ω). The functions ϕ(J) (t) is a linear combination of box
functions, each of width 1/2J and height 2J/2 , where the unit box function is equal
to the Haar scaling function (6.6), with the Fourier transform Φh (ω) as in (6.7).
Using the scaling-in-time property of the Fourier transform (4.55a), the transform
of a box function on the interval [0, 1/2J ) of height 2J/2 is
J+1
(J)
Φh (ω) = 2−J/2 e−jω/2
sin(ω/2J+1 )
.
ω/2J+1
(6.43)
Shifting the nth box to start at t = n/2J multiplies its Fourier transform by
J
e−jωn/2 . Putting it all together, we find
Φ
(J)
(ω) =
(J)
Φh (ω)
L(J)
X−1
(a)
J
J
(J)
e−jωn/2 gn(J) = Φh (ω) G(J) (ejω/2 )
n=0
(b)
(J)
= Φh (ω)
J−1
Y
ℓ=0
ℓ
G(ejω2
/2J
(c)
(J)
) = Φh (ω)
J
Y
ℓ
G(ejω/2 ),
(6.44)
ℓ=1
where (a) follows from the definition of the DTFT (3.78a); (b) from (6.3); and (c)
from reversing the order of the factors in the product.
α3.2 [January 2013] [free version] CC by-nc-nd
Comments to [email protected]
Fourier and Wavelet Signal Processing
212
Copyright 2013 J. Kovačević, V. K. Goyal, and M. Vetterli
Chapter 6. Wavelet Bases, Frames and Transforms on Functions
In the sequel, we will be interested in what happens in the limit, when J → ∞.
(J)
For any finite ω, the effect of the interpolation function Φh (ω) becomes negligible
as J → ∞. Indeed in (6.43), both terms dependent on ω tend to 1 as J → ∞ and
only the factor 2−J/2 remains. So, in (6.44), the key term is the product, which
becomes an infinite product, which we now define:
Definition 6.3 (Fourier transform of the infinite product) We call
Φ(ω) the limit, if it exists, of the infinite product:
Φ(ω) = lim Φ(J) (ω) =
J→∞
∞
Y
ℓ
2−1/2 G(ejω/2 ).
(6.45)
ℓ=1
√
The corollary to Theorem √
6.1, (6.41) is now clear; if G(1) > 2, Φ(0) would
grow unbounded, and if G(1) < 2, Φ(0) would be zero, contradicting the fact that
Φ(ω) is the limit of lowpass filters and hence a lowpass function.
A more difficult question is to understand when the limits of the time-domain
iteration ϕ(J) (t) (6.40) and the Fourier-domain iteration Φ(J) (ω) (6.44) form the
Fourier-transform pair. We will show in Example 6.1 that this is a nontrivial question. As the exact conditions are technical and beyond the scope of our text, we
concentrate on those cases when the limits in Definitions 6.2 and 6.3 are well defined
and form a Fourier-transform pair, that is, when
ϕ(t)
FT
←→
Φ(ω),
We now look into the behavior of the infinite product. If Φ(ω) decays sufficiently fast in ω, the scaling function ϕ(t) will be smooth. How this can be done
while maintaining other desirable properties (such as compact support and orthogonality) is the key result for designing wavelet bases from iterated filter banks.
Example 6.1 (Fourier transform of the infinite product) To gain intuition, we now look into examples of filters and their associated infinite products.
(i) Daubechies filter with two zeros at ω = π: We continue with our example
of the Daubechies lowpass filter from (3.10) with its associated piecewiseconstant function in Figure 6.19. In the Fourier-domain product (6.44),
the terms are periodic with periods 4π, 8π, . . ., 2J 2π, since G(ejω ) is 2πperiodic (see Figure 6.20(a)–(c)). We show the product in part (d) of the
figure. The terms are oscillating depending on their periodicity, but the
product decays rather nicely. We will study this decay in detail shortly.
(ii) Length-4 filter designed using lowpass approximation method: Consider the
orthogonal filter designed using the window method in Example 1.2. This
filter does not have a zero at ω = π, since
G(ejω )ω=π ≈ 0.389.
Its iteration is shown in Figure 6.21, with noticeable high-frequency oscillations, prohibiting convergence of the iterated function ϕ(J) (t).
α3.2 [January 2013] [free version] CC by-nc-nd
Comments to [email protected]
Fourier and Wavelet Signal Processing
Copyright 2013 J. Kovačević, V. K. Goyal, and M. Vetterli
6.2. Scaling Function and Wavelets from Orthogonal Filter Banks
1.5
|G(ejω/2 )|
1.5
|G(ejω/4 )|
1.5
1.
1.
1.
0.5
0.5
0.5
0
0
4Π
8Π
12 Π
16 Π
ω
0
(a)
1.2
0
4Π
8Π
|Φ(J ) (ω)|
12 Π
16 Π
ω
(b)
0
213
|G(ejω/8 )|
0
4Π
8Π
12 Π
16 Π
ω
(c)
1.0
0.8
0.6
0.4
0.2
0.0
0
4Π
8Π
12 Π
16 Π
ω
(d)
Figure 6.20: Factors (a) |G(ejω/2 )|, (b) |G(ejω/4 )|, and (c) |G(ejω/8 )| that appear in
(d) the Fourier-domain product Φ(J ) (ω).
φ(J)(t)
1.4
1.2
1
0.8
0.6
0.4
0.2
0
−0.2
−0.4
0
1
2
3
4
5
6
7
Figure 6.21: Iteration of a filter without a zero at ω = π. The high-frequency oscillations
prohibit the convergence of the iterated function ϕ(J ) (t).
(iii) Stretched Haar filter: Instead of the standard Haar filter, consider:
ZT
√1
√1
0
0
g =
←→ G(z) = √12 (1 + z −3 ).
2
2
It is clearly an orthogonal lowpass filter and has one zero at ω = π. However, unlike the Haar filter, its iteration is highly unsmooth. Consider the
equivalent filter after J stages of iteration:
i
1 h
g (J) = J/2 1 0 0 1 . . . 1 0 0 1 .
2
The piecewise-constant function ϕ(J) (t) inherits this lack of smoothness,
and does not converge pointwise to a proper limit, as shown graphically in
Figure 6.22. Considering the frequency domain and the infinite product, it
turns out that L2 convergence fails as well (see Exercise ??).
The examples above show that iterated filters and their associated graphical
functions behave quite differently. The Haar case we saw in the previous section
α3.2 [January 2013] [free version] CC by-nc-nd
Comments to [email protected]
Fourier and Wavelet Signal Processing
214
1.5
Copyright 2013 J. Kovačević, V. K. Goyal, and M. Vetterli
Chapter 6. Wavelet Bases, Frames and Transforms on Functions
ϕ(1) (t)
1.5
ϕ(2) (t)
1.5
1.0
1.0
1.0
0.5
0.5
0.5
0.0
0.5
1.0
1.5
2.0
2.5
3.0
-0.5
t
0.0
0.5
1.0
1.5
2.0
2.5
3.0
t
ϕ(J ) (t)
0.0
-0.5
0.5
1.0
1.5
2.0
2.5
3.0
t
-0.5
(b)
(a)
(c)
Figure
6.22:
Iteration of the stretched Haar filter with impulse response g =
√
√
[1/ 2 0 0 1/ 2]. (a) ϕ(1) (t). (b) ϕ(2) (t). (c) ϕ(J ) (t).
was trivial, the 4-tap filters showed a smooth behavior, and the stretched Haar
filter pointed out potential convergence problems.
In the sequel, we concentrate on orthonormal filters with N ≥ 1 zeros at
ω = π, or
N
1 + e−jω
jω
G(e ) =
R(ejω ),
(6.46)
2
√
with R(ejω )ω=π = 2 for the limit to exist. We assume (1) pointwise convergence
of the iterated function ϕ(J) (t) to ϕ(t), (2) pointwise convergence of the iterated
Fourier-domain function Φ(J) (ω) to Φ(ω), and finally (3) that ϕ(t) and Φ(ω) are a
Fourier-transform pair. In other words, we avoid all convergence issues and concentrate on the well-behaved cases exclusively.
Two-Scale Equation We have seen in Section 6.1.1 that the Haar scaling function
satisfies a two-scale equation 6.8. This is true in general, except that more terms
will be involved in the summation. To show this, we start with the Fourier-domain
limit of the infinite product (6.45):
Φ(ω) =
∞
Y
ℓ
(a)
2−1/2 G(ejω/2 ) = 2−1/2 G(ejω/2 )
ℓ=1
∞
Y
ℓ
2−1/2 G(ejω/2 ),
ℓ=2
(b)
(c)
= 2−1/2 G(ejω/2 ) Φ(ω/2) = 2−1/2
L−1
X
gn e−jωn/2 Φ(ω/2),
n=0
(d)
=
L−1
X
n=0
gn
h
i
2−1/2 e−jωn/2 Φ(ω/2) ,
(6.47)
where in (a) we took one factor out, 2−1/2 G(ejω/2 ); in (b) we recognize the infinite product as Φ(ω/2); (c) follows from the definition of the DTFT (3.78a); and
in (d) we just rearranged the terms. Then, using the scaling-in-time property of
FT
the Fourier transform (4.55a), Φ(ω/2) ←→ 2ϕ(2t), and the shift-in-time property
α3.2 [January 2013] [free version] CC by-nc-nd
Comments to [email protected]
Fourier and Wavelet Signal Processing
Copyright 2013 J. Kovačević, V. K. Goyal, and M. Vetterli
6.2. Scaling Function and Wavelets from Orthogonal Filter Banks
215
2
1.5
1
0.5
0
−0.5
0
20
40
60
80
100
120
140
160
180
200
Figure 6.23: Two-scale equation for the Daubechies scaling function. (a) The scaling
function ϕ(t) and (b) expressed as a linear combination of ϕ(2t − n).
FT
(4.53), e−jωn/2 X(ω) ←→ x(t − n/2), we get the two-scale equation:
ϕ(t) =
X
√ L−1
2
gn ϕ(2t − n),
(6.48)
n=0
shown in Figure 6.23 for the Daubechies 4-tap filter from Examples 6.1 and 6.2.
Smoothness As seen earlier, the key is to understand the infinite product (6.45)
which becomes, using (6.46),
!N
−jω/2ℓ
ℓ
1
+
e
Φ(ω) =
2−1/2
R(ejω/2 )
2
ℓ=1
!!N ∞
ℓ
∞
Y
Y
ℓ
1 + e−jω/2
=
2−1/2 R(ejω/2 ) .
2
ℓ=1
ℓ=1
|
{z
}|
{z
}
∞
Y
A(ω)
(6.49)
B(ω)
Our goal is to see if Φ(ω) has a sufficiently fast decay for large ω. We know from
Chapter 4, (4.82a), that if |Φ(ω)| decays faster than 1/|ω| for large |ω|, then ϕ(t)
is bounded and continuous. Consider first the product
!
ℓ
∞
∞
Y
ℓ
1 + e−jω/2
sin(ω/2)
(a) Y −1/2 1
(b)
√ (1 + e−jω/2 ) = e−jω/2
2
=
,
2
2
ω/2
ℓ=1
ℓ=1
where in (a) we extracted the Haar filter (6.1a), and (b) follows from (6.45) as well
as the Haar case (6.7). The decay of this Fourier transform is of order O(1/|ω|), and
thus, A(ω) in (6.49) decays as O(1/|ω|N ). 50 So, as long as |B(ω)| does not grow
faster than |ω|N −1−ε , ε > 0, the product (6.49) will decay fast enough to satisfy
(4.82a), leading to a continuous scaling function ϕ(t). We formalize this discussion
in the following theorem, the proof of which is given in Solved Exercise ??:
50 In time domain, it is the convolution of N box functions, or a B spline of order N − 1 (see
Chapter 6.
α3.2 [January 2013] [free version] CC by-nc-nd
Comments to [email protected]
Fourier and Wavelet Signal Processing
216
Copyright 2013 J. Kovačević, V. K. Goyal, and M. Vetterli
Chapter 6. Wavelet Bases, Frames and Transforms on Functions
Theorem 6.4 (Smoothness of the scaling function) With R(ejω ) as in
(6.46), if
B = sup |R(ejω )| < 2N −1/2 ,
(6.50)
ω∈[0,2π]
then, as J → ∞, the iterated function ϕ(J) (t) converges pointwise to a continuous
function ϕ(t) with the Fourier transform
Φ(ω) =
∞
Y
ℓ
2−1/2 G(ejω/2 ).
ℓ=1
Condition (6.50) is sufficient, but not necessary: many filters fail the test but
still lead to continuous limits (and more sophisticated tests can be used).
If we strengthened the bound to
B < 2N −k−1/2
k ∈ N,
then ϕ(t) would be continuous and k-times differentiable (see Exercise TBD).
Example 6.2 (Smoothness of the scaling function) We now test the continuity condition (6.50) on the two filters we have used most often.
The Haar filter
√
1 + e−jω
G(ejω ) = √12 (1 + e−jω ) =
2 ,
|{z}
2
R(ejω )
√
√
has N = 1 zero at ω = π and R(z) = 2. Thus, B = 2, which does not meet
the inequality in (6.50). According to Theorem 6.4, ϕ(t) may or may not be
continuous (and we know it is not).
The Daubechies filter (3.10)
G(ejω ) =
1 + e−jω
2
2
√1 (1
2
|
+
√
√
3 + (1 − 3)e−jω ),
{z
}
R(ejω )
has N = 2 zero at ω = π. The supremum of |R(ejω | is attained at ω = π,
√
B = sup |R(ejω )| = 6 < 23/2 ,
ω∈[0,2π]
and thus, the scaling function ϕ(t) must be continuous.
Reproduction of Polynomials We have seen in Chapter 6 that splines of degree
N and their shifts can reproduce polynomials of degree up to N . Given that the
scaling functions based on a filter having N zeros at ω = π contain a spline part of
α3.2 [January 2013] [free version] CC by-nc-nd
Comments to [email protected]
Fourier and Wavelet Signal Processing
Copyright 2013 J. Kovačević, V. K. Goyal, and M. Vetterli
6.2. Scaling Function and Wavelets from Orthogonal Filter Banks
217
350
300
250
200
150
100
50
0
−50
−100
0
50
100
150
200
250
300
350
400
450
Figure 6.24: An example of the reproduction of polynomials by the scaling function and
its shifts. The scaling function ϕ(t) is based on the Daubechies filter with two zeros at
π, (3.10), and reproduces the linear function x(t) = t (on an interval because of the finite
number of scaling functions used).
degree (N − 1), linear combinations of {ϕ(t − n)}n∈Z can reproduce polynomials of
degree (N − 1). We illustrate this property in Figure 6.24, where the Daubechies
filter with two zeros at π, (3.10), reproduces the linear function x(t) = t. (We give
the proof of this property later in the chapter.)
Orthogonality to Integer Shifts As we have seen in the Haar case already, the
scaling function is orthogonal to its integer shifts, a property inherited from the
underlying filter:
hϕ(t), ϕ(t − n)it = δn .
(6.51)
Since ϕ(t) is defined through a limit and the inner product is continuous in both
arguments, orthogonality (6.51) follows from the orthogonality of ϕ(J) (t) and its
integer shifts for any J:
hϕ(J) (t), ϕ(J) (t − n)it = δn ,
J ∈ Z+ ,
which follows, in turn, from the same property for the iterated filter
(J)
hϕ
(a)
(J)
(t), ϕ
(J)
LX
−1
= h
(b)
=
=
gn(J) 2J/2 ϕh (2J t
n=0
(J)
L(J)
X−1 L X−1
n=0
(c)
(t − k)it
L(J)
X−1
n=0
(J)
gn(J) gm
Z
− n),
∞
−∞
m=0
(J)
L(J)
X−1
m=0
(6.52)
(J)
gn
in (3.6a):
(J) J/2
gm
2 ϕh (2J (t − k) − m)it
2J ϕh (2J t − n)ϕh (2J t − 2J k − m)) dt
(J)
(d)
gn(J) gn−2J k = hgn(J) , gn−k 2J in = δk ,
where (a) follows from (6.40); in (b) we took the sums and filter coefficients out of
the inner product; (c) from the orthogonality of the Haar scaling functions; and (d)
from the orthogonality of the filters themselves, (3.6a).
The orthogonality (6.51) at scale 0 has counterparts at other scales:
hϕ(2ℓ t), ϕ(2ℓ t − n)it = 2−ℓ δn ,
(6.53)
easily verified by changing the integration variable.
α3.2 [January 2013] [free version] CC by-nc-nd
Comments to [email protected]
Fourier and Wavelet Signal Processing
218
Copyright 2013 J. Kovačević, V. K. Goyal, and M. Vetterli
Chapter 6. Wavelet Bases, Frames and Transforms on Functions
6.2.3
Wavelet Function and its Properties
The scaling function we have just seen is lowpass in nature (if the underlying filter
g is lowpass in nature). Similarly to what we have done for the scaling function,
we can construct a wavelet function (or, simply wavelet ) that will be bandpass in
nature (if the underlying filter h is highpass in nature).
(J)
We thus associate a piecewise-constant function ψ (J) (t) to hn , the impulse
response of (6.37b), in such a way that ψ (J) (t) is of finite length and of norm 1;
we use the same arguments as before to determine the width and height of the
piecewise segments, leading to
n
n+1
≤ t <
.
2J
2J
ψ (J) (t) = 2J/2 h(J)
n
(6.54)
(J)
Unlike ϕ(J) (t), our new
object of interest ψ (t) is a bandpass function. In particjω ular, because H(e ) ω=π = 0, its Fourier transform Ψ(ω) satisfies
Ψ(ω)|ω=0 = 0.
Again, we are interested in what happens when J → ∞. Clearly, this involves
an infinite product, but it is the same infinite product we studied for the convergence
of ϕ(J) (t) towards ϕ(t). In short, we assume this question to be settled. The
development parallels the one for the scaling function, with the important twist of
J−1
J−1
consistently replacing the lowpass filter G(z 2 ) by the highpass filter H(z 2 ).
We do not repeat the details, but rather indicate the main points. Equation (6.44)
becomes
J
Y
ℓ
(J)
Ψ(J) (ω) = Φh (ω) H(ejω/2 )
G(ejω/2 ).
(6.55)
ℓ=2
Similarly to the scaling function, we define the wavelet as the limit of ψ (J) (t)
or Ψ (ω), where we now assume that both are well defined and form a Fouriertransform pair.
(J)
Definition 6.5 (Wavelet) Assuming the limit to exist, we define the wavelet
in time and frequency domains to be
ψ(t) = lim ψ (J) (t),
(6.56a)
J→∞
Ψ(ω) = lim Ψ(J) (ω).
(6.56b)
J→∞
From (6.55) and using the steps leading to (6.45), we can write
Ψ(ω) = 2−1/2 H(ejω/2 )
∞
Y
ℓ
2−1/2 G(ejω/2 ).
(6.57)
ℓ=2
α3.2 [January 2013] [free version] CC by-nc-nd
Comments to [email protected]
Fourier and Wavelet Signal Processing
Copyright 2013 J. Kovačević, V. K. Goyal, and M. Vetterli
6.2. Scaling Function and Wavelets from Orthogonal Filter Banks
219
2
1.5
1
0.5
0
−0.5
−1
−1.5
0
0.5
1
1.5
2
2.5
3
2
1.5
1
0.5
0
−0.5
−1
−1.5
0
20
40
60
80
100
120
140
160
180
200
Figure 6.25: Wavelet based on the Daubechies highpass filter (6.60). (a) Wavelet ψ(t)
and (b) the two-scale equation for the wavelet.
Two-Scale Equation Similarly to (6.47), we can rewrite (6.57) as
Ψ(ω) = 2−1/2 H(ejω/2 )Φ(ω/2).
(6.58)
Taking the inverse Fourier transform, we get a relation similar to (6.48), namely
ψ(t) =
X
√ L−1
2
hn ϕ(2t − n),
(6.59)
n=0
the two-scale equation for the wavelet. From the support of ϕ(t) in (6.39), it also
follows that ψ(t) has the same support on [0, L − 1]. To illustrate the two-scale
relation and also show a wavelet, consider the following example.
Example 6.3 (Wavelet and the two-scale equation) Take the Daubechies
lowpass filter (3.10) and construct its highpass via (1.24). It has a double zero
at ω = 0, and is given by:
i
√
√
√
1 h√
H(z) = √ ( 3 − 1) + (3 − 3)z −1 − (3 + 3)z −2 + (1 + 3)z −3 . (6.60)
4 2
Figure 6.25 shows the wavelet ψ(t) and the two-scale equation.
Smoothness Since the wavelet is a finite linear combination of scaling functions
and their shifts as in (6.59), the smoothness is inherited from the scaling function,
as illustrated in Figure 6.25(a).
Zero-Moment Property We assumed that the lowpass filter G(ejω ) had N zeros
(N ≥ 1) at ω = π. Using (1.24) and applying it to (6.46), we get
N
1 − ejω
j(L−1)ω
H(z) = e
R(ej(ω+π) ).
(6.61)
2
α3.2 [January 2013] [free version] CC by-nc-nd
Comments to [email protected]
Fourier and Wavelet Signal Processing
220
Copyright 2013 J. Kovačević, V. K. Goyal, and M. Vetterli
Chapter 6. Wavelet Bases, Frames and Transforms on Functions
It has therefore N zeros at ω = 0. These N zeros carry over directly to Ψ(ω)
because of (6.58) and Φ(ω)|ω=0 = 1. Because of this
dn X(ω) = 0.
(6.62)
dω n ω=0
We can now use the moment property of the Fourier transform, (4.59a), to find the
Fourier-transform pair of the above equation, leading to
Z ∞
tn ψ(t) dt = 0
n = 0, 1, . . . , N − 1.
(6.63)
−∞
In other words, if p(t) is a polynomial function of degree (N − 1), its inner product
with the wavelet at any shift and/or scale will be 0:
hp(t), ψ(at − b)it = 0
for all a, b ∈ R.
(6.64)
Remembering that ϕ(t) is able to reproduce polynomials up to degree (N − 1), it
is a good role split for the two functions: wavelets annihilate polynomial functions
while scaling functions reproduce them.
Orthogonality to Integer Shifts In our quest towards building orthonormal bases
of wavelets, we will need that the wavelet is orthogonal to its integer shifts. The
derivation is analogous to that for the scaling function; we thus skip it here, and
instead just summarize this and other orthogonality conditions:
hψ(2ℓ t), ψ(2ℓ t − n)it = 2−ℓ δn ,
hϕ(2ℓ t), ψ(2ℓ t − n)it = 0.
6.2.4
(6.65a)
(6.65b)
Scaling Function and Wavelets from Biorthogonal Filter
Banks
As we have already seen with filter banks, not all cases of interest are necessarily
orthogonal. In Chapter 1, we designed biorthogonal filter banks to obtain symmetric/antisymmetric FIR filters. Similarly, with wavelets, except for the Haar case,
there exist no orthonormal and compactly-supported wavelet bases that are symmetric/antisymmetric. Since symmetry is often a desirable feature, we need to relax
orthonormality. We thus set the stage here for the biorthogonal wavelet series by
briefly going through the necessary concepts.
To start, we assume a quadruple (hn , gn , e
hn , gen ) of biorthogonal impulse responses satisfying the four biorthogonality relations (1.64a)–(1.64d). We further
require that both lowpass filters have at least one zero at ω = π, and more if
possible:
G(e ) =
jω
1 + e−jω
2
α3.2 [January 2013] [free version] CC by-nc-nd
N
R(e ),
jω
e jω ) =
G(e
1 + e−jω
2
Ne
R̃(ejω ).
(6.66)
Comments to [email protected]
Fourier and Wavelet Signal Processing
Copyright 2013 J. Kovačević, V. K. Goyal, and M. Vetterli
6.2. Scaling Function and Wavelets from Orthogonal Filter Banks
221
Since the highpass filters are related to the lowpass ones by (1.75), the highpass
e jω ) will have N
e and N zeros at ω = 0, respectively. In the
filters H(ejω ) and H(e
biorthogonal case, unlike in the orthonormal one, there is no implicit normalization,
so we will assume that
√
e jω )
G(ejω )ω=0 = G(e
= 2,
ω=0
e jω ) accordingly.
which can be enforced by normalizing H(ejω ) and H(e
Analogously to the orthogonal case, the iterated filters are given by
G(J) (z) =
J−1
Y
ℓ
e (J) (z) =
G
G(z 2 ),
ℓ=0
J−1
Y
ℓ=0
and we define scaling functions in the Fourier domain as
Φ(ω) =
∞
Y
2
−1/2
G(e
jω/2ℓ
),
ℓ=0
e
Φ(ω)
=
∞
Y
ℓ=0
e 2ℓ ),
G(z
ℓ
e jω/2 ).
2−1/2 G(e
(6.67)
In the sequel, we will concentrate on well-behaved cases only, that is, when the
infinite products are well defined. Also, the iterated time-domain functions corree(J) (z) have well-defined limits ϕ(t) and ϕ(t),
sponding to G(J) (z) and G
e
respectively,
related to (6.67) by Fourier transform.
The two-scale relations follow similarly to the orthogonal case:
Φ(ω) = 2−1/2 G(ejω/2 ) Φ(ω/2),
ϕ(t) =
√
2
L−1
X
n=0
(6.68a)
gn ϕ(2t − n),
(6.68b)
as well as
e
e jω/2 ) Φ(ω/2),
e
Φ(ω)
= 2−1/2 G(e
ϕ(t)
e
=
(6.69a)
X
√ L−1
2
gn ϕ(2t
e − n)a.
(6.69b)
n=0
Example 6.4 (Scaling function and wavelets from linear B-splines)
Choose as the lowpass filter
G(ejω ) =
√ jω
2e
1 + e−jω
2
2
=
1
√ (ejω + 2 + e−jω ),
2 2
which has a double zero at ω = 0 and satisfies the normalization G(ejω )ω=0 =
√
2. Then, using (6.67), we compute Φ(ω) to be (4.47), that is, the Fourier
transform of the triangle function (4.45), or linear B-spline. This is because
G(z) is (up to a normalization and shift) the convolution of the Haar filter with
α3.2 [January 2013] [free version] CC by-nc-nd
Comments to [email protected]
Fourier and Wavelet Signal Processing
222
Copyright 2013 J. Kovačević, V. K. Goyal, and M. Vetterli
Chapter 6. Wavelet Bases, Frames and Transforms on Functions
5
4
1
3
0.8
2
0.6
1
0.4
0
0.2
−1
0
−1
0
1
2
3
−2
0
1
2
3
Figure 6.26: The triangle function and its dual. (a) ϕ(t) from the iteration of
1
(b) ϕ(t)
e from the iteration of 4√
[−1, 2, 6, 2, −1].
2
1
√
[1, 2, 1].
2 2
itself. Thus, the limit of the iterated filter is the convolution of the box function
with itself, the result being shifted to be centered at the origin.
We now search for a biorthogonal scaling function ϕ(t)
e
by finding first a
e
(nonunique) biorthogonal lowpass filter G(z)
satisfying (1.66). Besides the trivial
e
solution G(z)
= 1, the following is a solution as well:
e
G(z)
=
1
1
√ (1 + z)(1 + z −1)(−z + 4 − z −1) = √ (−z 2 + 2z + 6 + 2z −1 − z −2 ),
4 2
4 2
obtained as one possible factorization of C(z) from Example 1.4. The resulting
dual scaling function ϕ(t)
e
looks quite irregular (see Figure 6.26). We could,
e
instead, look for a G(z)
with more zeros at ω = π to obtain a smoother dual
scaling function. For example, choose
e
G(z)
=
1
√ (1 + z)2 (1 + z −1 )2 (3z 2 − 18z + 38 − 18z −1 + 3z −2 ),
64 2
leading to quite a different ϕ(t)
e (see Figure 6.27).
Choosing the highpass filters as in (1.75),
−1
e
H(z) = z G(−z
),
e
H(z)
= z −1 G(−z),
with only a minimal shift, since the lowpass filters are centered around the origin
and symmetric, we get all four functions as in Figure 6.27.
6.3
Wavelet Series
So far, we have considered only a single scale with the two functions ϕ(t) and
ψ(t). Yet, as for the Haar case in Section 6.1, multiple scales are already lurking
in the background through the two-scale equations (6.48),(6.59). And just like in
the DWT in Chapter 3, the real action appears when all scales are considered as
we have already seen with the Haar wavelet series.
α3.2 [January 2013] [free version] CC by-nc-nd
Comments to [email protected]
Fourier and Wavelet Signal Processing
Copyright 2013 J. Kovačević, V. K. Goyal, and M. Vetterli
6.3. Wavelet Series
223
1.5
1
1
0.8
0.5
0.6
0.4
0
0.2
−0.5
0
0
0.5
1
−1
1.5
2
3
1.5
2
0
1
2
3
4
5
0
1
2
3
4
5
1
1
0.5
0
0
−1
−0.5
−1
0
2
4
6
−2
8
Figure 6.27: Biorthogonal linear spline basis. (a) The linear B-spline is the triangle
function ϕ(t). (b) The linear B-spline wavelet ψ(t). (c) The dual scaling function ϕ(t).
e
e
(d) The dual wavelet ψ(t).
ψ0,0 (t)
2
2
1
1
2
-2
4
6
8
10
t
-1
-2
-2
ψ2,1 (t)
2
1
2
-2
-1
(a)
ψ−1,2 (t)
4
6
8
10
t
2
-2
4
6
8
10
t
-1
-2
(b)
(c)
Figure 6.28: Example wavelets. (a) The prototype wavelet ψ(t) = ψ0,0 (t); (b) ψ−1,2 (t);
(c) ψ2,1 (t).
6.3.1
Definition of the Wavelet Series
We thus recall
ψℓ,k (t) = 2−ℓ/2 ψ(2−ℓ t − k) =
1
ψ
t − 2ℓ k
2ℓ
,
2ℓ/2
1
t − 2ℓ k
ϕℓ,k (t) = 2−ℓ/2 ϕ(2−ℓ t − k) = ℓ/2 ϕ
,
2ℓ
2
(6.70a)
(6.70b)
for ℓ, k ∈ Z, with the understanding that the basic scaling function ϕ(t) and the
wavelet ψ(t) are no longer Haar, but can be more general. As before, for ℓ = 0, we
have the usual scaling function and wavelet and their integer shifts; for ℓ > 0, the
functions are stretched by a power of 2, and the shifts are proportionally increased;
and for ℓ < 0, the functions are compressed by a power of 2, with appropriately
reduced shifts. Both the scaling function and the wavelet are of unit norm, and
that at all scales (a few examples are given in Figure 6.28).
α3.2 [January 2013] [free version] CC by-nc-nd
Comments to [email protected]
Fourier and Wavelet Signal Processing
224
Copyright 2013 J. Kovačević, V. K. Goyal, and M. Vetterli
Chapter 6. Wavelet Bases, Frames and Transforms on Functions
Two-Scale Equations at Nonconsecutive Scales Since we want to deal with multiple scales (not just two), we extend the two-scale equations for ϕ(t) and ψ(t) across
arbitrary scales that are powers of 2:
(a)
Φ(ω) = 2−1/2 G(ejω/2 ) Φ(ω/2),
(b)
= 2−1 G(ejω/2 )G(ejω/4 ) Φ(ω/4),
(c)
= 2−1 G(2) (ejω/4 ) Φ(ω/4),
..
.
k
= 2−k/2 G(k) (ejω/2 ) Φ(ω/2k ),
ϕ(t) = 2k/2
L−1
X
n=0
gn(k) ϕ(2k t − n),
(6.71a)
(6.71b)
for k ∈ Z+ , where both (a) and (b) follow from the two-scale equation in the Fourier
domain, (6.47); (c) from the expression for the equivalent filter, (6.37a); and (d) by
repeatedly applying the same (see Exercise ??). The last expression is obtained by
applying the inverse DTFT to (6.71a).
Using an analogous derivation for the wavelet, we get
k
Ψ(ω) = 2−k/2 H (k) (ejω/2 ) Φ(ω/2k ),
ψ(t) = 2k/2
L−1
X
n=0
k
h(k)
n ϕ(2 t − n),
(6.72a)
(6.72b)
for k = 2, 3, . . .. The attractiveness of the above expressions lies in their ability
to express any ϕℓ,k (t), ψℓ,k (t), in terms of a linear combination of an appropriately
scaled ϕ(t), where the linear combination is given by the coefficients of an equivalent
(k)
(k)
filter gn or hn . We are now ready for the main result of this chapter:
Theorem 6.6 (Orthonormal basis for L2 (R)) The continuous-time wavelet
ψ(t) satisfying (6.59) and its shifts and scales,
1
t − 2ℓ k
{ψℓ,k (t)} = { √ ψ
},
ℓ, k ∈ Z,
(6.73)
2ℓ
2ℓ
form an orthonormal basis for the space of square-integrable functions, L2 (R).
Proof. To prove the theorem, we must prove that (i) {ψℓ,k (t)}ℓ, k∈Z is an orthonormal
set and (ii) it is complete. The good news is that most of the hard work has already
been done while studying the DWT in Theorem 3.2, Chapter 3.
α3.2 [January 2013] [free version] CC by-nc-nd
Comments to [email protected]
Fourier and Wavelet Signal Processing
Copyright 2013 J. Kovačević, V. K. Goyal, and M. Vetterli
6.3. Wavelet Series
225
(i) We have already shown that the wavelets and their shifts are orthonormal at a
single scale, (6.65a), and need to show the same across scales:
(a)
hψℓ,k (t), ψm,n (t)it = 2−ℓ hψ0,k (τ ), ψ−i,n (τ )iτ ,
X (i)
(b)
= 2−ℓ h2i/2
hn ϕ−i,2i k+n (τ ), ψ−i,n (τ )iτ ,
n
(c)
−ℓ+i/2
= 2
X
n
h(i)
n hϕ−i,2i k+n (τ ), ψ−i,n (τ )iτ = 0,
where (a) follows from assuming (without loss of generality) ℓ = m + i, i > 0, and
change of variable t = 2ℓ τ ; (b) from two-scale equation for the wavelet (6.72b);
and (c) from the linearity of the inner product as well as orthogonality of the
wavelet and scaling function (6.65a).
(ii) The proof of completeness is more involved, and thus, we show it only for the
Haar case. Further Reading gives pointers to texts with the full proof. Consider a
unit-norm function x(t) such that x(t) = 0 for t < 0 with finite length at most 2J
for some J ∈ Z.51 We approximate x(t) by a piecewise-constant approximation
at scale ℓ, (where ℓ ≪ J), or
x(ℓ) (t) = 2−ℓ
(a)
=
=
2ℓ (k+1)
X
k∈Z
2ℓ k ≤ t < 2ℓ (k + 1),
x(τ ) ϕℓ,k (τ ) dτ ϕℓ,k (t),
x(τ ) dτ,
2ℓ k
X Z
k∈Z
(b)
Z
τ ∈R
(c)
hx, ϕℓ,k iϕℓ,k (t) =
X
(ℓ)
αk ϕℓ,k (t),
(6.74a)
k∈Z
where (a) follows from (6.16b); (b) from the definition of the inner product; and
(ℓ)
in (c) we introduced αk = hx, ϕℓ,k i.
(ℓ)
Because of the finite-length assumption of x(t), the sequence αk is also of
J −ℓ
finite length (of degree 2
). Since x(t) is of norm 1 and the approximation in
(ℓ)
(6.74a) is a projection, kαk k ≤ 1. Thus, we can apply Theorem 3.2 and represent
(ℓ)
the sequence αk by discrete Haar wavelets only
(ℓ)
αk
=
X X
(i)
βn(i) hk−2i n .
n∈Z i∈Z+
Since the expression (6.74a) is the piecewise-constant interpolation of the sequence
(ℓ)
αk , together with proper scaling and normalization, by linearity, we can apply
(ℓ)
this interpolation to the discrete Haar wavelets used to represent αk , which leads
51 Both of these restrictions are inconsequential; the former because a general function can be
decomposed into a function nonzero on t > 0 and t ≤ 0, the latter because the fraction of the
energy of x(t) outside of the interval under consideration can be made arbitrarily small by making
J arbitrarily large.
α3.2 [January 2013] [free version] CC by-nc-nd
Comments to [email protected]
Fourier and Wavelet Signal Processing
226
Copyright 2013 J. Kovačević, V. K. Goyal, and M. Vetterli
Chapter 6. Wavelet Bases, Frames and Transforms on Functions
to a continuous-time Haar wavelet representation of x(ℓ) (t):
X (ℓ)
x(ℓ) (t) =
αk ϕℓ,k (t),
k∈Z
=
XX X
(i)
βn(i) hk−2i n ϕℓ,k (t),
k∈Z n∈Z i∈Z+
=
X X
βn(i) ψℓ+i,n2i (t).
n∈Z i∈Z+
This
follows from h(i) being of length 2i . Thus, for a fixed n and
P last statement
(i)
i, k∈Z hk−2i n ϕℓ,k (t) will equal the Haar wavelet of length 2i 2ℓ = 2i+ℓ at shift
n2i . Again by Theorem 3.2, this representation is exact.
What remains to be shown is that x(ℓ) (t) can be made arbitrarily close, in
2
L norm, to x(t). This is achieved by letting ℓ → −∞ and using the fact that
piecewise-constant functions are dense in L2 (R);52 we get
lim kx(t) − x(ℓ) (t)k = 0.
i→−∞
TBD: Might be expanded.
The proof once more shows the intimate relation between the DWT from Chapter 3
and the wavelet series from this chapter.
Definition We can now formally define the wavelet series:
Definition 6.7 (Wavelet series) The wavelet series of a function x(t) is a
function of ℓ, k ∈ Z given by
Z ∞
(ℓ)
βk = hx, ψℓ,k i =
x(t) ψℓ,k (t) dt,
ℓ, k ∈ Z,
(6.75a)
−∞
with ψℓ,k (t) the prototype wavelet. The inverse wavelet series is given by
X X (ℓ)
x(t) =
βk ψℓ,k (t).
(6.75b)
ℓ∈Z k∈Z
In the above, β (ℓ) are the wavelet coefficients.
To denote such a wavelet series pair, we write:
x(t)
WS
←→
(ℓ)
βk .
We derived such bases already and we will see other constructions when we
talk about multiresolution analysis.
52 That is, any L2 function can be approximated arbitrarily closely by a piecewise-constant
function over intervals that tend to 0. This is a standard result but technical, and thus we just
use it without proof.
α3.2 [January 2013] [free version] CC by-nc-nd
Comments to [email protected]
Fourier and Wavelet Signal Processing
Copyright 2013 J. Kovačević, V. K. Goyal, and M. Vetterli
6.3. Wavelet Series
6.3.2
227
Properties of the Wavelet Series
We now consider some of the properties of the wavelet series. Many follow from the
properties of the wavelet (Section 6.2.3) or of the DWT (Chapter 3), and thus our
treatment will be brief.
Linearity
The wavelet series operator is a linear operator, or,
a x(t) + b y(t)
WS
(ℓ)
←→
(ℓ)
a βk + b β k .
(6.76)
Shift in Time A shift in time by 2m n, m, n ∈ Z, results in
x(t − 2m n)
WS
←→
(ℓ)
βk−2m n ,
ℓ ≤ m.
(6.77)
This is a restrictive condition as it holds only for scales smaller than m. In other
words, only a function x(t) that has a scale-limited expansion, that is, it can be
written as
m X
X
(ℓ)
x(t) =
βk ψℓ,k (t),
ℓ=−∞ k∈Z
will possess the shift-in-time property for all (of its existing) scales. This is a
counterpart to the shift-in-time property of the DWT, (3.17), and the fact that the
DWT is periodically shift varying.
Scaling in Time
Scaling in time by 2−m , m ∈ Z, results in
x(2−m t)
WS
←→
(ℓ−m)
2m/2 βk
.
(6.78)
Parseval’s Equality The wavelet series operator is a unitary operator and thus
preserves the Euclidean norm (see (2.53)):
Z ∞
X X (ℓ)
kxk2 =
|x(t)|2 dt =
|βk |2 .
(6.79)
−∞
ℓ∈Z k∈Z
Time-Frequency Localization Assume that the wavelet ψ(t) is centered around
t = 0 in time and ω = 3π/4 in frequency (that is, it is a bandpass filter with the
support of approximately [π/2, π]). Then, from (6.70a), ψℓ,0 (t) is centered around
ω = 2−ℓ (3π/4) in frequency (see Figure 6.29).
With our assumption of g being a causal FIR filter of length L, the support in
time of the wavelets is easy to characterize. Since the support of ψ(t) is [0, L − 1],
support(ψℓ,k (t)) ⊆ [2ℓ k, 2ℓ (k + L − 1)).
(6.80)
Because of the FIR assumption, the frequency localization is less precise (no compact support in frequency), but the center frequency is around 2−ℓ (3π/4) and the
passband is mostly in an octave band,
π
support(Ψℓ,k (ω)) ∼ [2−ℓ , 2−ℓ π].
(6.81)
2
α3.2 [January 2013] [free version] CC by-nc-nd
Comments to [email protected]
Fourier and Wavelet Signal Processing
228
Copyright 2013 J. Kovačević, V. K. Goyal, and M. Vetterli
Chapter 6. Wavelet Bases, Frames and Transforms on Functions
Figure 6.29: Time-frequency localization of wavelet basis functions. Three wavelets are
highlighted: at scale ℓ = 0, ψ(t); at scale ℓ = −1, a higher-frequency wavelet ψ−1,7 (t);
and at scale ℓ = 2, a lower-frequency wavelet ψ2,1 (t). These are centered along the dyadic
sampling grid [2ℓ k 2−ℓ (3π/4)], for ℓ, k ∈ Z.
Characterization of Singularities As we have seen with the example of Haar
wavelet series (see Figure 6.8), one of the powerful features of the wavelet series
is its ability to characterize both the position and type of singularities present in a
function.
Consider a function with the simplest singularity, a Dirac delta function at a
location t0 , that is, x(t) = δ(t − t0 ). At scale ℓ, only wavelets having their support
(6.80) straddling t0 will produce nonzero coefficients,
(ℓ)
βk
6= 0
for
⌊t0 /2ℓ ⌋ − L < k ≤ ⌊t0 /2ℓ ⌋.
(6.82)
Thus, there are L nonzero coefficients at each scale. These coefficients correspond
to a region of size 2ℓ (L − 1) around t0 , or, as ℓ → −∞, they focus arbitrarily closely
on the singularity. What about the size of the coefficients at scale ℓ? The inner
product of the wavelet with a Dirac delta function simply picks out a value of the
wavelet. Because of the scaling factor 2−ℓ/2 in (6.70a), the nonzero coefficients will
be of order
(ℓ)
|βk | ∼ O(2−ℓ/2 )
(6.83)
for the range of k in (6.82). That is, as ℓ → −∞, the nonzero wavelet series
coefficients zoom in onto the discontinuity, and they grow at a specific rate given
by (6.83). An example for the Haar wavelet was shown in Figure 6.8.
Generalizing the Dirac delta function singularity, a function is said to have an
nth-order singularity at t0 when its nth-order derivative has a Dirac delta function
α3.2 [January 2013] [free version] CC by-nc-nd
Comments to [email protected]
Fourier and Wavelet Signal Processing
Copyright 2013 J. Kovačević, V. K. Goyal, and M. Vetterli
6.3. Wavelet Series
229
component at t0 . The scaling (6.83) for a zeroth-order singularity is an example of
the following result:
Theorem 6.8 (Scaling behavior around singularities) Given a wavelet
ψ(t) with N zero moments, around a singularity of order n, 0 ≤ n ≤ N , the
(ℓ)
wavelet series coefficients βk behave as
(ℓ) ℓ → −∞.
(6.84)
βk ∼ O(2ℓ(n−1/2) ),
Proof. We have analyzed n = 0 earlier. We now give a proof for n = 1; generalizing to
n > 1 is the topic of Exercise 6.3.
Assume the wavelet has at least one zero moment, N ≥ 1. A function with a
first-order singularity at t0 looks like a Heaviside function (4.7) locally (at t0 ). We can
reduce the analysis to n = 0 by considering the derivative x′ (t), which is a Dirac delta
function at t0 . We use the fact that ψ has at least one zero moment and is of finite
support. Then, as in (6.34), using integration by parts,
Z ∞
Z ∞
hx(t), ψ(t)it =
ψ(t)x(t) dt = −
θ(t)x′ (t) dt
−∞
−∞
= −hx′ (t), θ(t)it
hx(t), ψℓ,k (t)it = −hx′ (t), θℓ,k (t)it ,
Rt
where θ(t) = −∞ ψ(τ ) dτ is the primitive of ψ(t), θℓ,k (t) is the primitive of ψℓ,k (t), and
x′ (t) is the derivative of x(t). Because ψ(t) has at least one zero at ω = 0 and is of
finite support, its primitive is well defined and also of finite support. The key is now
the scaling behavior of θℓ,k (t) with respect to θ(t). Evaluating
θℓ,k (t) =
Z
t
−∞
2−ℓ/2 ψ(2−ℓ τ − k) dτ = 2ℓ/2
Z
2−ℓ t−k
−∞
ψ(t′ ) dt′ = 2ℓ/2 θ(2−ℓ t − k),
we see that this scaling is given by 2ℓ/2 . Therefore, the wavelet coefficients scale as
(ℓ) ′
βk = |hx(t), ψℓ,k (t)it | = −hx (t), θℓ,k (t)it (6.85)
∼ 2ℓ/2 hδ(t − t0 ), θ(2−ℓ t − k)it ∼ O(2ℓ/2 ),
at fine scales and close to t0 .
Zero-Moment Property When the lowpass filter g has N zeros at ω = π, we
verified that ψ(t) has N zero moments (6.63). This property carries over to all
scaled versions of ψ(t), and thus, for any polynomial function p(t) of degree smaller
than N ,
(ℓ)
βk = hp(t), ψℓ,k (t)it = 0.
This allows us to prove the following result:
α3.2 [January 2013] [free version] CC by-nc-nd
Comments to [email protected]
Fourier and Wavelet Signal Processing
230
Copyright 2013 J. Kovačević, V. K. Goyal, and M. Vetterli
Chapter 6. Wavelet Bases, Frames and Transforms on Functions
Theorem 6.9 (Decay of wavelet series coefficients for x ∈ CN ) For a
function x(t) with N continuous and bounded derivations, that is, x ∈ C N , the
wavelet series coefficients decay as
(ℓ) βk ≤ α2mN
for some constant α > 0 and m → −∞.
Proof. Consider the Taylor series expansion of x(t) around some point t0 . Since x(t)
has N continuous derivatives,
x′′ (t0 ) 2
x(N−1) (t0 ) N−1
x′ (t0 )
ε+
ε + ··· +
ε
+ RN (ε),
1!
2!
(N − 1)!
= p(t) + RN (ε),
x(t0 + ε) = x(t0 ) +
where
|RN (ε)| ≤
εN
N!
sup
t0 ≤t≤t0 +ε
(N) x (t) ,
and we view it as a polynomial p(t) of degree (N − 1) and a remainder RN (ε). Because
of the zero-moment property of the wavelet,
(ℓ) βk = |hx(t), ψm,n (t)i| = |hp(t) + RN (ε), ψm,n (t)i| = |hRN (ε), ψm,n (t)i| ,
that is, the inner product with the polynomial term is zero, and only the remainder
matters. To minimize the upper bound on |hRN (ε), ψm,n i|, we want t0 close to the
center of the wavelet. Since the spacing of the sampling grid at scale ℓ is 2ℓ , we see
that ε is at most 2ℓ and thus |hRN (ε), ψℓ,k i| has an upper bound of order 2ℓN .
A stronger result, in which N is replaced by N + 1/2, follows from Theorem 6.17
in the context of the continuous wavelet transform.
6.3.3
Multiresolution Analysis
We have already introduced the concept of multiresolution analysis with the Haar
scaling function and wavelet in Section 6.1. As opposed to having a discrete-time
filter and constructing a continuous-time basis from it, multiresolution analysis does
the opposite: it starts from the multiresolution spaces to build the wavelet series.
For example, we saw that the continuous-time wavelet basis generated a partition
of L2 (R) into a sequence of nested spaces
. . . ⊂ V (2) ⊂ V (1) ⊂ V (0) ⊂ V (−1) ⊂ V (−2) ⊂ . . . ,
and that these spaces were all scaled copies of each other, that is, V (ℓ) is V (0) scaled
by 2ℓ . We will turn the question around and ask: assuming we have a sequence of
nested and scaled spaces as above, does it generate a discrete-time filter bank? The
α3.2 [January 2013] [free version] CC by-nc-nd
Comments to [email protected]
Fourier and Wavelet Signal Processing
Copyright 2013 J. Kovačević, V. K. Goyal, and M. Vetterli
6.3. Wavelet Series
231
answer is yes; the framework is multiresolution analysis we have seen in the Haar
case. We present it shortly in its more general form, starting with the axiomatic
definition and followed by examples.
The embedded spaces above are very natural for piecewise-polynomial functions over uniform intervals of length 2ℓ . For example, the Haar case leads to
piecewise-constant functions. The next higher order is for piecewise-linear functions, and so on. The natural bases for such spaces are B-splines we discussed in
Chapter 6; these are not orthonormal bases, requiring the use of orthogonalization
methods.
Axioms of Multiresolution Analysis We now summarize the fundamental characteristics of the spaces and basis functions seen in the Haar case. These are also the
axioms of multiresolution analysis.
(i) Embedding: We work with a sequence of embedded spaces
. . . ⊂ V (2) ⊂ V (1) ⊂ V (0) ⊂ V (−1) ⊂ V (−2) ⊂ . . . ,
(6.86a)
where V (ℓ) is the space of piecewise-constant functions over [2ℓ k, 2ℓ (k + 1))k∈Z
with finite L2 norm. We call the V (ℓ) s successive approximation spaces, since
as ℓ → −∞, we get finer and finer approximations.
(ii) Upward Completeness: Since piecewise-constant functions over arbitrarilyshort intervals are dense in L2 (see Footnote 52),
[
lim V (ℓ) =
V (ℓ) = L2 (R).
(6.86b)
ℓ∈Z
ℓ→−∞
(iii) Downward Completeness: As ℓ → ∞, we get coarser and coarser approximations. Given a function x(t) ∈ L2 (R), its projection onto V (ℓ) tends to zero as
ℓ → ∞, since we lose all the details. More formally,
\
V (ℓ) = {0}.
(6.86c)
ℓ∈Z
(iv) Scale Invariance: The spaces V (ℓ) are just scaled versions of each other,
x(t) ∈ V (ℓ)
⇔
x(2m t) ∈ V (ℓ−m) .
(6.86d)
(v) Shift Invariance: Because x(t) is a piecewise-constant function over intervals
[2ℓ k, 2ℓ (k + 1)), it is invariant to shifts by multiples of 2ℓ ,
x(t) ∈ V (ℓ)
⇔
x(t − 2ℓ k) ∈ V (ℓ) .
(6.86e)
(vi) Existence of a Basis: There exists ϕ(t) ∈ V (0) such that
{ϕ(t − k)}k∈Z
(6.86f)
is a basis for V (0) .
The above six characteristics, which naturally generalize the Haar multiresolution
analysis, are the defining characteristics of a broad class of wavelet systems.
α3.2 [January 2013] [free version] CC by-nc-nd
Comments to [email protected]
Fourier and Wavelet Signal Processing
232
Copyright 2013 J. Kovačević, V. K. Goyal, and M. Vetterli
Chapter 6. Wavelet Bases, Frames and Transforms on Functions
Definition 6.10 (Multiresolution analysis) A sequence {V (ℓ) }m∈Z of subspaces of L2 (R) satisfying (6.86a)–(6.86f) is called a multiresolution analysis. The
spaces V (ℓ) are called the successive approximation spaces, while the spaces W (ℓ) ,
defined as the orthogonal complements of V (ℓ) in V (ℓ−1) , that is,
V (ℓ−1) = V (ℓ) ⊕ W (ℓ) ,
(6.87)
are called the successive detail spaces.
Definition For simplicity, we will assume the basis in (6.86f) to be orthonormal;
we cover the general case in Solved Exercise ??.
The two-scale equation (6.48) follows naturally from the scale-invariance axiom ((iv)). What can we say about the coefficients gn ? Evaluate
(a)
(b)
δk = hϕ(t), ϕ(t − k)it = 2
(c)
=
X
XX
n∈Z m∈Z
gn gm hϕ(2t − n), ϕ(2t − 2k − m)it
gn gn−2k ,
n∈Z
where (a) is true by assumption; in (b) we substituted the two-scale equation 6.48
for both ϕ(t) and ϕ(t − k); and (c) follows from hϕ(2t − n), ϕ(2t − 2k − m)it = 0
except for n = 2k + m when it is 1/2. We thus conclude that the sequence gn
corresponds to an orthogonal filter (1.13). Assuming that the Fourier transform
Φ(ω) of ϕ(t) is continuous and satisfies53
|Φ(0)| = 1,
it follows from the two-scale equation in the Fourier domain that
√
|G(1)| = 2,
making gn a lowpass sequence. Assume it to be of finite length L and derive the
equivalent highpass filter using (1.24). Defining the wavelet as in (6.59), we have:
Theorem 6.11 The wavelet given by (6.59) satisfies
hψ(t), ψ(t − n)it = δn ,
hψ(t), ϕ(t − n)it = 0,
and W (0) = span ({ψ(t − n)}n∈Z ) is the orthogonal complement of V (0) in V (−1) ,
V (−1) = V (0) ⊕ W (0) .
53 If
(6.88)
ϕ(t) is integrable, this follows from upward completeness (6.86b)) for example.
α3.2 [January 2013] [free version] CC by-nc-nd
Comments to [email protected]
Fourier and Wavelet Signal Processing
Copyright 2013 J. Kovačević, V. K. Goyal, and M. Vetterli
6.3. Wavelet Series
233
We do not prove the theorem but rather just discuss the outline of a proof.
The orthogonality relations follow from the orthogonality of the sequences gn and
hn by using the two-scale equations (6.48) and (6.59). That {ψ(t − n)}∈Z is an
orthonormal basis for W (0) requires checking completeness and is more technical.
By construction, and in parallel to (6.86d), W (ℓ) are just scaled versions of each
other,
x(t) ∈ W (ℓ)
x(2m t) ∈ W (ℓ−m) .
⇔
(6.89)
Putting all the pieces above together, we have:
Theorem 6.12 (Wavelet basis for L2 (R)) Given a multiresolution analysis
of L2 (R) from Definition 6.10, the family
1
t − 2ℓ k
ψℓ,k (t) = ℓ/2 ψ
ℓ, k ∈ Z,
2ℓ
2
with ψ(t) as in (6.59), is an orthonormal basis for L2 (R).
Proof. Scaling (6.88) using (6.86d), we get that V (ℓ) = V (ℓ+1) ⊕ W (ℓ+1) . Iterating it
n times leads to
V (ℓ) = W (ℓ+1) ⊕ W (ℓ+2) ⊕ . . . ⊕ W (ℓ+n) ⊕ V (ℓ+n) .
As n → ∞ and because of (6.86c), we get54
V (ℓ) =
∞
M
W (i) ,
i=ℓ+1
and finally, letting ℓ → −∞ and because of (6.86b), we obtain
L2 (R) =
M
W (ℓ) .
(6.90)
ℓ∈Z
Since {ψ(t − k)}k∈Z is an orthonormal basis for W (0) , by scaling, {ψℓ,k (t)}k∈Z is an
orthonormal basis for W (ℓ) . Then, following (6.90), the family {ψℓ,k (t)}ℓ,n∈Z is an
orthonormal basis for L2 (R).
Thus, in a fashion complementary to Section 6.1, we obtain a split of L2 (R) into
a collection {W (ℓ) }ℓ∈Z as a consequence of the axioms of multiresolution analysis
(6.86a)–(6.86f) (see Figure 6.10 for a graphical representation of the spaces V (ℓ) and
W (ℓ) ). We illustrate our discussion with examples.
54 In
the infinite sum, we imply closure.
α3.2 [January 2013] [free version] CC by-nc-nd
Comments to [email protected]
Fourier and Wavelet Signal Processing
234
Copyright 2013 J. Kovačević, V. K. Goyal, and M. Vetterli
Chapter 6. Wavelet Bases, Frames and Transforms on Functions
1.5
0.8
1
0.6
0.4
0.5
0.2
0
0
−0.2
−10
0
10
0
2
4
6
8
0
2
4
6
8
1.5
0.4
1
0.2
0
0.5
−0.2
0
−0.4
−10
0
10
Figure 6.30: Sinc scaling function and wavelet. (a) Scaling function ϕ(t). (b) Magnitude
Fourier transform |Φ(ω)|. (c) Wavelet ψ(t). (d) Magnitude Fourier transform |Ψ(ω)|.
Examples
Example 6.5 (Sinc multiresolution analysis) Let V (0) be the space of L2
functions bandlimited to [−π, π), for which we know that
ϕ(t) =
sin(πt)
πt
(6.91)
and its integer shifts form an orthonormal basis. Define V (ℓ) to be the space
of L2 functions bandlimited to [−2−ℓ π, 2−ℓ π). These are nested spaces of bandlimited functions, which obviously satisfy (6.86a), as they do the axioms of
multiresolution analysis (6.86b)–(6.86f), that is, the union of the V (ℓ) s is L2 (R),
their intersection is empty, the spaces are scaled versions of each other and are
shift invariant with respect to shifts by integer multiples of 2ℓ . The existence of
the basis we stated in (6.91). The details are left as Exercise ??, including the
derivation of the wavelet and the detail spaces W (ℓ) , the spaces of L2 bandpass
functions,
W (ℓ) = [−2−ℓ+1 π, −2−ℓ π) ∪ [2−ℓ π, 2−ℓ+1 π).
(6.92)
Figure 6.30 shows the sinc scaling function and wavelet both in time as well as
Fourier domains.
While the perfect bandpass spaces lead to a bona fide multiresolution analysis of L2 (R), the basis functions have slow decay in time. Since the Fourier
transform is discontinuous, the tails of the scaling function and the wavelet decay only as O(1/t) (as can be seen in the sinc function (6.91)). We will see in
latter examples possible remedies to this problem.
Example 6.6 (Piecewise-linear multiresolution analysis) Let V (0) be the
space of continuous L2 functions piecewise linear over intervals [k, k + 1), or
x(t) ∈ V (0) if kxk < ∞ and x′ (t) is piecewise constant over intervals [k, k + 1).
α3.2 [January 2013] [free version] CC by-nc-nd
Comments to [email protected]
Fourier and Wavelet Signal Processing
Copyright 2013 J. Kovačević, V. K. Goyal, and M. Vetterli
6.3. Wavelet Series
235
For simplicity, consider functions x(t) such that x(t) = 0 for t < 0. Then x′ (t)
is specified by the sequence {ak }, the slopes of x(t) over intervals [k, k + 1), for
k ∈ N. The nodes of x(t), that is, the values at the integers, are given by
0, k ≤ 0;
Pk−1
x(k) =
a
i=0 i , k > 0,
and the piecewise-linear function is
x(t) = [x(k + 1) − x(k)] (t − k) + x(k) = ak (t − k) +
k−1
X
ai
(6.93)
i=0
for t ∈ [k, k + 1) (see Figure 6.31).
The spaces V (ℓ) are simply scaled versions of V (0) ; they contain functions
that are continuous and piecewise linear over intervals [2ℓ k, 2ℓ (k + 1)). Let us
verify the axioms of multiresolution.
(i) Embedding: Embedding as in (6.86a) is clear.
(ii) Upward Completeness: Similarly to the piecewise-constant case, piecewiselinear functions are dense in L2 (R) (see Footnote 52), and thus upward
completeness (6.86b) holds.
(iii) Downward Completeness: Conversely, as ℓ → ∞, the approximation gets
coarser and coarser, ultimately verifying downward completeness (6.86c).
(iv) Scale Invariance: Scaling (6.86d) is clear from the definition of the piecewiselinear functions over intervals scaled by powers of 2.
(v) Shift Invariance: Similarly, shift invariance (6.86e) is clear from the definition of the piecewise linear functions over intervals scaled by powers of
2.
(vi) Existence of a Basis: What remains is to find a basis for V (0) . As an
educated guess, take the triangle function from (4.45) shifted by 1 to the
right and call it θ(t) (see Figure 6.32(a)). Then x(t) in (6.93) can be written
as
∞
X
x(t) =
bk θ(t − k),
k=0
with b0 = a0 and bk = ak + bk−1 . We prove this as follows: First,
θ′ (t) = ϕh (t) − ϕh (t − 1),
where ϕh (t) is the Haar scaling function, the indicator function of the unit
interval. Thus, x′ (t) is piecewise constant. Then, the value of the constant
between k and k + 1 is (bk − bk−1 ) and thus equals ak as desired. The only
detail is that θ(t) is clearly not orthogonal to its integer translates, since

 2/3, k = 0;
1/6, k = −1, 1;
hθ(t), θ(t − k)it =

0,
otherwise.
α3.2 [January 2013] [free version] CC by-nc-nd
Comments to [email protected]
Fourier and Wavelet Signal Processing
236
Copyright 2013 J. Kovačević, V. K. Goyal, and M. Vetterli
Chapter 6. Wavelet Bases, Frames and Transforms on Functions
We can applythe orthogonalization
procedure in (??). The z-transform of
the sequence 1/6 2/3 1/6 is
√
√
√
1
3
−1 (a) 2 +
(z + 4 + z ) =
[1 + (2 − 3)z] [1 + (2 − 3)z −1 ],
|
{z
}|
{z
}
6
6
right sided
left sided
where (a) follows from it being a deterministic autocorrelation, positive on
the unit circle, and could thus be factored into its spectral roots. Choosing
just the right-sided part, with the impulse response
s
√
√
2+ 3
(−1)k (2 − 3)k ,
αk =
6
leads to
ϕc (t) =
∞
X
k=0
αk θ(t − k),
a function such that ϕc (t) = 0 for t < 0 and orthonormal to its integer
translates. It is piecewise linear over integer pieces, but of infinite extent
(see Figure 6.32).
Instead of the spectral factorization, we can just take the square root
as in (??). In Fourier domain,
2 1 jω 1 −jω
1
+ e + e
=
(2 + cos(ω)).
3 6
6
3
Then,
√
3 θ(ω)
Φs (ω) =
(2 + cos(ω))1/2
is the Fourier transform of a symmetric and orthogonal scaling function
ϕs (t) (see Figure 6.32(c)).
Because of the embedding of the spaces V (ℓ) , the scaling functions all satisfy
two-scale equations (Exercise ??). Once the two-scale equation coefficients are
derived, the wavelet can be calculated in the standard manner. Naturally, since
the wavelet is a basis for the orthogonal complement of V (0) in V (−1) , it will be
piecewise linear over half-integer intervals (Exercise ??).
Example 6.7 (Meyer multiresolution analysis) The idea behind Meyer’s
wavelet construction is to smooth the sinc solution in Fourier domain, so as to
obtain faster decay of the basis functions in the time domain. The simplest way
to do this is to allow the Fourier transform magnitude of the scaling function,
|Φ(ω)|2 , to linearly decay to zero, that is,
1,
|ω| < 2π
2
3 ;
|Φ(ω)| =
(6.94)
3|ω|
2 − 2π , 2π
<
|ω|
< 4π
3
3 .
We start by defining a function orthonormal to its integer translates and the
space V (0) spanned by those, axiom (vi).
α3.2 [January 2013] [free version] CC by-nc-nd
Comments to [email protected]
Fourier and Wavelet Signal Processing
Copyright 2013 J. Kovačević, V. K. Goyal, and M. Vetterli
6.3. Wavelet Series
237
x′ (t)
x(t)
1.5
1.5
1.0
1.0
0.5
0.5
1
2
3
4
5
t
1
-0.5
2
3
4
t
5
-0.5
-1.0
-1.0
(b)
(a)
Figure 6.31: (a) A continuous and piecewise-linear function x(t) and (b) its derivative
x′ (t).
1
0.5
0
0
0.5
1
1.5
2
2.5
3
3.5
4
4.5
5
0
0.5
1
1.5
2
2.5
3
3.5
4
4.5
5
−5
−4
−3
−2
−1
0
1
2
3
4
5
1
0.5
0
−0.5
1
0.5
0
Figure 6.32: Basis function for piecewise-linear spaces. (a) The nonorthogonal basis
function θ(t). (b) An orthogonalized basis function ϕc (t) such that ϕc (t) = 0 for t < 0.
(c) An orthogonalized symmetric basis function ϕs (t).
(i) Existence of a Basis: The basis function is shown in Figure 6.33, where
we also show graphically that (??) holds, proving that {ϕ(t − k)}k∈Z is an
orthonormal set. We now define V (0) to be
V (0) = span ({ϕ(t − k)}k∈Z ) .
(ii) Upward Completeness: Define V (ℓ) as the scaled version of V (0) . Then
(6.86b) holds, similarly to the sinc case.
(iii) Downward Completeness: Again, (6.86c) holds.
(iv) Scale Invariance: Holds by construction.
(v) Shift Invariance: Holds by construction.
(vi) Embedding: To check V (0) ⊂ V (−1) we use Figure 6.34 to see that V (0)
is perfectly represented in V (−1) . This means we can find a 2π-periodic
α3.2 [January 2013] [free version] CC by-nc-nd
Comments to [email protected]
Fourier and Wavelet Signal Processing
238
Copyright 2013 J. Kovačević, V. K. Goyal, and M. Vetterli
Chapter 6. Wavelet Bases, Frames and Transforms on Functions
Figure 6.33: Meyer scaling function, with a piecewise linear squared Fourier transform
magnitude. (a) The function |Φ(ω)|2 . (b) Proof of orthogonality by verifying (??).
function G(ejω ) to satisfy the two-scale equation in Fourier domain (6.47),
illustrated in Figure 6.35.
Now that we have verified the axioms of multiresolution analysis, we can
construct the wavelet. From (6.94), (6.47) and the figure, the DTFT of the
discrete-time filter gn is
 √
|ω| ≤ π3 ;

 q2,
jω
π
2π
|G(e )| =
(6.95)
4 − 6|ω|
π ,
3 ≤ |ω| < 3 ;


2π
0,
3 < |ω| ≤ π.
As the phase is not specified, we chose it to be zero making G(ejω ) real and
symmetric. Such a filter has an infinite impulse response, and its z-transform is
not rational (since it is exactly zero over an interval of nonzero measure). It does
satisfy, however, the quadrature formula for an orthogonal lowpass filter from
(1.13). Choosing the highpass filter in the standard way, (1.24),
H(ejω ) = e−jω G(ej(ω+π) ),
with G(ejω ) real, and using
domain, (6.58), we get






Ψ(ω) =





the two-scale equation for the wavelet in Fourier
0,
q
|ω| <
2π
3 ;
3ω
− 1, 2π
3 ≤ |ω| <
q 2π
4π
e−jω 2 − 3ω
4π ,
3 ≤ |ω| <
0,
|ω| ≥ 8π
3 .
e−jω
4π
3 ;
8π
3 ;
(6.96)
The construction and resulting wavelet (a bandpass function) are shown in Figure 6.36. Finally, the scaling function ϕ(t) and wavelet ψ(t) are shown, together
with their Fourier transforms, in Figure 6.37.
α3.2 [January 2013] [free version] CC by-nc-nd
Comments to [email protected]
Fourier and Wavelet Signal Processing
Copyright 2013 J. Kovačević, V. K. Goyal, and M. Vetterli
6.3. Wavelet Series
239
Figure 6.34: Embedding V (0) ⊂ V (−1) for the Meyer wavelet.
Figure 6.35: The two-scale equation for the Meyer wavelet in frequency domain. Note
how the 4π-periodic function G(ej(ω/2+π) ) carves out Φ(ω) from Φ(ω/2).
The example above showed all the ingredients of the general construction
of Meyer wavelets. The key was the orthogonality relation for Φ(ω), the fact
that Φ(ω) is continuous, and that the spaces V (ℓ) are embedded. Since Φ(ω) is
continuous, ϕ(t) decays as O(1/t2 ). Smoother Φ(ω)’s can be constructed, leading
to faster decay of ϕ(t) (Exercise ??).
6.3.4
Biorthogonal Wavelet Series
Instead of one scaling function and one wavelet, we now seek two scaling funce
tions, ϕ(t) and ϕ(t),
e
as well as two corresponding wavelets, ψ(t) and ψ(t)
as in
Section 6.2.4, such that the families
1
t − 2ℓ k
ψℓ,k (t) = √ ψ
,
(6.97a)
2ℓ
2ℓ
1
t − 2ℓ k
ψeℓ,k (t) = √ ψe
,
(6.97b)
2ℓ
2ℓ
for ℓ, k, ∈ Z, form a biorthogonal set
hψk,ℓ (t), ψem,n (t)i = δn−ℓ δm−k ,
and are complete in L2 (R). That is, any x(t) ∈ L2 (R) can be written as either
X X (ℓ)
(ℓ)
x(t) =
βk ψℓ,k (t),
βk = hx, ψeℓ,k i,
ℓ∈Z k∈Z
α3.2 [January 2013] [free version] CC by-nc-nd
Comments to [email protected]
Fourier and Wavelet Signal Processing
240
Copyright 2013 J. Kovačević, V. K. Goyal, and M. Vetterli
Chapter 6. Wavelet Bases, Frames and Transforms on Functions
Figure 6.36: Construction of the wavelet from the two-scale equation. (a) The stretched
scaling function Φ(ω/2). (b) The stretched and shifted lowpass filter G(ej(ω/2+π) ). (c)
The resulting bandpass wavelet Ψ(ω).
1
1
0.8
0.8
0.6
0.6
0.4
0.4
0.2
0.2
0
−0.2
−6
−4
−2
0
2
4
1
0
−10
−5
0
5
10
−10
−5
0
5
10
1
0.8
0.5
0.6
0
0.4
−0.5
−6
Figure 6.37:
Ψ(ω).
0.2
−4
−2
0
2
4
0
Meyer scaling function and wavelet. (a) ϕ(t). (b) Φ(ω). (c) ψ(t). (d)
or
x(t) =
XX
ℓ∈Z k∈Z
α3.2 [January 2013] [free version] CC by-nc-nd
(ℓ)
β̃k ψeℓ,k (t),
(ℓ)
β̃k
= hx, ψℓ,k i.
Comments to [email protected]
Fourier and Wavelet Signal Processing
Copyright 2013 J. Kovačević, V. K. Goyal, and M. Vetterli
6.3. Wavelet Series
241
These scaling functions and wavelets will satisfy two-scale equations as before
√ X
√ X
ϕ(t) = 2
gn ϕ(2t − n),
ϕ(t)
e
= 2
gn ϕ(2t
e
e − n),
n∈Z
√ X
ψ(t) = 2
hn ϕ(2t − n),
n∈Z
n∈Z
e
ψ(t)
=
√ X
e
2
hn ϕ(2t
e − n).
n∈Z
We can then define a biorthogonal multiresolution analysis by
V (0) = span ({ϕ(t − k)}k∈Z ) ,
and the appropriate scaled spaces
V (ℓ) = span ({ϕℓ,k }k∈Z ) ,
Ṽ (0) = span ({ϕ(t
e − k)}k∈Z ) ,
Ṽ (ℓ) = span ({ϕ
eℓ,k }k∈Z ) ,
(6.98)
for ℓ ∈ Z. For a given ϕ(t)—for example, the triangle function—we can verify
that the axioms of multiresolution analysis (Exercise ??). From there, define the
wavelet families as in (6.97a)–(6.97b), which then lead to the wavelet spaces W (ℓ)
and W̃ (ℓ) . While this seems very natural, the geometry is more complicated than
in the orthogonal case. On the one hand, we have the decompositions
V (ℓ) = V (ℓ+1) ⊕ W (ℓ+1) ,
Ṽ (ℓ) = Ṽ (ℓ+1) ⊕ W̃ (ℓ+1) ,
(6.99)
(6.100)
as can be verified by using the two-scale equations for the scaling functions and
wavelets involved. On the other hand, unlike the orthonormal case, V (ℓ) is not
orthogonal to W (ℓ) . Instead,
W̃ (ℓ) ⊥ V (ℓ) ,
W (ℓ) ⊥ Ṽ (ℓ) ,
similarly to a biorthogonal basis (see Figure 1.11). We explore these relationships
in Exercise ?? to show that
W̃ (ℓ) ⊥ W (m) ,
ℓ 6= m.
The embedding (6.86a) has then two forms:
. . . ⊂ V (2) ⊂ V (1) ⊂ V (0) ⊂ V (−1) ⊂ V (−2) ⊂ . . . ,
with detail spaces {W (ℓ) }ℓ∈Z , or,
. . . ⊂ Ṽ (2) ⊂ Ṽ (1) ⊂ Ṽ (0) ⊂ Ṽ (−1) ⊂ Ṽ (−2) ⊂ . . . ,
with detail spaces {W̃ (ℓ) }ℓ∈Z . The detail spaces allow us to write
M
M
L2 (R) =
W (ℓ) =
W̃ (ℓ) .
ℓ∈Z
ℓ∈Z
The diagram in Figure 6.38 illustrates these two splits and the biorthogonality
between them.
α3.2 [January 2013] [free version] CC by-nc-nd
Comments to [email protected]
Fourier and Wavelet Signal Processing
242
Copyright 2013 J. Kovačević, V. K. Goyal, and M. Vetterli
Chapter 6. Wavelet Bases, Frames and Transforms on Functions
Figure 6.38: The space L2 (R) is split according to two different embeddings. (a)
Embedding V (ℓ) based on the scaling function ϕ(t). (b) Embedding Ṽ (ℓ) based on the
dual scaling function ϕ(t).
e
Note that orthogonality is “across” the spaces and their duals,
for example, W̃ (ℓ) ⊥ V (ℓ) .
6.4
Wavelet Frame Series
6.4.1
Definition of the Wavelet Frame Series
6.4.2
Frames from Sampled Wavelet Series
6.5
6.5.1
Continuous Wavelet Transform
Definition of the Continuous Wavelet Transform
The continuous wavelet transform uses a function ψ(t) and all its shifted and scaled
versions to analyze functions. Here we consider only real wavelets; this can be
extended to complex wavelets without too much difficulty.
Consider a real waveletR ψ(t) ∈ L2 (R) centered around t = 0 and having at
least one zero moment (i.e., ψ(t) dt = 0). Now, consider all its shifts and scales,
denoted by
1
t−b
ψa,b (t) = √ ψ
, a ∈ R+ , b ∈ R,
(6.101)
a
a
which means that ψa,b (t) is centered around b and scaled by a factor a. The scale
factor √1a insures that the L2 norm is preserved, and without loss of generality, we
can assume kψk = 1 and thus
kψa,b k = 1.
There is one more condition on the wavelet, namely the admissibility condition
stating that the Fourier transform Ψ(w) must satisfy
Z
|Ψ(ω)|2
dω < ∞.
(6.102)
Cψ =
|ω|
ω∈R+
Since |Ψ(0)| = 0 because of the zero moment property, this means that |Ψ(ω)| has
to decay for large ω, which it will if ψ has any smoothness. In short, (6.102) is a
very mild requirement that is satisfied by all wavelets of interest (see, for example,
α3.2 [January 2013] [free version] CC by-nc-nd
Comments to [email protected]
Fourier and Wavelet Signal Processing
Copyright 2013 J. Kovačević, V. K. Goyal, and M. Vetterli
6.5. Continuous Wavelet Transform
243
Figure 6.39: The wavelet transform. (a) An example function. (b) The magnitude of
wavelet transform |X(a, b)|.
Exercise ??). Now, given a function x(t) in L2 (R), we can define its continuous
wavelet transform as
Z ∞ Z ∞
t−b
1
ψ
X(a, b) = √
x(t) dt =
ψa,b (t)x(t) dt
a −∞
a
−∞
= hf, ψa,b i.
(6.103)
In words, we take the inner product of the function x(t) with a wavelet centered at
location b, and rescaled by a factor a, shown in Figure 6.16. A numerical example
is given in Figure 6.39, which displays the magnitude |X(a, b)| as an image. It is
already clear that the continuous wavelet transform acts as a singularity detector
or derivative operator, and that smooth regions are suppressed, which follows from
the zero moment property.
Let us rewrite the continuous wavelet transform at scale a as a convolution.
For this, it will be convenient to introduce the scaled and normalized version of the
wavelet,
√
1
t
FT
ψa (t) = √ ψ
←→
Ψa (ω) = aΨ(aω),
(6.104)
a
a
as well as the notation ψ̄(t) = ψ(−t). Then
Z ∞
Z ∞
t−b
1
√ ψ
X(a, b) =
f (t) dt =
ψa (t − b)f (t) dt
a
a
−∞
−∞
= (f ∗ ψ̄a )(b).
Now the Fourier transform of X(a, b) over the “time” variable b is
√
X(a, ω) = X(ω)Ψ∗a (ω) = X(ω) aΨ∗ (aω),
(6.105)
(6.106)
FT
where we used ψ(−t) ←→ Ψ∗ (ω) since ψ(t) is real.
6.5.2
Existence and Convergence of the Continuous Wavelet
Transform
The invertibility of the continuous wavelet transform is of course a key result: not
only can we compute the continuous wavelet transform, but we are actually able to
α3.2 [January 2013] [free version] CC by-nc-nd
Comments to [email protected]
Fourier and Wavelet Signal Processing
244
Copyright 2013 J. Kovačević, V. K. Goyal, and M. Vetterli
Chapter 6. Wavelet Bases, Frames and Transforms on Functions
come back! This inversion formula was first proposed by J. Morlet.55
Theorem 6.13 (Inversion of the continuous wavelet transform)
Consider a real wavelet ψ satisfying the admissibility condition (6.102). A
function f ∈ L2 (R) can be recovered from its continuous wavelet transform
X(a, b) by the inversion formula
Z ∞Z ∞
1
db da
x(t) =
X(a, b)ψa,b (t) 2 ,
(6.107)
Cψ 0
a
−∞
where equality is in the L2 sense.
Proof. Denote the right hand side of (6.107) by x(t). In that expression, we replace
X(a, b) by (6.105) and ψa,b (t) by ψa (t − b) to obtain
Z ∞Z ∞
1
db da
(f ∗ ψ̄a )(b)ψa (t − b) 2
x(t) =
Cψ 0
a
−∞
Z ∞
1
da
=
(f ∗ ψ̄a ∗ ψa )(t) 2 ,
Cψ 0
a
where the integral over b was recognized as a convolution. We will show the L2 equality
of x(t) and x(t) through the equality of their Fourier transforms. The Fourier transform
of x(t) is
Z ∞ Z ∞
1
da dt
X(ω) =
(f ∗ ψ̄a ∗ ψa )(t)e−jωt 2
Cψ −∞ 0
a
Z ∞
(a)
1
da
X(ω)Ψ∗a (ω)Ψa (ω) 2
=
Cψ 0
a
Z ∞
(b)
1
2 da
a|Ψ(aω)| 2 ,
X(ω)
(6.108)
=
Cψ
a
0
where (a) we integrated first over t, and transformed the two convolutions into products;
and (b) we used (6.104). In the remaining integral above, apply a change of variable
Ω = aω to compute:
Z ∞
Z ∞
da
|Ψ(Ω)|2
|Ψ(aω)|2
=
dΩ = Cψ ,
(6.109)
a
Ω
0
0
which together with (6.108), shows that X(ω) = X(ω). By Fourier inversion, we have
proven that x(t) = x(t) in the L2 sense.
The formula (6.107) is also sometimes called the resolution of the identity and goes
back to Calderon in the 1960’s in a context other than wavelets.
6.5.3
Properties of the Continuous Wavelet Transform
Linearity
55 The story goes that Morlet asked a mathematician for a proof, but only got as an answer:
“This formula, being so simple, would be known if it were correct.”
α3.2 [January 2013] [free version] CC by-nc-nd
Comments to [email protected]
Fourier and Wavelet Signal Processing
Copyright 2013 J. Kovačević, V. K. Goyal, and M. Vetterli
6.5. Continuous Wavelet Transform
245
Figure 6.40: The shift property of the continuous wavelet transform.
Shift in Time The continuous wavelet transform has a number of properties,
several of these being extensions or generalizations of properties seen already for
wavelet series. Let us start with shift and scale invariance. Consider g(t) = x(t− τ ),
or a delayed version of x(t). Then
Z ∞ Z ∞ ′
t−b
1
t +τ −b
1
Xg (a, b) = √
ψ
x(t − τ ) dt = √
ψ
x(t′ ) dt′
a −∞
a
a −∞
a
(6.110)
= Xf (a, b − τ )
by using the change of variables t′ = t − τ . That is, the continuous wavelet transform of g is simply a delayed version of the wavelet transform of x(t), as shown in
Figure 6.40.
Scaling in Time
of x(t),
For the scaling property, consider a scaled and normalized version
1
g(t) = √ f
s
t
,
s
where the renormalization ensures that kgk = kf k. Computing the continuous
wavelet transform of g, using the change of variables t′ = t/s, gives
Z ∞ Z ∞ ′
1
t−b
t
1
st − b
Xg (a, b) = √
ψ
f
dt = √
ψ
f (t′ ) dt′
a
s
a
as −∞
as −∞
r Z ∞ ′
s
t − b/s
a b
′
′
=
ψ
x(t ) dt = X
,
.
(6.111)
a −∞
a/s
s s
In words: if g(t) is a version of x(t) scaled by a factor s and normalized to maintain
its energy, then its continuous wavelet transform is a scaled by s both in a and b.
A graphical representation of the scaling property is shown in Figure 6.41.
Consider now a function x(t) with unit energy and having its wavelet transform
concentrated mostly in a unit square, say [a0 , a0 + 1] × [b0 , b0 + 1]. The continuous
α3.2 [January 2013] [free version] CC by-nc-nd
Comments to book-[email protected]
Fourier and Wavelet Signal Processing
246
Copyright 2013 J. Kovačević, V. K. Goyal, and M. Vetterli
Chapter 6. Wavelet Bases, Frames and Transforms on Functions
Figure 6.41: The scaling property of the continuous wavelet transform.
wavelet transform of g(t) is then mostly concentrated in a square [sa0 , s(a0 + 1)] ×
[sb0 , s(b0 + 1)], a cell of area s2 . But remember that g(t) has still unit energy, while
its continuous wavelet transform now covers a surface increased by s2 . Therefore,
when evaluating an energy measure in the continuous wavelet transform domain,
we need to renormalize by a factor a2 , as was seen in both the inversion formula
(6.107) and the energy conservation formula (6.112).
When comparing the above properties with the equivalent ones from wavelet
series, the major difference is that shift and scale are arbitrary real variables, rather
than constrained, dyadic rationals (powers of 2 for the scale, multiples of the scale
for shifts). Therefore, we obtain true time scale and shift properties.
Parseval’s Equality Closely related to the resolution of the identity is an energy
conservation formula, an analogue to Parseval’s equality.
Theorem 6.14 (Energy conservation of the continuous wavelet transform)
Consider a function f ∈ L2 (R) and its continuous wavelet transform X(a, b) with
respect to a real wavelet ψ satisfying the admissibility condition (6.102). Then,
the following energy conservation holds:
Z ∞
Z
Z
1
2 db da
|x(t)|2 dt =
.
(6.112)
|X(a, b)|
Cψ a∈R+ b∈R
a2
−∞
α3.2 [January 2013] [free version] CC by-nc-nd
Comments to [email protected]
Fourier and Wavelet Signal Processing
Copyright 2013 J. Kovačević, V. K. Goyal, and M. Vetterli
6.5. Continuous Wavelet Transform
247
Proof. Expand the right hand side (without the leading constant) as
Z
Z
Z
Z
db da (a)
(f ∗ ψ̄a )(b)2 db da
|X(a, b)|2
=
2
a
a2
+
a∈R+ b∈R
Za∈R b∈RZ
√
(b)
1
X(ω) aΨ∗ (aω)2 dω da
=
2π
a2
+
Za∈R
Zb∈R
1
da
=
|X(ω)|2 |Ψ(aω)|2 dω
,
2π
a
+
a∈R
b∈R
where (a) uses (6.105); and (b) uses Parseval’s equality for the Fourier transform with
respect to b, also transforming the convolution into a product. Changing the order of
integration and in (c) using the change of variables Ω = aω allows us to write the above
as
Z ∞
Z
Z
Z
1
da
db da
2
|Ψ(aω)| 2 dω
=
|X(ω)|
|X(a, b)|2
a2
a
−∞ 2π
a∈R+
a∈R+ b∈R
Z
Z
1
(c)
2
2 dΩ
|Ψ(Ω)|
=
|F (ω)|
dω.
Ω
ω∈R 2π
ω∈R+
|
{z
}
Cψ
Therefore
1
Cψ
Z
a∈R+
Z
b∈R
|X(a, b)|2
db da
1
=
a2
2π
Z
ω∈R
|X(ω)|2 dω,
and applying Parseval’s equality to the right side proves (6.112).
Both the inversion formula and the energy conservation formula use da db/a2
as an integration measure. This is related to the scaling property of the continuous
wavelet transform as will be shown below. Note that the extension to a complex
wavelet is not hard; the integral over da has to go from −∞ to ∞, and Cψ has to
be defined accordingly.
Redundancy The continuous wavelet transform maps a one-dimensional function
into a two-dimensional one: this is clearly very redundant. In other words, only
a small subset of two-dimensional functions correspond to wavelet transforms. We
are thus interested in characterizing the image of one-dimensional functions in the
continuous wavelet transform domain.
A simple analogue is in order. Consider an M by N matrix T having orthonormal columns (i.e., T T T = I) with M > N . Suppose y is the image of an
arbitrary vector x ∈ RN through the operator T , or y = T x. Clearly y belongs to
a subspace S of RM , namely the span of the columns of T .
There is a simple test to check if an arbitrary vector z ∈ RM belongs to S.
Introduce the kernel matrix K,
K = TTT,
(6.113)
which is the M by M matrix of outer products of the columns of T . Then, a vector
z belong to S if and only if it satisfies
Kz = z.
α3.2 [January 2013] [free version] CC by-nc-nd
(6.114)
Comments to [email protected]
Fourier and Wavelet Signal Processing
248
Copyright 2013 J. Kovačević, V. K. Goyal, and M. Vetterli
Chapter 6. Wavelet Bases, Frames and Transforms on Functions
Indeed, if z is in S, then it can be written as z = T x for some x. Substituting this
into the left side of (6.114) leads to
Kz = T T T T x = T x = z.
Conversely, if (6.114) holds then z = Kz = T T T z = T x, showing that z belongs to
S.
If z is not in S, then Kz = zb is the orthogonal projection of z onto S as
can be verified. See Exercise ?? for a discussion of this, as well as the case of
non-orthonormal columns in T .
We now extend the test given in (6.114) to the case of the continuous wavelet
transform. For this, let us introduce the reproducing kernel of the wavelet ψ(t),
defined as
K(a0 , b0 , a, b) = hψa0 ,b0 , ψa,b i.
(6.115)
This is the deterministic crosscorrelation of two wavelets at scale and shifts (a0 , b0 )
and (a, b), respectively, and is the equivalent of the matrix K in (6.113).
Call V the space of functions X(a, b) that are square integrable with respect
to the measure (db da)/a2 (see also Theorem 6.14). In this space, there exists a
subspace S that corresponds to bona fide continuous wavelet transforms. Similarly
to what we just did in finite dimensions, we give a test to check whether a function
X(a, b) in V actually belongs to S, that is, if it is the continuous wavelet transform
of some one-dimensional function x(t).
Theorem 6.15 (Reproducing kernel property of the continuous wavelet transform)
A function X(a, b) is the continuous wavelet transform of a function x(t) if and
only if it satisfies
Z ∞Z ∞
1
db da
X(a0 , b0 ) =
K(a0 , b0 , a, b)X(a, b) 2 .
(6.116)
Cψ 0
a
−∞
Proof. We show that if X(a, b) is a continuous wavelet transform of some function x(t),
then (6.116) holds. Completing the proof by showing that the converse is also true is
left as Exercise ??.
By assumption,
Z
X(a0 , b0 ) =
∞
ψa0 ,b0 (t)x(t) dt.
−∞
Replace x(t) by its inversion formula (6.107), or
Z ∞
Z ∞Z ∞
1
db da
F (a0 , b0 ) =
ψa0 ,b0 (t)
ψa,b (t)X(a, b) 2 dt
C
a
ψ
−∞
0
−∞
Z ∞Z ∞ Z ∞
(a)
1
db da
=
ψa0 ,b0 (t)ψa,b (t)X(a, b) dt 2
Cψ 0
a
−∞ −∞
Z ∞Z ∞
(b)
1
db da
=
K(a0 , b0 , a, b)X(a, b) 2 ,
Cψ 0
a
−∞
where (a) we interchanged the order of integration; and (b) we integrated over t to get
the reproducing kernel (6.115).
α3.2 [January 2013] [free version] CC by-nc-nd
Comments to [email protected]
Fourier and Wavelet Signal Processing
6.5. Continuous Wavelet Transform
Copyright 2013 J. Kovačević, V. K. Goyal, and M. Vetterli
249
Characterization of Singularities The continuous wavelet transform has an interesting localization property which is related to the fact that as a → 0, the wavelet
ψa,b (t) becomes arbitrarily narrow, performing a zoom in the vicinity of b. This is
easiest to see for x(t) = δ(t − τ ). Then
Z ∞ 1
1
t−b
τ −b
X(a, b) = √
ψ
δ(t − τ ) dt = √ ψ
. (6.117)
a −∞
a
a
a
This is the wavelet scaled by a and centered at b. As a → 0, the continuous wavelet
transform narrows exactly on the singularity and grows as a−1/2 .
A similar behavior can be shown for other singularities as well, which we
do now. For simplicity, we consider a compactly supported wavelet with N zero
moments. We have seen the most elementary case, namely the Haar wavelet (with a
single zero moment) in Section 6.1. Another example is the ramp function starting
at τ :
0,
t ≤ τ;
x(t) =
t − τ, t > τ.
This function is continuous, but its derivative is not. Actually, its second derivative
is a Dirac delta function at location τ .
To analyze this function and its singularity, we need a wavelet with at least
2 zero moments. Given a compactly supported wavelet, its second order primitive
will be compactly supported as well. To compute the continuous wavelet transform
X(a, b), we can apply integration by parts just like in (6.34) to obtain
Z ∞
√
t−b
aθ
X(a, b) = −
x′ (t) dt,
a
−∞
where x′ (t) is now a step function. We apply integration by parts one more time to
get
Z ∞
t−b
t−b
X(a, b) = − a3/2 θ(1)
x′ (t)
+ a3/2
θ(1)
x′′ (t) dt
a
a
−∞
t∈R
Z ∞
t−b
τ −b
3/2
(1)
= a
θ
δ(t − τ ) dt = a3/2 θ(1)
,
(6.118)
a
a
−∞
where θ(1) (t) is the primitive of θ(t), and the factor a3/2 comes from an additional
factor a due to integration of θ(t/a). The key, of course, is that as a → 0, the
continuous wavelet transform zooms towards the singularity and has a behavior of
the order a3/2 . These are examples of the following general result.
Theorem 6.16 (Localization property of the continuous wavelet transform)
Consider a wavelet ψ of compact support having N zero moments and a function
x(t) with a singularity of order n ≤ N (meaning the nth derivative is a Dirac
delta function; for example, Dirac delta function = 0, step = 1, ramp = 2, etc.).
Then, the wavelet transform in the vicinity of the singularity at τ is of the form
τ −b
n n−1/2 (n)
X(a, b) = (−1) a
ψ
,
(6.119)
a
α3.2 [January 2013] [free version] CC by-nc-nd
Comments to [email protected]
Fourier and Wavelet Signal Processing
250
Copyright 2013 J. Kovačević, V. K. Goyal, and M. Vetterli
Chapter 6. Wavelet Bases, Frames and Transforms on Functions
Figure 6.42: A function with singularities of order 0, 1 and 2, and its wavelet transform.
where ψ (n) is the nth primitive of ψ.
Proof. (Sketch) The proof follows the arguments developed above for n = 0, 1, and 2.
Because ψ(t) has N zero moments, its primitives of order n ≤ N are also compactly
supported. For a singularity of order n, we apply integration by parts n times. Each
n−1/2
primitive adds
(the −1/2 comes from
√a scaling factor a; this explains the factor a
the initial 1/ a factor in the wavelet). After n integrations by parts, x(t) has been
differentiated n times, is thus a Dirac delta function, and reproduces ψ (n) at location
τ.
The key is that the singularities are not only precisely located at small scales,
but the behavior of the continuous wavelet transform also indicates the singularity
type. Figure 6.42 sketches the continuous wavelet transform of a function with a
few singularities.
We considered the behavior around points of singularity, but what about
“smooth” regions? Again, assume a wavelet of compact support and having N
zero moments. Clearly, if the function x(t) is polynomial of degree N − 1 or less,
all inner products with the wavelet will be exactly zero due to the zero moment
property. If the function x(t) is piecewise polynomial,56 then the inner product will
be zero once the wavelet is inside an interval, while boundaries will be detected
according to the types of singularities that appear. We have calculates an example
in Section 6.1 for Haar, which makes the above explicit, while also pointing out
what happens when the wavelet does not have enough zero moments.
Decay and Smoothness Beyond polynomial and piecewise-polynomial functions,
let us consider more general smooth functions. Among the many possible classes
56 That is, the function is a polynomial over intervals (t , t
i i+1 ), with singularities at the interval
boundaries.
α3.2 [January 2013] [free version] CC by-nc-nd
Comments to [email protected]
Fourier and Wavelet Signal Processing
Copyright 2013 J. Kovačević, V. K. Goyal, and M. Vetterli
6.5. Continuous Wavelet Transform
251
of smooth functions, we consider functions having m continuous derivatives, or the
space C m .
For the wavelet, we take a compactly supported wavelet ψ having N zero
moments. Then, the N th primitive, denoted ψ (N ) , is compactly supported and
Z ∞
ψ (N ) (t) dt = C 6= 0.
−∞
This follows since the Fourier transform of ψ has N zeros at the origin, and each
integration removes one, leaving the Fourier transform ψ (N ) nonzero at the origin.
For example, the primitive of the Haar wavelet is the triangle function in (6.33),
with integral equal to 1/2.
Consider the following scaled version of ψ (N ) , namely a−1 ψ (N ) (t/a). This
function has an integral equal to C, and it acts like a Dirac delta function as a → 0
in that, for a continuous function x(t),
Z ∞
1 (N q) t − b
lim
ψ
x(t) dt = Cx(b).
(6.120)
a→0 −∞ a
a
Again, the Haar wavelet with its primitive is a typical example, since a limit of
scaled triangle functions is a classic way to obtain the Dirac delta function. We
are now ready to prove the decay behavior of the continuous wavelet transform as
a → 0.
Theorem 6.17 (Decay of continuous wavelet transform for x ∈ CN )
Consider a compactly supported Rwavelet ψ with N zero moments, N ≥ 1, and
primitives ψ (1) , . . . , ψ (N ) , where ψ (N ) (t) dt = C. Given a function x(t) having
N continuous and bounded derivatives f (1) , . . . , f (N ) , or f ∈ C N , then the
continuous wavelet transform of x(t) with respect to ψ behaves as
|X(a, b)| ≤ C ′ aN +1/2
(6.121)
for a → 0.
Proof. (sketch) The proof closely follows the method of integration by parts as used in
Theorem 6.16. That is, we take the N th derivative of x(t), f (N) (t), which is continuous
and bounded by assumption. We also have the N th primitive of the wavelet, ψ (N) (t),
which is of compact support and has a finite integral. After N integrations by parts,
we have
Z ∞
1
t−b
√ ψ
X(a, b) =
x(t) dt
a
a
−∞
Z ∞
(a)
1
t−b
= (−1)N aN √
ψ (N)
f (N) (t) dt
a
a −∞
Z ∞
(b)
1 (N) t − b
= (−1)N aN+1/2
ψ
f (N) (t) dt,
a
−∞ a
α3.2 [January 2013] [free version] CC by-nc-nd
Comments to [email protected]
Fourier and Wavelet Signal Processing
252
Copyright 2013 J. Kovačević, V. K. Goyal, and M. Vetterli
Chapter 6. Wavelet Bases, Frames and Transforms on Functions
Place Holder
for
Art Only
Place Holder
for
Art Only
Place Holder
for
Art Only
Place Holder
for
Art Only
(a)
(b)
(c)
(d)
Figure 6.43: A function and its scalogram. (a) Function with various modes. (b)
Scalogram with a Daubechies wavelet. (c) Scalogram with a symmetric wavelet. (d)
Scalogram with a Morlet wavelet.
where (a) N steps of integration by parts contribute a factor aN ; and (b) we normalize
the N th primitive by 1/a so that it has a constant integral and acts as a Dirac delta
function as a → 0. Therefore, for small a, the integral above tends towards Cf (N) (b),
which is finite, and the decay of the continuous wavelet transform is thus of order
aN+1/2 .
While we used a global smoothness, it is clear that it is sufficient for x(t) to be C N
in the vicinity of b for the decay to hold. The converse result, namely the necessary
decay of the wavelet transform for x(t) to be in C N , is a technical result which
is more difficult to prove; it requires non-integer, Lipschitz, regularity. Note that
if x(t) is smoother, that is, it has more than N continuous derivatives, the decay
will still be of order aN +1/2 since we cannot apply more integration by parts steps.
Also, the above result is valid for N ≥ 1 and thus cannot be applied to functions in
C 0 , but it can still be shown that the behavior is of order a1/2 as is to be expected.
Scalograms So far, we have only sketched continuous wavelet transforms, to point
out general behavior like localization and other relevant properties. For “real” functions, a usual way of displaying the continuous wavelet transform is the density
plot of the continuous wavelet transform magnitude |X(a, b)|. This is done in Figure 6.43 for a particular function and for 3 different wavelets, namely an orthogonal
Daubechies wavelet, a symmetric biorthogonal wavelet, and the Morlet wavelet.
As can be seen, the scalograms with respect to symmetric wavelets (Figure 6.43
(c) and (d)) have no drift across scales, which helps identify singularities. The
zooming property at small scales is quite evident from the scalogram.
Remarks The continuous-time continuous wavelet transform can be seen as a
mathematical microscope. Indeed, it can zoom in, and describe the local behavior of a function very precisely. This pointwise characterization is a distinguishing
feature of the continuous wavelet transform. The characterization itself is related
to the wavelet being a local derivative operator. Indeed, a wavelet with N zero
moments acts like an N th order derivative on the function analyzed by the wavelet
transform, as was seen in the proofs of Theorems 6.16 and 6.17. Together with the
α3.2 [January 2013] [free version] CC by-nc-nd
Comments to [email protected]
Fourier and Wavelet Signal Processing
Copyright 2013 J. Kovačević, V. K. Goyal, and M. Vetterli
6.5. Continuous Wavelet Transform
253
psi
Psi
0.5
0.4
0.4
0.2
0.3
-3
-2
1
-1
2
3
t
0.2
-0.2
0.1
-0.4
w0
(a)
w
(b)
Figure 6.44: Morlet wavelet. (a) Time domain function, with real and imaginary parts
in solid and dotted lines, respectively. (b) Magnitude spectrum of the Fourier transform.
fact that all scales are considered, this shows that the continuous wavelet transform
is a multiscale differential operator.
Compactly Supported Wavelets: Throughout the discussion so far, we have
often used the Haar wavelet (actually, its centered version) as the exemplary wavelet
used in a continuous wavelet transform. The good news is that it is simple, short,
and antisymmetric around the origin. The limitation is that in the frequency domain
it has only a single zero at the origin; thus it can only characterize singularities up
to order 1, and the decay of the continuous wavelet transform for smooth functions
is limited.
Therefore, one can use higher order wavelets, like any of the Daubechies
wavelets, or any biorthogonal wavelet. The key is the number of zeros at the
origin. The attraction of biorthogonal wavelets is that there are symmetric or antisymmetric solutions. Thus, singularities are well localized along vertical lines,
which is not the case for non-symmetric wavelets like the Daubechies wavelets. At
the same time, there is no reason to use orthogonal or biorthogonal wavelets, since
any functions satisfying the admissibility conditions (6.102) and having a sufficient
number of zero moments will do. In the next subsection, scalograms will highlight
differences between continuous wavelet transforms using different wavelets.
Morlet Wavelet: The classic, and historically first wavelet is a windowed complex exponential, first proposed by Jean Morlet. As a window, a Gaussian bell
shape is used, and the complex exponential makes it a bandpass filter. Specifically,
the wavelet is given by
2
1
ψ(t) = √ e−jω0 t e−t /2 ,
(6.122)
2π
with
ω0 = π
r
2
,
ln 2
where ω0 is such that the second
√ maximum of ℜ(ψ(t)) is half of the first one (at
t = 0), and the scale factor 1/ 2π makes the wavelet of unit norm. It is to be noted
that Ψ(0) 6= 0, and as such the wavelet is not admissible. However, Ψ(0) is very
small (of order 10−7 ) and has numerically no consequence (and can be corrected by
removing it from the wavelet). Figure 6.44 shows the Morlet wavelet in time and
frequency domains.
α3.2 [January 2013] [free version] CC by-nc-nd
Comments to [email protected]
Fourier and Wavelet Signal Processing
254
Copyright 2013 J. Kovačević, V. K. Goyal, and M. Vetterli
Chapter 6. Wavelet Bases, Frames and Transforms on Functions
It is interesting to note that the Morlet wavelet and the Gabor function are
related. From (6.122) the Morlet wavelet at scale a 6= 0 is
2
1
ψa,0 = √
e−jω0 t/a e−(t/a) /2
2πa
while, following (5.3) and (5.6), the Gabor function at ω is
2
2
1
gω,0 (t) = √
ejωt/a e−t /2a
2πa
p
which are equal for ω = ω0 = π 2/ ln 2 and the same scale factor a. Thus, there
is a frequency and a scale where the continuous wavelet transform (with a Morlet
wavelet) and a local Fourier transform (with a Gabor function) coincide.
6.6
Computational Aspects
The multiresolution framework derived above is more than just of theoretical interest. In addition to allow constructing wavelets, like the spline and Meyer wavelets,
it also has direct algorithmic implications as we show by deriving Mallat’s algorithm
for the computation of wavelet series.
6.6.1
Wavelet Series: Mallat’s Algorithm
Given a wavelet basis {ψm,n (t)}m,n∈Z , any function x(t) can be written as
X X (ℓ)
x(t) =
βk ψm,n (t).
m∈Z n∈Z
where
(ℓ)
βk
= hf, ψm,n i.
(6.123)
Assume that only a finite-resolution version of x(t) can be acquired, in particular
the projection of x(t) onto V (0) , denoted f (0) (t). Because
V (0) =
∞
M
W (ℓ) ,
m=1
we can write
f (0) (t) =
∞ X
X
(ℓ)
βk ψm,n (t).
(6.124)
m=1 n∈Z
Since f (0) (t) ∈ V (0) , we can also write
f (0) (t) =
X
n∈Z
α(0)
n ϕ(t − n),
(6.125)
where
α(0)
= hx(t), ϕ(t − n)it = hf, ϕ0,n i.
n
α3.2 [January 2013] [free version] CC by-nc-nd
Comments to [email protected]
Fourier and Wavelet Signal Processing
Copyright 2013 J. Kovačević, V. K. Goyal, and M. Vetterli
6.6. Computational Aspects
255
Given these two ways of expressing f (0) (t), how to go from one to the other? The
answer, as to be expected, lies in the two-scale equation, and leads to a filter bank
algorithm. Consider f (1) (t), the projection of f (0) (t) onto V (1) . This involves
computing the inner products
α(1)
= hf (0) (t),
n
√1 ϕ(t/2
2
− n)it ,
n ∈ Z.
(6.126)
From (6.48), we can write
X
1
√ ϕ(t/2 − n) =
gk ϕ(t − 2n − k).
2
k∈Z
(6.127)
Replacing this and (6.125) into (6.126) leads to
XX
(0)
α(1)
=
gk αℓ hϕ(t − 2n − k), ϕ(t − ℓ)it
n
k∈Z ℓ∈Z
(a)
=
X
(0) (b)
gℓ−2n αℓ
ℓ∈Z
= (e
g ∗ α(0) )2n ,
(6.128)
where (a) follows because the inner product is 0 unless ℓ = 2n + k; and (b) simply
rewrites the sum as a convolution, with
gn = g−n .
e
(1)
(0)
The upshot is that the sequence αn is obtained from convolving αn with ge (the
time-reversed impulse response of g) and downsampling by 2. The same development for the wavelet series coefficients
(1)
βk
= hf (0) (t),
√1 ψ(t/2
2
− n)it
yields
(1)
βk
where
= (e
h ∗ α(0) )2n ,
(6.129)
e
hn = h−n
is the time-reversed impulse response of the highpass filter h. The argument just
developed holds irrespectively of the scale at which we start, thus allowing to split a
function f (ℓ) in V (ℓ) into its components f (m+1) in V (m+1) and d(m+1) in W (m+1) .
This split is achieved by filtering and downsampling α(ℓ) with e
g and e
h, respectively.
Likewise, this process can be iterated k times, to go from V (ℓ) to V (m+k) , while
splitting off W (m+1) , W (m+2) , . . . , W (m+k) , or
V (ℓ) = W (m+1) ⊕ W (m+1) ⊕ . . . ⊕ W (m+k) ⊕ V (m+k) .
The key insight is of course that once we have an initial projection, for exam(0)
ple, f (0) (t) with expansion coefficients αn , then all the other expansion coefficients
can be computed using discrete-time filtering. This is shown in Figure 6.46, where
(0)
the sequence αn , corresponding to an initial projection of x(t) onto V (0) , is decomposed into the expansion coefficients in W (1) , W (2) , W (3) and V (3) . This algorithm
is known as Mallat’s algorithm, since it is directly related to the multiresolution
analysis of Mallat and Meyer.
α3.2 [January 2013] [free version] CC by-nc-nd
Comments to [email protected]
Fourier and Wavelet Signal Processing
256
Copyright 2013 J. Kovačević, V. K. Goyal, and M. Vetterli
Chapter 6. Wavelet Bases, Frames and Transforms on Functions
Figure 6.45: Splitting of V (0) into W (1) , W (2) , W (3) and V (3) , shown for a sinc multiresolution analysis.
(0)
Figure 6.46: Mallat’s algorithm. From the initial sequence αn , all of the wavelet series
coefficients are computed through a discrete filter bank algorithm.
Figure 6.47: Initialization of Mallat’s algorithm. The function x(t) is convolved with
ϕ(t)
e = ϕ(−t) and sampled at t = n.
Initialization How do we initialize Mallat’s algorithm, that is, compute the initial
(0)
sequence αn ? There is no escape from computing the inner products
= hx(t), ϕ(t − n)it = (ϕ
e ∗ f )|t=n ,
α(0)
n
where ϕ(t)
e = ϕ(−t). This is shown in Figure 6.47.
The simplification obtained through this algorithm is the following. Computing inner products involves continuous-time filtering and sampling, which is difficult.
Instead of having to compute such inner products at all scales as in (6.123), only a
(0)
single scale has to be computed, namely the one leading to αn . All the subsequent
inner products are obtained from that sequence, using only discrete-time processing.
The question is: How well does f (0) (t) approximate the function x(t)? The
α3.2 [January 2013] [free version] CC by-nc-nd
Comments to [email protected]
Fourier and Wavelet Signal Processing
Copyright 2013 J. Kovačević, V. K. Goyal, and M. Vetterli
6.6. Computational Aspects
257
1
0
−1
0
100
0
100
0
100
200
300
400
500
600
700
2
1
0
−1
−2
200
300
400
500
600
700
800
1
0
−1
200
300
400
500
600
700
Figure 6.48: Initial approximation in Mallat’s algorithm. (a) Function x(t). (b) Approximation f (0) (t) with Haar scaling function in V (0) and error e(0) (t) = x(t) − f (0) (t).
(c) Same but in V (−3) , or f (−3) (t) and e(−3) (t) = x(t) − f (−3) (t).
key is that if the error kf (0) − f k is too large, we can go to finer resolutions f (ℓ) ,
m < 0, until kf (ℓ) − f k is small enough. Because of completeness, we know that
there is an m such that the initial approximation error can be made as small as we
like.
In Figure 6.48, we show two different initial approximations and the resulting
errors,
e(ℓ) (t) = x(t) − f (ℓ) (t).
Clearly, the smoother the function, the faster the decay of ke(ℓ) k as m → −∞.
Exercise ?? explores this further.
The Synthesis Problem We have considered the analysis problem, or given a
function, how to obtain its wavelet coefficients. Conversely, we can also consider
the synthesis problem. That is, given a wavelet series representation as in (6.124),
how to synthesize f (0) (t). One way is to effectively add wavelets at different scales
and shifts, with the appropriate weights (6.123).
The other way is to synthesize f (0) (t) as in (6.125), which now involves only
linear combinations of a single function ϕ(t) and its integer shifts. To make matters
specific, assume we want to reconstruct f (0) ∈ V (0) from f (1) (t) ∈ V (1) and d(1) (t) ∈
W (1) . There are two ways to write f (0) (t), namely
X
f (0) (t) =
α(0)
(6.130)
n ϕ(t − n)
n∈Z
1 X (1)
1 X (1)
αn ϕ(t/2 − n) + √
β ψ(t/2 − n),
= √
2 n∈Z
2 n∈Z k
α3.2 [January 2013] [free version] CC by-nc-nd
(6.131)
Comments to [email protected]
Fourier and Wavelet Signal Processing
258
Copyright 2013 J. Kovačević, V. K. Goyal, and M. Vetterli
Chapter 6. Wavelet Bases, Frames and Transforms on Functions
Figure 6.49: Synthesis of f (0) (t) using Mallat’s algorithm. The wavelet and scaling coef(0)
ficients are fed through a DWT synthesis, generating the sequence αn . A final continuous(0)
time processing implementing (6.130) leads to f (t).
where the latter is the sum of f (1) (t) and d(1) (t). Now,
(0)
αℓ
= hf (0) (t), ϕ(t − ℓ)it .
Using the two-scale equation (6.127) and its equivalent for ψ(t/2 − n),
X
1
√ ψ(t/2 − n) =
hk ϕ(t − 2n − k),
2
k∈Z
we can write
(0)
αℓ
= hf (1) (t), ϕ(t − ℓ)it + hd(1) (t), ϕ(t − ℓ)it
(a) X X (1)
αn gk hϕ(t − ℓ), ϕ(t − 2n − k)it
=
n∈Z n∈Z
+
XX
n∈Z k∈Z
(b)
=
X
n∈Z
(1)
βk hk hϕ(t − ℓ), ϕ(t − 2n − k)it
α(1)
n gℓ−2n +
X
(1)
βk hℓ−2n
(6.132)
n∈Z
where (a) follows from (6.131) using the two-scale equation; and (b) is obtained
from the orthogonality of the ϕs, unless k = ℓ − 2n. The obtained expression for
(0)
(1)
(1)
αℓ indicates that the two sequences αn and βk are upsampled by 2 before being
filtered by g and h, respectively. In other words, a two-channel synthesis filter bank
produces the coefficients for synthesizing f (0) (t) according to (6.130). The argument
above can be extended to any number of scales and leads to the synthesis version
of Mallat’s algorithm, shown in Figure 6.49.
Again, the simplification arises since instead of having to use continuoustime wavelets and scaling functions at many scales, only a single continuous-time
prototype function is needed. This prototype function is ϕ(t) and its shifts, or the
basis for V (0) . Because of the inclusion of all the coarser spaces in V (0) , the result
is intuitive, nonetheless it is remarkable that the multiresolution framework leads
naturally to a discrete-time filter bank algorithm.
α3.2 [January 2013] [free version] CC by-nc-nd
Comments to [email protected]
Fourier and Wavelet Signal Processing
Copyright 2013 J. Kovačević, V. K. Goyal, and M. Vetterli
Chapter at a Glance
6.6.2
259
Wavelet Frames
Chapter at a Glance
Highpass & wavelet
Filter
Lowpass & scaling function
−1 N
G(z) = 1+z2
R(z)
Function
Φ(ω) =
Ψ(ω) =
∞
Y
i=1
i
√1 G(ejω/2 )
2
1−z N
R(−z −1 )
2
∞
Y
i
√1 H(ejω/2 )
√1 G(ejω/2 )
2
2
i=2
H(z) = z −L+1
Φ(ω) = 2−1/2 G(ejω/2 )Φ(ω/2)
√ P
ϕ(t) =
2 n gn ϕ(2t − n)
Ψ(ω) = 2−1/2 H(ejω/2 )Φ(ω/2)
√ P
ψ(t) =
2 n hn ϕ(2t − n)
Smoothness
Can be tested,
increases with N
Same as for ϕ(t)
Moments
Polynomials of degree N − 1
are in span({ϕ(t − n)}n∈Z )
Wavelets has N zero moments
Size and
support
support(g) = {0, . . . , L − 1}
support(ϕ) = [0, L − 1]
support(h) = {0, . . . , L − 1}
support(ψ) = [0, L − 1]
Two-scale
equation
Orthogonality
hϕ(t), ϕ(t − n)it = δn
hϕ(t), ψ(t − n)it = 0
hψ(t), ψ(t − n)it = δn
hψ(t), 2−m/2 ψ(2−m t − n)it = δn
Table 6.1: Major properties of scaling function and wavelet based on an iterated filter
bank with an orthonormal lowpass filter having N zeros at z = −1 or ω = π.
Historical Remarks
TBD
Further Reading
Books and Textbooks Daubechies [31].
Results on Wavelets For the proof of completeness of Theorem 6.6, see [29, 31].
α3.2 [January 2013] [free version] CC by-nc-nd
Comments to [email protected]
Fourier and Wavelet Signal Processing
260
α3.2 [January 2013] [free version] CC by-nc-nd
Copyright 2013 J. Kovačević, V. K. Goyal, and M. Vetterli
Chapter 6. Wavelet Bases, Frames and Transforms on Functions
Comments to [email protected]
Fourier and Wavelet Signal Processing
Copyright 2013 J. Kovačević, V. K. Goyal, and M. Vetterli
Chapter 7
Approximation,
Estimation, and
Compression
Contents
7.1
Introduction
262
7.2
Abstract Models and Approximation
262
7.3
Empirical Models
262
7.4
Estimation and Denoising
262
7.5
Compression
262
7.6
Inverse Problems
262
Chapter at a Glance
262
Historical Remarks
262
Further Reading
262
7.A
263
Elements of Source Coding
261
α3.2 [January 2013] [free version] CC by-nc-nd
Comments to [email protected]
Fourier and Wavelet Signal Processing
262
7.1
Copyright 2013 J. Kovačević, V. K. Goyal, and M. Vetterli
Chapter 7. Approximation, Estimation, and Compression
Introduction
Chapter Outline
7.2
Abstract Models and Approximation
7.2.1
Local Fourier and Wavelet Approximations of Piecewise
Smooth Functions
7.2.2
Wide-Sense Stationary Gaussian Processes
7.2.3
Poisson Processes
7.3
Empirical Models
7.3.1
ℓp Models
7.3.2
Statistical Models
7.4
Estimation and Denoising
7.4.1
Connections to Approximation
7.4.2
Wavelet Thresholding and Variants
7.4.3
Frames
7.5
Compression
7.5.1
Audio Compression
7.5.2
Image Compression
7.6
Inverse Problems
7.6.1
Deconvolution
7.6.2
Compressed Sensing
Chapter at a Glance
TBD
Historical Remarks
TBD
Further Reading
TBD
α3.2 [January 2013] [free version] CC by-nc-nd
Comments to [email protected]
Fourier and Wavelet Signal Processing
7.A. Elements of Source Coding
Copyright 2013 J. Kovačević, V. K. Goyal, and M. Vetterli
263
Appendix
7.A
Elements of Source Coding
7.A.1
Entropy Coding
7.A.2
Quantization
7.A.3
Transform Coding
α3.2 [January 2013] [free version] CC by-nc-nd
Comments to [email protected]
Fourier and Wavelet Signal Processing
264
α3.2 [January 2013] [free version] CC by-nc-nd
Copyright 2013 J. Kovačević, V. K. Goyal, and M. Vetterli
Chapter 7. Approximation, Estimation, and Compression
Comments to [email protected]
Fourier and Wavelet Signal Processing
Copyright 2013 J. Kovačević, V. K. Goyal, and M. Vetterli
Bibliography
[1] T. Aach. New criteria for shift variance and wide-sense cyclostationarity in
multirate filter banks. In Proc. IEEE Int. Workshop on Spectral Methods and
Multirate Signal Proc., pages 7–13, Florence, Italy, 2006.
[2] T. Aach. Comparative analysis of shift variance and cyclostationarity in multirate filter banks. IEEE Trans. Circ. and Syst., 54(5):1077–1087, May 2007.
[3] N. I. Akhiezer and I. M. Glazman. Theory of Linear Operators in Hilbert
Spaces, volume 1. Frederick Ungar Publishing, 1966.
[4] S. Akkarakaran and P. P. Vaidyanathan. Bifrequency and bispectrum maps:
A new look at multirate systems with stochastic inputs. IEEE Trans. Signal
Proc., 48(3):723–736, March 2000.
[5] M. Antonini, M. Barlaud, P. Mathieu, and I. Daubechies. Image coding using
wavelet transform. IEEE Trans. Image Proc., 1(2):205–220, April 1992.
[6] P. Auscher. Wavelets: Mathematics and Applications, chapter Remarks on
the local Fourier bases, pages 203–218. CRC Press, 1994.
[7] M. G. Bellanger and J. L. Daguet. TDM-FDM transmultiplexer: Digital
polyphase and FFT. IEEE Trans. Commun., 22(9):1199–1204, September
1974.
[8] J. J. Benedetto and M. C. Fickus. Finite normalized tight frames. Adv. Comp.
Math., sp. iss. Frames, 18:357–385, 2003.
[9] B. G. Bodmann, P. G. Casazza, and G. Kutyniok. A quantitative notion
of redundancy for finite frames. In Journ. Appl. and Comput. Harmonic
Analysis, 2010.
[10] H. Bölcskei and F. Hlawatsch. Gabor Analysis and Algorithms: Theory and
Applications, chapter Oversampled modulated filter banks, pages 295–322.
Birkhäuser, Boston, MA, 1998.
[11] H. Bölcskei and F. Hlawatsch. Oversampled cosine modulated filter banks
with perfect reconstruction. IEEE Trans. Circ. and Syst. II: Analog and
Digital Signal Proc., 45(8):1057–1071, August 1998.
265
α3.2 [January 2013] [free version] CC by-nc-nd
Comments to [email protected]
Fourier and Wavelet Signal Processing
266
Copyright 2013 J. Kovačević, V. K. Goyal, and M. Vetterli
Bibliography
[12] H. Bölcskei, F. Hlawatsch, and H. G. Feichtinger. Frame-theoretic analysis
of oversampled filter banks. IEEE Trans. Signal Proc., 46(12):3256–3269,
December 1998.
[13] A. P. Bradley. Shift invariance in the discrete wavelet transform. In Proc.
Digit. Image Comp., December 2003.
[14] A. A. M. L. Bruckens and A. W. M. van den Enden. New networks for perfect
inversion and perfect reconstruction. IEEE Journ. Sel. Areas in Commun.,
10(1), September 1992.
[15] P. J. Burt and E. H. Adelson. The Laplacian pyramid as a compact image
code. IEEE Trans. Commun., 31(4):532–540, April 1983.
[16] E. J. Candès. Curvelet Web Site. http://www.curvelet.org/papers.html.
[17] E. J. Candès, L. Demanet, D. L. Donoho, and L. Ying. Fast discrete curvelet
transforms. Preprint, 2005.
[18] P. G. Casazza. The art of frame theory. Taiwanese Journ. Math., 4(2):129–
202, 2000.
[19] P. G. Casazza and J. Kovačević. Equal-norm tight frames with erasures. Adv.
Comp. Math., sp. iss. Frames, 18:387–430, 2002.
[20] P. G. Casazza and G. Kutyniok. A generalization of Gram-Schmidt orthogonalization generating all Parseval frames. Adv. Comp. Math., 27:65–78, 2007.
Preprint.
[21] A. Chebira and J. Kovačević. Lapped tight frame transforms. In Proc. IEEE
Int. Conf. Acoust., Speech and Signal Proc., volume III, pages 857–860, Honolulu, HI, April 2007.
[22] O. Christensen. An Introduction to Frames and Riesz Bases. Birkhäuser,
2002.
[23] A. Cohen, I. Daubechies, and J.-C. Feauveau. Biorthogonal bases of compactly
supported wavelets. Commun. Pure and Appl. Math., 45:485–560, 1992.
[24] R. R. Coifman, Y. Meyer, S. Quake, and M. V. Wickerhauser. Signal processing and compression with wavelet packets. Technical report, Yale Univ.,
1991.
[25] A. L. Cunha, J. Zhou, and M. N. Do. The nonsubsampled contourlet
transform: Theory, design, and applications. IEEE Trans. Image Proc.,
15(10):3089–3101, October 2006.
[26] Z. Cvetković. Oversampled modulated filter banks and tight Gabor frames
in ℓ2 (Z). In Proc. IEEE Int. Conf. Acoust., Speech and Signal Proc., pages
1456–1459, Detroit, MI, May 1995.
α3.2 [January 2013] [free version] CC by-nc-nd
Comments to [email protected]
Fourier and Wavelet Signal Processing
Copyright 2013 J. Kovačević, V. K. Goyal, and M. Vetterli
Bibliography
267
[27] Z. Cvetković and M. Vetterli. Oversampled filter banks. IEEE Trans. Signal
Proc., 46(5):1245–1255, May 1998.
[28] Z. Cvetković and M. Vetterli. Tight Weyl-Heisenberg frames in ℓ2 (Z). IEEE
Trans. Signal Proc., 46(5):1256–1259, May 1998.
[29] I. Daubechies. Orthonormal bases of compactly supported wavelets. Commun.
Pure and Appl. Math., 41:909–996, November 1988.
[30] I. Daubechies. The wavelet transform, time-frequency localization and signal
analysis. IEEE Trans. Inform. Th., 36(5):961–1005, September 1990.
[31] I. Daubechies. Ten Lectures on Wavelets. SIAM, Philadelphia, PA, 1992.
[32] I. Daubechies, A. Grossman, and Y. Meyer. Painless nonorthogonal expansions. Journ. Math. Phys., 27:1271–1283, November 1986.
[33] M. N. Do and M. Vetterli. The finite ridgelet transform for image representation. IEEE Trans. Image Proc., 12(1):16–28, January 2003.
[34] M. N. Do and M. Vetterli. The contourlet transform: An efficient directional
multiresolution image representation. IEEE Trans. Image Proc., 14(12):2091–
2106, December 2005.
[35] E. Dubois. The sampling and reconstruction of time-varying imagery with
application in video systems. Proc. IEEE, 73(4):502–522, April 1985.
[36] R. J. Duffin and A. C. Schaeffer. A class of nonharmonic Fourier series. Trans.
Amer. Math. Soc., 72:341–366, 1952.
[37] Y. Eldar and H. Bölcskei. Geometrically uniform frames. IEEE Trans. Inform.
Th., 49(4):993–1006, April 2003.
[38] D. Esteban and C. Galand. Applications of quadrature mirror filters to split
band voice coding schemes. In Proc. IEEE Int. Conf. Acoust., Speech and
Signal Proc., pages 191–195, 1995.
[39] H. G. Feichtinger and T. Strohmer, editors. Gabor Analysis and Algorithms:
Theory and Applications. Birkhäuser, Boston, MA, 1998.
[40] D. Gabor. Theory of communication. Journ. IEE, 93:429–457, 1946.
[41] V. K. Goyal, M. Vetterli, and N. T. Thao. Quantized overcomplete expansions in RN : Analysis, synthesis, and algorithms. IEEE Trans. Inform. Th.,
44(1):16–31, January 1998.
[42] D. Han and D. R. Larson. Frames, bases and group representations. Number
697 in Memoirs AMS. AMS Press, Providence, RI, 2000.
[43] C. Heil and D. Walnut. Continuous and discrete wavelet transforms. SIAM
Review, 31:628–666, 1989.
α3.2 [January 2013] [free version] CC by-nc-nd
Comments to [email protected]
Fourier and Wavelet Signal Processing
268
Copyright 2013 J. Kovačević, V. K. Goyal, and M. Vetterli
Bibliography
[44] C. Herley and M. Vetterli. Biorthogonal bases of symmetric compactly supported wavelets. In M. Farge and et al., editors, Proc. Wavelets, Fractals and
Fourier Transforms. Oxford Univ. Press, 1991.
[45] C. Herley and M. Vetterli. Wavelets and recursive filter banks. IEEE Trans.
Signal Proc., August 1993.
[46] C. Herley and M. Vetterli. Orthogonal time-varying filter banks and wavelet
packets. IEEE Trans. Signal Proc., 42(10):2650–2663, October 1994.
[47] O. Herrmann. On the approximation problem in nonrecursive digital filter
design. IEEE Trans. Circ. Theory, 18:411–413, 1971.
[48] R. B. Holmes and V. I. Paulsen. Optimal frames for erasures. Linear Algebra
and Its Appl., 377:31–51, 2004.
[49] J. D. Johnston. A filter family designed for use in Quadrature Mirror Filter
Banks. In Proc. IEEE Int. Conf. Acoust., Speech and Signal Proc., pages
291–294, Denver, CO, 1980.
[50] G. Karlsson and M. Vetterli. Theory of two - dimensional multirate filter
banks. IEEE Trans. Acoust., Speech, and Signal Proc., 38(6):925–937, June
1990.
[51] N. G. Kingsbury. The dual-tree complex wavelet transform: A new efficient
tool for image restoration and enhancement. In Proc. Eur. Signal Proc. Conf.,
pages 319–322, 1998.
[52] N. G. Kingsbury. Image processing with complex wavelets. Phil. Trans. Royal
Soc. London A, September 1999.
[53] N. G. Kingsbury. Complex wavelets for shift invariant analysis and filtering of
signals. Journ. Appl. and Comput. Harmonic Analysis, 10(3):234–253, May
2001.
[54] J. Kovačević and A. Chebira. An Introduction to Frames. Foundations and
Trends in Signal Proc. Now Publishers, 2008.
[55] J. Kovačević and M. Vetterli. Nonseparable multidimensional perfect reconstruction filter banks and wavelet bases for Rn . IEEE Trans. Inform. Th., sp.
iss. Wavelet Transforms and Multiresolution Signal Analysis, 38(2):533–555,
March 1992. Chosen for inclusion in Fundamental Papers in Wavelet Theory.
[56] D. Labate, W-Q. Lim, G. Kutyniok, and G. Weiss. Sparse multidimensional
representation using shearlets. In Proc. SPIE Conf. Wavelet Appl. in Signal
and Image Proc., pages 254–262, Bellingham, WA, 2005.
[57] D. J. LeGall and A. Tabatabai. Subband coding of digital images using symmetric short kernel filters and arithmetic coding techniques. In Proc. IEEE
Int. Conf. Acoust., Speech and Signal Proc., pages 761–765, New York, NY,
1988.
α3.2 [January 2013] [free version] CC by-nc-nd
Comments to [email protected]
Fourier and Wavelet Signal Processing
Bibliography
Copyright 2013 J. Kovačević, V. K. Goyal, and M. Vetterli
269
[58] Y. Lu and M. N. Do. The finer directional wavelet transform. In Proc.
IEEE Int. Conf. Acoust., Speech and Signal Proc., volume 4, pages 573–576,
Philadelphia, PA, March 2005.
[59] S. Mallat. Multiresolution approximations and wavelet orthonormal bases of
L2 (R). Trans. Amer. Math. Soc., 315:69–87, September 1989.
[60] S. Mallat. A Wavelet Tour of Signal Processing. Academic Press, second
edition, 1999.
[61] H. S. Malvar. Signal Processing with Lapped Transforms. Artech House,
Norwood, MA, 1992.
[62] T. G. Marshall. U-L block-triangular matrix and ladder realizations of subband codes. In Proc. IEEE Int. Conf. Acoust., Speech and Signal Proc.,
volume 3, pages 177–180, 1993.
[63] J. L. Massey and T. Mittelholzer. Sequences II: Methods in Communication,
Security and Computer Sciences, chapter Welch’s bound and sequence sets
for code-division multiple-access systems, pages 63–78. Springer-Verlag, New
York, NY, 1993.
[64] F. Mintzer. Filters for distortion-free two-band multirate filter banks. IEEE
Trans. Acoust., Speech, and Signal Proc., 33(3):626–630, June 1985.
[65] I. Newton. Opticks or A Treatise of the Reflections, Refractions, Inflections
and Colours of Light. Royal Society, 1703.
[66] H. J. Nussbaumer. Complex quadrature mirror filters. In Proc. IEEE Int.
Conf. Acoust., Speech and Signal Proc., volume II, pages 221–223, Boston,
MA, 1983.
[67] B. Porat. Digital Processing of Random Signals. Prentice Hall, Englewood
Cliffs, NJ, 1994.
[68] B. Porat. A Course in Digital Signal Processing. John Wiley & Sons, New
York, NY, 1996.
[69] K. Ramchandran and M. Vetterli. Best wavelet packet bases in a ratedistortion sense. IEEE Trans. Image Proc., 2(2):160–175, April 1993.
[70] T. A. Ramstad. IIR filter bank for subband coding of images. In Proc. IEEE
Int. Symp. Circ. and Syst., pages 827–830, 1988.
[71] J. M. Renes, R. Blume-Kohout, A. J. Scot, and C. M. Caves. Symmetric informationally complete quantum measurements. Journ. Math. Phys.,
45(6):2171–2180, 2004.
[72] O. Rioul. A discrete-time multiresolution theory. IEEE Trans. Signal Proc.,
41(8):2591–2606, August 1993.
α3.2 [January 2013] [free version] CC by-nc-nd
Comments to [email protected]
Fourier and Wavelet Signal Processing
Copyright 2013 J. Kovačević, V. K. Goyal, and M. Vetterli
270
Bibliography
[73] O. Rioul and P. Duhamel. Fast algorithms for discrete and continuous wavelet
transforms. IEEE Trans. Inform. Th., sp. iss. Wavelet Transforms and Multiresolution Signal Analysis, 38(2):569–586, March 1992.
[74] N. Saito and R. R. Coifman. Local discriminant bases. In Proc. SPIE Conf.
Vis. Commun. and Image Proc., pages 2–14, 1994.
[75] N. Saito and R. R. Coifman. Local discriminant bases and their applications.
Journ. Math. Imag. Vis., 5:337–358, 1995.
[76] A. Sandryhaila, A. Chebira, C. Milo, J. Kovačević, and M. Püschel. Systematic construction of real lapped tight frame transforms. IEEE Trans. Signal
Proc., 58(5):2256–2567, May 2010.
[77] I. W. Selesnick. Wavelets in Signal and Image Analysis, chapter The double
density DWT. Kluwer Academic Publishers, 2001.
[78] I. W. Selesnick. The double-density dual-tree DWT. IEEE Trans. Signal
Proc., 52(5):1304–1314, May 2004.
[79] I. Shah and T. A. C. M. Kalker. On ladder structures and linear phase
conditions for bi-orthogonal filter banks. In Proc. IEEE Int. Conf. Acoust.,
Speech and Signal Proc., pages 181–184, 1994.
[80] E.
P.
Simoncelli.
http://www.cns.nyu.edu/∼eero/steerpyr/.
Simoncelli
Web
Site.
[81] E. P. Simoncelli, W. T. Freeman, E. H. Adelson, and D. J. Heeger. Shiftable
multiscale transforms. IEEE Trans. Inform. Th., sp. iss. Wavelet Transforms
and Multiresolution Signal Analysis, 38(2):587–607, March 1992.
[82] M. J. T. Smith. IIR analysis/synthesis systems, chapter in Subband Image
Coding. Kluwer Academic Press, Boston, MA, 1991. J. W. Woods ed.
[83] M. J. T. Smith and T. P. Barnwell III. Exact reconstruction for tree-structured
subband coders. IEEE Trans. Acoust., Speech, and Signal Proc., 34(3):431–
441, June 1986.
[84] A. K. Soman, P. P. Vaidyanathan, and T. Q. Nguyen. Linear phase paraunitary filter banks: Theory, factorizations and applications. IEEE Trans. Signal
Proc., sp. iss. Wavelets and Signal Proc., 41(12), December 1993.
[85] J.-L. Starck, M. Elad, and D. L. Donoho. Redundant multiscale transforms
and their application for morphological component separation. Advances in
Imaging and Electron Physics, 132, 2004.
[86] P. Stoica and R. Moses. Introduction to Spectral Analysis. Prentice Hall,
Englewood Cliffs, NJ, 2000.
[87] G. Strang and T. Nguyen. Wavelets and Filter Banks. Wellesley Cambridge
Press, Boston, MA, 1996.
α3.2 [January 2013] [free version] CC by-nc-nd
Comments to [email protected]
Fourier and Wavelet Signal Processing
Bibliography
Copyright 2013 J. Kovačević, V. K. Goyal, and M. Vetterli
271
[88] T. Strohmer. Modern Sampling Theory: Mathematics and Applications, chapter Finite and infinite-dimensional models for oversampled filter banks, pages
297–320. Birkhäuser, Boston, MA, 2000.
[89] T. Strohmer and R. Heath. Grassmannian frames with applications to coding and communications. Journ. Appl. and Comput. Harmonic Analysis,
14(3):257–175, 2003.
[90] W. Sweldens. The lifting scheme: A custom-design construction of biorthogonal wavelets. Journ. Appl. and Comput. Harmonic Analysis, 1996.
[91] C. W. Therrien. Issues in multirate statistical signal processing. In Proc.
Asilomar Conf. Signal, Syst. and Comp., volume 1, pages 573–576, Pacific
Grove, CA, 2001.
[92] J. A. Tropp, I. S. Dhillon, R. W. Heath, Jr., and T. Strohmer. Designing
structured tight frames via an alternating projection method. IEEE Trans.
Inform. Th., 51(1):188–209, January 2005.
[93] M. K. Tsatsanis and G. B. Giannakis. Principal component filter banks for
optimal multiresolution analysis. IEEE Trans. Signal Proc., 43(8):1766–1777,
August 1995.
[94] M. Unser. An extension of the Karhunen-Loève transform for wavelets and
perfect reconstruction filterbanks. In Proc. SPIE Conf. Wavelet Appl. in
Signal and Image Proc., volume 2034, pages 45–56, San Diego, CA, July
1993.
[95] M. Unser. Wavelets, filterbanks, and the Karhunen-Loève transform. In Proc.
Eur. Conf. Signal Proc., volume III, pages 1737–1740, Rhodes, Greece, 1998.
[96] P. P. Vaidyanathan. Quadrature mirror filter banks, M-band extensions and
perfect reconstruction techniques. IEEE Acoust., Speech, and Signal Proc.
Mag., 4(3):4–20, July 1987.
[97] P. P. Vaidyanathan. Theory and design of M-channel maximally decimated
quadrature mirror filters with arbitrary M, having the perfect reconstruction
property. IEEE Trans. Acoust., Speech, and Signal Proc., 35(4):476–492, April
1987.
[98] P. P. Vaidyanathan. Multirate Systems and Filter Banks. Prentice Hall,
Englewood Cliffs, NJ, 1992.
[99] P. P. Vaidyanathan and S. Akkarakaran. A review of the theory and applications of principal component filter banks. Journ. Appl. and Comput.
Harmonic Analysis, 10(3):254–289, May 2001.
[100] P. P. Vaidyanathan and P.-Q. Hoang. Lattice structures for optimal design
and robust implementation of two-channel perfect reconstruction filter banks.
IEEE Trans. Acoust., Speech, and Signal Proc., 36(1):81–94, January 1988.
α3.2 [January 2013] [free version] CC by-nc-nd
Comments to [email protected]
Fourier and Wavelet Signal Processing
Copyright 2013 J. Kovačević, V. K. Goyal, and M. Vetterli
272
Bibliography
[101] P. P. Vaidyanathan and Z. Doǧanata. The role of lossless systems in modern digital signal processing: A tutorial. IEEE Trans. Educ., 32(3):181–197,
August 1989.
[102] R. Vale and S. Waldron. Tight frames and their symmetries. Const. Approx.,
21:83–112, 2005.
[103] M. Vetterli. Filter banks allowing perfect reconstruction.
10(3):219–244, April 1986.
Signal Proc.,
[104] M. Vetterli. A theory of multirate filter banks. IEEE Trans. Acoust., Speech,
and Signal Proc., 35(3):356–372, March 1987.
[105] M. Vetterli and C. Herley. Wavelets and filter banks: Theory and design.
IEEE Trans. Signal Proc., 40(9):2207–2232, September 1992.
[106] M. Vetterli and J. Kovačević.
Wavelets and Subband Coding.
Signal Processing. Prentice Hall, Englewood Cliffs, NJ, 1995.
http://waveletsandsubbandcoding.org/.
[107] M. Vetterli, J. Kovačević, and V. K. Goyal. Foundations of Signal Processing.
Cambridge Univ. Press, 2013.
[108] M. Vetterli and D. J. LeGall. Perfect reconstruction FIR filter banks: Some
properties and factorizations. IEEE Trans. Acoust., Speech, and Signal Proc.,
37(7):1057–1071, July 1989.
[109] E. Viscito and J. P. Allebach. The analysis and design of multidimensional
FIR perfect reconstruction filter banks for arbitrary sampling lattices. IEEE
Trans. Circ. and Syst., 38(1):29–42, January 1991.
[110] P. Viswanath and V. Anantharam. Optimal sequences and sum capacity of
synchronous CDMA systems. IEEE Trans. Inform. Th., 45(6):1984–1991,
September 1999.
[111] S. Waldron. Generalised Welch bound equality sequences are tight frames.
IEEE Trans. Inform. Th., 49(9):2307–2309, September 2003.
[112] M. V. Wickerhauser. INRIA lectures on wavelet packet algorithms. Technical
report, Yale Univ., March 1991.
[113] M. V. Wickerhauser. Lectures on wavelet packet algorithms, 1992.
α3.2 [January 2013] [free version] CC by-nc-nd
Comments to [email protected]
Was this manual useful for you? yes no
Thank you for your participation!

* Your assessment is very important for improving the work of artificial intelligence, which forms the content of this project

Download PDF

advertisement