Linköping University Post Print Image Analysis and Reconstruction using a

Linköping University Post Print Image Analysis and Reconstruction using a
Linköping University Post Print
Image Analysis and Reconstruction using a
Wavelet Transform Constructed from a
Reducible Representation of the Euclidean
Motion Group
Remco Duits, Michael Felsberg, Gösta Granlund and Bart M. ter Haar Romeny
N.B.: When citing this work, cite the original article.
The original publication is available at www.springerlink.com:
Remco Duits, Michael Felsberg, Gösta Granlund and Bart M. ter Haar Romeny, Image
Analysis and Reconstruction using a Wavelet Transform Constructed from a Reducible
Representation of the Euclidean Motion Group, 2007, International journal of computer
vision., (72), 1, 79-102.
http://dx.doi.org/10.1007/s11263-006-8894-5
Copyright: Springer Science Business Media
http://www.springerlink.com/
Postprint available at: Linköping University Electronic Press
http://urn.kb.se/resolve?urn=urn:nbn:se:liu:diva-41574
Image Analysis and Reconstruction using a Wavelet Transform
constructed from a Reducible Representation of the Euclidean
Motion Group
Remco Duits
y
Michael Felsberg
G
osta Granlund
y
Bart ter Haar Romeny
[email protected] [email protected] [email protected] [email protected]
Eindhoven University of Technology
Department of Biomedical Engineering
P.O. Box 513, NL-5600 MB Eindhoven, The Netherlands
yComputer Vision Laboratory
Department of Electrical Engineering
Link
oping University
S-58183 Link
oping, Sweden
The Netherlands Organization for Scientic Research is gratefully acknowledged for nancial support.
1
2
R. Duits, M. Felsberg, G.H. Granlund , B.M. ter Haar Romeny
Abstract
Inspired by the early visual system of many mammalians we consider the construction ofand reconstruction from- an orientation score Uf : R2 S 1 ! C as a local orientation
representation of an image, f : R2 ! R. The mapping f 7! Uf is a wavelet transform W
corresponding to a reducible representation of the Euclidean motion group onto L2 (R2 ) and
oriented wavelet 2 L2 (R2 ). This wavelet transform is a special case of a recently developed
generalization of the standard wavelet theory and has the practical advantage over the usual
wavelet approaches in image analysis (constructed by irreducible representations of the
similitude group) that it allows a stable reconstruction from one (single scale) orientation
score. Since our wavelet transform is a unitary mapping with stable inverse, we directly
relate operations on orientation scores to operations on images in a robust manner.
Furthermore, by geometrical examination of the Euclidean motion group G = R2 o T,
which is the domain of our orientation scores, we deduce that an operator on 1orientation scores must be left invariant to ensure that the corresponding operator W W
on images is Euclidean invariant. As an example we consider all linear second order left
invariant evolutions on orientation scores corresponding to stochastic processes on G. As
an application we detect elongated structures in (medical) images and automatically close
the gaps between them.
Finally, we consider robust orientation estimates by means of channel representations,
where we combine robust orientation estimation and learning of wavelets resulting in an
auto-associative processing of orientation features. Here linear averaging of the channel representation is equivalent to robust orientation estimation and an adaptation of the wavelet
to the statistics of the considered image class leads to an auto-associative behavior of the
system.
1 Introduction
In many medical image applications it is desirable to construct a local orientation-score of
a grey-value image. In the case of 2D images f : R2 ! R such an orientation score Uf :
R2 o T ! C depends on 3 variables (b1 ; b2 ; ei ), where b = (b1 ; b2 ) 2 R2 denote position and
where ei 2 T $ (cos ; sin ) 2 S 1 is a local orientation variable.1
Mostly, such an orientation score is obtained by means of a convolution with some
anisotropic wavelet 2 L2 (R2 ) \ L2 (R2 ), cf.[24],[20]:
Uf
(b; ei ) =
Z
R2
(R 1 (x0 b))f (x0 ) dx0 ;
with R =
cos sin sin cos :
(1)
This idea is inspired by the early visual system of many mammalians, in which receptive elds
exist that are tuned to various locations and orientations. Thereby a simple cell receptive eld
can be parameterized by its position and orientation. Assemblies of oriented receptive elds
are grouped together on the surface of the primary visual cortex in a pinwheel like structure,
known as the orientation preference structure. The orientation preference structure is an almost
everywhere smooth mapping of the Euclidean motion group space R2 o T onto the 2D surface.
Due to the dierence in dimensionality, the orientation preference structure is punctuated by socalled pinwheels, which are singularities in this mapping, see Figure 1. Perceptual organization
(or image enhancement) on the basis of orientation similarity on images f can be done via
their orientation scores Uf , as there exists a linear well-posed invertible transformation W
from the image f to the orientation score Uf and vice versa. This invertibility ensures that no
information is lost in the decomposition of an input image into various orientations. As a model
for the orientation preference structure in the visual system this implies that the orientation
1 The torus T is the group of elements in the set S 1 with group product ei ei0 = ei(+0 )
Orientation Scores
3
Figure 1: Left: A:Parts of visual cortex active under dierent orientation stimuli. B: Orientation preference map
obtained by vector summation of data obtained for each angle. Orientation preference is color coded according to the key
shown below, replicated with permission from [5], copyright 1997 Society of Neuroscience. Right:enlarged section of the
rectangular area in the upper gure. Shaded and unshaded areas denote the left and right eye resp. Colored lines connect
cells with equal orientation sensitivity, replicated with permission from [32].
score may serve as an alternative format to the input luminance function, since there is no
loss of data evidence.2 As a tool for image processing, however, the inverse mapping from
orientation score to original image is a very useful one as we will see later.
The domain of an orientation score Uf is the well-known Euclidean motion group G =
R2 o T, with group product
0
0
0
g g0 = (b; ei )(b0 ; ei ) = (b + R b0 ; ei(+ ) ) ; g = (b; ei ); g0 = (b0 ; ei ) 2 R2 o T:
7! Uf is a wavelet transformation
Uf (b; ei ) = (W [f ])(g) = (Ug ; f )L2 (R2 ) = (Tb Rei ; f )L2 (R2 ) ;
g = (b; ei ) ; (2)
where Tb Rei is the translated and rotated wavelet and the representation g 7! Ug is given
and the mapping f
by
Ug (x) = (TbRei )(x) = (R 1(x b)) ; g = (b; ei ) 2 G; x 2 R2;
(3)
where (Tb )(x) = (x b); x 2 Rd and (Rei )(x) = (R 1 x); with R , 2 [0; 2), the
counter clock-wise rotation given in (1).
Because the local orientation is explicitly encoded in the orientation score, it is much easier
to do (enhancement or perceptual organization) operations based on local orientations on the
score. In section 3 we quantify the stability of W and its inverse, by means of a functional
Hilbert space approach to the theory of wavelets. In fact, this approach leads to a generalization
of standard wavelet theory, which is necessary for our application. Here we will only give a
brief explanation of this theory and restrict ourselves to the practical consequences. For a more
in-depth mathematical treatment we refer to earlier work, cf. [8], [7], [6] and [11].
2 This does by no means imply that the visual system actually runs an inverse process.
4
R. Duits, M. Felsberg, G.H. Granlund , B.M. ter Haar Romeny
In section 4 we discuss explicit constructions of wavelets (so-called proper wavelets) that
allow a stable (re)construction. Using these proper wavelets we can do image processing via
processing on its score, which is discussed in section 5. This kind of image processing, which
is useful for detection and completion of elongated structures in (medical) imaging, is also
generalized to image processing via invertible orientation scores of 3D images. This is briey
discussed in section 6, where we do not deal with all technical details, but just show some
preliminary results.
Finally, in section 7 we focus on robust orientation estimation rather than image enhancement. For this purpose we will describe a dierent paradigm, based on channel representations.
Here we will also discuss the similarity and complementarity of channel representations, with
the foregoing theory of orientation scores.
2 Preliminaries and Notation
The Fourier transform F : L2(Rd) ! L2(Rd), is almost everywhere dened by
Z
1
[F (f )](!) = f^(!) =
f (x) e i!x dx :
(2)d=2
Rd
d
Notice that kF [f ]k2 = kf k2 and F [f g]=(2) 2 F [f ]F [g], for all f; g 2 L2 (Rd ).
We use the following notation for Euclidean/polar coordinates in spatial and Fourier
domain, respectively: x = (x; y) = (r cos ; r sin ), ! = (!x ; !y ) = ( cos '; sin '),
with ; ' 2 [0; 2); r; > 0. The corresponding complex variables will be denoted by
z = x + iy = rei and w = !x + i!y = ei' .
Images are assumed to be within L2 (Rd ). We mainly consider d = 2, unless explicitly
stated otherwise. The space of bandlimited (by % > 0) images is given by
(4)
L%2 (Rd ) = ff 2 L2 (R2 ) j supp(F [f ]) B0;% g; % > 0;
d
where B0;% = f! 2 R j k!k < %g.
We will use short notation for the following groups:
{ Aut(Rd ) = fA : Rd ! Rd j A linear and A 1 existsg
{ dilation group D(d) = fA 2 Aut(Rd ) j A = aI; a > 0g .
{ orthogonal group O(d) = fX 2 Aut(Rd ) j X T = X 1 g
{ rotation group SO(d) = fR 2 O(d) j det(R) = 1g.
{ circle group T = fz 2 C j jz j = 1g, z = ei , = arg z with group homomorphism
: T ! SO(2) Aut(R2 ), given by (z ) = R , recall (1)
With B(H ) we denote the space of bounded operators on H . The range of a linear
operator A will be denoted by R(A) and its nilspace will be denoted by N (A).
A representation R of a group G onto a Hilbert space H is a homomorphism R between
G and B(H ), the space of bounded linear operators on H . It satises Rgh = Rg Rh for all
g 2 G; h 2 G and Re = I . A representation R is irreducible if the only invariant closed
subspaces of H are H and f0g and otherwise reducible. We mainly consider unitary
representations (i.e. kUg kH = k kH for all g 2 G and 2 H ), which will be denoted
by U rather than R .
In this article we mainly consider the left regular representation of the Euclidean motion
group on L2 (Rd ), which are given by (3).
5
Orientation Scores
Let b 2 Rd, a > 0 and g 2 G with corresponding (g) 2 Aut(Rd). Then the unitary
operators f 7! f, b , Da and Rg on L2 (Rd ) are dened by
f(x) = f ( x)
Da (x) = 1d ( xa )
a2
Tb (x) = (x b)
Rg (x) = pdet1 (g) (( (g)) 1x) ;
(5)
which are left regular actions of O(1); Rd ; D(d) respectively G in L2 (Rd ) .
1 e
The 2D-Gaussian kernel Gs at scale s is given by Gs(x) = 4s
kx k2
4s .
3 Quantication of Wellposedness of Transformations between
Image and Orientation Score
Because the local orientation is explicitly encoded in the orientation score, it is much easier
to do (enhancement or perceptual organization) operations based on local orientations on the
score. However, well-posed image enhancement on the basis of orientation similarity in an
image f (without loss of data evidence) can be done via its orientation score Uf i there exists
a stable transformation from image f to Uf and vice versa. Stability entails that a small3
perturbation in the image, must correspond to a small perturbation on the orientation score
and vice versa. For instance in the case of Fourier transformation, the stability is ensured by
the Plancherel theorem, which states that kF (f )k2L2 (Rd ) = kf k2L2 (Rd ) for all images f 2 L2 (Rd ).
In standard wavelet theory there also exists such a theorem, but this is not applicable to our
case, which brings us to the challenge of generalizing standard wavelet theory.
3.1
Why Standard Wavelet Theory is not Applicable to our Application
In this subsection we rst explain why standard wavelet theory, summarized in Theorem 1, can
not be applied to the framework of orientation scores. Then we give a short summary (only as
far as relevant for our purpose) of the results from a new wavelet theory which we developed
in earlier work. For more mathematical background and generalizations we refer to cf.[8], [7],
[6] and [11].
Let H be a Hilbert space, e.g. the space of images L2 (Rd ). Let U be an irreducible unitary
representation of the locally compact group G, with left-invariant Haar measure G . Recall
the denitions in the preliminaries. Let 2 H be an admissible wavelet, which means that
C =
Z
G
j(Ug ; )j2 d (g) < 1
G
( ; )
then the wavelet transform W : H ! L2 (G) is dened by
W [f ](g) = (Ug ; f )H :
The next theorem is well-known in mathematical physics [33], and is rst formulated and proven
in Grossmann et al. [22]. For a simple alternative and self-contained proof, see [7] p.20, which
uses a topological version of Schur's lemma, for proof see [6] p.86.
3 Notice that this depends on the norm imposed on the set of images and on the norm imposed on the set of
orientation scores.
6
R. Duits, M. Felsberg, G.H. Granlund , B.M. ter Haar Romeny
Theorem 1 (The Wavelet Reconstruction Theorem) The wavelet transform is a linear
isometry (up to a constant) from the Hilbert space H onto a closed subspace CGK of L2 (G; d):
kW [f ]k2L2(G) = C kf k2
(6)
The space CGK is the unique functional Hilbert space with reproducing kernel K (g; g0 ) =
1
G
C (Ug ; Ug0 ). The corresponding orthogonal projection P : L2 (G; d) ! CK is given by
(P )(g) =
Z
K (g; g0 )(g0 ) dG (g0 )
G
2 L2 (G; d) :
Furthermore, W intertwines U and the left regular representation
Lg () = (h 7! (g 1h))| on L2(G) : W Ug = Lg W
L |i.e. Lg
(7)
is given by
We notice that if U is the left regular representation of G = R o D(1) o O(1) (the group
consiting of translations, dilations and polarity) onto H = L2 (R) one obtains the more familiar
framework of wavelet theory in 1D signal processing, [9]p.5-6.
Of course, we would like to apply Theorem 1 to the wavelet transformation that maps
an image to its orientation score, recall (2), since it would imply that reconstruction of an
image from its orientation score is perfectly well-posed in the sense that (just like in Fourier
transform) the quadratic norm is preserved. Unfortunately, the next lemma shows us that we
are not allowed to do so in our case. Therefore in earlier work, cf.[9], [7], [8] we generalized the
standard wavelet theory in such a way that irreducibility is neither a requirement nor replaced
by a requirement.
Lemma 2 The left-regular action U of the Euclidean motion group in L2 (R2 ), given by (2), is
a reducible representation.
Proof : Consider the subspace consisting of L2 -functions whose Fourier transform have a
support inside a given disk around the origin with radius, say a > 0, i.e. La2 (R2 ) = ff 2
L2 (R2 ) j supp(F [f ]) B0;a g, then this is a non-trivial vector space unequal L2 (R2 ) which is
invariant under U , which directly follows by F [Ug ] = ei!b Rei F [ ], for all 2 La2 (R2 ) .
We could consider the similitude group SIM (2) = R2 o T D(1) with representation
Vb;ei ;a
1
(x) = p
a
R 1 (x b)
a
!
; a > 0; 2 [0; 2); b 2 R2 ;
which is irreducible, for proof see [26]p.51-52. This brings us within the standard wavelet
frameworks in 2D image analysis (in particular to 2D Gabor wavelets, [25], or Cauchy-wavelets
[2]). But from an implementation/practical point of view we do not want to consider multiple
scales, but stick to a single scale. This pertains to the so-called Euclidean coherent states from
mathematical physics, [23], which are not to be confused with the more familiar Euclidean
coherent states constructed from the irreducible4 representations of the Euclidean motion group
onto L2 (S 1 ), cf. [1]p.219-220.
Omitting the dilation group poses an important question. For example in scale space theory, [10], it is a well-known problem that reconstruction of a sharp image f , from its (say
Gaussian) blurred version f Gs is extremely ill-posed: Is it possible to get around this illposed ness by considering all rotated versions of linear combinations of Gaussian derivatives
4 They are in fact, up to equivalence, the only irreducible representations of the Euclidean Motion group, cf.
[30].
7
Orientation Scores
0
1
- €€€€€
3
0
1
- €€€€€
3
1
1
0
1
- €€€€€
3
0
+
0
1
0
1
- €€€€€
3
1
1
- €€€€€
3
0
1
- €€€€€
3
0
+
0
1
- €€€€€
3
0
1
1
1
- €€€€€
3
0
1
- €€€€€
3
0
+
0
1
- €€€€€
3
0
1
- €€€€€
3
1
1
- €€€€€
3
0
1
0
=
0
0
0
0
4
0
0
0
0
Figure 2: Integrating the discrete orientation score Uf4 over its 4 discrete orientations (8), boils down
to convolution with the discrete spike .
f (@x )p (@y )q Gs ? Before we give an armative answer to this question and deal with the issue
of well-posed reconstruction of images from orientation scores, we give an illustration by means
of an extremely simplied discrete example, where reconstruction is done by integration over
discrete orientations, rather than inverse convolution.
Example: Suppose we construct a discrete orientation score with only 4 orientations, up,
down left and right, constructed with the following discrete oriented wavelet : Z Z ! R,
given by
8
if (x1 ; x2 ) 2 f(0; 0); (1; 0)g
< 1
1
2
[x ; x ] =
1=3 if (x1 ; x2 ) 2 f(0; 1); (0; 1); ( 1; 0)g
:
0
else ,
see Figure 2. Then reconstruction of the original discrete image f : Z Z ! R from its
orientation score is done by integration over all directions.
4
1X
1
2
f [x ; x ] =
U 4 [x1 ; x2 ; eik=2 ]:
4 k=1 f
3.2
(8)
Generalization of Standard Wavelet Theory by Reproducing Kernel
Theory
Before we formulate the main theorem (Theorem 4) from which we can quantify the stability
of the transformations between image and orientation score, we give some short explanation
on reproducing kernel Hilbert spaces (also called functional Hilbert spaces), which is necessary
to read and understand the theorem.
A reproducing kernel Hilbert space is a Hilbert space consisting of complex valued functions
on an index set I on which the point evaluation a , given by a (f ) = f (a) is a continuous linear
functional for all a 2 I. This means that a (fn ) = fn (a) ! a (f ) = f (a) for every sequence
ffng in H which converges to f , fn ! f . It is not dicult to show that a linear functional
is continuous if and only if it is bounded. So a is a continuous linear functional if and only
if there exists a constant Ca such that jf (a)j Ca kf kH . For example, the spaces L2 (Rd ) are
not functional Hilbert spaces, but the well known rst order Sobolev space H1 (R) is such a
functional Hilbert space. Another example which is related to image processing is the space
of bandlimited images (on a square [0; a] [0; a]), where the reproducing kernel5 is given by
K (x; x0 ) = 4a212 sinc(a(x x0 ))sinc(a(y y0 )) = F 1 [1[ a;a][ a;a] ](x x0 ) and as is pointed
out in [8]p.121, p.126-127 and [6] the Nyquist theorem is a direct consequence of Theorem 3
below.
5 Also the space L% (R2 ) is a reproducing
2
F 1 [1B0;% ](x x0 ) = J1 kx %x0 k kx %x0 k .
kernel Hilbert space with reproducing kernel K (x; x0 ) =
8
R. Duits, M. Felsberg, G.H. Granlund , B.M. ter Haar Romeny
If H is a functional Hilbert space, then a is a continuous linear functional, so that by the
Riesz representation theorem it has a Riesz representant Ka 2 H such that
f (a) = a (f ) = (Ka ; f )H ;
for every a 2 I. The function K : I I ! C given by K (a; b) = (Ka ; Kb )H = Kb (a) is called
reproducing kernel. This reproducing kernel is a function of positive type on I, i.e.
n X
n
X
i=1 j =1
K (mi ; mj )ci cj 0 ; for all n 2 N; c1 ; :::; cn 2 C; m1 ; :::; mn 2 I:
Conversely, as Aronszajn pointed out in his paper, cf. [3], a function K of positive type on
a set I uniquely induces a reproducing kernel Hilbert space consisting of functions on I with
reproducing kernel K . We denote this unique reproducing kernel Hilbert space by CIK . For the
explicit construction of this space, we refer to [8]p.120, p.221-222 or [27], [3]. We notice that
this (abstract) construction, is somewhat disappointing from the engineering point of view as
it does not lead to a simple tangible description of the inner product: Only in some cases it is
possible to obtain explicit tangible descriptions of these inner products. Nevertheless, there is
the following very general result6 :
Theorem 3 Let V = fm j m 2 Ig be a subset of H such that its linear span is dense in H .
Dene the function K : I I ! C by K (m; m0 ) = (m ; 0m )H . Then the transform W : H 7! CIK
dened by
(W [f ])(m) = (m ; f )H
(9)
is a unitary mapping, i.e. kW [f ]kCK = kf kH for all f 2 H .
I
Proof see [7] p.8 or [8] p.221-222.
By applying this theorem to the case H = L2 (R2 ), I = G, where G = R2 o T R2 o SO(2)
and V = fUg j g 2 Gg, which is dense in L2 (R2 ) i
0 < M (!) = (2)
Z2
0
jF ( )(; )j2d < 1 ;
(10)
2
almost everywhere on R2 , and by characterizing the inner product on the space CGK=R oT we
obtain the following result:
2
Theorem 4 The space of orientation scores is a reproducing kernel Hilbert space CRK oT which
is a closed subspace of H L2 (T; det(T(t)) ) which is a vector subspace7 of L2 (G), where H =
1
ff 2 L2(Rd) j M 2 F [f ] 2 L2(Rd)g. The inner product on CRK2oT is given by (; )M =
(TM []; TM [])L2 (G) where
[TM []](b; ) = F
1 ! 7! M 21 (!) F [(; ei )](!) (b);
which is thereby explicitly characterized by means of the function M given in (10). The wavelet
2
transformation which maps an image f 2 L2 (R2 ) onto its orientation score Uf 2 CRK oT is a
6 Special cases include frame theory, cf.[8], sampling theorems and wavelet theory cf.[6]
7 i.e. is a subspace as a vector space, but is equipped with a dierent norm.
9
Orientation Scores
unitary mapping: kf k2L2 (R2 ) = kUf k2M = (Uf ; Uf )M .
As a result the image f can be exactly reconstructed from its orientation score Uf = W [f ] by
means of the adjoint wavelet transformation W :
2
f = W W [f ] = F 1 4! 7!
Z2
0
3
F [Uf (; ei )](!) F [Rei ](!) d M 1(!)5
(11)
Proof
Take d = 2, T = T and : T ! Aut(R2 ) given by (ei )x = R x in Theorem 4.4 formulated in
[9], which is a summary of results proved in [7] p.27-30 (here one must set S = R2 ) .
.
Consequences and Remarks:
1. This theorem easily generalizes to d dimensional images, i.e. f 2 L2 (Rd ); d = 2; 3; : : :.
The only thing that changes is that integration now takes place over SO(d) and the
function M becomes
M (!) = (2)d=2
Z
SO(d)
jF (Rt )(!)j2dT(t) ;
where dT (t) is the normalized left-invariant Haar-measure of SO(d), which is the Fourier
transform of
Z
~(x) =
(Rt Rt )(x)dT (t):
SO(d)
It can be shown that if 2 L1 (R2 ), then M and ~ are continuous functions in L1 (R)
and thereby vanishing at innity. As a result the ideal case M = (2)d=2 (in which case
we would have H = L2 (Rd o SO(d)) and thereby (quadratic norm preservation between
image and orientation score) cannot be obtained unless one uses a Gelfand triple structure
(just like Fourier transform) constructed by means of the Laplace operator8 but this goes
beyond the scope of this paper, for details see [6].
2. Theorem 4 easily generalizes to the discrete orientation group, i.e. G = TN o R2 , where
TN = feik jk 2 f0; 1; : : : N 1g; = 2N g; for N 2 N;
(12)
by replacing integrations by discrete summation. Notice that the discrete orientation
score UfN (b; eik ) of an image f 2 L2 (R2 ) is given by
UfN (b; eik ) = (Tb Reik ; f )L2 (R2 ) ; k 2 f0; 1; : : : N
1g; =
2
:
N
NP1
and the discrete version of the function M is M (!) = N1
jF (Reik )(!)j2.
k=0
3. The function M completely determines the stability of the forward and backward transformation. In practice (due to sampling) we work with band-limited images. If we restrict
the wavelet transformation to the space of bandlimited images L%2 (R2 ) we can dene a
8 This commutes with the left regular actions Ug
for all g 2 Rd o SO(d).
10
R. Duits, M. Felsberg, G.H. Granlund , B.M. ter Haar Romeny
MN HΛHN+1LL
1
0.8
0.6
0.4
0.2
0.5
1
1.5
2
Λ
Figure 3: Plots of 7! MN (2 ), with 2 = (N + 1) for N = 20; 40; 60; 80; 100.
condition number (with respect to quadratic norms on the space of images and the space
of orientation scores), [8]. This condition number tends to 1 when M tends to a constant
function on the relevant part of the spectrum say 1B0;% . We will call wavelets with the
property that M jB0;% 1 proper wavelets as they guarantee a stable reconstruction.
For these type of wavelets we could as well use the approximative reconstruction formula9
2
3
Z2
~ f = F 1 4! 7! 1 F [Uf (; ei )](!) F [Rei ](!) d5 :
(13)
2
0
4 Construction of Proper Wavelets
Wavelets with M = 1B0;% induce optimal stability of the (inverse) wavelet transform, but
because of the discontinuity at = k!k = % this choice causes numerical problems with the
discrete inverse Fourier transform in practice. To avoid this practical problem we mainly focus
on wavelets , with either M (!) = MN (2 2 ), N 2 N, > 0, = k!k where
N
2 X 2k
2
1:
(14)
M N ( ) = e
k!
k=0
In both cases the function M smoothly approximates 1B0;% , see Figure 4 , and thereby guarantees a stable reconstruction. In the sequel we will call a wavelet 2 L2 (R2 ) \ L1 (R2 ), with
such a M , a proper wavelet. Within the class of proper wavelets, we are looking for wavelets
that are also good practical line detectors, since in the score we want to see a clear distinction
between elongated structures and non elongated structures.
To this end we notice that it is hard to give a strict mathematical answer to the question
which elongated wavelet to choose given an application where it is required to enhance, detect
and complete elongated structures ? In practice it is mostly sucient to consider wavelets that
are similar to the local elongated patch one would like to detect and orthogonal to structures
of local patches you do not want to detect, in other words employing the basic principles of
template matching.
4.1
Construction of Proper Wavelets by Expansion in Eigen Functions of
the Harmonic Oscillator
A nice basis to expand wavelet are the eigenfunctions (represented in polar coordinates) of
the harmonic oscillator operator kxk2 from quantum mechanics. To this end we notice
that this basis is a complete basis in L2 (Rd ), d = 2; 3; : : :. Moreover, it is readily veried that
9 We stress that even if M 6= 1 stability is still manifest. The only requirement on M is that it remains
overal nite and non-vanishing. Recall that in general one has to use (11) for exact reconstruction.
11
Orientation Scores
12
10
8
6
4
2
-4
-2
2
4
Figure 4: Radial basis functions hm
gray for n =
n , left for m = 0 and middle for m = 1 and lighter
p
0; 1; 2; : : :. Right, the basis functions are eectively active on [0; Rmn ), where Rmn = 2(2n + jmj + 1),
as this equals the radius where the total energy Emn = 2(2n + jmj +1) equals the potential energy given
by V (x) = r2 . This is illustrated by joining the graphs of hmn , m = 0; 1, n = 0; 1; 2 together with their
corresponding energy levels and the graph of the potential V .
Figure 5: Local expansion of a 33 33 pixel patch of an MRI-image showing a bifurcating bloodvessel
in the retina. The original image is left, the reconstruction with basis function up to jmj = 32 and
n = 12 is in the middle, and the same reconstruction linearly dampening higher m and n components
is depicted on the right.
the harmonic oscillator commutes both with the rotation operator RR given by RR (x) =
(R 1 x), x 2 Rd , 2 L2 (Rd ) and with Fourier transformation. As a result they have a
common set of eigenfunctions. Consequently, these eigen functions are steerable. Moreover
they are (up to a phase factor) Fourier invariant, which enables us to control the shape of the
wavelet in the Fourier domain (in order to get a proper wavelet, i.e. stable (re)construction) and
simultaneously in the spatial domain (in order to get a good line detector). In this respect, we
stress that any other choice of a complete polar-separable base is inferior to the complete base
of eigen functions of the harmonic oscillator, due to the Bochner-Hecke Theorem, see Appendix
A. In this section we only consider 2D images (so d = 2), so L2 (R2 ) = L2 (S 1 ) L2 ((0; 1) ; rdr)
m
and the eigen functions10 are given by (hm
n Ym )(x) = hn (r)Ym (), x = (r cos ; r sin ), where
2 n! 1=2 rjmj e r2 =2 Ln(jmj) (r2 ); r > 0; n 2 N [ f0g; m 2 Z;
hm
n (r) = (jmj+n)!
Ym () = p12 e im ; 2 [0; 2);
(15)
where L(njmj) (r) is the n-th generalized Laguerrre polynomial of type jmj, see Figure 4. Considering
numerical expansions of local patches of elongated structures into this basis (in order to see
which components are important to detect) we notice that for each jmj a soft (linear) cut-o in
n is required, see gure 5. By expanding the wavelet in the complete basis of eigen functions
10 The eigen-values of Ym hm
2 are respectively
n , m 2 Z, n 2 N [ f0g, with respect to F , Rei , kxk
( 1)n+m (i)jmj , eim , 2(2n + jmj + 1).
12
R. Duits, M. Felsberg, G.H. Granlund , B.M. ter Haar Romeny
of the Harmonic oscillator we get:
1 n
P P
(x)
=
m (Ym hm
n ) (; r) ;
m2Z n=0
1
F [ ](!) = P P (i)jmj( 1)n+mmn (Ym hmn) ('; ) ;
m2Z n=0
1 n +im
P P
(Rei )(x) =
m e
(Ym hm
(16)
n ) (; r) ;
m2Z n=0
1
P P
n e im (Y m hn f )(b) ;
Uf (b; ei ) = (Rei(+) f )(b) =
( 1)m m
m
n
=0
m
2
Z
2
1 P
1
1 P
1 P
1
P
P
n n0 hm ()hm0 ():
( 1)n n hm () =
M (! )
=
( 1)n+n0 m
m n
m n
n
m= 1 n=0
m= 1 n=0 n0 =0
For details on derivations of proper wavelets with M (!) = MN (2 2 ), we refer to earlier work
[9]. Here we only consider a specic case which leads to a nice line detecting proper wavelet,
which corresponds to the wavelet proposed by Kalitzin et al. in[24].
n = Example: The special case m
m n0
In this case M = MN , recall (14) and (16) simplies to
M (! ) =
N
X
m=0
jmj2(hmn())2 = MN (2) ; = k!k ;
(17)
The (up to phase factors unique) solution N0 of (17) is now given by (m = 1 for all m)
0
N (x )
N (z )m jzj2
P
2
p1 rm e 2 epim = p1
2
2 m=0 pm! e 2
m=0 m!
N
@ m
j z j2
P
= p1
( 1 )m ([email protected] ) e 2
z = rei :
=
N
P
2 m=0 2
(18)
m!
This series converges uniformly on compacta, but not in L2 -sense. The real part of this wavelet
corresponds to the wavelet rst proposed by Kalitzin, cf. [24], as a line detector in medical
images. The imaginary part is a good edge detector.
Practical Aspects: The cuto index N has a practical upper bound because of sampling. If
N increases the reconstruction will become better, but if we choose N too large the wavelet
behaves badly along = 0, see Figure 6.
We notice that ~N0 = F 1 [! 7! MN (2 )], which equals the integration over the autocorrelations of all rotated kernels, is an approximation of the identity11 , whereas the wavelets
0
N are not.
The size of the wavelet 0N can be controlled by dilation, x 7! (D 0N )(x) = 1 0N (x=).
This does eect M , since FD = D1= F , but, for N suciently large, the stability remains
guaranteed. Moreover, for large N , the wavelet can be smoothed by convolving the wavelet
with a relatively small Gaussian kernel:
2
0 (x) = (Gs 0 )(x) = e (1=2) %z (%;s) (D(%;s) 0 )(z ) ; z = x + iy
(19)
N;s
N
N
where we recall that % > 0 equals the maximum frequency radius and where (%; s) = 1+ %s2 1
and (%; s) = %2 1 s = O %s2 and %s2 << 1. It is easily veried that 7! Gs implies
s +2+ %2
11 This means f N0 ! f
in L2 -sense for all f 2 L2 (R2 ) as N ! 1
13
Orientation Scores
Figure 6: Top row: Left: two dierent views of graphs N0 =15 , Right: two dierent views of graphs
0
0
N =30 . Bottom row: Left: 3 Plots of graph <( N =100 ). Right: 100 100-pixel grey-value plots of
0
<( N =100 ) and Gaussian blurred (with = 0:8 pixels). Notice that the kernel becomes sharper and the
oscillatory-ring vanishes as N increases. The locally highly oscillatory behavior within <( N0 =100 ) may
seem akward, but is not really harmful since it disappears immediately by convolution with Gaussian
kernel with tiny scale, also see (19).
~ 7! G2s ~, so as long as the scale s is relatively small, the Gaussian convolution of the
wavelet is harmless for the stability of the (re)construction.
grey-value
3.5
frequency
1
3
0.8
2.5
0.6
2
1.5
0.4
1
0.2
0.5
0.2
0.4
0.6
0.8
1
1.2
1.4
r
10
20
30
40
50
60
Ρ
Figure 7: Left: The graphs of the kernel N0 (x) cut o at N = 10; 20; 30; 40; 50 with = 1=8, restricted
to its main direction = 0. Notice that the peaks move out as m increases. Notice that the asymptotic
0 (r; = 0) = (8)1=4 pr(1 1 + O 1 ) = (8)1=4 pr + O r
formula derived for 1
see [8]section
16r
r
p
1
=
4
7.3, is a very good approximation (we included the graph of r 7! (8) r). Right: The corresponding
functions M N = MN which indeed approximate 1 as N ! 1.
2
4
3
2
0
4.2
Simple Approach to Construction of Proper Wavelets
In [8] we also developed a more simple approach to obtain proper wavelets, which we will again
illustrate with an example. In this more heuristic approach we do not insist on having full
control over the analytic expansions in spatial and Fourier domain by means of the Harmonic
oscillator basis. Here we take the condition M 1 as a starting point and consider line
detector wavelets only in the Fourier domain. Although we do not have full control over the
shape of the wavelet in the spatial domain, we will rely on the basic but clear intuition that
an elongated wavelet in the Fourier domain (say along axis e) corresponds to a double sided
elongated wavelet in the spatial domain domain (along axis R=2 e). The following example
shows that it is possible to obtain proper wavelets, which are nice line detectors and which
allow a simple and fast reconstruction scheme.12
12 Once
again exact reconstruction can be obtained by (11), but to avoid any kind of de-blurring, that is
divisions in the Fourier domain, we give fast and simple approximative reconstruction schemes (either by (13)
or even by (21)) that are sucient in practical applications.
14
R. Duits, M. Felsberg, G.H. Granlund , B.M. ter Haar Romeny
Figure 8: Upper row: Plots of the graph of the real and imaginary part
p of the proper wavelet given
by (20), determined by the discrete inverse Fourier transform of ! 7! A(')Gs (k!k) ( we set s = 800)
sampled on a 256 256 equidistant grid. From left to right: Density plots of <( ) at true size, 3D
views of the graphs of <( ) and =( ). Bottom row: MRI-image of the retina, three slices UfN (eik ; ),
k = 0; 2; 4 of the discrete orientation score and fast approximative reconstruction, which is close to exact
reconstruction.
Example :
Consider N = 18 discrete orientations, so = 2N . The idea is to "ll a cake by pieces of cake"
in the Fourier domain. In order to avoid high frequencies in the spatial domain, these pieces
must be smooth and thereby
they must overlap.
p
Let = F 1 [! 7! A(')Gs ()] and let A : S 1 ! R+ be given by
A(') =
8
>
<
>
:
22 '2 2 '2 + 1
if j'j 2
2
2 2 2 4
2 ' ' j'j + 2 if 2 j'j ; ' 2 [ ; ) ;
0
else
(20)
then it is easily checked that M (!) = Gs (), = k!k. In our experiments we took s = 21 2
large, which allowed us to use the approximate reconstruction (13), but this gave similar results
than a fast reconstruction by integration over the angles only:
NX1
~
f (x) =
Uf (x; ei(k) ) ;
(21)
k=0
see gure 8. In [8] p.134 we applied elementary processing on the orientation score of the retinal
image in gure 8 and compared it with a standard line detection technique.
5 Image Enhancement via Left Invariant Operations on Orientation Scores
Now that we have constructed an orientation score Uf from image f , such that it allows a
well-posed reconstruction of f from Uf , one can think of suitable operations on the orientation
scores.
15
Orientation Scores
Let be a proper wavelet, then there exists a 1{to{1 correspondence between bounded
% d
operators 2 B(CK
G ) on orientation scores and bounded operators 2 B (L2 (R )) on band
limited images:
[f ] = (W % ) [[W % [f ]]]; f 2 L%2 (Rd ) :
(22)
This allows us to relate operations on orientation scores to operations on images in a robust
manner. For proper wavelets, we have that the space of orientation scores CGK can be considered
as a linear subspace of L2 (G).
Let : L2 (G) ! L2 (G) be some bounded operator on L2 (G), then the range of the
restriction of this operator to the subspace CGK of orientation scores need not be contained
in CGK , i.e. (Uf ) need not be the orientation score of an image. The adjoint mapping of
W % : L%2 (Rd) ! L2(G), given by W % [f ] = W [f ]; f 2 L%2 (Rd) is given by13
(W % ) (V ) =
W % (W % ) Z
G
Ug V (g) dG(g) ; V 2 L2(G) :
The operator P =
is the orthogonal projection on the space of orientation scores
G
CK . This projection can be used to decompose the manipulated orientation score:
(Uf ) = P ((Uf )) + (I P )((Uf )) :
Denition 5 An operator : L2 (G) ! L2 (G) is left invariant i
Lg = Lg ; for all g 2 G;
(23)
where the left regular action Lg (also known as shift-twist transformation, cf.[35]) of g 2 G
onto L2 (G) is given by
(24)
Lg (h) = (g 1h) = (R 1(b0 b); ei(0 ));
0
with g = (b; ei ) 2 G; h = (b0 ; ei ) 2 G.
Theorem 6 Let be a bounded operator
on CGK . Then the unique corresponding operator on L%2 (Rd ), which is given by [f ] = W % W % [f ] is Euclidean invariant, i.e. Ug = Ug
for all g 2 G if and only if is left invariant, i.e. Lg = Lg for all g 2 G.
Here we omit the proof. It follows by W % Ug = Lg W % , for all g 2 G, cf.[9]p.38.
Practical Consequence: Euclidean invariance of is of great practical importance, since
the result should not be essentially dierent if the original image is rotated or translated. So
by Theorem 6 the only reasonable operations on orientation scores are left invariant. It is not
a problem when the mapping : CGK ! L2 (G) maps an orientation score to an element in
L2 (G) n CGK , but be aware that P : CGK ! CGK yields the same result.14
All linear left invariant kernel operators : L2 (G) ! L2 (G) are G-convolution operators.
They are given by
R
[(U )](g) = K (h 1 g)U (h) dh ; g = (b; ei )
G
2
(25)
R R
=
K (R01 (b b0 ); ei( 0 ) ) U (b0 ; ei0 ) d0 db01 db02 ;
R2 0
13 Note that the approximative reconstruction can be written f ~ = (W % ) W [f ], which approximates the
original image f = (W ) W [f ].
14 One can always compute the angle between (Uf ) and CGK to see how eective the operation is (in most
of our applications this angle was small).
16
R. Duits, M. Felsberg, G.H. Granlund , B.M. ter Haar Romeny
for almost every g = (b; ei ) 2 G. From the practical point of view(speed) these can be
implemented via impuls response and then taking the G-convolution. Before we propose left
invariant operators on orientation scores, we give a brief overview of the interesting geometry
within the domain G of orientation scores, which is the Euclidean Motion Group.
For any Lie-group G the tangent space Te (G) at the unity element equipped with the
product
a(t)b(t)(a(t)) 1 (b(t)) 1 e
[A; B ] = lim
;
t#0
t2
where t 7! a(t) resp. t 7! b(t) are any smooth curves in G with a(0) = b(0) = e and a0 (0) = A
and b0 (0) = B , is isomorphic to L(G). L(G) is the Lie-algebra of left invariant vector elds on
G, i.e. all vector elds A~ on G such that
A~g f = A~e (f Lg ) = A~e (h 7! f (g h)) ;
~ B~ ] = A~B~ B~ A~. The isomorphism is given by A $ A~ , A~g () =
equipped with product [A;
A( Lg ) = A(h 7! (g h)) for all smooth : G Og ! R and all g; h in G. In our case of the
Euclidean motion group we have that Te (G) is spanned by fA1 = e ; A2 = e ; A3 = e g with
= b1 cos + b2 sin e = cos eb1 + sin eb2
(26)
( in the spatial plane along the measured orientation) and
= b1 sin + b2 cos e = sin eb1 + cos eb2
(27)
(in the spatial plane orthogonal to the measured orientation). The corresponding left(or shifttwist) invariant vector elds are given by
fA~1 = @ ; A~2 = @ = cos @b1 + sin @b2 ; A~3 = @ = sin @b1 + cos @b2 g :
(28)
8
<
8
A~3 = [A~1 ; A~2 ]
< A3 = [A1 ; A2 ]
~
~
~
It is easily veried that
A2 = [A1 ; A3 ] and : A2 = [A1 ; A3 ] ,
:
0 = [A2 ; A3 ] ;
0 = [A~2 ; A~3 ]
~
which coincides with A ! A. For dimensional consistency dene X1 = ( 2Z )A1 ; X2 = A2
and X3 = A3 , where Z is the width of the image domain (so b1 ; b2 2 [0; Z ], assuming a square
image domain). A group element g = (b; ) can then be parameterized using either so-called
coordinates of the rst kind fi gi=1;2;3 , see [8] section 7.6 p.228-229, or by trhe coordinates of
the second kind fi gi=1;2;3 :
g = (b; ei ) = exp(
3
P
i Xi )
i=1
Z
2 231 cos Z 1
Z
2 Z
2
2
=
1 + 2Z
sin Z ; 2 sin Z
2 cos Z
3
Q
g = (b; ei ) = exp(i Xi ) =
i=1
2 ; 2 sin 2 + 3 cos 2 ; ei Z ;
= 2 cos 2
sin
3
Z
Z
Z
Z
1
2
1
1
1
The coordinates of the second kind correspond to
(; R b) = (; ; ) .
3
1
1
(ei ; ; ),
1
2
1
1
1
2
1 ; ei
2
1 Z
;
1
(29)
1 2 3
2
since by (29): ( Z ; ; ) =
17
Orientation Scores
Figure 9: Illustration of image processing via elementary operation on orientation score. Visual illusion.
Normalization (see (30)) of the orientation layers in the orientation scores reveals the most contrasting
lines in the triangles.
Figure 10: Illustration of image processing via elementary operation on orientation score. From left
to right: 1:Noisy medical himage
with guidewire , 2-3: Small oriented
wavelet with corresponding
i
2
<
U
min
<
(
U
)
f
f
~
processed image f = W U~f , with U~f = maxf<(Uf ) min(<(Uf ))g . 4-5:Same as 2-3 with relatively
larger kernel 1 N0 ( x ), where we recall that N0 was given by (18). For the sake of clarity the wavelet
plots (2,4) are zoomed in with a factor of 2.
5.1
Basic Left Invariant Operations on Orientation Scores
In image analysis it is well known that dierential operators used for corner/line/edge/blob
detection must be Euclidean invariant. Mostly, such dierential invariants are easily expressed
in a local coordinate system (gauge coordinates) where in 2D one coordinate axis (say v) is
along the isophote and the other runs along the gradient direction (say w), cf. [17].
Rather than putting these gauge coordinates along isophotes we propose a local coordinate
system along the measured orientation. Note to this end that in some medical image applications the elongated structures are not along isophotes. So in our orientation scores and play
the role of v and w. Moreover we can dierentiate along the direction and obtain directional
frequencies.
Besides these local left invariant operators we can think of more global left-invariant operators, like normalization
0
[(Uf
)](b; ei )= U
f
(b; ei )= @
Z
R2
11
jU f
(x; ei )jp dxA
p
;p> 1;
(30)
cf. Figure 9, or grey-value transformations
(Uf ) = (Uf
to enhance elongated structures.
min
fUf (g)g)q ; q > 0:
g
(31)
18
5.2
R. Duits, M. Felsberg, G.H. Granlund , B.M. ter Haar Romeny
Evolution Equations Corresponding to Left Invariant Stochastic Processes on the Euclidean Motion Group
Just like the well-known Gaussian scale space satises the translation and rotation invariance
axiom, [10], the following linear evolutions on orientation scores are left invariant:
(
@s W = A W
lim W (; s) = Uf () ;
s#0
(32)
where the generator A acting on L2 (G) is given by (the closure of)
a
D
A = 1 @ + a2 @ + a3 @ + 112 @ + D22 @ + D33 @ ;
Z
Z
ai ; Dii 2 C; i = 1; : : : ; 3: (33)
The rst order derivatives take care of transport (convection) and the second order derivatives
give diusion. We rst consider the case where all Dii 's are zero and the initial condition is a
spike-bundle 0 ;b0 (i.e. one \oriented particle" so to speak). This spike will move over time
along exponential curves, which are straight-lines in a spatial plane, spirals through G and
straight lines along -direction. By introducing the variables, t = s aZ1 ; 2 = aa21 Z; 3 = aa31 Z
Equations (32-33) reduces to
(
@t W = [@ + 2 @ + 3 @ ]W
lim W (; t) = 0 ;b0 :
t#0
2 ; 3 2 C
Notice that indeed [s] = 1 $ [t] = 1 and [a1 ] = [a2 ] = [a3 ] = [length] $ [2 ] = [3 ] = [length].
It follows by equality (29) that the orbit of the Dirac distribution at initial position (b0 ; ei0 )
is given by
(b10 + 3 (cos(t + 0 ) cos(0 )) + 2 sin(t + 0 ); b20 + 3 sin(t + 0 ) 2 (cos(t + 0 ) cos(0 )); ei(t+ ) ) ;
0
p
22 + 23 around central point
( 3 cos 0 2 sin 0 + b10 ; 2 cos 0 3 sin 0 + b20 );
which is a circular spiral with radius
which exactly corresponds to the results from our numerical implementation. The solution of
the pure diusion problem i.e. a1 = a2 = a3 = 0 in (33) is a G-convolution kernel operator
with some positive kernel Ks 2 L1 (G), which can be sharply estimated from above and below
by Gaussian kernels on G, viz.
0 j g j2
c0 (V (t)) 1=2 e b t
Ks(g) c(V (t)) 1=2e
b jgtj
2
;
with c; c0 ; b; b0 > 0. For details see [12]. In the degenerate case a1 = a2 = a3 = D11 = 0, the
diusion boils down to an ordinary spatial convolution for each xed with an anisotropic
22
Gaussian kernel where D
D33 gives the anisotropy factor of Gaussian convolution along e and
e .
The evolution equations given by (32) correspond to stochastic processes. For example the
case a1 = a3 = 0, D11 = 21 2 and D22 = D33 = 0 is the forward Kolmogorov equation corresponding to the stochastic process known as the so-called direction process15 , cf. Mumford's
15 In many later work Mumford's nal Fokker-Plank equation which is physically correct as long as 2
is the
variance in average curvature , 2is often mis-formulated in literature introducing dimensional inconsistencies.
For example [35] and [4] where =2 must be .
19
Orientation Scores
work [28]:
8
>
>
>
>
<
RL
RL
= L1 k(s) ds = L1 j_(s)j ds N (0; 2 )
0s
0
R
cos
(
)
>
d + x(0) ;
x(s) =
>
>
>
0 sin ( )
:
L NE () :
which is the limit of the following discrete stochastic process
8
>
>
<
(si + s) = (si ) + s "; " N (0; N2 )
cos (si )
x(si + s) = x(si ) + s sin
(si )
>
>
:
L
s = N , with L NE () :
Just like scale space theory16 , a scale space representation u(x; s) = Gs f can be regarded
as an isotropic stochastic process, where the distribution of positions of particles evolve over
time (the well-known Wiener process), evolutions on orientation scores can be considered as
stochastic processes where the distribution of positions of oriented particles evolve over time.
The life time T of a particle travelling with unit speed (so T = L) through G is assumed to
be negative exponentially distributed, T NE (), i.e. p(T = t) = e t , with expected life
time E (T ) = 1 , because it is memoryless, which must be the case in a Markov-process. The
probability density of nding a ei -oriented particle at position b is given by
Z1
Z1
p(b; jT = t)p(T = t) dt = [etA Uf ](b; )e t dt = [(A I) 1 Uf ](b; ) :
0
0
Consider two independent stochastic processes generated by A = Conv + Di, where Conv,
resp. Di stands for the convection resp. diusion part of A, given by (33), and its adjoint
A = Conv + Di. So the direction of shooting particles is opposite and the stochastic
behavior is similar in the two processes. The probability-density of collision of particles from
these 2 processes yields the following left invariant operation, see Figure 11,
([Uf ])(g) = [(A I ) 1 Uf ](g)[(A I ) 1 Uf ](g) :
(34)
p(g) = p(b; ) =
For detailed numerical algorithms and analytic approximations17 (By deriving exact Green's
functions of a nil-potent group of Heisenberg type) of the Green's functions of the involved left
invariant evolution equations we refer to [8] section 4.9, p.163-177 and [34].
6 Invertible Orientation Scores of 3D images
We generalized all of our results for 2D- image processing via left invariant operations on
invertible orientation scores of 2D images to the more complicated case of 3D-image processing
via left invariant operations on invertible orientation scores of 3D images. In this section we
16 In a scale space representation u(x; s) = (Gs f )(x) the evolution parameter s =
2 inherits the physical
dimension of the generator of the corresponding evolution equation us = u and thereby
in image analysis
s > 0 is usually considered as the scale of observation of an image f . Scale can be related to time via a diusion
constant
D: Dt = s = 1 2 .
17 In [31] it is claimed2 that the authors found the analytic solution of the direction process. Their claim and
derivations are incorrect. This is easily veried by substitution of their solution into the partial dierential
equation. We did nd the analytic solution in terms of elliptic functions of a special kind, the existence of which
was conjectured by Mumford cf. [28]p.497. This will be the main topic of a future publication.
2
20
R. Duits, M. Felsberg, G.H. Granlund , B.M. ter Haar Romeny
Figure 11: Example of perceptual organization. From left to right: 1. Original
image; 2. detection
<
U
<(Uf ) 2 3.
f
~
~
~
of elongated structures via orientation scores Uf : f = W [Uf ] with Uf = maxf<(Uf )minmin(
<(Uf ))g
Inverse transformation of evolved orientation score W [[U~f ]], where denotes the shooting process
by maintaining curvature and direction; 4. Inverse transformation of probability density of collision of
forward and backward process on orientation score, see (34). In contrast to related work[35] we do not
put sources and sinks by hand, but use our orientation scores instead. The only parameters involved
are, range of wavelet t, decay time and the stochastic process parameters D11 , D22 , D33 in (33).
will not deal with all technical details, but restrict ourselves to the denition of a 3D orientation
score and some preliminary results of line detection in 3D by means of left invariant operations
on invertible orientation scores.
Although some generalizations are straightforward, some diculties arise that did not arise
in the 2D-case. First of all SO(3) is not commutative, so the SO(3)-irreducible representations are not one-dimensional. Secondly, in practice one is mainly interested in constructing
orientation scores by \sigar-shaped" wavelets, i.e. wavelets that are invariant under the stabi(3)
lizer of the north pole, which brings us to the 2-sphere S 2 = SO
SO(2) . Thirdly, it is not obvious
which discrete subgroup of SO(3) to take and thereby the question arises of how to store the
orientation score, since an equidistant sampling in spherical coordinates does not make sense.
Let f 2 L%2 (R3 ) be a bandlimited 3D image, then we dene its wavelet transform W [f ] 2
G
CK by
W [f ](g) =
Z
R3
(R 1 (x b))f (x)dx ; g = (b; R) 2 G = R3 o SO(3) :
We restrict ourselves to the case where the wavelet is invariant under the stabilizer of the
north-pole ez , which is the subgroup of SO(3) consisting of all rotations around the z -axis. So
we assume
(Rx) = (x) ; for all R 2 Stab(ez ) :
(35)
On SO(3) we dene the following equivalence relation:
R1 R2 , (R2 ) 1 R1 2 Stab(ez ) SO(2) :
The equivalence classes are the left cosets [R] = R Stab(ez ), R 2 SO(3). The partition of all
equivalence classes will be denoted by SO(3)=SO(2), which is isomorphic to S 2 and thereby
not a group. Rather than using the canonical parameterization given by
Ra; (x) = (cos )x + (1 cos )(a; x)a + sin (a x) ; x; a 2 R3 ; 2 [0; 2)
(36)
of SO(3) we will use the well-known Euler angle parameterization Y : B0;2 ! SO(3):
Y (x) = Rez ; Rey ; Rez ; ;
x = ( cos sin ; sin sin ; cos )T ;
(37)
21
Orientation Scores
Figure 12: Density plots of through XOZ -plane and Y OZ -plane of the 3D-equivalent (for details see
[9]) of the 2D-wavelet obtained in the simple approach framework, see (20), a joint contour plot of isosurphaces of this 3D-equivalent at (x) = +0:02 and (x) = 0:02, show that it is rather a surface
patch detector than a line detector.
which gives us directly an explicit isomorphism between S 2 and
S 2 3 n(; ) = (cos sin ; sin sin ; cos )T
SO(3)
SO(2) :
SO(3)
$ [Rez ; Rey ; ] 2 SO
:
(2)
Because of our assumption (35) we can dene the orientation score Uf : R3 S 2
sponding to image f 2 L2 (R3 ) by means of
! C corre-
Uf (b; n(; )) = W [f ](b; Rez ; Rey ; ) :
We can again expand the wavelet in eigen functions of the harmonic oscillator (which are
invariant under rotations around the z -axis, i.e. m = 0), see Appendix A, and thereby we
obtain (for details see [8]par.4.7.2, p.146-151):
1 n l
1 P
P
(x)
=
l gn (r) Yl0 (; ) ; where ln = 0nl ;
n=0 l=0
1 P
1 P
l
P
((Rez ; Rey ; ) 1 x) =
ln D(l) (Rez ; Rey ; ) 0m0 gnl (r) Ylm0 (; ) ;
n=0 l=0 m0 = l
1 P
1 n
P
F [ ](!)
=
l ( 1)n+l (i)l gnl () Yl0 (#; ');
n=0 l=0
1
1 P
1 P
P
( 1)n+~n ln ln~ gnl ()gnl~ () ; for all ! 2 R3 ;
M (! )
=
l=0 n=0 n~ =0
where
h
i0
m0 m0
D(l)(Rez ; Rey ; ) m0 = ( 1)q (2Yll+1)(; ) :
4
n
n
Completely analogous to the 2D case the case l = l 0 establishes the 3D-equivalent of
wavelet (18), which is again a proper wavelet with M (!) = MN (2 ), N 2 N; = k!k, see
Figure 13.
7 Channel Representations
Orientation scores are a nice tool for well-posed image enhancement and stochastic completion
of \gaps" by means of left-invariant processing on the orientation scores. However, for some
practical applications where detection of oriented structures is required it is more relevant
to obtain a fast robust estimation for a single orientation per position, rather than image
22
R. Duits, M. Felsberg, G.H. Granlund , B.M. ter Haar Romeny
Figure 13: The 3D-equivalent of the 2D proper wavelet given by (18) is also a proper wavelet and is
N
P
given by N0;3D (x) = p1l! rl e r Yl0 (; ) , where Yl0 is the well-known surface harmonic Yl0 (; ) =
l=0
q
2
l
+1
0
Pl (cos ) 4 . From left to right, plots of N0;3D (x; 0; z ), for N = 10; 20; 40 and Gs N0;3D
=40 (x; 0; z ),
for tiny scale s and nally a joined 3D plot of the iso-intensity contours of the rotated 3D kernel
3
i ((Rez ; Rex ; ) 1 x) = 0:5; 0:5, ( 4 and 4 ).
2
2
Figure 14: Illustration of robust line detection in 3D images via orientation scores. 3D-images (64 64 64) illustrated only by 3 dierent 2D-cuts (along xy-plane, on z = 2; 12; 22). First column original
image of a straight
line and 2 circular spirals. Parameterized by respectively (10 + (0:2)t; 20 + (0:2)t; t),
2t ; 32+10 sin 2t ; t) and (20+12 cos 2t ; 20+12 sin 2t ; t). In the second column
(32+10 cos 128
128
80
80
we added other geometrical structures, some spots and a cube. In the third column we obtained f1
by adding strong correlated Gaussian distributed noise on the grey-values. 4th and 5th column: two
layers of the orientation score Uf (; c~i ) = W [f1 ], with wavelet N3D=16 illustrated in gure 13 and with
c~2 ( 0:19; 0:58; 0:79), with c~7 ( 0:30; 0:93; 0:19), 6th column: the approximative reconstruction
f1 ~ and last column the enhanced/processed image after simple power enhancement (see (31)) in
the orientation score. We did not use any tresholding on the grey-values (which is by denition an
ill-posed operation). These experiments show us that we can enhance lines (and distinguish them from
other geometrical structures such as lines and planes) already by extremely simple operation on the
3D-scores in a robust manner. For application of this technique in medical imaging (detection of Adam
Kiewicz-vessel, responsible for blood supply to spinal court ), we refer to [29].
1
23
Orientation Scores
enhancement via orientation scores where the columns Uf (x; ) represent a distribution of the
local orientation per position x.
A simple way to get a rst orientation estimate ^(x) would be by storing the angle where
the response is maximum within each orientation column of the score. To each orientation
estimate ^(x) we attach a measure of reliability of the estimate r(x). This can be obtained by
computing the value of the maximum relative to other response values in the orientation column
Uf (x; ). This would give a rough estimate of the local orientation (with its corresponding
condence). For example, if the score is constructed by a second order derivative of the 2DGaussian kernel with respect to y (so oriented along x) this boils down to an orientation
estimate by means of the angle of the eigenvector v1 (x) in the Hessian matrix (consisting of
Gaussian derivatives) with smallest absolute eigen value, so ^(x) = \v1 (x) and we can attach
the condence r(x) = jj21 ((xx))jj+jj12 ((xx))jj . Typically these kind of rough orientation estimates are
noisy and unreliable, likewise the example in the middle image in the bottom row of Figure
17. Therefore, the main goal in this section is to obtain a more robust orientation estimates
from the rough orientation estimate x 7! ^(x) with condence x 7! r(x). This brings us to the
framework of channel representations, [21]. This approach has quite some analogy with the
orientation score framework of the previous sections in the sense that
they both provide a new object which is an orientation decomposition of the image and
which is a function on the Euclidean motion group R2 o T (or rather R2 o TN ), and
therefore all theory and algorithms concerning left-invariant operations on orientation
scores may as well be applied to (the smoothing of) channel representations,
they pose the requirement for exact invertibility,
admit an operational scheme for reconstruction from the orientation decomposition.
Nevertheless we stress that the starting point with respect to the invertibility of the frameworks
is dierent: In the orientation score framework the reconstructable input is the original image
f : R2 ! R, whereas the reconstructable input in the framework of channel representations is a
rst rough orientation estimate : R2 ! T, obtained18 from original image f : R2 ! R, rather
than the original image itself.
Next we explain the method of robust orientation by channel representations in more detail:
Let L > 0, then the channel representation u : [0; L] R+ ! RL is an encoding of a signal
value f (x) 2 [0; L] obtained by a modular19 signal f : R2 ! R at position x, and an associated
condence r(x) > 0, given by
u(f (x); r(x)) = r(x)(B 0 (f (x)); : : : ; B L 1 (f (x)))T ;
where B n : [0; L] ! R+ , n = 0; : : : ; L 1 are continuous functions, such that the mapping
(y; r) 7! u(y; r) is injective. The vector u(f (x); r(x)) is usually called channel vector (at
position x 2 R2 ) and its components are called channels. The injectivity assumption allows us
to decode (f (x); r(x)) from the channel vector u(f (x); r(x)).
^
Here we only consider the case where f (x) = L2(x) , where ^(x) is a rst rough orientation
estimation (for example by means of a gradient, Riesz transform, or Hessian of the original
image). Thereby we have f (x) 2 [0; L] for all x 2 R2 and the interval [0; L) will be equipped
with group product f (x) + g(x) = (f (x) + g(x)) mod L and modular distance
dL (f (x); g(x)) = min(mod(f (x) g(x); L); mod(g(x) f (x); L));
18 For example obtained by an orientation score.
19 With modular, we mean periodic in its co-domain.
[19].
For channel representations on non-modular signals, see
24
R. Duits, M. Felsberg, G.H. Granlund , B.M. ter Haar Romeny
which makes [0; L) isomorphic to the circle group T. Moreover, we will only consider the special
case
B n (f (x)) = B 0 (dL (f (x); n));
i.e. the basis functions B n are obtained by a modular shift over n from smooth basis function
B 0 . The support of the symmetric function B0 will be a connected bounded interval [ W2 ; W2 ],
where W is called the kernel width. As a result the support of B n equals [n W2 ; n + W2 ].
Consequently, we have a nice practical localized smooth basis. Further we notice that the
number NA of active (non-zero) channels in the channel representation is limited as it equals
1+2b W2 c. This is a major practical advantage over for example the discrete Fourier basis where
all discrete frequencies are needed to represent a discrete -spike.
Example: B-spline channels
The B-spline channel representation, [16], with parameter K 2 N is obtained by setting
B 0 (f (x)) = (rect (K ) rect)(f (x));
K
where we used K fold periodic convolution (so that the channel width equals K + 1) and
where rect(y) = 1 if if jyj < 12 and rect(y) = 12 if jyj = 21 and rect(y) = 0 elsewhere. The
decoding is linear and is given by
!
LX1
1 LX1 n
n u (f (x); r(x)); unK (f (x); r(x)) ;
(38)
(f (x); r(x)) =
r(x) n=0 K
n=0
where unK (f (x); r(x)) = r(x)BK0 (f (x) n). It is not dicult to give a formal proof of (38):
1. The rst part of (38) can easily be proved by induction: The case K = 1 is trivial,
moreover if we assume that it holds for K 1 then we have
LP1
LP1
n BKn (y) =
n BK0 (y n)
n=0
n=0
LP1 R
=
n B00 (u)BK0 1 (y n u) du
n=0
LP1
R 0
0
n BK 1 (y n u) du
= B0 (u)
n=0
R 0
= B0 (u)(y u) du = 1 y + 0 = y
2. With respect to the second argument we notice that
LX1
LX1
B0n (y) = 1 for all y 2 [0; L] )
BKn (y) = 1 for all y 2 [0; L] ;
n=0
n=0
LP1
LP1
from which we deduce that
un (y; r) = r
BKn (y) = r.
n=0
n=0
In practice one uses decoding after manipulation and then one would like to use a decoding
window (of size say 2p + 1) and obtain the following fast decoding estimate
nP
0 +p
n un (f (x); r(x))
n
=
n
p
0
f~(x) = n0 +p
:
(39)
P
n
u (f (x); r(x))
n=n0 p
25
orientation
smoothed
orientation
channels
smoothed
channels
averaging kernel
channel decoding
channel encoding
Orientation Scores
Figure 15: Illustration of robust orientation estimation by means of channel smoothing
In the experiments discussed in subsection 7.1 we set p = 1 and K = 2, i.e. we used quadratic
B -splines with a decoding window size of 3, where the decoding estimate (39) is given by
un0 +1 (f (x); r(x)) un0 1 (f (x); r(x))
+ n0 ;
(40)
f~(x) = n0 +1
u (f (x); r(x)) + un0 (f (x); r(x)) + un0 1 (f (x); r(x))
the window center n0 can be chosen in various (somewhat ad-hoc) ways. For example n0 could
be chosen such that un0 +1 (f (x); r(x)) + un0 (f (x); r(x)) + un0 1 (f (x); r(x)) is maximal.
Example: Cosine square channels
The cosine square channel representation, cf.[18], with parameter ! > 0 is obtained by
setting
2 (!f (x)) If !jf (x)j ;
0
0
2
B (f (x)) = C! (f (x)) := cos
(41)
0
else.
Note that ! equals the channel width.
The decoding algorithm is more complicated in this case compared to the B-spline case.
It is non-linear and involves L2 -approximation for ! > N , but it is possible to give an exact
decoding for the case ! = N (i.e. channel width W = N ), N 3:
f (x) = l + 2N arg
= l + 2N arg
l+P
N
1
n=l
NP1
n=0
un!= N (f (x); r(x))ei
u!n+=lN (f (x); 1)ei
2n
2(n l)
N
;
(42)
N
where l is the index of the rst channel in the decoding interval, see [19]p.17-19 for more details.
7.1
Channel Smoothing for Robust Orientation Estimation
In this section we will consider robust orientation estimation by means of channel smoothing.
The idea is to construct a channel representation, to smooth the channel representation and to
decode from the smoothed channels to obtain an orientation estimate, see gure 15.
If the decoding scheme f (x) = 1 [u(f (x); r(x))] is linearR we have that smoothing of f (by
means of a continuous convolution kernel K : R2 ! R+ , with R2 K (x)dx = 1) leads to the same
result as decoding its smoothed channel representation u~ (f (x); r(x)) = (K u(f (); r()))(x).
To this end we notice that
(K f )(x) = 1 [~u(f (x); r(x))]
,
"
#
R
R
R
f (y)K (x y) dy = 1 [u(f (x); r(x))]K (x y) dy = 1 u(f (x); r)K (x y) dy ;
Rd
Rd
Rd
(43)
26
R. Duits, M. Felsberg, G.H. Granlund , B.M. ter Haar Romeny
for all x 2 Rd . The windowed decoding in the B -spline case and the windowed decoding
in the cosine square case (42), or similarly for general ! in [19] p.18 are non-linear (even in
case r(x) = 1). As a result, in general, the left hand side and right hand side (43) do not
coincide. In fact, they only coincide if the orientation measurement x 7! f (x) is such that the
decoding becomes linear. For example, in the B -spline case, it requires that f (x) is such that
the active part of the smoothed channels is a subset of the decoding window, since then we
have u~n0 1 (f ) + u~n0 (f ) + u~n0 +1 (f ) = 1 with the consequence that (40) and (38) coincide. This
will only be the case if the image is locally 1D at x. Also at discontinuities the channel will
be smoothed, but here the contributions of the smoothing falls outside the decoding window,
i.e. the smoothing contributions lie in the nil-space of the decoding operation. So, eectively
the channel smoothing does not smooth over discontinuities, which is desirable in a robust
orientation estimation.
In the cosine square channel case a similar observation can be made. Here we notice that if
"
arg
NX1
k=0
u~k+l (f (x); r(x))e
2ik
N
#
1
= arg
N
"
NY1
k=0
#
2ik
2
ik
1 NX1
(N
arg(uk+l e N ) =
uk + l e N =
N k=0
N
1)
;
which is the case20 if u~ (f (x); r(x)) is symmetric (for the sake of simplicity we assume N is
odd):
N 1
N 1
N 1
;
u~l+ 2 +j (f (x); r(x)) = u~l+ 2 j (f (x); r(x)); j = 1; : : :
2
which again will be the case if the image is locally 1D at x. Moreover we again have that
eectively the channel smoothing does not smooth over discontinuities.
7.1.1 Robust Estimation
In this section we will draw some parallels from channel smoothing to non-parametric methods
in random variable estimation. Assume that f : x 7! f (x) is a realization of a stochastic
process P that is ergodic for all x 2 . This implies that we can replace averaging realizations
of P with averaging f in f (
) [0; L]. We denote the probability density function over f by
pdf.
The orientation estimate due to decoding is robust in the sense that it minimizes the energy
ZL
E (g) = (f g)pdf(f ) df = ( pdf)(g) :
0
If we use B -spline channels, it follows by (40) that the inuence function is given by
(f ) = 0 (f ) = B2 (f 1) + B2 (f + 1):
This inuence function is smooth and compactly supported at [ 2:5; 2:5] and locally linear
near the origin. The inuence function tells tell us how much an outlier aects the orientation
estimate. To this end we notice that @ [email protected](f ) = ( pdf)(f ). For example, if the outlier is
f =f 0
outside the compact support [ 2:5; 2:5] of the inuence function it does not aect the estimate
and if the outlier equals 1 then puts maximal damage to the estimate. Clearly, this inuence
function (see gure 16 ) is highly preferable over the linear inuence function in the non-robust
case of a leat square estimate (f ) = f 2 .
20 arg(e i(NN
1)
(1 + z + z)) = arg(e i NN ) = (NN 1) for all z 2 C.
(
1)
27
Orientation Scores
1
0.5
ψ
0
−0.5
−1
−2.5
−2
−1.5
−1
−0.5
0
f
0.5
1
1.5
2
2.5
Figure 16: The inuence function in case of robust orientation estimation by means of channel smooth-
ing on B -splines
The channel vector u(x) at position x is related to the probability density function (pdf)
of the generating stochastic process P at position x. By denition a kernel density estimation
N
P
with kernel B0 is given by p~fn (f ) = N1
B0 (f fn ), where ffn g is a set of N independent
n=0
samples of the stochastic variable f .
By the ergodicity
assumption averaging of the elements of the channel representation
R
u~n (f (x); 1) = j
1 j un (f (x); 1)dx, is equivalent to sampling of a kernel density estimate with
the symmetric positive kernel B 0 . The expectation value of the kernel density estimate equals
(B 0 pdf)(f ) and thereby the expectation of the channel averaging is
E
8
<Z
:
9
=
un (f (x); 1) = (B 0 pdf)(f )f =n :
;
(44)
7.1.2 Explicit Example of Channel Smoothing
Analogue to the orientation score framework where we ensured Euclidean invariance of the
processing on the original image x 7! f (x) we would like to ensure Euclidean invariant processing on the original orientation map x 7! ^f (x). This means that operations on the channel
representation should be left invariant. Therefore channel smoothing should be done by means
of a GL -convolution, recall (12), the discrete version of the G-convolution given by (25):
u~n (x) = (K~ GL
u)(x; ein ) =
LX1 Z
k=0 R2
K~ (Rk1 (x b0 ); ei(n k) ) U (b0 ; eik ) db0 ; n = 0; : : : ; L 1;
where L denotes the total number of channels, where GL = R2 o TL and with U : GL ! R is
given by U (x; ein ) = un (x), where K~ : GL ! R represents the smoothing kernel and where
fungLn=01 respectively fu~ngLn=01 represents the input and output channels. In particular if K~ is
a singular with respect to the angle21 , that is K~ (x; ein ) = n K (x), we obtain:
u~n (x) = (K (Rn1 ) R2 un )(x):
To obtain adaptive channel smoothing we quadratically tted the convolution kernels K 0 of
the type s;! (x) = e rs C!0 (), x = (r cos ; r sin ), where we recall that C!0 is given by (41),
to the auto-correlations
An (x) = F 1 [! 7! jF (un (; r = 1))(!)j2 ](x)
21 One may also want to consider non-singular convolution kernels, to model interaction between the orientation
channels.
28
R. Duits, M. Felsberg, G.H. Granlund , B.M. ter Haar Romeny
Figure 17: Robust orientation estimation by means of adaptive channel smoothing. Top row: from
left to right, test pattern, t=test pattern with 1 percent salt and pepper noise and Gaussian noise of
variance 0.01, ground truth of the orientation. Bottom row: from left to right, we considered orientation
estimation only inside the white disk area, and the roughorientation estimation computed from gradient,
t(x)e
2
the result after channel smoothing of f (x) = 2 arccos kgrad
gradt(x)k , where e 2 R is a xed normalized
direction.
of the channel vectors un (; r = 1). More precisely, we minimized22
min
s;!
NX1 Z
n=0 R2
kA k 2
j k n kL1(R 2)
s;! L1 (R
)
s;! (x)
An (Rn 1 x)j2 w(x)dx;
R
where kAn kL1 (R2 ) = jAn (x)j dx and where Rn 2 SO(2) is a rotation such such that the
R2
direction of the n-th channel is mapped to the direction of the 0-th channel, so that x 7!
An (Rn 1 x) is aligned to A0 (x). In the implementation we replaced the integral by a discrete
summation and using a steepest descent algorithm. We took w(x) = kxk1+ (1 >> > 0, to
avoid singularity problems at the origin) to compensate for the Jacobian of a polar coordinate
transformation. Notice to this end that local orientation means locally 1D signals, so we wanted
to have a 1D subspace projection along the orientation.
Some results are illustrated in gure 17. This method showed very good results compared to
other existing orientation estimation methods. For a detailed comparison to other orientation
estimation methods, we refer to earlier work [15]p.22-25.
7.1.3 The Number of Channels L and the Channel Width W
The shape of the B -spline and the cosine square channel representation are pretty similar if
the corresponding channel widths coincide, i.e. ! = K + 1.
In practice their is not much dierence between the two representations. It is rather the
number of channels L and the channel width W that makes the dierence. Both quantities
lead to a trade-o situation.
22 Although this minimization works ne, it does not result in the best possible noise suppression. For an
optimal minimization (with the same type of kernel) concerning the signal to noise ratio we refer to [16].
Orientation Scores
29
On the one hand the channel width W should not be to narrow, since then the estimated
pdf becomes to spiky, and consequently nding the maxima of the pdf is ill-posed. On the
other hand W should not be to broad, as this leads to interference of the local maxima.
On the one hand the number of channels L should not be to large, as it requires a lot of
samples of the pdf and there is only a limited number of samples of the measurement. On the
other hand L should not be to small as the accuracy of the maxima of the pdf becomes better
if there are more channels.
8 Conclusion
Given an image f 2 L2 (Rd ), we construct a local orientation score Uf , which is a complex-valued
function on the Euclidean motion group G. The corresponding transformation f 7! Uf is a
wavelet transform constructed from an oriented wavelet and a representation of the Euclidean
motion group onto L2 (R2 ). Since this representation U is reducible the well-known wavelet
reconstruction theorem, which allows perfectly wellposed reconstruction, does not apply.
Therefore we generalized standard wavelet theory by means of reproducing kernel theory.
From this new generalization it follows that our wavelet transformation is a unitary mapping
between the space of bandlimited images, modelled by L%2 (R2 ) and the functional Hilbert space
CGK of orientation scores which is explicitly characterized. The norm on CGK explicitly depends
on the oriented wavelet via a function M , which thereby characterizes the stability of the
explicitly described inverse wavelet transformation. As a result, by proper choice of the wavelet
, the image f can be reconstructed from Uf in a robust way. We developed and implemented
several approaches to obtain proper wavelets (which are also good line detectors in practice)
in the 2 dimensional case, d = 2. These results are generalized to (and implemented in) the 3
dimensional case, d = 3.
These proper wavelets give rise to a stable transformation from an image to a single scale
orientation score and vice versa, which allows us to relate operations on orientation scores
to operations on images in a robust way.
We showed that operations must be left invariant in order to obtain a Euclidean invariant transformation on images. As a result the only left-invariant kernel operators on the
orientation scores are G-convolutions. As an example we observe the probability of collision
of particles from two stochastic processes on the Euclidean motion group, which is used to
automatically detect elongated structures and to close gaps between them.
Finally, we focussed on robust orientation estimation rather than image enhancement. For
this purpose we used the framework of (invertible) channel representations, which has a lot of
analogy with the framework of invertible orientation scores, where a single orientation estimates
for each position is encoded into channel vectors. We obtained robust orientation estimates
via the decoding algorithm after channel smoothing (which is done by a discrete G-convolution
on the orientation channels). This is analogue to image processing via left invariant stochastic
processes on invertible orientation scores in the sense that the processing is done by leftinvariant processing on the Euclidean motion group followed by an inverse transformation to
the space of input. The dierence though is that in the framework of invertible orientation
channels the input is given by the original orientation estimates ^f : R2 ! T (obtained from
image f : R2 ! R), whereas in the framework of invertible orientation scores the input is the
original image f .
30
R. Duits, M. Felsberg, G.H. Granlund , B.M. ter Haar Romeny
A The Bochner-Hecke Theorem and the Spectral Decomposition of the Hankel Transform
Theorem 7 Let H be a harmonic homogeneous polynomial of degree m in d variables. Let
F be an element of L2 ((0; r); rd+2m 1 dr) then the Fourier transform of their direct product
(r; x) 7! F (r)H (x), which is in L2 (Rd ), is given by:
R1
d 2
( i)m H (!) (r) ( 2 +m) J d 2 +m (r)F (r)rd+2m 1 dr =
2
0 d 1 h
i
(
+
m
)
m
( i) H (!) 2
H d 2 2 +m r d 2 1 + m F ( ) ;
where = k! k and the H d 2 +m the Hankel Transform given by:
2
Z1
(H d 2 +m )() = (r)1=2 (r)J d 2 +m (r) dr ; 2 L2 ((0; 1)) :
(45)
2
2
0
The proof can be found in [14]pp.24-25. The Hankel Transform H , = d 2 2 + m, is a unitary
map on L2 ((0; 1); dr) and has a complete set of orthonormal eigen functions fEn g given by
r
7 En(r) =
!
1
2 + 1 r 2 ( ) 2
2n!
r 2 e 2 Ln (r ); n = 0; 1; 2; : : : ; r > 0 ;
(n + + 1)
(46)
where L(n) (r) is the n-th generalized Laguerrre polynomial of type > 1,
r er d n r n+
(
)
Ln (r) =
(e r ); r > 0 ;
n!
dr
1
P
with corresponding eigen values ( 1)n : (H ) = ( 1)n (En ; )L2 ((0;1);dr) En . The functions
n=0
2 1
2
d
En are also eigen functions of the operator dr2 + r2 + r2 4 2, with eigen value 4n + 2,
cf.[13]p.79, which coincides with the fact that the Harmonic oscillator kxk2 commutes with
Fourier transform. These results gives us:
For d = 2, we have L2(R2) = L2(S 1) N L2((0; 1); rdr) and a Fourier invariant orthonormal base is given by fYm hm
n gm2Z;n2N[f0g , where
hm = r 1=2 E m (r) ;
(47)
n
n
and Ym () = p12 eim . It now follows by the Bochner-Hecke Theorem that:
F (Ym hmn) = F (rmYm hrmmn )
= ( 1)n ( i)jmj (Ym hm
n)
n
+
m
j
m
j
= ( 1) (i) (Ym hm
n):
(48)
For d = 3, we have L2(R2) = L2(S 1) N L2((0; 1); rdr). All l homogeneous harmonic
polynomials are spanned by fx 7! rl Ylm (; )gl=0:::1 ; m= l;:::;l . Dene
1
gnl (r) = r 1 Enl+ 2 (r) ; r > 0
(49)
then gnl 2 L2 ((0; 1); rdr) are eigen functions of 1 H 12 +m r1 with corresponding eigenvalues ( 1)n . Therefore it follows by the Bochner-Hecke Theorem that
F (Y m gl ) = (i)l ( 1)n+l (Y m gl ) :
(50)
l
n
l
n
Orientation Scores
31
References
[1] S.T. Ali, J.P. Antoine, and J.P. Gazeau. Coherent States, Wavelets and Their Generalizations.
Springer Verlag, New York, Berlin, Heidelberg, 1999.
[2] J.P. Antoine. Directional wavelets revisited: Cauchy wavelets and symmetry detection in patterns.
Applied and Computational Harmonic Analysis, 6:314{345, 1999.
[3] N. Aronszajn. Theory of reproducing kernels. Trans. A.M.S., 68:337{404, 1950.
[4] J. August and S.W. Zucker. The curve indicator random eld and markov processes. IEEE-PAMI,
Pattern Recognition and Machine Intelligence, 25, 2003. Number 4.
[5] W.H. Bosking, Y. Zhang, B. Schoeld, and D. Fitzpatrick. Orientation selectivity and the arrangement of horizontal connections in tree shrew striate cortex. The Journal of Neuroscience,
17(6):2112{2127, March 1997.
[6] M. Duits. A functional Hilbert space approach to frame transforms and wavelet transforms. September 2004. Master thesis in Applied Analysis group at the department of Mathematics and Computer
Science at the Eindhoven University of Technology.
[7] M. Duits and R. Duits. A functional Hilbert space approach to the theory of wavelets. Technical
report, TUE, Eindhoven, March 2004. RANA/CASA Report RANA-7-2004, available on the web:
ftp://ftp.win.tue.nl/pub/rana/rana04-07.pdf Department of Mathematics Eindhoven University of
Technology.
[8] R. Duits. Perceptual Organization in Image Analysis. PhD thesis, Eindhoven University of Technology, Department of Biomedical Engineering, The Netherlands, 2005.
A digital version is available on the web: URL: http:// www.bmi2.bmt.tue.nl/ImageAnalysis/People/RDuits/THESISRDUITS.pdf.
[9] R. Duits, M. Duits, and M. van Almsick. Invertible orientation scores as an application of generalized wavelet theory. Technical report, TUE, Eindhoven, March 2004. Technical Report 04-04,
Biomedical Image and Analysis, Department of Biomedical Engineering, Eindhoven University of
Technology.
[10] R. Duits, L.M.J. Florack, J. de Graaf, and B. ter Haar Romeny. On the axioms of scale space
theory. Journal of Mathematical Imaging and Vision, 20:267{298, May 2004.
[11] R. Duits, M. van Almsick, M. Duits, E. Franken, and L.M.J. Florack. Image processing via shifttwist invariant operations on orientation bundle functions. In Niemann Zhuralev et al. Geppener,
Gurevich, editor, 7th International Conference on Pattern Recognition and Image Analysis: New
Information Technologies, pages 193{196, St.Petersburg, October 2004. Extended version is to
appear in special issue of the International Journal for Pattern Recognition and Image Analysis
MAIK.
[12] N. Dungey, A. F. M. ter Elst, and D. W. Robinson. Analysis on Lie groups with polynomial growth,
volume 214. Birkhauser-Progress in Mathematics, Boston, 2003.
[13] S.J.L. Eijndhoven and J. de Graaf. Some results on hankel invariant distribution spaces. Proceedings
of the Koninklijke Akademie van Wetenschapppen, Series A, 86(1):77{87, 1982.
[14] J. Faraut and K. Harzallah. Deux cours d'analyse harmonique. Birkhaeuser, Tunis, 1984.
[15] M. Felsberg, P.-E. Forssen, and H. Scharr. Ecient robust smoothing of low-level signal features.
Technical Report LiTH-ISY-R-2619, SE-581 83 Linkoping, Sweden, August 2004.
[16] M. Felsberg, P.-E. Forssen, and H. Scharr. Channel smoothing: Ecient robust smoothing of
low-level signal features. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2005.
accepted.
[17] L. M. J. Florack. Image Structure. Kluwer Academic Publishers, Dordrecht, The Netherlands,
1997.
32
R. Duits, M. Felsberg, G.H. Granlund , B.M. ter Haar Romeny
[18] P.-E. Forssen and G. H. Granlund. Sparse feature maps in a scale hierarchy. In G. Sommer and Y.Y.
Zeevi, editors, Proc. Int. Workshop on Algebraic Frames for the Perception-Action Cycle, volume
1888 of Lecture Notes in Computer Science, Kiel, Germany, September 2000. Springer, Heidelberg.
[19] P.E. Forssen. Low and Medium Level Vision using Channel Representations. PhD thesis, Linkoping
University, Dept. EE, Linkoping, Sweden, March 2004.
[20] M. van Ginkel. Image Analysis using Orientation Space based on Steerable Filters. PhD thesis,
Delft University of Technology, Delft, Netherlands, October 2002.
[21] G.H. Granlund. An associative perception-action structure using a localized space variant information representation. In Proceedings of Algebraic Frames for the Perception-Action Cycle (AFPAC),
Kiel, Germany, September 2000. Also as Technical Report LiTH-ISY-R-2255.
[22] A. Grossmann, J. Morlet, and T. Paul. Integral transforms associated to square integrable representations. J.Math.Phys., 26:2473{2479, 1985.
[23] C. J. Isham and J. R. Klauder. Coherent states for n-dimensional euclidean groups e(n) and their
application. Journal of Mathematical Physics, 32(3):607{620, March 1991.
[24] S. N. Kalitzin, B. M. ter Haar Romeny, and M. A. Viergever. Invertible apertured orientation lters
in image analysis. International Journal of Computer Vision, 31(2/3):145{158, April 1999.
[25] T. S. Lee. Image representation using 2d gabor wavelets. IEEE-Transactions on Pattern Analysis
and Machine Inteligence, 18(10):959{971, 1996.
[26] A.K. Louis, P. Maass, and A. Rieder. Wavelets, Theory and Applications. Wiley, New York, 1997.
[27] F.J.L. Martens. Spaces of analytic functions on inductive/projective limits of Hilbert
Spaces.
PhD thesis, University of Technology Eindhoven, Department of Mathematics and Computing Science, Eindhoven, The Netherlands, 1988. This PHD thesis is
available on the webpages of the Technische Universiteit Eindhoven. Webpage in 2004:
http://alexandria.tue.nl/extra3/proefschrift/PRF6A/8810117.pdf.
[28] D. Mumford. Elastica and computer vision. Algebraic Geometry and Its Applications. SpringerVerlag, pages 491{506, 1994.
[29] R. W. van der Put. Methods for 3d orientation analysis and their application to the study of
arterial remodelling. Master's thesis, Department of Biomedical Engineering Eindhoven University
of Technology, June 2005. Technical Report BMIA-0502.
[30] M. Sugiura. Unitary representations and harmonic analysis. North-Holland Mathematical Library,
44., Amsterdam, Kodansha, Tokyo, second edition, 1990.
[31] K.K. Thornber and L.R. Williams. Analytic solution of stochastic completion elds. Biological
Cybernetics, 75:141{151, 1996.
[32] D. Y. Ts'0, R. D. Frostig, E. E. Lieke, and A. Grinvald. Functional organization of primate visual
cortex revealed by high resolution optical imaging. Science, 249:417{20, 1990.
[33] S. Twareque Ali. A general theorem on square-integrability: Vector coherent states. Journal of
Mathematical Physics, 39, 1998. Number 8.
[34] M. A. van Almsick, R. Duits, E. Franken, and B.M. ter Haar Romeny. From stochastic completion elds to tensor voting. In Proceedings DSSCC-workshop on Deep Structure Singularities and
Computer Vision, pages {, Maastricht the Netherlands, June 9-10 2005. Springer-Verlag.
[35] L. R. Williams and J.W. Zweck. A rotation and translation invariant saliency network. Biological
Cybernetics, 88:2{10, 2003.
Was this manual useful for you? yes no
Thank you for your participation!

* Your assessment is very important for improving the work of artificial intelligence, which forms the content of this project

Download PDF

advertisement