# User manual | PDF ```TOTAL VARIATION MINIMIZATION WITH AN H −1 CONSTRAINT
CAROLA-BIBIANE SCHÖNLIEB
Abstract. This paper is concerned with the numerical minimization of total variation functionals with an H −1 constraint. We present an algorithm for its solution which is based on the dual
formulation of the total variation and show its application in several areas of image processing.
1. Introduction and Motivations
Let Ω ⊂ R2 be a bounded and open domain with Lipschitz boundary. For a given function g ∈ L2 (Ω)
we are interested in the numerical realization of the following minimization problem
(1.1)
min
u∈BV (Ω)
J (u) = |Du| (Ω) +
1
2
kT u − gk−1 ,
2λ
where T ∈ L(L2 (Ω)) is a bounded linear operator and λ > 0 is a tuning parameter. The function
|Du| (Ω) is the total variation of u and k.k−1 is the norm in H −1 (Ω), the dual of H01 (Ω).
Problem (1.1) is a model for minimizing the total variation of a function u which obeys an H −1
constraint, i.e., kT u − gk−1 is small, for a given function g ∈ L2 (Ω). In the terminology of inverse
problems this means that from an observed datum g one wants to determine the original function u,
from which a priori we know that T u = g and u has some regularity properties modeled by the total
variation and the H −1 norm. Minimization problems like this have important applications in a wide
range of image processing tasks. We give an overview of such in the following subsections.
The main interest of this paper is the numerical solution of (1.1). In  Chambolle proposes an
algorithm to solve total variation minimization with an L2 constraint and T = Id. This approach was
extended to more general operators T in a subsequent work . A generalization to other norms than
L2 , including the case of the H −1 norm for the case T = Id, was proposed in [5, 6]. In the following we
shall report the generalization of Chambolle’s algorithm for the case of an H −1 norm in the problem.
Moreover, we present strategies to extend the use of this algorithm from problems with T = Id to
problems (1.1) with an arbitrary linear operator T . Additional to the theory of this algorithm we
present applications of it in image processing, in particular for image denoising, image decomposition
and inpainting. Finally we show how to speed up the numerical computations by considering a domain
decomposition approach for our problem.
For now, let us start our discussion with some motivations for considering T V − H −1 minimization.
1.1. Total Variation Minimization in Image Processing. In a wide range of image processing
tasks one encounters the situation that the observed image g is corrupted, e.g., by noise or blur, or that
the features of interest in the image are hidden. Now the challenge is to recover the original image u,
i.e., the hidden image features, from the observed datum g. In mathematical terms this means that one
has to solve an inverse problem T u = g, where T models the process through which the image u went
before observation. In the case of an operator T with unbounded inverse, this problem is ill-posed. In
such cases one modifies the problem by introducing some additional a-priori information on u, usually
in terms of a regularizing term, e.g., the total variation of u.
2000 Mathematics Subject Classification. Primary 46N10; Secondary 46N60.
Key words and phrases. functional minimization, total variation, negative Hilbert-Sobolev space, image processing.
1
2
C. B. SCHÖNLIEB
Many such problems in image processing are formulated as minimization problems, cf.  for a
general overview on this topic. More precisely, let Ω ⊂ R2 be an open and bounded domain with
Lipschitz boundary, B1 , B2 two Banach spaces over Ω and g ∈ B1 be the given image. A general
variational approach in image processing can be then written as
(1.2)
J (u) = R(u) +
1
2
kT u − gkB1 → min ,
u∈B2
2λ
where λ > 0 is the tuning parameter of the problem and T ∈ L(B1 ) is a bounded linear operator.
The term R : B2 → R denotes the regularizing term which smoothes the image u and represents some
2
kind of a priori information about the minimizer u. The term kT u − gkB1 is the so called fidelity
term of the approach which forces the minimizer u to stay close to the given image g (how close is
dependent on the size of λ). In the case of image denoising and decomposition the operator T = Id,
the identity in B1 . For deblurring problems T will be a symmetric kernel, e.g., a Gaussian, and for
image inpainting T denotes the characteristic function of a subdomain of Ω, cf. Section 1.3 for details.
In general B2 ⊂ B1 signifying the smoothing effect of the regularizing term on the minimizer u ∈ B2 .
Under certain regularity assumptions on a minimizer u of the functional J , the minimizer fulfills a
so called optimality condition of (1.2), i.e., the corresponding Euler-Lagrange equation. This is, that
for a minimizer u the first variation, i.e., the Fréchet derivative, of J has to be zero. In the case
B1 = L2 (Ω), in mathematical terms this reads
(1.3)
−λ∇R(u) + T ∗ (g − T u) = 0,
in Ω,
which is a partial differential equation with certain boundary conditions on ∂Ω. Thereby ∇R denotes
the Fréchet derivative of R over B1 = L2 (Ω). The evolutionary version of (1.3) is the so called steepestdescent or gradient flow approach. More precisely, a minimizer u of (1.2) is embedded in an evolution
process. We denote it by u(., t). At time t = 0, u(., t = 0) = g ∈ B1 is the given image. It is then
transformed through a process that can be written
(1.4)
ut = −λ∇R(u) + T ∗ (g − T u),
in Ω.
Given a variational formulation (1.2), the steepest-descent approach is used to numerically compute a
minimizer of J . Thereby (1.4) is iteratively solved until one is close enough to a minimizer of J , i.e.,
uk+1 − uk is sufficiently small for two subsequent iterates uk and uk+1 .
In other situations we encounter equations that do not come from variational principles, such as
Cahn-Hilliard- and TV-H −1 inpainting, cf. , , . Then the image processing approach is
directly given in the form of (1.4).
In what follows we shall concentrate on imaging approaches (1.2) which use the total variation as
a regularizing term R, i.e., evolutionary approaches (1.4) where ∇R(u) is replaced by elements of the
subdifferential of the total variation of u. We recall that for u ∈ L1loc (Ω)
Z
2
u∇ · ϕ dx : ϕ ∈ Cc1 (Ω) , kϕk∞ ≤ 1
V (u, Ω) := sup
Ω
is the variation of u and that u ∈ BV (Ω) (the space of bounded variation functions, [2, 25]) if and
only if V (u, Ω) < ∞, see [2, Proposition 3.6]. In such a case, |Du|(Ω) = V (u, Ω), where |Du|(Ω) is the
total variation of the finite Radon measure Du, the derivative of u in the sense of distributions. The
subdifferential ∂|Du|(Ω) is defined as
∂ |Du| (Ω) := ∂BV (Ω) (x) |Du| (Ω) := {u∗ ∈ BV (Ω)′ : hu∗ , v − ui + |Du| (Ω) ≤ |Dv| (Ω)
∀v ∈ BV (Ω)},
where BV (Ω)′ denotes the dual space of BV (Ω). It is obvious from this definition that 0 ∈ ∂ |Du| (Ω)
if and only if u is a minimizer of |D·| (Ω).
The minimization of energies with total variation constraints, i.e.,
1
2
kT u − gkH ,
|Du|(Ω) +
(1.5)
min
2λ
u∈BV (Ω)
TOTAL VARIATION MINIMIZATION WITH AN H −1 CONSTRAINT
3
for a given g ∈ H, where H is a suitable Hilbert space, e.g., H = L2 (Ω), traces back to the first
uses of such a functional model in noise removal in digital images as proposed by Rudin, Osher,
and Fatemi . There the operator T is just the identity. Extensions to more general operators
T and numerical methods for the minimization of the functional appeared later in several important
contributions [15, 23, 4, 39, 14]. From these pioneering and very successful results, the scientific output
related to total variation minimization and its applications in signal and image processing increased
One of those outputs is TV-H −1 minimization (1.1), i.e., (1.5) with H = H −1 (Ω). This approach
found growing interest in recent years due to several advantages compared to the L2 constrained
problem, cf. the following subsections and, for instance, , , . Thereby the norm in H −1 (Ω) :=
(H01 (Ω))∗ , i.e., the dual of H01 (Ω) is defined by
Z
2
2
2
kf k−1 = ∇∆−1 f L2 (Ω) =
∇∆−1 f dx.
Ω
The operator ∆−1 denotes the inverse to the negative Dirichlet Laplacian, i.e., u = ∆−1 f is the unique
solution to
−∆u = f
u=0
(1.6)
in Ω
on ∂Ω.
Note that the existence and uniqueness of minimizers for (1.1) is guaranteed. In fact the existence
of a unique solution of (1.1) follows as a special case from the following theorem.
Theorem 1.1. (Theorem 3.1 in ) Given Ω ⊂ R2 , open, bounded and connected, with Lipschitz
boundary, g ∈ H −s (R2 ) (s > 0), g = 0 outside of Ω̄, λ > 0, and T ∈ L(L2 (Ω)) an injective continuous
linear operator such that T 1Ω 6= 0, then the minimization problem
min
u∈BV (Ω)
|Du| (Ω) +
1
2
kT u − gk−s ,
2λ
s > 0,
where k·k−s denotes the norm in H −s (R2 ), has a unique solution in BV (Ω).
Proof. The proof is a standard application of methods from variational calculus and can be found,
e.g., in . The main ingredients in the proof, in order to guarantee compactness, are the PoincaréWirtinger inequality (cf.  for instance) which bounds the L2 − and the L1 − norm by the total
variation, and the fact that L2 (Ω) can be embedded in H −1 (Ω).
In the subsequent two subsections we shall present two main applications of TV-H −1 minimization
(1.1) for image denoising/decomposition, and image inpainting.
1.2. TV − H−1 Minimization for Image Denoising and Image Decomposition. In taking
T = Id in (1.2), we encounter two interesting areas in image processing, namely image denoising and
image decomposition. Thereby, in many image denoising models, a given noisy image g is decomposed
into its piecewise smooth part u and its oscillatory, noisy part v, i.e., g = u + v. Similar, in image
decomposition, the piecewise smooth part u represents the structure/cartoon part of the image, and
the oscillatory part v the texture part of the image. We call the latter task also cartoon-texture
decomposition. The most famous model within this range is the T V − L2 denoising model proposed
by Rudin, Osher and Fatemi 
(1.7)
J (u) = |Du| (Ω) +
1
2
ku − gkL2 (Ω) → min .
2λ
u∈BV (Ω)
This model produces very good results for removing noise and preserving edges in structure images,
meaning images without texture-like components, i.e., high oscillatory edges. Unfortunately it fails
in the presence of the latter. Namely it cannot separate pure noise, i.e., well oscillatory components,
from high oscillatory edges but removes both equally.
4
C. B. SCHÖNLIEB
To overcome this situation, Y. Meyer  suggested to replace the L2 −fidelity term by a weaker
norm. Namely he proposes the following model:
1
J (u) = |Du| (Ω) +
ku − gk∗ → min ,
2λ
u∈BV (Ω)
where the k·k∗ is defined as follows.
Definition 1.1. Let G denote the Banach space consisting of all generalized functions g(x, y) which
can be written as
g(x, y) = ∇ · (f~(x, y)), f~ = (f1 , f2 ), f1 , f2 ∈ L∞ (Ω), f~ · ~n = 0 on ∂Ω,
where ~n is the unit normal on ∂Ω. Then kf k∗ is the induced norm of G defined as
p
kgk∗ = inf f1 (x, y)2 + f2 (x, y)2 ∞ .
g=∇·f~
L
(Ω)
In fact, the space G is the dual space of W01,1 (Ω). In  Meyer further introduces two other
spaces with similar properties than G but we are not going into detail here. We only mention that
these spaces are intrinsically appropriate for modeling textured or oscillatory patterns and in fact they
provide norms which are smaller for such than the L2 norm.
The drawback of Y. Meyers model is that it cannot be solved directly with respect to the minimizer
u and therefore has to be approximated, cf. . Thereby the ∗−norm is replaced by
q
2
1 1
2 + f 2
~
(u
+
∇
·
f
)
−
g
+
,
f
2
2
1
2µ
λ
L (Ω)
Lp (Ω)
with λ, µ > 0 and p ≥ 1. In the case p = 2 the second term in the above expression is equivalent to
the H −1 norm. In particular v = ∇ · f~ corresponds to v ∈ H −1 (Ω). Indeed, for v ∈ H −1 (Ω), there is
a unique v ∗ ∈ H01 (Ω) such that
2
q
2
2
2
2 + f 2
.
kvk−1 = k∇v ∗ kL2 (Ω) = ∇∆−1 v L2 (Ω) = f
2
1
L2 (Ω)
Limiting to the case p = 2 and the limit µ → 0 the TV-H
, , i.e.,
(1.8)
|Du| (Ω) +
−1
denoising model was created, cf. ,
1
2
ku − gk−1 → min .
2λ
u∈BV (Ω)
Numerical experiments showed that (1.8) gives much better results than (1.7) under the presence of
oscillatory data, cf.  and . In Section 3 we will present some numerical results that support
this claim.
1.3. TV-H−1 Inpainting. Another important task in image processing is the process of filling in
missing parts of damaged images based on the information gleaned from the surrounding areas. It is
essentially a type of interpolation and is called inpainting. In this case the operator T in (1.2) is the
characteristic function of a subdomain of Ω.
Now, let g be the given image defined on the image domain Ω. The problem is to reconstruct the
original image u in the damaged domain D ⊂ Ω, called inpainting domain. The general variational
1 χΩ\D (u − g)2 → min ,
(1.9)
J (u) = R(u) +
B1
u∈B2
2λ
where
(
1 Ω\D
χΩ\D (x) =
(1.10)
0 D,
TOTAL VARIATION MINIMIZATION WITH AN H −1 CONSTRAINT
5
is the characteristic function of Ω \ D. Here the role of the regularizing term R(u) is to fill in the
image content into the missing domain D, e.g., by diffusion and/or transport. Due to the characteristic
function χΩ\D , the fidelity term of the inpainting approach has impact on the minimizer u outside of
the inpainting domain only.
After the pioneering works of Masnou and Morel , and Bertalmio et.R al , the basic variational
inpainting model is the T V −L2 model, where as before R(u) = |Du| (Ω) ≈ Ω |∇u| dx, B1 = L2 (Ω) and
B2 = BV (Ω), cf. [18, 16, 37, 36]. A variational model with a regularizing
term containing higher
order
R 2
derivatives is the Euler elastica model [20, 19, 31] where R(u) = Ω a + b (∇ · (∇u/|∇u|)) |∇u| dx
with positive weights a and b. Other examples to be mentioned for (1.9) are the active contour model
based on Mumford and Shah’s segmentation , and the inpainting scheme based on the MumfordShah-Euler image model , only to give a rough overview. For a more complete introduction to
image inpainting we refer to .
Now second order variational inpainting methods (where the order of the method is determined by
the derivatives of highest order in the corresponding Euler-Lagrange equation), like TV inpainting,
have drawbacks as in the connection of edges over large distances (Connectivity Principle, cf. Figure
1) and the smooth propagation of level lines (sets of image points with constant grayvalue) into the
damaged domain (Staircasing Effect, cf. Figure 2).
Figure 1. Two examples of Euler elastica inpainting compared with TV inpainting.
In the case of large aspect ratios the TV inpainting fails to comply to the Connectivity
Principle. Figure from .
This is due to the penalization of the length of the level lines within the minimizing process with a
second order regularizing term, connecting level lines from the boundary ofR the inpainting domain via
the shortest distance (linear interpolation). The regularizing term R(u) ≈ Ω |∇u| dx in the T V − L2
inpainting approach, for instance, can be interpreted via the coarea formula which gives
Z
Z ∞
min
length(Γλ ) dλ,
|∇u| dx ⇐⇒ min
u
Ω
Γλ
−∞
where Γλ = {x ∈ Ω : u(x) = λ} is the level line for the grayvalue λ. If we consider on the other hand
the regularizing term in the Euler elastica inpainting approach, the coarea formula reads
2 R
∇u
minu Ω a + b ∇ · |∇u|
|∇u| dx
(1.11)
⇐⇒
R∞
minΓλ −∞ a length(Γλ ) + b curvature2 (Γλ ) dλ.
6
C. B. SCHÖNLIEB
Figure 2. An example of Euler elastica inpainting compared with TV inpainting.
Despite the presence of high curvature TV inpainting truncates the circle inside the
inpainting domain (linear interpolation of level lines, i.e., Staircasing Effect). Depending on the weights a and b Eulers elastica inpainting returns a smoothly restored
object, taking the curvature of the circle into account. Figure from .
Thereby not only the length of the level lines but also their curvature is penalized (where the penalization of each depends on the ratio b/a). This results in a smooth continuation of level lines over
the inpainting domain also over large distances, compare Figure 1 and 2. The performance of higher
order inpainting methods, such as Euler elastica inpainting, can also be interpreted via the second
boundary condition, necessary for the well-posedness of the corresponding Euler-Lagrange equation of
fourth order. Not only the grayvalues of the image are specified on the boundary of the inpainting
domain but also the gradient of the image function, namely the direction of the level lines are given.
In an attempt to solve both the connectivity principle and the staircasing effect resulting from
second order image diffusions, a number of third and fourth order diffusions has been suggested for
image inpainting. One of them is TV-H −1 inpainting. Thereby the inpainted image u of g ∈ L2 (Ω),
shall evolve via
(1.12)
ut = λ∆p + χΩ\D (g − u),
p ∈ ∂ |Du| (Ω),
where ∂ |Du| (Ω) denotes the subdifferential of the total variation. A similar form of this inpainting
approach appeared the first time in . In Section 3 we will present some numerical results for this
approach.
Note that (1.12) does not follow a variational principle. Let p ∈ ∂ |Du| (Ω). Then, in fact the
functional
1 χΩ\D (u − g)2
(1.13)
|Du| (Ω) +
−1
2λ
exhibits the optimality condition
0 = λp + χΩ\D ∆−1 χΩ\D (u − g) ,
which splits into
0 = λp
in D
0 = λp + ∆−1 (u − g) in Ω \ D,
Hence the minimization of (1.13) translates into a second order diffusion inside the inpainting domain
D, whereas a stationary solution of (1.12) fulfills
0 = λ∆p
in D
0 = λ∆p + (g − u) in Ω \ D.
TOTAL VARIATION MINIMIZATION WITH AN H −1 CONSTRAINT
7
1.4. Numerical Solution for TV-H−1 Minimization. The numerical solution of TV-H −1 approaches is a challenging task on its own. Generally speaking the computational costs are very high
because of the high (fourth!) differential order of the corresponding optimality condition, induced by
the H −1 norm in the problem. Solving (1.1) by an explicit steepest descent iteration for instance,
results in a stepsize of order O(∆x4 ), where ∆x is the spatial stepsize in Ω. In particular in the case
when T 6= Id even existing semi-implicit schemes are on the one hand unconditionally stable, i.e., the
stepsize for the iterations can be chosen arbitrarily large, but their convergence to a minimizer is still
slow, depending on the size of the tuning parameter λ. This will be discussed in more detail at the
end of this section. However, it is clear, that the numerical solution of TV-H −1 approaches in general
poses the challenge of limiting the computational costs.
Now, the numerical solution of TV-H −1 minimization depends on the specific problem at hand. In
 Lieu and Vese proposed a numerical method to solve TV-H −1 denoising/decomposition (1.8) by
using the Fourier representation of the H −1 norm on the whole Rd , d ≥ 1. Thereby the space H −1 (Rd )
is defined as a Hilbert space equipped with the inner product
Z −1
2
fˆĝ¯ dξ
1 + |ξ|
hf, gi−1 =
and associated norm kf k−1 =
2
d
L (R ), i.e.,
(1.14)
q
hf, f i−1 , cf. also . Here ĝ denotes the Fourier transform of g in
1
ĝ(y) :=
(2π)2d
Z
e−ixy g(x) dx,
y ∈ Rd ,
Rd
and ḡ the complex conjugate of g. H −1 (Rd ) is the dual space of H 1 (Rd ). Note that we consider
functions g defined in Rd rather than on a bounded domain which can be done by considering zero
extensions of the image function. With this definition of the H −1 norm the corresponding optimality
"
^ #
¯ û
¯
ĝ−
λp + 2 Re
= 0 in Ω
−1
(1+|ξ|2 )
(1.15)
∇u
n=0
on ∂Ω
|∇u| · ~
u=0
outside Ω̄,
g denotes the inverse Fourier transform of g, defined in analogy to (1.14), Re denotes the real
where {g}
part of a complex number, and ~n is the outward pointing unit normal vector on ∂Ω. For the numerical
computation of a solution of (1.15), we approximate an element p of the subdifferential of |Du| (Ω),
by its relaxed version
p ≈ ∇ · (∇u/|∇u|ǫ ),
p
where |∇u|ǫ = |∇u|2 + ǫ.
Equation (1.15) leads to solve a second order PDE rather than a fourth order PDE, resulting in a
better CFL condition for the numerical scheme, cf. . The dual approach for TV-H−1 denoising
and decomposition [5, 6] shall be presented in section 2.
(1.16)
In the case of TV-H −1 inpainting the situation is completely different since (1.12) does not fulfill a
variational principle, cf. Section 1.3. In  and  the authors used a convexity splitting scheme to
solve (1.12). Convexity splitting algorithms, proposed by Eyre in , are usually applied to gradient
flows and provide a semi-implicit scheme for the discretization in time. Roughly said convexity splitting
means to solve a gradient system
ut = −∇J (u)
in Ω,
u(., t = 0) = u0 in Ω,
8
C. B. SCHÖNLIEB
with initial condition u0 ∈ RN and a functional J from RN into R, where N is the dimension of the
data space, by a semi-implicit algorithm of the form: Pick an initial u0 = u0 and iterate for k ≥ 0
uk+1 − uk = τ ∇Je (uk ) − ∇Jc (uk+1 ) ,
where uk approximates the exact solution u(kτ ) (where τ denotes the timestep) and Jc , Je are strictly
convex and chosen such that
J (u) = Jc (u) − Je (u).
Under certain assumptions this discretization approach is unconditionally stable and relatively easy to
apply to a large range of variational problems. Although most higher-order inpainting schemes arent
gradient flows, among them (1.12), this method can still be applied in a modified form. For more
details to the application of convexity splitting algorithms in higher order inpainting compare .
For the application of convexity splitting to (1.12) an element p ∈ ∂T V (u) is replaced by its relaxed
version (1.16), namely we want to solve
ut = −λ∆(∇ · (
(1.17)
∇u
)) + χΩ\D (g − u).
|∇u|ǫ
In  the authors propose the following splitting for the TV-H −1 inpainting equation. The regularizing term in (1.17) can be modeled by a gradient flow in H −1 of the energy
Z
1
J (u) =
|∇u|ǫ dx.
Ω
1
We split J in
Jc1
−
Je1
with
Z
C1
Jc1 (u) =
|∇u|2 dx,
2
Ω
Je1 (u) =
Z
−|∇u|ǫ +
Ω
C1
|∇u|2 dx.
2
2
The fitting term is a gradient flow in L of the energy
Z
1
χΩ\D (u − g)2 dx
J 2 (u) =
2λ Ω
and is splitted into J 2 = Jc2 − Je2 with
Z
Z
C2 2
1
C2 2
2
2
− χΩ\D (u − g)2 +
Jc (u) =
|u| dx, Je (u) =
|u| dx.
2
2λ
2
Ω
Ω
For the splittings discussed above the resulting time-stepping scheme is
uk+1 − uk
= −∇H −1 (Jc1 (uk+1 ) − Je1 (uk )) − ∇L2 (Jc2 (uk+1 ) − Je2 (uk )),
τ
where ∇H −1 and ∇L2 represent the Fréchet derivatives with respect to the H −1 inner product and the
L2 inner product respectively. This translates to a numerical scheme of the form
(1.18)
uk+1 −uk
τ
k
∇u
+ C1 ∆2 uk+1 + C2 uk+1 = C1 ∆2 uk − ∆(∇ · ( |∇u
k | ))
ǫ
1
k
+C2 u + λ χΩ\D (g − uk ),
where ∆2 = ∆∆. In order to make the scheme unconditionally stable, the constants C1 and C2 have to
be chosen so that Jci , Jei , i = 1, 2, are all convex. The condition turns out to be C1 > 1ǫ and C2 > 1/λ.
Since usually in inpainting tasks λ is chosen comparatively small, e.g., λ = 10−3 , the condition on C2
makes the numerical scheme (1.18), although unconditionally stable, quite slow.
In the following we are going to present a method introduced by Chambolle  for TV-L2 minimization (1.7) and its generalization for the TV-H −1 case (1.1). This algorithm will give us the
opportunity to address TV-H −1 minimization in a general way.
TOTAL VARIATION MINIMIZATION WITH AN H −1 CONSTRAINT
9
2. An Algorithm for TV-H −1 Minimization
2.1. Preliminaries. Throughout this section k·k denotes the norm in X = L2 (Ω) in the continuous
setting, i.e., the Euclidean norm in X = RN ×M in the discrete setting. In the discrete setting the continuous image domain Ω = [a, b]×[c, d] ⊂ R2 is approximated by a finite grid {a = x1 < . . . < xN = b}×
d−c
{c = y1 < . . . < yM = d} with equidistant step-size h = xi+1 − xi = b−a
N = M = yj+1 − yj equal to 1
(one pixel). The digital image u is an element in X. We denote u(xi , yj ) = ui,j for i = 1, . . . , N and
j = 1, . . . , M .
Further we define Y = X × X with Euclidean norm k·kY and inner product h·, ·iY . Moreover the
operators gradient ∇, divergence ∇· and Laplacian ∆ in the discrete setting are defined as follows:
The gradient ∇u is a vector in Y given by forward differences
(∇u)i,j = ((∇x u)i,j , (∇y u)i,j ),
with
(∇x u)i,j
(
ui+1,j − ui,j
=
0
if i < N
if i = N,
(∇y u)i,j
(
ui,j+1 − ui,j
=
0
if j < M
if j = M,
for i = 1, . . . , N , j = 1, . . . , M .
We further introduce a discrete divergence ∇· : Y → X defined, by analogy with the continuous
setting, by ∇· = −∇∗ (∇∗ is the adjoint of the gradient ∇). That is, the discrete divergence operator
is given by backward differences like

 y
y
x
x


pi,j − pi−1,j if 1 < i < N
pi,j − pi,j−1 if 1 < j < M
(∇ · p)ij =
+ pyi,j
pxi,j
if i = 1
if j = 1


 x
 y
−pi−1,j
if i = N
−pi,j−1
if j = M,
for every p = (px , py ) ∈ Y .
Finally we define the discrete Laplacian as ∆ = ∇ · ∇, i.e.,


ui+1,j − 2ui,j + ui−1,j if
(∆u)i,j =
ui+1,j − ui,j
if


u
−
u
if
i−1,j
i,j
ui,j+1 − 2ui,j + ui,j−1

+ ui,j+1 − ui,j


ui,j−1 − ui,j
1<i<N
i=1
i=N
if 1 < j < M
if j = 1
if j = M,
and its inverse operator ∆−1 , as in the continuous setting (1.6), i.e., u = ∆−1 f is the unique solution
of
−(∆u)i,j = fi,j 1 < i < N, 1 < j < M
ui,j = 0
i = 1, N ; j = 1, M.
Moreover, without always indicating it, when in the discrete setting, instead of minimizing
J (u) =
we consider the discretized functional
X
J h (u) :=
1≤i≤N
1≤j≤M
1
2
ku − gk−1 + |Du| (Ω),
2λ
2
1
∇∆−1 (u − g) i,j + (∇u)i,j ,
2λ
p
with |y| = y12 + y22 for every y = (y1 , y2 ) ∈ R2 and some step-size h. As already pointed out in ,
the functional J h multiplied by h converges as h → 0 in the Γ sense to J , cf.  for details.
10
C. B. SCHÖNLIEB
2.2. Chambolle’s Algorithm for Total Variation Minimization. In  Chambolle proposes an
algorithm to numerically compute a minimizer of
1
2
ku − gkL2 (Ω) + |Du| (Ω).
J (u) =
2λ
His algorithm is based on considerations of the convex conjugate of the total variation and on exploiting
the corresponding optimality condition. It amounts to compute the minimizer u of J as
u = g − PλK (g),
where PλK denotes the orthogonal projection over L2 (Ω) on the convex set K which is the closure of
the set
∇ · ξ : ξ ∈ Cc1 (Ω; R2 ), |ξ(x)| ≤ 1 ∀x ∈ R2 .
To numerically compute the projection PλK (g) he uses a fixed point algorithm. All this will be
explained in more detail in the context of TV-H −1 minimization in the following subsection.
2.3. A Generalization of Chambolle’s Algorithm for TV − H−1 Minimization. The main
contribution of this paper is to generalize Chambolle’s algorithm to the case of an H −1 constrained
minimization of the total variation where T is an arbitrary linear and bounded operator. In short we
shall see how to solve (1.1) using a similar strategy as in . We start with solving the simplified
problem when T = Id, as also proposed in [5, 6], and as a second step present a method how to use this
solution in order to solve the general case (1.1). Hence for the time being we consider the minimization
problem
1
2
(2.19)
min{J (u) = |Du| (Ω) +
ku − gk−1 }.
u
2λ
We proceed by exploiting the optimality condition of (2.19), i.e.,
1
0 ∈ ∂ |Du| (Ω) + ∆−1 (u − g) .
λ
This can be rewritten as
∆−1 (g − u)
∈ ∂ |Du| (Ω).
λ
(2.20)
Since
s ∈ ∂f (x) ⇐⇒ x ∈ ∂f ∗ (s),
where f ∗ is the conjugate (or Fenchel transform) of f , it follows
−1
∆ (g − u)
.
u ∈ ∂ |D·| (Ω)∗
λ
Here
(2.21)
(
0
|D·| (Ω) (v) = χK (v) =
+∞
∗
if v ∈ K
otherwise,
where K is the closure of the set
∇ · ξ : ξ ∈ Cc1 (Ω; R2 ), |ξ(x)| ≤ 1 ∀x ∈ R2 ,
as before. Rewriting the above inclusion again we have
−1
g
g−u
1
∆ (g − u)
∗
,
∈
+ |D·| (Ω)
λ
λ
λ
λ
i.e., with w = ∆−1 (g − u)/λ it reads
0 ∈ (−∆w − g/λ) + λ1 ∂ |D·| (Ω)∗ (w)
w=0
in Ω
on ∂Ω.
TOTAL VARIATION MINIMIZATION WITH AN H −1 CONSTRAINT
In other words w is a minimizer of
w − ∆−1 g/λ2
11
1
|D·| (Ω)∗ (w),
2
λ
where H01 (Ω) = v ∈ H 1 (Ω) : v = 0 on ∂Ω and kvkH 1 (Ω) = k∇vk. Because of (2.21), for w being a
0
minimizer of the above functional it is necessary that |D·| (Ω)∗ (w) = 0, i.e., w ∈ K. Hence a minimizer
w fulfills
w = P1K (∆−1 g/λ),
H01 (Ω)
+
where P1K is the orthogonal projection on K over H01 (Ω), i.e.,
P1K (u) = argminv∈K ku − vkH 1 (Ω) .
0
Hence the solution u of problem (2.19) is given by
u = g + ∆ P1λK (∆−1 g) ,
where −∆ denotes the zero Dirichlet Laplacian as before.
Computing the nonlinear projection P1λK (∆−1 g) amounts to solve the following problem:
2
(2.22)
min ∇ λ∇ · p − ∆−1 g i,j : p ∈ Y, |pi,j | ≤ 1 ∀i = 1, . . . , N ; j = 1, . . . , M .
Analogous to  we use the Karush-Kuhn-Tucker conditions for the above constrained minimization.
Then there exist αi,j ≥ 0 such that the corresponding Euler-Lagrange equation reads
∇ ∆ λ∇ · p − ∆−1 g i,j + αi,j pi,j = 0, ∀i = 1, . . . , N ; j = 1, . . . , M,
where either αi,j > 0 and |pi,j | = 1 or |pi,j | < 1 and αi,j = 0. Now, following the arguments in , in
both cases this yields
αi,j = ∇∆ ∇ · pn − ∆−1 g/λ i,j , ∀i = 1, . . . , N ; j = 1, . . . , M.
Then the gradient descent algorithm for solving (2.22) reads: for an initial p0 = 0, iterate for n ≥ 0
pni,j − τ ∇∆ ∇ · pn − ∆−1 g/λ i,j
n+1
.
(2.23)
pi,j =
1 + τ (∇∆ (∇ · pn − ∆−1 g/λ))i,j Redoing the convergence proof in  we end up with a similar result. Essentially the same proof can
be found in [5, 6].
Theorem 2.1. Let τ ≤ 1/64. Then, λ∇ · pn converges to P1λK (∆−1 g) as n → ∞.
Proof. The proof works similar to the proof in . For the sake of completeness and clarity we
still present the detailed proof
here, keeping close to the notation in . By induction we easily
see that for every n ≥ 0, pni,j ≤ 1 for all i, j. Indeed, starting with pn , with |pni,j | ≤ 1 for all
i = 1, . . . , N ; j = 1, . . . , M , we have
n n
−1
n+1 pi,j + τ ∇(−∆) ∇ · p − ∆ g/λ i,j p
≤
≤ 1.
i,j
1 + τ (∇(−∆) (∇ · pn − ∆−1 g/λ))i,j Now, let us fix an n ≥ 0 and consider ∇ ∇ · pn+1 − ∆−1 (g/λ) Y . We want to show that this norm
is decreasing with n. In what follows we will abbreviate k·kY by k·k. We have
∇ ∇ · pn+1 − ∆−1 (g/λ) 2 = ∇∇ · (pn+1 − pn ) + ∇ ∇ · pn − ∆−1 (g/λ) 2
2
= ∇ ∇ · pn − ∆−1 (g/λ) +2 ∇∇ · (pn+1 − pn ), ∇ ∇ · pn − ∆−1 (g/λ)
2
+ ∇∇ · (pn+1 − pn ) .
12
C. B. SCHÖNLIEB
Inserting η = (pn+1 − pn )/τ in the above equation and integrating by parts in the second term we get
∇ ∇ · pn+1 − ∆−1 (g/λ) 2 = ∇ ∇ · pn − ∆−1 (g/λ) 2
2
+2τ η, ∇∆ ∇ · pn − ∆−1 (g/λ) + τ 2 k∇∇ · ηk .
By further estimating k∇∇ · ηk ≤ κ kηk, where κ = k|∇∇ · k| = supkpk≤1 k∇∇ · pk the norm of the
operator ∇∇· : Y → Y , we deduce
∇ ∇ · pn+1 − ∆−1 (g/λ) 2 ≤ ∇ ∇ · pn − ∆−1 (g/λ) 2
i
h 2
+τ 2 η, ∇∆ ∇ · pn − ∆−1 (g/λ) + κ2 τ kηk .
The operator norm κ will be bounded at the end of the proof. For now we are going to show that
the term multiplied by τ is always negative as long as pn+1 6= pn and τ ≤ 1/κ2 , and hence that
∇ ∇ · pn − ∆−1 (g/λ) 2 is decreasing. To do so we consider
2
2 η, ∇∆ ∇ · pn − ∆−1 (g/λ) + κ2 τ kηk
P
(2.24)
2
n
−1
=
− ∆ (g/λ) i,j + κ2 τ |ηi,j | .
1≤i≤N 2ηi,j ∇∆ ∇ · p
1≤j≤M
Now, from the fixed point equation we have
h
i
ηi,j = − ∇∆ ∇ · pn − ∆−1 (g/λ) i,j + ∇∆ ∇ · pn − ∆−1 (g/λ) i,j · pn+1
.
i,j
Setting ρi,j = ∇∆ ∇ · pn − ∆−1 (g/λ) i,j · pn+1
and inserting the above expression for ηi,j into
i,j
(2.24) we have for every i, j
2
2ηi,j ∇∆ ∇ · pn − ∆−1 (g/λ) i,j + κ2 τ |ηi,j |
2
2
2
= (κ2 τ − 1) |ηi,j | − ∇∆ ∇ · pn − ∆−1 (g/λ) i,j + |ρi,j | .
n+1 ≤ 1 it follows that |ρi,j | ≤ ∇∆ ∇ · pn − ∆−1 (g/λ)
, and hence
Since pi,j
i,j 2
2
2ηi,j ∇∆ ∇ · pn − ∆−1 (g/λ) i,j + κ2 τ |ηi,j | ≤ (κ2 τ − 1) |ηi,j | .
The last term is negative or zero if and only if κ2 τ − 1 ≤ 0. Hence, if
τ ≤ 1/κ2 ,
we see that ∇ ∇ · pn − ∆−1 (g/λ) is nonincreasing with n. For τ < 1/κ2 it is immediately clear
that the norm is even decreasing, unless η = 0, that is, pn+1 = pn . The same holds for κ2 τ = 1.
2
2
Indeed, in this case, if ∇ ∇ · pn+1 − ∆−1 (g/λ) = ∇ ∇ · pn − ∆−1 (g/λ) it follows that
2
0 = 2 η, ∇∆ ∇ · pn − ∆−1 (g/λ) + τ k∇∇ · ηk
2
≤ 2 η, ∇∆ ∇ · pn − ∆−1 (g/λ) + κ2 τ kηk
2
P
2
n
=
− ∆−1 (g/λ) i,j + |ρi,j | ,
1≤i≤N − ∇∆ ∇ · p
1≤j≤M
and therefore ∇∆ ∇ · pn − ∆−1 (g/λ) i,j ≤ |ρi,j |. Since in turn
|ρi,j | ≤ ∇∆ ∇ · pn − ∆−1 (g/λ) i,j we deduce |ρi,j | = ∇∆ ∇ · pn − ∆−1 (g/λ) i,j for each i, j.
= 1. In both cases, the
But this can only be if either ∇∆ ∇ · pn − ∆−1 (g/λ) i,j = 0 or pn+1
i,j
n
fixed point iteration (2.23) yields pn+1
i,j = pi,j for all i, j.
Now, since ∇ ∇ · pn − ∆−1 (g/λ) is decreasing with n, the norm is uniformly bounded and hence
there exists an m ≥ 0 such that
m = lim ∇ ∇ · pn − ∆−1 (g/λ) .
n→∞
TOTAL VARIATION MINIMIZATION WITH AN H −1 CONSTRAINT
13
Moreover the sequence pn has converging subsequences. Let p̄ be the limit of a subsequence (pnk ) and
p̄′ be the limit of (pnk +1 ). Inserting pnk +1 and pnk into the fixed point equation (2.23) and passing to
the limit we have
p̄i,j − τ ∇∆ ∇ · p̄ − ∆−1 g/λ i,j
′
.
p̄i,j =
1 + τ (∇∆ (∇ · p̄ − ∆−1 g/λ))i,j ∇ ∇ · p̄ − ∆−1 (g/λ) Repeating
the
previous
calculations
we
see
that
since
m
=
= ∇ ∇ · p̄′ − ∆−1 (g/λ) , it must be that η̄i,j = (p̄′i,j − p̄i,j )/τ = 0 for every i, j, that is, p̄ = p̄′ .
Hence p̄ is a fixed point of (2.23), i.e.,
∇∆ ∇ · p̄ − ∆−1 g/λ i,j + ∇∆ ∇ · p̄ − ∆−1 g/λ i,j p̄i,j = 0, ∀i = 1, . . . , N ; j = 1, . . . , M
which is the Euler equation for a solution of (2.22). One can deduce that p̄ solves (2.22) and that
λ∇ · p̄ is the projection P1K (∆−1 g). Since this projection is unique, we deduce that all the sequence
λ∇ · pn converges to P1K (∆−1 g). The theorem is proved if we can show that κ2 ≤ 64. By definition
κ = k|∇∇ · k| = sup k∇∇ · pk .
kpk≤1
Then for every i, j, we have
2
k∇∇ · pk
=
P
1≤i≤N
1≤j≤M
2
(∇∇ · p)i,j .
For more clarity let us set u := ∇ · p ∈ X for now. With the convention that p0,j = pN,j = pi,0 =
pi,M = 0 we get
2
2 2 P
P
2
k∇∇ · pk =
1≤i≤N (∇x u)
1≤i≤N (∇u)
+ (∇y u)i,j =
i,j
i,j
1≤j≤M
P 1≤j≤M
P
2
2
=
1≤i<N (ui+1,j − ui,j ) +
1≤i≤N (ui,j+1 − ui,j )
1≤j<M
P
P 1≤j≤M
2
2
=
1≤i≤N ((∇ · p)
1≤i<N ((∇ · p)
i,j+1 − (∇ · p)i,j )
i+1,j − (∇ · p)i,j ) +
1≤j<M
1≤j≤M
2
P
pxi+1,j − pxi,j + pyi+1,j − pyi+1,j−1 − pxi,j − pxi−1,j + pyi,j − pyi,j−1
=
1≤i<N
1≤j≤M
2
P
+ 1≤i≤N pxi,j+1 − pxi−1,j+1 + pyi,j+1 − pyi,j − pxi,j − pxi−1,j + pyi,j − pyi,j−1
1≤j<M
2 2 2 2
P
≤ 8 · 1≤i<N pxi+1,j + pxi,j + pyi+1,j + pyi+1,j−1 1≤j≤M
2 2
2 2 + pxi,j + pxi−1,j + pyi,j + pyi,j−1 2 2 2 2
P
+8 · 1≤i≤N pxi,j+1 + pxi−1,j+1 + pyi,j+1 + pyi,j 1≤j<M
2 2
2 y 2 y
+ p + p
+ px + px
2
i,j
i−1,j
i,j
i,j−1
≤ 64 · kpk ≤ 64.
Remark 2.1. In our numerical computations we stop the fixed point iteration (2.23) as soon as the
distance between the iterates is small enough, i.e.,
n+1
p
(2.25)
− pn ≤ e · pn+1 ,
where e is a chosen error bound.
Then, in summary, to minimize (2.19) we apply the following algorithm
Algorithm (P)
• For an initial p0 = 0, iterate (2.23) until (2.25);
• Set P1λK (∆−1 g) = λ∇ · pn̂ , where n̂ is the first iterate of (2.23) which fulfills (2.25);
• Compute a minimizer u of (2.19) by
u = g + ∆ P1λK (∆−1 g) = g + ∆ λ∇ · pn̂ .
14
C. B. SCHÖNLIEB
The second step is to use the presented algorithm for (2.19) in order to solve (1.1), i.e.,
min{J (u) = |Du| (Ω) +
u
1
2
kT u − gk−1 }.
2λ
To do so we first approximate a minimizer of (1.1) iteratively by a sequence of minimizers of, what we
call, surrogate functionals J s . This approach is inspired by similar methods used, e.g., in , .
Let τ > 0 be a fixed stepsize. Starting with an initial condition u0 = g, we solve for k ≥ 0
(2.26)
uk+1 = argminu J s (u, uk ) = |Du| (Ω) +
1 u − uk 2 + 1 u − g + (Id − T )uk 2 .
−1
−1
2τ
2λ
Note that a function u for which J s (u, u) = J (u), i.e., a fixed point of J s , is a potential minimizer
for J . A rigorous derivation of convergence properties is still missing and is a matter of future
investigation. Note however that in the case of image inpainting, i.e., T = χΩ\D and g is replaced
by χΩ\D g, the optimality condition of (2.26) indeed describes a fourth order diffusion inside of the
inpainting domain D. Hence, in this case, minimizing (2.26) rather describes the behaviour of solutions
of the inpainting approach (1.12) than directly minimizing (1.1), cf. also Subsection 1.3 and especially
(1.13). Despite the missing theory, the numerical results obtained by using this scheme for inpainting
issues suggest its correct asymptotic behaviour, see Section 3.2.
Now, the corresponding optimality condition to (2.26) reads
0 ∈ ∂ |Du| (Ω) +
which can be rewritten as
1
1 −1
∆ (u − uk ) + ∆−1 u − g + (Id − T )uk ,
τ
λ
∆−1
g1 − u g2 − u
+
τ
λ
∈ ∂ |Du| (Ω),
where g1 = uk , g2 = g + (Id − T )uk . Setting
g=
µ=
g1 λ+g2 τ
λ+τ
λτ
λ+τ ,
we end up with the same inclusion as (2.20), i.e.,
∆−1 (g − u)
∈ ∂ |Du| (Ω),
µ
and Algorithm (P) for solving (2.19) can be directly applied.
3. Applications
In this section we present applications of our new algorithm for solving (1.1) for image denoising,
decomposition and inpainting, and present numerical results obtained. For comparison, we also present
results for the T V − L2 model in  on the same images.
Now, in order to compute the minimizer u of (1.1), we have the following algorithm.
Algorithm TV − H−1 :
• In the case T = Id directly apply Algorithm (P) to compute a minimizer of (1.1).
• In the case T 6= Id iterate (2.26) by solving Algorithm (P) in every iteration step until the
two subsequent iterates uk and uk+1 are sufficiently close.
Note that in our numerical examples e in (2.25) is chosen to be 10−4 .
TOTAL VARIATION MINIMIZATION WITH AN H −1 CONSTRAINT
(a) original
15
(b) noisy, SN R = 25.2
Figure 3. Image of a horse and its noisy version with additive white noise
3.1. Image Denoising and Decomposition. In the case of image denoising and image decomposition the operator T = Id and thus Algorithm (P) can be directly applied. For image denoising the
signal to noise ratio (SNR) is computed as
hgi
,
SN R = 20 log
σ
with hgi the average value of the pixels gi,j and σ the standard deviation of the noise. For our numerical
results the parameter λ in (1.1) was chosen so that the best residual-mean-squared-error (RMSE) is
obtained. We define the RMSE as
v
X
1 u
u
(ui,j − ûi,j )2 ,
RM SE =
t
NM
1≤i≤N
1≤j≤M
where û is the original image without noise, cf. . Numerical examples for image denoising with
TV-H −1 minimization and their comparison with the results obtained by the T V − L2 approach are
presented in Figures 3-6. In both examples the superiortity of the TV-H −1 minimization approach
with respect to the separation of noise and edges is clearly visible.
We also apply (1.1) for texture removal in images, i.e., image decomposition, and compare the
numerical results with those of the T V −L2 approach, cf. Figure 7. The cartoon-texture decomposition
in this example works better in the case of TV-H −1 minimization, since this approach differentiates
between small oscillations and strong edges, better than the TV-L2 approach.
3.2. Image Inpainting. In order to apply our algorithm to TV-H −1 inpainting we follow the method
of surrogate functionals from Section 2. In fact it turns out that a fixed point of the corresponding
optimality condition of (2.26) with T = χΩ\D and g is replaced by χΩ\D g is indeed a stationary
solution of (1.12). This approach is also motivated by the fixed point approach used in  in order
to prove existence of a stationary solution of (1.12). Hence a stationary solution to (1.12) can be
computed iteratively by the following algorithm: Take u0 = g, with any trivial (zero) expansion to the
inpainting domain, and solve for k ≥ 0
1
1
k 2
k 2
|Du| (Ω) +
||u − u ||−1 +
||u − χΩ\D g − (1 − χΩ\D )u ||−1 → uk+1 ,
(3.27) min
2τ
2λ
u∈BV (Ω)
16
C. B. SCHÖNLIEB
(a) g = u + v
(b) TV-L2 : u
(c) TV-L2 : v
(d) TV-H −1 : u
(e) TV-H −1 : v
Figure 4. Denoising results for the image of a horse in Figure 3. Results from the
T V − L2 denoising model compared with TV-H −1 denoising with λ = 0.05 for both.
TOTAL VARIATION MINIMIZATION WITH AN H −1 CONSTRAINT
(a) original
17
(b) noisy, SN R = 29.4
Figure 5. Image of the roof of a house in Scotland and its noisy version with additive
white noise
for positive iterationsteps τ > 0. Now, as before, let g1 = uk , g2 = χΩ\D g + (1 − χΩ\D )uk and
g=
µ=
g1 λ+g2 τ
λ+τ
λτ
λ+τ ,
then we end up with the same inclusion as (2.20), i.e.,
∆−1 (g − u)
∈ ∂ |Du| (Ω),
µ
and Algorithm (P) can be directly applied. Compare Figure 8 for a numerical example.
3.3. Domain Decomposition for TV-H −1 Minimization. As already discussed in Section 1.4
one of the drawbacks of using TV-H −1 minimization in applications is its slow numerical performance.
Also with our new algorithm, i.e., Algorithm (P), we are conditioned to timesteps τ ≤ 1/64. If now
additionally the data dimension is large, e.g., when we have to process 2D images of high resolution, of
sizes 3000×3000 pixels for instance, or even 3D image data, each iteration step itself is computationally
expensive and we are far away from real-time computations. Total variation approaches for example
already turned out to be an effective tool for the reconstruction of medical images as the ones gained
from PET (Positron Emission Tomography) measurements (cf. , for instance). These imaging
approaches need to deal with 3D or even 4D image data (including time dependence) in a fast and
robust way. Motivated by this we were thinking about alternative ways to reduce the dimensionality
of the data and hence speed up the reconstruction process. This is the interest of this section where
we present a domain decomposition approach to be applied to TV-H −1 minimization models.
Domain decomposition methods were introduced as techniques for solving partial differential equations based on a decomposition of the spatial domain of the problem into several subdomains. This
means that instead of solving one big problem, i.e., the equation on the whole domain Ω, a number of
SN
local subproblems is solved, i.e., the equation on Ω1 , . . . , ΩN where Ω = i=1 Ωi . Thereby the main
outcome of this method is a dimension reduction of the problem whose impact is shown when computed on parallel processors. In  the authors proposed a domain decomposition algorithm which
is applicable to T V − L2 minimization. The difficulty of doing this for total variation approaches is
that solutions may be discontinuous, and hence their correct treatment on the interfaces of the domain
decomposition patches isnt straightforward. In the following we will roughly present a method how to
modify the algorithm in  for TV-H −1 minimization. In particular, we want to investigate splittings
of the domain Ω into arbitrary nonoverlapping domains Ω = Ω1 ∪ Ω2 with Ω1 ∩ Ω2 = ∅. For simplicity,
18
C. B. SCHÖNLIEB
(a) g = u + v
(b) TV-L2 : u
(c) TV-L2 : v
(d) TV-H −1 : u
(e) TV-H −1 : v
Figure 6. Denoising results for the image of the roof in Figure 5. Results from the
T V − L2 denoising model with λ = 0.05 compared with TV-H −1 denoising with
λ = 0.01
we shall only consider a splitting in two domains. However note that, as in , the algorithm can be
easily extended to more than two subdomains.
Following the notation in , let H = L2 (Ω) and Vi = L2 (Ωi ), where H = V1 ⊕ V2 . Let further
be Hψ = BV (Ω) and Viψ = BV (Ωi ), i = 1, 2. We take the norm in BV (Ω) to be k·kBV (Ω) =
|D·| (Ω) + k·kL2 (Ω) . Let us further denote ΠVi the orthogonal projection onto Vi . This setting is the
TOTAL VARIATION MINIMIZATION WITH AN H −1 CONSTRAINT
(a) g = u + v
(b) TV-L2 : u
(c) TV-L2 : v
(d) TV-H −1 : u
(e) TV-H −1 : v
Figure 7. Decomposition into cartoon and texture of an synthetic image. Results
from the T V − L2 model with λ = 1 and TV-H −1 minimization with λ = 0.1.
19
20
C. B. SCHÖNLIEB
Figure 8. TV-H −1 inpainting result for the image of the two statues with λ = 0.005.
same as the one for T V − L2 minimization in Example 2.1 in , with the only difference that the
fidelity term in (1.1) is minimized in the weaker H −1 norm instead of the norm in L2 (Ω). However,
considering (1.1) within this setting, we fulfill all the necessary properties, assumed in , in order to
apply their theory to our case. In what follows we will sketch the main arguments for this.
Now, we want to minimize J in (1.1) by the following alternating algorithm: Pick an initial V1 ⊕V2 ∋
u01 + u02 := u0 ∈ BV (Ω), for example u0 = 0, and iterate
 n+1
 u1 ≈ argminu1 ∈V1 J (u1 + un2 )
+ u2 )
un+1
≈ argminu2 ∈V2 J (un+1
1
2
 n+1
n+1
.
+
u
u
:= un+1
2
1
In  this algorithm is implemented by solving the subspace minimization problems via an oblique
thresholding iteration. Lets present the strategy in short for the T V − H −1 case. The subproblem on
1
2
kT u1 − (g − T u2 )k−1 .
(3.28)
argminu1 ∈V1 J (u1 + u2 ) = |D(u1 + u2 )| (Ω) +
2λ
As in Section 2 in (2.26), we introduce a sequence of surrogate functionals for this subminimization
problem, i.e., let u0 = 0, for k ≥ 0 let
1 u1 − uk1 2 + 1 u1 − (g − T u2 + (Id − T )uk1 )2 .
J1s (u1 + u2 , uk1 ) = |D(u1 + u2 )| (Ω) +
−1
−1
2τ
2λ
We want to realize an approximate solution to (3.28) by using the following algorithm: For u01 ∈
BV (Ω1 ),
(3.29)
uk+1
= argminu1 ∈V1 J1s (u1 + u2 , uk1 ),
1
k ≥ 0.
Problem (3.29) can be reformulated as
= argminu∈BV (Ω) {F (u), ΠV2 (u) = 0} ,
uk+1
1
with F (u) = J1s (u + u2 , uk1 ). Like in  we are going to use the following theorem:
Theorem 3.1. (Theorem 4.3 in ) We consider the following problem
(3.30)
argminx∈V {F (x) : G(x) = 0} ,
where G : V → R is a bounded linear operator on V . If F is continuous in a point of ker G and G∗
has closed range in V , then a point x0 ∈ ker G is an optimal solution of (3.30) if and only if
∂F (x0 ) ∩ Range G∗ 6= ∅.
TOTAL VARIATION MINIMIZATION WITH AN H −1 CONSTRAINT
21
Now, since L2 (Ω) ⊂ L2 (R2 ) ⊂ H −1 (Ω) (by zero extensions of functions on Ω to R2 ), our functional
F is continuous on V1ψ ⊂ V1 = ker ΠV2 in the norm topology of BV (Ω). Further ΠV2 |BV (Ω) is a
∗
bounded and surjective map with closed range in the norm topology of BV (Ω), i.e., ΠV2 |BV (Ω) is
∗
ψ
injective and the Range ΠV2 |BV (Ω) ∼
= (V2 )′ is closed. By applying Theorem 3.1, we know that the
is equivalent to the existence of an η ∈ Range(ΠV2 |BV (Ω) )∗ ∼
optimality of uk+1
= (BV (Ω2 ))′ such that
1
−η ∈ ∂BV (Ω) F (uk+1
),
1
where ∂BV (Ω) denotes the subdifferential of F on BV (Ω). Now
∂BV (Ω) F (uk+1
)=
1
where
1 −1 k+1
∆ (u1 − z) + ∂BV (Ω) D(uk+1
+ u2 ) (Ω)
1
µ
z=
λτ
z1 λ + z2 τ
, µ=
,
λ+τ
λ+τ
is equivalent to
with z1 = uk1 , z2 = g − T u2 + (Id − T )uk1 . Then the optimality of uk+1
1
1
0 ∈ ∆−1 (uk+1
− z) + η + ∂BV (Ω) D(uk+1
+ u2 ) (Ω).
1
1
µ
The latter is equivalent to
1
) − η),
+ u2 ∈ ∂BV (Ω) |D.| (Ω)∗ ( ∆−1 (z − uk+1
uk+1
1
1
µ
i.e,
z − uk+1
1
1
u2 + z
1
) − η).
∈
+ ∂BV (Ω) |D.| (Ω)∗ ( ∆−1 (z − uk+1
1
µ
µ
µ
µ
By letting w = ∆−1 (z − uk+1 )/µ − η we have
0 ∈ (−∆(w + η) − (u2 + z)/µ) +
1
∂BV (Ω) |D.| (Ω)∗ (w),
µ
or, in other words, w is a minimizer of
w − (∆−1 (u2 + z)/µ − η)2 1
H (Ω)
0
2
+
1
|D·| (Ω)∗ (w).
µ
Following the same procedure as in Section 2 we get that
w = P1K (∆−1 (u2 + z)/µ − η),
where P1K denotes the orthogonal projection on K over H01 (Ω) like in Section 2. Then a minimizer
uk+1
of (3.29) can be computed as
1
uk+1
= −∆ Id − P1µK ∆−1 (z + u2 ) − µη − u2 .
1
By applying ΠV2 to both sides of the latter equality we get
0 = µ∆η + ΠV2 ∆P1µK ∆−1 (u2 + z) − µη .
Assuming necessary zero boundary conditions on ∂Ω the resulting fixed point equation for η reads
1
η = ΠV2 P1µK µη − ∆−1 (u2 + z) .
µ
Like in  this fixed point can be computed via the iteration
η 0 ∈ V2 ,
η m+1 =
1
ΠV2 P1µK µη m − ∆−1 (u2 + z) ,
µ
m ≥ 0.
22
C. B. SCHÖNLIEB
(a) Given image
(b) Intermediate inpainting iterate
(c) Inpainting result
Figure 9. TV-H −1 inpainting computation for a model example on four domains
with λ = 0.01.
In sum we solve (1.1) by the alternating subspace minimizations: Pick an initial V1 ⊕ V2 ∋ u0,L
+
1
0,M
0
0
u2 := u ∈ BV (Ω), for example u = 0, and iterate
 n+1,0

u1
= un,L

1


n+1,ℓ+1

, un+1,ℓ
) ℓ = 0, . . . , L − 1
= argminu1 ∈V1 J1s (u1 + un,M
 u1
2
1
n+1,0
n,M
(3.31)
u2
= u2



+ u2 , un+1,m
) m = 0, . . . , M − 1
un+1,m+1
= argminu2 ∈V2 J2s (un+1,L

1
2
2
 n+1

n+1,L
n+1,M
u
:= u1
+ u2
,
where each subminimization problem is computed by the oblique thresholding algorithm
Oblique Thresholding for TV − H −1 Minimization
= −∆ Id − P1µK ∆−1 (z + uj ) − µη − uj ,
(3.32)
uk+1
i
where η in (3.32) is computed via the fixed point iteration
Let η 0 = 0 ∈ V2 and iterate
1
(3.33)
η m+1 = ΠV2 P1µK µη m − ∆−1 (uj + z) ,
µ
i = 1, 2, i 6= j,
m ≥ 0.
As before the projection P1µK is computed by Algorithm (P). Note that, as in , a parallel version
of (3.31) can be obtained by a slight modification of the update, i.e., un+1 := (un +un+1,L
+un+1,M
)/2.
1
2
In Figures 9 and 10 the given image was divided in four subdomains, marked by the red lines, and
the image is inpainted via (3.31).
Acknowledgments
This paper was initiated during my participation at the workshop Singularities in nonlinear evolution phenomena and applications, held at the Centro di Ricerca Matematica Ennio De Giorgi on May
26-29, 2008, organized by Sisto Baldo, Matteo Novaga and Giandomenico Orlandi. The author thanks
the research center and the workshop organizers for their hospitality and a very interesting workshop.
Further, CBS would like to thank Massimo Fornasier for a sequence of fruitful discussions and Antonin
Chambolle for his very helpful suggestions concerning the generalization of his algorithm to the H −1
constrained case.
TOTAL VARIATION MINIMIZATION WITH AN H −1 CONSTRAINT
23
Figure 10. TV-H −1 inpainting computation on four domains with λ = 0.005.
The author also acknowledges the following funds for their financial support. CBS is partially supported by the project WWTF Five senses-Call 2006, Mathematical Methods for Image Analysis and
Processing in the Visual Arts project nr. CI06 003, by the FFG project Erarbeitung neuer Algorithmen
zum Image Inpainting project nr. 813610 and by KAUST (King Abdullah University of Science and
Technology).
References
 R. Acart, C.R. Vogel, Analysis of bounded variation penalty methods for ill-posed problems, Inverse Problem 10:
1217-1229, 1994.
 L. Ambrosio, N. Fusco, and D. Pallara, Functions of bounded variation and free discontinuity problems., Oxford
Mathematical Monographs. Oxford: Clarendon Press. xviii, 2000.
 G. Aubert, and P. Kornprobst, Mathematical Problems in Image Processing. Partial Differential Equations and
the Calculus of Variations, Springer, Applied Mathematical Sciences, Vol 147, 2006.
 G. Aubert and L. Vese, A variational method in image recovery., SIAM J. Numer. Anal. 34 (1997), no. 5, 1948–1979.
 J-F Aujol and A Chambolle, Dual norms and image decomposition models, International Journal of Computer
Vision, volume 63, number 1, pages 85-104, June 2005.
 J-F Aujol and G Gilboa, Constrained and SNR-based Solutions for TV-Hilbert Space Image Denoising, Journal of
Mathematical Imaging and Vision, volume 26, numbers 1-2, pages 217-237, November 2006.
 J. Bect, L. Blanc-Féraud, G. Aubert, and A. Chambolle, A ℓ1 -unified variational framework for image restoration.
In T. Pajdla and J. Matas, editors, Proceedings of the 8th European Conference o Computer Vision, vol. IV, Prague,
Czech Republic, 2004. Spinger Verlag.
 M. Bertalmio, G. Sapiro, V. Caselles, and C. Ballester, Image Inpainting, Siggraph 2000, Computer Graphics
Proceedings, pp.417–424, 2000.
 A. Bertozzi, S. Esedoglu, and A. Gillette, Inpainting of Binary Images Using the Cahn-Hilliard Equation. IEEE
Trans. Image Proc. 16(1) pp. 285-291, 2007.
 A. Bertozzi, S. Esedoglu, and A. Gillette, Analysis of a two-scale Cahn-Hilliard model for image inpainting, Multiscale Modeling and Simulation, vol. 6, no. 3, pages 913-936, 2007.
 A. Bertozzi, C.-B. Schönlieb, Unconditionally stable schemes for higher order inpainting, in preparation.
 A. Braides, Gamma-Convergence for Beginners. Nr. 22 in Oxford Lecture Series in Mathematics and its Applications. Oxford University Press, 2002.
 M. Burger, L. He, C. Schönlieb, Cahn-Hilliard inpainting and a generalization for grayvalue images, UCLA CAM
report 08-41, June 2008.
24
C. B. SCHÖNLIEB
 A. Chambolle, An Algorithm for Total Variation Minimization and Applications, J. Math. Imaging Vis. 20, 1-2,
pp. 89-97, 2004.
 A. Chambolle and P.-L. Lions, Image recovery via total variation minimization and related problems., Numer.
Math. 76 (1997), no. 2, 167–188.
 T. F. Chan and J. Shen, Mathematical models for local non-texture inpaintings, SIAM J. Appl. Math.,
62(3):10191043, 2001.
 T. F. Chan and J. Shen, Non-texture inpainting by curvature driven diffusions (CDD), J. Visual Comm. Image
Rep., 12(4):436449, 2001.
 T. F. Chan and J. Shen, Variational restoration of non-flat image features: models and algorithms, SIAM J. Appl.
Math., 61(4):13381361, 2001.
 T.F. Chan, S.H. Kang, and J. Shen, Euler’s elastica and curvature-based inpainting, SIAM J. Appl. Math., Vol. 63,
Nr.2, pp.564–592, 2002.
 T.F. Chan and J. Shen, Variational Image Inpainting, Comm. Pure Applied Math, Vol. 58, pp. 579-619, 2005.
 I. Daubechies, G. Teschke, and L. Vese, Iteratively solving linear inverse problems under general convex constraints,
Inverse Probl. Imaging 1 (2007), no. 1, 29–46.
 R. Dautray, and J.L. Lions, Mathematical Analysis and Numerical Methods for Science and Technology. Vol.2,
Springer-Verlag, 1988.
 D. C. Dobson and C. R. Vogel, Convergence of an iterative method for total variation denoising, SIAM J. Numer.
Anal. 34 (1997), no. 5, 1779–1791.
 S. Esedoglu, and J.-H. Shen, Digital inpainting based on the Mumford-Shah-Euler image model, Eur. J. Appl.
Math., 13:4, pp. 353-370, 2002.
 L. C. Evans and R. F. Gariepy, Measure Theory and Fine Properties of Functions., CRC Press, 1992.
 D. Eyre, An Unconditionally Stable One-Step Scheme for Gradient Systems, Jun. 1998, unpublished.
 M. Fornasier, and C.-B. Schönlieb, Subspace correction methods for total variation and ℓ1 − minimization, 33p,
arXiv: 0712.2258v1 [math.NA].
 E. Jonsson, S.-C. Huang, and T. Chan, Total Variation Regularization in Positron Emission Tomography, CAM
report 98-58, UCLA, November 1998.
 L. Lieu and L. Vese, Image restoration and decompostion via bounded total variation and negative Hilbert-Sobolev
spaces, Applied Mathematics & Optimization, Vol. 58, pp. 167-193, 2008.
 O. M. Lysaker and Xue-C. Tai, Iterative image restoration combining total variation minimization and a secondorder functional, International Journal of Computer Vision, Vol.66, Nr.1, pp.5-18, 2006.
 S. Masnou and J. Morel, Level Lines based Disocclusion, 5th IEEE Int’l Conf. on Image Processing, Chicago, IL,
Oct. 4-7, 1998, pp.259–263, 1998.
 Y. Meyer, Oscillating Patterns in Image Processing and Nonlinear Evolution Equations, Univ. Lecture Ser. 22,
AMS, Providence, RI, 2002.
 D. Mumford, and B. Gidas, Stochastic models for generic images, Quart. Appl. Math., 59 (2001), pp. 85-111.
 Y. Nesterov, Smooth minimization of non-smooth functions, Math. Program. 103, 1, pp. 127-152, 2005.
 S. Osher, A. Sole, and L. Vese. Image decomposition and restoration using total variation minimization and the H
-1 norm, Multiscale Modeling and Simulation: A SIAM Interdisciplinary Journal, Vol. 1, Nr. 3, pp. 349-370, 2003.
 L. I. Rudin, S. Osher, and E. Fatemi, Nonlinear total variation based noise removal algorithms, Physica D 60, pp.
259-268, 1992.
 L. Rudin and S. Osher, Total variation based image restoration with free local constraints, Proc. 1st IEEE ICIP,
1:3135, 1994.
 A. Tsai, Jr. A. Yezzi, and A. S. Willsky, Curve evolution implementation of the Mumford-Shah functional for image
segmentation, denoising, interpolation and magnification, IEEE Trans. Image Process., 10(8):11691186, 2001.
 L. Vese, A study in the BV space of a denoising-deblurring variational problem., Appl. Math. Optim. 44 (2001),
131–161.
 L. Vese, S. Osher, Image Denoising and Decomposition with Total Variation Minimization and Oscillatory Functions, J. Math. Imaging Vision, 20 (2004), pp.7-18.
Carola-Bibiane Schönlieb
Department of Applied Mathematics and Theoretical Physics (DAMTP),
Centre for Mathematical Sciences,