# Image Processing – Variational and PDE Methods Contents

Dr C.-B. Schönlieb Mathematical Tripos Part III: Lent Term 2013/14 Image Processing – Variational and PDE Methods Contents 1 . . . . 1 2 2 6 6 2 Mathematical Representation of Images 2.1 Images as elements in a function space . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2.2 The Mumford-Shah image model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8 9 13 3 Variational Approach to Image Processing 3.1 Mathematical preliminaries . . . . . . . . . . . . . . . . . . . . . . 3.2 Image denoising . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3.3 Image reconstruction in the context of inverting a linear operator 3.4 Image inpainting . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3.5 Image segmentation with Mumford-Shah . . . . . . . . . . . . . . . . . . . 13 15 17 22 25 32 PDEs in Imaging 4.1 Perona-Malik . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4.2 Anisotropic diffusion filters . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 38 38 41 4 1 Introduction 1.1 What is a digital image? . . . . . . . . . . . . . . . . . . . . 1.2 Image processing tasks . . . . . . . . . . . . . . . . . . . . . 1.3 Image processing approaches . . . . . . . . . . . . . . . . . 1.4 Motivation for nonlinear PDE and variational approaches . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Introduction These lecture notes are the results of a graduate course given at the University of Cambridge. They are heavily based on various textbooks and review articles in the mathematical imaging literature. The three main sources for the creation of this material are • Aubert and Kornprobst’s book in Springer, Applied Mathematical Sciences, 06’ [AK06]. Sections 3.5 and 4. • Bredies and Lorenz’s book (in german!) in Vieweg+Teubner, 11’ [BL11]. Section 3. • Chambolle et al.’s lecture notes in De Gruyter, 10’ [CCCNP10]. Section 3.2. • Chan and Shen’s book in SIAM 05’ [CS05a]. Section 2. 1 Image Processing – Variational and PDE Methods 1.1 C.-B. Schönlieb What is a digital image? In order to appreciate the following theory and the image processing applications, we first need to understand what a digital image really is. Roughly speaking a digital image is obtained from an analogue image (representing the continuous world) by sampling and quantization. Basically this means that the digital camera superimposes a regular grid on an analogue image and assigns a value, e.g., the mean brightness in this field, to each grid element. In the terminology of digital images these grid elements are called pixels. The image content is then described by grayvalues or colour values prescribed in each pixel. The grayvalues are scalar values ranging between 0 (black) and 255 (white). The colour values are vector values, e.g., (r, g, b), where each channel r, g and b represents the red, green, and blue component of the colour and ranges, as the grayvalues, from 0 to 255. The mathematical representation of a digital image is a so-called image function u defined on a two dimensional (in general rectangular) image domain, the grid. This function is either scalar valued in the case of a grayvalue image, or vector valued in the case of a colour image. Here the function value u(x, y) denotes the grayvalue, i.e., colourvalue, of the image in the pixel (x, y) of the image domain. Figure 1 visualizes the connection between the digital image and its image function for the case of a grayvalue image. Figure 1: Digital image versus image function: On the very left a zoom into a digital photograph where the image pixels (small squares) are clearly visible is shown; in the middle the grayvalues of the red selection in the digital photograph are displayed in matrix form; on the very right the image function of the digital photograph is shown where the grayvalue u(x, y) is plotted as the height over the x, y− plane. Typical sizes of digital images range from 2000 × 2000 pixels in images taken with a simple digital camera, to 10000 × 10000 pixels in images taken with high-resolution cameras used by professional photographers. The size of images in medical imaging applications depends on the task at hand. PET for example produces three dimensional image data, where a full-length body scan has a typical size of 175 × 175 × 500 pixels. 1.2 Image processing tasks Image de-noising In most acquisition processes for digital images data wrong information is added to the image. Even modern cameras which are able to acquire high–resolution images produce noisy outputs, cf. Figure 2. In fact, the appearance of noise is an intrinsic problem in image processing. The task can be phrased as Please email all corrections and suggestions to these notes to [email protected] 2 1.2 Image processing tasks 3 Figure 2: Bad lighting conditions may result into noisy image. First: A digital photo which has been acquired under too little light. Second: Plot of the grey values of the red channel along the one-dimensional slice marked in red in the photograph. Image Processing – Variational and PDE Methods C.-B. Schönlieb Figure 3: Blurred and de-blurred image using a total variation approach, see Section 3.3 Identify and remove the noise while preserving the most important information and structures. Remark 1.1. For the human eye, noise is an easy problem to cope with. If the noise is not too strong we are still able to analyse an image for its contents. However, for the computer this is not the case. This is important when aiming for the automated analysis of an image. Image de-blurring Image de-blurring denotes the task of removing blur in images that can be caused by wrong focusing, shaking of the camera, atmospheric turbulences (for instance in earth-based astronomical imaging), or the movement due to the breathing of the patient in medical imaging. The task here is Identify the blur and enhance the image by removing (or reducing) the blur. The better the blurring process is understood, the better de-blurring works. Segmentation The goal is to segment an image into its different objects. The simplest situation is binary, that is the segmentation into object and background. Image segmentation aims to Segment one or more objects of interest in an image, also under the presence of noise and blur. Image inpainting An important task in image processing is the process of filling in missing parts of damaged images based on the information obtained from the intact part of the image. It is essentially a type of interpolation and is called inpainting or disocclusion. This imaging task is drastically different from de-noising and de-blurring: in image inpainting all the information in certain pixels got lost. The task is Please email all corrections and suggestions to these notes to [email protected] 4 1.2 Image processing tasks Figure 4: Image segmentation with level set method, compare Section 3.5 Figure 5: Damaged image (left) and its inpainted image (right), see Section 3.3 5 Image Processing – Variational and PDE Methods C.-B. Schönlieb Reconstruct a guess for the original image contents inside of holes in an image using the available image information from intact parts of the image domain. Image reconstructions from samples of linear transformations In medical-, seismic-, biological- imaging or for some visualisation tasks in chemistry, some physical imaging tools are employed to visualise the inside of the body, the earth, a cell or chemical reactions. In such applications, it usually is not the image directly that is measured but samples of its Fourier transform or its Radon transform for instance. In that case, we want to Reconstruct an approximation of the image density from (usually under-) sampled transform data by “smoothly” inverting the transformation. Remark 1.2. Other tasks, which we are not considering in this lecture are image registration, edge detection, video processing and many more. 1.3 Image processing approaches 1.4 Motivation for nonlinear PDE and variational approaches Let us motivate the consideration of such sophisticated approaches as nonlinear PDEs for the solution of imaging tasks by a brief discussion of image denoising. Noise in an image usually constitutes a highly-oscillatory (high frequency) component of the acquired image data. One way to think about denoising an image is to smooth that image, aiming to “smooth” away the noise. The simplest and best investigated method for smoothing images is to apply a linear filter to them. One example of such a filter is Gaussian smoothing. Gaussian smoothing For the following considerations we represent a grey scale image g as a realvalued mapping g ∈ L1 (R2 ). Gaussian smoothing denotes the construction of a smoothed version u of g by convolving g with a Gaussian kernel, that is Z u(x) = (Gσ ∗ g) (x) := Gσ (x − y) g(y) dy, (1.1) R2 where Gσ denotes the two-dimensional Gaussian of width σ > 0: 2 2 1 e−|x| /(2σ ) . Gσ (x) := 2πσ 2 This convolution if smoothing the image g because Gσ ∗ g ∈ C ∞ (R2 ), even if g is only absolutely integrable. To understand this smoothing process better we investigate the effect of Gaussian smoothing in the frequency domain. To do so, we define the Fourier transform F by Z (Fg)(ω) := g(x) e−iωx dx, R2 and get (F (Gσ ∗ g)) (ω) = (FGσ )(ω) · (Fg)(ω) 2 = e−|ω| /(2/σ 2 ) · (Fg)(ω). This means that (1.1) is a low-pass filter that attenuates high frequencies in a monotone way. Please email all corrections and suggestions to these notes to [email protected] 6 1.4 Motivation for nonlinear PDE and variational approaches 7 Figure 6: Reconstruction of the Shepp-Logan Phantom from 11% under sampling of the Fourier transform Image Processing – Variational and PDE Methods C.-B. Schönlieb Relation to linear diffusion filtering It is a classical result that for g ∈ C(R2 ), bounded, the linear diffusion equation ut = ∆u (1.2) u(x, t = 0) = g(x) possesses the solution ( u(x, t) = g(x) t=0 √ G 2t ∗ g (x) t > 0. (1.3) This solution is unique within the class of functions that satisfy 2 |u(x, t)| ≤ M ea|x| , M, a > 0. It depends continuously on the initial image g with respect to the L∞ norm, and it fulfils the max-min principle inf2 g ≤ u(x, t) ≤ sup g on R2 × [0, ∞), R R2 cf. [Wei98]. Investigating (1.3) we find √ that the time t of the solution of the linear diffusion equation is related to the spatial width σ = 2t, that is the later in time the solution u(x, t) is considered the “flatter” it becomes (or the lower its frequencies). Hence, smoothing structures of order σ requires to stop the diffusion process at time 1 T = σ2 . 2 The problem with using Gaussian filtering, i.e. linear diffusion filtering, for image denoising is that the smoothing is isotropic: it does not depend on the image and it is the same in all directions. In particular, image edges are not preserved. Nonlinear diffusion We would like to find models for removing image noise while preserving one of the most important parts of image information, that is image edges. To do so, it is essential to richen the diffusion process by nonlinearity. Instead of (1.2) we consider a nonlinear diffusion equation of the form ut = div(c(|∇u|2 ) ∇u) (1.4) u(x, t = 0) = g(x), with an appropriate diffusion constant c(·) that makes the strength of the smoothing dependent on the size of the gradient of the solution. Equation (1.4) is called the Perona-Malik equation, and is just one example for a nonlinear PDE approach in image processing, cf. Section 4. Related, but not always equivalent, are variational models for image smoothing. The role of the nonlinearity in a PDE model, is here taken by the non-smoothness of the functional that is to be minimised, cf. Section 3. 2 Mathematical Representation of Images Now, since the image function is a mathematical object we can treat it as such and apply mathematical operations to it. These mathematical operations are summarized by the term image processing techniques, and range from statistical methods, morphological operations, to solving a partial differential equation for the image function. We are especially interested in the last, i.e., PDE- and variational methods used in imaging. Please email all corrections and suggestions to these notes to [email protected] 8 Image Processing – Variational and PDE Methods C.-B. Schönlieb We have introduced the object digital image already in Section 1.1. There, a digital image has been introduced as a sampled and quantised version of an analog (also called physical or real) image. The higher the resolution of a digital image, the closer it is to the analog image in the real-world. While digital image processing is indeed concerned with digital images the methods used are often motivated from considerations in the continuum, that is methods are formulated for the analog image. In this course we take up this mathematically more challenging and analytically more beautiful position, and let our image u be a continuous object defined on a rectangular domain Ω = (a, b) × (c, d). Within this framework, there are many possibilities how images can be modelled, compare [CS05a, Chapter 3]. For our purposes we will focus on so-called deterministic image models only and on three deterministic models in particular: Images as elements in a function space, level set representation of images, and the Mumford Shah representation of images. Throughout this section let Ω = (a, b) × (c, d) ⊂ R2 , where a < b, c < d ∈ R. Moreover, we consider grey value images u : Ω → R only. 2.1 Images as elements in a function space Images as distributions We define the set of test functions D(Ω) = {ϕ ∈ C ∞ (Ω), suppϕ ⊂ Ω} Example 2.1. Let Bα (m) = {|x − m| < α} the largest ball such that Bα (m) ∈ Ω. Then, we define ( 2 2 2 eα /(|x−m| −α ) x ∈ Bα (m) ϕα,m (x) = 0 otherwise. A test function ϕ ∈ D(Ω) can be interpreted as a linear sensor for capturing image signals. More precisely, we model an image u on Ω as a distribution, that is a continuous linear functional on D(Ω). Let us phrase this in accurate mathematics. Definition 2.2 (Convergence in D(Ω)). A sequence {ϕn } in D(Ω) converges to zero, if ϕn vanish outside a bounded set and all partial derivatives of ϕn converge to zero uniformly, that is α lim sup |D ϕn (x)| = 0 n→∞ x∈Ω for all multi indices α = (α1 , α2 ) ∈ Z2+ . Definition 2.3 (Distribution). A linear functional on D(Ω) (that is a linear map from D(Ω) to R) that is continuous w.r.t. convergence in D(Ω), is a distribution on Ω. We write D0 (Ω) for the set of all distributions on Ω and u(ϕ) = hu, ϕi for the image of a test function ϕ ∈ D(Ω) under the distribution u ∈ D0 (Ω). An image u ∈ D0 (Ω), outputs a single response hu, ϕi for any sensor ϕ ∈ D(Ω), that attempts to sense image features. Example 2.4 (Examples of distributional images). • u(x) = δ(x) the Dirac-delta function, compare Figure 7 on the left. Then hu, ϕi = ϕ(O), for any sensor ϕ ∈ D(Ω). Please email all corrections and suggestions to these notes to [email protected] 9 Image Processing – Variational and PDE Methods C.-B. Schönlieb Figure 7: Left: The image u is a bright spot concentrated at the origin, i.e., u(x) = δ(x), where δ stands for the Dirac delta function. Right: The image u describes a step edge from 0 to 1, i.e., u(x) = u(x1 , x2 ) = H(x1 ), where H(t) is the Heaviside 0 − 1 step function • u(x) = u(x, y) = H(x) the Heaviside step function, that is ( 0 x<0 H(x) = 1 x ≥ 0, compare Figure 7 on the right. We summarise the following properties of distributional images. • The sensing with test functions is a linear operation. • With additional positivity constraint we have that the Riesz representation theorem is valid, i.e. u is a positive distribution, then for any sensor ϕ ∈ D(Ω) there exists a Radon measure µ on Ω (i.e. a Borel measure that is finite on any compact subset of Ω) s.t. Z hu, ϕi = ϕ(x) dµ. (2.1) Ω • notion of derivatives, i.e., distributional derivative v = Dα u is defined as a new distribution such that hv, ϕi = (−1)|α| hu, Dα ϕi, ∀ϕ ∈ D(Ω), where α is a multi index. • Sensing of distributions mimics the digital sensor devices in CCD cameras. • Distributions constitute a very general class of functions. In particular, D0 (Ω) is a complete space but not a normed vector space. However, both the variational and the PDE imaging models are set within Banach spaces. In the following we will narrow down the class of functions we consider. This leads us first to Sobolev spaces. Please email all corrections and suggestions to these notes to [email protected] 10 Image Processing – Variational and PDE Methods C.-B. Schönlieb Sobolev images We start with Lebesgue integrable functions. For p ∈ (0, ∞) we define Z p p L (Ω) = u : |u(x)| dx < ∞ , Ω and L∞ (Ω) being the space of essentially bounded functions. These are Banach space with norms Z kukp := |u(x)|p dx 1/p Ω and kuk∞ := ess supΩ |u| = inf{C > 0 : |u(x)| ≤ C for a.e. x ∈ Ω}. In the case p = 2, L2 (Ω) is a Hilbert space with inner product Z (u, v)2 = u · v dx ∀u, v, ∈ L2 (Ω). Ω Can we use Lp norms to quantify image contents? We will try as follows. For Ω0 ⊂ Ω with |Ω0 | > 0, define Z 1 u(x) dx huiΩ0 := 0 |Ω | Ω0 the average of u over Ω0 . Then, the average information content of u in Ω0 is defined by the so-called p-mean oscillation 1/p Z 1 p 0 |u − hui | dx . upø = Ω |Ω0 | Ω0 For p = 2, this gives the canonical definition of the empirical standard deviation in statistics. However, to describe “change” in images we need the notion of derivatives, which leads us to consider Sobolev spaces. Definition 2.5 (Sobolev space). For 1 ≤ p ≤ ∞, we define the Sobolev space W k,p (Ω) as the space of all locally summable functions u : Ω → R such that for each multi index α with |α| ≤ k the derivative Dα u exists in the weak sense and belongs to Lp (Ω). If p = 2, we usually write H k (Ω) = W k,2 (Ω). The spaces W k,p (Ω) are Banach spaces equipped with the norm 1/p R P α p |D u| dx 1≤p<∞ kukW k,p (Ω) := P |α|≤k Ω α ess sup |D u| p = ∞. Ω |α|≤k In the case p = 2, that is the spaces H k are in fact Hilbert spaces with inner product X Z (u, v)H k = Dα u · Dα v dx. |α|≤k Ω If we start, by modelling change in an image by its first derivatives we might consider H 1 (Ω) as an appropriate space for an image, and in turn the corresponding H 1 norm as an appropriate measure for image information. However, one disadvantage of taking u ∈ H 1 (Ω) is that u necessarily must be continuous. Please email all corrections and suggestions to these notes to [email protected] 11 Image Processing – Variational and PDE Methods C.-B. Schönlieb Problem 2.6 (Hölder continuity of H 1 -functions). We start with the one-dimensional case. Let u : (0, 1) → R, u ∈ H 1 (0, 1). Then, for each 0 < s < t < 1, we have s Z t Z t √ √ |u0 (r)|2 dr ≤ t − s kuk2H 1 u0 (r) dr ≤ t − s u(t) − u(s) |{z} = s s f ormally so that u must be 1/2- Hölder continuous. (The formal argument above can be made rigorous: it should be made on smooth functions first and then justified by a density argument for H 1 (0, 1).) In two dimensions we have for u : (0, 1)2 → R and u ∈ H 1 ((0, 1)2 ) that its one-dimensional restrictions are in H 1 (0, 1), that is: for a.e. y ∈ (0, 1) and x 7→ u(x, y) ∈ H 1 (0, 1), which essentially comes from the fact that Z 1 Z 1 |Dx (x, y)|2 dx dy ≤ kuk2H 1 < ∞. 0 0 It means that for a.e. y, the map x 7→ u(x, y) is 1/2- Hölder continuous in x, so that the image u certainly cannot jump across vertical boundaries. A similar kind of regularity can be shown for any u ∈ W 1,p (Ω), 1 ≤ p ≤ ∞. The problem with forcing an image to be continuous is that, as we have seen before with the solution to the heat equation, edges in an image (that are jumps in the image function) cannot be represented. However, edges constitute one of the most important features in images, and whenever we are processing an image – we denoise, deblur or interpolate it say – we seek for methods that can represent and preserve edges in the process. These considerations call for a less-smooth space, the space of functions of bounded variation. BV images In the context of image reconstruction Rudin, Osher and Fatemi (ROF) introduced in 1992 the total variation as a measure for image reconstruction. The motivation was to propose a quantity that differentiates between noise and structure in the image, in particular between noise and image edges. For a function u ∈ W 1,1 (Ω) they considered Z |Du(x)| dx. (2.2) Ω 1,1 Well, thats confusing. In fact, W -functions are still not allowed to have discontinuities along horizontal and vertical direction in the image domain, see Exercises. However, the key is that the total variation can be defined for much more general functions than W 1,1 . To see that, lets first consider a simple example. Example 2.7 (Dirac-delta). Let f (x, y) = δ(x, y) the Dirac delta at (0, 0) ∈ Ω. Then Z 00 I(δ) = “ δ(x, y) dx dy = 1, Ω 1 but δ is not an L -function in the traditional sense, it is a measure. So what is the correct definition of the integral I for δ? More generally, let µ be a non-negative measure on all Borel subsets of Ω, with µ(K) < ∞ for all compact K ⊂ Ω (or more generally let µ be a signed Radon measure). Then, we can define Z I(µ) = 1 dµ = µ(Ω), Ω and with that signed Radon measures generalise the space L1 (Ω). The same thing applies to the total variation (2.2) of an image u by taking f = Du. Please email all corrections and suggestions to these notes to [email protected] 12 Image Processing – Variational and PDE Methods C.-B. Schönlieb Example 2.8 (Total variation of the Heaviside function). In R1 we consider f (x) = δ(x). Then for u0 = δ and u(−∞) = 0 we get that ( 0 x<0 u(x) = H(x) = 1 x ≥ 0. Then, u ∈ / W 1,1 (R1 ) but still we can define Z T V (H) = “ 00 |H 0 (x)| dx = R1 Z 1 dδ = 1. R1 More generally, for µ being a signed Radon measure in R1 and u its cumulative distribution function, we can define Z T V (u) = d|µ| = |µ|(R1 ), R1 where |µ| is the total variation measure of µ, that is |µ| = µ+ + µ− where µ+ and µ− are the positive (negative) variation of µ respectively. With this discussion in mind we can now generalise the notion of the total variation (2.2) for functions u ∈ L1loc . Definition 2.9 (Total variation). The total variation of an image is defined by duality: for u ∈ L1loc (Ω) it is given by Z ∞ 2 T V (u) = |Du|(Ω) = sup − udivϕ dx : ϕ ∈ Cc (Ω; R ), |ϕ(x)| ≤ 1 ∀x ∈ Ω , (2.3) Ω is the total variation measure of the Radon measure Du given by Riesz representation theorem (2.1). The space BV (Ω) of functions with bounded variation is the set of functions u ∈ L1 (Ω) such that T V (u) < ∞, endowed with the norm kukBV = kuk1 + T V (u). Example 2.10. • For u ∈ W 1,1 (Ω) we have that |Du|(Ω) = kDuk1 . • For u = χC for a subset C ⊂ Ω with smooth boundary, we have |Du|(Ω) = |Du|(C) = H1 (∂C) the perimeter of C in Ω. Here H1 is the 1-dimensional Hausdorff measure. 2.2 The Mumford-Shah image model See section 3.5. 3 Variational Approach to Image Processing The variational approach to image processing constitutes the computation of a reconstructed image u based on the observed image (or more generally data) g as a minimiser of a functional. The modelling of this variational approach can be best motivated from Bayesian statistics. Please email all corrections and suggestions to these notes to [email protected] 13 Image Processing – Variational and PDE Methods C.-B. Schönlieb Figure 8: In one space dimension the total variation of a signal u on the interval (0, 1) is just the maximal sum over the absolute differences of functions values for partitions of (0, 1). Hence, in this example the total variation of u on the left and of u on the right is the same! The Bayesian point of view to image reconstruction For this we go back to the discrete setting for a moment. Given an image g ∈ RN × RN we can formulate two components for solving a general inverse problem: • Data model: g = T u + n, where u ∈ RN × RN original image (to be reconstructed), T is a linear transformation (for instance, T = Id in image denoising, T = S a sampling operator for image inpainting, T = k∗ a convolution with a kernel k in image deblurring, or T = F for reconstructing an image from its Fourier transform), n is the noise which for now is assumed to be Gaussian distributed with mean 0 and standard deviation σ • A-priori probability density: P (u) = e−p(u) du. A-priori information on the original image. Then, the a posteriori probability for u knowing g given by Bayes is P (u|g) = with 1 P (g|u) = e− 2σ2 P i,j P (g|u)P (u) , P (g) |(T u)i,j −gi,j |2 , P (u) = e−p(u) The idea of maximum a posteriori” (MAP) image reconstruction is to find the “best” image as the one which maximises this probability or equivalently, which solves the minimisation problem X 1 min p(u) + 2 |gi,j − (T u)i,j |2 . u 2σ i,j The variational model Going back to the continuous setting with g ∈ L2 (Ω), where Ω ⊂ R2 is a rectangular image domain, and T ∈ L(L2 (Ω)) being a bounded linear operator from L2 into itself, a generic minimisation problem to recover u from g reads Z 1 min αψ(u) + |T u(x) − g(x)|2 dx. (3.1) 2 Ω u∈L2 (Ω) Please email all corrections and suggestions to these notes to [email protected] 14 Image Processing – Variational and PDE Methods C.-B. Schönlieb Here, the so-called regulariser ψ corresponds to the a-priori information that we have on the reconstructed image u and α > 0 is the so-called regularising parameter and acts like a balance between data model and the regulariser. One of the most successful variational approaches for image reconstruction is the total variation model, which takes ψ(u) = T V (u), cf. (2.3). 3.1 Mathematical preliminaries Definition 3.1 (Strong, weak, and weak* convergence). A sequence (xn ) in a normed space (X, k · kX ) converges • strongly to x ∈ X if limn→∞ kxn − x∗ k = 0. In this case we write xn → x. • weakly to x ∈ X if lim hxn , yiX×Y = hx, yiX×Y , n→∞ ∀y ∈ Y = X ∗ is the dual space to X. In this case we write xn * x. A sequence (yn ) ∈ Y = X ∗ converges weakly* to y ∈ Y if lim hyn , xiY ×X = hy, xiY ×X , n→∞ ∀x ∈ X. ∗ In this case we write yn * y. Definition 3.2 (Weak sequentially compact). Let (X, k · kX ) be a normed space. A subset U ⊂ X is weak sequentially compact if every sequence in U has a weakly converging subsequence with limit in U . Theorem 3.3. A normed vector space X is reflexive if and only if every bounded ball in X is weak sequentially compact. Example 3.4 (Reflexive Banach spaces). • L2 is reflexive but L1 is not. • H 1 is reflexive but W 1,1 is not. Hausdorff measure Let us start with the definition of the k-dimensional Hausdorff measure Hk . Definition 3.5. Let 0 ≤ k ≤ +∞ and S a subset of Rd . The k-dimensional Hausdorff measure of S is given by " ( )# X [ k k |diam(Ai )| : diam(Ai ) ≤ ρ, A ⊂ Ai , H (S) = lim n(k) · inf ρ→0 i∈I i∈I where n(k) is a normalisation factor and diam(Ai ) denotes the diameter of the set Ai . Then, the Hausdorff dimension of a set S is defined by H − −dim(S) = inf{k ≥ 0 : Hk (S) = 0}. Remark 3.6. Let k be a positive integer less or equal to the dimension d and S ⊂ Rd be a C 1 k–dimensional manifold in Rd . Then the Hausdorff measure Hk (S) equals the classical k-dimensional area of S. Moreover, Hd (S) equals the Lebesgue measure dx on S. Please email all corrections and suggestions to these notes to [email protected] 15 Image Processing – Variational and PDE Methods C.-B. Schönlieb Level sets Let u be an integrable function on Ω ⊂ R2 open and bounded with Lipschitz boundary. We define the level sets Γλ of u as Γλ (u) = {x ∈ Ω : u(x) ≤ λ} . (3.2) Then, the level-set representation of u is Γ(u) = {Γλ (u) : λ ∈ R}. Note, that the definition of level sets is not unique. In particular, the above definition differs from the classical level set formulation where the level sets are defined as curves γλ (i.e. γλ = {x ∈ Ω : u(x) = λ}}) rather than sets. In fact, for a continuous image function u the boundary ∂Γλ = γλ . The advantage of the set-notion (3.2) is that it makes sense for non-smooth images as well. Curvature In section 3.4 we will consider interpolation methods that find the interpolant with minimal curvature. To formalise this we first have to say what we mean by curvature, both in the sense of curves (associated to level sets (3.2)) as well as in the sense of curvature of a function (i.e. mean curvature of the surface defined by the image function u). To do so we first recall some facts about planar curves, their length and their curvature. Let γ : [0, 1] → R2 be a simple (i.e. without self intersections) curve parameterised from the interval [0, 1] to R2 . Then, the length of γ is defined as ( n X length(γ) := sup |γ(ti ) − γ(ti−1 )| : n ∈ N i=1 and 0 = t0 < t1 < . . . < tn = 1} . A rectifiable curve is a curve with finite length. Moreover, if γ is Lipschitz continuous on [a, b], then the metric derivative (the speed) of the curve γ is defined by |γ 0 |(t) := lim s→0 kγ(t + s) − γ(t)k , |s| where k · k is the Euclidean norm on R2 . With that the length of γ is equivalently defined by Z 1 length(γ) = |γ 0 |(t) dt. (3.3) 0 Note, that a generalised notion of length of a curve appears in the context of the coarea formula for functions of bounded variation, cf. Theorem 3.18. The arc length s(t) of γ is given in the same flavour as (3.3) with Z t |γ 0 |(t̃) dt̃, s(t) = t ∈ [0, 1]. 0 Re-parametrising γ in terms of its arc length is called the natural parametrisation and yields the unit tangent vector γ 0 (t) . γ 0 (s) = 0 |γ |(t) If γ is twice continuously differentiable then the signed curvature of γ at t is given by κ(t) = and det(γ 0 (t), γ 00 (t)) |γ 0 (t)|3 |κ(s)| = |γ 00 (s)|. (3.4) (3.5) We will see later in section 3.4 that the curvature of a curve γ can be defined in weaker sense and the regularity assumption on γ can be relaxed. Please email all corrections and suggestions to these notes to [email protected] 16 Image Processing – Variational and PDE Methods 3.2 C.-B. Schönlieb Image denoising Total variation minimisation Given g ∈ L2 (Ω) we propose to compute a denoised version u of g as a solution of 1 2 min J (u) = α|Du|(Ω) + ku − gk (ROF) 2 . 2 u∈L2 (Ω) This is the classical ROF-denoising approach, named after Rudin, Osher and Fatemi who have introduced this approach in 1992. Theorem 3.7. [Existence and uniqueness for ROF] The minimisation problem (ROF) has a unique minimiser u. For proving the existence of minimisers of this problem, we follow the direct method of calculus of variations. For a generic minimisation problem of the form min J (u), X a Banach space, u∈X we can phrase this strategy within three steps Method 3.8 (Direct method of calculus of variations). 1. Show that J is bounded from below, that is inf u∈X J (u) > −∞. Hence, there exists a minimising sequence (un ) ∈ X such that J (un ) < ∞ for all n and limn→∞ J (un ) = inf u∈X J (u). 2. Check that this sequence is contained in a set which is sequentially compact w.r.t. the topology induced on X. From that we get that there exists a subsequence (unk ) and a u∗ ∈ X such that limk→∞ unk = u∗ with convergence w.r.t. the topology in X. This u∗ is a candidate for a minimiser. 3. Can we now deduce that J (u∗ ) = lim J (unk )? (3.6) k→∞ Unfortunately no, since in essentially all cases the functional J is not continuous with respect to the convergence induced by the topology on X. Luckily, a key observation is that we do not really need the full strength of (3.6). To prove that u∗ in fact is a minimiser of J we need to confirm one last property of J , that is that J is (sequentially) lower-semicontinuous (l.s.c.) with respect to the topology in X. Then inf J (u) ≤ J (u∗ ) ≤ lim inf J (unk ) = inf J (u), u∈X k→∞ u∈X and we have that u∗ ∈ X is a minimiser of J Proof. We put the cart before the horse and start with proving that the total variation is l.s.c. with respect to weak convergence in Lp for p ∈ [1, +∞). The idea is that the total variation is the supremum of continuous functions and as such is l.s.c. Indeed, let Z Lϕ : u 7→ − u(x)divϕ(x) dx. Ω p If un * u in L (Ω) then Lϕ un → Lϕ u (this is due to the continuity of Lϕ even w.r.t. very weak topologies). But then Lϕ u = lim Lϕ un ≤ lim inf T V (un ). n→∞ Taking the supremum over all ϕ ∈ Cc∞ (Ω; R2 ) n→∞ with |ϕ(x)| ≤ 1 we deduce T V (u) ≤ lim inf T V (un ), n→∞ Please email all corrections and suggestions to these notes to [email protected] 17 Image Processing – Variational and PDE Methods C.-B. Schönlieb that is T V is (sequentially) l.s.c. with respect to all the above mentioned topologies. Now, let (un ) be a minimising sequence for J , that is limn→∞ J (un ) = inf u J (u). Since J (un ) ≤ J (0) < ∞ for n large enough (assuming that g ∈ L2 (Ω) and T V ≥ 0), then (un ) is bounded in L2 and hence there exists a converging subsequence unk * u in L2 . But then, the total variation is l.s.c. with respect to convergence in L2 and the norm in L2 is naturally l.s.c., that is ku − gk2 ≤ lim inf kunk − gk2 . k→∞ Hence J (u) ≤ lim inf J (unk ) = inf J , k→∞ so that u is a minimiser of J . To prove the uniqueness of the minimiser u we observe that k · k22 is strictly convex. Moreover the total variation is convex since it is the supremum of linear (and hence convex) functions Lϕ . Indeed, one clearly has for any u1 , u2 and t ∈ [0, 1] that Lϕ (tu1 + (1 − t)u2 ) = tLϕ (u1 ) + (1 − t)Lϕ (u2 ) ≤ tT V (u1 ) + (1 − t)T V (u2 ). Taking the sup in the lhs we have the result. Hence, if u and u0 are two minimisers of J , then 2 Z u + u0 α u + u0 0 ≤ (T V (u) + T V (u )) + − g dx J 2 2 2 Z Ω 1 1 = (J (u) + J (u0 )) − (u − u0 )2 dx, 2 4 Ω which would be strictly less than inf J unless u = u0 . Hence the minimiser of the ROF problem (ROF) exists and is unique. In the proof of Theorem 3.7 we have proved the existence of a minimiser u for J in (ROF). The next obvious question that arises is, what is the regularity of the minimiser u? Our immediate answer to this will be that, since |Du|(Ω) + kuk2 < ∞ (and hence also kuk1 < ∞ that u ∈ BV (Ω)), cf. Exercises. Very elegantly this property also results from the following compactness property of the space BV . Theorem 3.9 (Rellich’s compactness theorem). Let Ω ⊂ R2 a rectangular image domain, and let (un ) be a sequence of functions in BV (Ω) such that supn kun kBV < +∞. Then there exists u ∈ BV (Ω) and a subsequence (unk ) such that unk → u strongly in L1 (Ω) as k → ∞. Remark 3.10. Note, that the space BV (Ω) is not reflexive, but that Rellich’s theorem provides you with enough compactness to prove existence of solutions for the ROF minimisation problem taking a minimising sequence in BV . This will become important also when considering total variation minimisers with an L1 data fidelity term. Two numerical algorithms for total variation denoising In what follows, we discuss two approaches to solve the total variation denoising problem numerically. The first method relies on a smoothing of the total variation regulariser that turns it into a differentiable quantity and is based on the solution of the corresponding Euler-Lagrange equation via a fixed-point iteration. In the second method the total variation denoising problem is reformulated into its dual form using useful facts for the Legendre Fenchel dual of a convex and one-homogeneous function. The solution to the dual problem turns out to be an orthogonal projection onto a convex set and can again be computed via a fixed-point iteration. Please email all corrections and suggestions to these notes to [email protected] 18 Image Processing – Variational and PDE Methods C.-B. Schönlieb • Method A: Lagged diffusivity: The idea of this method is to replace the minimisation problem (ROF) by Z p 1 2 + dx + ku − gk2 , min J (u) = α |∇u| (3.7) 2 2 u∈W 1,1 (Ω) Ω where g ∈ L( Ω) and the smoothing parameter 0 < 1. The well-posedness of this scheme cannot be derived via the direct method of calculus of variations because W 1,1 is not reflexive. Even if we would set the problem as a minimisation problem in L2 (as before) the smoothed total variation (with ∇u representing the absolutely continuous part of the distributional derivative only) is not lower-semicontinuous. Well-posedness can still be proven via the method of relaxation. We will not discuss this here but refer the reader to [AK06, Chapter 3.2.3] for the proof. For computing a minimiser of (3.7) we first derive the corresponding first-order optimality conditions (called Euler-Lagrange equations) in weak form. That is, for a smooth and compactly supported test function φ ∈ Cc∞ (Ω) we compute d (J (u − τ φ))τ =0 = 0, dτ ∀φ ∈ Cc∞ (Ω). This gives d (J (u − τ φ))τ =0 = α dτ Z = ∇u Z Ω Ω p Z |∇u|2 + −αdiv · ∇φ dx + ∇u p (u − g)φ dx Ω |∇u|2 + ! ! + (u − g) φ dx, and therefore the Euler-Lagrange equation (in the weak sense) ! ∇u −αdiv p + (u − g) = 0. |∇u|2 + The lagged-diffusivity method computes the minimiser of (3.7) as a fixed point iteration on the Euler-Lagrange equation above. More precisely, for an initial u(0) (e.g. u(0) = g) one iteratively computes u(k+1) for k = 0, 1, . . . as a solution of ! ∇u(k+1) −αdiv p + (u(k+1) − g) = 0, |∇u(k) |2 + which is a linear partial differential equation in u(k+1) . For more information on this scheme and convergence analysis see [VO96, CM99]. Remark 3.11. If you are looking for a very easy method to implement total variation denoising, you can also attempt to compute a minimiser of (3.7) via the method of steepest descent. As above, we start with an initial u(0) and iterate for k = 0, 1, . . . ! ! ∇u(k) (k+1) (k) (k) u = u + τ αdiv p +g−u , |∇u(k) |2 + with time step size τ > 0. Stability of this scheme is only guaranteed with a condition on the step size τ whose upper bound depends on the space discretisation and the size of . This condition can be rather restrictive and, for small , allows only very small incremental changes in time. However, the advantage of this scheme is that it is very easy to implement and each iteration is very cheap since it only requires the evaluation of the right hand side in the equation. Please email all corrections and suggestions to these notes to [email protected] 19 Image Processing – Variational and PDE Methods C.-B. Schönlieb • Method B: Convex optimisation For the derivation of this algorithm we change to the discrete setting. We consider images as N ×N matrices. We denote by X the Euclidean space RN ×N . To define the discrete total variation we introduce a discrete (linear) gradient operator. If u ∈ X, the gradient ∇u is a vector in Y = X × X given by (∇u)i,j = ((∇x u)i,j , (∇y u)i,j ), with ( ui+1,j − ui,j if i < N (∇x u)i,j = 0 if i = N, ( ui,j+1 − ui,j if j < M (∇y u)i,j = 0 if j = M, for i = 1, . . . , N , j = 1, . . . , M . Then the discrete total variation of u is defined by X J(u) = |(∇u)i,j | , 1≤i,j≤N p with |y| := y12 + y22 for every y = (y1 , y2 ) ∈ R2 . This functional J is a discretization of the standard total variation, defined in the continuous setting for a function u ∈ L1 (Ω) (Ω is an open subset of R2 ) by (2.3). It can be shown that if some stepsize (or pixel size) h ≈ 1/N is introduced in the discrete definition of J (defining a new functional Jh equal to h times the expression in the discrete definition of J) that as h → 0 (and the number of pixels N goes to infinity), Jh Γ− converges to the continuous J (defined on Ω = (0, 1) × (0, 1)). This means that the minimizers of the problems we are going to consider approximate correctly, if the pixel size is very small, minimizers of similar problems defined in the continuous setting, see [L11, Chapter 2.7]. With J being one-homogeneous (that is, J(λu) = λJ(u) for every u and λ > 0), it is a standard ∗ fact in convex P analysis that the Legendre-Fenchel transform J (v) = supu hu, viX − J(u) (with hu, viX = i,j ui,j vi,j ) is the characteristic function of a closed convex set K, see [HUL93]. That is, ( 0 if v ∈ K J ∗ (v) = χK (v) = +∞ otherwise. Since J ∗∗ = J, we recover J(u) = sup hu, viX . v∈K In the continuous setting, one readily sees from the definition of the functional J that K is the closure of the set div ξ : ξ ∈ Cc1 (Ω; R2 ), |ξ(x)| ≤ 1 ∀x ∈ Ω . Let us now find a similar characterization in the discrete setting. In Y = X × X, we use the Euclidean scalar product, defined in the standard way by X hp, qiY = ((px )i,j (qx )i,j + (py )i,j (qy )i,j ), 1≤i,j≤N for every p = (px , py ), q = (qx , qy ) ∈ Y . Then, for every u, J(u) = sup hp, ∇uiY p Please email all corrections and suggestions to these notes to [email protected] 20 Image Processing – Variational and PDE Methods C.-B. Schönlieb where the sup is taken on all p ∈ Y such that |pi,j | ≤ 1 for every i, j. We introduce a discrete divergence div : RN ×M × RN ×M → RN ×M defined, by analogy with the continuous setting, by div = −∇∗ (∇∗ is the adjoint of the gradient ∇). That is, for every p ∈ Y and u ∈ X, h−div p, uiX = hp, ∇uiY . One checks easily that div is given by x x (p )i,j − (p )i−1,j if 1 < i < N x (divp)ij = (p )i,j if i = 1 x −(p )i−1,j if i = N y y (p )i,j − (p )i,j−1 if 1 < j < M + (py )i,j if j = 1 −(py )i,j−1 if j = M, for every p = (px , py ) ∈ RN ×M × RN ×M . From this one immediately deduce that K is given by {div p : p ∈ Y, |pi,j | ≤ 1 ∀i, j = 1, . . . , N } . Based on these preliminary observation Chambolle [Ch04] proposes an algorithm to solve 2 min u∈X ku − gk + J(u), 2α given g ∈ X and α > 0, which will be described in the sequel. Here, k.k is the Euclidean norm in 2 X, given by kuk = hu, uiX . The Euler equation for the proposed optimization problem is u − g + α∂J(u) 3 0. Here, ∂J is the sub-differential of J, defined by Definition 3.12. (Sub-differential). Let H be a Hilbert space. Let φ : H → (−∞, +∞] be convex and proper (φ 6= +∞). Then for any x ∈ H, the sub-differential of φ at x is defined as ∂φ(x) = {y ∈ H; ∀ξ ∈ H, φ(ξ) ≥ φ(x) + hy, ξ − xiH } . The Euler equation can be rewritten as (g−u)/α ∈ ∂J(u), which is equivalent to u ∈ ∂J ∗ ((g−u)/α) (we are only quoting this convex analysis result here which can be found in [HUL93]). Writing this as g g−u 1 g−u ∈ + ∂J ∗ ( ), α α α α we get that w = (g − u)/α is the minimizer of 2 1 kw − (g/α)k + J ∗ (w). 2 α Since J ∗ is the characteristic function χK , we deduce w = ΠK (g/α). Hence the solution u of the proposed problem is simply given by u = g − ΠαK (g). A possible algorithm for computing u is therefore to try to compute the nonlinear projection ΠαK . Please email all corrections and suggestions to these notes to [email protected] 21 Image Processing – Variational and PDE Methods C.-B. Schönlieb Computing the nonlinear projection ΠαK amounts to solving the following problem: n o 2 2 min kα div p − gk : p ∈ Y, |pi,j | ≤ 1 ∀i, j = 1, . . . , N . The Karush-Kuhn-Tucker conditions yield the existence of a Lagrange multiplier αi,j ≥ 0, associated to each constraint in the above problem. With this we have for each i, j −(∇(α div p − g))i,j + αi,j pi,j = 0, with either αi,j > 0 and |pi,j | = 1 or |pi,j | < 1 and αi,j = 0. In the latter case, also (∇(α div p − g))i,j = 0. We see that in any case αi,j = |(∇(α div p − g))i,j |. We thus propose the following semi-implicit gradient descent (or fixed point) algorithm. We choose τ > 0, let p(0) = 0 and for any n ≥ 0, (n+1) (n+1) (n) , pi,j = pi,j + τ (∇(divp(n) − f /α))i,j − (∇(divp(n) − f /α))i,j pi,j so that (n) (n+1) pi,j pi,j + τ (∇(divp(n) − f /α))i,j . = 1 + τ (∇(divp(n) − f /α))i,j Chambolle shows the following result Theorem 3.13. Let τ ≤ 1/8. Then, α div pn converges to ΠαK (g) as n → ∞. Other noise models For the Gaussian noise model we have derived in the beginning of Section 3 the variational approach (ROF). The following table should give an overview of possible other noise models and their corresponding data fidelity terms in the generic total variation denoising approach min {J (u) = α|Du|(Ω) + d(u, g)} , u∈A where A is an admissible set of functions and d(u, g) is the data fidelity term, that is an adequate distance function between u and g, cf. Table 1 Noise distribution Additive Laplace noise Poisson noise Multiplicative noise P (g|u) P 1 e− σ2 i,j |gi,j −ui,j | Q ugi,ji,j −ui,j i,j gi,j ! · e gi,j = ni,j · ui,j d(u, g) ku − gk1 R (u − g log(u)) function that depends on ug . Table 1: Noise models in the Bayesian framework and their corresponding data fidelity terms in the continuous variational model. 3.3 Image reconstruction in the context of inverting a linear operator In this section we start with the general introduction of variational regularisation for inverse problems. A general (linear) inverse problem is, given measurements/data g with g = T u + n, Please email all corrections and suggestions to these notes to [email protected] 22 Image Processing – Variational and PDE Methods C.-B. Schönlieb where T is a linear operator and n possible noise, reconstruct a preferably noise-free image u. In the previous section we have derived the variational model (3.1) for reconstructing u in the discrete setting. In the following, we will prove the well-posedness of the corresponding model in infinite dimensions. To do so, let g ∈ L2 (Ω) and the linear operator T fulfils the following assumptions (A1) T : L2 (Ω) → L2 (Ω) is a linear and continuous operator. (A2) T χΩ 6= 0. Examples for admissible T fulfilling (A1) and (A2) are convolution operators T = k∗, T = Id and T = χΩ\D . Then, we consider the following minimisation problem for reconstructing u 1 min α|Du|(Ω) + kT u − gk22 (3.8) u 2 ad we have: Theorem 3.14 (Existence of TV regularised reconstructions). For g ∈ L2 (Ω) and an operator T that fulfils assumptions (A1)-(A2), there exists a solution u ∈ BV (Ω) of the minimisation problem (3.8). If T is injective, then the minimiser is unique. Proof. Let (un ) be a minimising sequence for (3.8), then there exists a constant C > 0 such that |Dun |(Ω) ≤ C, ∀n ≥ 1. R Next, we prove that Ω un ≤ C for all n ≥ 1. Let R un wn = Ω χΩ and vn = un − wn . |Ω| R Then Ω vn = 0 and Dvn = Dun . Hence, |Dvn |(Ω) ≤ C. Using the Poincaré-Wirtinger inequality, that is Lemma 3.15 (Poincaré-Wirtinger inequality for BV functions). For u ∈ BV (Ω), let Z 1 u(x) dx. uø := |Ω| Ω Then there exists a constant C > 0 such that ku − uø k2 ≤ C|Du|(Ω). we obtain that kvn k2 ≤ C. We also have C ≥ kT un − gk22 = kT vn + T wn − gk22 ≥ (kT vn − gk2 − kT wn k2 )2 ≥ kT wn k2 (kT wn k2 − 2kT vn − gk2 ) ≥ kT wn k2 [kT wn k2 − 2(kT k · kvn k2 + kgk2 )] . Let xn = kT wn k2 and an = kT k · kvn k2 + kgk2 . Then xn (xn − 2an ) ≤ C, with 0 ≤ an ≤ kT k · C + kgk2 = C 0 , ∀n. Please email all corrections and suggestions to these notes to [email protected] 23 Image Processing – Variational and PDE Methods Hence, we obtain 0 ≤ xn ≤ an + C.-B. Schönlieb p a2n + C ≤ C 00 , which implies Z kT χΩ k2 kT wn k2 = un · ≤ C 00 , ∀n, |Ω| Ω R and thanks to (A2), we obtain that Ω un is uniformly bounded. Again, by the Poincaré-Wirtinger inequality, we have R un − Ω un ≤ const |Dun |(Ω) ≤ const · C. |Ω| 2 Finally, we obtain R R R Z u u u n n 000 Ω Ω n Ω u u − u − kun k2 = ≤ + + n ≤ C . n n |Ω| |Ω| 2 |Ω| 2 Ω Therefore, un is bounded in L2 and in particular in L1 . Then un is bounded in BV (Ω) and there is a subsequence (still denoted un ), and a u ∈ BV (Ω) such that un * u in L2 . Moreover, T un converges weakly to T u in L2 by assumption (A1). With the l.s.c. properties of |Du|(Ω) and kT u − gk2 w.r.t. weak convergence in L2 we are done. If T is injective it follows from the strict convexity of k · k22 that the map u 7→ kT u − gk22 is also strictly convex, hence the whole functional in (3.8) is strictly convex and the minimiser is unique. Image deblurring In the case of image deblurring the data model is given by g = k ∗ u + n, (3.9) where k(x) is a suitable blurring kernel (also called point spread function (PSF)) and n possible additive noise. One example for blur is the out-of-focus blur, which can be modelled as a convolution with a Gaussian kernel, that is −|x|2 1 k(x) = Gσ (x) = e 2σ2 . (3.10) 2 2πσ In the case when g is really a noisy & blurred version of the original image u then it is clear from our discussion so far that the reconstruction of u (or better an image function that approximates the original u) is done by solving the generic problem (3.8) with T being the convolution with the kernel k – the regularisation with the total variation is a way to eliminate the noise in the blurred image g. However, lets assume for a moment that the image g in (3.9) does not contain any noise, it is just given by g = k ∗ u. Then T = k∗ is a linear operator and we could consider reconstructing u by just inverting this operator. Unfortunately, we would be very ill-advised if attempting to do that because the inversion of a blurring process is in general ill-posed. In particular, when considering the Gaussian blur model, deblurring the image would correspond to solving the heat equation backward in time, cf. (1.3), which we know is an unstable process. Hence, in either case regularisation is needed. We start by specifying the generic total variation model (3.8) for deblurring for Ω = R2 , that is 1 2 uα = argminu α|Du|(Ω) + kk ∗ u − gk2 , 2 where Z (k ∗ u)(x) = k(x − y) u(y) dy. R2 Please email all corrections and suggestions to these notes to [email protected] 24 Image Processing – Variational and PDE Methods C.-B. Schönlieb Figure 9: Domains for the setup of the variational deblurring problem (3.11). To formulate this for an image on a bounded domain Ω there are several approaches considered in the literature. One way, which considers bounded domains only, is as follows. Let Ω0 , Ω0 , Ω ⊂ R2 open and bounded domains. The blurry image g ∈R L2 (Ω0 ) is defined on Ω0 and the blurring kernel k is an L1 function with compact support on Ω0 and Ω k dx = 1. Since g has emerged from u by convolution with k we have that for g in Ω0 only information from u in Ω and from k in Ω0 can be used. Hence, we have to assume that Ω is a domain such that Ω0 − Ω0 = {x − y; x ∈ Ω0 , y ∈ Ω0 } ⊂ Ω. Then the linear operator T for this problem is defined as 0 x∈Ω : Z (T u)(x) = (u ∗ k)(x) = u(x − y)k(y) dy. Ω The operator T is linear and continuous as an operator from L2 (Ω) → L2 (Ω0 ) and maps constant functions on Ω to constant functions on Ω0 and in particular fulfils assumptions (A1)-(A2). Hence, from Theorem 3.14 we have that there exists a minimiser for the problem Z 1 min α|Du|(Ω) + |u ∗ k − g|2 dx, (3.11) 2 Ω0 u∈L2 (Ω) for every α > 0. If T is additionally injective, then the minimiser of (3.11) is unique. Examples for injective blurring kernels are the Gaussian kernel in (3.10) or simple averaging kernels. Medical image reconstruction (MRI) The ROF-type (L2 fidelity) sparse reconstruction problem with sampling operator S and Fourier transform operator F, may then be written 1 min α|Du|(Ω) + kg − SFuk22 . (3.12) u 2 Note that in this application the data g is complex valued and so is the reconstructed ‘image’ u. 3.4 Image inpainting An important task in image processing is the process of filling in missing parts of damaged images based on the information obtained from the surrounding areas. It is essentially a type of interpolation and is called inpainting. Please email all corrections and suggestions to these notes to [email protected] 25 Image Processing – Variational and PDE Methods C.-B. Schönlieb Figure 10: Sub-sampling patterns employed. Nσq% denotes two-dimensional Gaussian sampling with variance 128σ and q% coverage of the 256 × 256 total space. Lq% σ denotes one-dimensional Gaussian sampling of lines with variance 128σ, while S q% denotes a spiral pattern with q% coverage. Let f represent some given image defined on an image domain Ω. Loosely speaking, the problem is to reconstruct the original image u in the (damaged) domain D ⊂ Ω, called inpainting domain or a hole/gap (cf. Figure 11). Figure 11: The inpainting task. More precisely let Ω ⊂ R2 be an open and bounded domain with Lipschitz boundary, and let B1 , B2 be two Banach spaces with B2 ⊆ B1 , f ∈ B1 denoting the given image, and D ⊂ Ω the missing domain. A general variational approach in image inpainting is formulated mathematically as a minimization problem for a regularized cost functional J : B2 → R, J (u) = R(u) + where R : B2 → R and 1 2 kλ(f − u)kB1 → min , u∈B2 2 ( λ0 λ(x) = 0 Ω\D D, Please email all corrections and suggestions to these notes to [email protected] (3.13) (3.14) 26 Image Processing – Variational and PDE Methods C.-B. Schönlieb is the indicator function of Ω \ D multiplied by a constant λ0 1 (whose size will depend on how noisy the given image data is). This constant is the tuning parameter of the approach. As before R(u) denotes the regularizing term and represents a certain a-priori information from the image u, i.e., it determines in which space the restored image lies in. In the context of image inpainting, i.e., in the setting of (3.13)(3.14), it plays the main role of filling in the image content into the missing domain D, e.g., by diffusion 2 and/or transport. The fidelity term kλ(f − u)kB1 of the inpainting approach forces the minimizer u to stay close to the given image f outside of the inpainting domain (how close is dependent on the size of λ0 ). In this case the operator T from the general approach equals the indicator function of Ω \ D. In general we have B2 ⊂ B1 , which signifies the smoothing effect of the regularizing term on the minimizer u ∈ B2 (Ω). Note that the variational approach (3.13)-(3.14) acts on the whole image domain Ω (global inpainting model), instead of posing the problem on the missing domain D only. This has the advantage of simultaneous noise removal in the whole image and makes the approach independent of the number and shape of the holes in the image. In this global model the boundary condition for D is superimposed by the fidelity term. Total variation inpainting Starting with our standard imaging model so far, we are tempted to take the total variation as the regularising term R in (3.13). Total variation inpainting has been proposed by Chan and Shen in [CS01a]. Within the same setting as for (3.13)-(3.14) the inpainted image u is recovered as a minimiser of 1 (3.15) |Du|(Ω) + kλ(u − f )k22 , 2 where |Du|(Ω) is the total variation of u in Ω. In the noise free case, that is if we assume that g|Ω\D is completely intact, we can also formulate the following variational approach: assume that g ∈ BV (Ω) and seek the inpainted image u∗ that solves min {u∈L2 (Ω): u|Ω\D =g|Ω\D } {|Du|(Ω)} . (3.16) Theorem 3.16. For an original image g ∈ BV (Ω) the minimisation problem (3.16) has a minimiser u∗ ∈ BV (Ω) Proof. We can rewrite the constrained problem (3.16) as the following unconstrained problem n o u∗ = argminu |Du|(Ω) + 1{v∈L2 (Ω): v|Ω\D =g|Ω\D } (u) , where 1S (u) = ( 0 +∞ u∈S otherwise. Then we can apply the direct method of calculus of variations, noting (or proving) that the indicator function is l.s.c. and using compactness properties in L2 as before. The disadvantage of the total variation approach in inpainting is that the level lines are interpolated linearly. This means that the direction of the level lines is not preserved, since they are connected by a straight line across the missing domain. This is due to the penalisation of the length of the level lines within the minimising process with a total variation regulariser, thus connecting level lines from the boundary of the inpainting domain via the shortest distance (linear interpolation). To see this, we take the level line point of view. We want to derive another characterisation for the total variation in terms of the level sets of the image function u. Please email all corrections and suggestions to these notes to [email protected] 27 Image Processing – Variational and PDE Methods C.-B. Schönlieb Definition 3.17. Let E ⊂ Ω be a measurable set in R2 . This set is called a set of finite perimeter off its characteristic function χE ∈ BV (Ω). We write Per(E; Ω) := |DχE |(Ω), for the perimeter of E in Ω. With the notion of sets of finite perimeter we have the following theorem. Theorem 3.18. [Coarea formula] Let u ∈ BV (Ω) and for s ∈ R the set {u > s} the s-level set of u. Then, one has Z ∞ |Du|(Ω) = Per({u > s}; Ω) ds. −∞ Proof. We will only present a sketch of the proof here since the full version is very complicated and technical. The sketch is discussed in three steps. • We start with considering affine functions u(x) = p x, ∈ R, on a simplex Ω = T . Then, the total variation of u in Ω can be easily computed to be Z |Du|(T ) = sup − p xdivϕ dx : ϕ ∈ Cc∞ (Ω; R2 ), |ϕ(x)| ≤ 1 ∀x ∈ Ω Ω = |T ||p|. On the the other hand the hypersurfaces ∂{u > s} are {px = s} and hence Z ∞ Z ∞ Per({u > s}; Ω) ds = dH1 ({px = s}) = |T ||p|. −∞ −∞ • Now, having proved the result for affine functions on simplexes the idea is to triangulate Ω with simplexes and approximate a general function u ∈ BV by piecewise affine functions un on these R simplexes such that Ω |Dun | dx → |Du|(Ω). Indeed, we can start this process by first using the following approximation theorem for functions of bounded variation. Theorem 3.19. Let Ω ⊂ R2 a rectangular domain and u ∈ BV (Ω). Then, there exists a sequence (un ) ∈ C ∞ (Ω) ∩ W 1,1 (Ω) such that (i) un → u in L1 (Ω), R (ii) Ω |Dun | dx → |Du|(Ω) as n → ∞. Then, we can approximate u with smooth functions and those (similar to finite element theory) with piecewise affine functions. Using the l.s.c. of the total variation and Fatou’s lemma we get that Z Per({u > s}; Ω) ds ≤ |Du|(Ω). R Please email all corrections and suggestions to these notes to [email protected] 28 Image Processing – Variational and PDE Methods C.-B. Schönlieb • The proof of the reverse inequality is straightforward by the the following calculation Z Z Z u(x) Z {u>0} 0 Z ∞Z Ω = |{z} Fubini 0 ds divϕ(x) dx {u<0} Z u(x) 0 Z (1 − χ{u>s} (x))divϕ(x) dxds χ{u>s} (x)divϕ(x) dxds − 0 = |{z} divϕ=0 Z ∞ ≤ −∞ Ω Z R Z ds divϕ(x) dx − u(x)divϕ(x) dx = ∞ Ω Z divϕ dxds −∞ {u>s} Per({u > s}; Ω) ds. −∞ Taking the sup over all admissible ϕ on the left finishes the proof. Now, assume for a moment that the original image g is smooth and that the missing g|D is nonconstant, that is there exists a point x0 close to D where ∇g(x0 ) 6= 0. Then (by the inverse function theorem) level lines of g in D are well-defined and distinguishable and in particular we have that the s-level line gs−1 = Γs := {x ∈ R2 : g ≡ s} is uniquely labeled by the level s. Combining this with the coarea formula we can derive the total variation inpainting problem for one level line Γs which should be interpolated in D across the two points p1 ∈ ∂D and p2 ∈ ∂D as Z min ds = min length(γs ). {γs : γs (p1 )=Γs (p1 ), γs (p2 )=Γs (p2 )} {γs : γs (p1 )=Γs (p1 ), γs (p2 )=Γs (p2 )} γs This means that level lines are interpolated with straight lines. While a straight line connection might still be pleasant for small holes it is very unpleasant in the presence of larger gaps, even for simple images. Another consequence of the linear interpolation is that level lines might not be connected across large distances. A solution for this is the use of higher-order regularisation terms. One of the prototype of higher-order variational inpainting methods is Euler’s elastica inpainting, which will be the topic of the next section. Euler’s elastica inpainting Based on the work of Mumford, Nitzberg & Shiota in 93’ [NMS93] Masnou and Morel in 98’ [Ma98, MM98] proposed the Euler’s elastica energy (which goes back to Euler in 1744) for the interpolation of level lines. In particular, the proposed to connect two points Γs (p1 ) with Γs (p2 ) for p1 , p2 ∈ ∂D by an elastica curve which is the one that solves Z min α + βκ2 dt , γs ∈A γs where κ is the scalar curvature of γs , the parameters α and β are positive constants and A is the admissible set for the minimisation defined by A = {γs : γs (p1 ) = Γs (p1 ), γs (p2 ) = Γs (p2 ), and normals n~1 and n~2 in p1 and p2 }. Please email all corrections and suggestions to these notes to [email protected] 29 Image Processing – Variational and PDE Methods C.-B. Schönlieb The idea of this approach is to weight the length penalty against a smoothness penalty modelled by the curvature of the interpolating curve. Now, for all grey values 0 ≤ s ≤ 1 we have corresponding elastica curves F ∗ = {Γs : 0 ≤ s ≤ 1} which minimises Z 1 Z α + βκ2 dt ds E(F) = 0 γs over F = {γs : 0 ≤ s ≤ 1, with appropriate boundary conditions }. Potential problems however are that two level lines might intersect and that F ∗ does not have to weave the entire inpainting domain. This motivated Chan, Kang and Shen in 2002 to formulate a corresponding functionalised model for Euler’s elastica inpainting. For an admissible (for the moment smooth) inpainting u on D we represent the curvature of the level line γs : u ≡ s by Du . κ = div ~n = div |Du| Therefore, we can write J (u) = E(F) Z 1Z = 0 (α + βκ2 ) dt ds γs : u=s Z = |{z} 1 α + βdiv ds/dl=|Du| 0 γs : u=s Z α + βdiv = Z D Du |Du| Du |Du| 2 ! |Du|dl dt 2 ! dx, where dl is the length element along the normal direction ~n, which is orthogonal to dt. Remark 3.20. Note, that this derivation from the level line formulation to the functionalised model works in the smooth (and non-constant) case only. A result as general as the coarea formula for the total variation does not exist for the curvature (Conjecture of De Giorgi). From our considerations so far, the smoothness assumption on u – that we need for the curvature to be well-defined – is not a very realistic assumption for an image function. We would rather desire to have an inpainting approach that is defined for image functions, even if they are discontinuous. This can be done by introducing a “weak” form of the curvature and – using this weak curvature – formulate Euler’s elastica inpainting R for image functions in BV . For g ∈ BV (Ω) with ∂D |Dg| = 0. The latter condition can be rewritten as Z Z |Dg| = |u+ − u− | dH1 = 0, ∂D ∂D which means that g + = g − a.e. (w.r.t. to H1 ) along ∂D. In other words, we assume that g does not jump across the boundary of D, that is there is no essential overlap between ∂D and image edges. Then we consider the minimisation of 2 ! Z Du J (u) = α + βdiv dx, (3.17) |Du| D Please email all corrections and suggestions to these notes to [email protected] 30 Image Processing – Variational and PDE Methods C.-B. Schönlieb under the conditions Z |Du| = 0, |κ| < ∞ a.e. (in the sense of the Hausdorff measure) along ∂D. u|Ω\D = g|Ω\D , ∂D The second and third condition enforce a certain regularity on u on the boundary ∂u, namely that u does not have essential discontinuities across the boundary and that its curvature κ is finite. Now, for a function u ∈ BV (Ω) the Euler elastica energy in (3.17) is still defined only formally because a general BV function lacks the necessary regularity for the curvature to be well defined. To be able to rigorously define (3.17) for functions of bounded variation we shall – in the following – introduce the weak form of curvature. For that, let us denote for a function u ∈ BV (D), du ν(S) = |Du|(S), S⊂D the total variation of u in a subset S of D, which is a Radon measure on D. Let further, supp(du ν) be the support of the total variation measure. Then, for any p ∈ supp(du ν) we have du ν(Np ) = |Du|(Np ) > 0 on any small neighbourhood Np ⊂ D of p. Now, let ρσ (x) = 1 x ρ , σ2 σ uσ = ρσ ∗ u, where ρ is a fixed, radially symmetric non-negative mollifier (that is ρ is smooth and limσ→0 ρσ (x) = δx) with compact support and unit integral. Then, for p ∈ supp(du ν) we define the weak absolute curvature Duσ (p) , κ̃(p) = lim sup div |Du | σ→0 σ where for those σ which give |Duσ (p)| = 0 we define div(Duσ /|Duσ |) = ∞. For any pixel p outside the support supp(du ν) we define κ̃(p) = 0, since u is a.e. a constant function near a neighbourhood of p. With this concept of weak curvature, the functionalised elastica energy in (3.17) can now be rigorously defined for BV functions u with κ̃ ∈ L2 (D, du ν). For more properties on the weak curvature and equivalence between classical and weak curvature over certain classes of functions see [CKS02]. In what follows, we will continue by deriving the first variation of the Euler elastica energy (3.17) and in turn an interpretation of its inpainting dynamics in terms of transport and diffusion of grey value information in g. Theorem 3.21. Let φ ∈ Cc1 (R, [0, ∞)) and define Z φ(κ) · |∇u| dx, R(u) = Ω for u ∈ W 2,1 (Ω). Then, the first variation of R over the set of Cc∞ (Ω) is given by ~, ∇u R = −divV where ~ = φ(κ) · ~n − V ~t ∂ (φ0 (κ)|∇u|) . |∇u| ∂~t Here ~n = ∇u/|∇u| and ~t ⊥ ~n. Please email all corrections and suggestions to these notes to [email protected] 31 Image Processing – Variational and PDE Methods C.-B. Schönlieb Proof. Let v ∈ Cc∞ (Ω), a compactly supported test function on Ω, then the first variation of R is defined for τ ∈ R as Z d R(u + τ v) |τ =0 . ∇u R · v dx = dτ Ω We first compute d R(u + τ v) dτ ∇(u + τ v) ∇v + φ0 (κu+τ v ) |∇(u + τ v)| Z φ(κu+τ v ) · = Ω d κu+τ v |∇(u + τ v)| dx. dτ Then, we determine the derivative of κu+τ v = div (∇(u + τ v)/|∇(u + τ v)|) as v))T ∇v|∇(u + τ v)| − ∇(u + τ v) · (∇(u+τ d ∇(u + τ v) |∇(u+τ v)| ∇v div = div dτ |∇(u + τ v)| |∇(u + τ v)|2 1 ∇(u + τ v) ∇(u + τ v) ⊗ = div Id − ∇v , |∇(u + τ v)| |∇(u + τ v)| |∇(u + τ v)| where ~a ⊗ ~a = ~a~aT is the orthogonal projection on the vector ~a ∈ R2 . Then, d R(u + τ v) |τ =0 dτ Z φ(κ)~n∇v = φ0 (κ)div + Ω 1 [Id − ~n ⊗ ~n] ∇v |∇u| |∇u| dx, where we have used ~n = ∇u/|∇u|. The second term in the above addition can be simplified by integration by parts to Z − 1 [Id − ~n ⊗ ~n] ∇ (φ0 (κ)|∇u|) ∇v dx |∇u| Ω Z = div [Id − ~n ⊗ ~n] Ω 1 0 ∇ (φ (κ)|∇u|) v dx. |∇u| we derive Integrating the first term by parts as well and using ~n ⊗ ~n + ~t ⊗ ~t = Id and ~t ⊗ ~t∇f = ~t ∂f ∂~ t 1 ∂ (φ0 (κ)|∇u|) ~ ∇u R · v dx = − div φ(κ)~n − t v dx |∇u| ∂~t Ω ZΩ ~ · v dx, =− divV Z Z Ω which completes the proof. Corollary 3.22. For the case of Euler elastica inpainting, i.e., φ(κ) = (a + bκ2 ) for nonnegative constants a, b, the above theorem gives the following expression for the first variation of the respective regularising energy R ~, ∇u R = −divV 3.5 ~ = (a + bκ2 )~n − 2b ∂κ|∇u| ~t. V |∇u| ∂~t Image segmentation with Mumford-Shah Mumford and Shah [MS89] introduced in 1989 a segmentation model that is based on the idea of decomposing an image into piecewise smooth parts that are separated by an edge set Γ. As before let Ω ⊂ R2 be a rectangular domain and f a given (possibly noisy) image. Further, define an edge set Γ to Please email all corrections and suggestions to these notes to [email protected] 32 Image Processing – Variational and PDE Methods C.-B. Schönlieb be a relatively closed subset of Ω, with finite one-dimensional Hausdorff measure. We search for a pair (u, Γ) minimising Z 1 (u − f )2 dx + J (u, Γ), (3.18) E(u, Γ) = 2 Ω with J (u, Γ) = α 2 Z |∇u|2 dx + βH1 (Γ). (3.19) Ω\Γ Here α and β are nonnegative constants and H1 (Γ) is the one-dimensional Hausdorff measure of Γ (which is the length of Γ if Γ is regular). The Mumford-Shah model is inspired by former statistical approaches, e.g. [GG84, BZ87]. It aims to decompose a given image into its piecewise smooth part u and its edge set Γ, where the former is measured with the H 1 norm and the latter by its length or more generally by the one dimensional Hausdorff dimension H1 (Γ). Reduced case α → +∞: of To study the reduced case for (3.19) as α → +∞, we consider the minimisation Z Ẽ(u, Γ) = (u − g)2 dx + βH1 (Γ), Ω\Γ over the admissible set A = {(u, Γ) : Du|Ω\Γ = 0, Γ is closed in Ω and H1 (Γ) < ∞}. Then, for Γ 6= ∅ fixed a minimiser u of Ẽ over A is piecewise constant on the connected components of SN Ω \ Γ. In particular, if Ω \ Γ = i=1 Ωi is the unique decomposition into connected components, then for u = ui constant on Ωi we have that Z Z 1 g dx =: gø|Ωi . argminui (g − ui )2 dx = |{z} |Ωi | Ωi Ωi least squares approximation Inserting this into Ẽ, the problem for the edge set Γ becomes Z Ẽ(Γ|g) = (g − gø|Γ )2 dx + βH1 (Γ), Ω\Γ were gø|Γ := N X gø|Ωi 1Ωi (x), x ∈ Ω. i=1 This is a model that seeks object boundaries Γ such that in each connected component of Ω \ Γ the given image g is approximately constant. This model has been rediscovered by Chan & Vese in 2001 under the name “Active Contours without (Gradients) Edges”. Well-posedness of the scheme: Going back to the general case, we attempt to prove existence of minimisers via the direct method of calculus of variations. To do so we must find a topology that ensures both compactness of the minimising sequence and lower-semicontinuity of E. However, we face the following difficulty Please email all corrections and suggestions to these notes to [email protected] 33 Image Processing – Variational and PDE Methods C.-B. Schönlieb Theorem 3.23 (Ambrosio 1989). Let E be a Borel set of RN with boundary ∂E, then the map E 7→ HN −1 (∂E) is not lower semicontinuous with respect to any compact topology. The proof of this theorem is very involved and goes beyond the scope of this course. However, let us understand this by means of an example. Example 3.24 (Lack of l.s.c. of the Hausdorff measure). Let xi be the sequence of all rational points in RN and let Bi = {x ∈ RN : |x − xi | ≤ 2−i }, Ek = k [ ∞ [ Bi , E = i=0 Bi . i=0 Moreover, let |E| be the Lebesgue measure of E and ωN = |B1 (0)| = Lebesque measure of the unit ball in RN . Then, ∞ ∞ [ X |Bi | |E| = Bi ≤ i=0 i=0 = ∞ X 2−iN · ωN i=0 ωN < +∞. = |{z} 1 − 2−N N ≥1 But then, since rationals are dense in RN we have that cl(E) = RN and since RN is uncountable ∂E = cl(E) \ E has infinite Lebesque measure. Therefore, also HN −1 (∂E) = +∞. On the other hand " k # [ HN −1 (∂Ek ) = HN −1 (∂ Bi ) i=0 ≤H N −1 k [ ! ∂Bi i=0 = k X ωN −1 N 2−i(N −1) i=0 N ωN −1 ≤ . |{z} 1 − 2−(N −1) N ≥2 Hence, for N ≥ 2 the sequence {HN −1 (∂Ek )} is bounded and Ek → E as k → +∞ in the sense of measures, that is χEk → χE in L1 . But HN −1 (∂E) > liminf k HN −1 (∂Ek ). Hence, for getting a well-defined imaging model we shall introduce a relaxed version of (3.19) for which existence of solutions in the space of so-called special functions of bounded variation SBV (Ω) [AFP00], defined below, can be proven. Please email all corrections and suggestions to these notes to [email protected] 34 Image Processing – Variational and PDE Methods C.-B. Schönlieb Let us start the well-posedness discussion of (3.18) with rephrasing the problem within a more regular setting (cf. [CS05a]) than the one of the relaxed functional we will end up with. This provides us with a better ground to motivate the idea the relaxation is built on. Let u be defined as piecewise smooth, i.e. SK S u is defined on a partition of Ω = i=1 Ωk Γ with u|Ωk ∈ H 1 (Ωk ) for k = 1, . . . , K and piecewise C 1 edge set Γ. Then one can define a piecewise continuous vector field ~n for each point x ∈ Γ that is part of a C 1 component and with that the jump of u at x as [u](x) = lim+ u(x + ρ · ~n(x)) − u(x − ρ · ~n(x)), ρ→0 where the limit for ρ → 0+ is taken is the trace sense. With that we can define a vectorial measure on Γ as Z Ju := [u]~n dH1 with Ju (γ) = [u](x)~n(x) dH1 ∀γ ∈ B(Γ). γ In this setting one can easily check that the restriction of the distributional derivative Du to the edge set Γ equals Ju . Moreover, on the components Ωk where u is H 1 one has that Du = ∇u ∈ L2 (Ωk ) and hence Du = ∇u|Ω\Γ + Ju |Γ . This is our key observation: Instead of defining the edge set Γ separately from u we rather capture it within the jump set of u, i.e. the set in which Du is singular and one-dimensional. The problem with assuming Γ ∈ C 1 (or even piecewise C 1 ) is that this space does not provide us with any compactness property and as a consequence any existence proof. Hence, we have to loosen our assumption on Γ and prove existence in a less restrictive (i.e. larger) function space. This space will turn out to involve the space of functions of bounded variation BV (Ω). This space provides us with sufficient compactness and semicontinuity properties and gives sense to one dimensional discontinuities (edges) of functions. The latter becomes clear when recalling some facts about functions of bounded variation such as that the distributional derivative Du of a BV -function can be written as the sum of its absolute continuous part ∇u dx, its jump part Ju and its Cantor part Cu , i.e. Du = ∇u dx + (u+ − u− )~nu H1 |Su +Cu , {z } | (3.20) Ju cf. the next paragraph for the details of this decomposition. Lebesgue decomposition of Du Let u ∈ BV (Ω). Then, from the general theory of the Lebesgue decomposition of measures, cf. e.g. [AFP00, p. 14, Theorem 1.28], we have that Du = ∇u dx + Ds u, ∈ L1 (Ω) the absolute continuous part of Du and Ds u ⊥ dx is the singular part of where ∇u(x) = d(Du) dx Du. The latter can be further decomposed into a jump part Ju and a Cantor part Cu , cf. [AFP00, Section 3.9]. Before we specify what these parts are exactly, we have to introduce some additional terminology first. For λ ∈ R, z ∈ Ω, and a small ρ > 0, we define the following subsets of the disc Bz,ρ = {x : |x − z| < ρ}: {u > λ}z,ρ := {x ∈ Ω ∩ Bz,ρ : u(x) > λ}, {u < λ}z,ρ := {x ∈ Ω ∩ Bz,ρ : u(x) < λ}. Please email all corrections and suggestions to these notes to [email protected] 35 Image Processing – Variational and PDE Methods C.-B. Schönlieb Definition 3.25. We call a function u essentially not greater than λ in a point x ∈ Ω and write u(x) λ if lim ρ→0+ dx({u > λ}x,ρ ) =0 dx(Bx,ρ ) and analogously, u is essentially not smaller than λ in x and we write u(x) λ if lim+ ρ→0 dx({u < λ}x,ρ ) = 0. dx(Bx,ρ ) Then, we define the approximate upper and lower limit of a measurable function u in Ω as u+ (x) := inf{λ ∈ R : u(x) λ}, u− (x) := sup{λ ∈ R : u(x) λ}, respectively. For a function u ∈ L1 (Ω) we have 1 ρ→0 dx(Bx,ρ ) Z |u(x) − u(y)| dy = 0 a.e. x ∈ Ω. lim Bx,ρ Points x for which the above holds are called Lebesgue points of u, which have the properties Z 1 u(x) = lim u(y) dy, ρ→0 dx(Bx,ρ ) B x,ρ u(x) = u+ (x) = u− (x). The complement of the set of Lebesgue points (up to a set of H1 measure zero) is called the jump set Su , i.e. Su = {x ∈ Ω; u− (x) < u+ (x)}. The set Su is countable rectifiable, and for H1 a.e. x ∈ Ω, we can define a normal ~nu (x). These considerations lead us to the decomposition (3.20) of the distributional derivative Du. The idea now is, similar to the C 1 -case discussed before, to identify the edge set Γ with Su . Hence, instead of (3.18) we minimise Z Z 1 α 2 E(u) = (u − f ) dx + |∇u|2 dx + βH1 (Su ). 2 Ω\D 2 Ω Solving this would allow us to eliminate the unknown Γ in the minimisation problem. The issue however is that we cannot do this in BV . The objectionable element of BV is the Cantor part Cu in the decomposition (3.20). For a function u ∈ BV this part may contain pathological functions such as the Cantor-Vitali function that make the minimisation problem ill-posed, cf. [Am89]. The Cantor-Vitali function is non-constant, continuous but with approximate differential equals zero almost everywhere. For such a function v we would have Z 1 E(v) = (v − f )2 dx ≥ inf E(u) = 0, 2 Ω\D u∈BV (Ω) since BV functions are dense in L2 . But this means that the infimum cannot be achieved in general. To exclude this case, we consider the space of special functions of bounded variation SBV (Ω), which is the space of BV -functions such that Cu = 0. Then, our new problem that replaces (3.18) reads min{E(u) : u ∈ SBV (Ω) ∩ L∞ (Ω)}. (3.21) For the relaxed problem (3.21) we have the following existence result. Please email all corrections and suggestions to these notes to [email protected] 36 Image Processing – Variational and PDE Methods C.-B. Schönlieb Theorem 3.26. Let f ∈ L∞ (Ω). Then, the minimisation problem (3.21) obtains a solution. To prove Theorem 3.26 we will use the following compactness and closure results for SBV functions, cf. [AFP00, Section 4.]. Theorem 3.27 (Closure of SBV). Let Ω ⊂ Rd open and bounded and (un ) ⊂ SBV (Ω) with (Z ) Z n − d−1 |u+ n − un | dH |∇un |2 dx + sup Ω < ∞. (3.22) Sun If (un ) is weakly∗ converges in BV to u, then u ∈ SBV (Ω), ∇un weakly converges to ∇u in [L2 (Ω)]d , and Dj un weakly∗ converges to Dj u in Ω. Moreover, Z Z |∇u|2 dx ≤ liminf n→∞ |∇un |2 dx Ω Ω Z Z (3.23) + − d−1 − d−1 |u − u | dH ≤ liminf n→∞ |u+ . n − un | dH Su Sun Theorem 3.28 (Compactness of SBV). Let Ω ⊂ Rd open and bounded, and (un ) ⊂ SBV (Ω). Assume that un satisfies (3.22) and |un (x)| ≤ C for a.e. x ∈ Ω, for a constant C ≥ 0 and all n ≥ 1. Then, there exists a subsequence (un(k) ) weakly∗ converging in BV (Ω) to u ∈ SBV (Ω) with |u(x)| ≤ C for a.e. x ∈ Ω. Proof of Theorem 3.26. Let (un ) ∈ SBV (Ω) ∩ L∞ (Ω) be a minimising sequence of E. First, we convince ourselves of the fact that we can restrict the minimisation problem to functions u that are essentially bounded by C = kf kL∞ . This is because for ũ = max(min(u, C), −C) the truncated version of u we have Sũ ⊂ Su , and Z Z Z Z α |∇ũ|2 dx + |f − ũ|2 dx ≤ α |∇u|2 dx + |f − u|2 dx. Ω Ω Ω Ω Then, for such a minimising sequence we have the uniform bound Z Z α 1 2 (un − f ) dx + |∇un |2 dx + βH1 (Sun ) ≤ C, E(un ) = 2 Ω\D 2 Ω for a constant C ≥ 0 and for all n ≥ 1. By Theorem 3.28 we can find a subsequence un(k) that weakly∗ converges in BV to a u ∈ SBV (Ω) with |u(x)| ≤ C for a.e. x ∈ Ω. Moreover, by Theorem 3.27 ∇un weakly converges to ∇u in (L2 (Ω))d and Dj un weakly∗ converges to Dj u in Ω. Applying the lower semicontinuity properties (3.23) finishes the existence proof. Having established the existence theory for the relaxed problem (3.21) the question arises what the connection between the relaxed and the original formulation (3.18) exactly is? To answer this we make use of the following theorem from [Am89a]. Theorem 3.29. Let Γ ⊂ Ω be a closed set such that Hd−1 (Γ) < ∞, and let u ∈ H 1 (Ω \ Γ) ∩ L∞ (Ω). Then, u ∈ SBV (Ω) and Su ⊂ Γ ∪ P with Hd−1 (P ) = 0. Hence, min E(u) ≤ inf E(u, Γ). u (u,Γ) Please email all corrections and suggestions to these notes to [email protected] 37 Image Processing – Variational and PDE Methods C.-B. Schönlieb Moreover, for a minimiser of E it is proven in [DCL89, DMS92, MS95] that Hd−1 (Ω ∩ (S¯u − Su )) = 0. Then, by choosing Γ = Ω ∩ S¯u we get a solution of the original problem and min E(u) = min E(u, Ω ∩ S¯u ). u u Following the existence theory there exists a series of works concerned with the regularity of the edge set Γ. cf. e.g. [MS89, Bo96]. In practice mostly the edge set Γ is assumed to be at least Lipschitz continuous in which case the Hausdorff measure as a regularity measure of Γ is replaced by the length of Γ as defined in (3.3). 4 PDEs in Imaging In section 1.4 we have already seen that for processing images with PDE methods one should turn to a nonlinear (or anisotropic) approach. In what follows we discuss two nonlinear PDEs that have been proposed for image enhancement: the Perona-Malik model and the anisotropic diffusion PDEs studied by Joachim Weickert. 4.1 Perona-Malik One of the most important and classical nonlinear PDEs for image enhancement is the Perona-Malik model, proposed by Perona and Malik in 1990 [PM90]. The basic idea behind this model is to use a nonlinear diffusion that reduces diffusivity at those locations in the image that are more likely to be edges. This likelihood for edges is measured by |∇u|, the size of the gradient of an image function u. For a given image g, the Perona-Malik filter is based on the equation ut = div c(|∇u|2 )∇u (4.1) u(x, t = 0) = g(x), and it uses diffusivities such as c(s2 ) = 1 , 1 + s2 /λ2 (4.2) λ > 0. Here λ is a given threshold that encodes the size of edges that should be preserved or enhanced by the diffusion process. This equation applies diffusion and edge detection in one single process. Smoothing and edge enhancement in 1D Let us start studying the behaviour of (4.1) in one space dimension. This simplifies notation and illustrates the main behaviour since near a straight edge a two dimensional image approximates a function of one variable. In that case, (4.1) can be rewritten as ~ (u2 ) , ut = V x x ~ is given by where – for our chosen diffusivity (4.2) – the flux function V ~ (s) = s · c(s2 ). V Then ~ 0 (s) = c(s2 ) + s c(s2 ) 0 , V with 2 c(s ) 0 = 1 1 + s2 /λ2 0 =− 2s/λ2 2 (1 + s2 /λ2 ) Please email all corrections and suggestions to these notes to [email protected] 38 Image Processing – Variational and PDE Methods C.-B. Schönlieb gives ~ 0 (s) = V 1 1 + s2 /λ2 s2 2 1 − 2 . λ 1 + s2 /λ2 | {z } ≤1 if |s|≤λ Hence, ~ 0 (s) ≥ 0 V ~ 0 (s) < 0 V Now, we can rewrite (4.1) as if |s| ≤ λ if |s| > λ. ~ 0 (ux ) · uxx ut = V and observe that the Perona-Malik model is of forward parabolic type if |ux | ≤ λ and of backward parabolic type if |ux | > λ. This means that λ plays the role of a contrast parameter that separates forward (low contrast) from backward (high contrast) diffusion areas, see Figure 12. Figure 12: The Perona-Malik model performs forward- and backward diffusion in areas with low- and high contrast respectively. Smoothing and edge enhancement in 2D A similar argument can be made in two dimensions, where the diffusion-dynamics (in points where |∇u| = 6 0) can be divided into diffusion in normal and tangential direction to the level lines of u. In that case ut =div c(|∇u|2 )∇u c(|∇u|2 )(uxx + uyy ) + 2c0 (|∇u|2 ) (u2x uxx + 2ux uy uxy + u2y uyy ). ~ and T~ , Now, for each point x where |∇u(x)| = 6 0 we can define the normal and tangential vectors N respectively, as ~ (x) = ∇u(x) , N |∇u(x)| ~ ~ ~ T (x) ⊥ N (x) with |T (x)| = 1. Please email all corrections and suggestions to these notes to [email protected] 39 Image Processing – Variational and PDE Methods C.-B. Schönlieb ~ and T~ direction as Then, we can rewrite (4.1) in terms of diffusion in N ut = c(|∇u|2 )uT T + c(|∇u|2 ) + 2|∇u|2 c0 (|∇u|2 ) uN N (4.3) where 1 u2 uyy + u2y uxx − 2ux uy uxy |∇u|2 x ~ t ∇2 uN ~ = 1 =N u2x uxx + u2y uyy + 2ux uy uxy . 2 |∇u| uT T = T~ t ∇2 uT~ = uN N Directional smoothing in 2D From the representation of the diffusion-dynamics in terms of normal and tangential direction in (4.3) it might come to mind to choose c is such a way that we smooth more ~ . This could be imposed by in tangential direction T~ (that is along edges) and less in normal direction N choosing c such that c(s) + 2sc0 (s) = 0, lim s→+∞ c(s) or sc0 (s) 1 lim =− . s→+∞ c(s) 2 Restricting ourselves to functions c(s) > 0 with power growth, then this limit implies 1 c(s) ≈ √ as s → +∞. s One example for such a c is c(s) = √ 1 . 1+s A comment on the well-posedness of Perona-Malik Indeed for most interesting choices of c, such as (4.2), the Perona-Malik equation is ill-posed (due to the backward diffusion). This intrinsic ill-posedness of the equation is what makes it work for edge enhancement. A way to derive a well-posed model from the Perona-Malik equation is to regularise the nonlinearity in (4.1). An example for that is the work of [CLMC92], where the authors consider ut = div c(|∇(Gσ ∗ u)|2 )∇u u(x, t = 0) = g(x), and prove the following theorem. 2 + Theorem 4.1. → R+ be smooth, decreasing with φ(0) = 1, lims→+∞ φ(s) = 0 p Let φ(s) = c(s ), φ : R 2 and s 7→ φ( (s)) is smooth. If g ∈ L (Ω), then there exists a unique function u(t, x) ∈ C([0, T ]; L2 (Ω)) ∩ L2 (0, T ; H 1 (Ω)) satisfying in the distributional sense ut = div (φ(|∇(Gσ ∗ u)|)∇u) in (0, T ) × Ω, ∇u · ν = 0 on (0, T ) × ∂Ω, u(x, t = 0) = g(x) in Ω. Moreover, |u|L∞ ((0,T );L2 (Ω)) ≤ |g|L2 , and u ∈ C ∞ ((0, T ) × cl(Ω)). Please email all corrections and suggestions to these notes to [email protected] 40 Image Processing – Variational and PDE Methods 4.2 C.-B. Schönlieb Anisotropic diffusion filters In the Perona-Malik model (4.1) the diffusion is weighted by a scalar diffusivity c that only determines the strength of the diffusion but cannot change its direction. With respect to this, the diffusion is nonlinear but isotropic. In this section we shall derive another image enhancement method that is based on an anisotropic diffusion equation that takes into account local variations of the gradient orientation and can potentially assign the diffusion with a certain orientation. A natural idea is to choose as a preferred smoothing direction d~ the one that minimises grey value ~ fluctuations. We consider d(Θ) = (cos(θ), sin(θ)) and ( if d~ k ∇u 2 max ~ F (Θ) = |d(Θ) · ∇u(x)| min if d~ ⊥ ∇u. Now, minimising/maximising F (Θ) is equivalent to minimising/maximising the quadratic form ~ d~t ∇u · (∇u)t · d, where t ∇u · (∇u) = u2x ux uy ux uy u2y is positive definite with eigenvalues λ1 = |∇u|2 λ2 = 0 and an orthonormal basis of eigenvectors v1 k ∇u v2 ⊥ ∇u. Having this, the idea is to define at each point x in the image an orientation descriptor as a function of (∇u(∇u)t ) (x). The problem with this idea is though that this function constitutes a point wise estimate, which does not take into account possible information contained in a neighbourhood of x (and as such makes this descriptor very sensitive to spurious changes in the image). To tackle this shortcoming Weickert in [Wei98] proposed the introduction of smoothing kernels at different scales into the orientation descriptor - the result he called structure tensor of the image. In sketch form his approach consists of the following steps. The structure tensor: • To avoid false detections due to noise the image u is first convolved with a Gaussian kernel, that is uσ (x) := (Gσ ∗ u)(x), σ ≥ 0. • Further, local orientation information is averaged by building the so-called structure tensor which reads Jρ (∇uσ ) := Gρ ∗ ∇uσ · (∇uσ )t , ρ ≥ 0. Please email all corrections and suggestions to these notes to [email protected] 41 Image Processing – Variational and PDE Methods C.-B. Schönlieb • Interpreting the information in Jρ in terms of its eigenvalues and eigenvectors one derives a way of manipulating the structure tensor. Namely, with Jρ (∇uσ ) = (jij ), the eigenvectors v1 , v2 are a pair of orthonormal eigenvectors p 2j12 v1 k , v2 ⊥ v1 , 2 j22 − j11 + (j22 − j11 )2 + 4j12 and corresponding eigenvalues q 1 2 2 µ1 = j11 + j22 + (j11 − j22 ) + 4j12 , 2 q 1 2 2 µ2 = j11 + j22 − (j11 − j22 ) + 4j12 . 2 Than, the eigenvalues µ1 and µ2 describe the average contrast of the smoothed image function uσ within a neighbourhood of size O(ρ) and the eigenvectors v1 and v2 the orientation that maximise grey value fluctuations within this neighbourhood (≈k ∇uσ ) and its orthogonal (the preferred direction of smoothing), respectively. Then, writing Jρ (∇uσ ) = 21 (µ1 v1 + µ2 v2 ), the eigenvalues encode the weighting of each eigendirection in Jρ . In particular, the eigenvalues µ1 and µ2 convey shape information in the form µ1 (x) ≈ µ2 (x) image has isotropic structure in x µ1 (x) µ2 (x) ≈ 0 image has line like structure in x µ1 (x) ≥ µ2 (x) 0 object edge forms a corner in x. Now, using the structure descriptor the generic form of the anisotropic, nonlinear diffusion equation used in imaging reads ut = div (D(Jρ (∇uσ ))∇u) , where D is a diffusion tensor that depends on the structure tensor Jρ and can be chosen with respect to the imaging task at hand. Roughly speaking, the eigenvectors of D should reflect the local image structure. Hence, a good choice is to choose them to be the same orthonormal basis of eigenvectors as one gets from Jρ . Moreover, the choice of the eigenvectors λ1 and λ2 of D depend on the desired goal. We will discuss one example below. For more details on image enhancement with anisotropic diffusion and other examples for D, see [Wei98]. Coherence-enhancing diffusion This type of anisotropic diffusion is designed to enhance flow (line) like structures in an image function u with the potential to even repair broken line like structures by an appropriate choice of ρ. To describe this approach, we first define the so-called coherence of an image u as coh = (µ1 − µ2 )2 , where µ1 and µ2 are the eigenvalues of Jρ (uσ ) and coh is taken as an indicator for line-like structures in u, that is the larger coh(x) the more likely it is – we assume – that a line-like structure goes through x. Please email all corrections and suggestions to these notes to [email protected] 42 Image Processing – Variational and PDE Methods C.-B. Schönlieb With this assumption, one chooses a diffusion tensor D with the eigenvalues λ1 = α ( α 1 λ2 = − α + (1 − α)e (µ1 −µ2 )2 if µ1 = µ2 otherwise, where α ∈ (0, 1). References [Am89] L. Ambrosio, Variational problems in SBV and image segmentation, Acta Applicandae Mathematicae, 17, pp. 1–40, 1989. [Am89a] L. Ambrosio, A compactness theorem for a new class of functions of bounded variation, Bolletino della Unione Matematica Italiana, VII (4), p. 857–881, 1989. [AFP00] L. Ambrosio, N. Fusco, and D. Pallara, Functions of bounded variation and free discontinuity problems., Oxford Mathematical Monographs. Oxford: Clarendon Press. xviii, 2000. [Al07] W. K. Allard, Total Variation Regularization for Image Denoising, I. Geometric Theory, SIAM J. Math. Anal. 39 (2007), 1150–1190. [ACC05] F. Alter, V. Caselles and A. Chambolle, A characterization of convex calibrable sets in RN , Math. Ann. 332 (2005), 329–366. [ACC05a] F. Alter, V. Caselles and A. Chambolle, Evolution of characteristic functions of convex sets in the plane by the minimizing total variation flow, Interfaces Free Bound. 7 (2005), 29–53. [AK06] G. Aubert, and P. Kornprobst, Mathematical Problems in Image Processing. Partial Differential Equations and the Calculus of Variations, Springer, Applied Mathematical Sciences, Vol 147, 2006. [BSCB00] M. Bertalmio, G. Sapiro, V. Caselles, and C. Ballester, Image Inpainting, Siggraph 2000, Computer Graphics Proceedings, pp.417–424, 2000. [BZ87] A. Blake, and A. Zisserman, Visual Reconstruction, MIT Press, Cambridge, MA, 1987. [Bo96] A. Bonnet, On the regularity of the edge set of Mumford-Shah minimizers, Progress in Nonlinear Differential Equations, 25, p. 93–103, 1996. [BL11] K. Bredies, and D. Lorenz, Vieweg+Teubner, 445 pages, 2011. Mathematische Bildverarbeitung, textbook (in german), [CLMC92] F. Catté, P.-L. Lions, J.-M. Morel, and T. Coll, Image Selective Smoothing and Edge Detection by Nonlinear Diffusion, SIAM J. Numer. Anal. 29 (1), pp. 182–193, 1992. [Ch04] A. Chambolle, An algorithm for total variation minimization and applications, Journal of Mathematical Imaging and Vision, 20(1-2), pp. 89–97, 2004. [CCCNP10] A. Chambolle, V. Caselles, D. Cremers, M. Novaga, T. Pock, An Introduction to Total Variation for Image Analysis, Chapter in Theoretical Foundations and Numerical Methods for Sparse Recovery, Ed. Massimo Fornasier, De Gruyter, 2010. Please email all corrections and suggestions to these notes to [email protected] 43 Image Processing – Variational and PDE Methods C.-B. Schönlieb [CKS02] T.F. Chan, S.H. Kang, and J. Shen, Euler’s elastica and curvature-based inpainting, SIAM J. Appl. Math., Vol. 63, Nr.2, pp.564–592, 2002. [CM99] T.F. Chan, and P. Mulet, On the convergence of the lagged diffusivity fixed point method in total variation image restoration, SIAM Journal on Numerical Analysis, 36(2), pp. 354–367, 1999. [CS01a] T. F. Chan and J. Shen, Mathematical models for local non-texture inpaintings, SIAM J. Appl. Math., 62(3):1019–1043, 2001. [CS01c] T. F. Chan and J. Shen, Non-texture inpainting by curvature driven diffusions (CDD), J. Visual Comm. Image Rep., 12(4):436449, 2001. [CS05a] T. F. Chan, and J. J. Shen, Image Processing and Analysis Variational, PDE, wavelet, and stochastic methods, SIAM, (2005). [DCL89] E. De Giorgi, M. Carriero, and A. Leaci. Existence theorem for a maximum problem with a free discontinuity set, Archive for Rational Mechanics and Analysis, 108, p. 195–218, 1989. [DMS92] G. Dal Maso, J.-M. Morel, and S. Solimini, A variational method in image segmentation: Existence and approximation results, Acta Math., 168, p. 89–151, 1992. [GG84] S. Geman, and D. Geman, Stochastic relaxation, Gibbs distributions, and the Bayesian restoration of images, IEEE Trans. Pattern Anal. Machine Intell. 6 (1984), pp. 721-741. [HUL93] J.B. Hiriart-Urruty, and C. Lemaréchal, Convex Analysis and Minimization Algorithms: Part 1: Fundamentals, Vol. 305. Springer, 1993. [L11] A. Langer, Subspace Correction and Domain Decomposition Methods for Total Variation Minimization, Doctoral thesis, Johannes Kepler University of Linz, July 2011. http://people.ricam.oeaw. ac.at/a.langer/publications/PhD_thesis.pdf. [Ma98] S. Masnou, Filtrage et désocclusion d’images par méthodes d’ensembles de niveau, Thése de doctorat á l’Université Paris-Dauphine, 1998. [MM98] S. Masnou and J. Morel, Level Lines based Disocclusion, 5th IEEE Int’l Conf. on Image Processing, Chicago, IL, Oct. 4-7, 1998, pp.259–263, 1998. [MS95] J.-M. Morel, and S. Solimini, Variational methods in image segmentation. Progress in Nonlinear Differential Equations and Their Applications, Vol. 14, Birkhäuser, Boston, 1995. [MS89] D. Mumford, and J. Shah, Optimal approximations by piecewise smooth functions and associated variational problems, Comm. Pure Applied Math. 42, pp. 577-685, 1989. [NMS93] M. Nitzberg, D. Mumford, and T. Shiota, Filtering, Segmentation, and Depth, Springer-Verlag, Lecture Notes in Computer Science, 662, 1993. [PM90] P. Perona, and J. Malik, Scale-space and Edge Detection Using Anisotropic Diffusion, IEEE Transactions on Pattern Analysis and Machine Intelligence, 12(7), pp.629-639, July 1990. [RO94] L. Rudin and S. Osher, Total variation based image restoration with free local constraints, Proc. 1st IEEE ICIP, 1, pp. 31–35, 1994. [ROF92] L. I. Rudin, S. Osher, and E. Fatemi, Nonlinear total variation based noise removal algorithms, Physica D 60, pp. 259-268, 1992. Please email all corrections and suggestions to these notes to [email protected] 44 Image Processing – Variational and PDE Methods C.-B. Schönlieb [VO96] C.R. Vogel, and M.E. Oman, Iterative methods for total variation denoising, SIAM Journal on Scientific Computing, 17(1), pp. 227–238, 1996. [Wei98] J. Weickert, Anisotropic diffusion in image processing, Teubner, Stuttgart, Germany, 1998. Please email all corrections and suggestions to these notes to [email protected] 45

* Your assessment is very important for improving the work of artificial intelligence, which forms the content of this project

### Related manuals

Download PDF

advertisement