Autocorrelation-Driven Diffusion Filtering Linköping University Post Print
Autocorrelation-Driven Diffusion Filtering Michael Felsberg Linköping University Post Print N.B.: When citing this work, cite the original article. ©2011 IEEE. Personal use of this material is permitted. However, permission to reprint/republish this material for advertising or promotional purposes or for creating new collective works for resale or redistribution to servers or lists, or to reuse any copyrighted component of this work in other works must be obtained from the IEEE. Michael Felsberg , Autocorrelation-Driven Diffusion Filtering, 2011, IEEE Transactions on Image Processing, (20), 7, 1797-1806. http://dx.doi.org/10.1109/TIP.2011.2107330 Postprint available at: Linköping University Electronic Press http://urn.kb.se/resolve?urn=urn:nbn:se:liu:diva-65430 TRANSACTIONS OF IMAGE PROCESSING, VOL. ..., NO. ..., ... 2011 1 Autocorrelation-Driven Diffusion Filtering Michael Felsberg, Member, IEEE Abstract—In this paper we present a novel scheme for anisotropic diffusion driven by the image autocorrelation function. We show the equivalence of this scheme to a special case of iterated adaptive filtering. By determining the diffusion tensor field from an autocorrelation estimate, we obtain an evolution equation that is computed from a scalar product of diffusion tensor and the image Hessian. We propose further a set of filters to approximate the Hessian on a minimized spatial support. On standard benchmarks, the resulting method performs favorable in many cases, in particular at low noise levels. In a GPU implementation video real-time performance is easily achieved. Index Terms—Diffusion filtering, adaptive filtering, structure tensor, steerable filters, image enhancement EDICS: TEC-PRC, TEC-PDE, TEC-RST I. I NTRODUCTION T HE efficient structure-preserving image denoising and enhancement is a relevant problem in many applications. In this paper, we propose a novel variant of tensor-driven diffusion filtering that: (a) only requires a scalar product of diffusion tensor and Hessian in the evolution equation, (b) is equivalent to a particular instance of adaptive filtering, and (c) is easy to implement in a fast numerical scheme. Structure-preserving denoising and enhancement are practical problems to be solved in many applications, e.g. image and video coding, digital cameras, and medical imaging. Most of these applications require a real-time approach, which limits the range of available methods. The development of real-time capable approaches with state-of-the-art performance is still a relevant topic. The theoretical and practical insights formulated in this paper allow to achieve this goal, as it is shown in the experiments. Structure-preserving denoising has a rather long history and non-linear diffusion  is probably one of the most commonly used techniques. The anisotropic extension of nonlinear diffusion using the structure tensor  often leads to better results, in particular close to lines and edges. The numerical implementation of anisotropic diffusion is however less trivial as expected, as can be seen by the variety of algorithms and publications on the topic , , . This is even more severe as we observed that the applied numerical scheme has larger influence on the quality of results than the choice of the method, i.e., using a sub-optimal numerical scheme for anisotropic diffusion results in worse peak-signal to noise ratios (PSNR) than a closer to optimal scheme for non-linear diffusion. Copyright (c) 2010 IEEE. Personal use of this material is permitted. However, permission to use this material for any other purposes must be obtained from the IEEE by sending a request to [email protected] M. Felsberg is with the Computer Vision Laboratory, Department of Electrical Engineering, Linköping University, S-58183 Linköping, Sweden, [email protected] Besides non-linear diffusion, a large variety of approaches for image denoising have been published, too many to be covered here. Some of the more popular approaches use iterated adaptive filters , bilateral filtering , , meanshift filtering , , channel decomposition , multiband filtering , or block-based methods , . Optimal results usually require estimates of image priors , , . Some of the mentioned methods have been analyzed and compared in , , . Besides the above cited work on diffusion, approaches to trace-based diffusion filtering have been suggested  and their relation to variational formulations and divergence expressions have been investigated . The main difference of our approach to this previous work lies in the diffusion function and the spatial averaging in its computation. Tschumperlé and Deriche compute the diffusion tensor by averaging over different channels, i.e., in the case of grey scale images, no averaging is done at all. In that case, the diffusion tensor is at most of rank 1 and the processing across and parallel to the image gradient are not independent. The processing becomes independent if the image Hessian is used to compute the diffusivity, even if it is not spatially averaged . Computing the diffusivity from the structure tensor with spatial averaging, however, results in a third alternative to trace-based diffusion , which has a rank 2 diffusivity tensor and is based on first order derivatives. In the present work, the latter approach is modified by considering the diffusion tensor as a function of the autocorrelation function. In particular, we use the Hessian, which represents the curvature of the autocorrelation function, to control the diffusion process. This is intuitively motivated by the fact that the curvature (flat, parabolic, or elliptic) represents the type of local structure (flat, linear, or two-dimensional) to a large extend independently of intensity or texture. Flat structures should be diffused isotropically, linear structures only in one direction, and two-dimensional structures not at all. Furthermore, we derive new discrete filter masks, apply a new noise estimation method , and extend the experimental evaluation using methods that take into account the visual quality: the visual information fidelity (VIF)  and the structural similarity index (SSIM) . The paper is structured as follows. In Sect. II, we introduce some tensor notation and calculus and we give some background on the structure tensor, anisotropic diffusion, and adaptive filtering. In Sect. III, we derive the novel scheme from the anisotropic diffusion equation and relate it to iterated adaptive filtering, we propose concrete discrete filters for all relevant steps and discuss practical aspects of the algorithm. In Sect. IV, we present a number of standard denoising experiments in order to compare our approach with methods from the literature. Finally, we discuss the advantages and 2 TRANSACTIONS OF IMAGE PROCESSING, VOL. ..., NO. ..., ... 2011 drawbacks of the proposed method in the concluding Sect. V. II. T ENSORS , D IFFUSION , AND A DAPTIVE F ILTERING In this section we introduce some notation, terms, and methods from the literature, which will be required in subsequent sections. A. Notation and Tensor Calculus Despite the fact that tensors are coordinate-system independent algebraic entities, we restrict ourselves to matrix representations of tensors in this paper. In what follows, (real) vectors are denoted as bold letters and matrices as bold capital letters. The scalar product between two column vectors a and b is usually denoted by the matrix product of the transpose of a and b: aT b. Tensors (of second order) will be often built by outer products of vectors, e.g., T = abT . Since second order tensors form a vector space, a scalar product (Frobenius product) is defined, which reads for two tensors A and B XX hA|Bi = trace(ABT ) = aij bij . (1) i j In most cases we will consider symmetric tensors, i.e., aij = aji and for these tensors the spectral theorem allows a decomposition into real eigenvalues: A = OΛOT = O diag(λi ) OT (2) where O is an orthogonal matrix formed by the eigenvectors of A and Λ is the diagonal matrix containing the corresponding (real) eigenvalues λi . Note that arbitrary powers of A can be computed as An = (OΛOT )n = OΛn OT = O diag(λni ) OT (3) = O diag(exp(k λi )) OT . and due to linearity exp(k A) (4) In many cases we deal with vector-valued or tensor-valued functions in this paper, even if written as vectors or tensors. The spatial variable, mostly denoted as x, is often omitted. In this context, differential operators are based on the partial derivatives with respect to the components of x and are denoted using the nabla operator ∇, such that ∂b (5) ∇b = ∂xi i X ∂ai ∇T a = div(a) = (6) ∂xi i ! X ∂aij T ∇ A = div(A) = . (7) ∂xi i j B. Structure Tensor Tensor representations occur frequently in image processing, mostly for representing local orientation and the deviation from a simple signal model . The gradient based structure tensor can be derived as the solution of a leastsquares approach to fitting a simple signal model . Another alternative is to consider the structure tensor as an estimate of the Hessian of the autocorrelation function of the signal . Assuming ergodicity, the autocorrelation function of a 2D image b(x) is given by correlating the image with itself R = b ? b − b20 (8) where b0 denotes the DC part of b. Since R is a symmetric function, we might expand it as 1 R(x) = R(0) + xT HRx + O(|x|4 ) (9) 2 where H denotes the Hessian operator. With the power theorem we obtain (see , p. 317) HR = E[∇b∇T b] (10) where E[·] denotes the expectation value. This expectation value is usually estimated by locally weighted averaging the outer product of the image gradient , : Z J(x) = w(y)∇b(x − y)∇T b(x − y) dy . (11) Note that the gradient operators in the equation above are normally regularized by some low-pass filter, e.g., using Gaussian derivatives. The local weight function w is also mostly chosen as a Gaussian function. C. Anisotropic Diffusion We will not give a detailed introduction to anisotropic diffusion, the reader is referred to, e.g., the work of Weickert . The diffusion scheme is defined as an evolution equation of an image b(x) over time t as: ∂b = div(D(J)∇b) (12) ∂t where D(J) is the diffusion tensor, which is computed by modifying the eigenvalues of the structure tensor J. Applying the product rule, the divergence term is split into two parts (cf. (7)) div(D(J)∇b) = div(D(J))∇b + trace(D(J)Hb) . (13) If the structure tensor is computed by spatial averaging (11), the divergence of the diffusion tensor is non-zero in general and needs to be estimated. If the tensor is computed by averaging over the different dimensions of a vector-valued image, i.e., no spatial averaging is applied, the divergence term can be converted into a trace-based term , . Thus, by modifying D(J), the evolution equation can be expressed by a single trace-based term, i.e., the divergence of the diffusion tensor is eliminated. However, according to our knowledge, this elimination is impossible if the structure tensor is computed by spatial averaging. When implementing anisotropic diffusion, an iterative algorithm that updates the input image successively has to be implemented. In each time-step, the diffusion tensor and the image gradient have to be estimated. If the evolution is divergence-based, another numerical approximation of a derivative operator is required, which is applied to the product of diffusion tensor and gradient. Hence, five sequential operations are required: image gradients, point-wise products, local FELSBERG: AUTOCORRELATION-DRIVEN DIFFUSION FILTERING 3 averaging, point-wise product, and gradient. If the evolution is reformulated as a trace-based approach, only four sequential operations are required, since the final gradient is replaced with a second image derivative which can be computed in parallel to the diffusion tensor. D. Adaptive Filtering Adaptive filtering is a more general and actually earlier published variant of steerable filters , developed by Knuttson et al. , . The main idea is to compose a spatially variant filter kernel by linear combinations of shift invariant kernels. The linear coefficients are locally estimated by an orientation-dependent scheme. A comprehensible formulation of the method can be found in , Chapt. 10. Adopting the notation above, the filter is composed as X hadapt = hLP + hC(J)|Ñk ihHP,k (14) k where hHP,k is an orientation selective high-pass filter with orientation nk , Ñk is the dual tensor for the orientation tensor nk nTk and C is the control tensor. The high-pass filter with orientation nk is defined in the Fourier domain by a polar separable filter with radial component ρ(|u|) and an angular component D(u/|u|) = (nTk u)2 /|u|2 . Consider for instance the case ρ(|u|) = |u|2 such that HHP,k = (nTk u)2 (15) p p with n1 = (1, 0)T , n2 = (0, 1)T , and n3 = ( 1/2, 1/2)T . The resulting frequency responses are HHP,1 = u21 , HHP,2 = u22 , and HHP,3 = u21 /2 + u1 u2 + u22 /2. The corresponding orientation tensors are given as 1 0 0 0 1/2 1/2 N1 = N2 = N3 = 0 0 0 1 1/2 1/2 The dual tensors can be computed via the isometric vector representation as 0 − 21 0 1 1 − 12 Ñ2 = Ñ3 = . Ñ1 = 1 0 0 1 − 12 − 12 Finally, in our example, the rightmost term in (14) is proportional to the negative Hessian since X Ñk HHP,k = uuT . (16) k The tensor controlled adaptive filter scheme is often applied in an iterated way to achieve good denoising results , Chapt. 10. By selecting the high-pass filter in the way we have done in the example above, but with a very small constant multiplier 0 < γ << 1 and selecting the low-pass filter appropriately as HLP = 1 − γρ(|u|) = 1 − γ|u|2 , (17) it is possible to show that the iterated adaptive filtering implements a numerical scheme for autocorrelation-driven anisotropic diffusion, see Sect. III-B. III. AUTOCORRELATION -D RIVEN D IFFUSION F ILTERING This section starts with defining autocorrelation-driven diffusion filtering in the continuous domain, where the autocorrelation function estimate is inserted after computing the divergence, resulting in a purely trace-based formulation. This continuous equation is then discretized in the temporal domain and an equivalence to iterated adaptive filtering is found. Finally, the scheme is fully discretized, its stability is analyzed, and practical aspects are discussed. A. Continuous Autocorrelation-Driven Diffusion Filtering Many schemes for tensor-driven anisotropic diffusion use the structure tensor for controlling the diffusion process. Interpreting the structure tensor as an estimate of the autocorrelation function, cf. Sect. II-B (9) to (11), one can go one step back and reformulate anisotropic diffusion with a diffusion tensor that depends on the autocorrelation function instead: Definition 1: The autocorrelation-driven anisotropic diffusion is defined by the evolution ∂b = div(B(R)∇b) , (18) ∂t where B(R) is the diffusion tensor determined by the autocorrelation function R at the current spatial position. This definition is particularly sensible from a statistical point of view if we consider the ergodicity assumption which is essential for denoising by weighted spatial averaging. If the image signal is ergodic in a strict sense, all statistical moments are stationary and thus the autocorrelation is constant, resulting in unweighted averaging. If the image signal is structured, its statistical moments are not stationary, and thus, it is not ergodic and should not be averaged spatially. The autocorrelation function becomes highly peaked in all directions in this case. If the image signal is ergodic in one direction, but structured in the orthogonal direction (e.g., edge or line structure), averaging should only be performed orthogonal to the structure. The autocorrelation function is peaked along the structured direction and flat orthogonal to it. In this work, we consider anisotropic diffusion (12) as an approximation to (18). We will now derive another approximation to (18) which is of fourth order and which always guarantees an even autocorrelation function: Theorem 1: Assume that J according to (11) is at least a second order approximation to HR and B is continuously differentiable. Then ∂b = hD(J)|Hbi (19) ∂t is a fourth order approximation to (18) for a suitable function D. Proof: For the diffusivity B(R), a fourth order approximation is given as 1 B(R) ≈ B(R(0)) + xT HRx B0 (R(0)) . (20) 2 Since J is at least a second order approximation of HR, 1 . D(J) = B(R(0)) + xT Jx B0 (R(0)) (21) 2 4 TRANSACTIONS OF IMAGE PROCESSING, VOL. ..., NO. ..., ... 2011 is also a fourth order approximation of B(R). If we now apply the product rule to (18), we obtain with (1) div(B(R)∇b) = div(B(R))∇b + hB(R)|Hbi . (22) For the first term, we obtain div(B(R)) = B0 ∇R (23) which vanishes in the origin since R is an even function. Hence, div(B(R)∇b) = hB(R)|Hbi , (24) and plugging in (21) we obtain that (19) is a fourth order approximation to (18). If we plug the autocorrelation approximation (21) into (18) before computing the divergence, we obtain the standard formulation of anisotropic diffusion (12). The only difference to (19) is that the divergence of the diffusion tensor does not vanish in general: div(D(J)) 6= 0 . However, this implies that the autocorrelation function contains a first order term ∂R ∂x , leading to a third order approximation to (18). Note that this result is just on the approximation order and assumes that (18) is our model instead of (12). B. A Time-Discrete Scheme for Autocorrelation-Driven Anisotropic Diffusion In order to obtain a filtering method, the continuous differential equation (19) needs to be discretized in the spatial and temporal domain. We start with discretizing the temporal domain, since we aim at relating diffusion filtering to adaptive filtering, which has been introduced with continuous notation in Sect. II-D. Using a standard explicit scheme for time discretization, we obtain from (19) bt+1 = bt + γhD(J)|Hbt i (25) where we use γ = 0.05 throughout this paper. The upper limit for γ depends on the spatial discretization and is discussed in Sect. III-D. For the iterative filtering (25), we obtain Theorem 2: Autocorrelation-driven diffusion filtering according to (25) and the particular case of iterated adaptive filtering using (15) are equivalent. Proof: As it has been shown in the example in Sect. II-D, the Hessian can be written in terms of P a set of dual tensors and corresponding filters (16). Hence, k Ñk hHP,k = −H and plugging this into (25) gives X D E bt+1 = bt + γ Id − C(J) − Ñk hHP,k bt (26) k where we replaced D(J) = Id − C(J). Sorting terms gives ! X bt+1 = δ − γ(hHP,1 + hHP,2 ) + hC(J)|Ñk iγhHP,k bt k (27) which is equivalent to an adaptive filtering according to (14) with constant multiplier γ and the low-pass filter δ−γ(hHP,1 + hHP,2 ). This low-pass filter is the inverse Fourier transform of (17). C. A Fully Discrete Scheme for Autocorrelation-Driven Anisotropic Diffusion In order to derive a fully discrete scheme for (25), we also discretize the spatial domain. Three filter (sets) have to be designed: The Hessian, the derivatives in (11), and the averaging kernel in (11). The latter is chosen to be a Gaussian function with variance σ 2 = 1 throughout the paper. The exact value for the variance is not crucial, as the experimental results do not differ for small variations. The derivatives used in the structure tensor (11) and the second derivatives used to compute the Hessian are computed with filter masks of minimum size. This is motivated by an observation made in : Reducing linear regularization, i.e., using smaller filter masks, for the benefit of increased regularization of the structure tensor improves the overall result. In order to obtain small filter masks with good isotropy, we apply a similar optimization strategy as in , but with a different objective function. We start by assuming 3 × 3 filter masks, which gives us one degree of freedom for the first derivatives (p ∈ [0; 0.25] since we want to avoid extra zeros in the frequency response): ∂b ∂x1 ∂b ∂x2 ≈ ≈ 1 ([p 1 − 2p p]T [−1 0 1]) ∗ b 2 1 ([−1 0 1]T [p 1 − 2p p]) ∗ b . 2 (28) (29) A second order derivative approximation can also be computed with a 3×3 filter, which we derive from one-sided finite differences. This guarantees frequency responses with zero DC level and without extra zeros. Starting with the second derivative for the horizontal coordinate, the 1D Laplace filter is obtained by combining a left-sided and a right-sided difference: ∂2 ≈ [−1 1 0] ∗ [0 − 1 1] = [1 − 2 1] . ∂x21 (30) A second approximation without extra zeros is obtained by the 2D filters that average in the vertical direction with a 2-box filter (see also ): −1 1 0 0 0 0 ∂2 1 −1 1 0 ∗ 0 −1 1 ≈ ∂x21 4 0 0 0 0 −1 1 1 −2 1 1 2 −4 2 . = (31) 4 1 −2 1 The actually used filters are the weighted average of both approximations (30) and (31), which span the whole space of zero DC 3 × 3 second derivatives without extra zeros in the frequency domain: q −2q q ∂2 1 4 − 2q −8 + 4q 4 − 2q (32) ≈ ∂x21 4 q −2q q q 4 − 2q q ∂2 1 −2q −8 + 4q −2q ≈ (33) ∂x22 4 q 4 − 2q q FELSBERG: AUTOCORRELATION-DRIVEN DIFFUSION FILTERING The mixed derivatives corresponding to (30) and identical: T −1 1 0 −1 1 1 ∂2 0 ∗ 0 0 0 ≈ = ∂x1 ∂x2 4 4 1 −1 0 1 5 (31) are −1 0 . 1 (34) The optimal filterset in terms pf p and q is obtained by minimizing the weighted angular error between two normalized tensors in the Fourier domain: 1 H12 H1 H2 (35) T1 = H22 H12 + H22 H1 H2 1 H11 H12 , (36) T2 = − p 2 2 2 H11 + H22 + 2H12 H12 H22 where u is the 2D frequency, Hk = Hk (u; p) are the Fourier transforms of ∂x∂ k and Hkl = Hkl (u; q) are the Fourier 2 transforms of ∂x∂k ∂xl . The weighting function v = v(|u|) is determined by the weight w in (11) and the statistics of natural images (|u|−1 ). Hence, we minimize Z 2 min v cos−1 (hT1 |T2 i) du . (37) p,q The resulting parameters p and q are given in good approximation as p = 14 , i.e., a Sobel filter for the first derivative, and q = 13 16 . D. Stability of the Discrete Scheme Conditions for the stability of discrete diffusion schemes have been proposed in . The fully discrete scheme is rewritten in the form bt+1 = Q(bt )bt (38) and the function Q(bt ) is tested for six criteria (D1)–(D6): continuity in its argument, symmetry, unit row sums, nonnegative off-diagonals, irreducibility, and positive diagonal. In our case, we derive a parametrized effective convolution kernel, which can then be converted into the corresponding matrix operator Q(bt ).1 First of all, we need to specify the diffusivity function D(J). It is known from the literature that the diffusivity determines the relation between non-linear diffusion and robust estimation , . The width of the influence function in robust estimation is more important than the exact shape , such that we decided to choose an algebraically simple diffusivity of exponential form: 1 (39) D(J) = exp − J . k Due to (4), this corresponds to applying the corresponding exponential mapping to the eigenvalues. Without loss of generality, we assume k = 1 for the reminder of this section. 1 This actually means that we assume an infinitely large or periodic discrete image. For finite domains, Q(bt ) is no longer equivalent to a convolution kernel. Let λ1 > λ2 denote the eigenvalues of J and θ the angle of the eigenvector corresponding to λ1 . Hence, using σk = exp(−λk ) (note σ2 > σ1 ) d11 d12 D(J) = (40) d12 d22 σ1 cos2 θ + σ2 sin2 θ −(σ2 − σ1 ) cos θ sin θ = −(σ2 − σ1 ) cos θ sin θ σ1 sin2 θ + σ2 cos2 θ and we obtain (using α = σ1 + σ2 = d11 + d22 ) αq + 2d12 4d22 − 2αq αq − 2d12 hD(J)|Hi = 4d11 − 2αq 4α(q − 2) 4d11 − 2αq αq − 2d12 4d22 − 2αq αq + 2d12 (41) Finally, the filter that corresponds to Q(bt ) is given as δ(x) + γhD(J)|Hi, cf. (25). For Q(bt ), we immediately get continuity, symmetry, unit row sums, and irreducibility. Positive diagonal entries (D6) are obtained if 2 . (42) γ< 19 The off-diagonals are however not nonnegative (D4), choose e.g. σ2 = 1, σ1 = 1/15 and θ = π/4, the upper left element in (41) equals −1/15. Even though the nonnegativity is violated, this does not imply instability of the scheme. In particular, for q = 0, the proposed scheme is identical to the explicit scheme , which is known to be conditional stable . If we restrict the stability analysis to locally simple signals, we can draw conclusions concerning stability. For this class of signals, the structure tensor J is of rank one. Hence, σ1 1 and σ2 = 1. Assuming σ1 = 0, we can compute the θ-intervals where the coefficients in (41) become negative. These are (modulo π): 4d22 − 2αq (43) (0.15π; 0.35π) for αq + 2d12 (44) (0.28π; 0.72π) for 4d11 − 2αq (45) (0.65π; 0.85π) for αq − 2d12 (46) (−0.22π; 0.22π) for These intervals are such that the angular distance to the gradient orientation is maximized, i.e., we have a maximum level of integration in these sectors and due to the vanishing sums property, the resulting effective operator is nonnegative. For instance, if we integrate along the columns (assuming θ = 0) of (41), we obtain the weighted 1D Laplacian operator σ1 [4, −8, 4]. It also worthwhile noticing that for the case q = 0, the first (43) and the third sector (45) vanish, but the second (44) and fourth (46) cover the full respective quadrant. For q = 1, the first (43) and third sector (45) cover the full respective quadrant and the second (44) and fourth sector (46) vanish. E. Practical Details What we experienced as the most critical parameter of the algorithm in order to obtain comparative results, is the width of the diffusivity function (39) k. Too large values of k lead to blurred results, i.e., structure is considered as noise, and too small values lead to noisy results, since noise is considered 6 TRANSACTIONS OF IMAGE PROCESSING, VOL. ..., NO. ..., ... 2011 as structure. The parameter that gives the statistically best discrimination of structure and noise can be derived from a null-hypothesis test and the resulting choice for k is  k= e−1 2 σ e−2 e (47) where σe2 is the estimated noise variance according to the method described in  with a minor modification. The estimation is based on a mixture of two χ22 distributions, which are fitted to the squared gradient magnitude distribution using the EM algorithm. The minor modification that we have applied here is, that we swap the two mixture components during the iterations if the noise has a larger estimated variance than the signal. Note further that the variance is given as twice the mixture parameter µ of the χ22 distribution: σe2 = 2µ. When implementing the diffusivity function (39), it appears to be preferable to apply the exponential mapping directly on the matrix instead of an eigenvalue manipulation. However, the latter is faster since the eigenvalues can be extracted in closed form , . In principle this would also allow to combine explicitly isotropic and anisotropic diffusion , but we chose to apply the same scheme to edge-like images (e.g. the pepper image) and fingerprint images since the achieved results were satisfying in both cases. Our scheme shares similarities with edge-enhancing diffusion, which explains good performance on edge-like images. Good performance on line-like structures might be explained by the missing odd order terms in the scheme due to omitting the divergence of the diffusion tensor. Finally, an often discussed parameter of anisotropic diffusion is the stopping time. We will not give a detailed consideration here. In our implementation the iterations are stopped as soon as the image update is below a certain threshold. Based on k, the threshold for the stopping criterion is empirically set to k 1/5 /40 by looking at curves of maximum PSNR in logarithmic scale. This is however only partly satisfying as the rate of change might depend on the image structure and not on the difference to a noise-free instance of the image. The stopping time should rather be based on the estimated noise level at each time instance. In practice, however, we observe little difference of the two strategies, since the largest area of the images is rather flat and the mean or median image update is basically given by the amount of removed noise (which is in itself directly related to the amount of noise). IV. E XPERIMENTS This section consists of two parts: the experiments and the discussion of results. A. Experimental Results For the experiments, we used a popular set of images that has already been used in earlier publications , , . The evaluation criteria are, besides the PSNR, SSIM  and VIF . The PSNR does not reflect what kind of artifacts are introduced, cf. Fig. 1, therefore, the other two criteria have been included. Two of the most successful methods in terms of PSNR are due to Portilla et al. (GSM)  and Dabov et al. Fig. 1. The PSNR does not reflect the type of residual error. Here: detail of the pepper image (σ = 15, top left) for which the proposed method, GSM, and EEDF yield about the same PSNR result. Top right: EEDF, bottom left: GSM, and bottom right: proposed method. (BM3D) , so that we compare our method with their implementations. Other methods as FoE  and K-SVD  are respectively slightly inferior .2 Furthermore, we compare our results to a standard implementation of anisotropic (edge enhancing) diffusion filtering (EEDF) . We evaluate each image and noise level with ten different instances of noise, clipped and quantized to eight bit integers. Each method is fed with an estimated noise level and not with the ground truth noise level of the added noise as done in the original evaluations , . The noise level is used as a parameter in all four methods to determine the degree of smoothing. Hence, results for GSM and BM3D differ from the literature, since the original images already contain noise, resulting in a higher noise level estimate. For the smoothed result, PSNR, SSIM, and VIF values are reported. Our method is entirely implemented in Matlab and is about 6 times faster than GSM, which is a combination of Matlab and C code. Our implementation is similar to the EEDF implementation  and shares basically the same parameters. Therefore, we use the same parameter settings for the EEDF, 2 The method in  seems to work slightly better than , but the code for the latter was more straightforward to be used for denoising. FELSBERG: AUTOCORRELATION-DRIVEN DIFFUSION FILTERING 7 cf. Section III-E. All our attempts to improve the results from EEDF by varying the parameters have not lead to a systematic increase of the quality. The resulting PSNRs for the four methods are plotted in Fig. 2 for two different images. Numerical results of the house.png 50 proposed EEDF GSM BM3D 45 40 TABLE I PSNR S FOR THE PROPOSED METHOD σ 1 2 5 10 15 20 25 50 75 100 Lena 47.8 42.5 37.5 33.4 31.6 30.4 29.7 25.8 22.8 20.5 Barb. 47.8 42.8 36.6 32.3 29.1 26.5 24.9 22.9 21.1 19.3 Boat 46.6 42.5 36.1 32.3 30.9 29.6 28.5 24.9 22.1 20.0 House 47.8 42.1 36.1 32.6 30.9 29.4 28.7 25.0 22.0 20.2 Pepper 48.1 42.8 36.9 33.2 30.7 29.6 28.9 24.1 21.4 19.4 Fingerp. 37.9 36.9 34.4 30.8 28.7 27.3 26.3 23.1 20.8 19.1 Flinst. 38.5 37.4 34.8 31.3 29.9 28.4 27.0 21.2 19.5 17.7 35 1 25 0.9 20 0.8 15 0.7 10 0.6 VIF PSNR [dB] house.png 30 5 0 0.5 0.4 1 2 5 10 15 20 25 50 75 100 m 0.3 fingerprint.png 50 0.2 proposed EEDF GSM BM3D 45 40 proposed EEDF GSM BM3D 0.1 0 1 2 5 10 15 20 25 50 75 100 25 50 75 100 m 35 1 25 0.9 20 0.8 15 0.7 10 0.6 VIF PSNR [dB] fingerprint.png 30 5 0 0.5 0.4 1 2 5 10 15 20 25 50 75 100 m 0.3 0.2 Fig. 2. PSNR results for the proposed method, EEDF, GSM, and BM3D for the house image and the fingerprint image. proposed method are summarized in Table I. The resulting VIFs and SSIMs for the four methods are plotted in Fig. 3 and Fig. 4, respectively. Numerical results of the proposed method are summarized in Table II for the VIF and in Table III for the SSIM. In order to allow a more systematic analysis of the performance, the differences between the proposed method and the three competing methods are plotted in Fig. 5. B. Discussion of Results As it can be seen from the PSNRs in Fig. 2 and the solid line plots in Fig. 5 (top) for low noise levels, the proposed proposed EEDF GSM BM3D 0.1 0 1 2 5 10 15 20 m Fig. 3. VIF results for the proposed method, EEDF, GSM, and BM3D for the house image and the fingerprint image. method works better than GSM and BM3D and slightly worse than EEDF. For high noise levels, GSM and BM3D work slightly better than the proposed method and EEDF works worst. Integrated over the different noise levels, GSM works only better than the proposed method for the house image, BM3D works worse for boat, pepper, and fingerprint, and 8 TRANSACTIONS OF IMAGE PROCESSING, VOL. ..., NO. ..., ... 2011 TABLE III SSIM SCORES FOR THE PROPOSED METHOD house.png 1 σ 1 2 5 10 15 20 25 50 75 100 0.9 0.8 0.7 SSIM 0.6 0.5 0.4 Lena 1.00 0.99 0.98 0.94 0.90 0.87 0.84 0.67 0.53 0.43 Barb. 1.00 1.00 0.99 0.96 0.92 0.87 0.83 0.69 0.56 0.45 Boat 1.00 1.00 0.98 0.95 0.92 0.89 0.86 0.71 0.55 0.44 House 0.99 0.97 0.89 0.81 0.76 0.71 0.69 0.51 0.37 0.30 Pepper 0.99 0.98 0.94 0.89 0.83 0.81 0.81 0.59 0.45 0.35 Fingerp. 1.00 1.00 0.99 0.99 0.97 0.96 0.95 0.88 0.80 0.72 Flinst. 0.99 0.99 0.99 0.98 0.97 0.95 0.93 0.79 0.71 0.62 0.3 0.2 1 2 15 5 10 15 20 25 50 75 100 m fingerprint.png 1 0.9 0.8 0.7 SSIM 0.6 10 6 PSNR / 100 6 VIF / 100 6 SSIM 0.1 0 difference as function of noiselevel proposed EEDF GSM BM3D 5 0 ï5 ï10 ï15 0.5 ï20 0.4 ï25 1 PSNR/GSM PSNR/EEDF PSNR/BM3D SSIM/GSM SSIM/EEDF SSIM/BM3D VIF/GSM VIF/EEDF VIF/BM3D 2 5 10 15 20 25 50 75 100 m 0.3 difference as function of image 0.2 1 2 20 5 10 15 20 25 50 75 100 m Fig. 4. SSIM results for the proposed method, EEDF, GSM, and BM3D for the house image and the fingerprint image. TABLE II VIF SCORES FOR THE PROPOSED METHOD σ 1 2 5 10 15 20 25 50 75 100 Lena 1.00 1.00 0.99 0.98 0.96 0.94 0.91 0.76 0.65 0.55 Barb. 1.00 1.00 0.99 0.96 0.90 0.82 0.74 0.67 0.60 0.51 Boat 1.00 1.00 0.99 0.97 0.93 0.90 0.86 0.62 0.57 0.49 House 1.00 1.00 1.00 0.99 0.98 0.96 0.93 0.75 0.64 0.53 Pepper 1.00 1.00 0.99 0.98 0.97 0.94 0.89 0.72 0.61 0.50 Fingerp. 0.91 0.90 0.90 0.86 0.80 0.75 0.71 0.59 0.49 0.39 Flinst. 1.00 0.99 0.99 0.97 0.96 0.94 0.92 0.55 0.49 0.40 EEDF works only better for the fingerprint image, cf. Fig. 5 (bottom). The proposed method works as well as other recently published methods, see e.g.,  in comparison to Table I. Note that the quite obvious deviation from Portilla’s original evaluation for the fingerprint image at low noise levels is due to using the estimated noise level instead of the added noise level 6 PSNR / 100 6 VIF / 100 6 SSIM 0.1 0 25 proposed EEDF GSM BM3D 15 10 PSNR/GSM PSNR/EEDF PSNR/BM3D SSIM/GSM SSIM/EEDF SSIM/BM3D VIF/GSM VIF/EEDF VIF/BM3D 5 0 ï5 ï10 ï15 Lena Barb. Boat House image Pepper Fingerp. Flinst. Fig. 5. Comparison of results as a function of noise (top) and as a function of the image (bottom). Positive values mean that the proposed method performs better than the competing method. ground-truth. The fingerprint image contains a lot of noise even without added noise and hence the comparison for the PSNR is made to a very noisy image. If using the added noise level, the method reproduces the noise from the ’groundtruth’ image. The obvious interpretation of this observation must be that GSM determines the amount of smoothing nearly entirely FELSBERG: AUTOCORRELATION-DRIVEN DIFFUSION FILTERING from the noise level parameter, which also leads to low-pass effects, cf. also Fig. 1, bottom left. A similar dependency on the noise parameter is observed for BM3D. The proposed method is only mildly affected and EEDF is hardly affected by changing the noise level parameter for the fingerprint case. As it can be seen from the VIF scores in Fig. 3 and the dashdotted line plots in Fig. 5 (top) throughout all noise levels, the proposed method works better than GSM and worse than EEDF. BM3D is better than the proposed method in a very few cases. The same order of results is obtained if the results are integrated over the different noise levels, cf. Fig. 5 (bottom). As it can be seen from the SSIM scores in Fig. 4 and the dashed line plots in Fig. 5 (top) for low noise levels, the proposed method works better than GSM and BM3D. For high noise levels, GSM and BM3D work better than the proposed method. The proposed method works better than EEDF throughout, but with an increasing difference with increasing noise level. Integrated over the different noise levels, GSM and BM3D work better than the proposed one on all images except for the fingerprint image and the proposed method works better than EEDF for all images, cf. Fig. 5 (bottom). The proposed method works better than the fields of expert except for the three highest noise levels, the house image, and medium to high level noise in the pepper image, cf.  in comparison to Table III. The proposed method is significantly faster than GSM and EEDF (measured on a 3GHz Intel Core 2 Duo). Furthermore, it is extremely simple to implement and due to the low number of sequential steps in each iteration well suited for a GPU implementation. First performance evaluations on the GPU (own framework within DirectX on an Nvidea 8800GTX) allowed us to run 50 iterations in video real-time on PAL image size (720 × 576). 50 iterations are sufficient for all considered noise levels. The method can be made even faster by not updating the diffusion tensor in every iteration, but only in every second or third. V. C ONCLUSION We proposed a new approach to define tensor driven diffusion based on the autocorrelation function. We show that our scheme is equivalent to a special case of adaptive filtering. We propose an optimized filter setup for the computation and compare the method to GSM, BM3D, and EEDF using three quality measures, PSNR, VIF, and SSIM. The result of the comparison is ambivalent in the way that the ranking of methods depends on the choice of the quality measure, the noise level, and the kind of image. Still, we observe that BM3D is always better than GSM and that our method is superior to GSM and BM3D for low-noise cases and images with many line structures. Compared to EEDF, the proposed method is clearly superior, except for the VIF scores. Visually, the diffusion results (EEDF and proposed method) contain fewer artifacts than the GSM results. The proposed scheme is easy to implement and fast to execute. Hence, we believe that our scheme is a good choice in many practical denoising tasks. 9 ACKNOWLEDGMENT I would like to thank Dr. Hanno Scharr and Kai Krajsek for discussions on the topic and pointing me to references with relevant work. I further thank Johan Hedborg for running the evaluation on the GPU. This research has received funding from the EC’s 7th Framework Programme (FP7/2007-2013), grant agreements 21578 (DIPLECS) and 247947 (GARNICS), from ELLIIT, the Strategic Area for ICT research funded by the Swedish Government, and from the The Swedish Research Council through a grant for the project Non-linear adaptive color image processing. R EFERENCES  P. Perona and J. Malik, “Scale-space and edge detection using anisotropic diffusion,” IEEE Trans. Pattern Analysis and Machine Intelligence, vol. 12, no. 7, pp. 629–639, 1990.  J. Weickert, “Theoretical foundations of anisotropic diffusion in image processing,” Computing, Suppl., vol. 11, pp. 221–236, 1996.  R. van den Boomgaard, “Nonlinear diffusion in computer vision,” http://staff.science.uva.nl/∼rein/nldiffusionweb/material.html, (Accessed 28 Jan 2008).  J. Weickert, K. Zuiderveld, B. ter Haar Romeny, and W. Niessen, “Parallel implementations of AOS schemes: A fast way of nonlinear diffusion filtering,” in IEEE Int. Conf. on Image Processing, 1997, pp. 396–399.  M. Welk, J. Weickert, and G. Steidl, “From tensor-driven diffusion to anisotropic wavelet shrinkage,” in European Conference on Computer Vision, 2006.  G. H. Granlund and H. Knutsson, Signal Processing for Computer Vision. Kluwer Academic Publishers, Dordrecht, 1995.  J. Weule, “Iteration nichtlinearer Gauss-Filter in der Bildverarbeitung,” Ph.D. dissertation, Heinrich-Heine-Universität Düsseldorf, 1994.  C. Tomasi and R. Manduchi, “Bilateral filtering for gray and color images,” in Proceedings of the 6th ICCV, 1998.  Y. Cheng, “Mean shift, mode seeking, and clustering,” IEEE Trans. Pattern Analysis and Machine Intell., vol. 17, no. 8, pp. 790–799, August 1995.  D. Comaniciu and P. Meer, “Mean shift: A robust approach toward feature space analysis,” IEEE Trans. Pattern Analysis and Machine Intell., vol. 24, no. 5, pp. 603–619, 2002.  M. Felsberg and G. Granlund, “Anisotropic channel filtering,” in Proc. 13th Scandinavian Conference on Image Analysis, ser. LNCS 2749, 2003, pp. 755–762.  J. Portilla, V. Strela, J. Wainwright, and E. P. Simoncelli, “Image denoising using scale mixtures of Gaussians in the wavelet domain,” IEEE Trans. Image Processing, vol. 12, no. 11, pp. 1338–1351, 2003.  M. Elad and M. Aharon, “Image denoising via sparse and redundant representations over learned dictionaries,” Image Processing, IEEE Transactions on, vol. 15, no. 12, pp. 3736–3745, November 2006.  K. Dabov, A. Foi, V. Katkovnik, and K. Egiazarian, “Image denoising by sparse 3d transform-domain collaborative filtering,” IEEE Trans. Image Processing, vol. 16, no. 8, pp. 2080–2095, August 2007.  J. Mairal, F. Bach, J. Ponce, G. Sapiro, and A. Zisserman, “Non-local sparse models for image restoration,” in International Conference on Computer Vision, 2009, pp. 2272–2279.  S. Roth and M. J. Black, “Fields of experts,” International Journal of Computer Vision, vol. 82, no. 2, pp. 205–229, 2009.  S. C. Zhu and D. Mumford, “Prior learning and Gibbs reactiondiffusion,” IEEE Trans. Pattern Analysis and Machine Intell., vol. 19, no. 11, pp. 1236–1250, 1997.  M. Felsberg, P.-E. Forssén, and H. Scharr, “Channel smoothing: Efficient robust smoothing of low-level signal features,” IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 28, no. 2, pp. 209–222, 2006.  S. K. Weeratunga and C. Kamath, “A comparison of PDE-based nonlinear anisotropic diffusion techniques for image denoising,” in Image Processing: Algorithms and Systems II, SPIE Electronic Imaging, 2003.  R. A. Carmona and S. Zhong, “Adaptive smoothing respecting feature directions,” IEEE Trans. Image Processing, vol. 7, no. 3, pp. 353–358, 1998. 10  D. Tschumperlé and R. Deriche, “Vector-valued image regularization with PDE’s : A common framework for different applications.” IEEE Trans. Pattern Anal. Mach. Intell., vol. 27, no. 4, pp. 506–517, 2005.  M. Felsberg, “On the relation between anisotropic diffusion and iterated adaptive filtering,” in 30. DAGM Symposium Mustererkennung, ser. LNCS, vol. 5096. Springer, 2008, pp. 436–445.  M. Felsberg, S. Kalkan, and N. Krüger, “Continuous dimensionality characterization of image structures,” Image and Vision Computing, vol. 27, no. 6, pp. 628–636, 2009.  H. R. Sheikh and A. C. Bovik, “Image information and visual quality,” IEEE Trans. Image Processing, vol. 15, no. 2, pp. 430–444, 2006.  Z. Wang, A. C. Bovik, H. R. Sheikh, and E. P. Simoncelli, “Image quality assessment: From error visibility to structural similarity,” IEEE Trans. Image Processing, vol. 13, no. 4, pp. 600–612, 2004.  J. Bigün, G. H. Granlund, and J. Wiklund, “Multidimensional orientation estimation with applications to texture analysis and optical flow,” IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 13, no. 8, pp. 775–790, August 1991.  W. Förstner, Statistische Verfahren für die automatische Bildanalyse und ihre Bewertung bei der Objekterkennung und -vermessung, ser. C. Verlag der Bayerischen Akademie der Wissenschaften, 1991, no. 370.  A. Papoulis, Probability, Random Variables and Stochastic Processes. McGraw-Hill, 1965.  J. Bigün and G. H. Granlund, “Optimal orientation detection of linear symmetry,” in Proceedings of the IEEE First International Conference on Computer Vision, London, Great Britain, June 1987, pp. 433–438.  W. Förstner and E. Gülch, “A fast operator for detection and precise location of distinct points, corners and centres of circular features,” in ISPRS Intercommission Workshop, Interlaken, June 1987, pp. 149–155.  J. Weickert, “A review of nonlinear diffusion filtering,” in Scale-Space Theory in Computer Vision, ser. LNCS, B. ter Haar Romeny, L. Florack, J. Koenderink, and M. Viergever, Eds., vol. 1252. Springer, Berlin, 1997, pp. 260–271.  W. T. Freeman and E. H. Adelson, “The design and use of steerable filters,” IEEE Trans. Pattern Analysis and Machine Intell., vol. 13, no. 9, pp. 891–906, 1991.  H. Knutsson, R. Wilson, and G. H. Granlund, “Anisotropic nonstationary image estimation and its applications: Part I – restoration of noisy images,” IEEE Trans. on Communications, vol. COM–31, no. 3, pp. 388–397, March 1983.  H. Knutsson, L. Haglund, and G. H. Granlund, “A new approach to image enhancement using tensor fields,” in Proc. PROART Workshop on Vision, 1990, pp. 111–115.  B. Johansson, “Low level operations and learning in computer vision,” Ph.D. dissertation, Linköping University, Sweden, SE-581 83 Linköping, Sweden, December 2004, dissertation No. 912, ISBN 91-85295-93-0.  H. Scharr, S. Körkel, and B. Jähne, “Numerische Isotropieoptimierung von FIR-Filtern mittels Querglättung,” in 19. DAGM Symposium Mustererkennung, Braunschweig, E.Paulus and F.M.Wahl, Eds., 1997, pp. 367–374.  J. Weickert, “Anisotropic diffusion in image processing,” Ph.D. dissertation, Faculty of Mathematics, University of Kaiserslautern, 1996.  M. J. Black and A. Rangarajan, “On the unification of line processes, outlier rejection, and robust statistics with applications in early vision,” International Journal of Computer Vision, vol. 19, no. 1, pp. 57–91, 1996.  P. D. Lax and R. D. Richtmyer, “Survey of the stability of linear finite difference equations,” Comm. Pure Appl. Math., vol. 9, pp. 267–293, 1956.  S. McKee and A. R. Mitchell, “Alternating direction methods for parabolic equations in two space dimensions with a mixed derivative,” The Computer Journal, vol. 13, no. 1, pp. 81–86, 1970.  W. Förstner, “Image preprocessing for feature extraction in digital intensity, color and range images,” in Proc. Int’l. Summer School on Data Analysis and the Statistical Foundation of Geomatics, ser. LNES. Springer, 1998.  R. Terebes, O. Lavialle, P. Baylou, and M. Borda, “Mixed anisotropic diffusion,” in International Conference on Pattern Recognition, vol. 3, 2002, pp. 760–763.  S. Roth and M. J. Black, “Fields of experts: A framework for learning image priors,” in Computer Vision and Pattern Recognition, vol. 2, 2005, pp. 860–869.  J. Bai and X.-C. Feng, “Fractional-order anisotropic diffusion for image denoising,” IEEE Trans. Image Processing, vol. 16, no. 10, pp. 2492– 2502, 2007. TRANSACTIONS OF IMAGE PROCESSING, VOL. ..., NO. ..., ... 2011 Michael Felsberg received the Diploma degree in 1998 and the PhD degree in 2002 (summa cum laude, awarded ’Fakultätspreis 2002’) in engineering from Christian-Albrechts-University Kiel, Germany. During his PhD studies, he held scholarships from the DFG and the German National Merit Foundation. He became associate professor at Linköping University, Sweden, in 2004, docent in 2005, and full professor of computer vision and head of division in 2008. His research interests include multidimensional signal theory, image processing, low-level computer vision, and signal-processing based approaches to probabilistic representations and learning. He is coordinator of the Seventh Framework EU Programme Project DIPLECS. He has published more than 90 conference papers, journal articles, and book contributions on image processing, computer vision, and pattern recognition. He received paper awards from the German Pattern Recognition Society (DAGM) in 2000 and 2004 and from the Swedish Society for Automated Image Analysis (SSBA) in 2007 and 2010. He received the Olympus Award in 2005. He is vice-president of the SSBA.
* Your assessment is very important for improving the work of artificial intelligence, which forms the content of this project
Efficient Spatiotemporal Filtering and Modelling Jörgen Karlholm Linköping Studies in Science and Technology