Discontinuous Galerkin Methods: Linear Systems and Hidden Accuracy Discontinuous Galerkin Methods: Linear Systems and Hidden Accuracy Proefschrift ter verkrijging van de graad van doctor aan de Technische Universiteit Delft, op gezag van de Rector Magnificus prof.ir. K.C.A.M. Luyben, voorzitter van het College voor Promoties, in het openbaar te verdedigen op vrijdag 14 juni 2013 om 15:00 uur door Paulien van Slingerland wiskundig ingenieur geboren te Leiderdorp Dit proefschrift is goedgekeurd door de promotor: Prof.dr.ir. C. Vuik Samenstelling promotiecommissie: Rector Magnificus Prof.dr.ir. C. Vuik Prof.dr. Y. Notay Prof.dr. J. Qiu Prof.dr.ir. B. Koren Dr. H.Q. van der Ven Prof.dr.ir. J.D. Jansen Prof.dr.ir. C.W. Oosterlee voorzitter Technische Universiteit Delft, promotor Université Libre de Bruxelles Xiamen University Technische Universiteit Eindhoven Nationaal Lucht- en Ruimtevaartlaboratorium Technische Universiteit Delft Technische Universiteit Delft The research in this dissertation was carried out at and supported by Delft Institute of Applied Mathematics, Delft University of Technology. Part of the work was also sponsored by the Air Force Office of Scientific Research, Air Force Material Command, USAF, under grant number FA8655-09-1-3055. Discontinuous Galerkin Methods: Linear Systems and Hidden Accuracy. Dissertation at Delft University of Technology. c 2013 by P. van Slingerland Copyright ISBN: 978-94-6186-157-3 Summary Discontinuous Galerkin Methods: Linear Systems and Hidden Accuracy Just like it is possible to predict tomorrow’s weather, it is possible to predict e.g. the presence of oil in a reservoir, or the air flow around a newly designed air foil. These predictions are often based on computer simulations, for which the Discontinuous Galerkin (DG) method can be particularly suitable. This discretization scheme uses discontinuous piecewise polynomials of degree p to combine the best of both classical finite element and finite volume methods. This thesis focuses on its linear systems and ‘hidden’ accuracy. Linear DG systems are relatively large and ill-conditioned. In search of efficient linear solvers, much attention has been paid to subspace correction methods. A particular example is a two-level preconditioner with a coarse space that is based on the DG scheme with polynomial degree p = 0. This more or less reduces the DG matrix to a (smaller) central difference matrix, for which many efficient linear solvers are readily available. An alternative for preconditioning is deflation. To contribute to the ongoing comparison between multigrid and deflation, and to extend the available analysis for the aforementioned two-level preconditioner, we have cast it into the deflation framework, and studied the impact of both variants on the convergence of the Conjugate Gradient (CG) method. This thesis discusses the results for Symmetric Interior Penalty (discontinuous) Galerkin (SIPG) discretizations for diffusion problems with strong contrasts in the coefficients. In addition, it considers the influence of the SIPG penalty parameter, weighted averages in the SIPG formulation (SWIP), the smoother, damping of the smoother, and the strategy for solving the coarse systems. We have found that both two-level methods yield fast and vi Summary scalable CG convergence (independent of the mesh element diameter), provided that the penalty parameter is chosen dependent on local values of the diffusion coefficient. The latter choice also benefits the discretization accuracy. Without damping, deflation can be up to 35% faster. If an optimal damping parameter is used, both two-level strategies yield similar efficiency. However, it remains an open question how the damping parameter can best be selected in practice. At the same time, DG approximations can contain ‘hidden’ accuracy. For a large class of sufficiently smooth periodic problems, there exists a postprocessor that enhances the DG convergence from order p + 1 to order 2p + 1. Interestingly, this technique needs to be applied only once, at the final simulation time, and does not contain any information of the underlying physics or numerics. To be able to post-process near non-periodic boundaries and shocks as well, we have developed the so-called position-dependent post-processor, and analyzed its impact on the discretization accuracy in both the L2 - and the L∞ -norm. This thesis presents the results for DG (upwind) discretizations for hyperbolic problems, and demonstrates the benefits of post-processing for streamline visualization. Our numerical and theoretical results show that the position-dependent post-processor can enhance the DG convergence from order p + 1 to order 2p + 1. Furthermore, unlike before, this technique can be applied in the entire domain to enhance the smoothness and accuracy of the DG approximation, even near non-periodic boundaries and shocks. Altogether, this work contributes to shorter computational times and more accurate visualization of DG approximations. This strengthens the increasing consensus that DG methods can be an effective alternative for classical discretizations schemes, and sustains the idea that numerical approximations can contain more information than we originally thought. Samenvatting Discontinuous Galerkin Methoden: Lineare Stelsels en Verborgen Nauwkeurigheid Net zoals het mogelijk is om het weer van morgen te voorspellen, is het mogelijk om bijvoorbeeld de aanwezigheid van olie in een reservoir te voorspellen, of de luchtstroming rond een nieuw ontwerp van een vliegtuigvleugel. Doorgaans zijn deze voorspellingen gebaseerd op computersimulaties, waar de Discontinuous Galerkin (DG) methode bij uitstek geschikt voor kan zijn. Dit discretisatieschema gebruikt stuksgewijze polynomen van graad p om de voordelen van klassieke eindige elementen en eindige volume methoden te combineren. Dit proefschrift focust op de bijbehorende lineaire stelsels en ‘verborgen’ nauwkeurigheid. Lineaire DG stelsels zijn relatief groot en slecht geconditioneerd. Op zoek naar effciënte lineaire solvers is er veel aandacht besteed aan subspace correction methoden. Een specifiek voorbeeld is een two-level preconditionering waarbij de ruimte op het grove niveau gebaseerd is op het DG schema met polynomiale graad p = 0. Dit reduceert de DG matrix min of meer tot een (kleinere) centrale differentie matrix, waarvoor reeds veel efficiënte lineaire solvers beschikbaar zijn. Een alternatief voor preconditioneren is deflatie. Om bij te dragen aan de voortdurende vergelijking tussen multigrid en deflatie, en om de beschikbare analyse voor de eerder genoemde two-level preconditionering uit te breiden, hebben we deze in een deflatietechniek omgezet, en het effect van beide methoden op de convergentie van de Conjugate Gradient (CG) methode bestudeerd. Dit proefschrift beschrijft de resultaten voor Symmetric Interior Penalty (discontinuous) Galerkin (SIPG) discretizaties voor diffusieproblemen met sterke contrasten in de coëfficiënten. Daarnaast beschouwt het de invloed viii Samenvatting van de SIPG penalty parameter, gewogen gemiddelden in de SIPG formulering (SWIP), de smoother, demping van de smoother, en de strategie om de lineaire stelsels op het grove niveau op de lossen. We hebben gevonden dat beide two-level methoden snelle en schaalbare CG convergentie leveren (onafhankelijk van de diameter van de grid elementen), mits de penalty parameter afhankelijk van lokale waarden van de diffussiecoëfficiënt gekozen wordt. Dit laatste verbetert ook de nauwkeurigheid van de discretisatie. Zonder demping kan deflatie tot 35% sneller zijn. Als een optimale demping parameter gebruikt wordt, zijn beide two-level strategieën ongeveer even efficiënt. Desalniettemin is het een open vraag hoe de penalty parameter het beste gekozen kan worden in de praktijk. Tegelijkertijd kunnen DG benaderingen ‘verborgen’ nauwkeurigheid bevatten. Voor een grote klasse van voldoende gladde periodieke problemen bestaat er een filter, die de DG convergentie verbetert van orde p + 1 naar orde 2p + 1. Interssant genoeg hoeft deze techniek slechts één keer toegepast te worden, terwijl deze geen enkele informatie bevat over de onderliggende fysische en numerieke vergelijkingen. Om ook nabij niet-periodieke randen en schokken te kunnen filteren, hebben we het zogeheten positie-afhankelijke filter ontwikkeld, en zijn effect geanalyseerd op de nauwkeurigheid van de discretisatie in de L2 en de L∞ -norm. Dit proefschrift presenteert de resultaten voor DG (upwind) discretisaties voor hyperbolische problemen, en demonstreert de voordelen van filteren voor visualisaties in de vorm van stroomlijnen. Daarnaast, in tegenstelling tot hiervoor, kan deze techniek toegepast worden in het gehele domein om de gladheid en nauwkeurigheid van de DG approximatie te verbeteren, zelfs nabij niet-periodieke randen en schokken. Al met al draagt dit werk bij aan kortere rekentijden en nauwkeuriger visualisatie van DG approximaties. Dit versterkt de toenemende consensus dat DG methoden een effectief alternatief kunnen zijn voor klassieke discretisatieschema’s, en onderschrijft het idee dat numerieke benaderingen meer informatie kunnen bevatten dan we oorspronkelijk dachten. Contents 1 Introduction 1.1 Introduction . . . . . . . . . . . . . . . 1.2 Discontinuous Galerkin (DG) methods 1.3 Linear DG systems . . . . . . . . . . . 1.4 Hidden DG accuracy . . . . . . . . . . 1.5 Outline . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1 1 2 4 5 6 2 Linear DG systems 2.1 Introduction . . . . . . . . . . . . . . . . . . . . 2.2 Discretization . . . . . . . . . . . . . . . . . . . 2.2.1 SIPG method for diffusion problems . . 2.2.2 Linear system . . . . . . . . . . . . . . . 2.2.3 Penalty parameter . . . . . . . . . . . . 2.3 Two-level preconditioner . . . . . . . . . . . . . 2.3.1 Coarse correction operator . . . . . . . . 2.3.2 Two-level preconditioner . . . . . . . . . 2.3.3 Implementation in a CG algorithm . . . 2.4 Deflation variant . . . . . . . . . . . . . . . . . 2.4.1 Two-level deflation . . . . . . . . . . . . 2.4.2 FLOPS . . . . . . . . . . . . . . . . . . 2.4.3 Coarse systems . . . . . . . . . . . . . . 2.4.4 Damping . . . . . . . . . . . . . . . . . 2.5 Numerical experiments . . . . . . . . . . . . . . 2.5.1 Experimental setup . . . . . . . . . . . . 2.5.2 The influence of the penalty parameter 2.5.3 Coarse systems . . . . . . . . . . . . . . 2.5.4 Smoothers and damping . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7 8 8 9 10 11 12 12 14 14 15 15 16 17 17 18 18 20 23 24 . . . . . . . . . . . . . . . . . . . . x CONTENTS 2.5.5 Other test cases . . . . . . . . . . . . . . . . . . . . . . Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 26 26 3 Theoretical scalability 3.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3.2 Methods and assumptions . . . . . . . . . . . . . . . . . . . . . 3.2.1 SIPG discretization for diffusion problems . . . . . . . . 3.2.2 Linear system . . . . . . . . . . . . . . . . . . . . . . . . 3.2.3 Two-level preconditioning and deflation . . . . . . . . . 3.3 Abstract relations for any SPD matrix A . . . . . . . . . . . . . 3.3.1 Using the error iteration matrix . . . . . . . . . . . . . . 3.3.2 Implications for the two-level methods . . . . . . . . . . 3.3.3 Comparing deflation and preconditioning . . . . . . . . 3.4 Intermezzo: regularity on the block diagonal of A . . . . . . . . 3.4.1 Using regularity of the mesh . . . . . . . . . . . . . . . 3.4.2 The desired result in terms of ‘small’ bilinear forms . . . 3.4.3 Regularity on the block diagonal of A . . . . . . . . . . 3.5 Main result: scalability for SIPG systems . . . . . . . . . . . . 3.5.1 Main result: scalability for SIPG systems . . . . . . . . 3.5.2 Special case: block Jacobi smoothing . . . . . . . . . . . 3.5.3 Influence of damping and the penalty parameter for block Jacobi smoothing . . . . . . . . . . . . . . . . . . . . . . 3.6 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 33 34 34 35 36 37 39 39 41 42 43 44 45 48 50 50 52 4 Hidden DG accuracy 4.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . 4.2 Discretization . . . . . . . . . . . . . . . . . . . . . . 4.2.1 DG for one-dimensional hyperbolic problems 4.2.2 DG for two-dimensional hyperbolic systems . 4.3 Original post-processing strategies . . . . . . . . . . 4.3.1 B-splines . . . . . . . . . . . . . . . . . . . . 4.3.2 Symmetric Post-processor . . . . . . . . . . . 4.3.3 One-sided post-processor . . . . . . . . . . . 4.4 Position-dependent post-processor . . . . . . . . . . 4.4.1 Generalized post-processor . . . . . . . . . . 4.4.2 Position-dependent post-processor . . . . . . 4.4.3 Post-processing in two dimensions . . . . . . 4.5 Numerical Results . . . . . . . . . . . . . . . . . . . 4.5.1 L2 -Projection . . . . . . . . . . . . . . . . . . 4.5.2 Constant coefficients . . . . . . . . . . . . . . 4.5.3 Dirichlet BCs . . . . . . . . . . . . . . . . . . 4.5.4 Variable coefficients . . . . . . . . . . . . . . 4.5.5 Discontinuous coefficients . . . . . . . . . . . 4.5.6 Two-dimensional system . . . . . . . . . . . . 4.5.7 Two-dimensional streamlines . . . . . . . . . 57 58 59 59 59 60 61 61 63 65 65 67 69 70 71 73 75 76 77 78 78 2.6 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 54 56 CONTENTS 4.6 Conclusion xi . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 80 5 Theoretical Superconvergence 5.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5.2 Methods and notation . . . . . . . . . . . . . . . . . . . . . . 5.2.1 Post-processor . . . . . . . . . . . . . . . . . . . . . . 5.2.2 DG discretization for hyperbolic problems . . . . . . . 5.2.3 Additional notation . . . . . . . . . . . . . . . . . . . 5.3 Auxiliary results . . . . . . . . . . . . . . . . . . . . . . . . . 5.3.1 Estimating ku − u⋆ k . . . . . . . . . . . . . . . . . . . 5.3.2 Derivatives of B-splines . . . . . . . . . . . . . . . . . 5.4 The main result in abstract form . . . . . . . . . . . . . . . . 5.4.1 Reducing the post-processor to its building blocks . . 5.4.2 Treating the remaining building blocks . . . . . . . . . 5.4.3 The main error estimate in abstract form . . . . . . . 5.5 The main result for DG approximations . . . . . . . . . . . . 5.5.1 Unfiltered DG convergence . . . . . . . . . . . . . . . 5.5.2 Main result: extracting DG superconvergence . . . . . 5.5.3 Implications for the position-dependent post-processor 5.6 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 85 86 86 86 88 88 90 90 92 94 94 96 98 99 99 102 103 104 6 Conclusion 6.1 Introduction . . . . . 6.2 Linear DG systems . 6.3 Hidden DG accuracy 6.4 Future research . . . . . . . 105 105 105 106 107 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . xii CONTENTS 1 Introduction 1.1 Introduction Just like it is possible to predict tomorrow’s weather, it is possible to predict the presence of oil in a reservoir, or the air flow around a newly designed air foil, for instance. Such predictions can be based on field or model experiments, but also on computer simulations. The latter can be significantly cheaper (in terms of both time and money), and allow us to study potentially dangerous situations in a safe and virtual setting. Unfortunately, for most real-life applications, the underlying physical model is too complex to calculate the corresponding solution exactly. Hence, we must settle for an approximation instead. The mathematical field of numerical analysis is concerned with advanced numerical algorithms for this purpose. In principle, the numerical algorithms available can provide solution approximations with an arbitrarily small error. However, the computational time (and computer memory) must also been taken into account: tomorrow’s weather forecast should not take a week to compute, for instance. This trade-off between accuracy and efficiency can be compared to using pixels in a digital camera: more pixels mean sharper photos, but also more data to process and store. It remains a challenge to develop numerical algorithms with ‘smarter pixels’ to improve this balance for many applications. In this context, a particular challenge is formed by problems with shocks or strong contrasts in the model coefficients. The latter is usually the case for oil reservoir simulations, for example, where layers of different material (sand, shale, etc.) may result in permeability contrasts with typical values between 10−1 and 10−7 . In this thesis, we seek to improve on existing numerical algorithms for such applications. In particular, we focus on so-called Discontinuous Galerkin (DG) discretizations schemes, and their linear systems and ‘hidden accuracy’. The remaining of this chapter introduces this research in the following man- 2 Chapter 1 Introduction ner: Section 1.2 provides a short introduction to Discontinuous Galerkin (DG) discretizations and their advantages in the context of finite volume schemes. Section 1.3 motivates the importance of an efficient solver for the corresponding linear systems. Section 1.4 discusses the hidden accuracy that DG approximations can contain and its extraction through post-processing. Finally, Section 1.5 provides an outline of this thesis. 1.2 Discontinuous Galerkin (DG) methods Discontinuous Galerkin (DG) methods [63, 4, 21, 39] are flexible discretization schemes for approximating the solution to partial differential equations. A DG method can be thought of as a Finite Volume Method (FVM) that uses piecewise polynomials rather than piecewise constants. More specifically, for a given problem and mesh, a DG approximation is a polynomial of degree p or lower within each mesh element, while it may be discontinuous at the element boundaries. As such, it combines the best of both classical finite element and finite volume methods. This makes DG methods particularly suitable for handling non-matching grids and local hp-refinement strategies. A DG method typically yields a higher convergence rate than a FVM (assuming similar numerical fluxes), as it makes use of a higher polynomial degree. Figure 1.1 illustrates this for a one-dimensional problem with five mesh elements.1 This higher order accuracy comes at a higher price though. While the FVM requires only one unknown per mesh element, a DG method requires multiple unknowns per mesh element: one for each polynomial basis function, for two-dimensional e.g. p + 1 for one-dimensional problems, and (p+1)(p+2) 2 problems. As a consequence, DG matrices are relatively large (both in size and in terms of the total number of nonzeros), resulting in more computational costs. Figure 1.2 illustrates this for a two-dimensional Laplace problem on a 3 × 3 uniform Cartesian mesh. In this light, it is interesting to compare the accuracy of the DG method and the FVM for a fixed number of unknowns, rather than a fixed number of mesh elements. Figure 1.3 demonstrates that a DG method can yield much higher accuracy for the same number of unknowns.2 Together with the advantages mentioned earlier, this motivates the current study of DG methods in this thesis. 1 In the illustrations in this section, the DG scheme under consideration is the SIPG method specified in Section 2.2 and 3.2. The FVM method is based on central fluxes. 2 In this figure, we study a two-dimensional diffusion problem with five layers, where the diffusion K is either 1 or 0.001 in each layer. The meshes are Cartesian and uniform. See Section 2.5.1 for further details. Section 1.2 Discontinuous Galerkin (DG) methods 3 FVM (p = 0) DG (p = 1) Figure 1.1. A DG method typically yields a higher convergence rate than a FVM, as it makes use of a higher polynomial degree p. FVM: 9 × 9 matrix DG (p = 2): 54 × 54 matrix Figure 1.2. DG matrices are relatively large (both for a 3 × 3 mesh). 0 10 −1 10 −2 Error 10 FVM −3 10 −4 10 −5 10 DG (p=3) −6 10 3 10 4 10 5 10 # Unknowns Figure 1.3. DG discretizations can yield significantly better accuracy (in the L2 norm) for the same number of unknowns. 4 1.3 Chapter 1 Introduction Linear DG systems In the previous section, we have seen that linear DG systems are relatively large. At the same time, their condition number typically increases with the number of mesh elements, the polynomial degree, and the stabilization factor [17, 72]. Figure 1.4 illustrates this.3 Problems with extreme contrasts in the coefficients (cf. Section 1.1) usually pose an extra challenge. In search of efficient and scalable algorithms (for which the number of iterations does not increase with e.g. the number of mesh elements), much attention has been paid to subspace correction methods [83]. For example, Schwarz domain decomposition methods are based on subspaces that arise from subdividing the spatial domain into smaller subdomains [3, 32]; geometric (h-)multigrid methods make use of subspaces resulting from multiple coarser meshes [34, 13]; spectral (p-)multigrid methods apply global corrections by solving problems with a lower polynomial degree [33, 58]; and algebraic multigrid methods use algebraic criteria to separate the unknowns of the original system into two sets, one of which is labeled ‘coarse’ [59, 68]. A particular example makes use of a single coarse space that is based on the DG scheme with polynomial degree p = 0. This two-level method, proposed by Dobrev et al. [24], more or less reduces the DG matrix to a (smaller) central difference matrix, for which many efficient linear solvers are readily available. Usually, the subspace correction methods above can either be used as a standalone solver, or as a preconditioner in an iterative Krylov method. The second strategy tends to be more robust for problems with a few isolated ‘bad’ eigenvalues, which is common for problems with strongly varying coefficients. An alternative for preconditioning is the method of deflation. This technique has been developed in the late eighties by Nicolaides [55], Dostal [26] and Mansfield [46], and further studied in [69, 80, 81], among others. Deflation is strongly related to the subspace correction methods mentioned earlier, as found in [52, 53, 54, 76, 75]. To contribute to this ongoing comparison between multigrid and deflation, and to extend the available analysis for the aforementioned two-level preconditioner [24], we have cast it into the deflation framework, and studied the impact of both variants on the convergence of the Conjugate Gradient (CG) method. This thesis discusses the results for Symmetric Interior Penalty (discontinuous) Galerkin (SIPG) discretizations for diffusion problems with strong contrasts in the coefficients. In addition, it considers the influence of the SIPG penalty parameter, weighted averages in the SIPG formulation (SWIP), the smoother, damping of the smoother, and the strategy for solving the coarse systems. 3 In this figure, we study the two-dimensional Laplace problem on a uniform Cartesian mesh with 10 × 10 mesh elements. We use the SIPG discretization with (stabilizing) penalty parameter σ = 2p2 . See Section 2.5.1 for further details. Section 1.4 Hidden DG accuracy 5 condition number k2(A) 10000 Figure 1.4 The condition number of a DG matrix can increase rapidly with the polynomial degree p. 0 1 2 3 4 5 polynomial degree p 1.4 Hidden DG accuracy DG approximations can contain ‘hidden accuracy’: although the convergence rate of a DG scheme is typically of order p+1 (where p is the polynomial degree), the Fourier analysis of Lowrie [45] reveals an accurate mode that evolves with an accuracy of order 2p + 1. Furthermore, Adjerid et al. [1] proved superconvergence of order p + 2 at the roots of a Radau polynomial of degree p + 1 on each element, and of order 2p + 1 at the downwind point of each element. See also [2] and references therein. This hidden accuracy can be extracted by applying a local averaging operator, which has been introduced by Bramble and Schatz [11] in the context of Ritz-Galerkin approximations for elliptic problems. They showed that this so-called symmetric post-processor yields super-convergence of order 2p + 1. Interestingly, this technique needs to be applied only once, at the final simulation time, and does not contain any information of the underlying physics or numerics. Thomee [77] provided alternative proofs for the results in [11] using Fourier transforms. This point of view reveals that the symmetric post-processor is related to the Lanczos filter (see e.g. [38, p. 163]), which is a classical filter in the context of spectral methods for reducing the Gibbs phenomenon. Thomee also demonstrated that a modified version of the post-processor can be used to obtain accurate derivative approximations. This work was extended in [78] for semidiscrete Galerkin finite element methods for parabolic problems. Cockburn, Luskin, Shu, and Süli [22] combined the ideas in [11] and [51] to apply the post-processor to DG approximations for linear periodic hyperbolic PDEs. They showed that super-convergence of order 2p + 1 can be extracted by the symmetric post-processor for this alternative application type. Since then, the post-processor has been modified to be able to post-process near nonperiodic boundaries [64], applied to extract derivatives (following [77]) [65, 66], improved computationally [49, 50], studied for non-uniform rectangular [23], structured triangular [47] and unstructured triangular [48] meshes, analyzed for 6 Chapter 1 Introduction linear convection-diffusion problems [41], and nonlinear hyperbolic conservation laws [42], and used to enhance streamline visualizations [73, 82]. To improve on the accuracy and smoothness of the one-sided post-processor [64] near non-periodic boundaries and shocks, we have developed the so-called position-dependent post-processor, and analyzed its impact on the discretization accuracy in both the L2 - and the L∞ -norm. This thesis presents the results for DG (upwind) discretizations for hyperbolic problems, and demonstrates the benefits of the proposed post-processor for streamline visualization. 1.5 Outline The outline of this thesis is as follows. Chapter 2 discusses the two-level preconditioner proposed by Dobrev et al. [24], and casts it into the deflation framework. The difference between both methods is studied through various numerical experiments. Chapter 3 provides theoretical support for the convergence of both the twolevel preconditioner and the deflation variant. In particular, it derives upper bounds for the condition number of the preconditioned/deflated system. Chapter 4 discusses the original (symmetric and) one-sided post-processor [64], and proposes the position-dependent post-processor. Both techniques are compared numerically in terms of smoothness and accuracy. Chapter 5 presents theoretical error estimates in both the L2 - and the L∞ norm for the position-dependent post-processor. Chapter 6 summarizes the main conclusions of this thesis and indicates possibilities for future research. 2 Linear DG systems This chapter is based on: P. van Slingerland, C. Vuik, Fast linear solver for pressure computation in layered domains. Submitted to Comput. Geosci. 8 2.1 Chapter 2 Linear DG systems Introduction Linear DG systems are relatively large and ill-conditioned (cf. Section 1.3). In search of efficient linear solvers, much attention has been paid to subspace correction methods. A particular example is a two-level preconditioner proposed by Dobrev et al. [24]. This method uses coarse corrections based on the DG discretization with polynomial degree p = 0. Using the analysis of Falgout et al. [31], Dobrev et al. have shown theoretically (for polynomial degree p = 1) that this preconditioner yields scalable convergence of the CG method, independent of the mesh element diameter. Another nice property is that the use of only two levels offers an appealing simplicity. More importantly, the coefficient matrix that is used for the coarse correction is quite similar to a matrix resulting from a central difference discretization, for which very efficient solution techniques are readily available. An alternative for preconditioning is the method of deflation. This method has been proved effective for layered problems with extreme contrasts in the coefficients in [80, 81]. Deflation is related to multigrid in the sense that it also makes use of a coarse space that is combined with a smoothing operator at the fine level. This relation has been considered from an abstract point of view by Tang et al. in [76, 75]. To continue this comparison between preconditioning and deflation in the context of DG schemes, and to extend the work in [24], we have cast the twolevel preconditioner into the deflation framework, and studied the impact of both variants on the convergence of the Conjugate Gradient (CG) method. This chapter discusses the numerical results for Symmetric Interior Penalty (discontinuous) Galerkin (SIPG) discretizations for diffusion problems with strong contrasts in the coefficients. In addition, it considers the influence of the SIPG penalty parameter, weighted averages in the SIPG formulation, the smoother, damping of the smoother, and the strategy for solving the coarse systems. Theoretical analysis of both two-level methods will be considered in Chapter 3. The outline of this chapter is as follows. Section 2.2 discusses the SIPG method for diffusion problems with large jumps in the coefficients. To solve the resulting systems, Section 2.3 discusses the two-level preconditioner. Section 2.4 rewrites the latter as a deflation method. Section 2.5 compares the performance of both two-level methods through various numerical experiments. Section 2.6 summarizes the main conclusions. 2.2 Discretization This section specifies the DG discretization under consideration. Section 2.2.1 discusses the SIPG method for stationary diffusion problems following [63]. Section 2.2.2 describes the resulting linear systems. Section 2.2.3 motivates the use of a diffusion-dependent penalty parameter following [27]. Section 2.2 Discretization 2.2.1 9 SIPG method for diffusion problems We study the following model problem on the spatial domain Ω with boundary ∂Ω = ∂ΩD ∪ ∂ΩN and outward normal n: −∇ · (K∇u) = f, in Ω, u = gD , K∇u · n = gN , on ∂ΩD , on ∂ΩN . (2.1) We assume that the diffusion K is a symmetric and positive-definite tensor whose eigenvalues are bounded below and above by positive constants, and that the other model parameters are chosen such that a weak solution of (2.1) exists1 . The SIPG approximation for the model above can be constructed in the following manner. First, choose a mesh with elements E1 , ..., EN . The numerical experiments in this chapter are for uniform Cartesian meshes on the domain Ω = [0, 1]2 , although our solver can be applied for a wider range of problems. Next, define the test space V that contains each function that is a polynomial of degree p or lower within each mesh element, and that may be discontinuous at the element boundaries. The SIPG approximation uh is now defined as the unique element in this test space that satisfies the relation B(uh , v) = L(v), for all test functions v ∈ V, (2.2) where B and L are (bi)linear forms that are specified hereafter. To define these forms for mesh elements of size h×h, we require the following additional notation: the vector ni denotes the outward normal of mesh element Ei ; the set Γh is the collection of all interior edges e = ∂Ei ∩ ∂Ej ; the set ΓD is the collection of all Dirichlet boundary edges e = ∂Ei ∩ ∂ΩD ; and the set ΓN is the collection of all Neumann boundary edges e = ∂Ei ∩ ∂ΩN . Finally, we introduce the usual trace operators for jumps and averages at the mesh element boundaries: in the interior, we define at ∂Ei ∩ ∂Ej : [v] = vi · ni + vj · nj , and {v} = 21 (vi + vj ), where vi denotes the trace of the (scalar or vector-valued) function v along the side of Ei with outward normal ni . Similarly, at the domain boundary, we define at ∂Ei ∩ ∂Ω: [v] = vi · ni , and {v} = vi . Using this notation, the forms B and L can be defined as follows: Z X Z X Z σ vgN , L(v) = fv − [K∇v] + v gD + h e e Ω e∈ΓN e∈ΓD B(uh , v) = N Z X i=1 + Ei K∇uh · ∇v X e∈Γh ∪ΓD 1 That Z e − {K∇uh } · [v] − [uh ] · {K∇v} + 1 is, f, gN ∈ L2 (Ω), and gD ∈ H 2 (Ω) [63, p. 25, 26]. σ [uh ] · [v] , h 10 Chapter 2 Linear DG systems where σ is the so-called penalty parameter. This positive parameter penalizes the inter-element jumps to enforce weak continuity and ensure convergence. Although it is presented as a constant here, its value may vary throughout the domain. This is discussed further in Section 2.2.3 later on. For a large class of sufficiently smooth problems, the SIPG method yields convergence of order p + 1 [63]. 2.2.2 Linear system In order to compute the SIPG approximation defined by (2.2), it needs to be rewritten as a linear system. To this end, we choose basis functions for the test space V . More specifically, for each mesh element Ei , we define the basis (i) function φ1 which is zero in the entire domain, except in Ei , where it is equal (i) (i) to one. Similarly, we define higher-order basis functions φ2 , ..., φm , which are higher-order polynomials in Ei and zero elsewhere. In this chapter, we use monomial basis functions. These latter are defined as follows. In the mesh element Ei with center (i) (xi , yi ) and size h × h, the function φk reads: (i) φk (x, y) = x − xi 1 2h k x y − yi 1 2h k y , where kx and ky are selected as follows: k kx ky 1 0 0 p=0 2 3 1 0 0 1 p=1 4 2 0 5 6 1 0 1 2 p=2 7 3 0 8 9 10 2 1 0 1 2 3 p=3 ... ... ... ... The dimension of the basis within one mesh element is equal to m = (p+1)(p+2) . 2 Next, we express uh as a linear combination of the basis functions: uh = m N X X (i) (i) uk φ k . (2.3) i=1 k=1 (i) The new unknowns uk in (2.3) can be determined by solving a linear system Au = b of the form: A11 A12 . . . A1N u1 b1 .. u b A21 A22 2 . 2 .. = .. , . . . .. . .. uN bN AN 1 . . . AN N Section 2.2 Discretization 11 where the blocks all have dimension m: (i) (j) (i) (j) B(φ1 , φ1 ) B(φ2 , φ1 ) . . . (i) (j) (i) (j) B(φ1 , φ2 ) B(φ2 , φ2 ) Aji = .. .. . . (i) (j) B(φ1 , φm ) ... (i) (j) L(φ1 ) u1 (i) (j) L(φ2 ) u2 , b = ui = j .. , .. . . (i) um (i) (j) B(φm , φ1 ) .. . , (i) (j) B(φm , φm ) (j) L(φm ) for all i, j = 1, ..., N . This system is obtained by substituting the expression (j) (2.3) for uh and the basis functions φℓ for v into (2.2). Once the unknowns (i) uk are solved from the system Au = b, the final SIPG approximation uh can be obtained from (2.3). 2.2.3 Penalty parameter The SIPG method involves the penalty parameter σ which penalizes the interelement jumps to enforce weak continuity. This parameter should be selected carefully: on the one hand, it needs to be sufficiently large to ensure that the SIPG method converges and the coefficient matrix A is Symmetric and Positive Definite (SPD) [63]. At the same time, it needs to be chosen as small as possible, since the condition number of A increases rapidly with the penalty parameter [17]. Computable theoretical lower bounds for a large variety of problems have been derived by Epshteyn and Riviere [28]. For one-dimensional diffusion probk2 lems, it suffices to choose σ ≥ 2 k10 p2 , where k0 and k1 are the global lower and upper bound respectively for the diffusion coefficient K. However, while this lower bound for σ is sufficient to ensure convergence (assuming the exact solution is sufficiently smooth), it can be unpractical for problems with strong variations in the coefficients. For instance, if the diffusion coefficient K takes values between 1 and 10−3 , we obtain σ ≥ 2000p2 , which is inconveniently large. For this reason, it is common practice to choose e.g. σ = 20 rather than σ = 20 000 for such problems [24, 60]. An alternative strategy is to choose the penalty parameter based on local values of the diffusion-coefficient K, e.g. choosing σ = 20K rather than σ = 20. It has been demonstrated numerically in [25] that such a diffusion-dependent penalty parameter can benefit the efficiency of a linear solver (also cf. Section 2.5.2). For general tensors K, this strategy can be defined as follows: for an edge with normal n, we set σ = αλ, where λ = nT Kn and α is a user-defined constant. The latter should be as small as possible in light of the discussion above. 12 Chapter 2 Linear DG systems If the diffusion is discontinuous, this definition may not be unique. For instance, in the example above, we could have λ = 1 on one side and λ = 0.001 on the other side of an edge. In such cases, it seems a safe choice to use the largest limit value of λ in the definition above (e.g. λ = 1 in the example). The reason for this is that theoretical stability and convergence analysis are usually based on a penalty parameter that is sufficiently large. An alternative strategy for dealing with discontinuities is to use the harmonic average of both limit values [27, 15, 29, 7]. In this case, the penalty λ λ parameter reads σ = 2α λi i+λjj , where λi and λj are based on the information in the mesh elements Ei and Ej respectively (adjacent to the edge under consideration). This choice is equivalent to using the minimum of both limit values [7, p. 5]. In that sense, it seems less ‘safe’ than the maximum strategy above. In [27, 15, 29, 7], the ‘harmonic’ penalty parameter is used in combination with the so-called Symmetric Weighted Interior Penalty (SWIP) method. The main difference between the standard SIPG method and the SWIP method is the following: whenever an average of a function at a mesh element boundary is considered (denoted by {.} in Section 2.2.1), the SWIP method uses a weighted average rather than the standard average. For this purpose, the weights typiλj i cally depend on the diffusion coefficient, i.e. wi = λi +λ and wj = λiλ+λ (note j j that the harmonic penalty can then be written as σ = α(wi λi + wj λj )). In this chapter, we study the effects of both a constant and a diffusiondependent penalty parameter, using either the maximum or the harmonic strategy above. Furthermore, we consider both the SIPG and the SWIP method. 2.3 Two-level preconditioner To solve the linear SIPG system obtained in the previous section, we start by considering the two-level preconditioner proposed by Dobrev et al [24]. Section 2.3.1 specifies the corresponding coarse correction operator. Section 2.3.2 defines the resulting two-level preconditioner. Section 2.3.3 indicates its implementation in a standard preconditioned CG algorithm. 2.3.1 Coarse correction operator The two-level preconditioner is defined in terms of a coarse correction operator Q ≈ A−1 that switches from the original DG test space to a coarse subspace, then performs a correction that is now simple in this coarse space, and finally switches back to the original DG test space. In this case, the coarse subspace is based on the piecewise constant basis functions. More specifically, the coarse correction operator Q is defined as follows. Let R denote the so-called restriction operator such that A0 := RART is the SIPG matrix for polynomial degree p = 0. More specifically, the matrix R is defined Section 2.3 Two-level preconditioner 13 as the following N × N block matrix: R11 R12 R21 R22 R= . .. RN 1 . . . R1N .. . , RN N ... .. . where the blocks all have size 1 × m: Rii = 1 0 . . . 0 , Rij = 0 ... 0 , for all i, j = 1, ..., N and i 6= j. Using this notation, the coarse correction operator is defined as Q := RT A−1 0 R (2.4) For example, for a Laplace problem on the domain [0, 1]2 with p = 1, a uniform Cartesian mesh with 2 × 2 elements, and penalty parameter σ = 10, we obtain the following matrices: A= 40 1 1 1 25 0 1 0 25 −10 −9 0 9 8 0 0 0 −3 −10 0 −9 0 −3 0 9 0 8 0 0 0 0 0 0 0 0 0 40 −10 A0 = −10 0 1 0 0 0 R= 0 0 0 0 −10 40 0 −10 0 0 0 0 0 1 0 0 9 0 8 0 0 −3 −1 1 25 0 0 25 0 0 0 0 0 0 0 −9 −3 0 0 8 −10 0 0 −10 , 40 −10 −10 40 0 0 0 0 −10 −9 0 40 −1 1 0 0 0 −10 0 9 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 −10 0 9 0 −3 0 −9 0 8 0 0 0 0 0 0 0 0 0 40 1 −1 1 25 0 −1 0 25 −10 −9 0 9 8 0 0 0 −3 0 0 0 1 0 0 0 0 0 0 0 −10 0 −9 −10 −9 0 40 −1 −1 0 0 0 0 0 0 0 9 −3 0 0 8 , 9 0 8 0 0 −3 −1 −1 25 0 0 25 0 0 . 0 0 Observe that A0 has the same structure as a central difference matrix (aside from a factor σ = 10). Furthermore, every element in A0 is also present in the 14 Chapter 2 Linear DG systems upper left corner of the corresponding block in the matrix A. This is because the piecewise constant basis functions are assumed to be in any polynomial basis. As a consequence, the matrix R contains elements equal to 0 and 1 only, and does not need to be stored explicitly: multiplications with R can be implemented by simply extracting elements or inserting zeros. 2.3.2 Two-level preconditioner We can now formulate the two-level preconditioner proposed by Dobrev et al. [24]. This is established by combining the coarse correction operator Q with a (damped) smoother: Definition 2.3.1 (Two-level preconditioner) Consider the coarse correction operator Q defined in (2.4), a damping parameter ω ≤ 1, and an invertible smoother M −1 ≈ A−1 . Then, the result y = Pprec r of applying the two-level preconditioner to a vector r can be computed as follows: y(1) = ωM −1 r (pre-smoothing), y(2) = y(1) + Q(r − Ay(1) ) y=y (2) + ωM −T (r − Ay (coarse correction), (2) ) (post-smoothing). (2.5) In this chapter, we consider block Jacobi and block Gauss-Seidel smoothing. These smoothers have the following property (cf. Chapter 3): M + M T − ωA is SPD. (2.6) Using the more abstract analysis in [79, p. 66], condition (2.6) implies that the preconditioning operator Pprec is SPD. As a consequence, the two-level preconditioner can be implemented in a standard preconditioned CG algorithm (cf. Section 2.3.3 hereafter). Requirement (2.6) also implies that the two-level preconditioner yields scalable convergence of the CG method (independent of the mesh element diameter) for a large class of problems. This has been shown for polynomial degree p = 1 by Dobrev et al. [24], using the analysis in [31]. In Chapter 3, we will extend this theory for p > 1. 2.3.3 Implementation in a CG algorithm Assuming (2.6), the two-level preconditioner is SPD and can be implemented in a standard preconditioned CG algorithm. Below, we summarize the implementation of this scheme for a given preconditioning operator P and start vector x0 : 1. r0 := b − Ax0 Section 2.4 Deflation variant 2. 3. 4. 5. 6. 7. 8. 9. 10. 11. 12. 15 y0 := P r0 p0 := y0 for j = 0, 1, ... until convergence do wj := Apj αj := (rj , yj )/(pj , wj ) xj+1 := xj + αj pj rj+1 := rj − αj wj yj+1 = P rj+1 βj := (rj+1 , yj+1 )/(rj , yj ) pj+1 := yj+1 + βj pj end 2.4 Deflation variant Next, we cast the two-level preconditioner into the deflation framework using the abstract analysis in [76]. Section 2.4.1 defines the resulting operator. Section 2.4.2 compares the two-level preconditioner and the corresponding deflation variant in terms of computational costs. Section 2.4.3 discusses the coarse systems involved in each iteration. Section 2.4.4 considers the influence of damping of the smoother. 2.4.1 Two-level deflation There are multiple strategies to construct a deflation method based on the components of the two-level preconditioner. An overview is given in [76]. Below, we consider the so-called ‘ADEF2’ deflation scheme, as this type can be implemented relatively efficiently, and allows for inexact solving of coarse systems (cf. Section 2.4.3 later on). Basically, this deflation variant is obtained from Definition 2.3.1 by skipping the last smoothing step in (2.5). Definition 2.4.1 (Two-level deflation) Consider the coarse correction operator Q defined in (2.4), a damping parameter ω ≤ 1, and an invertible smoother M −1 ≈ A−1 . Then, the result y = Pdefl r of applying the two-level deflation technique to a vector r can be computed as: y(1) := ωM −1 r y := y(1) + Q(r − Ay(1) ) (pre-smoothing), (coarse correction). (2.7) The operator Pdefl is not symmetric in general. As such, it seems unsuitable for the standard preconditioned CG method. Interestingly, it can still be implemented successfully in its current asymmetric form, as long as the smoother 16 Chapter 2 Linear DG systems M −1 is SPD (requirement (2.6) is not needed), and the start vector x0 is preprocessed according to: x0 7→ Qb + (I − AQ)T x0 . (2.8) Other than that, the CG implementation remains as discussed in Section 2.3.3. Indeed, it has been shown by [76, Theorem 3.4] that, under the aforementioned conditions, Pdefl yields the same CG iterates as an alternative operator (called ‘BNN’, cf. Section 3.2.3) that actually is SPD. It will be shown in Chapter 3 that, similar to the two-level preconditioner, the deflation variant above yields scalable convergence of the CG method (independent of the mesh element diameter). 2.4.2 FLOPS Because the deflation variant skips one of the two smoothing steps, its costs per CG iteration are lower than for the preconditioning variant. In this section, we compare the differences in terms of FLoating point OPerationS (FLOPS). Table 2.1 displays the costs for a two-dimensional diffusion problem with polynomial degree p, a Cartesian mesh with N = n × n elements, and polynomial space dimension m := (p+1)(p+2) . Using the preconditioning variant, the 2 CG method requires per iteration (27m2 + 14m)N flops, plus the costs for two smoothing steps and one coarse solve. Using the two-level deflation method, the CG method requires per iteration (18m2 + 12m)N flops, plus the costs for only one smoothing step and one coarse solve. A block Jacobi smoothing step with blocks of size m requires (2m2 − m)N flops, assuming that an LU -decomposition is known. In this case, the smoothing costs are low compared to the costs for a matrix-vector product, and the deflation variant is roughly 30% cheaper (per iteration). For more expensive smoothers, this factor becomes larger. A block Gauss-Seidel sweep (either forward or backward) requires the costs for one block Jacobi step, plus the costs for the updates based on the off-diagonal elements, which are approximately 4m2 N flops. operation mat-vec (Au) inner product (uT v) scalar multiplication (αu) vector update (u ± v) smoothing (M −1 u) coarse solve (A−1 0 u0 ) flops (rounded) 9m2 N 2mN mN mN variable variable # defl. 2 2 3 5 1 1 # prec. 3 2 3 7 2 1 Table 2.1. Comparing the computational costs per CG iteration for A-DEF2 deflation and the two-level preconditioner for our applications. Section 2.4 Deflation variant 2.4.3 17 Coarse systems Both two-level methods require the solution of a coarse system in each iteration, involving the coefficient matrix A0 . In Section 2.3.1, we have seen that A0 has the same structure and size (N × N ) as a central difference matrix. As a consequence, a direct solver is not feasible for most practical applications. At the same time, many effective inexact solvers are readily available for this type of system. For some deflation methods, including DEF1, DEF2, R-BNN1, R-BNN2, such an inexact coarse solver is not suitable. This is because those methods contain eigenvalue clusters at 0, so small perturbations in those schemes (e.g. due to inexact coarse solves) can result in an “unfavorable spectrum, resulting in slow convergence of the method” — — [76, p. 353]. ADEF2 does not have this limitation, as it clusters these eigenvalues at 1 rather than 0. This is one of the reasons why we focus on this particular deflation variant. In Section 2.5.3 we will investigate the use of an inexact coarse solver that applies the CG method in an inner loop with a scalable algebraic multigrid preconditioner. This strategy will be studied for both the two-level preconditioner, and the ADEF2 deflation variant. An alternative strategy is the Flexible CG (FCG) method [6, 56], which can be used to relax the stopping criterion for inner iterations while remaining as effective as a direct solver. The main difference with standard CG lies in the explicit orthogonalization and truncation of the search direction vectors, possibly combined with a restart strategy. We do not study this topic further in this thesis. 2.4.4 Damping Damping often benefits the convergence of multigrid methods [84]. For multi1 ”, grid methods with smoother M = I, a “typical choice of [ω] is close to ||A|| 2 although a “better choice of [ω] is possible if we make further assumptions on how the eigenvectors of A associated with small eigenvalues are treated by coarse-grid correction” —— [75, p. 1727]. In that reference, the latter is established for a coarse space that is based on a set of orthonormal eigenvectors of A. However, such a result does not seem available yet for the coarse space (and smoothers) currently under consideration. At the same time, deflation may not be influenced by damping at all. The latter has been observed theoretically in [75, p. 1727] for the DEF(1) variant. For the ADEF2 variant under consideration, such a result is not yet available. Altogether, it is an open question how the damping parameter can best be selected in practice. For this reason, we use an emprical approach in this chapter, and study the effects on both two-level methods for several values of ω ≤ 1. 18 2.5 Chapter 2 Linear DG systems Numerical experiments Next, we compare the two-level preconditioner and the corresponding deflation variant through numerical experiments. Section 2.5.1 specifies the experimental setup. Section 2.5.2 studies the influence of SIPG penalty parameter. Section 2.5.3 investigates the effectiveness of an inexact solver for the coarse systems. Section 2.5.4 studies the impact of (damping of) the smoother on the overall computational efficiency. Section 2.5.5 considers similar experiments for more challenging test cases. 2.5.1 Experimental setup We consider multiple diffusion problems of the form (2.1) with strong contrasts in the coefficients on the domain [0, 1]2 . At first, we primarily focus on the problem illustrated in Figure 2.1. This problem has five layers, and the diffusion is either 1 or 10−3 in each layer. Such strong contrasts in the coefficients are typically encountered during oil reservoir simulations involving e.g. layers of sand and shale. In Section 2.5.5, we also study problems that mimic the occurrence of sand inclusions within a layer of shale, and ground water flow. Furthermore, we examine a bubbly flow problem, the influence of Neumann boundary conditions, and a full anisotropic problem. K=1 K = 10−3 K=1 K = 10−3 K=1 Figure 2.1. Illustration of the problem with five layers The Dirichlet boundary conditions and the source term f are chosen such that the exact solution reads u(x, y) = cos(10πx) cos(10πy) (unless indicated otherwise). We stress that this choice does not impact the matrix or the performance of the linear solver, as we use random start vectors (see below). Furthermore, subdividing the domain into 10 × 10 equally sized squares, the diffusion coefficient is constant within each square. All model problems are discretized by means of the SIPG method as discussed in Section 2.2, although the SWIP variant with weighted averages is also discussed. We use a uniform Cartesian mesh with N = n × n elements, where n = 20, 40, 80, 160, 320. Furthermore, we use monomial basis functions with polynomial degree p = 2, 3 (results for p = 1 are similar though). For p = 3 and n = 320, this means that the number of degrees of freedom is a little Section 2.5 Numerical experiments 19 over 106 . In most cases, the penalty parameter is chosen diffusion-dependent, σ = 20nT Kn, using the largest limit value at discontinuities (cf. Section 2.2.3). However, we also study a constant penalty parameter, and a parameter based on harmonic means. The linear systems resulting from the SIPG discretizations are solved by means of the CG method, combined with either the two-level preconditioner (Definition 2.3.1) or the corresponding deflation variant (Definition 2.4.1). Unless specified otherwise, damping is not used. For the smoother M −1 , we use block Jacobi with small blocks of size m × m (recall that m = (p+1)(p+2) ). For 2 the preconditioning variant, we also consider block Gauss-Seidel with the same block size (deflation requires a symmetric smoother). Diagonal scaling is applied as a pre-processing step in all cases, and the same random start vector x0 is used for all problems of the same size. Preprocessing of the start vector according to (2.8) is applied for deflation only, as it makes no difference for the preconditioning variant. For the stopping criterion we use: krk k2 ≤ TOL, kbk2 (2.9) where TOL = 10−6 , and rk is the residual after the k th iteration. Coarse systems, involving the SIPG matrix A0 with polynomial degree p = 0, are solved directly in most cases. However, a more efficient alternative is provided in Section 2.5.3. In any case, the coarse matrix A0 is quite similar to a central difference matrix, for which very efficient solvers are readily available. Finally, we remark that all computations are carried out using a Intel Core 2 Duo CPU (E8400) at 3 GHz. 20 2.5.2 Chapter 2 Linear DG systems The influence of the penalty parameter This section studies the influence of the SIPG penalty parameter on the convergence of the CG and the SIPG method. We compare the differences between using a constant penalty parameter, and a diffusion-dependent value. Similar experiments have been considered in [25] for the two-level preconditioner for p = 1, a single mesh, and symmetric Gauss-Seidel smoothing (solving the coarse systems using geometric multigrid). They found that “proper weighting”, i.e. a diffusion-dependent penalty parameter, “is essential for the performance”. In this section, we consider p = 2, 3, and both preconditioning and deflation (using block Jacobi smoothing). Furthermore, we analyze the scalability of the methods by considering multiple meshes. Our results are consistent with those in [25]. Table 2.2 displays the number of CG iterations required for convergence for a Poisson problem (i.e. K = 1 everywhere) with σ = 20. Because the diffusion coefficient is constant, a diffusion-dependent value (σ = 20K) would yield the same results. We observe that both the two-level preconditioner (TL prec.) and the deflation variant (TL defl.) yield fast and scalable convergence (independent of the mesh element diameter). For comparison, the results for standard Jacobi and block Jacobi preconditioning are also displayed (not scalable). Interestingly, the two-level deflation method requires fewer iterations than the preconditioning variant, even though its costs per iteration are about 30% lower (cf. Section 2.4.2). Table 2.3 considers the same test (using a constant σ = 20), but now for the problem with five layers (cf. Figure 2.1). It can be seen that the convergence is no longer fast and scalable for this problem with large jumps in the coefficients. The deflation method is significantly faster than the preconditioning variant, but neither produce satisfactory results. degree mesh Jacobi Block Jacobi (BJ) TL Prec., 2x BJ TL Defl., 1x BJ N=202 301 205 36 32 p=2 N=402 N=802 581 1049 356 676 38 39 33 33 N=1602 1644 1190 40 34 N=202 325 206 49 36 p=3 N=402 N=802 576 1114 357 696 52 53 37 37 N=1602 1903 1183 54 38 Table 2.2. Both two-level methods yield fast scalable convergence for a problem with constant coefficients (Poisson, # CG iterations, σ = 20). degree mesh Jacobi Block Jacobi (BJ) TL Prec., 2x BJ TL Defl., 1x BJ N=202 1671 933 415 200 p=2 N=402 N=802 4311 9069 2253 4996 1215 2534 414 531 N=1602 15923 9651 3571 599 N=202 2675 1357 1089 453 p=3 N=402 N=802 5064 9104 2960 5660 2352 4709 591 667 N=1602 15655 9783 8781 698 Table 2.3. For a problem with extreme contrasts in the coefficients, a constant penalty yields poor convergence (five layers, # CG iterations, σ = 20). Section 2.5 Numerical experiments 21 Switching to a diffusion-dependent penalty parameter In Table 2.4, we revisit the experiment in Table 2.3, but this time for a diffusiondependent penalty parameter (σ = 20K, using the largest limit value of K at discontinuities). Due to this alternative discretization, the results are now similar to those for the Poisson problem (cf. Table 2.2): both two-level methods yield fast and scalable convergence. These results motivate the use of a diffusion-dependent penalty parameter, provided that that this strategy does not worsen the accuracy of the SIPG discretization compared to a constant penalty parameter. In Figure 2.2, it is verified that a diffusion-dependent penalty parameter actually improves the accuracy of the SIPG approximation (for p = 3). Similar results have been observed for p = 1, 2 (not displayed). The higher accuracy can be explained by the fact that the discretization contains more information of the underlying physics for a diffusion-dependent penalty parameter. Altogether, the penalty parameter can best be chosen diffusion-dependent, and we will do so in the remaining of this chapter. −2 constant σ = 20 SIPG L2−error 10 −3 10 Figure 2.2 A diffusion-dependent penalty parameter yields better SIPG accuracy (five layers, σ = 20K). −4 10 −5 10 diffusion−dependent σ = 20 K −6 10 2 20 2 2 40 80 2 160 # mesh elements degree mesh Jacobi Block Jacobi (BJ) TL Prec., 2x BJ TL Defl., 1x BJ N=202 976 243 46 43 p=2 N=402 N=802 1264 1570 424 788 43 43 45 45 N=1602 2315 1285 44 46 N=202 1303 244 55 47 p=3 N=402 N=802 1490 1919 425 697 56 56 48 48 N=1602 3109 1485 57 48 Table 2.4. For a diffusion-dependent penalty parameter, both two-level methods yield fast scalable convergence for a problem with large permeability contrasts (five layers, # CG iterations, σ = 20K). 22 Chapter 2 Linear DG systems Weighted averages The results for the diffusion-dependent penalty parameter in Figure 2.2 and Table 2.4 were established using the largest limit value of K in the definition of σ at the discontinuities. In this section, we consider the influence of using weighted averages, resulting in the SWIP method and a diffusion-dependent penalty parameter based on harmonic means (cf. Section 2.2.3). For this purpose, we study the problem with five layers again. We have found that using σ = 20K with this approach results in negative eigenvalues, implying that the scheme is not coercive, and resulting in poor CG convergence. The same is true for σ = αK with α = 100, 200, 500, 1000. For α = 20 000, the matrix is positive-definite (tested for N = 10, 20 and p = 1, 2, 3). Similar outcomes were found using the SIPG scheme rather than the SWIP scheme (for the same ‘harmonic’ penalty parameter). At the mesh element edges where the diffusion coefficient K is discontinuous, 20 using α = 20 000 and a harmonic penalty yields σ = 1.001 . At the same time, using α = 20 and a ‘maximum’ penalty (i.e. using the largest limit value at discontinuities), yields σ = 20. These values are nearly the same. However, at all other edges, where K is continuous, σ is 1000 times larger for the harmonic penalty (with α = 20 000) than for the maximum penalty (with α = 20). Because the penalty parameter should be chosen as small as possible (cf. Section 2.2.3), we conclude that it can best be based on the largest limit value at discontinuities. This is in line with our earlier speculation that using the maximum is a ‘safe’ choice. We have also combined the ‘maximum’ penalty with the SWIP method and compared the outcomes to the earlier results for the SIPG method (both for σ = 20K). We have found that the discretization accuracy and the CG convergence are practically the same: the relative absolute difference in the discretization error is less then 2% (for p = 1, 2, 3 and N = 202 , 402 , 802 , 1602 ). Comparing Table 2.5 (SWIP) to Table 2.4 (SIPG), it can be seen that the number of CG iterations required for convergence is nearly identical. Altogether, we conclude that both the SIPG and the SWIP method are suitable for our application, as long as the penalty parameter is chosen diffusiondependent, using the largest limit value at discontinuities. We will apply this strategy using the standard SIPG method in the remaining of this chapter. degree mesh Jacobi Block Jacobi (BJ) TL Prec., 2x BJ TL Defl., 1x BJ N=202 980 244 46 43 p=2 N=402 N=802 1270 1575 424 790 43 43 45 45 N=1602 2325 1287 44 46 N=202 1309 244 55 47 p=3 N=402 N=802 1500 1935 424 697 56 56 48 49 N=1602 3129 1485 57 49 Table 2.5. The difference between the SWIP method (this table) and the SIPG method (cf. Table 2.4) is small (five layers, # CG iterations). Section 2.5 Numerical experiments 2.5.3 23 Coarse systems To solve the coarse systems with coefficient matrix A0 , a direct solver is usually not feasible in practice. This is because A0 has the same structure and size (N × N ) as a central difference matrix. To improve on the efficiency of the coarse solver, we have investigated the cheaper alternative of applying the CG method again in an inner loop (cf. Section 2.4.3). This section discusses the results using the algebraic multigrid preconditioner MI 20 in the HSL software package2 , which is based on a classical scheme described in [74]. The inner loop uses a stopping criterion of the form (2.9). Table 2.6 displays the number of outer CG iterations required for convergence using the two-level preconditioner and deflation variant respectively (for the problem with five layers). Different values of the inner tolerance TOL are considered in these tables. For comparison, the results for the direct solver are also displayed. We observe that low accuracy in the inner loop is sufficient to reproduce the latter. For both two-level methods, the inner tolerance can be 104 times as large as the outer tolerance. For the highest acceptable inner tolerance TOL = 10−2 , the number of iterations in the inner loop is between 2 and 5 in all cases (not displayed in the tables). In terms of computational time, the difference between the direct solver and the inexact AMG-CG solver is negligible for the problems under consideration. However, for large three-dimensional problems, it can be expected that the inexact coarse solver is much faster, and thus crucial for the overall efficiency and scalability of the linear solver. Preconditioning: degree mesh N=402 direct 43 TOL = 10−4 43 TOL = 10−3 43 TOL = 10−2 43 TOL = 10−1 48 N=802 43 43 43 43 62 p=2 N=1602 44 44 44 44 55 N=3202 44 44 44 44 57 N=402 56 56 56 56 59 N=802 56 56 56 57 65 p=3 N=1602 57 57 57 58 70 N=3202 58 58 58 58 78 Deflation: degree mesh direct TOL = 10−4 TOL = 10−3 TOL = 10−2 TOL = 10−1 N=802 45 45 45 45 81 p=2 N=1602 46 46 46 46 51 N=3202 46 46 46 46 300 N=402 48 48 48 48 48 N=802 48 48 48 48 54 p=3 N=1602 48 48 48 48 79 N=3202 49 49 49 49 67 N=402 45 45 45 45 60 Table 2.6. Coarse systems can be solved efficiently by using an inexact solver with a relatively low accuracy (TOL) in the inner loop (five layers, # outer CG iterations). 2 HSL, a collection of Fortran codes for large-scale scientific computation. http://www.hsl.rl.ac.uk/ See 24 2.5.4 Chapter 2 Linear DG systems Smoothers and damping This section discusses the influence of the smoother and damping on both twolevel methods. In particular, we consider multiple damping values ω ∈ [0.5, 1], and both Jacobi and (block) Gauss-Seidel smoothing. The latter is applied for the preconditioning variant only, as deflation requires a symmetric smoother. Table 2.7 displays the number of CG iterations required for convergence for the problem with five layers. For the deflation variant (Defl.), we have found that damping makes no difference for the CG convergence, so the outcomes for ω < 1 are not displayed. Such a result has also been observed theoretically in [75] for an alternative deflation variant (known as DEF(1)). For the preconditioning variant (Prec.), damping can both improve and worsen the efficiency: for block Jacobi (BJ) smoothing, choosing e.g. ω = 0.7 can reduce the number of iterations by 37%; for block Gauss-Seidel (BGS) smoothing, choosing ω < 1 has either no influence or a small negative impact in most cases. We have also performed the experiment for standard point Jacobi and point Gauss-Seidel (not displayed in the table). However, this did not lead to satisfactory results: over 250 iterations in all cases, even when the relaxation parameter was sufficiently low to ensure that (2.6) is satisfied (only restricting for point Jacobi). Altogether, the block (Jacobi/Gauss-Seidel) smoothers yield significantly better results than the point (Jacobi/Gauss-Seidel) smoothers. We speculate that this is due to the following: the coarse correction operator Q simplifies the matrix A to A0 , which eliminates the ‘higher-order’ information in each element (regarding the higher-order basis functions), but preserves the ‘mesh’ information (i.e. which elements are neighbors and which are not). Intuitively, a suitable smoother would reintroduce this higher-order information, originally contained in dense blocks of size m × m. The block (Jacobi/Gauss-Seidel) smoothers are better suited for this task, which could explain why they are more effective. degree mesh Prec., 2x BJ (ω = 1) (ω = 0.9) (ω = 0.8) (ω = 0.7) (ω = 0.6) (ω = 0.5) Prec., 2x BGS (ω = 1) (ω = 0.9) (ω = 0.8) (ω = 0.7) (ω = 0.6) (ω = 0.5) Defl., 1x BJ (ω = 1) N=402 43 34 33 33 32 34 33 33 33 32 33 34 45 N=802 43 34 34 33 33 34 33 33 33 34 35 35 45 p=2 N=1602 44 37 34 33 34 35 34 34 35 36 36 37 46 N=3202 44 37 35 34 35 35 35 34 35 36 37 38 46 N=402 56 40 36 35 35 36 34 34 35 35 36 37 48 N=802 56 40 37 36 36 37 35 34 34 36 37 39 48 p=3 N=1602 57 42 39 36 36 38 35 35 36 37 38 39 48 N=3202 58 43 39 37 37 39 37 36 37 38 39 40 49 Table 2.7. Damping can improve the convergence for the preconditioner, but has no influence for deflation (five layers, # CG Iterations). Section 2.5 Numerical experiments 25 Taking the computational time into account Based on the results in Table 2.7, it appears that that the preconditioning variant with either block Jacobi (with optimal damping) or block Gauss-Seidel is the most efficient choice. However, the costs per iteration also need to be taken into account. Table 2.8 reconsiders the results in Table 2.7 but now in terms of the computational time in seconds (using a direct coarse solver). It can be seen that block Gauss-Seidel smoothing is relatively expensive. The deflation variant (with block Jacobi smoothing), is the fastest in nearly all cases. This is due to the fact that it requires only one smoothing step per iteration, instead of two. When an optimal damping parameter is known, the preconditioning variant reaches a comparable efficiency. This is also illustrated in Figure 2.3. However, it is an open question how the damping parameter can best be selected in practice. 60 8 prec. 7 defl. CPU time in seconds # CG iterations 50 40 prec. 30 20 10 0 0.5 6 5 defl. 4 3 2 1 0.6 0.7 0.8 0.9 damping parameter ω 1 0 0.5 0.6 0.7 0.8 0.9 damping parameter ω 1 Figure 2.3. Unless an optimal damping parameter is known, deflation is faster due to lower costs per iteration (five layers, N = 1602 , block Jacobi). degree mesh Prec., 2x BJ (ω = 1) (ω = 0.9) (ω = 0.8) (ω = 0.7) (ω = 0.6) (ω = 0.5) Prec., 2x BGS (ω = 1) (ω = 0.9) (ω = 0.8) (ω = 0.7) (ω = 0.6) (ω = 0.5) Defl., 1x BJ (ω = 1) N=402 0.09 0.07 0.07 0.07 0.07 0.07 0.15 0.15 0.15 0.14 0.15 0.15 0.07 N=802 0.61 0.48 0.48 0.47 0.47 0.48 0.80 0.80 0.80 0.82 0.84 0.84 0.50 p=2 N=1602 3.23 2.76 2.53 2.46 2.55 2.61 3.84 3.85 3.96 4.07 4.08 4.19 2.77 N=3202 15.43 13.12 12.43 12.11 12.41 12.47 17.50 17.03 17.64 18.10 18.46 19.10 13.61 N=402 0.38 0.27 0.24 0.24 0.24 0.24 0.41 0.41 0.42 0.42 0.43 0.45 0.22 N=802 1.73 1.25 1.16 1.12 1.13 1.16 1.84 1.79 1.79 1.89 1.94 2.06 1.05 p=3 N=1602 7.85 5.83 5.42 5.05 5.04 5.33 7.75 7.71 7.97 8.22 8.44 8.66 4.92 N=3202 37.41 27.91 25.12 24.03 24.11 25.48 36.32 35.36 36.36 37.28 38.23 39.21 23.31 Table 2.8. The deflation method and the block Jacobi smoother tend to be faster due to lower costs per iteration (five layers, CPU time in seconds). 26 2.5.5 Chapter 2 Linear DG systems Other test cases In this section, we repeat the experiments in Table 2.8 for more challenging test cases. For the preconditioning variant, we only display the results for block Jacobi smoothing without damping (ω = 1) and with optimal damping (ω = 0.7). Table 2.9 and Table 2.10 consider problems that mimic the occurrence of sand inclusions within a layer of shale and groundwater flow respectively. Similar problems have been studied in [81]. Table 2.11 displays the results for a bubbly flow problem, inspired by [75]. Table 2.12 studies a bowl-shaped problem: at the black lines in the illustration, homogeneous Neumann boundary conditions are applied. These have a negative impact on the matrix properties, resulting in a challenging problem. Table 2.13 considers an anisotropic problem with two layers (with exact solution u(x, y) = cos(2πy)). Because the diffusion is a full tensor, this test case mimics the effect of using a non-Cartesian mesh. It can be seen from these tables that, as before, both two-level methods yield fast and scalable convergence. Without damping, deflation is the most efficient. When an optimal damping value is known, the preconditioning variant performs comparable to deflation. 2.6 Conclusion This chapter compares the two-level preconditioner proposed in [24] to an alternative (ADEF2) deflation variant for linear systems resulting from SIPG discretizations for diffusion problems. We have found that both two-level methods yield fast and scalable convergence for diffusion-problems with large jumps in the coefficients. This result is obtained provided that the SIPG penalty parameter is chosen dependent on local values of the permeability (using the largest limit value at discontinuities). The latter also benefits the accuracy of the SIPG discretization. Furthermore, the impact of using weighted averages (SWIP) is then small. Coarse systems can be solved efficiently by applying the CG method again in an inner loop with low accuracy. The main difference between both methods is that the deflation method can be implemented by skipping one of the two smoothing steps in the algorithm for the preconditioning variant. This may be particularly advantageous for expensive smoothers, although the basic block Jacobi smoother was found to be highly effective for the problems under consideration. Without damping, deflation can be up to 35% faster than the original preconditioner. If an optimal damping parameter is used, both two-level strategies yield similar efficiency (deflation appears unaffected by damping). However, it remains an open question how the damping parameter can best be selected in practice. Altogether, both two-level strategies can contribute to faster linear solvers for SIPG systems with strong contrasts in the coefficients, such as those encountered in oil reservoir simulations. Section 2.6 Conclusion 27 40 CPU time in sec. 35 Prec., 2x BJ Prec., 2x BJ (ω = 0.7) Defl., 1x BJ K=1 30 K = 10−5 25 20 15 10 5 0 402 802 1602 3202 # mesh elements # CG Iterations: degree mesh degrees of freedom Prec., 2x BJ Prec., 2x BJ (ω = 0.7) Defl., 1x BJ N=402 9600 44 30 42 N=802 38400 48 30 42 p=2 N=1602 153600 46 30 42 N=3202 614400 46 31 43 N=402 16000 53 32 46 N=802 64000 56 34 47 p=3 N=1602 256000 58 34 48 N=3202 1024000 59 34 48 N=402 9600 0.09 0.06 0.06 N=802 38400 0.68 0.43 0.47 p=2 N=1602 153600 3.28 2.18 2.44 N=3202 614400 16.77 11.38 13.14 N=402 16000 0.36 0.22 0.21 N=802 64000 1.75 1.09 1.05 p=3 N=1602 256000 8.11 4.86 4.91 N=3202 1024000 37.42 22.01 23.62 CPU time in seconds: degree mesh degrees of freedom Prec., 2x BJ Prec., 2x BJ (ω = 0.7) Defl., 1x BJ Table 2.9. Sand inclusions 28 Chapter 2 Linear DG systems 45 40 CPU time in sec. 35 K = 102 Prec., 2x BJ Prec., 2x BJ (ω = 0.7) Defl., 1x BJ 30 K = 104 25 20 K = 10−3 15 10 5 0 402 802 1602 3202 # mesh elements # CG Iterations: degree mesh degrees of freedom Prec., 2x BJ Prec., 2x BJ (ω = 0.7) Defl., 1x BJ N=402 9600 54 38 54 N=802 38400 52 38 54 p=2 N=1602 153600 52 38 54 N=3202 614400 52 40 55 N=402 16000 67 41 59 N=802 64000 68 42 59 p=3 N=1602 256000 68 42 60 N=3202 1024000 69 42 60 N=402 9600 0.10 0.07 0.07 N=802 38400 0.74 0.54 0.59 p=2 N=1602 153600 3.70 2.69 3.12 N=3202 614400 19.04 14.74 16.77 N=402 16000 0.45 0.28 0.27 N=802 64000 2.15 1.34 1.30 p=3 N=1602 256000 9.56 5.93 6.12 N=3202 1024000 43.27 26.88 29.25 CPU time in seconds: degree mesh degrees of freedom Prec., 2x BJ Prec., 2x BJ (ω = 0.7) Defl., 1x BJ Table 2.10. Ground water Section 2.6 Conclusion 29 40 CPU time in sec. 35 Prec., 2x BJ Prec., 2x BJ (ω = 0.7) Defl., 1x BJ K = 10−5 30 25 20 15 K=1 10 5 0 402 802 1602 3202 # mesh elements # CG Iterations: degree mesh degrees of freedom Prec., 2x BJ Prec., 2x BJ (ω = 0.7) Defl., 1x BJ N=402 9600 41 31 41 N=802 38400 42 31 39 p=2 N=1602 153600 43 32 40 N=3202 614400 44 32 41 N=402 16000 55 33 45 N=802 64000 56 34 45 p=3 N=1602 256000 57 35 45 N=3202 1024000 58 35 46 N=402 9600 0.08 0.06 0.06 N=802 38400 0.59 0.45 0.44 p=2 N=1602 153600 3.07 2.32 2.35 N=3202 614400 16.02 11.77 12.63 N=402 16000 0.37 0.22 0.21 N=802 64000 1.74 1.09 1.00 p=3 N=1602 256000 7.95 4.89 4.60 N=3202 1024000 37.00 22.66 22.79 CPU time in seconds: degree mesh degrees of freedom Prec., 2x BJ Prec., 2x BJ (ω = 0.7) Defl., 1x BJ Table 2.11. Bubbly flow 30 Chapter 2 Linear DG systems 40 CPU time in sec. 35 Prec., 2x BJ Prec., 2x BJ (ω = 0.7) Defl., 1x BJ K = 10−1 30 25 20 K=1 15 10 5 0 402 802 1602 3202 # mesh elements # CG Iterations: degree mesh degrees of freedom Prec., 2x BJ Prec., 2x BJ (ω = 0.7) Defl., 1x BJ N=402 9600 45 34 47 N=802 38400 45 35 47 p=2 N=1602 153600 45 36 47 N=3202 614400 45 36 47 N=402 16000 59 36 49 N=802 64000 59 37 49 p=3 N=1602 256000 60 37 50 N=3202 1024000 60 38 50 N=402 9600 0.09 0.07 0.07 N=802 38400 0.63 0.49 0.52 p=2 N=1602 153600 3.26 2.62 2.75 N=3202 614400 16.53 13.31 14.41 N=402 16000 0.40 0.24 0.23 N=802 64000 1.83 1.16 1.09 p=3 N=1602 256000 8.35 5.20 5.09 N=3202 1024000 38.42 24.65 24.69 CPU time in seconds: degree mesh degrees of freedom Prec., 2x BJ Prec., 2x BJ (ω = 0.7) Defl., 1x BJ Table 2.12. Neumann BCs Section 2.6 Conclusion 31 45 40 CPU time in sec. 35 Prec., 2x BJ Prec., 2x BJ (ω = 0.7) Defl., 1x BJ K = 10−3 1 4 1 2 1 1 4 30 25 20 K= 15 1 1 4 1 4 1 2 10 5 0 402 802 1602 3202 # mesh elements # CG Iterations: degree mesh degrees of freedom Prec., 2x BJ Prec., 2x BJ (ω = 0.7) Defl., 1x BJ N=402 9600 47 36 48 N=802 38400 48 37 49 p=2 N=1602 153600 49 38 51 N=3202 614400 50 38 52 N=402 16000 61 39 54 N=802 64000 67 39 55 p=3 N=1602 256000 68 41 57 N=3202 1024000 71 42 57 N=402 9600 0.09 0.07 0.07 N=802 38400 0.68 0.52 0.54 p=2 N=1602 153600 3.57 2.79 2.99 N=3202 614400 17.65 13.62 15.47 N=402 16000 0.41 0.27 0.25 N=802 64000 2.08 1.23 1.21 p=3 N=1602 256000 9.63 5.83 5.85 N=3202 1024000 44.32 26.68 27.04 CPU time in seconds: degree mesh degrees of freedom Prec., 2x BJ Prec., 2x BJ (ω = 0.7) Defl., 1x BJ Table 2.13. Anisotropy 32 Chapter 2 Linear DG systems 3 Theoretical scalability This chapter is based on: P. van Slingerland, C. Vuik, Scalable two-level preconditioning and deflation based on a piecewise constant subspace for (SIP)DG systems. Submitted to JCAM. 34 3.1 Chapter 3 Theoretical scalability Introduction Chapter 2 reformulated the two-level preconditioner proposed in [24] as a deflation method, and demonstrated numerically that both two-level variants can yield fast and scalable CG convergence using (damped) block Jacobi smoothing. In this chapter, we focus on theoretical support for these findings. For symmetric problems, convergence theory for a large class of two-level methods has been established by Falgout et al. [31]. They have derived spectral bounds for the error iteration matrix corresponding to such two-level methods. These results are abstract in the sense that they apply for a large family of coarse spaces and smoothers, and for any symmetric and positive-definite coefficient matrix A. The extension to the nonsymmetric case has been presented by Notay [57]. By applying the work in [31] SIPG matrices, Dobrev et al [24] have shown theoretically that their two-level preconditioner (with coarse corrections based on the DG discretization with polynomial degree p = 0) yields scalable convergence of the CG method (independent of the mesh element diameter). This result was established for SIPG schemes with polynomial degree p = 1. To extend the work in [24], in this chapter, we derive bounds for the condition number of the preconditioned system for both two-level methods studied in Chapter 2. In particular, we show that these bounds are independent of the mesh element diameter. Unlike before, our results also apply for p > 1 (besides p = 1). Another difference is that we include BNN/ADEF2 deflation in the analysis, and compare it to the original preconditioning variant. Additionally, we demonstrate that the required restrictions on the smoother are satisfied for (damped) block Jacobi smoothing. The outline of this chapter is as follows. Section 3.2 specifies both two-level methods for the linear SIPG systems under consideration. Section 3.3 discusses abstract relations for the condition number of the preconditioned/deflated system, valid for any SPD matrix A. Section 3.4 derives an auxiliary property of SIPG matrices that is a consequence of regularity of the mesh. Section 3.5 uses this property to obtain the main scalability result. Finally, Section 3.6 summarizes the main conclusions. 3.2 Methods and assumptions Basically, this chapter considers the same linear SIPG systems and two-level preconditioning strategies as in Chapter 2, but now for a larger class of meshes and basis functions. This section specifies the slightly alternative formulations and additional assumptions used in the theoretical analysis to come. Section 3.2.1 specifies the diffusion model and discretizes it by means of the SIPG method. Section 3.2.2 discusses the resulting linear systems. Section 3.2.3 considers two two-level preconditioning strategies for solving the resulting linear system by means of the preconditioned CG method. Section 3.2 Methods and assumptions 3.2.1 35 SIPG discretization for diffusion problems We study the following diffusion problem on the d-dimensional domain Ω with boundary ∂Ω = ∂ΩD ∪ ∂ΩN and outward normal n: −∇ · (K∇u) = f, u = gD , K∇u · n = gN , in Ω, on ∂ΩD , on ∂ΩN . (3.1) We assume that the diffusion K is a scalar that is bounded below and above by positive constants, and that the other model parameters are chosen such that a weak solution of (3.1) exists1 . Additionally, we assume and that Ω is either an interval (d = 1), polygon (d = 2) or polyhedra (d = 3). To discretize the model problem (3.1), we subdivide Ω into mesh elements E1 , ..., EN with maximum element diameter h. We assume that each mesh element Ei is affine-equivalent with a certain reference element E0 that is an interval/polygon/polyhedra (inde- (3.2) pendent of h) with mutually affine-equivalent edges. Note that all meshes consisting entirely of either intervals, triangles, tetrahedrons, parallelograms, or parallelepipeds satisfy this property. Furthermore, we assume that the mesh is regular in the sense of [18, p. 124]. To specify this property, for all i = 0, ..., N , let hi and ρi denote the diameter of Ei , and the diameter of the largest ball contained in Ei respectively. We can now define regularity as2 : hi . 1, ρi ∀i = 1, ..., N. (3.3) Now that we have specified the mesh, we can construct an SIPG approximation for our model problem (3.1). To this end, define the test space V that contains each function that is a polynomial of degree p or lower within each mesh element, and that may be discontinuous at the element boundaries. The SIPG approximation uh is now defined as the unique element in this test space that satisfies the relation B(uh , v) = L(v), for all test functions v ∈ V, where B and L are (bi)linear forms that are similar to those defined earlier in Section 2.2.1, but now for a larger class of meshes. The only difference is that the quantity h is replaced by an alternative value that depends on the edge under consideration (and that K is scalar). 1 That 1 is, f, gN ∈ L2 (Ω) and gD ∈ H 2 (Ω) [63, p. 25, 26]. this paper, we use the symbol . in expressions of the form “F (x) . G(x) for all x ∈ X” to indicate that there exists a constant C > 0, independent of the variable x and the maximum mesh element diameter h (or the number of mesh elements), such that F (x) ≤ CG(x) for all x ∈ X. The symbol & is defined similarly. 2 Throughout 36 Chapter 3 Theoretical scalability To specify this, we use the same notation as before: the vector ni denotes the outward normal of mesh element Ei ; the set Γh is the collection of all interior edges e = ∂Ei ∩∂Ej ; the set ΓD is the collection of all Dirichlet boundary edges e = ∂Ei ∩ ∂ΩD ; the set ΓN is the collection of all Neumann boundary edges e = ∂Ei ∩ ∂ΩN ; and [.] and {.} denote the usual trace operators for jumps and averages at the mesh element boundaries, as defined in Section 2.2.1. Additionally, for all edges e, we write he to denote the length of the largest mesh element adjacent to e for one-dimensional problems, the length of e for two-dimensional problems, and the square root of the surface area of e for threedimensional problems. Using this notation, the forms B and L are defined as follows (for one-dimensional problems, the boundary integrals below should be interpreted as function evaluations of the integrand): BΩ (uh , v) = N Z X i=1 Ei X K∇uh · ∇v, Z σ [uh ] · [v] , h e e∈Γh ∪ΓD e X Z Br (uh , v) = − {K∇uh } · [v] + [uh ] · {K∇v} , Bσ (uh , v) = e∈Γh ∪ΓD e B(uh , v) = BΩ (uh , v) + Bσ (uh , v) + Br (uh , v), Z X Z X Z σ [K∇v] + v gD + L(v) = fv − vgN , he e Ω e e∈ΓD (3.4) e∈ΓN where σ ≥ 0 is the penalty parameter (cf. Section 2.2.3). Its value may vary throughout the domain, and we assume that it is bounded below and above by positive constants (independent of the maximum element diameter h). Furthermore, we assume that it is sufficiently large so that the scheme is coercive3 [63, p. 38–40], i.e.: BΩ (v, v) + Bσ (v, v) . B(v, v), 3.2.2 ∀v ∈ V. (3.5) Linear system In order to compute the SIPG approximation, we need to choose a basis for the test space V . In Section 2.2.2, we have discussed the monomial basis functions for uniform Cartesian meshes. We now consider a more general class: for all i = 1, ..., N , let Fi : Ei → E0 denote an invertible affine mapping (which exists (0) by assumption (3.2)). Furthermore, let the functions φk : E0 → R (with 3 For k2 coercivity, it suffices that σ ≥ 2Cn0 k1 , where k0 and k1 are the global lower and 0 upper bounds for the diffusion coefficient K respectively, n0 is the maximum number of neighbors an element can have (e.g. n0 = 4 for a two-dimensional quadrilateral mesh), and C is a constant occurring in a trace inequality that does not depend on h (but may depend on p). See [63, p. 23, 38-39] for more details. Section 3.2 Methods and assumptions 37 k = 1, ..., m) form a basis for the space of all polynomials of degree p and lower (0) on the reference element (setting φ1 = 1). Using this basis on the reference (i) element, the basis function φk (with k = 1, ..., m and i = 1, ..., N ) is zero in the (i) (0) entire domain, except in the mesh element Ei , where it reads φk = φk ◦ Fi . Now that we have defined the basis functions, we can express uh as a linear combination of these functions and construct a linear system. Although we are now considering a larger class of meshes and basis functions, we arrive at the same forms we have seen earlier in Section 2.2.2: uh = m N X X (i) (i) uk φ k , i=1 k=1 (i) where the unknowns uk linear system Au = b of A11 A21 . .. AN 1 in this expression can be determined by solving a following form: A12 . . . A1N u1 b1 .. u b 2 A22 . 2 (3.6) .. = .. , .. . . . bN uN ... AN N where the blocks all have dimension m, and where, for all i, j = 1, ..., N : (i) (j) (i) (j) (i) (j) B(φ1 , φ1 ) B(φ2 , φ1 ) . . . B(φm , φ1 ) .. (i) (j) (i) (j) . B(φ1 , φ2 ) B(φ2 , φ2 ) Aji = .. .. . . (i) (j) (i) (j) B(φ1 , φm ) ... B(φm , φm ) (i) (j) u1 L(φ1 ) (i) (j) u2 L(φ2 ) ui = , b = (3.7) j .. .. . . . (i) um (j) L(φm ) Note that A is Symmetric and Positive-Definite (SPD), as the bilinear form B is SPD (cf. (3.4) and (3.5)). 3.2.3 Two-level preconditioning and deflation To solve the linear SIPG system by means of the preconditioned CG method, we consider the two-level preconditioner and the corresponding deflation variant discussed in Chapter 2. For the deflation method, we consider the alternative (yet equivalent) BNN formulation. We specify both methods below, as well as some additional assumptions on the smoother. 38 Chapter 3 Theoretical scalability Recall the restriction operator R, which is defined such that A0 := RART is the SIPG matrix for polynomial degree p = 0, and the coarse correction operator Q := RT A−1 0 R. The two-level preconditioner (Definition 2.3.1) combines −1 −1 r ≈ A−1 . The result y = Pprec this operator with a nonsingular smoother Mprec of applying the two-level preconditioner to a vector r can be computed in three steps: −1 y(1) = Mprec r y (2) =y (1) (pre-smoothing), + Q(r − Ay (1) ) −T y = y(2) + Mprec (r − Ay(2) ) (coarse correction), (post-smoothing). (3.8) Basically, the BNN deflation variant is obtained by turning (3.8) inside out, −1 −1 and using an SPD smoother Mdefl ≈ A−1 . The result y = Pdefl r of applying the BNN deflation technique to a vector r can now be computed as: y(1) := Qr (pre-coarse correction). −1 y(2) := y(1) + Mdefl (r − Ay(1) ) y := y (2) + Q(r − Ay (2) ) (smoothing), (post-coarse correction). (3.9) Both two-level strategies can be implemented in a standard preconditioned CG algorithm (cf. Section 2.3.3). We stress that the BNN deflation variant can be implemented more efficiently in a CG algorithm by using the so-called ADEF2 deflation variant (cf. Section 2.4). However, for the theoretical purposes in this paper, it is more convenient to study BNN rather than ADEF2. Furthermore, we require some additional assumptions on the smoothers, indicated hereafter. To specify these, for any SPD matrix M , let πM := RT (RM RT )−1 RM (3.10) denote the projection onto the coarse space Range(RT ) that yields the best approximation in the M -norm [31]. Additionally, for any nonsingular matrix M such that M + M T − A is SPD, define the symmetrization f := M T (M + M T − A)−1 M. M Using this notation, we can now specify the additional smoother requirements. For the preconditioner, we assume: T Mprec + Mprec − A is SPD. fprec v . vT v, h2−d vT M (3.11) ∀v ∈ Range(I − πI ). (3.12) For the deflation method, we assume: 2Mdefl − A is SPD, (3.13) Section 3.3 Abstract relations for any SPD matrix A h2−d vT Mdefl v . vT v, 39 ∀v ∈ Range(I − πI ). (3.14) −1 We have seen in Chapter 2 that the operator Pprec is SPD assuming that (3.11) is satisfied (for the deflation variant, it suffices that Mdefl is SPD). The conditions (3.11) and (3.13) imply that “the smoother iteration is a contraction in the A-norm” [31, p. 473]. The main idea behind the conditions (3.12) and (3.14) is that the smoother should scale with h2−d in the same way that A f is an efficient preconditioner for A in the space orthogonal does, and that M to the coarse space Range(RT ) [79, p. 78]. A slightly stronger version of (3.12) is also used in [24] to establish scalable convergence. It will be shown in Section 3.5.2 that the requirements (3.11)–(3.14) are satisfied for (damped) block Jacobi smoothing (with m × m blocks). A similar strategy can be used to show (3.11) and (3.12) for (damped) block Gauss-Seidel smoothing (with m × m blocks)4 . 3.3 Abstract relations for any SPD matrix A This section discusses abstract relations for the condition number of the preconditioned system for both two-level methods. These results are abstract in the sense that they apply for any SPD matrix A, so they are not restricted to SIPG matrices. Section 3.3.1 discusses the condition number of a preconditioned system for a certain class of operators. Section 3.3.2 considers the specific implications for both two-level methods. Section 3.3.3 compares them under specific assumptions on the smoother. 3.3.1 Using the error iteration matrix In this section, we consider the condition number of the preconditioned system for arbitrary SPD matrices A and a certain class of SPD preconditioners P −1 . Specifically, each preconditioner P −1 in this class is such that, for some SPD matrix M , the so-called error iteration matrix I − P −1 A has the same eigenvalues as (recall the notation in Section 3.2.3): TM := (I − πA )(I − M −1 A)(I − πA ). In Section 3.3.2 hereafter, we will see that both two-level methods are in this class for certain specific choices of M . Defining 2 KM := sup v6=0 k(I − πM )vkM 2 kvkA , (3.15) we now have the following result: 4 To show that (3.11) and (3.12) are valid for block Gauss-Seidel smoothing, note that T (3.11) is automatically satisfied as Mprec + Mprec − A is the block diagonal of A, which is SPD. To show (3.12), the main idea is to follow a similar strategy as in [79, Proposition 6.12], and then use the result for block Jacobi obtained in Section 3.5.2. 40 Chapter 3 Theoretical scalability Lemma 3.3.1 The condition number (in the 2-norm) of the preconditioned system P −1 A above can be bounded as follows5 : κ2 (P −1 A) ≤ λmax (M −1 A)KM . (3.16) Additionally, if6 M − A ≥ 0, then, κ2 (P −1 A) = KM . (3.17) 1 1 To show this, we use that TM has real eigenvalues (as A 2 TM A− 2 is symmetric), and that [57, Theorem 2.1 and Corollary 2.1]: TM has m times eigenvalue 0, −1 λmin (TM ) ≥ 1 − λmax (M A), 1 λmax (TM ) = 1 − . λmax (A−1 M (I − πM )) 1 (3.18) (3.19) (3.20) 1 Proof (of Lemma 3.3.1) First, note that P −1 A and P − 2 AP − 2 have the same positive eigenvalues and singular values. Hence, we may express the condition number as: κ2 (P −1 A) = (3.20) = 1 − λmin (TM ) λmax (P −1 A) = λmin (P −1 A) 1 − λmax (TM ) 1 − λmin (TM ) λmax A−1 M (I − πM ) . (3.21) Because I − πM = (I − πM )2 is a projection and M (I − πM ) is symmetric, it follows that: λmax A−1 M (I − πM ) = λmax A−1 M (I − πM )2 = λmax A−1 (I − πM )T M (I − πM ) 1 1 = λmax A− 2 (I − πM )T M (I − πM )A− 2 = sup v6=0 k(I − πM )vk2M kvk2A (3.15) = KM . Substitution into (3.21) yields: κ2 (P −1 A) = 1 − λmin (TM ) KM . (3.22) Application of (3.19) now completes the proof of (3.16). To show (3.17), assume that 1 1 M − A ≥ 0, which implies that I − A 2 M −1 A 2 ≥ 0: M M −1 ≥ [40, p. 398, 471] ≤ A A−1 5 Throughout this chapter, λ min and λmax denote the smallest and largest eigenvalue of a matrix with real eigenvalues respectively. 6 Throughout this chapter, for symmetrical matrices M , M ∈ R n×n , we write M ≤ M 1 2 1 2 to indicate that vT M1 v ≤ vT M2 v for all vectors v ∈ R n ; the notation ≥, <, and > is used similarly. Section 3.3 Abstract relations for any SPD matrix A 1 1 1 1 A 2 M −1 A 2 I − A 2 M −1 A 2 41 ≤ I ≥ 0. Hence, defining the symmetric projection π̄A := values of TM are non-negative: 1 1 1 A2 1 πA A− 2 , it follows that the eigen- 1 1 A 2 TM A− 2 = (I − π̄A )(I − A 2 M −1 A 2 )(I − π̄A ) ≥ 0. As a result, (3.18) implies that λmin (TM ) = 0. Substitution into (3.22) yields (3.17), which then completes the proof. 3.3.2 Implications for the two-level methods Next, we apply the result in the previous section to analyze the condition number of the preconditioned system for both the two-level preconditioner −1 −1 and the corresponding BNN deflation variant Pdefl (as specified in Section Pprec 3.2.3). For the two-level preconditioner, it is well-known that −1 κ2 (Pprec A) ≤ KM fprec . (3.23) This follows as a special case from [31] (also cf. [79, p. 70-73]), and relies on assumption (3.11). Below, we observe that the theory in [57] implies (via Lemma 3.3.1) that (3.23) remains true if we replace the inequality by equality. Furthermore, we obtain similar bounds for the deflation variant. Altogether, we have the following result, which applies for any SPD matrix A: −1 −1 and Pdefl be the twoLemma 3.3.2 Suppose that A is SPD and let Pprec level operators specified in Section 3.2.3. Then, assuming (3.11), the con−1 A can be dition number (in the 2-norm) of the preconditioned system Pprec expressed as follows: −1 κ2 (Pprec A) = KM fprec . (3.24) Additionally, assuming (3.13), we have for deflation: −1 −1 A)KMdefl < 2KMdefl , A) ≤ λmax (Mdefl κ2 (Pdefl (3.25) and, under the stronger assumption Mdefl − A ≥ 0: −1 κ2 (Pdefl A) = KMdefl . (3.26) To show this result, we apply Lemma 3.3.1, using (σ denotes the spectrum): −1 (3.27) σ I − Pprec A = σ TM fprec , −1 σ I − Pdefl A = σ TMdefl . (3.28) 42 Chapter 3 Theoretical scalability These relations follow similar to [75, p. 1730]. Finally, we use that, for any nonsingular M [79, Proposition 3.8]: M + MT − A > 0 ⇒ f − A ≥ 0. M (3.29) fprec − A ≥ 0. Using this Proof (of Lemma 3.3.2) Combining (3.11) and (3.29) gives M with (3.27) in Lemma 3.3.1 yields (3.24). Similarly, (3.26) follows from Lemma 3.3.1 using (3.28) and the assumption Mdefl − A ≥ 0. To show (3.25), note that the first inequality results from Lemma 3.3.1 and (3.28), while the second inequality follows from −1 observing that (3.13) implies that λmax (Mdefl A) < 2. This completes the proof. 3.3.3 Comparing deflation and preconditioning In this section, we compare both two-level methods in terms of the corresponding condition numbers. In [75, Theorem 6.1], it has been shown that the −1 −1 fprec . Below, we compare A and Pdefl A are equal if Mdefl = M eigenvalues of Pprec both two-level methods in case they both use the same smoother. −1 −1 Theorem 3.3.3 Suppose that A is SPD and let Pprec and Pdefl be the twolevel operators specified in Section 3.2.3. Furthermore, choose Mprec = Mdefl =: M SPD with M − A ≥ 0. Then, both methods are related in the following sense: 1 −1 −1 −1 A). A) ≤ κ2 (Pprec A) ≤ κ2 (Pdefl κ2 (Pdefl 2 (3.30) Before showing this result, we note that it implies that the CG convergence for the preconditioner is asymptotically not worse than for the deflation variant. That is, assuming M − A ≥ 0. In general, we may have 2M − A > 0 rather than the stronger assumption M − A ≥ 0. This is the case for block Jacobi smoothing in the numerical experiments in Section 2.5, where we have seen that deflation can yield fewer iterations (at lower costs per iteration). Nevertheless, if the smoother M is replaced by the damped alternative ω −1 M such that ω −1 M − A ≥ 0 (note that any ω ≤ 21 suffices if 2M − A > 0), then the result above applies (although a larger ω might give better results). Altogether, Theorem 3.3.3 provides insight in the way both two-level methods are related, but does not imply that preconditioning is always better. To show Theorem 3.3.3, we use, for any SPD matrices M, N : 1 1 (3.31) KM ≤ N − 2 M (I − πM )N − 2 KN . This follows from the more general work in [Not10, Corollary 2.2]. Additionally, we use, for any SPD matrix M [Not05, eq. (45)]: 1 1 f≤ M ≤M M. 2 2 − λmax (M −1 A) (3.32) Section 3.4 Intermezzo: regularity on the block diagonal of A 43 Theorem 3.3.3 can now be shown as follows: Proof (of Theorem 3.3.3) First, we show that (3.31) implies that, for any SPD matrices M, N and scalar α > 0: M ≤ αN ⇒ To show this, define π̄M = M jection. Hence: 1 2 1 πM M − 2 = = = 1 1 M ≤αN ⇒M 2 N −1 M 2 ≤α ≤ (I−π̄M ) symmetric projection ≤ KM ≤ αKN . and observe that I − π̄M is a symmetric pro- 1 2 − 21 M (I − πM )N − 2 N 2 1 1 1 2 − 12 21 12 M M (I − πM )M − 2 M 2 N − 2 N 2 1 1 2 − 12 21 − M (I − π̄M )M 2 N 2 N 2 1 1 2 − (I − π̄M )M 2 N 2 v 1 −1 1 M2 M2N sup kvk22 v6=0 1 2 1 (I − π̄M )M 2 N − 2 v 2 α sup kvk22 v6=0 1 2 − 12 2 v M N 2 α sup kvk22 v6=0 kvk2 ≤ α sup v6=0 −1 −1 M ≤αN ⇒N 2 M N 2 ≤α ≤ α2 sup v6=0 = (3.33) N −1 −1 2 MN 2 kvk22 kvk22 kvk22 α2 . This completes the proof of (3.33). Combining (3.33) and (3.32) now gives: 1 1 KM ≤ KM KM . f ≤ 2 2 − λmax (M −1 A) Application of Lemma 3.3.2 (using Mprec = Mdefl =: M SPD with M − A ≥ 0) yields −1 −1 κ2 (Pdefl A) = KM f and κ2 (Pdefl A) = KM . Hence: 1 1 −1 −1 −1 κ2 (Pdefl A) ≤ κ2 (Pprec A) ≤ κ2 (Pdefl A). 2 2 − λmax (M −1 A) Observing that λmax (M −1 A) ≤ 1 (as M − A SPD) yields (3.30) which then completes the proof. 3.4 Intermezzo: regularity on the block diagonal of A To further refine the abstract relations in the previous section for our SIPG application, we need to derive an auxiliary result: the diagonal blocks of a SIPG matrix A all ‘behave’ in a similar manner in the space orthogonal to the coarse 44 Chapter 3 Theoretical scalability space. This section obtains this result in three steps: Section 3.4.1 discusses an auxiliary property based on the regularity of the mesh. Section 3.4.2 uses this property to establish the desired result in terms of ‘small bilinear forms’. Section 3.4.3 can then show the main result of this section: regularity on the block diagonal of A. 3.4.1 Using regularity of the mesh The first step is rather abstract consequence of the regularity of the mesh. To state this result, recall the mapping Fi : Ei → E0 (cf. Section 3.2.2). Because this mapping is invertible and affine by assumption (3.2), there exists an invertible matrix Gi ∈ Rd×d and a vector gi ∈ Rd such that Fi (x) = Gi x + gi , ∀x ∈ Ei . −1 Next, let |G−1 i | denote the determinant of Gi , and define T Zi := |G−1 i |Gi Gi . Using regularity of the mesh, the following spectral properties of Zi can be shown: Lemma 3.4.1 Assuming (3.2) and (3.3), the eigenvalues of the matrices Zi above satisfy the following relation: 1 . λmin (h2−d Zi ) ≤ λmax (h2−d Zi ) . 1, ∀i = 1, ..., N. (3.34) To show this result, we use the following relations [18, p. 120–122]7 : −1 G = meas(Ei ) , i meas(E0 ) kGi k2 ≤ h0 , ρi We can now prove Lemma 3.4.1: −1 G ≤ hi . i 2 ρ0 (3.35) T Proof (of Lemma 3.4.1) Because Zi := |G−1 i |Gi Gi , and Gi is invertible, we have (cf. [67, p. 26]): T 2−d −1 λmax (h2−d Zi ) = h2−d G−1 Gi kGi k22 , i λmax (Gi Gi ) = h 1 1 = h2−d G−1 λmin (h2−d Zi ) = h2−d G−1 . i i T −1 2 λmax (Gi Gi )−1 Gi 2 Applying the relations in (3.35), using that meas(E0 ), h0 and ρ0 do not depend on h, and observing that ρdi . meas(Ei ) . hdi (for all i = 1, ..., N ), we may write: d 2 meas(Ei ) h0 2 h hi meas(Ei ) h 2 λmax (h2−d Zi ) ≤ h2−d . . , meas(E0 ) ρi hd ρi h ρi 7 Here, meas(.) denotes the Lesbesque measure. Section 3.4 Intermezzo: regularity on the block diagonal of A λmin (h2−d Zi ) ≥ h2−d meas(Ei ) meas(E0 ) ρ0 hi 2 & meas(Ei ) hd 45 h hi 2 & ρ d h 2 i . h hi Hence, the proof is completed if we can show that 1≤ h h ≤ . 1, hi ρi ∀i = 1, ..., N. The first two inequalities follow from the fact that ρi ≤ hi ≤ h. The last inequality follows as a special case from (3.3). Hence, we obtain (3.34), which completes the proof. 3.4.2 The desired result in terms of ‘small’ bilinear forms The second step is to use the regularity result in the previous section to obtain the desired result (regularity on the block diagonal of A) in terms of ‘small’ bilinear forms. To state this result, we require the following notation: let V0 denote the space of all polynomials of degree p and lower defined on the reference element E0 . Additionally, let Γi denote the set of all edges of Ei that are either in the interior or at the Dirichlet boundary. Furthermore, let Γ0 denote the set of all edges of the reference element E0 . Next, define the following bilinear forms8 : Z Z (0) (i) ∇v · ∇w, K∇ (v ◦ Fi ) · ∇ (w ◦ Fi ) , BΩ (v, w) = BΩ (v, w) = E0 Ei XZ σ XZ Bσ(i) (v, w) = [v ◦ Fi ] · [w ◦ Fi ] , Bσ(0) (v, w) = [v] · [w] , e he e e∈Γi e∈Γ0 for all v, w ∈ V0 and i = 1, ..., N . Using this notation, we now have the following result: Lemma 3.4.2 Suppose that the diffusion coefficient K and the penalty parameter σ are bounded above and below by positive constants (independent of h). Assume that (3.2) and (3.3) hold. Then, the bilinear forms above satisfy the following relations: (0) (i) (0) BΩ (w, w) . h2−d BΩ (w, w) . BΩ (w, w), ∀w ∈ V0 , ∀i = 1, ..., N. (3.36) 0 ≤ h2−d Bσ(i) (w, w) . Bσ(0) (w, w), ∀w ∈ V0 , ∀i = 1, ..., N. (3.37) We discuss the proof of both relations individually hereafter. 8 Here, the trace operators are defined as before by extending the function to be zero outside E0 and Ei . 46 Chapter 3 Theoretical scalability Proof (of (3.36)) Because the diffusion coefficient K is bounded below and above by positive constants (independent of h), we may write (all displayed relations below are for all w ∈ V0 and for all i = 1, ..., N ): Z Z (i) ∇ (w ◦ Fi ) · ∇ (w ◦ Fi ) . (3.38) ∇ (w ◦ Fi ) · ∇ (w ◦ Fi ) . BΩ (w, w) . Ei Ei Next, we apply the chain rule, using that the Jacobian of Fi is equal to Gi : Z Z ∇ (w ◦ Fi ) · ∇ (w ◦ Fi ) = Gi ∇w ◦ Fi · Gi ∇w ◦ Fi . Ei Ei A change of variables (from x ∈ Ei to Fi (x) ∈ E0 ) introduces a factor G−1 i : Z Z −1 ∇ (w ◦ Fi ) · ∇ (w ◦ Fi ) = Gi Gi ∇w · Gi ∇w E0 Ei = Z T (∇w)T G−1 i Gi Gi ∇w. E0 | {z } =Zi Substitution of this expression into (3.38) and multiplication with h2−d yields: Z Z (i) (∇w)T (h2−d Zi )(∇w). (∇w)T (h2−d Zi )(∇w) . h2−d BΩ (w, w) . E0 E0 Application of Lemma 3.4.1 gives: Z Z (i) ∇w · ∇w . h2−d BΩ (w, w) . | E0 {z (0) =BΩ (w,w) | } which completes the proof of (3.36). ∇w · ∇w, E0 {z (0) =BΩ (w,w) } Proof (of (3.37) for 1D problems) Because the penalty parameter σ is bounded be(i) low and above by positive constants (independent of h), and because Bσ is SPSD, it follows that (all displayed relations below are for all w ∈ V0 and for all i = 1, ..., N ): X Z h2−d (i) 0 ≤ h2−d Bσ (w, w) . [w ◦ Fi ] · [w ◦ Fi ] . (3.39) e he e∈Γ i For one-dimensional problems, the integral over an edge e should be interpreted as the evaluation of the integrand. Furthermore, he denotes the size of the largest mesh element adjacent to e, so regularity (3.3) implies that hh . 1 for all e. Hence, we may write (for e d = 1): X (i) [w ◦ Fi (e)] · [w ◦ Fi (e)] . 0 ≤ hBσ (w, w) . e∈Γi Finally, observe that, for all e ∈ Γi , the transformed edge value Fi (e) =: e0 ∈ Γ0 . Different e ∈ Γi yield different e0 ∈ Γ0 , although not all e0 ∈ Γ0 are reached in the presence of Neumann boundary conditions. Nevertheless, we may write: X (i) 0 ≤ hBσ (w, w) . [w(e0 )] · [w(e0 )] . e0 ∈Γ0 | {z (0) =Bσ (w,w) } This completes the proof of (3.37) for one-dimensional problems. Section 3.4 Intermezzo: regularity on the block diagonal of A 47 Proof (of (3.37) for 2D problems) Similar to the one-dimensional case, we can obtain (3.39). For two-dimensional problems, the edges e are line segments and he = meas(e), i.e. the length of e. Hence, for all e there exists an invertible affine mapping re : [0, 1] → e. By definition of the line integral over e, we may now rewrite (3.39) as (using d = 2): X 1 Z 1 (i) 0 ≤ Bσ (w, w) . [w ◦ Fi ◦ re (t)] · [w ◦ Fi ◦ re (t)] re′ dt h e 0 e∈Γ i Because re (t) is affine, its derivative re′ is a constant and: Z Z 1 ′ ′ re dt = r e = 1 de = meas(e) = he . e 0 Hence: (i) 0 ≤ Bσ (w, w) . X Z e∈Γi 1 [w ◦ Fi ◦ re (t)] · [w ◦ Fi ◦ re (t)] dt. 0 Next, consider a single e ∈ Γi : note that Fi ◦ re ([0, 1]) = Fi (e) =: e0 ⊂ ∂E0 , and define the invertible affine mapping re0 := Fi ◦ re . As above, we have that re′ 0 = meas(e0 ). By definition of the line integral over e0 , we may now write (using that meas(e0 ) does not depend on h): Z Z Z 1 1 [w] · [w] . [w] · [w] . [w ◦ Fi ◦ re (t)] · [w ◦ Fi ◦ re (t)] dt = meas(e0 ) e0 e0 0 Next, we apply this strategy for all e ∈ Γi , which yield different (disjunct) e0 ⊂ ∂E0 , although the entire boundary of E0 is not reached in the presence of Neumann boundary conditions. Combining the results, we may write (after possible repartitioning of the edges e0 ): X Z (i) 0 ≤ Bσ (w, w) . [w] · [w] . e0 ∈Γ0 | e0 {z (0) =Bσ (w,w) } This completes the proof of (3.37) for two-dimensional problems (d = 2). Proof (of (3.37) for 3D problems) The proof is similar to the two-dimensional case, except that we are now dealing with surface integrals rather than line integrals. Similar the one-dimensional case, p we have (3.39). For three-dimensional problems, the faces e are polygons and he = meas(e), i.e. the square root of the surface area of e. Because all faces are mutually affine-equivalent (3.2), for all e, there exists an invertible affine mapping re : D → e for some polygon D ⊂ R 2 (independent of h). By definition of the surface integral over e, we may rewrite (3.39) as (using d = 3): (i) 0 ≤ h−1 Bσ (w, w) X 1 Z ∂re ∂re × . [w ◦ Fi ◦ re (u, v)] · [w ◦ Fi ◦ re (u, v)] du dv. h he D ∂u ∂v e∈Γ i ∂r ∂r and ∂v are constant, and: Because re (u, v) is affine, it follows that its derivatives ∂u Z ∂r ∂r ∂r ∂r 1 × du dv ∂u × ∂v = meas(D) ∂v D ∂u Z 1 meas(e) h2e = 1 de = = . meas(D) e meas(D) meas(D) 48 Chapter 3 Theoretical scalability Hence: (i) 0 ≤ h−1 Bσ (w, w) . X e∈Γi he h meas(D) Z [w ◦ Fi ◦ re (u, v)] · [w ◦ Fi ◦ re (u, v)] du dv D Because e can be contained in a circle with diameter h, it follows that meas(e) . h2 , hence hhe . 1: Z X 1 (i) [w ◦ Fi ◦ re (u, v)] · [w ◦ Fi ◦ re (u, v)] du dv 0 ≤ h−1 Bσ (w, w) . meas(D) D e∈Γ i Next, consider a single e ∈ Γi : note that Fi ◦ re (D) = Fi (e) =: e0 ⊂ ∂E0 , and define ∂re ∂re the invertible affine mapping re0 := Fi ◦ re . As above, we have that ∂u0 × ∂v0 = meas(e0 ) . meas(D) By definition of the surface integral over e0 , we may now write (using that meas(e0 ) does not depend on h): Z Z 1 1 [w ◦ Fi ◦ re (u, v)] · [w ◦ Fi ◦ re (u, v)] du dv = [w] · [w] meas(D) D meas(e0 ) e0 Z [w] · [w] . . e0 Next, we apply this strategy for all e ∈ Γi , which yield different (disjunct) e0 ⊂ ∂E0 , although the entire boundary of E0 is not reached in the presence of Neumann boundary conditions. Combining the results, we may write (after possible repartitioning of the edges e0 ): X Z (i) [w] · [w] . 0 ≤ h−1 Bσ (w, w) . e0 ∈Γ0 | e0 {z (0) =Bσ (w,w) } This completes the proof of (3.37) for three-dimensional problems (d = 3). 3.4.3 Regularity on the block diagonal of A As a final step, we now demonstrate that the diagonal blocks of a SIPG matrix A all behave in a similar manner in the space orthogonal to the coarse space. To state this result, we require the following notation: suppose that AΩ results from the bilinear form h2−d BΩ in the same way that A results from the bilinear form B: this is established by substituting AΩ for A and h2−d BΩ for B in (3.6) and (3.7). Similarly, suppose that the matrices Aσ and Ar result from the bilinear forms h2−d Bσ and h2−d Br respectively. Altogether, we may write A = hd−2 (AΩ + Aσ + Ar ). (3.40) Finally, let Dσ be the result of extracting the diagonal blocks of size m × m from Aσ . Using this notation, we now have regularity on the block diagonal of A in the following sense: Section 3.4 Intermezzo: regularity on the block diagonal of A 49 Theorem 3.4.3 Suppose that the diffusion coefficient K and the penalty parameter σ are bounded above and below by positive constants (independent of h). Assume that (3.2) and (3.3) hold. Then, the matrices AΩ and Dσ above satisfy the following relations: vT v . vT AΩ v, ∀v ∈ Range(I − πI ), vT AΩ v . vT v, T ∀v ∈ RmN , T mN 0 ≤ v Dσ v . v v, ∀v ∈ R . (3.41) (3.42) (3.43) To show this result, the main idea is to observe that AΩ is an N × N block diagonal matrix with blocks of size m × m, where the first row and column in (j) every diagonal block is zero: this follows from the fact that BΩ (φik , φℓ ) = 0 (j) for i 6= j, and that the gradient of the piecewise constant basis function φ1 is (piecewise) zero. Altogether, AΩ is of the following form: 0 0 0 A(1) Ω .. . 0 0 AΩ = (3.44) . (i) 0 AΩ .. . 0 0 (N ) 0 AΩ As a consequence, we can treat the diagonal blocks individually by applying Lemma 3.4.2, and then combine the results (a similar strategy is used for the block diagonal Dσ ). To show (3.41), we also use the nature of πI = RT R, which is the projection operator onto the coarse space Range(RT ) that yields the best approximation in the 2-norm. As a result, the space Range(I − πI ) is orthogonal to Range(RT ), where the latter corresponds to the piecewise constant basis functions. In particular, any v ∈ Range(I − πI ) ⊂ RN m is of the form: 0 v1 .. . 0 . (3.45) v= vi . .. 0 vN 50 Chapter 3 Theoretical scalability Using these ideas, we can now show Theorem 3.4.3: (i) Proof (of Theorem 3.4.3) Let AΩ denote the result of deleting the first row and column in the ith diagonal block in AΩ , as indicated in (3.44). In other words: (i) (0) (0) (i) = h2−d BΩ (φk , φℓ ), AΩ ℓ−1,k−1 (0) for all k, ℓ = 2, ..., m. Next, observe that BΩ is independent of h and symmetric. (0) (0) Furthermore, for all higher-order polynomials v ∈ span{φ2 , ..., φm } \ {0}, the gradient (0) BΩ (v, v) (0) of v is nonzero, which implies that > 0. In other words, BΩ is even positivedefinite for the subspace under consideration. As a consequence, applying Lemma 3.4.2, we obtain a result similar to (3.41), but then for the individual diagonal blocks: (i) wT w . wT AΩ w, ∀w ∈ R m−1 , ∀i = 1, ..., N. Using the notation in (3.45), this relations hold in particular for w = vi , for all i = 1, ..., N . Summing over all i then yields: N X i=1 viT vi . N X (i) viT AΩ vi , ∀v ∈ Range(I − πI ), i=1 Using the notation in (3.45), this can be rewritten as (3.41), which then completes its proof. The relations (3.42) and (3.43) follow in a similar manner from Lemma 3.4.2 (without deleting the first row and column in each diagonal block). 3.5 Main result: scalability for SIPG systems Combining the results in the previous section, we can now show the main result of this chapter: both two-level methods yield scalable convergence (independent of the mesh element diameter) of the preconditioned CG method for SIPG systems. This result has been shown by Dobrev et al. [24] for the preconditioning variant for p = 1. In this section, we extend these results for p ≥ 1 and for the deflation variant. Section 3.5.1 obtains the aforementioned scalability for a general class of smoothers. Section 3.5.2 shows that the required smoother criteria are valid for block Jacobi smoothing. Section 3.5.3 studies the influence of damping and the penalty parameter on the upper bound of the condition numbers for block Jacobi smoothing. 3.5.1 Main result: scalability for SIPG systems To state the main scalability result of this chapter, let A be the discretization matrix resulting from an SIPG scheme with p ≥ 1, as defined in Section 3.2.1. −1 −1 and Pdefl denote the two-level preconditioner and BNN Furthermore, let Pprec deflation variant respectively, as specified in Section 3.2.3. The main result can now be stated as follows: Section 3.5 Main result: scalability for SIPG systems 51 Theorem 3.5.1 (Main result) Suppose that the diffusion coefficient K and the penalty parameter σ are bounded above and below by positive constants (independent of h). Assume that (3.2), (3.3), and (3.5) hold. Furthermore, assume that the smoother conditions (3.11), (3.12), (3.13), and (3.14) are satisfied. Then, both two-level methods yield scalable CG convergence in the sense that the condition number κ2 (in the 2-norm) of the preconditioned system can be bounded independently of the maximum mesh element diameter h: −1 κ2 (Pdefl A) . 1. −1 κ2 (Pprec A) . 1, (3.46) To show Theorem 3.5.1, the main idea is to consider Lemma 3.3.2: −1 κ2 (Pdefl A) < 2KMdefl . −1 κ2 (Pprec A) = KM fprec , (3.47) The proof is then completed by showing that KM fprec , KMdefl . 1, for any smoother that satisfies the criteria above. This is established using the auxiliary result Theorem 3.4.3, and coercivity (3.5) in matrix form: hd−2 vT (AΩ + Aσ ) v . vT Av, ∀v ∈ RN m . (3.48) Altogether, Theorem 3.5.1 can now be shown as follows: Proof (of Theorem 3.5.1) First, we will show that KM f prec . 1 (a similar strategy f f yields KMdefl . 1). For ease of notation, we will write M for Mprec . The main idea is to show that (I − πM )v . kvk for all v: because π is a projection onto the coarse f f f A M M f-norm, we can replace π f space Range(RT ) that yields the best approximation in the M M by the suboptimal projection πI , and then combine the properties established so far: (I − π f )v2 ≤ k(I − πI )vk2M f f M M (3.12) . (3.41) . (3.44), (3.45) = Bσ SPSD ⇒ Aσ SPSD . (3.48) . hd−2 k(I − πI )vk22 hd−2 k(I − πI )vk2AΩ hd−2 kvk2AΩ hd−2 kvk2AΩ +Aσ kvk2A ∀v ∈ R N m . Substitution of this relation into the definition of KM f yields: 2 (I − π f )v f M M KM . 1, f := sup kvk2A v6=0 A similar strategy, using (3.14) instead of (3.12), yields KMdefl . 1. Substitution of KM fprec , KMdefl . 1 into (3.47) now yields (3.46), which completes the proof of Theorem 3.5.1. 52 Chapter 3 Theoretical scalability 3.5.2 Special case: block Jacobi smoothing This section demonstrates that Theorem 3.5.1 is valid for (damped) block Jacobi smoothing. To specify this result, suppose that MBJ is the block Jacobi smoother with blocks of size m × m. Next, consider the specific choice Mprec = Mdefl = ω −1 MBJ with damping parameter ω > 0 (independent of h). We assume that ω ≤ 1, with ω < 1 strictly for the preconditioning variant. Additionally, we assume that there exists a permutation matrix P such that A can be permuted as: P AP T = ∆ − L − LT , (3.49) with ∆= ∆1 ∆2 .. . ∆q , 0 L1 −L = 0 .. . 0 Lq−1 0 , for some block-diagonal matrices ∆1 , ..., ∆q with blocks of size m × m, matrices L1 , ..., Lq−1 , and integer q ≤ N . Note that this assumption implies that the matrix A has property Aπ in the sense of [5, Definition 6.7]. Moreover, we remark that (3.49) is satisfied if the mesh can be colored by two colors9 (in that case, we can choose q = 2, and ∆1 and ∆2 each correspond to one of the two colors). In particular, structured rectangular meshes can be colored by two colors and thus satisfy (3.49). Altogether, assuming10 (3.49), we can now show that all smoother requirements for Theorem 3.5.1 are satisfied for (damped) block Jacobi smoothing: Corollary 3.5.2 If (3.49) is satisfied, then Theorem 3.5.1 applies for the damped block Jacobi smoothers Mprec and Mdefl above, i.e. both two-level methods yield scalable CG convergence in the sense that the condition number κ2 (in the 2-norm) of the preconditioned system can be bounded independently of the maximum mesh element diameter h: −1 κ2 (Pprec A) . 1, −1 A) . 1. κ2 (Pdefl This result follows immediately from Theorem 3.5.1 once we have verified that the conditions (3.11), (3.13), (3.12), and (3.14) are satisfied for the damped 9 That is, the mesh can be represented by a graph whose vertices can be colored such that connected vertices do not have the same color. 10 Alternatively, we could assume that the damping parameter ω is sufficiently small. This option is not considered further in this thesis. Section 3.5 Main result: scalability for SIPG systems 53 block Jacobi smoothers under consideration. In other words, writing M := ω −1 MBJ , we need to show: 2M − A > 0, h 2−d T h 2−d T (3.50) T v M v . v v, fv . v v, v M T ∀v ∈ Range(I − πI ), (3.51) ∀v ∈ Range(I − πI ), (3.52) for all ω ≤ 1, with ω < 1 strictly for (3.52). We treat each relation separately. To show (3.50), we use that (ρ denotes the spectral radius) : B := ∆−1 (L + LT ), ρ(B) < 1, (3.53) which follows from [5, Theorem 6.38] using (3.49) and the fact that A and M are SPD. Proof (of (3.50)) Without loss of generality, assume that ω = 1. Next, observe that P M P T = ∆. Hence, P (2M − A)P T P M P T =∆ = P AP T :=∆−L−LT 2∆ − P AP T = 2∆ − ∆ + L + LT = ∆ I + ∆−1 (L + LT ) {z } | =:B = ∆(I + B). Hence, 1 1 λmin (2M − A) = λmin (I + ∆ 2 B∆− 2 ). 1 1 And because (3.53) implies that ρ(B) = ρ(∆ 2 B∆− 2 ) < 1, it follows that 2M − A > 0. This completes the proof. To show (3.51), the main idea is to use Theorem 3.4.3 and the following property (cf. [24, p. 760] and [43, p. 4]): 0 < B(v, v) . BΩ (v, v) + Bσ (v, v), ∀v ∈ RN m . (3.54) Proof (of (3.51)) Without loss of generality, assume that ω = 1. Next, recall the notation introduced in the beginning of Section 3.4.3. Additionally, similar to Dσ , let Dr be the result of extracting the diagonal blocks of size m × m from Ar . Using this notation, and the fact that AΩ is a block diagonal matrix with blocks of size m × m, we may write: h2−d M = AΩ + Dσ + Dr . Next, we write (3.54) in matrix form: 0 < vT Av . vT hd−2 AΩ + hd−2 Aσ v, (3.55) ∀v ∈ R N m . Because this relation is also true when considering the diagonal blocks only, we may write: h2−d vT M v . v(AΩ + Dσ )v, ∀v ∈ R N m . Application of Theorem 3.4.3 now yields (3.51), which completes the proof. 54 Chapter 3 Theoretical scalability To show (3.52), we combine the previous results (3.50) and (3.51): Proof (of (3.52)) Using (3.50), and the fact that ω < 1 strictly, it can be shown that f≤ M which can be seen as follows: 1 M, 2(1 − ω) (3.56) 2ωM − A ≥ 0, 2M − A = (2 − 2ω)M + 2ωM − A | {z } ≥ (2 − 2ω)M, ≥0 (2M − A)−1 [40, p. 398, 471] ≤ M (2M − A)−1 M {z } | ≤ 1 M −1 , 2(1 − ω) 1 M. 2(1 − ω) f =:M Combining this relation with (3.51), it now follows that fv h2−d vT M (3.56) ≤ 1 h2−d vT M v 2(1 − ω) (3.51) . vT v, ∀v ∈ Range(I − πI ). This completes the proof of (3.52). 3.5.3 Influence of damping and the penalty parameter for block Jacobi smoothing In Section 2.5.2, we studied the influence of damping and the penalty parameter on the CG convergence. We found that both two-level methods perform significantly better if the penalty parameter is chosen dependent on local values of the diffusion coefficient. Furthermore, a damping parameter around ω = 0.7 was observed to be optimal for the preconditioning variant during the numerical experiments. To gain more insight in these results, in this section, we study the influence of damping and the penalty parameter on the constants in Corollary 3.5.2, where the same damped block Jacobi smoother is used for both two-level methods. Regarding damping, it can be shown (the proof is given at the end of this section): −1 A) ≤ κ2 (Pdefl 2 KMBJ , ω −1 κ2 (Pprec A) < 1 KMBJ . 2ω(1 − ω) (3.57) Although these upper bounds may not be optimal, the fact that the upper bound for the preconditioner blows up as ω tends to 1 is in line with our earlier numerical observation in Section 2.5 that the preconditioning variant performs better for ω safely away from 1. (i) To study the influence of the penalty parameter, let σmax denote the largest value that the penalty parameter σ attains at the edges of mesh element Ei , (i) and let Kmin denote the smallest value that the diffusion coefficient K attains Section 3.5 Main result: scalability for SIPG systems 55 within Ei (for all i = 1, ..., N ). We can now bound KMBJ in terms of the local ratio between the penalty parameter and the diffusion coefficient, assuming11 Aσ + Ar ≥ 0 (the proof is given at the end of this section): (i) σmax KMBJ ≤ C1 max i=1,...,N + C2 , (i) Kmin (3.58) for some positive constants C1 and C2 that are independent of the mesh element diameter h and the penalty parameter σ (but possibly dependent on the diffusion coefficient K). The result of substituting (3.58) into (3.57) is in line with the observation in Section 2.5 that the penalty parameter can best be chosen dependent on local values of the diffusion coefficient. We end this section with the proofs of (3.57) and (3.58): Proof (of (3.57)) The first inequality follows from (3.47), (3.33) and the fact that Mdefl = ω −1 MBJ . To show the second inequality, we use that (3.50) implies that −1 A) < 2ω: λmax (Mprec (3.47) −1 κ2 (Pprec A) = KM f prec (3.33), (3.32) 1 ≤ −1 A) 2 − λmax (Mprec −1 λmax (Mprec A)<2ω 1 KMprec 2(1 − ω) < (3.33),Mprec =ω −1 MBJ ≤ KMprec 1 KMBJ . 2ω(1 − ω) This completes the proof of (3.57). Proof (of (3.58)) By definition, KMBJ (3.15) = sup v6=0 (I − πM )v2 BJ M BJ kvk2A . As in the proof of Theorem 3.5.1, we may replace πMBJ by the suboptimal projection πI : KMBJ ≤ sup v6=0 k(I − πI )vk2MBJ kvk2A . Using the notation in (3.40) and (3.55), we can rewrite this as (note that the factor hd−2 cancels): KMBJ ≤ sup k(I − πI )vk2AΩ +Dσ +Dr kvk2AΩ +Aσ +Ar v6=0 . Next, we use the assumption Aσ + Ar ≥ 0: KMBJ 11 This Aσ +Ar ≥0 ≤ sup v6=0 k(I − πI )vk2AΩ +Dσ +Dr kvk2AΩ condition seems closely related to coercivity (3.5). How either can be guaranteed in practice (for problems with strong contrasts in the coefficients) is left for future research. 56 Chapter 3 Theoretical scalability (3.44), (3.45) = k(I − πI )vk2AΩ +Dσ +Dr sup k(I − πI )vk2AΩ v6=0 = 1+ kvk2Dσ +Dr sup v∈Range(I−πI ) kvk2AΩ (i) . (i) (i) Next, consider the notation for AΩ in (3.44) and, similarly, let Dσ and Dr denote the result of removing the first row and column from diagonal blok i in Dσ and Dr respectively. Then, we may write, using the notation for the components of v in (3.45): PN (i) (i) T T i=1 vi (Dσ + Dr )vi KMBJ ≤ 1 + sup . PN (i) T v∈Range(I−πI ) i=1 vi AΩ vi At the same time, it can be shown (similar to Section 3.4) that there exist positive constants CΩ , Cσ , Cr independent of h and σ, with CΩ , Cσ also independent of K, such that, for all w ∈ R m−1 : (i) (i) wT AΩ w ≥ CΩ Kmin wT w, (i) (i) wT Dσ w ≤ Cσ σmax wT w, (i) wT Dr w ≤ Cr wT w. Combining these relations gives: (i) (i) (i) viT (Dσ + Dr )viT ≤ Cσ σmax + Cr (i) CΩ Kmin (i) viT AΩ vi . Using the latter relation, we may now write: PN i=1 KMBJ ≤ 1+ sup v∈Range(I−πI ) ≤ 1+ max i=1,...,N ( (i) Cσ σmax (i) CΩ Kmin ) (i) Cσ σmax +Cr (i) CΩ Kmin PN (i) viT AΩ vi ( ) Cr + max . (i) i=1,...,N CΩ Kmin i=1 This can be rewritten as (3.58), which then completes the proof. 3.6 (i) viT AΩ vi Conclusion This chapter is focused on the theoretical analysis of the two-level preconditioner and deflation variant studied numerically in Chapter 2 for linear systems resulting from SIPG discretizations for diffusion problems. For both two-level methods, we have found that the condition number of the preconditioned system can be bounded independently of the mesh element diameter. This result is valid for any polynomial degree p ≥ 1, which extends the available analysis for the preconditioning variant for p = 1 in [24]. We have verified that the restrictions on the smoother are satisfied for block Jacobi smoothing. Altogether, our theory explains the scalable CG convergence observed during the numerical experiments in Chapter 2, and guarantees similar results for a large class of other diffusion problems on a variety of meshes. 4 Hidden DG accuracy This chapter is based on: P. van Slingerland, J.K. Ryan, C. Vuik, Position-dependent smoothness-increasing accuracy-conserving (SIAC) filtering for improving Discontinuous Galerkin solutions, SIAM J. Sci. Comp., 33(2011), pp 802–825. 58 4.1 Chapter 4 Hidden DG accuracy Introduction DG approximations can contain ‘hidden accuracy’: although the convergence rate of a DG scheme is typically of order p + 1 (where p is the polynomial degree), it can be improved to order 2p + 1 by applying a post-processor (cf. Section 1.4). Interestingly, this post-processor does not contain any information of the underlying physics or numerics, and needs to be applied only once, at the final time. The main idea behind the post-processor is to compute a convolution of the DG approximation against a linear combination of 2p + 1 B-splines. A positive side effect of this strategy is that the smoothness of the B-splines is carried over to the DG approximation, which then becomes p − 1 times continuously differentiable. This enhanced smoothness can benefit the visualization of the approximation, e.g. in the form of streamlines. The aforementioned filter, also known as the symmetric post-processor, was introduced by Bramble and Schatz [11] in the context of Ritz-Galerkin approximations for elliptic problems. Cockburn, Luskin, Shu, and Süli [22] demonstrated that the same technique can also be used to enhance the convergence rate of DG approximations from order p + 1 to order 2p + 1. This was shown for linear periodic hyperbolic problems with a sufficiently smooth exact solution. An overview of the development of post-processing techniques can also be found in [22]. To make the post-processor applicable near non-periodic boundaries and shocks, Ryan and Shu [64] introduced the one-sided post-processor. Inspired by the ideas in [16, 35, 51], they shifted the support of the local averaging operator to one side of the evaluation point. Using the resulting one-sided post-processor near boundaries and shocks and the original symmetric postprocessor in the interior, it was now possible to post-process the entire domain, even for non-smooth solutions or non-periodic boundary conditions. To improve the accuracy and smoothness of this one-sided strategy, in this chapter, we propose a position-dependent post-processor. In particular, we investigate the impact of using extra B-splines near the boundary. This way, we seek to reduce the possibility that the post-processor worsens the errors near the boundary (despite superconvergence). Furthermore, we study the effect of smoother transitions in the position of the B-splines. With this strategy, we aim to eliminate the necessity of (re)introducing artificial discontinuities. To compare the performance of the resulting position-dependent post-processor and the original one-sided filter, we discuss seven numerical experiments, including a problem with stationary shocks, a two-dimensional system, and a streamline visualization example (theoretical error estimates are discussed in Chapter 5). The outline of this chapter is as follows. Section 4.2 specifies the DG schemes under consideration. Section 4.3 provides the basics of the original symmetric and one-sided post-processor. Section 4.4 introduces the positiondependent post-processor. Section 4.5 discusses the numerical experiments. Section 4.2 Discretization 59 Section 4.6 summarizes the main conclusions. 4.2 Discretization This section summarizes the DG method for linear hyperbolic problems. Section 4.2.1 considers the one-dimensional case, following [21]. Section 4.2.2 discusses two-dimensional systems, similar to [22]. 4.2.1 DG for one-dimensional hyperbolic problems Consider the following problem on the interval [a, b]: ut + cu x = f, with initial condition u0 and either periodic or Dirichlet boundary conditions at x = a. The functions c and f may depend on space and time, but not on u. We assume that the velocity c is positive. To construct a DG approximation for this problem, consider a uniform mesh with elements Ei = [xi− 21 , xi+ 21 ] of length h > 0, where x 12 = a and xi+ 12 = xi− 21 + h (for all i = 1, ..., N ). Next, define the test space V that contains each function that is a polynomial of degree p or lower within each mesh element, and that may be discontinuous at the mesh element boundaries. For all v ∈ V , we let v (i) denote the (continuous) restriction of v to Ei . At the initial time t = 0, the DG approximation uh is the L2 -projection of u0 onto V . For t > 0, it is the function in V such that: Z b Z b f v, ∀v ∈ V, (uh )t v + B(uh , v) = a a (0) where B is the following bilinear form (defining uh |x 1 in terms of the boundary 2 condition for u at x = a): B(uh , v) = − N Z X i=1 cuh vx + Ei N X i=1 (i) v )|xi− 1 . (i−1) (i) (cuh v (i) )|xi+ 1 − (cuh 2 2 This form uses an upwind flux approximation for uh at the element boundaries. For more details, cf. [21]. 4.2.2 DG for two-dimensional hyperbolic systems Similar to the one-dimensional case, we can construct a DG approximation for two-dimensional hyperbolic systems. To specify this, consider the following problem on a square domain Ω: ut + A1 ux1 + A2 ux2 = f, (4.1) 60 Chapter 4 Hidden DG accuracy with initial condition u0 and periodic boundary conditions. Here, u and f are vector-valued functions with m entries and the coefficients Aj are constant matrices of size m×m. We assume that the system above is strongly hyperbolic in the sense that, for all n ∈ R2 , there exists a diagonal matrix Λ and a nonsingular matrix R such that " # λ1 −1 R(A1 n1 + A2 n2 )R = Λ = . (4.2) .. . λ m To construct a DG approximation for the system (4.1), we use a uniform Cartesian mesh for the spatial domain Ω with compact mesh elements E1 , ..., EN of size h × h. Next, define the test space V that contains each vector-valued function with m entries that are polynomials of degree p or lower within each mesh element, and that may be discontinuous at the mesh element boundaries. For all v ∈ V , we let v (i) denote the (continuous) restriction of v to Ei . At the initial time t = 0, the DG approximation uh is the L2 -projection of u0 onto V . For t > 0, it is the function in V such that (h., .i denotes the standard inner product in ℓ2 ): Z Z h(uh )t , vi + B(uh , v) = hf, vi , ∀v ∈ V, Ω Ω where B is a bilinear form specified hereafter. To define B, let ûh denote the following upwind flux approximation for uh on the mesh element boundaries [22, p. 587]: consider an edge e of mesh (i) (i) element Ei with outward normal n(i) = (n1 , n2 ) and neighboring element Ej . Using the notation in (4.2) (substituting n = n(i) ), define w := R−1 uh . We then set ûh = Rŵ, where the k th entry of ŵ is defined as: ( (i) vk , if λk > 0, ŵk = (i) vk , else. Using this numerical flux, the bilinear form B can be specified as follows: B(uh , v) = − + N Z X uh , AT1 vx1 + AT2 vx2 i=1 Ei N X E X Z D (i) (i) (A1 n1 + A2 n2 )ûh , v (i) . i=1 e∈∂Ei 4.3 e Original post-processing strategies The smoothness and accuracy of a DG approximation can be improved by applying a post-processer at the final simulation time. This section provides Section 4.3 Original post-processing strategies 61 the basics of this technique. Section 4.3.1 defines B-splines [70, 71], which are the building blocks of the filter. Section 4.3.2 discusses the original symmetric post-processor studied in [11, 22]. Section 4.3.3 considers the one-sided postprocessor introduced in [64] for post-processing near non-periodic boundaries and shocks. 4.3.1 B-splines A B-spline ψ (p+1) of order p + 1 can be defined recursively in the following manner1 : ψ (1) := 1[− 12 , 12 ] , ψ (p+1) := ψ (p) ⋆ ψ (1) , for all p ≥ 1. (4.3) Figure 4.1 provides an illustration of a B-spline. In general, a B-spline of order p + 1 is a piecewise polynomial of degree p that is p − 1 times continuously p+1 differentiable. Moreover, its support reads [− p+1 2 , 2 ]. For more details on B-splines, cf. [70, 71]. 1 Figure 4.1 Illustration of a B-spline ψ p+1 (x) for p = 1. Its support is contained in [− p+1 , p+1 ]. 2 2 0 − p+1 2 4.3.2 x p+1 2 Symmetric Post-processor The symmetric post-processor [11, 22] enhances a DG approximation (with polynomial degree p) by convolving it against a linear combination of B-splines. To specify this technique, we choose 2p + 1 integer nodes that are located symmetrically around the origin: xj = −p + j, j = 0, ..., 2p. (4.4) Next, we place a B-spline of order p + 1 at each kernel node, and define a kernel K that is a linear combination of these B-splines: K(x) = 2p X j=0 1 Here, cj ψ (p+1) (x − xj ), for all x ∈ R, (4.5) the symbol ⋆ denotes the convolution operator. Furthermore, 1[− 1 , 1 ] is the indi- cator function that is 1 in [− 12 , 21 ] and 0 elsewhere. 2 2 62 Chapter 4 Hidden DG accuracy where the coefficients cj are determined by the following linear system: ( Z ∞ 2p X 1, for k = 0, (p+1) k cj ψ (x)(x + xj ) dx = (4.6) 0, else. −∞ j=0 Existence and uniqueness of the solution of the system in (4.6) have been shown in [10, Lemma 8.1]. The coefficients are chosen in this way to ensure that the kernel reproduces polynomials q of degree 2p and lower in the sense that K ⋆ q = q. The relevance of this property will be discussed further in Section 5.3.1. The symmetric post-processor can now be defined as follows: Definition 4.3.1 (Symmetric Post-processor) Consider a periodic DG approximation uh at some final simulation time on the interval [a, b], using mesh elements of size h and polynomials of degree p, as discussed in Section 4.2.1. Let K denote the kernel defined by (4.4), (4.5) and (4.6). Then, the result of post-processing uh in the evaluation point x̄ ∈ (a, b) is computed by convolving uh against the scaled kernel K (using a periodic extension of uh when needed): u∗h (x̄) = ↑ 1 h Z b K a x̄ − x h uh (x) dx. (4.7) Figure 4.2 Application of the scaled symmetric kernel K x̄−x at the evaluation h point x̄ in the mesh (p = 1). The kernel nodes are indicated by circles, and the corresponding B-splines by dashed lines. x̄ Figure 4.2 illustrates the symmetric post-processor. Although this technique does not contain any information of the underlying physics or numerics, it has been shown in [22] that it enhances the DG convergence rate from order p + 1 to order 2p + 1 for the linear periodic hyperbolic problems under consideration, assuming that the exact solution is sufficiently smooth. A second feature of the post-processor is that the smoothness of the Bsplines is carried over to the approximation, i.e. u⋆h is p − 1 times continuously differentiable. This enhanced smoothness can be benefit the visualization of the approximation, e.g. in the form of streamlines (also cf. Section 4.5.7). Section 4.3 Original post-processing strategies 63 Finally, we remark that the application of the post-processor is not restricted to the DG approximations above, but can be applied to any function, also in higher dimensions (cf. Section 4.4.3). However, whether this also yields acceptable accuracy and efficiency does depend on the underlying problem. The post-processor has been effectively applied for DG discretizations on nonuniform rectangular [23] and unstructured triangular [48] meshes. Furthermore, it has been shown successful for certain linear convection-diffusion problems [41] and nonlinear hyperbolic conservation laws [42]. 4.3.3 One-sided post-processor The symmetric post-processor is an effective technique for enhancing the accuracy and smoothness for periodic problems with a sufficiently smooth exact solution. To be able to post-process near non-periodic boundaries and shocks as well, the one-sided post-processor has been proposed in [64]. This technique is similar to the symmetric case, except that the kernel nodes are shifted to one side of the origin, e.g. the right side: p+1 xj = + j, for all j = 0, ..., 2p. (4.8) 2 The resulting right-sided post-processor is now obtained by replacing (4.4) by (4.8) in Definition 4.3.1 (note that the kernel coefficients cj now take different values). ↑ x̄ Figure 4.3 Application of the scaled right-sided at the evaluation kernel K x̄−x h point x̄ in the mesh (p = 1). The kernel nodes are indicated by circles, and the corresponding B-splines by dashed lines. Unlike the symmetric kernel, it can be applied near the right boundary. Figure 4.3 illustrates the right-sided post-processor. Unlike the symmetric post-processor (cf. Figure 4.2), it can be applied anywhere close to the right boundary of the domain, without requiring information outside the spatial domain. This is because the support of the right-sided kernel is located entirely on the right side of the origin (although the support of K x̄−. is located on h the left side of the evaluation point x̄, as illustrated). This follows from (4.5), (4.8), and the fact that the support of the B-spline ψ (p+1) is contained in the p+1 interval [− p+1 2 , 2 ]. 64 Chapter 4 Hidden DG accuracy Although the right-sided post-processor is suitable near the right boundary, it is not suitable near the left boundary, as it would require information outside of the spatial domain in this region (this is similar to the symmetric case). To be able to post-process near the left boundary, we can reverse the strategy above and shift the kernel nodes to the other (left) side of the origin. The resulting left-sided post-processor is applicable near the left boundary, but not the right. Although neither of the post-processors can be applied in the entire domain, we can still cover the entire domain by combining the previous kernel types: in the interior, we can use the symmetric kernel; at the right boundary, we can use the right-sided kernel; at the left boundary, we can use the left-sided kernel; and in the transition regions, we can use kernels that are between a symmetric and a one-sided kernel (specified below). Altogether, we obtain the following post-processor, as proposed in [64]: Definition 4.3.2 ((Combined) one-sided post-processor) Consider a DG approximation uh at some final simulation time on the interval [a, b], using mesh elements of size h and polynomials of degree p, as discussed in Section 4.2.1. Let x̄ denote an evaluation point in (a, b). Depending on this evaluation point, define the following nodes: xj = −p + j + λ(x̄), for all j = 0, ..., 2p, ( x̄−a a+b min{0, −⌈ 3p+1 2 ⌉ + ⌊ h ⌋}, for x̄ ∈ [a, 2 ), λ(x̄) = 3p+1 x̄−b a+b for x̄ ∈ [ 2 , b]. max{0, ⌈ 2 ⌉ + ⌈ h ⌉}, (4.9) (4.10) Let K be the result of substituting these nodes into (4.5) and (4.6)2 . Then, the result of post-processing uh in x̄ is obtained by computing the convolution with the scaled kernel K as in (4.7). ⌈ 3p+1 ⌉ 2 0 ⌉ left-sided −⌈ 3p+1 2 a 2 As right-sided Figure 4.4 Illustration of the shift function λ(x̄) in (4.10). This function selects the appropriate kernel, depending on the evaluation point x̄. symmetric x̄ b for the symmetric kernel, existence and uniqueness of the kernel coefficients defined by (4.6) using shifted nodes can be shown following [10, Lemma 8.1]. Section 4.4 Position-dependent post-processor 65 The shift function λ in (4.10) is designed so that the post-processor above can be applied in the entire domain, even for non-periodic boundary conditions. An illustration of λ is given in Figure 4.4. Different values of λ result in different kernels, so the kernel has become dependent on the evaluation point x̄ through λ: the case λ = 0 corresponds to the symmetric kernel. Similarly, for 3p+1 λ = −⌈ 3p+1 2 ⌉ and λ = ⌈ 2 ⌉, the left- and right-sided kernels are obtained. Observe that the symmetric kernel is applied whenever possible. The reason for this is that the symmetric post-processor yields better accuracy than the non-symmetric variants. The latter has been observed in [64, p. 298] and will also be demonstrated in Section 4.5. Finally, note that we can now post-process near shocks by treating them as a domain boundary. This is established by subdividing the spatial domain into subdomains such that the shocks occur at the boundaries of these subdomains. After that, each subdomain can be post-processed individually as prescribed by Definition 4.3.2. Unlike the symmetric post-processor, which would render the shock p − 1 times continuously differentiable, this strategy does not remove the discontinuous nature of the shock. In that sense, the underlying physics is better represented. 4.4 Position-dependent post-processor The (combined) one-sided post-processor can be applied in the entire domain, including near boundaries and shocks. To enhance the smoothness and accuracy of this technique, in this section, we propose the position-dependent post-processor. Section 4.4.1 generalizes the original one-sided post-processor by relaxing both the number and the position of the kernel nodes. Section 4.4.2 uses this generalization to introduce the position-dependent post-processor. Section 4.4.3 describes the application of a post-processor for two-dimensional problems. 4.4.1 Generalized post-processor To improve the one-sided post-processor, the first step is to generalize it by relaxing both the number and the position of the kernel nodes. More specifically, we drop the convention that the kernel must be based on 2p + 1 kernel nodes, and use r + 1 B-splines instead (where r is any positive integer at this point). Furthermore, we allow the kernel nodes to take real values, instead of the original approach to consider integers only. This makes it possible to render the shift function, and thus the overall technique, smooth. Altogether, the generalized post-processor is obtained by considering the original post-processor in Definition 4.3.2, but then using r + 1 kernel nodes and removing the rounding operators in the shift function (4.10): 66 Chapter 4 Hidden DG accuracy Definition 4.4.1 (Generalized post-processor) Consider a DG approximation uh at some final simulation time on the interval [a, b], using mesh elements of size h and polynomials of degree p, as discussed in Section 4.2.1. Let x̄ denote an evaluation point in (a, b). Depending on this evaluation point, define the following r + 1 kernel nodes: r xj = − + j + λ(x̄), for all j = 0, ..., r, (2 a+b min{0, − r+p+1 + x̄−a 2 h }, for x̄ ∈ [a, 2 ), λ(x̄) = + x̄−b for x̄ ∈ [ a+b max{0, r+p+1 2 h }, 2 , b]. (4.11) (4.12) Let K be the result of substituting these nodes into (4.5) and (4.6) (using r rather than 2p in the upper bound of the summation). Then, the result of post-processing uh in x̄ is obtained by computing the convolution with the scaled kernel K as in (4.7). r+p+1 2 0 left-sided − r+p+1 2 a right-sided Figure 4.5 Illustration of the modified shift function λ(x̄) in (4.12). Unlike the original shift function (cf. Figure 4.4), it no longer contains discontinuities. symmetric x̄ b The modified shift function in (4.12) is illustrated in Figure 4.5. Compared to the former shift function (cf. Figure 4.4), there are two main advantages: the first is that its values are closer to zero. As a consequence, the resulting kernels are ‘more symmetric’ and thus more accurate (cf. Section 4.3.3). The second advantage is that the modified shift function is now continuous everywhere. Because the filtered DG solution is as smooth as the shift function (but typically not smoother than the B-splines), the generalized post-processor yields a better and more realistic smoothness than before. We remark that the smoothness could be enhanced further by designing a shift function that has the same smoothness of the B-splines throughout the entire domain (note that the present version is not differentiable in two locations). However, such an alternative shift function would be less close to zero, resulting in kernels that are ‘less symmetric’ and thus less accurate. It will be shown in Chapter 5 that, similar to the to the symmetric postprocessor, the generalized post-processor can extract DG superconvergence of Section 4.4 Position-dependent post-processor 67 order 2p + 1. 4.4.2 Position-dependent post-processor The generalized post-processor can be seen as a variant of the original onesided post-processor that yields better smoothness for an arbitrary number of kernel nodes. To enhance the accuracy as well, we now combine two generalized post-processors to incorporate extra kernel nodes in the non-symmetric kernels near the boundary. The resulting technique is called the position-dependent post-processor, and aims to increase the accuracy near the boundary to that established by the symmetric kernel in the interior. To specify this method, we apply the generalized post-processor twice: once using 2p + 1 kernel nodes and once using 4p + 1 kernel nodes (this particular choice is motivated below). After that, we compute a ‘smooth’ convex combination of these two results to select the extra kernel nodes near the boundary only, while maintaining the enhanced smoothness. Altogether, we arrive at the following definition: Definition 4.4.2 (Position-dependent post-processor) Consider a DG approximation uh at some final simulation time on the interval [a, b], using mesh elements of size h, and polynomials of degree p, as discussed in Section 4.2.1. Let x̄ denote an evaluation point in (a, b). For this evaluation point, suppose that u⋆h,2p+1 (x̄) and u⋆h,4p+1 (x̄) are the result of applying the generalized post-processor (Definition 4.4.1) using 2p + 1 and 4p + 1 kernel nodes respectively. Furtermore, let θ : [a, b] → [0, 1] be a (‘smooth’) function that is equal to 1 in the interior and equal to 0 near the boundary. Then, we set u⋆h (x̄) = θ(x̄)u⋆h,2p+1 (x̄) + 1 − θ(x̄) u⋆h,4p+1 (x̄). 2p+1 nodes 1 0 4p+1 nodes a 4p+1 nodes x̄ Figure 4.6 Illustration of the coefficient function θ(x̄), which smoothly selects extra kernel nodes near the boundary for higher accuracy. b The purpose of the coefficient function θ is to select the appropriate number of kernel nodes, depending on the evaluation point. An illustration of a specific 68 Chapter 4 Hidden DG accuracy choice for θ is given in Figure 4.6. Near the boundary, we prefer to use extra nodes to enhance the accuracy of the non-symmetric kernels. Hence, θ = 0 in this region3 , to select u⋆h,4p+1 rather than u⋆h,2p+1 . In the interior, the symmetric kernel already provides the desired accuracy without the additional costs for the use of extra B-splines (costs are discussed further below). Hence, θ = 1 in this region, to select u⋆h,2p+1 rather than u⋆h,4p+1 . Between the boundary regions and the interior, there are two transition regions, where θ varies smoothly between 0 and 1 to avoid that artificial discontinuities are introduced4 . It will be shown in Chapter 5 that, similar to the to the generalized postprocessor, the position-dependent post-processor can extract DG superconvergence of order 2p + 1. There are several remarks to be made with respect to the position-dependent post-processor: • The symmetric and position-dependent post-processor are based on the same building blocks. For this reason, we speculate that suitable application types for the symmetric post-processor (away from non-periodic boundaries and shocks) are also suitable for the position-dependent postprocessor (but now in the entire domain). This would include the unstructured and non-linear problems mentioned at the end of Section 4.3.2, but verification of the latter is left for future research. One- and twodimensional linear problems on uniform meshes will be studied in Section 4.5. • It is not needed to compute both u⋆h,2p+1 and u⋆h,4p+1 in the entire domain. This is only required in the two small transition regions where θ ∈ (0, 1). Near the boundary, we only need to compute u⋆h,4p+1 . In the interior, we only require u⋆h,2p+1 . • Similar to the (combined) one-sided post-processor, we can apply the position-dependent post-processor near shocks, as discussed in Section 4.3.3. • The use of extra kernel nodes increases the kernel support, and thus the computational costs. Using r + 1 B-splines of order p + 1, the convolution with the kernel can be implemented using small matrix-vector multiplications. These matrices have size (r + 1) × (p + 1) and contain the inner products of the B-splines with the DG basis functions. The vectors (of length p + 1) contain the DG modes for a mesh element in the kernel support. Altogether, to compute u⋆h,r+1 in an evaluation point, r + p + 2 such matrix-vector multiplications are required, plus the summation of all 3 In this example, the boundary region where θ = 0 is 3p+1 mesh elements wide, i.e. the 2 smallest region where the symmetric kernel with 2p + 1 nodes cannot be applied. 4 In this example, the transition regions are two mesh elements wide. In these regions, θ is polynomial of degree 2p + 1 such that θ is p + 1 times differentiable. Section 4.4 Position-dependent post-processor 69 elements in the resulting vectors. We stress that the post-processor is applied only once, at the final simulation time. To keep the computational time at a minimum, extra kernel nodes should only be applied where necessary. For a more efficient inexact implementation, see [49, 47, 50]. • Instead of 4p + 1, we could use any number of kernel nodes near the boundary (provided that the convolution remains well-defined). For example, we also ran tests using 3p+1 and 5p+1 kernel nodes. As expected, we found that a larger number of kernel nodes leads to more accuracy. However, the difference between using 5p + 1 nodes and 4p + 1 nodes was relatively small. For 3p + 1 nodes, the errors were not improved over the unfiltered errors for the coarser meshes. Based on these experiments, using 4p + 1 kernel nodes is a natural choice for enhancing the errors where necessary, without increasing the kernel support and the computational costs too much. • For coarsemeshes, the use of extra kernel nodes can increase the support of K x̄−. such that it is no longer contained in the domain [a, b] (no h matter how you shift it). This is the case if (r + p + 1)h becomes larger than b − a. To be able to apply the post-processor anyway, a possible solution is to scale the kernel by the smaller scaling H= b−a <h r+p+1 (4.13) rather than h in (4.7). This shrinks the kernel support appropriately. It will be demonstrated in Section 4.5 that the use of a smaller scaling for the coarser meshes can reduce the accuracy-enhancing quality of the post-processor. For this reason, the alternative scaling should only be applied when necessary. 4.4.3 Post-processing in two dimensions All post-processing techniques discussed in this chapter can also be applied for higher-dimensional problems using tensor products. In this section, we illustrate this for a square domain Ω = [a1 , b1 ] × [a2 , b2 ] (for higher-dimensional problems, cf. Section 5.2.1). To this end, consider an evaluation point (x̄1 , x̄2 ) ∈ (a1 , b1 ) × (a2 , b2 ). Let K1 denote the one-dimensional kernel for the evaluation point x̄1 ∈ (a1 , b1 ), corresponding to any of the post-processors discussed in the previous sections: symmetric, (combined) one-sided, generalized, or position-dependent5 . Similarly, let K2 denote such a kernel for x̄2 ∈ (a2 , b2 ). Note that K1 and K2 are typically not the same kernel, as they are based on different evaluation points. 5 For the position-dependent post-processor, this means we take the convex combination of the two generalized kernels involved, i.e. θ(x̄1 )K2p+1 + (1 − θ(x̄1 ))K4p+1 . 70 Chapter 4 Hidden DG accuracy Using this notation, a DG approximation uh on the square domain Ω (cf. Section 4.2.2) can now be post-processed by computing the convolution against the tensor product of the two scaled kernels: Z b1 Z b2 1 x̄1 − x1 x̄2 − x2 ⋆ uh (x̄1 , x̄1 ) = 2 K1 K2 uh (x1 , x2 ) dx2 dx1 . h a1 a2 h h If uh is vector-valued, then this strategy can be applied to each individual component. 4.5 Numerical Results The previous section proposed a position-dependent post-processor to improve on the accuracy of the original (combined) one-sided post-processor. In this section, we compare both techniques in terms of numerical experiments (for a theoretical study, cf. Chapter 5). In particular, we consider the L2 -projection of a sine function (Section 4.5.1); four one-dimensional hyperbolic problems, including periodic boundary conditions (Section 4.5.2), Dirichlet boundary conditions (Section 4.5.3), variable coefficients (Section 4.5.4), and two stationary shocks (Section 4.5.5); a two-dimensional system (Section 4.5.6); and the two-dimensional test cases studied in [73], including streamline visualizations (Section 4.5.7). In our implementation, the DG approximations are based on first order upwind fluxes, monomial basis functions, and uniform meshes, as discussed in Section 4.2. For the time-discretization, we use a third-order SSP-RK scheme [36, 37] with a sufficiently small time step to ensure that the time-stepping errors are not dominating the spatial errors. We apply both the ‘old’ and the ‘new’ post-processor at the final time T = 12.5, as specified in Definition 4.3.2 and Definition 4.4.2 respectively. Convolutions of the form (4.7) are computed exactly using Gaussian quadrature [64]. Finally, we make use of the ARPREC multi-precision package to reduce round-off errors appropriately [8]. Section 4.5 Numerical Results 71 L2 -Projection 4.5.1 The first test case is the L2 -projection of u(x) = sin(x) onto the space of piecewise polynomials of degree p = 1, 2, 3 on the domain [0, 2π]. This test case can also be interpreted as a DG approximation at the initial time. It is the most elementary case that we can use to test the reliability of our filter. Table 4.1 demonstrates that both post-processors enhance the convergence rate from O(hp+1 ) to O(h2p+1 ). However, both the orders and the magnitude of the L2 - and L∞ -errors are better for the new post-processor. More importantly, it yields an improvement over the unfiltered errors in both norms, even for coarse meshes. The plots (for p = 2) illustrate that both post-processors produce identical results in the interior of the domain, where they apply the same symmetric kernel with 2p + 1 nodes. The differences occur at the boundary: the new post-processor yields significantly better results without introducing a spurious stair-stepping effect. This can be explained by the use of extra kernel nodes and the continuity of the new shift function (cf. Section 4.4). As a result, unlike before, the new post-processor enhances the accuracy of the DG approximation in the entire domain. 20 el. 40 el. 80 el. 160 el. −6 10 −9 10 −6 10 −9 10 −12 −12 −15 spatial domain 10 80 el. 160 el. −15 10 −18 6.2832 20 el. 40 el. −12 −15 0 −6 10 10 10 −18 10 10 10 10 −3 −9 10 10 10 20 el. 40 el. 80 el. 160 el. −3 10 Error After Post−Processing (New) Error After Post−Processing (Old) Error Before Post−Processing −3 −18 0 6.2832 10 Before After Post-Processing Post-Processing (Old) ∞ 2 mesh L -error order L -error order L -error order L∞ -error order Polynomial Degree p = 1 20 6.51e-03 5.95e-03 1.60e-02 2.21e-02 40 1.63e-03 2.00 1.50e-03 1.99 1.71e-03 3.22 3.11e-03 2.83 80 4.07e-04 2.00 3.76e-04 2.00 1.58e-04 3.43 4.00e-04 2.96 160 1.02e-04 2.00 9.40e-05 2.00 1.42e-05 3.48 5.03e-05 2.99 Polynomial Degree p = 2 20 1.73e-04 1.28e-04 3.95e-03 6.68e-03 40 2.16e-05 3.00 1.61e-05 2.99 2.11e-04 4.23 3.92e-04 4.09 80 2.70e-06 3.00 2.02e-06 3.00 5.47e-06 5.27 1.39e-05 4.82 160 3.38e-07 3.00 2.53e-07 3.00 1.26e-07 5.45 4.46e-07 4.96 Polynomial Degree p = 3 20 3.42e-06 2.15e-06 1.06e-04 2.26e-04 40 2.14e-07 4.00 1.35e-07 3.99 4.71e-06 4.49 8.96e-06 4.66 80 1.34e-08 4.00 8.49e-09 4.00 3.41e-08 7.11 8.72e-08 6.68 160 8.36e-10 4.00 5.31e-10 4.00 2.00e-10 7.41 7.16e-10 6.93 2 Table 4.1. L2 -projection 0 6.2832 spatial domain spatial domain After Post-Processing (New) L -error order L∞ -error order 2 4.88e-04 1.26e-03 1.90e-05 4.68 5.35e-05 4.56 9.02e-07 4.40 1.79e-06 4.90 5.33e-08 4.08 5.69e-08 4.98 4.19e-06 3.14e-06 8.69e-08 5.59 6.71e-08 5.55 1.38e-09 5.97 7.87e-10 6.41 2.17e-11 5.99 1.23e-11 6.00 3.75e-07 9.84e-07 6.30e-10 9.22 3.89e-10 11.31 2.67e-12 7.88 1.53e-12 7.99 1.06e-14 7.98 5.97e-15 8.00 72 Chapter 4 Hidden DG accuracy Isolating the boundary effects To isolate the effect of extra kernel nodes at the boundary, we revisit our previous test case and apply the left-sided post-processor in the entire domain for 2p + 1 (‘old’) and 4p + 1 (‘new’) kernel nodes (using a periodic extension of uh when needed). In other words, the only difference between the two post-processors is now the number of kernel nodes. Table 4.2 demonstrates that the use of 4p + 1 kernel nodes leads to better accuracy of O(h4p+1 ). The same phenomenon occurs at the boundary in Table 4.1, which explains why the new post-processor improves the errors in that region. We stress that the accuracy of O(h4p+1 ) cannot be expected in general. Based on the analysis in Section 5.5.3 later on, the theoretical error takes the form C1 h4p+1 + C2 h2p+1 , where the order in the first term is equal to the number of kernel nodes (and C1 , C2 > 0 are constant with respect to h). We speculate that the meshes in this test case are sufficiently coarse so that the first error dominates the second. For finer meshes, it is likely that the second error will start to dominate, so that the error then becomes O(h2p+1 ). We will encounter such an example in the next section. 20 el. 40 el. 80 el. 160 el. −6 10 −9 10 −6 10 −9 10 −12 −12 −15 spatial domain 10 160 el. −15 −18 0 6.2832 10 0 6.2832 spatial domain spatial domain Before After Post-Processing Post-Processing (Old) ∞ 2 mesh L -error order L -error order L -error order L∞ -error order Polynomial Degree p = 1 20 6.51e-03 5.95e-03 4.50e-02 2.54e-02 40 1.63e-03 2.00 1.50e-03 1.99 5.70e-03 2.98 3.22e-03 2.98 80 4.07e-04 2.00 3.76e-04 2.00 7.15e-04 3.00 4.03e-04 3.00 160 1.02e-04 2.00 9.40e-05 2.00 8.94e-05 3.00 5.05e-05 3.00 Polynomial Degree p = 2 20 1.73e-04 1.28e-04 1.03e-02 5.79e-03 40 2.16e-05 3.00 1.61e-05 2.99 3.29e-04 4.97 1.85e-04 4.96 80 2.70e-06 3.00 2.02e-06 3.00 1.03e-05 4.99 5.83e-06 4.99 160 3.38e-07 3.00 2.53e-07 3.00 3.23e-07 5.00 1.82e-07 5.00 Polynomial Degree p = 3 20 3.42e-06 2.15e-06 2.58e-03 1.46e-03 40 2.14e-07 4.00 1.35e-07 3.99 2.09e-05 6.95 1.18e-05 6.95 80 1.34e-08 4.00 8.49e-09 4.00 1.65e-07 6.99 9.29e-08 6.99 160 8.36e-10 4.00 5.31e-10 4.00 1.29e-09 7.00 7.28e-10 7.00 2 80 el. 10 −18 6.2832 20 el. 40 el. −12 −15 0 −6 10 10 10 −18 10 10 10 10 −3 −9 10 10 10 20 el. 40 el. 80 el. 160 el. −3 10 Error After Post−Processing (New) Error After Post−Processing (Old) Error Before Post−Processing −3 After Post-Processing (New) L -error order L∞ -error order 2 3.72e-03 2.10e-03 1.18e-04 4.97 6.71e-05 4.97 3.72e-06 4.99 2.11e-06 4.99 1.17e-07 5.00 6.62e-08 4.99 9.52e-05 5.37e-05 1.93e-07 8.95 1.09e-07 8.95 3.81e-10 8.98 2.16e-10 8.97 7.61e-13 8.97 4.42e-13 8.93 2.82e-06 1.59e-06 3.62e-10 12.93 2.04e-10 12.93 4.47e-14 12.98 2.53e-14 12.98 5.51e-18 12.99 3.21e-18 12.94 Table 4.2. L2 -projection: isolating boundary effects Section 4.5 Numerical Results 4.5.2 73 Constant coefficients We now consider a one-dimensional linear hyperbolic equation with constant coefficients and periodic boundary conditions: ut + ux = 0, x ∈ [0, 2π], t ∈ [0, 12.5]. The initial condition is chosen such that the exact solution reads u(x, t) = sin(x − t). For t = 0, this test case is equivalent to the one discussed in the previous section. Here, we consider the final time t = 12.5. This test case is more challenging than the previous one because uh now contains information of the physics of the PDE and the numerics of the DG method. Nevertheless, Table 4.3 demonstrates that the results are similar to those for the L2 -projection: we are able to obtain better errors than the DG solution and the old post-processor. In fact, the magnitude of the errors is improved throughout the entire domain when the new position-dependent post-processor is applied to the DG approximation. Furthermore, the convergence rate is improved from O(hp+1 ) to O(h2p+1 ). Error After Post−Processing (Old) Error Before Post−Processing −3 −3 10 10 20 el. 40 el. 80 el. 160 el. −6 10 −9 Error After Post−Processing (New) 20 el. 40 el. −6 160 el. −12 −12 6.2832 spatial domain 80 el. 160 el. −12 10 0 6.2832 0 spatial domain Before After Post-Processing Post-Processing (Old) ∞ 2 mesh L -error order L -error order L -error order L∞ -error order Polynomial Degree p = 1 20 1.41e-02 1.02e-02 1.89e-02 2.21e-02 40 2.91e-03 2.28 2.69e-03 1.92 2.10e-03 3.17 3.12e-03 2.82 80 6.81e-04 2.09 7.57e-04 1.83 2.18e-04 3.27 4.03e-04 2.95 160 1.67e-04 2.03 2.00e-04 1.92 2.34e-05 3.22 5.10e-05 2.98 Polynomial Degree p = 2 20 2.68e-04 3.18e-04 4.00e-03 7.50e-03 40 3.35e-05 3.00 3.98e-05 3.00 2.11e-04 4.25 4.07e-04 4.20 80 4.19e-06 3.00 4.97e-06 3.00 5.46e-06 5.27 1.41e-05 4.85 160 5.24e-07 3.00 6.22e-07 3.00 1.25e-07 5.45 4.49e-07 4.97 Polynomial Degree p = 3 20 5.18e-06 4.40e-06 1.30e-04 3.21e-04 40 3.24e-07 4.00 2.76e-07 4.00 4.71e-06 4.79 9.45e-06 5.09 80 2.02e-08 4.00 1.72e-08 4.00 3.41e-08 7.11 8.91e-08 6.73 160 1.26e-09 4.00 1.08e-09 4.00 2.00e-10 7.41 7.23e-10 6.95 2 40 el. 10 10 0 10 −9 10 10 20 el. −6 10 −9 10 −3 10 80 el. 6.2832 spatial domain After Post-Processing (New) L -error order L∞ -error order 2 9.60e-03 5.44e-03 1.20e-03 3.00 6.78e-04 3.00 1.50e-04 3.00 8.45e-05 3.00 1.87e-05 3.00 1.05e-05 3.00 1.30e-05 8.41e-06 3.77e-07 5.11 2.16e-07 5.28 1.06e-08 5.16 5.97e-09 5.18 3.09e-10 5.10 1.74e-10 5.10 3.76e-07 1.05e-06 6.63e-10 9.15 4.09e-10 11.32 2.96e-12 7.81 1.69e-12 7.92 1.29e-14 7.84 7.28e-15 7.86 Table 4.3. Constant coefficients 74 Chapter 4 Hidden DG accuracy Isolating boundary effects Similar to Section 4.5.1, we isolate the effect of extra kernel nodes near the boundary by repeating the experiment while applying the left-sided post-processor throughout the entire domain. The results are displayed in Table 4.4. As before (cf. Table 4.2), the extra nodes yield faster convergence and smaller errors. This explains the higher accuracy of the position-dependent post-processor near the boundary in Table 4.3. Unlike before, the convergence rate is typically lower than O(h4p+1 ), and it drops to O(h2p+1 ) for finer meshes. This change in the convergence rate is in line with our earlier speculation that the lower order component of the error, which is O(h2p+1 ), starts to dominate the higher order component for sufficiently fine meshes. Error Before Post−Processing −3 20 el. −3 10 10 20 el. 40 el. 80 el. 160 el. −6 10 Error After Post−Processing (New) Error After Post−Processing (Old) −9 80 el. −6 10 20 el. −6 10 160 el. −9 10 −3 10 40 el. 40 el. −9 10 10 80 el. 160 el. −12 −12 10 −12 10 0 6.2832 spatial domain 10 0 6.2832 Before After Post-Processing Post-Processing (Old) ∞ 2 mesh L -error order L -error order L -error order L∞ -error order Polynomial Degree p = 1 20 1.41e-02 1.02e-02 4.19e-02 2.36e-02 40 2.91e-03 2.28 2.69e-03 1.92 5.57e-03 2.91 3.14e-03 2.91 80 6.81e-04 2.09 7.57e-04 1.83 7.15e-04 2.96 4.03e-04 2.96 160 1.67e-04 2.03 2.00e-04 1.92 9.04e-05 2.98 5.10e-05 2.98 Polynomial Degree p = 2 20 2.68e-04 3.18e-04 1.03e-02 5.79e-03 40 3.35e-05 3.00 3.98e-05 3.00 3.29e-04 4.97 1.85e-04 4.97 80 4.19e-06 3.00 4.97e-06 3.00 1.03e-05 4.99 5.83e-06 4.99 160 5.24e-07 3.00 6.22e-07 3.00 3.23e-07 5.00 1.82e-07 5.00 Polynomial Degree p = 3 20 5.18e-06 4.40e-06 2.58e-03 1.46e-03 40 3.24e-07 4.00 2.76e-07 4.00 2.09e-05 6.95 1.18e-05 6.95 80 2.02e-08 4.00 1.72e-08 4.00 1.65e-07 6.99 9.29e-08 6.99 160 1.26e-09 4.00 1.08e-09 4.00 1.29e-09 7.00 7.28e-10 7.00 2 0 6.2832 spatial domain spatial domain After Post-Processing (New) L -error order L∞ -error order 2 1.22e-02 6.87e-03 1.24e-03 3.30 6.98e-04 3.30 1.50e-04 3.05 8.45e-05 3.05 1.86e-05 3.01 1.05e-05 3.01 1.05e-04 5.90e-05 4.49e-07 7.86 2.53e-07 7.86 9.34e-09 5.59 5.27e-09 5.59 2.88e-10 5.02 1.62e-10 5.02 2.82e-06 1.59e-06 3.96e-10 12.80 2.24e-10 12.80 3.16e-13 10.29 1.78e-13 10.29 2.32e-15 7.09 1.31e-15 7.09 Table 4.4. Constant coefficients: isolating boundary effects Section 4.5 Numerical Results 4.5.3 75 Dirichlet BCs The previous section discussed a test case with periodic boundary conditions. For those problems, it is actually not necessary to use a one-sided approach near the boundary: the more accurate symmetric post-processor could be applied by using a periodic extension of the DG solution. However, in most real-life applications, the boundary conditions are not periodic. For this reason, we revisit the test case of the previous section, but now using Dirichlet boundary conditions. That is, ut + ux = 0, u(0, t) = sin(−t), x ∈ [0, 2π], t ∈ [0, 12.5]. The initial condition is chosen such that the exact solution reads u(x, t) = sin(x − t). The results are displayed in Table 4.5: similar to the periodic case, we observe that the convergence rate is improved from O(hp+1 ) to O(h2p+1 ). Furthermore, the smoothness and the accuracy are improved in the entire domain, including the boundary. Error After Post−Processing (Old) Error Before Post−Processing −3 −3 10 10 20 el. 40 el. 80 el. 160 el. −6 10 −9 Error After Post−Processing (New) 20 el. 40 el. −6 −6 10 160 el. 20 el. 10 40 el. −9 10 −3 10 80 el. −9 10 10 80 el. 160 el. −12 −12 10 −12 10 0 6.2832 spatial domain 10 0 6.2832 0 spatial domain Before After Post-Processing Post-Processing (Old) ∞ 2 mesh L -error order L -error order L -error order L∞ -error order Polynomial Degree p = 1 20 1.10e-02 1.29e-02 1.63e-02 2.33e-02 40 2.68e-03 2.03 3.29e-03 1.97 1.76e-03 3.22 3.23e-03 2.85 80 6.67e-04 2.01 8.32e-04 1.98 1.66e-04 3.40 4.13e-04 2.97 160 1.66e-04 2.00 2.09e-04 1.99 1.55e-05 3.42 5.19e-05 2.99 Polynomial Degree p = 2 20 2.68e-04 3.17e-04 4.00e-03 7.50e-03 40 3.35e-05 3.00 3.98e-05 2.99 2.11e-04 4.25 4.07e-04 4.20 80 4.19e-06 3.00 4.97e-06 3.00 5.46e-06 5.27 1.41e-05 4.85 160 5.24e-07 3.00 6.22e-07 3.00 1.25e-07 5.45 4.49e-07 4.97 Polynomial Degree p = 3 20 5.18e-06 4.40e-06 1.30e-04 3.21e-04 40 3.24e-07 4.00 2.76e-07 4.00 4.71e-06 4.79 9.45e-06 5.09 80 2.02e-08 4.00 1.72e-08 4.00 3.41e-08 7.11 8.91e-08 6.73 160 1.26e-09 4.00 1.08e-09 4.00 2.00e-10 7.41 7.23e-10 6.95 2 Table 4.5. Dirichlet BCs 6.2832 spatial domain After Post-Processing (New) L -error order L∞ -error order 2 3.37e-03 2.41e-03 4.14e-04 3.02 3.00e-04 3.01 5.13e-05 3.01 3.72e-05 3.01 6.39e-06 3.01 4.63e-06 3.01 6.98e-06 5.23e-06 1.84e-07 5.25 1.24e-07 5.40 4.63e-09 5.31 3.16e-09 5.29 1.28e-10 5.18 8.83e-11 5.16 3.75e-07 1.05e-06 6.39e-10 9.20 3.97e-10 11.37 2.75e-12 7.86 1.59e-12 7.96 1.12e-14 7.94 6.48e-15 7.94 76 Chapter 4 Hidden DG accuracy 4.5.4 Variable coefficients As a first step towards nonlinear problems, we now consider a linear problem with smoothly varying coefficients: ut + (a u)x = f, a(x, t) = 2 + sin(x + t), x ∈ [0, 2π], t ∈ [0, 12.5]. The boundary conditions are periodic, and the initial condition and the forcing term f (x, t) are chosen such that the exact solution reads u(x, t) = sin(x − t). The results are displayed in Table 4.6. Similar to all of the previous test cases, we observe that the convergence rate is improved from from O(hp+1 ) to O(h2p+1 ). Furthermore, the smoothness and the accuracy are improved in the entire domain, including the boundary. Error Before Post−Processing −3 −3 10 10 20 el. 40 el. 80 el. 160 el. −6 10 Error After Post−Processing (New) Error After Post−Processing (Old) −9 −6 10 20 el. 40 el. 80 el. 10 160 el. 10 −9 10 −12 −12 −12 10 −15 10 −15 10 0 6.2832 spatial domain 10 0 6.2832 0 6.2832 spatial domain spatial domain Before After Post-Processing Post-Processing (Old) ∞ 2 mesh L -error order L -error order L -error order L∞ -error order Polynomial Degree p = 1 20 1.09e-02 1.46e-02 1.63e-02 2.47e-02 40 2.68e-03 2.03 3.53e-03 2.05 1.75e-03 3.22 3.38e-03 2.87 80 6.66e-04 2.01 8.62e-04 2.03 1.64e-04 3.41 4.31e-04 2.97 160 1.66e-04 2.00 2.13e-04 2.02 1.52e-05 3.44 5.40e-05 3.00 Polynomial Degree p = 2 20 2.68e-04 3.31e-04 4.00e-03 7.50e-03 40 3.35e-05 3.00 4.07e-05 3.03 2.11e-04 4.25 4.07e-04 4.21 80 4.19e-06 3.00 5.03e-06 3.02 5.46e-06 5.27 1.41e-05 4.85 160 5.24e-07 3.00 6.25e-07 3.01 1.25e-07 5.45 4.49e-07 4.97 Polynomial Degree p = 3 20 5.17e-06 4.41e-06 1.30e-04 3.21e-04 40 3.23e-07 4.00 2.76e-07 4.00 4.71e-06 4.79 9.45e-06 5.09 80 2.02e-08 4.00 1.73e-08 4.00 3.41e-08 7.11 8.91e-08 6.73 160 1.26e-09 4.00 1.08e-09 4.00 2.00e-10 7.41 7.23e-10 6.95 2 40 el. 80 el. 160 el. 10 10 −15 20 el. −6 −9 10 10 −3 After Post-Processing (New) L -error order L∞ -error order 2 2.75e-03 2.95e-03 3.48e-04 2.98 2.77e-04 3.41 4.38e-05 2.99 2.99e-05 3.21 5.50e-06 3.00 3.63e-06 3.04 4.58e-06 4.67e-06 1.01e-07 5.50 1.18e-07 5.31 2.77e-09 5.19 2.15e-09 5.78 9.81e-11 4.82 7.36e-11 4.87 1.11e-05 3.91e-05 6.63e-10 14.03 1.12e-09 15.09 2.65e-12 7.97 1.53e-12 9.51 1.06e-14 7.97 7.13e-15 7.75 Table 4.6. Variable coefficients Section 4.5 Numerical Results 4.5.5 77 Discontinuous coefficients For all of the previous test cases, the exact solution is infinitely smooth. In Section 4.4.2, we emphasized that the position-dependent post-processor can also be applied near a shock, by treating it as a boundary. To test this numerically, we now consider a problem with discontinuous coefficients [64]: ( 1 , x ∈ [− 12 , 21 ], ut + (a u)x = 0, a(x) = 2 , x ∈ [−1, 1], t ∈ [0, 12.5]. 1, else, The boundary conditions are periodic, and the initial condition is chosen such that the exact solution has two stationary shocks: ( −2 cos 4π(x − 21 t) , x ∈ [− 21 , 21 ], u(x, t) = cos 2π(x − t) , else. The results are displayed in Table 4.7. The accuracy of the new postprocessor is better than that of the DG solution as long as the mesh is sufficiently fine. For the coarser meshes, the lower accuracy is because a kernel Error Before Post−Processing Error After Post−Processing (New) Error After Post−Processing (Old) 0 0 10 0 10 10 20 el. −3 10 40 el. −3 −3 10 80 el. 80 el. −6 10 160 el. −9 160 el. −6 10 −9 10 1 spatial domain 40 el. 80 el. −6 10 160 el. 10 −1 1 −1 1 spatial domain spatial domain Before After Post-Processing Post-Processing (Old) ∞ 2 mesh L -error order L -error order L -error order L∞ -error order Polynomial Degree p = 1 20 1.21e+00 - 1.56e+00 1.15e+00 - 1.54e+00 40 2.72e-01 2.15 3.77e-01 2.05 2.54e-01 2.18 3.49e-01 2.14 80 3.83e-02 2.83 5.74e-02 2.71 3.63e-02 2.80 4.88e-02 2.84 160 5.20e-03 2.88 8.62e-03 2.74 4.70e-03 2.95 6.19e-03 2.98 Polynomial Degree p = 2 20 3.65e-02 5.14e-02 6.81e+00 - 1.62e+01 40 2.05e-03 4.15 4.84e-03 3.41 1.67e-01 5.35 6.78e-01 4.58 80 2.17e-04 3.24 6.27e-04 2.95 6.03e-03 4.79 2.79e-02 4.60 160 2.68e-05 3.02 7.94e-05 2.98 8.41e-05 6.16 5.79e-04 5.59 Polynomial Degree p = 3 20 1.08e-03 2.45e-03 3.58e+00 - 1.25e+01 40 6.60e-05 4.04 1.37e-04 4.16 1.87e-02 7.58 9.95e-02 6.97 80 4.13e-06 4.00 8.74e-06 3.97 6.50e-04 4.84 2.92e-03 5.09 160 2.58e-07 4.00 5.51e-07 3.99 2.62e-06 7.95 1.77e-05 7.36 2 10 −9 10 −1 20 el. 20 el. 40 el. After Post-Processing (New) L -error order L∞ -error order 2 1.20e+00 - 1.62e+00 2.74e-01 2.13 4.33e-01 1.91 3.75e-02 2.87 5.02e-02 3.11 4.75e-03 2.98 6.17e-03 3.03 5.71e-01 - 2.94e+00 1.25e-03 8.84 1.83e-03 10.66 4.16e-05 4.91 1.40e-04 3.71 1.18e-06 5.14 1.69e-06 6.37 2.27e-01 6.61e-01 2.64e-03 6.43 1.85e-02 5.16 5.20e-06 8.99 6.98e-05 8.05 4.67e-09 10.12 8.70e-08 9.65 Table 4.7. Discontinuous coefficients 78 Chapter 4 Hidden DG accuracy scale smaller than h is required to ensure that the support of the (scaled) kernel fits between two subsequent boundaries/shocks (cf. Section 4.4.2, (4.13)). Nevertheless, for p = 2 and p = 3, the magnitude of the errors is much smaller for the new post-processor than for the old one. For p = 1, this is not the case: the errors are slightly worse. We speculate that this is due to the fact that more extra kernel nodes are used for p = 2 than for p = 1. Improvement of this issue is left for future research. 4.5.6 Two-dimensional system In the previous sections, we considered several one-dimensional problems. However, higher-dimensional fields can also be filtered, as discussed in Section 4.4.3. In this section, we apply such a strategy for the following two-dimensional system: u −1 0 u 0 −1 u 0 + + = , (x, y) ∈ [0, 2π]2 , v t 0 1 v x −1 0 v y 0 t ∈ [0, 12.5]. The boundary conditions are periodic, and the initial condition is given by 1 u0 (x, y) = √ (sin(x + y) − cos(x + y)), 2 2 √ √ 1 v0 (x, y) = √ ( 2 − 1) sin(x + y) + (1 + 2) cos(x + y) . 2 2 The results for this test case are displayed in Table 4.8 for a final time of t = 12.5. Similar to the one-dimensional problems, we observe that the convergence rate is improved from O(hp+1 ) to O(h2p+1 ). Furthermore, unlike the original filter, the position-dependent post-processor improves the DG errors in the entire domain, even for coarse meshes. In other words, the results for this two-dimensional problem are similar to those for the previous smooth onedimensional cases. 4.5.7 Two-dimensional streamlines Next, we study the two-dimensional tests of Steffen et al. [73], including streamline visualizations. In each test, a velocity profile (u, v) on the square [−1, 1]2 is given. We consider the L2 -projection of that solution onto the space of piecewise polynomials of degree p = 1, 2, 3 (as before, this is similar to a DG approximation at the initial time). The velocity (u, v) is obtained as a function of (x, y) from the real and imaginary parts of a complex number r: u := Re(r), v := Im(r), Section 4.5 Numerical Results 79 u-component Before After Post-Processing Post-Processing (Old) 2 ∞ 2 mesh L -error order L -error order L -error order L∞ -error order Polynomial Degree p = 1 202 1.22e-01 3.94e-02 1.16e-01 2.77e-02 402 1.77e-02 2.78 7.12e-03 2.47 1.55e-02 2.90 4.58e-03 2.60 802 2.94e-03 2.59 1.38e-03 2.36 1.96e-03 2.98 6.33e-04 2.85 Polynomial Degree p = 2 202 1.58e-03 1.24e-03 1.96e-02 1.77e-02 402 1.95e-04 3.02 1.62e-04 2.94 4.29e-04 5.51 6.00e-04 4.88 2 80 2.44e-05 3.00 2.05e-05 2.98 9.46e-06 5.50 1.76e-05 5.09 Polynomial Degree p = 3 202 7.87e-05 5.58e-05 2.00e-03 1.79e-03 402 4.98e-06 3.98 3.30e-06 4.08 1.10e-05 7.51 1.54e-05 6.86 802 3.11e-07 4.00 1.97e-07 4.06 6.07e-08 7.50 1.17e-07 7.05 After Post-Processing (New) L -error order L∞ -error order 2 1.18e-01 2.73e-02 1.54e-02 2.93 3.48e-03 2.97 1.95e-03 2.98 4.39e-04 2.99 3.33e-04 1.03e-04 1.01e-05 5.05 2.29e-06 5.50 3.12e-07 5.01 7.03e-08 5.03 1.33e-06 1.93e-06 6.87e-09 7.60 1.69e-09 10.16 5.02e-11 7.10 1.16e-11 7.19 v-component Before After Post-Processing Post-Processing (Old) mesh L2 -error order L∞ -error order L2 -error order L∞ -error order Polynomial Degree p = 1 202 1.43e-01 3.89e-02 1.36e-01 3.65e-02 402 2.04e-02 2.81 6.13e-03 2.67 1.81e-02 2.91 5.42e-03 2.75 802 3.31e-03 2.62 1.63e-03 1.91 2.27e-03 2.99 7.23e-04 2.91 Polynomial Degree p = 2 202 2.22e-03 2.45e-03 2.16e-02 1.43e-02 402 2.72e-04 3.03 3.07e-04 3.00 4.89e-04 5.46 6.64e-04 4.43 802 3.39e-05 3.00 3.84e-05 3.00 1.09e-05 5.49 2.18e-05 4.93 Polynomial Degree p = 3 202 1.14e-04 1.26e-04 2.21e-03 1.41e-03 402 7.49e-06 3.93 8.21e-06 3.94 1.25e-05 7.46 1.60e-05 6.45 2 80 4.75e-07 3.98 5.23e-07 3.97 6.98e-08 7.48 1.40e-07 6.84 After Post-Processing (New) L2 -error order L∞ -error order 1.39e-01 3.21e-02 1.80e-02 2.95 4.07e-03 2.98 2.26e-03 2.99 5.09e-04 3.00 3.80e-04 1.18e-04 1.15e-05 5.04 2.63e-06 5.49 3.59e-07 5.01 8.09e-08 5.02 1.60e-06 1.86e-06 8.32e-09 7.59 2.04e-09 9.84 5.94e-11 7.13 1.37e-11 7.21 Table 4.8. Two-dimensional system where, defining the complex number z := x + iy, the following three test cases are given: r =(z − (0.74 + 0.35i))(z − (0.68 − 0.59i)) (z − (−0.11 − 0.72i))(z − (−0.58 + 0.64i)) (z − (0.51 − 0.27i))(z − (−0.12 + 0.84i))2 r =(z − (0.94 + 0.15i))(z + (−0.38 − 0.39i)) (z − (0.09 − 0.92i))(z − (−0.38 + 0.84i)) (z − (0.71 − 0.07i)) r = − (z − (0.74 + 0.35i))(z − (0.11 − 0.11i)) (Case 1), (Case 2), 2 (z − (−0.11 + 0.72i))(z − (−0.58 + 0.64i)) (z − (0.51 − 0.27i)) (Case 3). 80 Chapter 4 Hidden DG accuracy For each test case, the position-dependent post-processor enhances the convergence rate from O hp+1 to O h2p+1 . This can be seen from Table 4.9, Table 4.10, and Table 4.11. Figure 4.7 illustrates the local accuracy improvement for the first case. An interesting effect can be seen in the three tables: for sufficiently large p, the errors of the post-processed field are of the order of the machine precision, which suggests that the exact solution has been reached. This happens e.g. for the second case with p ≥ 2. For that problem, the exact solution is a polynomial of degree 5. At the same time, the post-processed solution is a piecewise polynomial of degree no more than 2p + 1 in each variable6 . Combining these facts (observing 2p + 1 ≥ 5 for p ≥ 2), the high accuracy suggests that the post-processed L2 -projection onto the space of piecewise polynomials of degree p behaves like the L2 -projection onto the space of piecewise polynomials of degree 2p + 1 in each variable. Theoretical support for this phenomenon is left for future research. A good feature of the post-processor is that it can enhance the accuracy of streamlines, especially near critical points. This was observed by Steffen et al. [73] for the symmetric post-processor, away from the boundary. Figure 4.8 shows that similar improvements are obtained for the position-dependent post-processor in the entire spatial domain (using a standard RK-4 method with time step ∆t = 0.01 to compute the streamlines). We have translated the second field of Steffan et al. so that the critical points are located close to the boundary to emphasize the improved applicability and accuracy of the position-dependent post-processor near the boundary. 4.6 Conclusion This chapter proposes the position-dependent post-processor as an alternative to the one-sided post-processor [64], and analyzes the impact of both strategies on the accuracy and smoothness of DG (upwind) approximations for hyperbolic problems. Our numerical results demonstrate that the new post-processor can enhance the convergence rate from order p + 1 to order 2p + 1, in both the L2 and the L∞ -norm. The differences with the original one-sided method occur at the boundary of the domain: in those regions, the new post-processor uses extra kernel nodes, as well as a smoother transition of the nodes. This results in significantly smaller errors with a more realistic smoothness. Altogether, unlike before, the proposed position-dependent post-processor can be used to obtain better smoothness and accuracy than the unfiltered DG approximation in the entire domain, including near (non-periodic) boundaries and shocks. This can aid to better visualization of the results, e.g. in the form of streamlines. 6 This is because the post-processed solution is obtained from the convolution of a piecewise polynomial of degree p (the L2 -projection before post-processing) with a piecewise polynomial of degree p in each variable (the kernel, a linear combination of B-splines of degree p + 1). Section 4.6 Conclusion 81 u-component before post-processing u-component after post-processing −5 −5 −6 −6 −7 −7 y −4 y −4 −8 −8 −9 −9 −10 −10 −11 −11 x x v-component before post-processing v-component after post-processing −6 −6 y −4 y −4 −8 −8 −10 −10 x x y y Figure 4.7. Logarithm of the local error before and after post-processing (Case 1, p = 2, N = 402 ). Before After Exact Before Exact After x x Figure 4.8. Enhanced streamline visualization after post-processing for Case 1 (left) and Case 2 (right) (p = 1, N = 202 ). 82 Chapter 4 Hidden DG accuracy u-component: L2 -error mesh Before Error Order 202 402 802 5.36e-02 1.35e-02 3.37e-03 1.99 2.00 202 402 802 1.92e-03 2.41e-04 3.01e-05 2.99 3.00 202 402 802 4.96e-05 3.11e-06 1.94e-07 4.00 4.00 L∞ -error After Before After Error Order Error Order Error Order Polynomial Degree p = 1 3.58e-03 4.63e-01 2.26e-02 1.20e-04 4.90 1.27e-01 1.87 8.48e-04 4.74 5.98e-06 4.33 3.32e-02 1.93 2.95e-05 4.85 Polynomial Degree p = 2 6.01e-06 1.97e-02 7.56e-06 2.00e-07 4.91 2.67e-03 2.89 2.19e-07 5.11 4.23e-09 5.56 3.47e-04 2.94 4.25e-09 5.69 Polynomial Degree p = 3 1.08e-23 4.22e-04 1.84e-22 7.03e-22 2.80e-05 3.91 1.40e-20 2.13e-21 1.81e-06 3.96 7.16e-20 - v-component: L2 -error mesh Before Error Order 202 402 802 1.23e-01 3.10e-02 7.75e-03 1.99 2.00 202 402 802 4.20e-03 5.27e-04 6.59e-05 3.00 3.00 202 402 802 9.09e-05 5.69e-06 3.56e-07 4.00 4.00 L∞ -error After Before After Error Order Error Order Error Order Polynomial Degree p = 1 4.82e-03 1.40e+00 3.53e-02 1.75e-04 4.78 3.77e-01 1.90 1.31e-03 4.75 9.84e-06 4.16 9.78e-02 1.95 4.57e-05 4.85 Polynomial Degree p = 2 1.14e-05 4.78e-02 1.56e-05 3.07e-07 5.21 6.37e-03 2.91 3.49e-07 5.48 5.99e-09 5.68 8.21e-04 2.95 6.31e-09 5.79 Polynomial Degree p = 3 1.75e-23 8.18e-04 3.47e-22 1.41e-21 5.37e-05 3.93 2.87e-20 2.44e-21 3.44e-06 3.96 6.81e-20 - Table 4.9. Case 1 Section 4.6 Conclusion 83 u-component: L2 -error mesh Before Error Order 202 402 802 2.00e-02 5.00e-03 1.25e-03 2.00 2.00 202 402 802 4.65e-04 5.83e-05 7.28e-06 3.00 3.00 202 402 802 6.80e-06 4.25e-07 2.66e-08 4.00 4.00 L∞ -error After Before After Error Order Error Order Error Order Polynomial Degree p = 1 2.57e-04 1.31e-01 7.92e-04 1.25e-05 4.36 3.42e-02 1.93 2.65e-05 4.90 8.39e-07 3.90 8.76e-03 1.97 9.38e-07 4.82 Polynomial Degree p = 2 2.68e-26 2.80e-03 2.95e-25 7.75e-25 3.62e-04 2.95 1.08e-23 2.34e-24 4.60e-05 2.98 4.55e-23 Polynomial Degree p = 3 4.13e-24 2.53e-05 5.94e-23 3.91e-22 1.61e-06 3.97 4.90e-21 9.63e-22 1.02e-07 3.99 2.31e-20 - v-component: L2 -error mesh Before Error Order 202 402 802 2.69e-02 6.74e-03 1.69e-03 2.00 2.00 202 402 802 5.73e-04 7.17e-05 8.97e-06 3.00 3.00 202 402 802 7.45e-06 4.66e-07 2.91e-08 4.00 4.00 L∞ -error After Before After Error Order Error Order Error Order Polynomial Degree p = 1 2.72e-04 1.84e-01 8.04e-04 1.42e-05 4.26 4.80e-02 1.94 2.73e-05 4.88 9.50e-07 3.90 1.23e-02 1.97 1.05e-06 4.71 Polynomial Degree p = 2 7.39e-26 3.49e-03 1.04e-24 1.45e-24 4.50e-04 2.96 2.82e-23 3.63e-24 5.71e-05 2.98 9.69e-23 Polynomial Degree p = 3 1.01e-23 2.81e-05 1.77e-22 7.18e-22 1.79e-06 3.98 1.22e-20 1.68e-21 1.13e-07 3.99 4.11e-20 - Table 4.10. Case 2 84 Chapter 4 Hidden DG accuracy u-component: L2 -error mesh Before Error Order 202 402 802 4.92e-02 1.23e-02 3.09e-03 1.99 2.00 202 402 802 1.89e-03 2.37e-04 2.97e-05 3.00 3.00 202 402 802 4.27e-05 2.67e-06 1.67e-07 4.00 4.00 L∞ -error After Before After Error Order Error Order Error Order Polynomial Degree p = 1 1.34e-03 3.98e-01 8.15e-03 3.73e-05 5.16 1.06e-01 1.90 2.82e-04 4.85 1.56e-06 4.58 2.75e-02 1.95 8.84e-06 5.00 Polynomial Degree p = 2 4.50e-06 1.42e-02 4.78e-06 1.11e-07 5.34 1.88e-03 2.92 7.47e-08 6.00 2.03e-09 5.77 2.41e-04 2.96 1.17e-09 6.00 Polynomial Degree p = 3 3.98e-24 2.40e-04 5.29e-23 5.44e-22 1.56e-05 3.94 1.45e-20 1.25e-21 9.92e-07 3.97 4.04e-20 - v-component: L2 -error mesh Before Error Order 202 402 802 4.17e-02 1.05e-02 2.62e-03 1.99 2.00 202 402 802 1.66e-03 2.08e-04 2.60e-05 3.00 3.00 202 402 802 3.79e-05 2.37e-06 1.48e-07 4.00 4.00 L∞ -error After Before After Error Order Error Order Error Order Polynomial Degree p = 1 7.11e-04 2.04e-01 3.90e-03 2.26e-05 4.98 5.39e-02 1.92 1.28e-04 4.93 1.10e-06 4.35 1.39e-02 1.96 4.17e-06 4.94 Polynomial Degree p = 2 3.65e-26 6.80e-03 5.33e-25 9.01e-25 8.95e-04 2.93 2.28e-23 2.58e-24 1.15e-04 2.96 9.17e-23 Polynomial Degree p = 3 4.01e-24 1.24e-04 5.62e-23 4.48e-22 8.05e-06 3.95 1.11e-20 9.47e-22 5.12e-07 3.97 3.13e-20 - Table 4.11. Case 3 5 Theoretical Superconvergence This chapter is based on: L. Ji, P. van Slingerland, J.K. Ryan, C. Vuik, Superconvergent error estimates for position-dependent smoothness-increasing accuracy-conserving (SIAC) postprocessing of Discontinuous Galerkin Solutions. Accepted for publication in Math. Comp. 86 5.1 Chapter 5 Theoretical Superconvergence Introduction In Chapter 4, we proposed the position-dependent post-processor by generalizing the original symmetric and one-sided post-processor. Various numerical experiments demonstrated that, unlike before, this technique enhances both the smoothness and the unfiltered errors in the entire spatial domain, including the boundary region. Furthermore, an improvement of the convergence rate form order p + 1 to order 2p + 1 was observed. This chapter focuses on theoretical support for these findings. For the symmetric filter, such theory is already available: Bramble and Schatz [11] derived superconvergence in the L2 - and L∞ -norm for (continuous) Ritz-Galerkin approximations. Cockburn, Luskin, Shu and Süli [22] extended this work to show an accuracy improvement from order p + 1 to order 2p + 1 for DG schemes in the L2 -norm for linear periodic hyperbolic problems. Interestingly, these results were established despite the fact that the post-processor does not contain any information of the underlying physics or numerics. To extend this available theory for the symmetric filter and to explain the numerical results in Chapter 4, in this chapter, we derive error estimates for the generalized and position-dependent post-processor. In particular, we show that it enhances the DG convergence from order p+ 1 to order 2p+ 1 in the L2 -norm and to order min{2p + 1, 2p + 2 − d/2} in the L∞ -norm for d-dimensional linear periodic hyperbolic problems. Unlike [11, 22] these estimates are valid in the entire spatial domain, while the post-processor does not require information outside the domain. Furthermore, it is the first time that such results are established in the L∞ -norm for DG approximations. The outline of this chapter is as follows. Section 5.2 discusses the DG method and post-processor under consideration, as well as some basic notation. Section 5.3 derives two auxiliary properties. Section 5.4 uses these to obtain the main error estimates in abstract form. Section 5.5 considers the implications for DG approximations and links the theory to the numerical observations in Chapter 4. Finally, Section 5.6 summarizes the main conclusions. 5.2 Methods and notation This section specifies the methods and notation considered in this chapter. Section 5.2.1 recalls the generalized post-processor from Chapter 4. Section 5.2.2 summarizes the DG method for linear periodic hyperbolic problems. Section 5.2.3 introduces some additional notation that we will use frequently. 5.2.1 Post-processor The position-dependent post-processor is based on a convex combination of two generalized post-processors (cf. Section 4.4). For this reason, in this chapter, we primarily focus on the generalized post-processor. Below, we repeat the def- Section 5.2 Methods and notation 87 inition of the latter, using slightly different notation than before. Implications for the position-dependent post-processor are discussed in Section 5.5.3. Recall the definition of a B-spline of order ℓ ≥ 1 (cf. (4.3)): ψ (1) := 1[− 21 , 12 ] , ψ (ℓ) := ψ (ℓ−1) ⋆ ψ (1) , for all ℓ ≥ 2. (5.1) Next, we define the one-dimensional kernel for an evaluation point x̄ ∈ (a, b) as the following linear combination of r + 1 B-splines of order ℓ: K(x) = r X j=0 cj ψ (ℓ) (x − xj ), for all x ∈ R, (5.2) where the kernel nodes read (the additional small factor ε > 0 is discussed below): r xj = − + j + λ(x̄), for all j = 0, ..., r, (2 a+b + x̄−a min{0, − r+ℓ+ε 2 h }, for x̄ ∈ [a, 2 ), λ(x̄) = max{0, r+ℓ+ε + x̄−b for x̄ ∈ [ a+b 2 h }, 2 , b], (5.3) (5.4) and the kernel coefficients cj satisfy: r X j=0 cj Z ( 1, ψ (ℓ) (x)(x + xj )k dx = 0, −∞ ∞ for k = 0, else. (5.5) Different evaluation points result in different kernels. Now that we have specified the one-dimensional kernel, we can define the generalized post-processor for a d-dimensional domain Ω = (a1 , b1 ) × ... × (ad , bd ). To this end, we proceed as in Section 4.4.3: consider an evaluation point x̄ = (x̄1 , ..., x̄d ) ∈ Ω. Next, let Kj denote the one-dimensional kernel for the evaluation point x̄j ∈ (aj , bj ) (as specified above), and set: K(x) = K1 (x1 )... Kd (xd ), for all x ∈ Rd . (5.6) Using this definition, we can apply the generalized post-processor to a function u ∈ L2 (Ω) by computing the convolution with the scaled kernel K: Z x̄ − x 1 u(x) dx. (5.7) K u⋆ (x̄) = d h Ω h Note that we have added a factor 2ε in the definition of the shift function (5.4). In practice, we can set ε = 0 (cf. (4.12)). However, for the theoretical purposes in this chapter, we need ε > 0 to be arbitrarily small yet fixed (independent of h). We stress that the resulting post-processor is still applicable in the entire spatial domain, including the region near the boundary. A nonzero value ε > 0 simply means that the post-processor is slightly less ‘symmetric’ near the boundary (cf. Section 4.3.3). 88 5.2.2 Chapter 5 Theoretical Superconvergence DG discretization for hyperbolic problems We study the generalized post-processor in both an abstract framework and in the context of DG schemes for linear periodic hyperbolic problems on uniform meshes with exact time integration. Below, we specify the latter. We acknowledge that the aforementioned assumptions are quite strong, and usually not valid in practice. Nevertheless, numerical experiments show that the positiondependent post-processor enhances the accuracy in a similar manner for other problems as well (cf. Chapter 4). Altogether, we study the following d-dimensional linear hyperbolic problem on the spatial domain Ω = [0, 1]d : ut + d X Aj uxj + A0 u = 0, (5.8) j=1 with initial condition u0 and periodic boundary conditions. The coefficients Aj are assumed to be scalar and constant. To obtain a DG approximation for this system, consider a uniform mesh for the spatial domain Ω with elements E1 , ..., EN of size h × ... × h. Next, define the test space V that contains each element of L2 (Ω) that is a polynomial of degree p or lower within each mesh element, and that may be discontinuous at the mesh element boundaries. At the initial time t = 0, the DG approximation uh is the L2 -projection of u0 onto V . For t > 0, uh is the function in V such that: Z (uh )t v + B(uh , v) = 0, ∀v ∈ V, Ω where B is a bilinear form defined hereafter. (i) (i) To specify B, let (n1 , ..., nd ) be the outward normal of Ei . Furthermore, let ûh denote the usual upwind flux (cf. Section 4.2). Now, the bilinear form B can be specified as follows: Z N d d X X Z X X (i) B(uh , v) = Aj vxj + A0 v . Aj nj ûh v + u h − i=1 5.2.3 e∈∂Ei e j=1 Ei j=1 Additional notation This section specifies some additional notation that we use throughout this chapter. Unless specified otherwise, Ω denotes a spatial domain of the form (a1 , b1 )× ... × (ad , bd ). The standard norms in Lp (Ω) are denoted as: kukLp (Ω) = Z Ω |u|p p1 , 1 ≤ p < ∞, Section 5.2 Methods and notation 89 kukL∞ (Ω) = ess sup |u(x)|. x∈Ω Furthermore, for integers k ≥ 0 and p = 2, ∞, let W k,p (Ω) denote the usual Sobolev space, i.e. the set of all functions u such that, for every d-dimensional multi-index1 α with |α| ≤ k, the weak partial derivative Dα u belongs to Lp (Ω). For p = 2, we write H k (Ω) = W k,2 (Ω), which is equipped with the norm: 1/2 X 2 kukH k (Ω) = (5.9) kDα ukL2 (Ω) . |α|≤k Additionally, we define the negative-order space H −k (Ω) for integers k ≥ 0: this space is the closure of the smooth functions C ∞ (Ω) with respect to the so-called negative-order norm2 : R uv Ω kukH −k (Ω) = sup . (5.10) ∞ kvk v∈C0 (Ω) H k (Ω) A multi-variate B-spline of order ℓ is the tensor product of one-dimensional B-splines: ψ (ℓ) (x) = ψ (ℓ) (x1 )... ψ (ℓ) (xd ), A scaled multi-variate B-spline is defined as: 1 1 (ℓ) x , ψh (x) = d ψ (ℓ) h h ∀x ∈ Rd . ∀x ∈ Rd . The usual central difference operator in the j th direction is denoted as3 1 ∂h,j u(x) = u(x + h2 ej ) − u(x − h2 ej ) . h k Using this notation, we set for any d-dimensional multi-index α (writing ∂h,j u= k−1 1 ∂h,j ∂h,j u for integers k ≥ 2): αd α1 ∂hα u = ∂h,1 ...∂h,d u. Finally, we use the notation Ωα for the largest subset of Ω = (a1 , b1 ) × ... × (ad , bd ) such that ∂hα u : Ωα → R does not require information outside Ω: h h h h × ... × ad + αd , bd − αd . (5.11) Ωα = a1 + α1 , b1 − α1 2 2 2 2 For scalars γ we set Ωγ = Ω(γ,...,γ) . 1A d-dimensional multi-index α is a d-tuple nonnegative α = (α1 , ..., αd ). ofα αintegers: 1 d ∂ ∂ Furthermore, |α| = α1 + ... + αd and D α u = ∂x u. ... ∂x 1 d 2 C ∞ (Ω) denotes the usual set of all functions in C ∞ (Ω) with compact support. 0 3 Here, e is the multi-index whose j th component is 1 and all other components are 0. j 90 Chapter 5 Theoretical Superconvergence 5.3 Auxiliary results To obtain the main error estimates, we require two auxiliary results, which are discussed in this section. Section 5.3.1 derives an estimate for ku − u⋆ k. Section 5.3.2 expresses derivatives of convolutions with B-splines in terms of central differences. 5.3.1 Estimating ku − u⋆ k In Section 4.3.2, we mentioned that the symmetric post-processor reproduces polynomials of degree 2p. As a result, the difference between u and u⋆ is of order 2p + 1 (assuming that u is sufficiently smooth). Actually, Bramble and Schatz [11, Lemma 5.2] designed the symmetric kernel this way. In this section, we show that a similar result holds for the generalized post-processor. Unlike before, this estimate is valid in the entire spatial domain. Lemma 5.3.1 Consider the generalized post-processor with r + 1 kernel nodes, as defined in Section 5.2.1. Let k ≤ r + 1 be a positive integer. Then4 , X ku − u⋆ kL∞ (Ω) . kDα ukL∞ (Ω) hk , ∀u ∈ W k,∞ (Ω). (5.12) |α|=k Here, α denotes a d-dimensional multi-index. The constant involved depends on the L1 -norm of the kernels, as indicated in the proof below. To show Lemma 5.3.1, we follow the same strategy sketched for the symmetric post-processor in [11, p. 103]: the main idea is to demonstrate (the proof is given below) that the generalized post-processor reproduces polynomials p of degree r (or lower, in each variable): p∗ (x̄) = p(x̄), ∀x̄ ∈ Ω. (5.13) Using this property, the proof can be completed using Taylor’s theorem: for u ∈ W k,∞ (Ω) and x, x̄ ∈ Ω, we have5 : u(x) = X (x − x̄)α α D u(x̄) α! |α|≤k−1 X (x − x̄)α Z 1 +k sk−1 Dα u x + s(x̄ − x) ds. α! 0 (5.14) |α|=k 4 Throughout this chapter, we use the symbol . in expressions of the form “F (x) . G(x) for all x ∈ X” to indicate that there exists a constant C > 0, independent of the variable x and the scaling h, such that F (x) ≤ CG(x) for all x ∈ X. The symbol & is defined similarly. 5 For a d-dimensional multi-index α, we write α! = α !...α ! and xα = xα1 ...xαd . 1 d Section 5.3 Auxiliary results 91 This follows from [14, p. 83, 100, 101] (using the chain rule and Lipschitz continuity of the derivatives of order k − 1 and lower [30, p. 269]). We can now show Lemma 5.3.1: Proof (of Lemma 5.3.1) To show (5.13), first consider the one-dimensional case. Without loss of generality, we may assume that p is a monomial basis function, i.e. p(x) = xm (with m = 0, ..., r). Next, observe that, by definition of the post-processor, p⋆ (x̄) := = 1 h Z r X j=0 K Ω cj Z x̄ − x h ∞ xm dx := r Z X j=0 ψ (ℓ) (y) x̄ − h(y + xj ) −∞ ∞ cj ψ (ℓ) −∞ m x̄ − x − xj xm dx. h | {z } =:y dy. Using the binomial theorem we may write: p⋆ (x̄) = m X n=0 Z r X m! cj x̄n (−h)m−n ψ (ℓ) (y)(y + xj )m−n dy = xm = p(x). (m − n)!n! R j=0 {z } | = 1 for m = n, and zero otherwise (5.5) This completes the proof of (5.13) for the one-dimensional case. For higher-dimensional problems, this result follows from (5.7) and repetitive application of (5.13) for onedimensional kernels. Now that we have obtained (5.13), we can show (5.12). To this end, choose an evaluation point x̄ ∈ Ω. Because the post-processor reproduces polynomials (5.13), we may write for any polynomial p of degree r or lower (note that u, u⋆ and p are continuous): |u − u⋆ |(x̄) = |(u − p)(x̄) − (u − p)⋆ (x̄)|. In particular, we may choose p to be the Taylor polynomial of degree k − 1 that approximates u near the point x̄. As a consequence, (u − p)(x̄) = 0, and we may write: |u − u⋆ |(x̄) = = |(u − p)⋆ (x̄)| 1 Z x̄ − x K (u − p)(x) dx d h Ω h | {z } =:y = Hölder ≤ Z K(y)(u − p)(x̄ − hy) dy supp{K} kKkL1 (Rd ) k(u − p)(x̄ − h.)kL∞ (supp(K)) . (5.15) To estimate the second term, we apply Taylor’s theorem (5.14) for all y ∈ supp(K): Z X (−hy)α 1 k−1 α |u − p|(x̄ − hy) = k s D u x̄ − hy + s(hy) ds α! 0 |α|=k Z 1 X −hy α α sk−1 ds ≤ α! kD ukL∞ (Ω) k 0 |α|=k {z } | =1 X |y α | kD α ukL∞ (Ω) hk ≤ α! |α|=k 92 Chapter 5 Theoretical Superconvergence Substitution of this result into (5.15) yields: X |y α | |u − u⋆ |(x̄) ≤ kKkL1 (Rd ) sup kD α ukL∞ (Ω) hk y∈supp(K) |α|=k α! X kD α ukL∞ (Ω) hk , . ∀x̄ ∈ Ω. |α|=k Here we have used that K is independent of h for all x̄ (although different x̄ yield different kernels). We now arrive at (5.12), which completes the proof. 5.3.2 Derivatives of B-splines The derivative of a B-spline ψ ℓ can be expressed as the central difference of the lower-order B-spline ψ ℓ−1 [70, p. 12]. In [11, Lemma 5.3], this property was exploited to estimate norms of derivatives of u⋆ for the symmetric-postprocessor. In this section, we obtain a similar result for the convolution of u with a single B-spline (without requiring continuity of u). This reduction to the core elements of the post-processor is convenient when handling different kernel types later on. Another difference with [11] is that we provide an explicit expression for the largest subdomain for which the estimates are valid (without requiring information outside Ω). These subdomains will turn out to be just large enough to ensure that our main estimates are applicable in the entire spatial domain. Altogether, we have the following result: Lemma 5.3.2 Let α be a d-dimensional multi-index whose entries are not (ℓ) larger than ℓ ≥ 1. If u ∈ L∞ (Ω), then ψh ⋆ u ∈ W ℓ,∞ (Ωℓ ), and α (ℓ) ≤ k∂hα ukL∞ (Ωα ) . (5.16) D ψh ⋆ u L∞ (Ωℓ ) (ℓ) Similarly, if u ∈ L2 (Ω), then ψh ⋆ u ∈ H ℓ (Ωℓ ), and, for k ≥ 0: α (ℓ) ≤ k∂hα ukH −k (Ωα ) . D ψh ⋆ u H −k (Ωℓ ) (5.17) To show Lemma 5.3.2, the main idea is to demonstrate (the proof is given below)6 : (ℓ) (ℓ−α) D α ψh ⋆ u = ψh ⋆ ∂hα u. (5.18) Furthermore, we make use of Young’s inequality for convolutions [9, Theorem 3.9.4]: for 1 ≤ p, q, r ≤ ∞, f ∈ Lp (Rd ), and g ∈ Lq (Rd ): 1 1 1 + = +1 p q r 6 Here, ⇒ kf ⋆ gkLr (Rd ) ≤ kf kLp (Rd ) kgkLq (Rd ) . (ℓ−α) we use the notation ψh thermore, the notation (0) ψh (ℓ−α1 ) (x) = ψh (ℓ−αd ) (x1 )... ψh ⋆ u should be interpreted as u. (5.19) (xd ), for all x ∈ R d . Fur- Section 5.3 Auxiliary results 93 Additionally, we use that [70, p.3]: (ℓ) ψh L1 (Rd ) = 1. (5.20) We can now show Lemma 5.3.2: Proof (of Lemma 5.3.2) To show (5.18), first, consider the one-dimensional case with ℓ = 1: Z h Z h 2 1 x+ 2 1 (1) ψh ⋆ u (x) = u(x − y ) dy = u(s) ds. h − h2 | {z } h x− h2 =:s (1) ψh ⋆ u (1) D(ψh ⋆ u) As a result, is continuous, and = ∂h u almost everywhere [9, Theorem 5.4.2]. For ℓ ≥ 1, we may now write: (5.1) (ℓ) (ℓ−1) (1) (ℓ−1) (1) (ℓ−1) D ψh ⋆ u = D ψh ⋆ ψh ⋆ u = ψh ⋆ D ψh ⋆ u = ψh ⋆ ∂h u. For higher-order derivatives, we may repeat this strategy obtain (5.18) for the onedimensional case. For the multi-dimensional case, we apply the above in each direction, which then completes the proof of (5.18). To show (5.16), note that (5.18) implies that α (ℓ−α) (ℓ) = ψh ψh ⋆ u ∞ ⋆ ∂hα u ∞ . D L (Ωℓ ) L (Ωℓ ) Next, define w ∈ L∞ (R d ) such that w = ∂hα u in Ωα and zero everywhere else. Because of the local support of the B-spline, the convolution in the right hand side above requires only information of ∂hα u in Ωα ⊇ Ωℓ . Hence, we may replace ∂hα u by w: (ℓ−α) α (ℓ) . ⋆ w ∞ = ψh ψh ⋆ u ∞ D L L (Ωℓ ) (Ωℓ ) Next, we apply Young’s inquality (5.19) with p = 1, and q, r = ∞: α (ℓ−α) (ℓ) ψh ⋆ u ∞ ≤ ψh 1 d kwkL∞ (Rd ) . D L (Ωℓ ) = ∂ α u L (R ) now yields (5.16). Using (5.20) and the fact that kwkL∞ (Rd ) h L∞ (Ωα ) To show (5.17), note that (5.18) implies that α (ℓ−α) (ℓ) ψh ⋆ u −k = ψh ⋆ ∂hα u −k D H (Ωℓ ) L∞ (R d ) H (Ωℓ ) ∂hα u in Ωα and zero everywhere else, so As before, we define w ∈ such that w = we may replace ∂hα u by w: α (ℓ−α) (ℓ) ψh ⋆ u −k = ψh ⋆ w −k D H (Ωℓ ) H (Ωℓ ) Next, choose v ∈ C0∞ (Ωℓ ), and extend it to R d by setting it equal to zero outside Ωℓ . We may now write: Z Z Fubini (ℓ−α) (ℓ−α) w ψh ⋆v . ψh ⋆w v = Ωα Ωℓ (ℓ−α) ψh C0∞ (Ωα ), Next, we note that ⋆v ∈ so we may consider it as a test function in the definition of the negative-order norm (5.10) of w: R (ℓ−α) Z ⋆v Ωα w ψ h (ℓ−α) (ℓ−α) ⋆ v k ψh ⋆w v = ψh (ℓ−α) H (Ωα ) Ωℓ ⋆ v ψh k H (Ωα ) {z } | ≤kwkH −k (Ω ) α 94 Chapter 5 Theoretical Superconvergence At the same time, 2 (ℓ−α) ⋆ v ψh (ℓ−α) ⋆ v ≤ kwkH −k (Ωα ) ψh H k (Ωα ) = H k (Ωα ) . 2 X β (ℓ−α) ⋆v 2 D ψh L (Ωα ) |β|≤k 2 X (ℓ−α) ⋆ Dβ v 2 = ψh L (Ωα ) |β|≤k . Next, apply Young’s inequality (5.19) with p = 1 and q, r = 2 to obtain (using that v, and thus its derivatives, have compact support in Ωℓ ): 2 2 X (5.20) (ℓ−α) 2 (ℓ−α) ⋆ v k ≤ = kvk2H k (Ω ) . ψh ψh 1 d D β v 2 ℓ H (Ωα ) L (R ) L (Ωℓ ) |β|≤k | {z } =1 Finally, we combine the results above to obtain: R (ℓ−α) ⋆w v Ωℓ ψ h α (ℓ) ψh ⋆ u −k = sup ≤ kwkH −k (Ωα ) . D H (Ωℓ ) kvkH k (Ωℓ ) v∈C0∞ (Ωℓ ) Replacing w by ∂hα u completes the proof of (5.17). 5.4 The main result in abstract form Using the auxiliary properties discussed in the previous section, we now derive an estimate for ku − v ⋆ k. This result is applicable for any (sufficiently smooth) functions u and v. The implications for DG approximations will be studied in Section 5.5. Section 5.4.1 expresses a post-processed function in terms of convolutions with B-splines. Section 5.4.2 estimates the remaining terms further. Section 5.4.3 combines these two results with Lemma 5.3.1 to obtain the final estimate for ku − v ⋆ k. 5.4.1 Reducing the post-processor to its building blocks The first step is to express v ⋆ in terms of convolutions with B-splines. This removes the dependency on the evaluation point and, as such, simplifies the analysis. As before, we provide explicit expressions for the subdomains involved, which are crucial to ensure that our final error estimates apply in the entire domain. Altogether, we show the following result: Lemma 5.4.1 Consider the generalized post-processor with B-splines of order ℓ ≥ 1 and ε > 0 small, as defined in Section 5.2.1. Then, for all v ∈ L2 (Ω): (ℓ) kv ⋆ kL2 (Ω) . ψh ⋆ v 2 , (5.21) L (Ωℓ+ε ) (ℓ) kv ⋆ kL∞ (Ω) . ψh ⋆ v . (5.22) L∞ (Ωℓ+ε ) Section 5.4 The main result in abstract form 95 The constants involved depend on the kernel coefficients, as indicated in the proof below. To show Lemma 5.4.1, the key is to observe that the kernel nodes are located within Ωℓ+ε during the convolution (this is motivated below). Proof (of Lemma 5.4.1) First, consider the one-dimensional case, and choose an evaluation point x̄ ∈ Ω. Then, v ⋆ (x̄) = r X j=0 cj Z Ω 1 (ℓ) ψ h x̄ − x − xj h v(x) dx = r X (ℓ) cj ψh ⋆ v (x̄ − hxj ). | {z } j=0 ∈Ωℓ+ε Next, observe that x̄ − hxj ∈ Ωℓ+ε for all x̄ ∈ Ω, which can be seen as follows. At the end points of the domain Ω = (a, b), the shift function (5.4) reads λ(a) = − r+ℓ+ε 2 respectively. Hence, for x̄ = a, the right most kernel node (5.3) is and λ(b) = r+l+ε 2 xr (a) = − ℓ+ε . Similarly, for x̄ = b, the left-most kernel node is x0 (b) = ℓ+ε . Hence, 2 2 ℓ+ε , b − h ], which is precisely Ωℓ+ε by the quantity x̄ − hxj takes values in [a + h ℓ+ǫ 2 2 definition (5.11). Considering the L2 -norm of v ⋆ now yields (recall the dependence of the kernel coefficients on the evaluation point): kv ⋆ k2L2 (Ω) 2 Z X r (ℓ) cj ψh ⋆ v (x̄ − hxj ) dx̄ Ω j=0 2 Z r r X X (ℓ) 2 |cj | ψh ⋆ v (x̄ − hxj ) dx̄ Ω = Cauchy-Schwartz ≤ j=0 j=0 ≤ sup x̄∈Ω ≤ X r |cj (x̄)|2 r sup x̄∈Ω j=0 {z | j=0 j=0 X r 2 X r Z (ℓ) ψh ⋆ v (x̄ − hxj ) dx̄ | {z } Ω ∈Ωℓ+ε 2 (ℓ) |cj (x̄)|2 ψh ⋆ v 2 L (Ωℓ+ε ) . } constant Similarly, when we consider the L∞ -norm of v ⋆ , we obtain: ) ( r X (ℓ) cj ψh ⋆ v (x̄ − hxj ) kv ⋆ kL∞ (Ω) = sup x̄ | {z } j=0 ≤ sup x̄∈Ω | r X j=0 ∈Ωℓ+ε (ℓ) |cj (x̄)| ψh ⋆ v {z constant } L∞ (Ωℓ+ε ) . This completes the proof for the one-dimensional case. For general d ≥ 1, the desired result follows by applying the strategy above for each spatial direction. To be more specific, recall that the kernel K is a tensor product of one-dimensional kernels K1 , ..., Kd (cf. (5.6)). We can now write for x̄ ∈ Ω (the 96 Chapter 5 Theoretical Superconvergence super-scripted indices below indicate the corresponding one-dimensional kernel): (1) xj 1 r r X X . (ℓ) (1) (d) cj1 ...cj ψh ⋆ v x̄ − h .. ... v ⋆ (x̄) = . d jd =0 j1 =0 (d) xj d | {z } ∈Ωℓ+ε Considering the L2 - and L∞ -norm yields: v u r r u 2 X X u (ℓ) (1) (d) ⋆ ... kv kL2 (Ω) ≤ td r sup cj1 (x̄)...cjd (x̄) ψh ⋆ v x̄∈Ω kv ⋆ kL∞ (Ω) | {z } constant r r X X (ℓ) (1) (d) ... ≤ sup cj1 (x̄)...cjd (x̄) ψh ⋆ v x̄∈Ω | j1 =0 This completes the proof. 5.4.2 L2 (Ωℓ+ε ) jd =0 j1 =0 L∞ (Ωℓ+ε ) jd =0 {z , . } constant Treating the remaining building blocks Now that we have reduced the post-processor to its building blocks, in this section, we further estimate the latter. To this end, we follow Bramble and Schatz [11, p. 104–105], while considering B-splines rather than the symmetric kernel (and without any restriction to continuous functions). As before, another difference with [11] is that we keep a careful administration of the subdomains involved, as this is crucial for our final estimate to be applicable in the entire domain. Altogether, we show the following result: Lemma 5.4.2 Consider a B-spline of order ℓ ≥ 1, let ε > 0 small (cf. Section 5.2.1), and define7 d0 = 1 + [d/2]. Then, for all v ∈ L2 (Ω): X (ℓ) . k∂hα vkH −ℓ (Ωα ) . (5.23) ψh ⋆ v 2 L (Ωℓ+ε ) |α|≤ℓ Furthermore, for all v ∈ L∞ (Ω): X X (ℓ) . k∂hα vkH −ℓ (Ωα ) + hℓ k∂hα vkL∞ (Ωα ) . ψh ⋆ v ∞ L (Ωℓ+ε ) |α|≤ℓ+d0 |α|≤ℓ (5.24) 7 Here, [d/2] denotes the integer part of d/2. Section 5.4 The main result in abstract form 97 To show Lemma 5.4.2, the main idea is to switch from L2 -norms to negativeorder norms at the cost of smoothness: for open bounded d-dimensional domains8 X0 ⋐ X1 and nonnegative integers k, we have: X kukL2 (X0 ) . kDα ukH −k (X1 ) , ∀u ∈ L2 (X1 ). (5.25) |α|≤k This has been shown in [11, p. 96]. At the same time, we can switch from L∞ -norms to L2 -norms (again, at the cost of smoothness): for open bounded d-dimensional domains X0 ⋐ X1 and d0 := [d/2]+1, we have that u ∈ H d0 (X1 ) is continuous almost everywhere in X0 , and: ∀u ∈ H d0 (X1 ). kukL∞ (X0 ) . kukH d0 (X1 ) , (5.26) This result is given in [12, p. 679]. Combining these relations with the expression for derivatives of B-splines in Lemma 5.3.2, we can now show Lemma 5.4.2 as follows: Proof (of Lemma 5.4.2) Relation (5.23) is obtained as follows: (ℓ) ψh ⋆ v L2 (Ω X α (ℓ) ψh ⋆ v D (5.25), Ωℓ+ε ⋐Ωℓ . ℓ+ε ) |α|≤ℓ Lemma 5.3.2 ≤ X H −ℓ (Ωℓ ) k∂hα vkH −ℓ (Ωα ) . |α|≤ℓ (ℓ) Relation (5.24) requires a little more work: let (ψh ⋆ v)⋆ denote the result of applying (ℓ) ψh the generalized post-processor to ⋆ v using r + 1 B-splines of order d0 (!) in the domain Ωℓ+ε (cf. Section 5.2.1). The triangle inequality then gives: ⋆ ⋆ (ℓ) (ℓ) (ℓ) (ℓ) ≤ ψh ⋆ v − ψh ⋆ v ∞ + ψh ⋆ v ∞ . ψh ⋆ v ∞ L (Ωℓ+ε ) L (Ωℓ+ε ) L (Ωℓ+ε ) (5.27) (ℓ) To estimate the first term in the right hand side of (5.27), we observe that ψh ⋆ v ∈ (ℓ) ℓ,∞ W (Ωℓ ) by Lemma 5.3.2. Hence, we can apply Lemma 5.3.1 (substituting ψh ⋆ v for u) and rewrite the derivatives using Lemma 5.3.2: ⋆ X Lemma 5.3.1 (ℓ) α (ℓ) (ℓ) ψh ⋆ v . hℓ ψh ⋆ v − ψh ⋆ v ∞ D L (Ωℓ+ε ) L∞ (Ωℓ+ε ) |α|=ℓ Ωℓ+ε ⊆ Ωℓ ≤ hℓ X α (ℓ) ψh ⋆ v D L∞ (Ωℓ ) |α|=ℓ Lemma 5.3.2 ≤ hℓ X k∂hα vkL∞ (Ωα ) (5.28) |α|≤ℓ To estimate the second term in the right hand side of (5.27), we apply Lemma 5.4.1, substituting Ωℓ+ε for Ω and d0 for ℓ. This results in a reduction from Ωℓ+ε to Ωℓ+d0 +2ε: ⋆ Lemma 5.4.1 (ℓ) (ℓ+d0 ) . ⋆ v ∞ ψh ⋆ v ∞ ψh L 8 We (Ωℓ+ε ) L (Ωℓ+d0 +2ε ) write X0 ⋐ X1 to indicate that X0 is a compactly embedded in X1 , i.e. the closure of X0 is compact and a subset of the interior of X1 . 98 Chapter 5 Theoretical Superconvergence Next, we switch to L2 -norms using (5.26), after which we can proceed as before for the estimates in the L2 -norm: ⋆ (ℓ) ψh ⋆ v ∞ L (5.26), Ωℓ+d0 +2ε ⋐Ωℓ+d0 +ε . (Ωℓ+ε ) X α (ℓ+d ) ψh 0 ⋆ v D |α|≤d0 (5.25), Ωℓ+d0 +ε ⋐Ωℓ+d0 . X L2 (Ωℓ+d0 +ε ) X α+β (ℓ+d ) ψh 0 ⋆ v D H −ℓ (Ωℓ+d0 ) |α|≤d0 |β|≤ℓ X reordering . |α|≤ℓ+d0 X Lemma 5.3.2 . α (ℓ+d ) ψh 0 ⋆ v D H −ℓ (Ωℓ+d0 ) k∂hα vkH −ℓ (Ωα ) . . (5.29) |α|≤ℓ+d0 Substituting (5.28) and (5.29) into (5.27) yields (5.24), which then completes the proof. 5.4.3 The main error estimate in abstract form Combining the auxiliary results in the previous sections, we can now estimate ku − v ⋆ k in both the L2 - and the L∞ -norm. These estimates are in abstract form in the sense that they are applicable for any (sufficiently smooth) functions u and v. To obtain these results, we follow [11, p. 104–106], while considering the generalized post-processor rather than the symmetric filter. Unlike before, our error estimates are valid in the entire domain, and the L∞ -estimates are not restricted to continuous v. Theorem 5.4.3 Consider the generalized post-processor using r + 1 Bsplines of order ℓ, as discussed in Section 5.2.1. Let k ≤ r + 1. Then, for all u ∈ W k,∞ (Ω) and v ∈ L2 (Ω): X X ku − v ⋆ kL2 (Ω) . kDα ukL∞ (Ω) hk + k∂hα (u − v)kH −(ℓ) (Ωα ) . |α|=k |α|≤ℓ (5.30) Furthermore, for all u ∈ W k,∞ (Ω) and v ∈ L∞ (Ω): X X ku − v ⋆ kL∞ (Ω) . kDα ukL∞ (Ω) hk + k∂hα (u − v)kH −ℓ (Ωα ) |α|=k + X |α|≤ℓ |α|≤ℓ+d0 k∂hα (u − v)kL∞ (Ωα ) hℓ . (5.31) To show Theorem 5.4.3, the main idea is to apply the triangle inequality to write: ku − v ⋆ k ≤ ku − u⋆ k + k(u − v)⋆ k . (5.32) Section 5.5 The main result for DG approximations 99 After that, we can apply Lemma 5.3.1 to the first term and Lemma 5.4.2 to the second. Altogether, the proof of Theorem 5.4.3 reads as follows: Proof (of Theorem 5.4.3) To show (5.30), use the triangle inequality and the linearity of the post-processor to write: ku − v ⋆ kL2 (Ω) ≤ ku − u⋆ kL2 (Ω) + ku⋆ − v ⋆ kL2 (Ω) . ku − u⋆ kL∞ (Ω) + k(u − v)⋆ kL2 (Ω) ⋆ ku − v kL∞ (Ω) ≤ ku − u⋆ kL∞ (Ω) + k(u − v)⋆ kL∞ (Ω) Application of Lemma 5.4.1 to the last terms gives: (ℓ) ku − v ⋆ kL2 (Ω) . ku − u⋆ kL∞ (Ω) + ψh ⋆ (u − v) 2 L (Ωℓ+ǫ ) (ℓ) ku − v ⋆ kL∞ (Ω) . ku − u⋆ kL∞ (Ω) + ψh ⋆ (u − v) ∞ L (Ωℓ+ǫ ) Application of Lemma 5.3.1 to the first term and Lemma 5.4.2 to the second term in each inequality yields (5.30) and (5.31), which then completes the proof. 5.5 The main result for DG approximations In the previous section, we obtained error estimates for arbitrary filtered functions. In this section, we study the implications for filtered DG approximations. Section 5.5.1 discusses the convergence of unfiltered DG approximations, including the superconvergence in the negative-order norm established in [22]. Section 5.5.2 uses these results to derive the main theorem of this chapter: the generalized post-processor improves the convergence rate of a DG approximation from order p + 1 to order 2p + 1 in the L2 -norm, and to order min{2p + 1, 2p + 2 − d/2} in the L∞ -norm. Section 5.5.3 discusses why the same convergence rates result for the position-dependent post-processor, and explains the accuracy improvement we observed earlier during the numerical experiments in Section 4.5. 5.5.1 Unfiltered DG convergence A DG approximation with polynomial degree p typically converges at rate p + 1 in the L2 -norm for sufficiently smooth problems. In the negative-order norm, superconvergence of order 2p + 1 has been shown for the linear periodic hyperbolic problems under consideration [22]. In this section, we summarize these results, including the implications for L∞ -norms and central differences of the error: Lemma 5.5.1 Consider the linear periodic hyperbolic problem (5.8) with exact solution u and initial data u0 . Suppose that uh is the DG approximation for u with polynomial degree p ≥ 1 and mesh size h, as discussed in 100 Chapter 5 Theoretical Superconvergence Section 5.2.2. Let α be a d-dimensional multi-index. Then, for all initial data u0 ∈ W p+1+|α|,∞ (Ω) and corresponding u and uh : k∂hα (u − uh )kH −(p+1) (Ωα ) . ku0 kH p+1+|α| (Ω) h2p+1 . (5.33) Furthermore, for all u0 ∈ W p+2+|α|,∞ (Ω) and corresponding u and uh , if the DG scheme yields convergence of order p + 1 in the sense that ku − uh kL2 (Ω) . ku0 kH p+2 (Ω) hp+1 , (5.34) then, k∂hα (u − uh )kL2 (Ωα ) . k∂hα (u − uh )kL∞ (Ωα ) . ku0 kH p+2+|α| (Ω) hp+1 , d max Dβ+α uL∞ (Ω) hp+1− 2 (5.35) |β|=p+1 d + ku0 kH p+2+|α| (Ω) hp+1− 2 . (5.36) Before we show these results, we have the following remarks regarding assumption (5.34). First of all, for one-dimensional problems with A0 = 0 in (5.8), relation (5.34) has been shown in [21, p. 166, 189–199]. For stationary problems, the same result has been derived for certain two-dimensional triangulations (satisfying a so-called transversality condition) [61], and for d-dimensional meshes with a unique outflow edge per mesh element [19, 85, 20] (with a lower order Sobolev norm in the right hand side). For some stationary problems, a convergence rate of order p + 12 , as shown by Johnson and Pitkäranta [44], has shown to be sharp [62]. Nevertheless, convergence of order p + 1 is usually observed in practice for (5.8) [65, 47]. In any case, assumption (5.34) is not needed to obtain the main error estimate in Section 5.5.2 hereafter in the L2 -norm; only the result in the L∞ -norm relies on it. Altogether, it seems reasonable to require (5.34) at this point. To show Lemma 5.5.1, we use the following known error estimate in the negative-order norm9 [22, Theorem 3.3]: ku − uh kH −(p+1) (Ω) . ku0 kH p+1 (Ω) h2p+1 . (5.37) To obtain the L∞ -estimate, we use the following inverse inequality [12, p. 680]: d kvkL∞ (Ω) . h− 2 kvkL2 (Ω) , ∀v ∈ V, (5.38) and the following property of the L2 -projection Ph onto V [18, p. 121-129]: kv − Ph vkL∞ (Ω) . max Dβ v L∞ (Ω) hp+1 , ∀v ∈ W p+1,∞ (Ω). (5.39) |β|=p+1 9 Actually, it follows from [22, Theorem 3.3] that ku − uh kH −(p+1) (X 0) . ku0 kH p+1 (Ω) h2p+1 for X0 ⋐ Ω. However, the same result is true when we replace X0 by Ω, due to the periodicity of the problem. Section 5.5 The main result for DG approximations 101 Altogether, Lemma 5.5.1 can be shown as follows: Proof (of Lemma 5.5.1) Let Ω′ be the result of translating Ω (and the corresponding in each direction, i.e. mesh) by a distance h 2 h h h h Ω′ = ,1 + × ... × ,1 + . 2 2 2 2 Next, consider ∂hα u, ∂hα u0 and ∂hα uh on this translated domain (using periodic extensions). Because the problem is linear and periodic, ∂hα u is a solution to (5.8) for the domain Ω′ (with initial condition ∂hα u0 ). At the same time, ∂hα uh is the DG approximation for ∂hα u (on the translated mesh). Hence, we can apply (5.34) and (5.37) for this translated problem on Ω′ (also cf. [22, p. 590]): k∂hα (u − uh )kL2 (Ω′ ) . k∂hα u0 kH p+2 (Ω′ ) hp+1 , k∂hα (u − uh )kH −(p+1) (Ω′ ) . k∂hα u0 kH p+1 (Ω′ ) h2p+1 . Next, observe that it follows from (5.18), (5.19), (5.20) (and the periodicity) that, for 1 ≤ p ≤ ∞: k∂hα ukLp (Ω′ ) . kD α ukLp (Ω′ ) . (5.40) With this reasoning, we obtain: k∂hα (u − uh )kL2 (Ω′ ) . ku0 kH p+2+|α| (Ω′ ) hp+1 , k∂hα (u − uh )kH −(p+1) (Ω′ ) . ku0 kH p+1+|α| (Ω′ ) h 2p+1 (5.41) . Using periodicity and translation invariance again, we may replace Ω′ by Ω, and we arrive at (5.35) and (5.33) (using Ωα ⊆ Ω). To show (5.36), we consider the translated domain again, and use the triangle inequality and the inverse inequality (5.38) to write: k∂hα (u − uh )kL∞ (Ω′ ) k∂hα u − Ph ∂hα ukL∞ (Ω′ ) + kPh ∂hα u − ∂hα uh kL∞ (Ω′ ) ≤ (5.38) . d k∂hα u − Ph ∂hα ukL∞ (Ω′ ) + h− 2 kPh ∂hα u − ∂hα uh kL2 (Ω′ ) . Next, apply the triangle inequality again, and use the property of the polynomial projection (5.39): k∂hα (u − uh )kL∞ (Ω′ ) d . k∂hα u − Ph ∂hα ukL∞ (Ω′ ) + h− 2 kPh ∂hα u − ∂hα ukL2 (Ω′ ) . +h− 2 k∂hα u − ∂hα uh kL2 (Ω′ ) d d 1 + h− 2 k∂hα u − Ph ∂hα ukL∞ (Ω′ ) + h− 2 k∂hα u − ∂hα uh kL2 (Ω′ ) d (5.39) . (5.41), (5.40) . d 1 + h− 2 max D β ∂hα u |β|=p+1 max D β+α u |β|=p+1 L∞ (Ω′ ) d L∞ (Ω′ ) d hp+1 + h− 2 k∂hα (u − uh )kL2 (Ω′ ) d hp+1− 2 + ku0 kH p+2+|α| (Ω′ ) hp+1− 2 . Using periodicity and translation invariance again, we may replace Ω′ by Ω, and we arrive at (5.36) (using Ωα ⊆ Ω), which then completes the proof. 102 Chapter 5 Theoretical Superconvergence 5.5.2 Main result: extracting DG superconvergence We now arrive at the main theorem of this chapter: the generalized postprocessor improves the convergence rate of a DG approximation from order p + 1 to order 2p + 1 in the L2 -norm, and to order min{2p + 1, 2p + 2 − d/2} in the L∞ -norm. This theorem extends existing error estimates for the symmetric post-processor in the L2 -norm [22]. Below, we also include the L∞ -norm in the analysis. Theorem 5.5.2 (Main result) Consider the linear periodic hyperbolic problem (5.8) with exact solution u and initial data u0 . Suppose that uh is the DG approximation for u with polynomial degree p ≥ 1 and mesh size h, as discussed in Section 5.2.2. Furthermore, consider the generalized postprocessor with at least 2p + 1 B-splines of order p + 1 (cf. Section 5.2.1). Then, assuming u0 , u ∈ W 2p+3+[d/2],∞ (Ω): ku − u⋆h kL2 (Ω) . h2p+1 . (5.42) If, furthermore, (5.34) holds, then, ku − u⋆h kL∞ (Ω) . hmin{2p+1,2p+2−d/2} . (5.43) The constants involved may depend on u and u0 , but not on uh (as indicated in the proof below). To show Theorem 5.5.2, we substitute Lemma 5.5.1 into Theorem 5.4.3: Proof (of Theorem 5.5.2) Let k be any positive integer not larger than the number of kernel nodes such that u ∈ W k,∞ (Ω). Then, substitution of v = uh and ℓ = p + 1 into (5.30) and (5.31) yields: X X k∂hα (u − uh )kH −(p+1) (Ωα ) kD α ukL∞ (Ω) hk + ku − u⋆h kL2 (Ω) . |α|≤p+1 |α|=k ku − u⋆h kL∞ (Ω) . X α k kD ukL∞ (Ω) h + X k∂hα (u − uh )kH −(p+1) (Ωα ) |α|≤p+1+d0 |α|=k + X k∂hα (u − uh )kL∞ (Ωα ) hp+1 . |α|≤p+1 Application of Lemma 5.5.1 gives (using u0 , u ∈ W 2p+3+[d/2],∞ (Ω) by assumption): X X ku0 kH p+1+|α| (Ω) h2p+1 kD α ukL∞ (Ω) hk + ku − u⋆h kL2 (Ω) . |α|≤p+1 |α|=k ku − u⋆h kL∞ (Ω) . X α kD ukL∞ (Ω) h + |α|=k + X k X |α|≤p+1 ku0 kH p+1+|α| (Ω) h2p+1 |α|≤p+1+d0 max D β+α u d L∞ (Ω) |β|=p+1 d + ku0 kH p+2+|α| (Ω) h2p+2− 2 . h2p+2− 2 Section 5.5 The main result for DG approximations 103 Choosing k = 2p + 1, this can be simplified to (5.42) and (5.43), which then completes the proof. 5.5.3 Implications for the position-dependent post-processor Now that we have established error estimates for the generalized post-processor, the same convergence rates follow automatically for the position-dependent post-processor. This is because the latter is based on a convex combination of two generalized post-processors with 2p + 1 and 4p + 1 B-splines respectively (cf. Section 4.4.2). More precisely, we claim that Theorem 5.5.2 is also valid if u⋆h is the result of applying the position-dependent post-processor to uh (the constants involved are typically different though). To see this, consider Theorem 5.5.2 and suppose that u⋆h,2p+1 and u⋆h,4p+1 are the result of applying the generalized post-processor to uh with 2p + 1 and 4p + 1 B-splines respectively. Then, for θ : Ω → [0, 1], we have in any norm: ku − u⋆h k := u − θ u⋆h,2p+1 + (1 − θ)u⋆h,4p+1 ≤ u − u⋆h,2p+1 + u − u⋆h,4p+1 . Application of Theorem 5.5.2 to both terms then yields (5.42) and (5.43) for the position-dependent post-processor. This means that it improves the convergence rate of a DG approximation from order p + 1 to order 2p + 1 in the L2 -norm, and to order min{2p + 1, 2p + 2 − d/2} in the L∞ -norm. We can also use the analysis in this chapter to explain the accuracy improvement near the boundary we observed earlier during the numerical experiments in Section 4.5. Similar to (5.32), we may write for the pointwise error in x̄ ∈ Ω: |u − u⋆h | (x̄) ≤ |u − u⋆ | (x̄) + |u⋆ − u⋆h | (x̄). (5.44) Following the analysis in the previous sections, the accuracy of the second term is O(h2p+1 ) (in the L2 -norm). For the first term, we can apply the bounds obtained in the proof of Lemma 5.3.1 (we omit the dependency on u for simplicity): |u − u⋆ |(x̄) . θ(x̄) kK2p+1 kL1 (Rd ) h2p+1 + 1 − θ(x̄) kK4p+1 kL1 (Rd ) h4p+1 . Here, K2p+1 and K4p+1 denote the kernels corresponding to the generalized post-processor with 2p + 1 and 4p + 1 B-splines respectively. From this expression it can be seen that this part of the error is influenced by the L1 -norm of the kernel. The latter becomes larger as the kernel becomes less symmetric. At the same time, we are forced to switch to non-symmetric kernels near the boundary. Using θ = 0 in this region (cf. Section 4.4.2), we can compensate the relatively large L1 -norm of the kernel by a larger order (4p + 1 rather than 2p+1). This basically explains our observations in Section 4.5: the use of extra kernel near the boundary yields better accuracy. 104 5.6 Chapter 5 Theoretical Superconvergence Conclusion This chapter derives theoretical error estimates for the position-dependent postprocessor proposed in Chapter 4 for DG (upwind) approximations for linear hyperbolic problems. We have found that it enhances the accuracy from order p + 1 to order 2p + 1 in the L2 -norm, and to order min{2p + 1, 2p + 2 − d/2} in the L∞ -norm (where p is the polynomial degree and d is the spatial dimension). This expands the L2 -estimates in [22] for the symmetric post-processor, which cannot be applied near non-periodic boundaries and shocks. Altogether, our theory explains the superconvergence observed during the numerical experiments in Chapter 4, and guarantees similar results for a certain class of linear hyperbolic problems. Furthermore, our abstract formulation can be used to obtain similar error estimates for any approximation for which superconvergence in the negative-order norm can be shown. 6 Conclusion 6.1 Introduction This thesis is focused on the linear systems and hidden accuracy of Discontinuous Galerkin (DG) discretizations. In particular, it discusses the two-level preconditioner in [24], investigates an alternative strategy in the form of a deflation method with the same coarse space, and studies the impact of both techniques on the convergence of the Conjugate Gradient (CG) method for Symmetric Interior Penalty (discontinuous) Galerkin (SIPG) discretizations for diffusion problems. Moreover, this thesis considers the one-sided post-processor in [64], proposes the position-dependent post-processor, and analyzes the impact of both strategies on the accuracy and smoothness of DG (upwind) approximations for hyperbolic problems. The remaining of this chapter summarizes the main conclusions of this research in the following manner: Section 6.2 considers the two-level methods for solving the linear systems. Section 6.3 discusses post-processing for extracting the hidden accuracy. Finally, Section 6.4 provides suggestions for future research. 6.2 Linear DG systems We have found that both the two-level preconditioner and the deflation variant yield scalable CG convergence (independent of the mesh element diameter). This has been shown theoretically for any polynomial degree p ≥ 1, which extends the available analysis for the preconditioning variant for p = 1 in [24]. The scalability of both methods is also confirmed by our numerical experiments for various diffusion problems with extreme contrasts in the coefficients. These include problems mimicking bubbly flow, ground water flow, and the presence of layers of sand and shale in oil reservoir simulations. These ex- 106 Chapter 6 Conclusion periments also demonstrate that both two-level methods only yield fast CG convergence provided that the penalty parameter is chosen dependent on local values of the diffusion coefficient (using the largest limit value at discontinuities). The latter also benefits the accuracy of the SIPG discretization. At the coarse level, both two-level methods need to solve the same coarse system. The latter is similar to a system resulting from a central difference scheme, for which very efficient solution techniques are readily available. In that sense, both two-level methods transform the original challenging DG system into a more familiar problem. It has been verified that the coarse systems can be solved efficiently by using an inexact solver with relatively low accuracy, such as the CG method combined with a scalable algebraic multigrid preconditioner. The main difference between both methods is that the deflation method can be implemented by skipping one of the two smoothing steps in the algorithm for the preconditioning variant. This may be particularly advantageous for expensive smoothers, although the basic block Jacobi smoother was found to be highly effective for the problems under consideration. Despite the lower costs per iteration, we have found that the deflation method can require fewer iterations, especially for large problems. As a result, it can be up to 35% faster than the original preconditioner (in terms of the overall computational time in seconds). That is, when damping of the smoother is not taken into account. If an optimal damping parameter is used, both two-level strategies yield similar efficiency (deflation appears unaffected by damping). However, it remains an open question how the damping parameter can best be selected in practice. Altogether, this work contributes to shorter computational times of DG discretizations, e.g. for oil reservoir simulations. This strengthens the increasing consensus that DG methods can be an effective alternative for classical discretizations schemes, such as the Finite Volume Method (FVM). 6.3 Hidden DG accuracy We have found that the proposed position-dependent post-processor enhances the DG convergence from order p + 1 to order 2p + 1. This result has been shown theoretically in both the L2 - and the L∞ -norm (with a slightly lower order in the L∞ -norm for higher-dimensional problems). This expands the L2 estimates in [22] for the symmetric post-processor, which cannot be applied near non-periodic boundaries and shocks. The aforementioned superconvergence of order 2p + 1 is also demonstrated by our numerical experiments. These include problems with non-periodic boundary conditions, problems with stationary shocks, a two-dimensional system, and a streamline visualization example. These experiments illustrate that both the position-dependent post-processor and the original one-sided technique produce the same results in the domain interior. In that region, both techniques apply the symmetric post-processor, resulting in the usual high level of accuracy and smoothness. Section 6.4 Future research 107 The differences occur at the boundary of the domain: in those regions, the position-dependent post-processor yields a more realistic smoothness without the previous artificial stair-stepping effect. Furthermore, unlike before, it improves the local accuracy of the DG approximation in the entire domain, not just in order but also in magnitude. Altogether, this work contributes to more accurate visualization of DG approximations, e.g. in the form of streamlines. Furthermore, it sustains the idea that numerical approximations may contain more information than we originally thought. 6.4 Future research Suggestions for future research include the following: 1. The comparison of both two-level methods could be continued for more advanced applications, e.g. for three-dimensional unstructured meshes, or a strongly anisotropic diffusion tensor. 2. According to the present study, the coarse systems in the two-level methods can be solved efficiently using an inexact CG solver. To improve the efficiency further, the Flexible CG (FCG) method could be studied to reduce the number of inner iterations (cf. Section 2.4.3). 3. The derived convergence theory for the two-level methods is based on the assumption that the scheme is coercive (or the condition used in Section 3.5.3). To ensure this in applications, practical conditions for the diffusion-dependent penalty parameter could be derived, possibly by applying available global strategies in a local fashion (cf. Section 2.2.3). 4. The application of the two-level deflation method could be extended to non-symmetric DG schemes, such as the Non-symmetric Interior Penalty Galerkin (NIPG) method. The latter could be less sensitive to the choice of the penalty parameter. 5. The theoretical error estimates for the post-processor apply for linear hyperbolic problems with constant coefficients and periodic boundary conditions. Nevertheless, the numerical results in this thesis suggest that it is effective for a larger class of problems, including Dirichlet boundary conditions and variable coefficients. Theoretical support for these findings could be a welcome step towards real-life applications. Additionally, the position-dependent post-processor could be analyzed for unstructured meshes and non-linear problems, e.g. by following [48, 42]. 6. For large values of the polynomial degree p, further modification of the position-dependent post-processor may be required to avoid that the kernel support becomes too large to fit the geometric setting, and to avoid 108 Chapter 6 Conclusion that round-off errors start to dominate (as the magnitude of the one-sided kernel coefficients increases rapidly with p). 7. Numerical results suggest that there exists a relation between the postprocessor and the L2 -projection onto the space of piecewise polynomials of degree 2p + 1 (cf. Section 4.5.7). This could be studied theoretically. Acknowledgments For their contributions to this dissertation, I would like to express my sincere gratitude to ... ... Kees Vuik, for being my promotor and supervisor. Kees, we have been working together since I chose “Een ‘lastig’ probleem” for my Bachelor’s thesis. Over the years, you have become like an academic dad to me. Without your faith and support, this dissertation would not have been written. ... Ben de Pagter and Wim van Horssen, for supporting my position as a PhD student at DIAM. ... my committee, for taking the time to evaluate this thesis. Special thanks are due to Yvan Notay, for sending extensive and valuable comments on Chapter 3. Furthermore I am thankful to Jan Dirk Jansen, for giving relevant feedback on early versions of Chapter 2. ... Scott MacLachlan, for offering useful suggestions with respect to damping at Copper Mountain. ... Jan van Neerven, Markus Haase, and Mark Veraar, for helping me with derivations with an inspiring passion for functional analysis. ... Kees Lemmens, for turning all my computer issues into enjoyable mini classes on Linux. I can never go back to Windows now. ... my colleagues in the Numerical Analysis group, for sharing interesting and amusing ideas during tea talks and at the coffee machine. I have warm memories to the times we were celebrating Sinterklaas, having a 110 Acknowledgments Pakistani dinner, and losing with style to our computer science colleagues at karting. ... Aletta Wubben and Louise van Swaaij, for teaching me the soft skills that are easily overlooked at a technical university. ... Cor Kraaikamp, for offering a cup of tea and alternative views at just the right moments. ... my friends and family, for supporting me, inviting me to fun outings, and reminding me of who I was before I started this research. ... Sonja Cox, for being my paranymph and for analyzing the world with me, including ourselves and, yes, our analyzing habits... Sonja, you are a wonderful friend and I hope you will move back to the Netherlands soon. ... my parents, Gé and Mary van Slingerland, for giving me “een goede basis” these last thirty years. Mom, dad, your unconditional love and support, like on that day when it was snowing heavily, move me every time. To quote the song, “You are the wind beneath my wings”. ... Ewoud Marijt, for being my paranymph and for encouraging me to think in terms of possibilities. Ewoud, you always manage to make me smile, even if I, for some reason, try not to. I am incredibly lucky to have you by my side. Paulien van Slingerland Delft, May 2013 Curriculum Vitae Paulien van Slingerland was born on May 19, 1983, in Leiderdorp, The Netherlands. After completing her secondary education at the Stedelijk Gymnasium Leiden in 2001, she enrolled for Applied Mathematics at Delft University of Technology. She was awarded the CIVI aanmoedigingsprijs for her propaedeutics diploma (cum laude) in 2002, after which she obtained her MSc. degree (cum laude) in 2007. The time integration scheme that was the result of her Master’s thesis is still being used by Deltares for simulating water quality. In September 2007, Paulien started working as a PhD student at Delft University of Technology. Initially, she studied coastal wave modeling in the Fluid Mechanics group (Faculty of Civil Engineering and Geosciences). After a year she switched to the Numerical Analysis group (Faculty of Electrical Engineering, Mathematics and Computer Science), which has resulted in this thesis, four refereed journal papers (one is accepted, three are in submission), and a prize for the best poster presentation during the thirty-sixth Woudschoten conference of the Werkgemeenschap Scientific Computing. Since January 2013, Paulien is working at TNO as a trainee. 112 Curriculum Vitae Publications Journal papers - P. van Slingerland, C. Vuik. Scalable two-level preconditioning and deflation based on a piecewise constant subspace for (SIP)DG systems. Submitted to JCAM. - P. van Slingerland, C. Vuik. Fast linear solver for pressure computation in layered domains. Submitted to Comput. Geosci. - L. Ji, P. van Slingerland, J.K. Ryan, C. Vuik. Superconvergent error estimates for position-dependent smoothness-increasing accuracy-conserving (SIAC) post-processing of Discontinuous Galerkin Solutions. Accepted for publication in Math. Comp. - P. van Slingerland, J.K. Ryan, C. Vuik. Position-dependent smoothnessincreasing accuracy-conserving (SIAC) filtering for improving Discontinuous Galerkin solutions. SIAM J. Sci. Comp., 33(2011), pp 802–825. Technical reports - P. van Slingerland, C. Vuik. Scalable two-level preconditioning and deflation based on a piecewise constant subspace for (SIP)DG systems. DIAM report 12-11, Delft University of Technology, 2012. - P. van Slingerland, C. Vuik. Fast linear solver for pressure computation in layered domains. DIAM report 12-10, Delft University of Technology, 2012. 114 Publications - P. Slingerland, C. Vuik. Spectral two-level deflation for DG: a preconditioner for CG that does not need symmetry. DIAM report 11-12, Delft University of Technology, 2011. - P. van Slingerland, J.K. Ryan, C. Vuik. Smoothness-increasing convergence-conserving spline filters applied to streamline visualisation of DG approximations. DIAM report 09-06, Delft University of Technology, 2009. - P. van Slingerland, M. Borsboom, C. Vuik. A local theta scheme for advection problems with strongly varying meshes and velocity profiles. DIAM report 08-17, Delft University of Technology, 2008. Talks at international conferences - Fast linear solver for pressure computation in layered domains. 13th European conference on the mathematics of oil recovery (ECMOR). Biarritz, France, 2012. - A preconditioner for CG that does not need symmetry. Twelfth Copper Mountain conference on iterative methods (COPPER). Copper Mountain (Colorado), United States of America, 2012. - Exploiting the nested block structure of DG matrices: a block ILU preconditioner with deflation and a spectral two-level strategy. International conference on preconditioning techniques for scientific and industrial applications (PRECOND). Bordeaux, France, 2011. - A local theta scheme for advection problems with strongly varying meshes and velocity profiles. The mathematics of finite elements and applications (MAFELAP). Brunel University, Londen, England, 2009. - A robust higher-order variable-θ scheme for the advection diffusion equation on unstructured grids. 2nd international conference on high order non-oscillatory methods for wave propagation, transport and flow problems. Trento, Italy, 2007. Other talks - Spectral two-level deflation for DG: a preconditioner for CG that does not need symmetry .... Discontinuous Galerkin methods in computational electromagnetics: a workshop on recent developments in theory and applications. National Aerospace Laboratory of the Netherlands, Amsterdam, The Netherlands, 2011. - Extracting the hidden accuracy of DG solutions. Spring meeting Werkgemeenschap Scientific Computing. Antwerp, Belgium, 2010. Publications 115 - A local theta scheme for advection (dominated) problems with strongly varying meshes and velocity profiles. Invited talk at the Institute of Applied Mathematics. Dortmund University of Technology, Dortmund, Germany, 2009. - An accurate and robust local theta FCT scheme for the advection equation for strongly varying meshes and velocity profiles. Meeting Kontactgroep Numerieke Stromingsleer. University of Twente, Enschede, The Netherlands, 2007. Poster presentations - A preconditioner for CG that does not need symmetry. NWO-JSPS joint seminar: numerical linear algebra - algorithms, applications, and training. Delft University of Technology, Delft, 2012. - A preconditioner for CG that does not need symmetry. Thirty-sixth Woudschoten conference, Werkgemeenschap Scientific Computing. Zutphen, The Netherlands, 2011. - The hidden accuracy of DG. Thirty-fifth Woudschoten conference, Werkgemeenschap Scientific Computing. Zutphen, The Netherlands, 2010. - Post-processing for DG: improving the accuracy near boundaries and shocks. Opening symposium Applied Mathematics Institute. Delft University of Technology, Delft, The Netherlands, 2010. - Post-processing for DG. Thirty-fourth Woudschoten conference, Werkgemeenschap Scientific Computing. Zutphen, The Netherlands, 2009. - Smoothness-increasing accuracy-conserving spline filters applied to streamline visualisation of DG approximations. Burgersdag 2009, J.M. Burgerscentrum. Eindhoven University of Technology, Eindhoven, The Netherlands, 2009. 116 Publications Bibliography [1] S. Adjerid, K. D. Devine, J. E. Flaherty, and Lilia Krivodonova. A posteriori error estimation for discontinuous Galerkin solutions of hyperbolic problems. Comput. Methods Appl. Mech. Engrg., 191(11-12):1097–1112, 2002. [2] S. Adjerid and T. C. Massey. Superconvergence of discontinuous Galerkin solutions for a nonlinear scalar hyperbolic problem. Comput. Methods Appl. Mech. Engrg., 195(25-28):3331–3346, 2006. [3] P. F. Antonietti and B. Ayuso. Schwarz domain decomposition preconditioners for discontinuous Galerkin approximations of elliptic problems: non-overlapping case. M2AN Math. Model. Numer. Anal., 41(1):21–54, 2007. [4] D. N. Arnold, F. Brezzi, B. Cockburn, and L. D. Marini. Unified analysis of discontinuous Galerkin methods for elliptic problems. SIAM J. Numer. Anal., 39(5):1749–1779 (electronic), 2002. [5] O. Axelsson. Iterative solution methods. Cambridge University Press, Cambridge, 1994. [6] O. Axelsson and P. S. Vassilevski. Variable-step multilevel preconditioning methods. I. Selfadjoint and positive definite elliptic problems. Numer. Linear Algebra Appl., 1(1):75–101, 1994. [7] B. Ayuso de Dios, M. Holst, Y. Zhu, and L. Zikatanov. Multilevel preconditioners for discontinuous Galerkin approximations of elliptic problems with jump coeffients. arXiv:1012.1287v2, 2012. 118 BIBLIOGRAPHY [8] D.H. Bailey, Y. Hida, X. S. Li, and B. Thompson. ARPREC: an arbitrary precision computation package. Technical Report 53651, Lawrence Berkeley National Laboratory, September 2002. [9] V. I. Bogachev. Measure theory. Springer-Verlag, Berlin, 2007. [10] J. H. Bramble and A. H. Schatz. Estimates for spline projections. Rev. Française Automat. Informat. Recherche Opérationnelle, 10(R-2):5–37, 1976. [11] J. H. Bramble and A. H. Schatz. Higher order local accuracy by averaging in the finite element method. Math. Comp., 31(137):94–111, 1977. [12] James H. Bramble, Joachim A. Nitsche, and Alfred H. Schatz. Maximumnorm interior estimates for Ritz-Galerkin methods. Math. Comput., 29:677–688, 1975. [13] S. C. Brenner and J. Zhao. Convergence of multigrid algorithms for interior penalty methods. Appl. Numer. Anal. Comput. Math., 2(1):3–18, 2005. [14] V. I. Burenkov. Sobolev spaces on domains, volume 137 of Teubner-Texte zur Mathematik [Teubner Texts in Mathematics]. B. G. Teubner Verlagsgesellschaft mbH, Stuttgart, 1998. [15] E. Burman and P. Zunino. A domain decomposition method based on weighted interior penalties for advection-diffusion-reaction problems. SIAM J. Numer. Anal., 44(4):1612–1638 (electronic), 2006. [16] W. Cai, D. Gottlieb, and C.-W. Shu. On one-sided filters for spectral Fourier approximations of discontinuous functions. SIAM J. Numer. Anal., 29(4):905–916, 1992. [17] P. Castillo. Performance of discontinuous Galerkin methods for elliptic PDEs. SIAM J. Sci. Comput., 24(2):524–547, 2002. [18] P. G. Ciarlet. The finite element method for elliptic problems, volume 4 of Studies in mathematics and its applications. North-Holland, New York, 1978. [19] B. Cockburn, B. Dong, and J. Guzmán. Optimal convergence of the original DG method for the transport-reaction equation on special meshes. SIAM J. Numer. Anal., 46(3):1250–1265, 2008. [20] B. Cockburn, B. Dong, J. Guzmán, and J. Qian. Optimal convergence of the original DG method on special meshes for variable transport velocity. SIAM J. Numer. Anal., 48(1):133–146, 2010. BIBLIOGRAPHY 119 [21] B. Cockburn, C. Johnson, C.-W. Shu, and E. Tadmor. Advanced numerical approximation of nonlinear hyperbolic equations, volume 1697 of Lecture Notes in Mathematics. Springer-Verlag, Berlin, 1998. Papers from the C.I.M.E. Summer School held in Cetraro, June 23–28, 1997, Edited by Alfio Quarteroni, Fondazione C.I.M.E.. [C.I.M.E. Foundation]. [22] B. Cockburn, M. Luskin, C.-W. Shu, and E. Süli. Enhanced accuracy by post-processing for finite element methods for hyperbolic equations. Math. Comp., 72(242):577–606, 2003. [23] S. Curtis, R. M. Kirby, J. K. Ryan, and C.-W. Shu. Postprocessing for the discontinuous Galerkin method over nonuniform meshes. SIAM J. Sci. Comput., 30(1):272–289, 2007. [24] V. A. Dobrev, R. D. Lazarov, P. S. Vassilevski, and L. T. Zikatanov. Two-level preconditioning of discontinuous Galerkin approximations of second-order elliptic equations. Numer. Linear Algebra Appl., 13(9):753– 770, 2006. [25] Veselin A. Dobrev, Raytcho D. Lazarov, and Ludmil T. Zikatanov. Preconditioning of symmetric interior penalty discontinuous Galerkin FEM for elliptic problems. In Domain decomposition methods in science and engineering XVII, volume 60 of Lect. Notes Comput. Sci. Eng., pages 33–44. Springer, Berlin, 2008. [26] Zdeněk Dostál. Conjugate gradient method with preconditioning by projector. International Journal of Computer Mathematics, 23(3-4):315–323, 1988. [27] Maksymilian Dryja. On discontinuous Galerkin methods for elliptic problems with discontinuous coefficients. Comput. Methods Appl. Math., 3(1):76–85 (electronic), 2003. Dedicated to Raytcho Lazarov. [28] Y. Epshteyn and B. Rivière. Estimation of penalty parameters for symmetric interior penalty Galerkin methods. J. Comput. Appl. Math., 206(2):843–872, 2007. [29] A. Ern, A.F. Stephansen, and P. Zunino. A discontinuous Galerkin method with weighted averages for advection-diffusion equations with locally small and anisotropic diffusivity. IMA J. Numer. Anal., 29(2):235–256, 2009. [30] L. C. Evans. Partial differential equations, volume 19 of Graduate Studies in Mathematics. American Mathematical Society, Providence, RI, 1998. [31] R. D. Falgout, P. S. Vassilevski, and L. T. Zikatanov. On two-grid convergence estimates. Numer. Linear Algebra Appl., 12(5-6):471–494, 2005. [32] X. Feng and O. A. Karakashian. Two-level additive Schwarz methods for a discontinuous Galerkin approximation of second order elliptic problems. SIAM J. Numer. Anal., 39(4):1343–1365 (electronic), 2001. 120 BIBLIOGRAPHY [33] K. J. Fidkowski, T. A. Oliver, J. Lu, and D. L. Darmofal. p-Multigrid solution of high-order discontinuous Galerkin discretizations of the compressible Navier-Stokes equations. J. Comput. Phys., 207(1):92–113, 2005. [34] J. Gopalakrishnan and G. Kanschat. A multilevel discontinuous Galerkin method. Numer. Math., 95(3):527–550, 2003. [35] D. Gottlieb, C.-W. Shu, A. Solomonoff, and H. Vandeven. On the Gibbs phenomenon. I. Recovering exponential accuracy from the Fourier partial sum of a nonperiodic analytic function. J. Comput. Appl. Math., 43(12):81–98, 1992. [36] S. Gottlieb and C.-W. Shu. Total variation diminishing Runge-Kutta schemes. Mathematics of Computation, 67:73–85, 1998. [37] S. Gottlieb, C.-W. Shu, and E. Tadmor. Strong stability preserving highorder time discretization methods. SIAM Review, 43:89–112, 2001. [38] J. S. Hesthaven, S. Gottlieb, and D. Gottlieb. Spectral methods for timedependent problems, volume 21 of Cambridge Monographs on Applied and Computational Mathematics. Cambridge University Press, Cambridge, 2007. [39] J.S. Hesthaven and T. Warburton. Nodal discontinuous Galerkin methods, volume 54 of Texts in Applied Mathematics. Springer, New York, 2008. Algorithms, analysis, and applications. [40] R.A. Horn and C.R. Johnson. Matrix analysis. Cambridge University Press, Cambridge, 1988. [41] Liangyue Ji, Yan Xu, and Jennifer K. Ryan. Accuracy-enhancement of discontinuous Galerkin solutions for convection-diffusion equations in multiple-dimensions. Math. Comp., 81(280):1929–1950, 2012. [42] Liangyue Ji, Yan Xu, and Jennifer K Ryan. Negative-order norm estimates for nonlinear hyperbolic conservation laws. J. Sci. Comput., 54(2-3):531– 548, 2013. [43] K. Johannsen. A symmetric smoother for the nonsymmetric interior penalty discontinuous Galerkin discretization. Technical Report ICES Report 05-23, University of Texas at Austin, 2005. [44] C. Johnson and J. Pitkäranta. An analysis of the discontinuous Galerkin method for a scalar hyperbolic equation. Math. Comp., 46(173):1–26, 1986. [45] R. B. Lowrie. Compact higher-order numerical methods for hyperbolic conservation laws. PhD thesis, University of Michigan, 1996. BIBLIOGRAPHY 121 [46] Lois Mansfield. On the use of deflation to improve the convergence of conjugate gradient iteration. Communications in Applied Numerical Methods, 4(2):151–156, 1988. [47] H. Mirzaee, L. Ji, J. K. Ryan, and R. M. Kirby. Smoothness-increasing accuracy-conserving (SIAC) postprocessing for discontinuous Galerkin solutions over structured triangular meshes. SIAM J. Numer. Anal., 49(5):1899–1920, 2011. [48] H. Mirzaee, J. King, J.K. Ryan, and R.M. Kirby. Smoothness-increasing accuracy-conserving filters for discontinuous Galerkin solutions over unstructured triangular meshes. SIAM J. Sci. Comput., 35(1):A212–A230, 2013. [49] H. Mirzaee, J. K. Ryan, and R. M. Kirby. Quantification of errors introduced in the numerical approximation and implementation of smoothnessincreasing accuracy conserving (SIAC) filtering of discontinuous Galerkin (DG) fields. J. Sci. Comput., 45(1-3):447–470, 2010. [50] H. Mirzaee, J. K. Ryan, and R. M. Kirby. Efficient implementation of smoothness-increasing accuracy-conserving (SIAC) filters for discontinuous Galerkin solutions. J. Sci. Comput., 52(1):85–112, 2012. [51] M. S. Mock and P. D. Lax. The computation of discontinuous solutions of linear hyperbolic equations. Comm. Pure Appl. Math., 31(4):423–430, 1978. [52] R. Nabben and C. Vuik. A comparison of deflation and coarse grid correction applied to porous media flow. SIAM J. Numer. Anal., 42(4):1631– 1647 (electronic), 2004. [53] R. Nabben and C. Vuik. A comparison of deflation and the balancing preconditioner. SIAM J. Sci. Comput., 27(5):1742–1759 (electronic), 2006. [54] R. Nabben and C. Vuik. A comparison of abstract versions of deflation, balancing and additive coarse grid correction preconditioners. Numerical Linear Algebra with Applications, 15(4):355–372, 2008. [55] R. A. Nicolaides. Deflation of conjugate gradients with applications to boundary value problems. SIAM J. Numer. Anal., 24(2):355–365, 1987. [56] Y. Notay. Flexible conjugate gradients. 22(4):1444–1460 (electronic), 2000. SIAM J. Sci. Comput., [57] Yvan Notay. Algebraic analysis of two-grid methods: The nonsymmetric case. Numer. Linear Algebra Appl., 17(1):73–96, 2010. [58] P.-O. Persson and J. Peraire. Newton-GMRES preconditioning for discontinuous Galerkin discretizations of the Navier-Stokes equations. SIAM J. Sci. Comput., 30(6):2709–2733, 2008. 122 BIBLIOGRAPHY [59] F. Prill, M. Lukáčová-Medviďová, and R. Hartmann. Smoothed aggregation multigrid for the discontinuous Galerkin method. SIAM J. Sci. Comput., 31(5):3503–3528, 2009. [60] J. Proft and B. Rivière. Discontinuous Galerkin methods for convectiondiffusion equations for varying and vanishing diffusivity. Int. J. Numer. Anal. Model., 6(4):533–561, 2009. [61] G.R. Richter. An optimal-order error estimate for the discontinuous Galerkin method. Math. Comp., 50(181):75–88, 1988. [62] G.R. Richter. On the order of convergence of the discontinuous Galerkin method for hyperbolic equations. Math. Comp., 77(264):1871–1885, 2008. [63] B. Rivière. Discontinuous Galerkin methods for solving elliptic and parabolic equations, volume 35 of Frontiers in Applied Mathematics. Society for Industrial and Applied Mathematics (SIAM), Philadelphia, PA, 2008. Theory and implementation. [64] J. K. Ryan and C.-W. Shu. On a one-sided post-processing technique for the discontinuous Galerkin methods. Methods Appl. Anal., 10(2):295–307, 2003. [65] J. K. Ryan, C.-W. Shu, and H. Atkins. Extension of a postprocessing technique for the discontinuous Galerkin method for hyperbolic equations with application to an aeroacoustic problem. SIAM J. Sci. Comput., 26(3):821– 843, 2005. [66] J.K. Ryan and B. Cockburn. Local derivative post-processing for the discontinuous Galerkin method. J. Comput. Phys., 228(23):8642–8664, 2009. [67] Y. Saad. Iterative methods for sparse linear systems. This is a revised version of the book published in 1996 by PWS Publishing, Boston. It can be downloaded from http://www-users.cs.umn.edu/ saad/books.html, 2000. [68] Y. Saad and B. Suchomel. ARMS: an algebraic recursive multilevel solver for general sparse linear systems. Numer. Linear Algebra Appl., 9(5):359– 378, 2002. [69] Y. Saad, M. Yeung, J. Erhel, and F. Guyomarc’h. A deflated version of the conjugate gradient algorithm. SIAM J. Sci. Comput., 21(5):1909–1926, December 1999. [70] I. J. Schoenberg. Cardinal spline interpolation. SIAM, Philadelphia, Pa., 1973. [71] L. L. Schumaker. Spline functions: basic theory. John Wiley & Sons Inc., New York, 1981. BIBLIOGRAPHY 123 [72] S. J. Sherwin, R. M. Kirby, J. Peiró, R. L. Taylor, and O. C. Zienkiewicz. On 2D elliptic discontinuous Galerkin methods. Internat. J. Numer. Methods Engrg., 65(5):752–784, 2006. [73] M. Steffen, S. Curtis, R. M. Kirby, and J. K. Ryan. Investigation of smoothness-increasing accuracy-conserving filters for improving streamline integration through discontinuous fields. IEEE Transactions on Visualization and Computer Graphics, 14:680–692, 2008. [74] K. Stüben. An introduction to algebraic multigrid. In U. Trottenberg, C. W. Oosterlee, and A. Schüller, editors, Multigrid, pages 413–532. Academic Press, 2001. [75] J. M. Tang, S. P. MacLachlan, R. Nabben, and C. Vuik. A comparison of two-level preconditioners based on multigrid and deflation. SIAM J. Matrix Anal. Appl., 31(4):1715–1739, 2010. [76] J. M. Tang, R. Nabben, C. Vuik, and Y. A. Erlangga. Comparison of twolevel preconditioners derived from deflation, domain decomposition and multigrid methods. J. Sci. Comput., 39(3):340–370, 2009. [77] V. Thomée. High order local approximations to derivatives in the finite element method. Math. Comp., 31(139):652–660, 1977. [78] V. Thomée. Negative norm estimates and superconvergence in Galerkin methods for parabolic problems. Math. Comp., 34(149):93–113, 1980. [79] P. S. Vassilevski. Multilevel block factorization preconditioners. Springer, New York, 2008. Matrix-based analysis and algorithms for solving finite element equations. [80] C. Vuik, A. Segal, and J.A. Meijerink. An efficient preconditioned CG method for the solution of a class of layered problems with extreme contrasts in the coefficients. Journal of Computational Physics, 152:385–403, 1999. [81] C. Vuik, A. Segal, J.A. Meijerink, and G.T. Wijma. The construction of projection vectors for a Deflated ICCG method applied to problems with extreme contrasts in the coefficients. Journal of Computational Physics, 172:426–450, 2001. [82] D. Walfish, J. K. Ryan, R. M. Kirby, and R. Haimes. One-sided smoothness-increasing accuracy-conserving filtering for enhanced streamline integration through discontinuous fields. Journal of Scientific Computing, 38:164–184, 2009. [83] Jinchao Xu. Iterative methods by space decomposition and subspace correction. SIAM Rev., 34(4):581–613, 1992. 124 BIBLIOGRAPHY [84] I. Yavneh. Why multigrid methods are so efficient. Computing in Science & Engineering, 8(6):12–22, 2006. [85] T. Zhang and Z. Li. Optimal error estimate and superconvergence of the DG method for first-order hyperbolic problems. J. Comput. Appl. Math., 235(1):144–153, 2010.

* Your assessment is very important for improving the work of artificial intelligence, which forms the content of this project

Download PDF

advertisement