# 14913_INN_FC_PT2_01_V3.

```Discontinuous Galerkin Methods:
Linear Systems and Hidden Accuracy
Discontinuous Galerkin Methods:
Linear Systems and Hidden Accuracy
Proefschrift
ter verkrijging van de graad van doctor
aan de Technische Universiteit Delft,
op gezag van de Rector Magnificus prof.ir. K.C.A.M. Luyben,
voorzitter van het College voor Promoties,
in het openbaar te verdedigen op
vrijdag 14 juni 2013 om 15:00 uur
door
Paulien van Slingerland
wiskundig ingenieur
geboren te Leiderdorp
Dit proefschrift is goedgekeurd door de promotor:
Prof.dr.ir. C. Vuik
Samenstelling promotiecommissie:
Rector Magnificus
Prof.dr.ir. C. Vuik
Prof.dr. Y. Notay
Prof.dr. J. Qiu
Prof.dr.ir. B. Koren
Dr. H.Q. van der Ven
Prof.dr.ir. J.D. Jansen
Prof.dr.ir. C.W. Oosterlee
voorzitter
Technische Universiteit Delft, promotor
Université Libre de Bruxelles
Xiamen University
Technische Universiteit Eindhoven
Nationaal Lucht- en Ruimtevaartlaboratorium
Technische Universiteit Delft
Technische Universiteit Delft
The research in this dissertation was carried out at and supported by Delft
Institute of Applied Mathematics, Delft University of Technology. Part of the
work was also sponsored by the Air Force Office of Scientific Research, Air
Force Material Command, USAF, under grant number FA8655-09-1-3055.
Discontinuous Galerkin Methods: Linear Systems and Hidden Accuracy.
Dissertation at Delft University of Technology.
c 2013 by P. van Slingerland
Summary
Discontinuous Galerkin Methods:
Linear Systems and Hidden Accuracy
Just like it is possible to predict tomorrow’s weather, it is possible to predict
e.g. the presence of oil in a reservoir, or the air flow around a newly designed
air foil. These predictions are often based on computer simulations, for which
the Discontinuous Galerkin (DG) method can be particularly suitable. This
discretization scheme uses discontinuous piecewise polynomials of degree p to
combine the best of both classical finite element and finite volume methods.
This thesis focuses on its linear systems and ‘hidden’ accuracy.
Linear DG systems are relatively large and ill-conditioned. In search of
efficient linear solvers, much attention has been paid to subspace correction
methods. A particular example is a two-level preconditioner with a coarse
space that is based on the DG scheme with polynomial degree p = 0. This
more or less reduces the DG matrix to a (smaller) central difference matrix,
for which many efficient linear solvers are readily available. An alternative for
preconditioning is deflation. To contribute to the ongoing comparison between
multigrid and deflation, and to extend the available analysis for the aforementioned two-level preconditioner, we have cast it into the deflation framework,
and studied the impact of both variants on the convergence of the Conjugate
Gradient (CG) method. This thesis discusses the results for Symmetric Interior
Penalty (discontinuous) Galerkin (SIPG) discretizations for diffusion problems
with strong contrasts in the coefficients. In addition, it considers the influence
of the SIPG penalty parameter, weighted averages in the SIPG formulation
(SWIP), the smoother, damping of the smoother, and the strategy for solving
the coarse systems. We have found that both two-level methods yield fast and
vi
Summary
scalable CG convergence (independent of the mesh element diameter), provided
that the penalty parameter is chosen dependent on local values of the diffusion
coefficient. The latter choice also benefits the discretization accuracy. Without
damping, deflation can be up to 35% faster. If an optimal damping parameter
is used, both two-level strategies yield similar efficiency. However, it remains
an open question how the damping parameter can best be selected in practice.
At the same time, DG approximations can contain ‘hidden’ accuracy. For
a large class of sufficiently smooth periodic problems, there exists a postprocessor that enhances the DG convergence from order p + 1 to order 2p + 1.
Interestingly, this technique needs to be applied only once, at the final simulation time, and does not contain any information of the underlying physics or
numerics. To be able to post-process near non-periodic boundaries and shocks
as well, we have developed the so-called position-dependent post-processor,
and analyzed its impact on the discretization accuracy in both the L2 - and
the L∞ -norm. This thesis presents the results for DG (upwind) discretizations
for hyperbolic problems, and demonstrates the benefits of post-processing for
streamline visualization. Our numerical and theoretical results show that the
position-dependent post-processor can enhance the DG convergence from order p + 1 to order 2p + 1. Furthermore, unlike before, this technique can be
applied in the entire domain to enhance the smoothness and accuracy of the
DG approximation, even near non-periodic boundaries and shocks.
Altogether, this work contributes to shorter computational times and more
accurate visualization of DG approximations. This strengthens the increasing
consensus that DG methods can be an effective alternative for classical discretizations schemes, and sustains the idea that numerical approximations can
Samenvatting
Discontinuous Galerkin Methoden:
Lineare Stelsels en Verborgen Nauwkeurigheid
Net zoals het mogelijk is om het weer van morgen te voorspellen, is het mogelijk om bijvoorbeeld de aanwezigheid van olie in een reservoir te voorspellen,
of de luchtstroming rond een nieuw ontwerp van een vliegtuigvleugel. Doorgaans zijn deze voorspellingen gebaseerd op computersimulaties, waar de Discontinuous Galerkin (DG) methode bij uitstek geschikt voor kan zijn. Dit
discretisatieschema gebruikt stuksgewijze polynomen van graad p om de voordelen van klassieke eindige elementen en eindige volume methoden te combineren. Dit proefschrift focust op de bijbehorende lineaire stelsels en ‘verborgen’ nauwkeurigheid.
Lineaire DG stelsels zijn relatief groot en slecht geconditioneerd. Op zoek
naar effciënte lineaire solvers is er veel aandacht besteed aan subspace correction
methoden. Een specifiek voorbeeld is een two-level preconditionering waarbij
de ruimte op het grove niveau gebaseerd is op het DG schema met polynomiale
graad p = 0. Dit reduceert de DG matrix min of meer tot een (kleinere) centrale differentie matrix, waarvoor reeds veel efficiënte lineaire solvers beschikbaar zijn. Een alternatief voor preconditioneren is deflatie. Om bij te dragen aan de voortdurende vergelijking tussen multigrid en deflatie, en om de
beschikbare analyse voor de eerder genoemde two-level preconditionering uit
te breiden, hebben we deze in een deflatietechniek omgezet, en het effect van
beide methoden op de convergentie van de Conjugate Gradient (CG) methode
bestudeerd. Dit proefschrift beschrijft de resultaten voor Symmetric Interior
Penalty (discontinuous) Galerkin (SIPG) discretizaties voor diffusieproblemen
met sterke contrasten in de coëfficiënten. Daarnaast beschouwt het de invloed
viii
Samenvatting
van de SIPG penalty parameter, gewogen gemiddelden in de SIPG formulering (SWIP), de smoother, demping van de smoother, en de strategie om de
lineaire stelsels op het grove niveau op de lossen. We hebben gevonden dat
beide two-level methoden snelle en schaalbare CG convergentie leveren (onafhankelijk van de diameter van de grid elementen), mits de penalty parameter
afhankelijk van lokale waarden van de diffussiecoëfficiënt gekozen wordt. Dit
laatste verbetert ook de nauwkeurigheid van de discretisatie. Zonder demping
kan deflatie tot 35% sneller zijn. Als een optimale demping parameter gebruikt
wordt, zijn beide two-level strategieën ongeveer even efficiënt. Desalniettemin
is het een open vraag hoe de penalty parameter het beste gekozen kan worden
in de praktijk.
Tegelijkertijd kunnen DG benaderingen ‘verborgen’ nauwkeurigheid bevatten. Voor een grote klasse van voldoende gladde periodieke problemen bestaat
er een filter, die de DG convergentie verbetert van orde p + 1 naar orde 2p + 1.
Interssant genoeg hoeft deze techniek slechts één keer toegepast te worden,
terwijl deze geen enkele informatie bevat over de onderliggende fysische en numerieke vergelijkingen. Om ook nabij niet-periodieke randen en schokken te
kunnen filteren, hebben we het zogeheten positie-afhankelijke filter ontwikkeld,
en zijn effect geanalyseerd op de nauwkeurigheid van de discretisatie in de L2 en de L∞ -norm. Dit proefschrift presenteert de resultaten voor DG (upwind)
discretisaties voor hyperbolische problemen, en demonstreert de voordelen van
filteren voor visualisaties in de vorm van stroomlijnen. Daarnaast, in tegenstelling tot hiervoor, kan deze techniek toegepast worden in het gehele domein
om de gladheid en nauwkeurigheid van de DG approximatie te verbeteren, zelfs
nabij niet-periodieke randen en schokken.
Al met al draagt dit werk bij aan kortere rekentijden en nauwkeuriger visualisatie van DG approximaties. Dit versterkt de toenemende consensus dat
DG methoden een effectief alternatief kunnen zijn voor klassieke discretisatieschema’s, en onderschrijft het idee dat numerieke benaderingen meer informatie
kunnen bevatten dan we oorspronkelijk dachten.
Contents
1 Introduction
1.1 Introduction . . . . . . . . . . . . . . .
1.2 Discontinuous Galerkin (DG) methods
1.3 Linear DG systems . . . . . . . . . . .
1.4 Hidden DG accuracy . . . . . . . . . .
1.5 Outline . . . . . . . . . . . . . . . . .
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
1
1
2
4
5
6
2 Linear DG systems
2.1 Introduction . . . . . . . . . . . . . . . . . . . .
2.2 Discretization . . . . . . . . . . . . . . . . . . .
2.2.1 SIPG method for diffusion problems . .
2.2.2 Linear system . . . . . . . . . . . . . . .
2.2.3 Penalty parameter . . . . . . . . . . . .
2.3 Two-level preconditioner . . . . . . . . . . . . .
2.3.1 Coarse correction operator . . . . . . . .
2.3.2 Two-level preconditioner . . . . . . . . .
2.3.3 Implementation in a CG algorithm . . .
2.4 Deflation variant . . . . . . . . . . . . . . . . .
2.4.1 Two-level deflation . . . . . . . . . . . .
2.4.2 FLOPS . . . . . . . . . . . . . . . . . .
2.4.3 Coarse systems . . . . . . . . . . . . . .
2.4.4 Damping . . . . . . . . . . . . . . . . .
2.5 Numerical experiments . . . . . . . . . . . . . .
2.5.1 Experimental setup . . . . . . . . . . . .
2.5.2 The influence of the penalty parameter
2.5.3 Coarse systems . . . . . . . . . . . . . .
2.5.4 Smoothers and damping . . . . . . . . .
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
7
8
8
9
10
11
12
12
14
14
15
15
16
17
17
18
18
20
23
24
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
x
CONTENTS
2.5.5 Other test cases . . . . . . . . . . . . . . . . . . . . . .
Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
26
26
3 Theoretical scalability
3.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
3.2 Methods and assumptions . . . . . . . . . . . . . . . . . . . . .
3.2.1 SIPG discretization for diffusion problems . . . . . . . .
3.2.2 Linear system . . . . . . . . . . . . . . . . . . . . . . . .
3.2.3 Two-level preconditioning and deflation . . . . . . . . .
3.3 Abstract relations for any SPD matrix A . . . . . . . . . . . . .
3.3.1 Using the error iteration matrix . . . . . . . . . . . . . .
3.3.2 Implications for the two-level methods . . . . . . . . . .
3.3.3 Comparing deflation and preconditioning . . . . . . . .
3.4 Intermezzo: regularity on the block diagonal of A . . . . . . . .
3.4.1 Using regularity of the mesh . . . . . . . . . . . . . . .
3.4.2 The desired result in terms of ‘small’ bilinear forms . . .
3.4.3 Regularity on the block diagonal of A . . . . . . . . . .
3.5 Main result: scalability for SIPG systems . . . . . . . . . . . .
3.5.1 Main result: scalability for SIPG systems . . . . . . . .
3.5.2 Special case: block Jacobi smoothing . . . . . . . . . . .
3.5.3 Influence of damping and the penalty parameter for block
Jacobi smoothing . . . . . . . . . . . . . . . . . . . . . .
3.6 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
33
34
34
35
36
37
39
39
41
42
43
44
45
48
50
50
52
4 Hidden DG accuracy
4.1 Introduction . . . . . . . . . . . . . . . . . . . . . . .
4.2 Discretization . . . . . . . . . . . . . . . . . . . . . .
4.2.1 DG for one-dimensional hyperbolic problems
4.2.2 DG for two-dimensional hyperbolic systems .
4.3 Original post-processing strategies . . . . . . . . . .
4.3.1 B-splines . . . . . . . . . . . . . . . . . . . .
4.3.2 Symmetric Post-processor . . . . . . . . . . .
4.3.3 One-sided post-processor . . . . . . . . . . .
4.4 Position-dependent post-processor . . . . . . . . . .
4.4.1 Generalized post-processor . . . . . . . . . .
4.4.2 Position-dependent post-processor . . . . . .
4.4.3 Post-processing in two dimensions . . . . . .
4.5 Numerical Results . . . . . . . . . . . . . . . . . . .
4.5.1 L2 -Projection . . . . . . . . . . . . . . . . . .
4.5.2 Constant coefficients . . . . . . . . . . . . . .
4.5.3 Dirichlet BCs . . . . . . . . . . . . . . . . . .
4.5.4 Variable coefficients . . . . . . . . . . . . . .
4.5.5 Discontinuous coefficients . . . . . . . . . . .
4.5.6 Two-dimensional system . . . . . . . . . . . .
4.5.7 Two-dimensional streamlines . . . . . . . . .
57
58
59
59
59
60
61
61
63
65
65
67
69
70
71
73
75
76
77
78
78
2.6
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
54
56
CONTENTS
4.6
Conclusion
xi
. . . . . . . . . . . . . . . . . . . . . . . . . . . . .
80
5 Theoretical Superconvergence
5.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . .
5.2 Methods and notation . . . . . . . . . . . . . . . . . . . . . .
5.2.1 Post-processor . . . . . . . . . . . . . . . . . . . . . .
5.2.2 DG discretization for hyperbolic problems . . . . . . .
5.2.3 Additional notation . . . . . . . . . . . . . . . . . . .
5.3 Auxiliary results . . . . . . . . . . . . . . . . . . . . . . . . .
5.3.1 Estimating ku − u⋆ k . . . . . . . . . . . . . . . . . . .
5.3.2 Derivatives of B-splines . . . . . . . . . . . . . . . . .
5.4 The main result in abstract form . . . . . . . . . . . . . . . .
5.4.1 Reducing the post-processor to its building blocks . .
5.4.2 Treating the remaining building blocks . . . . . . . . .
5.4.3 The main error estimate in abstract form . . . . . . .
5.5 The main result for DG approximations . . . . . . . . . . . .
5.5.1 Unfiltered DG convergence . . . . . . . . . . . . . . .
5.5.2 Main result: extracting DG superconvergence . . . . .
5.5.3 Implications for the position-dependent post-processor
5.6 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . .
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
85
86
86
86
88
88
90
90
92
94
94
96
98
99
99
102
103
104
6 Conclusion
6.1 Introduction . . . . .
6.2 Linear DG systems .
6.3 Hidden DG accuracy
6.4 Future research . . .
.
.
.
.
105
105
105
106
107
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
xii
CONTENTS
1
Introduction
1.1
Introduction
Just like it is possible to predict tomorrow’s weather, it is possible to predict
the presence of oil in a reservoir, or the air flow around a newly designed air
foil, for instance. Such predictions can be based on field or model experiments,
but also on computer simulations. The latter can be significantly cheaper (in
terms of both time and money), and allow us to study potentially dangerous
situations in a safe and virtual setting.
Unfortunately, for most real-life applications, the underlying physical model
is too complex to calculate the corresponding solution exactly. Hence, we
must settle for an approximation instead. The mathematical field of numerical
analysis is concerned with advanced numerical algorithms for this purpose.
In principle, the numerical algorithms available can provide solution approximations with an arbitrarily small error. However, the computational time (and
computer memory) must also been taken into account: tomorrow’s weather
forecast should not take a week to compute, for instance. This trade-off between accuracy and efficiency can be compared to using pixels in a digital
camera: more pixels mean sharper photos, but also more data to process and
store. It remains a challenge to develop numerical algorithms with ‘smarter
pixels’ to improve this balance for many applications.
In this context, a particular challenge is formed by problems with shocks
or strong contrasts in the model coefficients. The latter is usually the case for
oil reservoir simulations, for example, where layers of different material (sand,
shale, etc.) may result in permeability contrasts with typical values between
10−1 and 10−7 . In this thesis, we seek to improve on existing numerical algorithms for such applications. In particular, we focus on so-called Discontinuous
Galerkin (DG) discretizations schemes, and their linear systems and ‘hidden
accuracy’.
The remaining of this chapter introduces this research in the following man-
2
Chapter 1 Introduction
ner: Section 1.2 provides a short introduction to Discontinuous Galerkin (DG)
discretizations and their advantages in the context of finite volume schemes.
Section 1.3 motivates the importance of an efficient solver for the corresponding
linear systems. Section 1.4 discusses the hidden accuracy that DG approximations can contain and its extraction through post-processing. Finally, Section
1.5 provides an outline of this thesis.
1.2
Discontinuous Galerkin (DG) methods
Discontinuous Galerkin (DG) methods [63, 4, 21, 39] are flexible discretization
schemes for approximating the solution to partial differential equations. A
DG method can be thought of as a Finite Volume Method (FVM) that uses
piecewise polynomials rather than piecewise constants. More specifically, for
a given problem and mesh, a DG approximation is a polynomial of degree
p or lower within each mesh element, while it may be discontinuous at the
element boundaries. As such, it combines the best of both classical finite
element and finite volume methods. This makes DG methods particularly
suitable for handling non-matching grids and local hp-refinement strategies.
A DG method typically yields a higher convergence rate than a FVM (assuming similar numerical fluxes), as it makes use of a higher polynomial degree. Figure 1.1 illustrates this for a one-dimensional problem with five mesh
elements.1 This higher order accuracy comes at a higher price though. While
the FVM requires only one unknown per mesh element, a DG method requires
multiple unknowns per mesh element: one for each polynomial basis function,
for two-dimensional
e.g. p + 1 for one-dimensional problems, and (p+1)(p+2)
2
problems. As a consequence, DG matrices are relatively large (both in size
and in terms of the total number of nonzeros), resulting in more computational
costs. Figure 1.2 illustrates this for a two-dimensional Laplace problem on a
3 × 3 uniform Cartesian mesh.
In this light, it is interesting to compare the accuracy of the DG method and
the FVM for a fixed number of unknowns, rather than a fixed number of mesh
elements. Figure 1.3 demonstrates that a DG method can yield much higher
accuracy for the same number of unknowns.2 Together with the advantages
mentioned earlier, this motivates the current study of DG methods in this
thesis.
1 In the illustrations in this section, the DG scheme under consideration is the SIPG
method specified in Section 2.2 and 3.2. The FVM method is based on central fluxes.
2 In this figure, we study a two-dimensional diffusion problem with five layers, where the
diffusion K is either 1 or 0.001 in each layer. The meshes are Cartesian and uniform. See
Section 2.5.1 for further details.
Section 1.2 Discontinuous Galerkin (DG) methods
3
FVM (p = 0)
DG (p = 1)
Figure 1.1. A DG method typically yields a higher convergence rate than a FVM,
as it makes use of a higher polynomial degree p.
FVM: 9 × 9 matrix
DG (p = 2): 54 × 54 matrix
Figure 1.2. DG matrices are relatively large (both for a 3 × 3 mesh).
0
10
−1
10
−2
Error
10
FVM
−3
10
−4
10
−5
10
DG (p=3)
−6
10
3
10
4
10
5
10
# Unknowns
Figure 1.3. DG discretizations can yield significantly better accuracy (in the L2 norm) for the same number of unknowns.
4
1.3
Chapter 1 Introduction
Linear DG systems
In the previous section, we have seen that linear DG systems are relatively
large. At the same time, their condition number typically increases with the
number of mesh elements, the polynomial degree, and the stabilization factor
[17, 72]. Figure 1.4 illustrates this.3 Problems with extreme contrasts in the
coefficients (cf. Section 1.1) usually pose an extra challenge.
In search of efficient and scalable algorithms (for which the number of iterations does not increase with e.g. the number of mesh elements), much attention
has been paid to subspace correction methods [83]. For example, Schwarz domain decomposition methods are based on subspaces that arise from subdividing the spatial domain into smaller subdomains [3, 32]; geometric (h-)multigrid
methods make use of subspaces resulting from multiple coarser meshes [34, 13];
spectral (p-)multigrid methods apply global corrections by solving problems
with a lower polynomial degree [33, 58]; and algebraic multigrid methods use
algebraic criteria to separate the unknowns of the original system into two sets,
one of which is labeled ‘coarse’ [59, 68]. A particular example makes use of a
single coarse space that is based on the DG scheme with polynomial degree
p = 0. This two-level method, proposed by Dobrev et al. [24], more or less
reduces the DG matrix to a (smaller) central difference matrix, for which many
efficient linear solvers are readily available.
Usually, the subspace correction methods above can either be used as a
standalone solver, or as a preconditioner in an iterative Krylov method. The
second strategy tends to be more robust for problems with a few isolated ‘bad’
eigenvalues, which is common for problems with strongly varying coefficients.
An alternative for preconditioning is the method of deflation. This technique has been developed in the late eighties by Nicolaides [55], Dostal [26]
and Mansfield [46], and further studied in [69, 80, 81], among others. Deflation
is strongly related to the subspace correction methods mentioned earlier, as
found in [52, 53, 54, 76, 75].
To contribute to this ongoing comparison between multigrid and deflation,
and to extend the available analysis for the aforementioned two-level preconditioner [24], we have cast it into the deflation framework, and studied the impact
of both variants on the convergence of the Conjugate Gradient (CG) method.
This thesis discusses the results for Symmetric Interior Penalty (discontinuous)
Galerkin (SIPG) discretizations for diffusion problems with strong contrasts in
the coefficients. In addition, it considers the influence of the SIPG penalty
parameter, weighted averages in the SIPG formulation (SWIP), the smoother,
damping of the smoother, and the strategy for solving the coarse systems.
3 In this figure, we study the two-dimensional Laplace problem on a uniform Cartesian
mesh with 10 × 10 mesh elements. We use the SIPG discretization with (stabilizing) penalty
parameter σ = 2p2 . See Section 2.5.1 for further details.
Section 1.4 Hidden DG accuracy
5
condition number k2(A)
10000
Figure 1.4
The condition number of
a DG matrix can increase rapidly with the
polynomial degree p.
0
1
2
3
4
5
polynomial degree p
1.4
Hidden DG accuracy
DG approximations can contain ‘hidden accuracy’: although the convergence
rate of a DG scheme is typically of order p+1 (where p is the polynomial degree),
the Fourier analysis of Lowrie [45] reveals an accurate mode that evolves with
an accuracy of order 2p + 1. Furthermore, Adjerid et al. [1] proved superconvergence of order p + 2 at the roots of a Radau polynomial of degree p + 1
on each element, and of order 2p + 1 at the downwind point of each element.
This hidden accuracy can be extracted by applying a local averaging operator, which has been introduced by Bramble and Schatz [11] in the context
of Ritz-Galerkin approximations for elliptic problems. They showed that this
so-called symmetric post-processor yields super-convergence of order 2p + 1.
Interestingly, this technique needs to be applied only once, at the final simulation time, and does not contain any information of the underlying physics or
numerics.
Thomee [77] provided alternative proofs for the results in [11] using Fourier
transforms. This point of view reveals that the symmetric post-processor is
related to the Lanczos filter (see e.g. [38, p. 163]), which is a classical filter in
the context of spectral methods for reducing the Gibbs phenomenon. Thomee
also demonstrated that a modified version of the post-processor can be used to
obtain accurate derivative approximations. This work was extended in [78] for
semidiscrete Galerkin finite element methods for parabolic problems.
Cockburn, Luskin, Shu, and Süli [22] combined the ideas in [11] and [51] to
apply the post-processor to DG approximations for linear periodic hyperbolic
PDEs. They showed that super-convergence of order 2p + 1 can be extracted
by the symmetric post-processor for this alternative application type. Since
then, the post-processor has been modified to be able to post-process near nonperiodic boundaries [64], applied to extract derivatives (following [77]) [65, 66],
improved computationally [49, 50], studied for non-uniform rectangular [23],
structured triangular [47] and unstructured triangular [48] meshes, analyzed for
6
Chapter 1 Introduction
linear convection-diffusion problems [41], and nonlinear hyperbolic conservation
laws [42], and used to enhance streamline visualizations [73, 82].
To improve on the accuracy and smoothness of the one-sided post-processor
[64] near non-periodic boundaries and shocks, we have developed the so-called
position-dependent post-processor, and analyzed its impact on the discretization accuracy in both the L2 - and the L∞ -norm. This thesis presents the results
for DG (upwind) discretizations for hyperbolic problems, and demonstrates the
benefits of the proposed post-processor for streamline visualization.
1.5
Outline
The outline of this thesis is as follows.
Chapter 2 discusses the two-level preconditioner proposed by Dobrev et al.
[24], and casts it into the deflation framework. The difference between
both methods is studied through various numerical experiments.
Chapter 3 provides theoretical support for the convergence of both the twolevel preconditioner and the deflation variant. In particular, it derives
upper bounds for the condition number of the preconditioned/deflated
system.
Chapter 4 discusses the original (symmetric and) one-sided post-processor
[64], and proposes the position-dependent post-processor. Both techniques are compared numerically in terms of smoothness and accuracy.
Chapter 5 presents theoretical error estimates in both the L2 - and the L∞ norm for the position-dependent post-processor.
Chapter 6 summarizes the main conclusions of this thesis and indicates possibilities for future research.
2
Linear DG systems
This chapter is based on:
P. van Slingerland, C. Vuik, Fast linear solver for pressure computation in
layered domains. Submitted to Comput. Geosci.
8
2.1
Chapter 2 Linear DG systems
Introduction
Linear DG systems are relatively large and ill-conditioned (cf. Section 1.3).
In search of efficient linear solvers, much attention has been paid to subspace
correction methods. A particular example is a two-level preconditioner proposed by Dobrev et al. [24]. This method uses coarse corrections based on
the DG discretization with polynomial degree p = 0. Using the analysis of
Falgout et al. [31], Dobrev et al. have shown theoretically (for polynomial
degree p = 1) that this preconditioner yields scalable convergence of the CG
method, independent of the mesh element diameter. Another nice property is
that the use of only two levels offers an appealing simplicity. More importantly,
the coefficient matrix that is used for the coarse correction is quite similar to a
matrix resulting from a central difference discretization, for which very efficient
An alternative for preconditioning is the method of deflation. This method
has been proved effective for layered problems with extreme contrasts in the
coefficients in [80, 81]. Deflation is related to multigrid in the sense that it also
makes use of a coarse space that is combined with a smoothing operator at the
fine level. This relation has been considered from an abstract point of view by
Tang et al. in [76, 75].
To continue this comparison between preconditioning and deflation in the
context of DG schemes, and to extend the work in [24], we have cast the twolevel preconditioner into the deflation framework, and studied the impact of
both variants on the convergence of the Conjugate Gradient (CG) method.
This chapter discusses the numerical results for Symmetric Interior Penalty
(discontinuous) Galerkin (SIPG) discretizations for diffusion problems with
strong contrasts in the coefficients. In addition, it considers the influence of
the SIPG penalty parameter, weighted averages in the SIPG formulation, the
smoother, damping of the smoother, and the strategy for solving the coarse
systems. Theoretical analysis of both two-level methods will be considered in
Chapter 3.
The outline of this chapter is as follows. Section 2.2 discusses the SIPG
method for diffusion problems with large jumps in the coefficients. To solve
the resulting systems, Section 2.3 discusses the two-level preconditioner. Section 2.4 rewrites the latter as a deflation method. Section 2.5 compares the
performance of both two-level methods through various numerical experiments.
Section 2.6 summarizes the main conclusions.
2.2
Discretization
This section specifies the DG discretization under consideration. Section 2.2.1
discusses the SIPG method for stationary diffusion problems following [63].
Section 2.2.2 describes the resulting linear systems. Section 2.2.3 motivates
the use of a diffusion-dependent penalty parameter following [27].
Section 2.2 Discretization
2.2.1
9
SIPG method for diffusion problems
We study the following model problem on the spatial domain Ω with boundary
∂Ω = ∂ΩD ∪ ∂ΩN and outward normal n:
−∇ · (K∇u) = f,
in Ω,
u = gD ,
K∇u · n = gN ,
on ∂ΩD ,
on ∂ΩN .
(2.1)
We assume that the diffusion K is a symmetric and positive-definite tensor
whose eigenvalues are bounded below and above by positive constants, and
that the other model parameters are chosen such that a weak solution of (2.1)
exists1 .
The SIPG approximation for the model above can be constructed in the following manner. First, choose a mesh with elements E1 , ..., EN . The numerical
experiments in this chapter are for uniform Cartesian meshes on the domain
Ω = [0, 1]2 , although our solver can be applied for a wider range of problems.
Next, define the test space V that contains each function that is a polynomial
of degree p or lower within each mesh element, and that may be discontinuous
at the element boundaries. The SIPG approximation uh is now defined as the
unique element in this test space that satisfies the relation
B(uh , v) = L(v),
for all test functions v ∈ V,
(2.2)
where B and L are (bi)linear forms that are specified hereafter.
To define these forms for mesh elements of size h×h, we require the following
additional notation: the vector ni denotes the outward normal of mesh element
Ei ; the set Γh is the collection of all interior edges e = ∂Ei ∩ ∂Ej ; the set ΓD is
the collection of all Dirichlet boundary edges e = ∂Ei ∩ ∂ΩD ; and the set ΓN
is the collection of all Neumann boundary edges e = ∂Ei ∩ ∂ΩN . Finally, we
introduce the usual trace operators for jumps and averages at the mesh element
boundaries: in the interior, we define at ∂Ei ∩ ∂Ej : [v] = vi · ni + vj · nj , and
{v} = 21 (vi + vj ), where vi denotes the trace of the (scalar or vector-valued)
function v along the side of Ei with outward normal ni . Similarly, at the
domain boundary, we define at ∂Ei ∩ ∂Ω: [v] = vi · ni , and {v} = vi . Using
this notation, the forms B and L can be defined as follows:
Z
X Z
X Z σ vgN ,
L(v) =
fv −
[K∇v] + v gD +
h
e
e
Ω
e∈ΓN
e∈ΓD
B(uh , v)
=
N Z
X
i=1
+
Ei
K∇uh · ∇v
X
e∈Γh ∪ΓD
1 That
Z e
− {K∇uh } · [v] − [uh ] · {K∇v} +
1
is, f, gN ∈ L2 (Ω), and gD ∈ H 2 (Ω) [63, p. 25, 26].
σ
[uh ] · [v] ,
h
10
Chapter 2 Linear DG systems
where σ is the so-called penalty parameter. This positive parameter penalizes
the inter-element jumps to enforce weak continuity and ensure convergence.
Although it is presented as a constant here, its value may vary throughout the
domain. This is discussed further in Section 2.2.3 later on.
For a large class of sufficiently smooth problems, the SIPG method yields
convergence of order p + 1 [63].
2.2.2
Linear system
In order to compute the SIPG approximation defined by (2.2), it needs to be
rewritten as a linear system. To this end, we choose basis functions for the
test space V . More specifically, for each mesh element Ei , we define the basis
(i)
function φ1 which is zero in the entire domain, except in Ei , where it is equal
(i)
(i)
to one. Similarly, we define higher-order basis functions φ2 , ..., φm , which are
higher-order polynomials in Ei and zero elsewhere. In this chapter, we use
monomial basis functions.
These latter are defined as follows. In the mesh element Ei with center
(i)
(xi , yi ) and size h × h, the function φk reads:
(i)
φk (x, y)
=
x − xi
1
2h
k x y − yi
1
2h
k y
,
where kx and ky are selected as follows:
k
kx
ky
1
0
0
p=0
2 3
1 0
0 1
p=1
4
2
0
5 6
1 0
1 2
p=2
7
3
0
8 9 10
2 1 0
1 2 3
p=3
...
...
...
...
The dimension of the basis within one mesh element is equal to m = (p+1)(p+2)
.
2
Next, we express uh as a linear combination of the basis functions:
uh =
m
N X
X
(i) (i)
uk φ k .
(2.3)
i=1 k=1
(i)
The new unknowns uk in (2.3) can be determined by solving a linear system
Au = b of the form:


A11 A12 . . . A1N  u1   b1 

..   u   b 
 A21 A22
2
. 
 2


..  =  ..  ,

 .
.


 . 
..
 .
 ..
uN
bN
AN 1 . . .
AN N
Section 2.2 Discretization
11
where the blocks all have dimension m:

(i)
(j)
(i)
(j)
B(φ1 , φ1 ) B(φ2 , φ1 ) . . .


(i)
(j)
(i)
(j)
B(φ1 , φ2 ) B(φ2 , φ2 )
Aji = 
..
..

.
.

(i)
(j)
B(φ1 , φm )
...


 (i) 
(j)
L(φ1 )
u1

 (i) 
(j) 
L(φ2 )
u2 



,
b
=
ui = 
j
 ..  ,
 .. 
 . 
 . 
(i)
um

(i)
(j)
B(φm , φ1 )

..

.

,


(i)
(j)
B(φm , φm )
(j)
L(φm )
for all i, j = 1, ..., N . This system is obtained by substituting the expression
(j)
(2.3) for uh and the basis functions φℓ for v into (2.2). Once the unknowns
(i)
uk are solved from the system Au = b, the final SIPG approximation uh can
be obtained from (2.3).
2.2.3
Penalty parameter
The SIPG method involves the penalty parameter σ which penalizes the interelement jumps to enforce weak continuity. This parameter should be selected
carefully: on the one hand, it needs to be sufficiently large to ensure that
the SIPG method converges and the coefficient matrix A is Symmetric and
Positive Definite (SPD) [63]. At the same time, it needs to be chosen as small
as possible, since the condition number of A increases rapidly with the penalty
parameter [17].
Computable theoretical lower bounds for a large variety of problems have
been derived by Epshteyn and Riviere [28]. For one-dimensional diffusion probk2
lems, it suffices to choose σ ≥ 2 k10 p2 , where k0 and k1 are the global lower and
upper bound respectively for the diffusion coefficient K. However, while this
lower bound for σ is sufficient to ensure convergence (assuming the exact solution is sufficiently smooth), it can be unpractical for problems with strong
variations in the coefficients. For instance, if the diffusion coefficient K takes
values between 1 and 10−3 , we obtain σ ≥ 2000p2 , which is inconveniently
large. For this reason, it is common practice to choose e.g. σ = 20 rather than
σ = 20 000 for such problems [24, 60].
An alternative strategy is to choose the penalty parameter based on local
values of the diffusion-coefficient K, e.g. choosing σ = 20K rather than σ = 20.
It has been demonstrated numerically in [25] that such a diffusion-dependent
penalty parameter can benefit the efficiency of a linear solver (also cf. Section
2.5.2). For general tensors K, this strategy can be defined as follows: for an
edge with normal n, we set σ = αλ, where λ = nT Kn and α is a user-defined
constant. The latter should be as small as possible in light of the discussion
above.
12
Chapter 2 Linear DG systems
If the diffusion is discontinuous, this definition may not be unique. For
instance, in the example above, we could have λ = 1 on one side and λ = 0.001
on the other side of an edge. In such cases, it seems a safe choice to use the
largest limit value of λ in the definition above (e.g. λ = 1 in the example). The
reason for this is that theoretical stability and convergence analysis are usually
based on a penalty parameter that is sufficiently large.
An alternative strategy for dealing with discontinuities is to use the harmonic average of both limit values [27, 15, 29, 7]. In this case, the penalty
λ λ
parameter reads σ = 2α λi i+λjj , where λi and λj are based on the information
in the mesh elements Ei and Ej respectively (adjacent to the edge under consideration). This choice is equivalent to using the minimum of both limit values
[7, p. 5]. In that sense, it seems less ‘safe’ than the maximum strategy above.
In [27, 15, 29, 7], the ‘harmonic’ penalty parameter is used in combination
with the so-called Symmetric Weighted Interior Penalty (SWIP) method. The
main difference between the standard SIPG method and the SWIP method is
the following: whenever an average of a function at a mesh element boundary is
considered (denoted by {.} in Section 2.2.1), the SWIP method uses a weighted
average rather than the standard average. For this purpose, the weights typiλj
i
cally depend on the diffusion coefficient, i.e. wi = λi +λ
and wj = λiλ+λ
(note
j
j
that the harmonic penalty can then be written as σ = α(wi λi + wj λj )).
In this chapter, we study the effects of both a constant and a diffusiondependent penalty parameter, using either the maximum or the harmonic strategy above. Furthermore, we consider both the SIPG and the SWIP method.
2.3
Two-level preconditioner
To solve the linear SIPG system obtained in the previous section, we start by
considering the two-level preconditioner proposed by Dobrev et al [24]. Section 2.3.1 specifies the corresponding coarse correction operator. Section 2.3.2
defines the resulting two-level preconditioner. Section 2.3.3 indicates its implementation in a standard preconditioned CG algorithm.
2.3.1
Coarse correction operator
The two-level preconditioner is defined in terms of a coarse correction operator
Q ≈ A−1 that switches from the original DG test space to a coarse subspace,
then performs a correction that is now simple in this coarse space, and finally
switches back to the original DG test space. In this case, the coarse subspace
is based on the piecewise constant basis functions.
More specifically, the coarse correction operator Q is defined as follows. Let
R denote the so-called restriction operator such that A0 := RART is the SIPG
matrix for polynomial degree p = 0. More specifically, the matrix R is defined
Section 2.3 Two-level preconditioner
13
as the following N × N block matrix:

R11 R12

 R21 R22
R=
 .
 ..
RN 1 . . .

R1N
.. 
. 
,


RN N
...
..
.
where the blocks all have size 1 × m:
Rii = 1 0 . . . 0 ,
Rij = 0
...
0 ,
for all i, j = 1, ..., N and i 6= j. Using this notation, the coarse correction
operator is defined as
Q := RT A−1
0 R
(2.4)
For example, for a Laplace problem on the domain [0, 1]2 with p = 1, a
uniform Cartesian mesh with 2 × 2 elements, and penalty parameter σ = 10,
we obtain the following matrices:










A=










40
1
1
1 25
0
1
0 25
−10 −9
0
9
8
0
0
0 −3
−10
0 −9
0 −3
0
9
0
8
0
0
0
0
0
0
0
0
0
40
 −10
A0 = 
 −10
0

1 0
 0 0
R=
 0 0
0 0
−10
40
0
−10
0
0
0
0
0
1
0
0
9
0
8
0
0 −3
−1
1
25
0
0 25
0
0
0
0
0
0
0 −9
−3
0
0
8

−10
0
0 −10 
,
40 −10 
−10
40
0
0
0
0
−10
−9
0
40
−1
1
0
0
0
−10
0
9
0
0
0
0
0
0
1
0
0
0
0
0
0
0
0
0
−10
0
9
0 −3
0
−9
0
8
0
0
0
0
0
0
0
0
0
40
1 −1
1 25
0
−1
0 25
−10 −9
0
9
8
0
0
0 −3
0
0
0
1
0
0
0
0
0
0
0
−10
0
−9
−10
−9
0
40
−1
−1

0
0
0
0 

0
0 

0
9 

−3
0 

0
8 
,
9
0 

8
0 

0 −3 

−1 −1 

25
0 
0 25

0
0 
.
0 
0
Observe that A0 has the same structure as a central difference matrix (aside
from a factor σ = 10). Furthermore, every element in A0 is also present in the
14
Chapter 2 Linear DG systems
upper left corner of the corresponding block in the matrix A. This is because
the piecewise constant basis functions are assumed to be in any polynomial
basis. As a consequence, the matrix R contains elements equal to 0 and 1
only, and does not need to be stored explicitly: multiplications with R can be
implemented by simply extracting elements or inserting zeros.
2.3.2
Two-level preconditioner
We can now formulate the two-level preconditioner proposed by Dobrev et al.
[24]. This is established by combining the coarse correction operator Q with a
(damped) smoother:
Definition 2.3.1 (Two-level preconditioner) Consider the coarse correction operator Q defined in (2.4), a damping parameter ω ≤ 1, and an
invertible smoother M −1 ≈ A−1 . Then, the result y = Pprec r of applying
the two-level preconditioner to a vector r can be computed as follows:
y(1) = ωM −1 r
(pre-smoothing),
y(2) = y(1) + Q(r − Ay(1) )
y=y
(2)
+ ωM
−T
(r − Ay
(coarse correction),
(2)
)
(post-smoothing).
(2.5)
In this chapter, we consider block Jacobi and block Gauss-Seidel smoothing.
These smoothers have the following property (cf. Chapter 3):
M + M T − ωA is SPD.
(2.6)
Using the more abstract analysis in [79, p. 66], condition (2.6) implies that
the preconditioning operator Pprec is SPD. As a consequence, the two-level
preconditioner can be implemented in a standard preconditioned CG algorithm
(cf. Section 2.3.3 hereafter).
Requirement (2.6) also implies that the two-level preconditioner yields scalable convergence of the CG method (independent of the mesh element diameter) for a large class of problems. This has been shown for polynomial degree
p = 1 by Dobrev et al. [24], using the analysis in [31]. In Chapter 3, we will
extend this theory for p > 1.
2.3.3
Implementation in a CG algorithm
Assuming (2.6), the two-level preconditioner is SPD and can be implemented in
a standard preconditioned CG algorithm. Below, we summarize the implementation of this scheme for a given preconditioning operator P and start vector
x0 :
1. r0 := b − Ax0
Section 2.4 Deflation variant
2.
3.
4.
5.
6.
7.
8.
9.
10.
11.
12.
15
y0 := P r0
p0 := y0
for j = 0, 1, ... until convergence do
wj := Apj
αj := (rj , yj )/(pj , wj )
xj+1 := xj + αj pj
rj+1 := rj − αj wj
yj+1 = P rj+1
βj := (rj+1 , yj+1 )/(rj , yj )
pj+1 := yj+1 + βj pj
end
2.4
Deflation variant
Next, we cast the two-level preconditioner into the deflation framework using the abstract analysis in [76]. Section 2.4.1 defines the resulting operator.
Section 2.4.2 compares the two-level preconditioner and the corresponding deflation variant in terms of computational costs. Section 2.4.3 discusses the
coarse systems involved in each iteration. Section 2.4.4 considers the influence
of damping of the smoother.
2.4.1
Two-level deflation
There are multiple strategies to construct a deflation method based on the components of the two-level preconditioner. An overview is given in [76]. Below,
we consider the so-called ‘ADEF2’ deflation scheme, as this type can be implemented relatively efficiently, and allows for inexact solving of coarse systems
(cf. Section 2.4.3 later on).
Basically, this deflation variant is obtained from Definition 2.3.1 by skipping
the last smoothing step in (2.5).
Definition 2.4.1 (Two-level deflation) Consider the coarse correction
operator Q defined in (2.4), a damping parameter ω ≤ 1, and an invertible
smoother M −1 ≈ A−1 . Then, the result y = Pdefl r of applying the two-level
deflation technique to a vector r can be computed as:
y(1) := ωM −1 r
y := y(1) + Q(r − Ay(1) )
(pre-smoothing),
(coarse correction).
(2.7)
The operator Pdefl is not symmetric in general. As such, it seems unsuitable
for the standard preconditioned CG method. Interestingly, it can still be implemented successfully in its current asymmetric form, as long as the smoother
16
Chapter 2 Linear DG systems
M −1 is SPD (requirement (2.6) is not needed), and the start vector x0 is preprocessed according to:
x0 7→ Qb + (I − AQ)T x0 .
(2.8)
Other than that, the CG implementation remains as discussed in Section 2.3.3.
Indeed, it has been shown by [76, Theorem 3.4] that, under the aforementioned
conditions, Pdefl yields the same CG iterates as an alternative operator (called
‘BNN’, cf. Section 3.2.3) that actually is SPD.
It will be shown in Chapter 3 that, similar to the two-level preconditioner,
the deflation variant above yields scalable convergence of the CG method (independent of the mesh element diameter).
2.4.2
FLOPS
Because the deflation variant skips one of the two smoothing steps, its costs
per CG iteration are lower than for the preconditioning variant. In this section,
we compare the differences in terms of FLoating point OPerationS (FLOPS).
Table 2.1 displays the costs for a two-dimensional diffusion problem with
polynomial degree p, a Cartesian mesh with N = n × n elements, and polynomial space dimension m := (p+1)(p+2)
. Using the preconditioning variant, the
2
CG method requires per iteration (27m2 + 14m)N flops, plus the costs for two
smoothing steps and one coarse solve. Using the two-level deflation method,
the CG method requires per iteration (18m2 + 12m)N flops, plus the costs for
only one smoothing step and one coarse solve.
A block Jacobi smoothing step with blocks of size m requires (2m2 − m)N
flops, assuming that an LU -decomposition is known. In this case, the smoothing costs are low compared to the costs for a matrix-vector product, and the
deflation variant is roughly 30% cheaper (per iteration). For more expensive
smoothers, this factor becomes larger. A block Gauss-Seidel sweep (either forward or backward) requires the costs for one block Jacobi step, plus the costs
for the updates based on the off-diagonal elements, which are approximately
4m2 N flops.
operation
mat-vec (Au)
inner product (uT v)
scalar multiplication (αu)
vector update (u ± v)
smoothing (M −1 u)
coarse solve (A−1
0 u0 )
flops (rounded)
9m2 N
2mN
mN
mN
variable
variable
# defl.
2
2
3
5
1
1
# prec.
3
2
3
7
2
1
Table 2.1. Comparing the computational costs per CG iteration for A-DEF2
deflation and the two-level preconditioner for our applications.
Section 2.4 Deflation variant
2.4.3
17
Coarse systems
Both two-level methods require the solution of a coarse system in each iteration,
involving the coefficient matrix A0 . In Section 2.3.1, we have seen that A0 has
the same structure and size (N × N ) as a central difference matrix. As a
consequence, a direct solver is not feasible for most practical applications. At
the same time, many effective inexact solvers are readily available for this type
of system.
For some deflation methods, including DEF1, DEF2, R-BNN1, R-BNN2,
such an inexact coarse solver is not suitable. This is because those methods
contain eigenvalue clusters at 0, so small perturbations in those schemes (e.g.
due to inexact coarse solves) can result in an “unfavorable spectrum, resulting
in slow convergence of the method” —
— [76, p. 353]. ADEF2 does not have
this limitation, as it clusters these eigenvalues at 1 rather than 0. This is one
of the reasons why we focus on this particular deflation variant.
In Section 2.5.3 we will investigate the use of an inexact coarse solver that
applies the CG method in an inner loop with a scalable algebraic multigrid preconditioner. This strategy will be studied for both the two-level preconditioner,
An alternative strategy is the Flexible CG (FCG) method [6, 56], which
can be used to relax the stopping criterion for inner iterations while remaining
as effective as a direct solver. The main difference with standard CG lies in
the explicit orthogonalization and truncation of the search direction vectors,
possibly combined with a restart strategy. We do not study this topic further
in this thesis.
2.4.4
Damping
Damping often benefits the convergence of multigrid methods [84]. For multi1
”,
grid methods with smoother M = I, a “typical choice of [ω] is close to ||A||
2
although a “better choice of [ω] is possible if we make further assumptions
on how the eigenvectors of A associated with small eigenvalues are treated by
coarse-grid correction” —— [75, p. 1727]. In that reference, the latter is established for a coarse space that is based on a set of orthonormal eigenvectors of
A. However, such a result does not seem available yet for the coarse space (and
smoothers) currently under consideration.
At the same time, deflation may not be influenced by damping at all. The
latter has been observed theoretically in [75, p. 1727] for the DEF(1) variant.
For the ADEF2 variant under consideration, such a result is not yet available.
Altogether, it is an open question how the damping parameter can best
be selected in practice. For this reason, we use an emprical approach in this
chapter, and study the effects on both two-level methods for several values of
ω ≤ 1.
18
2.5
Chapter 2 Linear DG systems
Numerical experiments
Next, we compare the two-level preconditioner and the corresponding deflation
variant through numerical experiments. Section 2.5.1 specifies the experimental
setup. Section 2.5.2 studies the influence of SIPG penalty parameter. Section
2.5.3 investigates the effectiveness of an inexact solver for the coarse systems.
Section 2.5.4 studies the impact of (damping of) the smoother on the overall
computational efficiency. Section 2.5.5 considers similar experiments for more
challenging test cases.
2.5.1
Experimental setup
We consider multiple diffusion problems of the form (2.1) with strong contrasts
in the coefficients on the domain [0, 1]2 . At first, we primarily focus on the
problem illustrated in Figure 2.1. This problem has five layers, and the diffusion
is either 1 or 10−3 in each layer. Such strong contrasts in the coefficients
are typically encountered during oil reservoir simulations involving e.g. layers
of sand and shale. In Section 2.5.5, we also study problems that mimic the
occurrence of sand inclusions within a layer of shale, and ground water flow.
Furthermore, we examine a bubbly flow problem, the influence of Neumann
boundary conditions, and a full anisotropic problem.
K=1
K = 10−3
K=1
K = 10−3
K=1
Figure 2.1. Illustration of the problem with five layers
The Dirichlet boundary conditions and the source term f are chosen such
that the exact solution reads u(x, y) = cos(10πx) cos(10πy) (unless indicated
otherwise). We stress that this choice does not impact the matrix or the performance of the linear solver, as we use random start vectors (see below).
Furthermore, subdividing the domain into 10 × 10 equally sized squares, the
diffusion coefficient is constant within each square.
All model problems are discretized by means of the SIPG method as discussed in Section 2.2, although the SWIP variant with weighted averages is
also discussed. We use a uniform Cartesian mesh with N = n × n elements,
where n = 20, 40, 80, 160, 320. Furthermore, we use monomial basis functions
with polynomial degree p = 2, 3 (results for p = 1 are similar though). For
p = 3 and n = 320, this means that the number of degrees of freedom is a little
Section 2.5 Numerical experiments
19
over 106 . In most cases, the penalty parameter is chosen diffusion-dependent,
σ = 20nT Kn, using the largest limit value at discontinuities (cf. Section 2.2.3).
However, we also study a constant penalty parameter, and a parameter based
on harmonic means.
The linear systems resulting from the SIPG discretizations are solved by
means of the CG method, combined with either the two-level preconditioner
(Definition 2.3.1) or the corresponding deflation variant (Definition 2.4.1). Unless specified otherwise, damping is not used. For the smoother M −1 , we use
block Jacobi with small blocks of size m × m (recall that m = (p+1)(p+2)
). For
2
the preconditioning variant, we also consider block Gauss-Seidel with the same
block size (deflation requires a symmetric smoother).
Diagonal scaling is applied as a pre-processing step in all cases, and the
same random start vector x0 is used for all problems of the same size. Preprocessing of the start vector according to (2.8) is applied for deflation only,
as it makes no difference for the preconditioning variant. For the stopping
criterion we use:
krk k2
≤ TOL,
kbk2
(2.9)
where TOL = 10−6 , and rk is the residual after the k th iteration.
Coarse systems, involving the SIPG matrix A0 with polynomial degree p =
0, are solved directly in most cases. However, a more efficient alternative is
provided in Section 2.5.3. In any case, the coarse matrix A0 is quite similar to
a central difference matrix, for which very efficient solvers are readily available.
Finally, we remark that all computations are carried out using a Intel Core
2 Duo CPU (E8400) at 3 GHz.
20
2.5.2
Chapter 2 Linear DG systems
The influence of the penalty parameter
This section studies the influence of the SIPG penalty parameter on the convergence of the CG and the SIPG method. We compare the differences between
using a constant penalty parameter, and a diffusion-dependent value. Similar
experiments have been considered in [25] for the two-level preconditioner for
p = 1, a single mesh, and symmetric Gauss-Seidel smoothing (solving the coarse
systems using geometric multigrid). They found that “proper weighting”, i.e.
a diffusion-dependent penalty parameter, “is essential for the performance”.
In this section, we consider p = 2, 3, and both preconditioning and deflation
(using block Jacobi smoothing). Furthermore, we analyze the scalability of the
methods by considering multiple meshes. Our results are consistent with those
in [25].
Table 2.2 displays the number of CG iterations required for convergence
for a Poisson problem (i.e. K = 1 everywhere) with σ = 20. Because the
diffusion coefficient is constant, a diffusion-dependent value (σ = 20K) would
yield the same results. We observe that both the two-level preconditioner (TL
prec.) and the deflation variant (TL defl.) yield fast and scalable convergence
(independent of the mesh element diameter). For comparison, the results for
standard Jacobi and block Jacobi preconditioning are also displayed (not scalable). Interestingly, the two-level deflation method requires fewer iterations
than the preconditioning variant, even though its costs per iteration are about
30% lower (cf. Section 2.4.2).
Table 2.3 considers the same test (using a constant σ = 20), but now for the
problem with five layers (cf. Figure 2.1). It can be seen that the convergence is
no longer fast and scalable for this problem with large jumps in the coefficients.
The deflation method is significantly faster than the preconditioning variant,
but neither produce satisfactory results.
degree
mesh
Jacobi
Block Jacobi (BJ)
TL Prec., 2x BJ
TL Defl., 1x BJ
N=202
301
205
36
32
p=2
N=402 N=802
581
1049
356
676
38
39
33
33
N=1602
1644
1190
40
34
N=202
325
206
49
36
p=3
N=402 N=802
576
1114
357
696
52
53
37
37
N=1602
1903
1183
54
38
Table 2.2. Both two-level methods yield fast scalable convergence for a problem
with constant coefficients (Poisson, # CG iterations, σ = 20).
degree
mesh
Jacobi
Block Jacobi (BJ)
TL Prec., 2x BJ
TL Defl., 1x BJ
N=202
1671
933
415
200
p=2
N=402 N=802
4311
9069
2253
4996
1215
2534
414
531
N=1602
15923
9651
3571
599
N=202
2675
1357
1089
453
p=3
N=402 N=802
5064
9104
2960
5660
2352
4709
591
667
N=1602
15655
9783
8781
698
Table 2.3. For a problem with extreme contrasts in the coefficients, a constant
penalty yields poor convergence (five layers, # CG iterations, σ = 20).
Section 2.5 Numerical experiments
21
Switching to a diffusion-dependent penalty parameter
In Table 2.4, we revisit the experiment in Table 2.3, but this time for a diffusiondependent penalty parameter (σ = 20K, using the largest limit value of K
at discontinuities). Due to this alternative discretization, the results are now
similar to those for the Poisson problem (cf. Table 2.2): both two-level methods
yield fast and scalable convergence.
These results motivate the use of a diffusion-dependent penalty parameter,
provided that that this strategy does not worsen the accuracy of the SIPG
discretization compared to a constant penalty parameter. In Figure 2.2, it is
verified that a diffusion-dependent penalty parameter actually improves the
accuracy of the SIPG approximation (for p = 3). Similar results have been
observed for p = 1, 2 (not displayed). The higher accuracy can be explained
by the fact that the discretization contains more information of the underlying
physics for a diffusion-dependent penalty parameter. Altogether, the penalty
parameter can best be chosen diffusion-dependent, and we will do so in the
remaining of this chapter.
−2
constant σ = 20
SIPG L2−error
10
−3
10
Figure 2.2
A diffusion-dependent
penalty parameter yields
better SIPG accuracy
(five layers, σ = 20K).
−4
10
−5
10
diffusion−dependent σ = 20 K
−6
10
2
20
2
2
40
80
2
160
# mesh elements
degree
mesh
Jacobi
Block Jacobi (BJ)
TL Prec., 2x BJ
TL Defl., 1x BJ
N=202
976
243
46
43
p=2
N=402 N=802
1264
1570
424
788
43
43
45
45
N=1602
2315
1285
44
46
N=202
1303
244
55
47
p=3
N=402 N=802
1490
1919
425
697
56
56
48
48
N=1602
3109
1485
57
48
Table 2.4. For a diffusion-dependent penalty parameter, both two-level methods
yield fast scalable convergence for a problem with large permeability contrasts (five
layers, # CG iterations, σ = 20K).
22
Chapter 2 Linear DG systems
Weighted averages
The results for the diffusion-dependent penalty parameter in Figure 2.2 and
Table 2.4 were established using the largest limit value of K in the definition
of σ at the discontinuities. In this section, we consider the influence of using
weighted averages, resulting in the SWIP method and a diffusion-dependent
penalty parameter based on harmonic means (cf. Section 2.2.3). For this
purpose, we study the problem with five layers again.
We have found that using σ = 20K with this approach results in negative
eigenvalues, implying that the scheme is not coercive, and resulting in poor
CG convergence. The same is true for σ = αK with α = 100, 200, 500, 1000.
For α = 20 000, the matrix is positive-definite (tested for N = 10, 20 and
p = 1, 2, 3). Similar outcomes were found using the SIPG scheme rather than
the SWIP scheme (for the same ‘harmonic’ penalty parameter).
At the mesh element edges where the diffusion coefficient K is discontinuous,
20
using α = 20 000 and a harmonic penalty yields σ = 1.001
. At the same
time, using α = 20 and a ‘maximum’ penalty (i.e. using the largest limit
value at discontinuities), yields σ = 20. These values are nearly the same.
However, at all other edges, where K is continuous, σ is 1000 times larger for
the harmonic penalty (with α = 20 000) than for the maximum penalty (with
α = 20). Because the penalty parameter should be chosen as small as possible
(cf. Section 2.2.3), we conclude that it can best be based on the largest limit
value at discontinuities. This is in line with our earlier speculation that using
the maximum is a ‘safe’ choice.
We have also combined the ‘maximum’ penalty with the SWIP method
and compared the outcomes to the earlier results for the SIPG method (both
for σ = 20K). We have found that the discretization accuracy and the CG
convergence are practically the same: the relative absolute difference in the
discretization error is less then 2% (for p = 1, 2, 3 and N = 202 , 402 , 802 , 1602 ).
Comparing Table 2.5 (SWIP) to Table 2.4 (SIPG), it can be seen that the
number of CG iterations required for convergence is nearly identical.
Altogether, we conclude that both the SIPG and the SWIP method are
suitable for our application, as long as the penalty parameter is chosen diffusiondependent, using the largest limit value at discontinuities. We will apply this
strategy using the standard SIPG method in the remaining of this chapter.
degree
mesh
Jacobi
Block Jacobi (BJ)
TL Prec., 2x BJ
TL Defl., 1x BJ
N=202
980
244
46
43
p=2
N=402 N=802
1270
1575
424
790
43
43
45
45
N=1602
2325
1287
44
46
N=202
1309
244
55
47
p=3
N=402 N=802
1500
1935
424
697
56
56
48
49
N=1602
3129
1485
57
49
Table 2.5. The difference between the SWIP method (this table) and the SIPG
method (cf. Table 2.4) is small (five layers, # CG iterations).
Section 2.5 Numerical experiments
2.5.3
23
Coarse systems
To solve the coarse systems with coefficient matrix A0 , a direct solver is usually
not feasible in practice. This is because A0 has the same structure and size
(N × N ) as a central difference matrix. To improve on the efficiency of the
coarse solver, we have investigated the cheaper alternative of applying the CG
method again in an inner loop (cf. Section 2.4.3). This section discusses the
results using the algebraic multigrid preconditioner MI 20 in the HSL software
package2 , which is based on a classical scheme described in [74]. The inner loop
uses a stopping criterion of the form (2.9).
Table 2.6 displays the number of outer CG iterations required for convergence using the two-level preconditioner and deflation variant respectively (for
the problem with five layers). Different values of the inner tolerance TOL are
considered in these tables. For comparison, the results for the direct solver
are also displayed. We observe that low accuracy in the inner loop is sufficient
to reproduce the latter. For both two-level methods, the inner tolerance can
be 104 times as large as the outer tolerance. For the highest acceptable inner
tolerance TOL = 10−2 , the number of iterations in the inner loop is between 2
and 5 in all cases (not displayed in the tables).
In terms of computational time, the difference between the direct solver and
the inexact AMG-CG solver is negligible for the problems under consideration.
However, for large three-dimensional problems, it can be expected that the
inexact coarse solver is much faster, and thus crucial for the overall efficiency
and scalability of the linear solver.
Preconditioning:
degree
mesh N=402
direct
43
TOL = 10−4
43
TOL = 10−3
43
TOL = 10−2
43
TOL = 10−1
48
N=802
43
43
43
43
62
p=2
N=1602
44
44
44
44
55
N=3202
44
44
44
44
57
N=402
56
56
56
56
59
N=802
56
56
56
57
65
p=3
N=1602
57
57
57
58
70
N=3202
58
58
58
58
78
Deflation:
degree
mesh
direct
TOL = 10−4
TOL = 10−3
TOL = 10−2
TOL = 10−1
N=802
45
45
45
45
81
p=2
N=1602
46
46
46
46
51
N=3202
46
46
46
46
300
N=402
48
48
48
48
48
N=802
48
48
48
48
54
p=3
N=1602
48
48
48
48
79
N=3202
49
49
49
49
67
N=402
45
45
45
45
60
Table 2.6. Coarse systems can be solved efficiently by using an inexact solver
with a relatively low accuracy (TOL) in the inner loop (five layers, # outer CG
iterations).
2 HSL,
a collection of Fortran codes for large-scale scientific computation.
http://www.hsl.rl.ac.uk/
See
24
2.5.4
Chapter 2 Linear DG systems
Smoothers and damping
This section discusses the influence of the smoother and damping on both twolevel methods. In particular, we consider multiple damping values ω ∈ [0.5, 1],
and both Jacobi and (block) Gauss-Seidel smoothing. The latter is applied for
the preconditioning variant only, as deflation requires a symmetric smoother.
Table 2.7 displays the number of CG iterations required for convergence for
the problem with five layers. For the deflation variant (Defl.), we have found
that damping makes no difference for the CG convergence, so the outcomes for
ω < 1 are not displayed. Such a result has also been observed theoretically in
[75] for an alternative deflation variant (known as DEF(1)).
For the preconditioning variant (Prec.), damping can both improve and
worsen the efficiency: for block Jacobi (BJ) smoothing, choosing e.g. ω = 0.7
can reduce the number of iterations by 37%; for block Gauss-Seidel (BGS)
smoothing, choosing ω < 1 has either no influence or a small negative impact
in most cases.
We have also performed the experiment for standard point Jacobi and point
Gauss-Seidel (not displayed in the table). However, this did not lead to satisfactory results: over 250 iterations in all cases, even when the relaxation
parameter was sufficiently low to ensure that (2.6) is satisfied (only restricting
for point Jacobi). Altogether, the block (Jacobi/Gauss-Seidel) smoothers yield
significantly better results than the point (Jacobi/Gauss-Seidel) smoothers.
We speculate that this is due to the following: the coarse correction operator Q simplifies the matrix A to A0 , which eliminates the ‘higher-order’
information in each element (regarding the higher-order basis functions), but
preserves the ‘mesh’ information (i.e. which elements are neighbors and which
are not). Intuitively, a suitable smoother would reintroduce this higher-order
information, originally contained in dense blocks of size m × m. The block
(Jacobi/Gauss-Seidel) smoothers are better suited for this task, which could
explain why they are more effective.
degree
mesh
Prec., 2x BJ (ω = 1)
(ω = 0.9)
(ω = 0.8)
(ω = 0.7)
(ω = 0.6)
(ω = 0.5)
Prec., 2x BGS (ω = 1)
(ω = 0.9)
(ω = 0.8)
(ω = 0.7)
(ω = 0.6)
(ω = 0.5)
Defl., 1x BJ (ω = 1)
N=402
43
34
33
33
32
34
33
33
33
32
33
34
45
N=802
43
34
34
33
33
34
33
33
33
34
35
35
45
p=2
N=1602
44
37
34
33
34
35
34
34
35
36
36
37
46
N=3202
44
37
35
34
35
35
35
34
35
36
37
38
46
N=402
56
40
36
35
35
36
34
34
35
35
36
37
48
N=802
56
40
37
36
36
37
35
34
34
36
37
39
48
p=3
N=1602
57
42
39
36
36
38
35
35
36
37
38
39
48
N=3202
58
43
39
37
37
39
37
36
37
38
39
40
49
Table 2.7. Damping can improve the convergence for the preconditioner, but has
no influence for deflation (five layers, # CG Iterations).
Section 2.5 Numerical experiments
25
Taking the computational time into account
Based on the results in Table 2.7, it appears that that the preconditioning
variant with either block Jacobi (with optimal damping) or block Gauss-Seidel
is the most efficient choice. However, the costs per iteration also need to be
taken into account.
Table 2.8 reconsiders the results in Table 2.7 but now in terms of the computational time in seconds (using a direct coarse solver). It can be seen that
block Gauss-Seidel smoothing is relatively expensive. The deflation variant
(with block Jacobi smoothing), is the fastest in nearly all cases. This is due to
the fact that it requires only one smoothing step per iteration, instead of two.
When an optimal damping parameter is known, the preconditioning variant
reaches a comparable efficiency. This is also illustrated in Figure 2.3. However, it is an open question how the damping parameter can best be selected
in practice.
60
8
prec.
7
defl.
CPU time in seconds
# CG iterations
50
40
prec.
30
20
10
0
0.5
6
5
defl.
4
3
2
1
0.6
0.7
0.8
0.9
damping parameter ω
1
0
0.5
0.6
0.7
0.8
0.9
damping parameter ω
1
Figure 2.3. Unless an optimal damping parameter is known, deflation is faster
due to lower costs per iteration (five layers, N = 1602 , block Jacobi).
degree
mesh
Prec., 2x BJ (ω = 1)
(ω = 0.9)
(ω = 0.8)
(ω = 0.7)
(ω = 0.6)
(ω = 0.5)
Prec., 2x BGS (ω = 1)
(ω = 0.9)
(ω = 0.8)
(ω = 0.7)
(ω = 0.6)
(ω = 0.5)
Defl., 1x BJ (ω = 1)
N=402
0.09
0.07
0.07
0.07
0.07
0.07
0.15
0.15
0.15
0.14
0.15
0.15
0.07
N=802
0.61
0.48
0.48
0.47
0.47
0.48
0.80
0.80
0.80
0.82
0.84
0.84
0.50
p=2
N=1602
3.23
2.76
2.53
2.46
2.55
2.61
3.84
3.85
3.96
4.07
4.08
4.19
2.77
N=3202
15.43
13.12
12.43
12.11
12.41
12.47
17.50
17.03
17.64
18.10
18.46
19.10
13.61
N=402
0.38
0.27
0.24
0.24
0.24
0.24
0.41
0.41
0.42
0.42
0.43
0.45
0.22
N=802
1.73
1.25
1.16
1.12
1.13
1.16
1.84
1.79
1.79
1.89
1.94
2.06
1.05
p=3
N=1602
7.85
5.83
5.42
5.05
5.04
5.33
7.75
7.71
7.97
8.22
8.44
8.66
4.92
N=3202
37.41
27.91
25.12
24.03
24.11
25.48
36.32
35.36
36.36
37.28
38.23
39.21
23.31
Table 2.8. The deflation method and the block Jacobi smoother tend to be faster
due to lower costs per iteration (five layers, CPU time in seconds).
26
2.5.5
Chapter 2 Linear DG systems
Other test cases
In this section, we repeat the experiments in Table 2.8 for more challenging
test cases. For the preconditioning variant, we only display the results for
block Jacobi smoothing without damping (ω = 1) and with optimal damping
(ω = 0.7).
Table 2.9 and Table 2.10 consider problems that mimic the occurrence of
sand inclusions within a layer of shale and groundwater flow respectively. Similar problems have been studied in [81]. Table 2.11 displays the results for a
bubbly flow problem, inspired by [75]. Table 2.12 studies a bowl-shaped problem: at the black lines in the illustration, homogeneous Neumann boundary
conditions are applied. These have a negative impact on the matrix properties,
resulting in a challenging problem. Table 2.13 considers an anisotropic problem
with two layers (with exact solution u(x, y) = cos(2πy)). Because the diffusion
is a full tensor, this test case mimics the effect of using a non-Cartesian mesh.
It can be seen from these tables that, as before, both two-level methods
yield fast and scalable convergence. Without damping, deflation is the most
efficient. When an optimal damping value is known, the preconditioning variant
performs comparable to deflation.
2.6
Conclusion
This chapter compares the two-level preconditioner proposed in [24] to an alternative (ADEF2) deflation variant for linear systems resulting from SIPG
discretizations for diffusion problems. We have found that both two-level methods yield fast and scalable convergence for diffusion-problems with large jumps
in the coefficients. This result is obtained provided that the SIPG penalty
parameter is chosen dependent on local values of the permeability (using the
largest limit value at discontinuities). The latter also benefits the accuracy of
the SIPG discretization. Furthermore, the impact of using weighted averages
(SWIP) is then small. Coarse systems can be solved efficiently by applying
the CG method again in an inner loop with low accuracy. The main difference
between both methods is that the deflation method can be implemented by
skipping one of the two smoothing steps in the algorithm for the preconditioning variant. This may be particularly advantageous for expensive smoothers,
although the basic block Jacobi smoother was found to be highly effective for
the problems under consideration. Without damping, deflation can be up to
35% faster than the original preconditioner. If an optimal damping parameter
is used, both two-level strategies yield similar efficiency (deflation appears unaffected by damping). However, it remains an open question how the damping
parameter can best be selected in practice. Altogether, both two-level strategies
can contribute to faster linear solvers for SIPG systems with strong contrasts
in the coefficients, such as those encountered in oil reservoir simulations.
Section 2.6 Conclusion
27
40
CPU time in sec.
35
Prec., 2x BJ
Prec., 2x BJ (ω = 0.7)
Defl., 1x BJ
K=1
30
K = 10−5
25
20
15
10
5
0
402
802
1602
3202
# mesh elements
# CG Iterations:
degree
mesh
degrees of freedom
Prec., 2x BJ
Prec., 2x BJ (ω = 0.7)
Defl., 1x BJ
N=402
9600
44
30
42
N=802
38400
48
30
42
p=2
N=1602
153600
46
30
42
N=3202
614400
46
31
43
N=402
16000
53
32
46
N=802
64000
56
34
47
p=3
N=1602
256000
58
34
48
N=3202
1024000
59
34
48
N=402
9600
0.09
0.06
0.06
N=802
38400
0.68
0.43
0.47
p=2
N=1602
153600
3.28
2.18
2.44
N=3202
614400
16.77
11.38
13.14
N=402
16000
0.36
0.22
0.21
N=802
64000
1.75
1.09
1.05
p=3
N=1602
256000
8.11
4.86
4.91
N=3202
1024000
37.42
22.01
23.62
CPU time in seconds:
degree
mesh
degrees of freedom
Prec., 2x BJ
Prec., 2x BJ (ω = 0.7)
Defl., 1x BJ
Table 2.9. Sand inclusions
28
Chapter 2 Linear DG systems
45
40
CPU time in sec.
35
K = 102
Prec., 2x BJ
Prec., 2x BJ (ω = 0.7)
Defl., 1x BJ
30
K = 104
25
20
K = 10−3
15
10
5
0
402
802
1602
3202
# mesh elements
# CG Iterations:
degree
mesh
degrees of freedom
Prec., 2x BJ
Prec., 2x BJ (ω = 0.7)
Defl., 1x BJ
N=402
9600
54
38
54
N=802
38400
52
38
54
p=2
N=1602
153600
52
38
54
N=3202
614400
52
40
55
N=402
16000
67
41
59
N=802
64000
68
42
59
p=3
N=1602
256000
68
42
60
N=3202
1024000
69
42
60
N=402
9600
0.10
0.07
0.07
N=802
38400
0.74
0.54
0.59
p=2
N=1602
153600
3.70
2.69
3.12
N=3202
614400
19.04
14.74
16.77
N=402
16000
0.45
0.28
0.27
N=802
64000
2.15
1.34
1.30
p=3
N=1602
256000
9.56
5.93
6.12
N=3202
1024000
43.27
26.88
29.25
CPU time in seconds:
degree
mesh
degrees of freedom
Prec., 2x BJ
Prec., 2x BJ (ω = 0.7)
Defl., 1x BJ
Table 2.10. Ground water
Section 2.6 Conclusion
29
40
CPU time in sec.
35
Prec., 2x BJ
Prec., 2x BJ (ω = 0.7)
Defl., 1x BJ
K = 10−5
30
25
20
15
K=1
10
5
0
402
802
1602
3202
# mesh elements
# CG Iterations:
degree
mesh
degrees of freedom
Prec., 2x BJ
Prec., 2x BJ (ω = 0.7)
Defl., 1x BJ
N=402
9600
41
31
41
N=802
38400
42
31
39
p=2
N=1602
153600
43
32
40
N=3202
614400
44
32
41
N=402
16000
55
33
45
N=802
64000
56
34
45
p=3
N=1602
256000
57
35
45
N=3202
1024000
58
35
46
N=402
9600
0.08
0.06
0.06
N=802
38400
0.59
0.45
0.44
p=2
N=1602
153600
3.07
2.32
2.35
N=3202
614400
16.02
11.77
12.63
N=402
16000
0.37
0.22
0.21
N=802
64000
1.74
1.09
1.00
p=3
N=1602
256000
7.95
4.89
4.60
N=3202
1024000
37.00
22.66
22.79
CPU time in seconds:
degree
mesh
degrees of freedom
Prec., 2x BJ
Prec., 2x BJ (ω = 0.7)
Defl., 1x BJ
Table 2.11. Bubbly flow
30
Chapter 2 Linear DG systems
40
CPU time in sec.
35
Prec., 2x BJ
Prec., 2x BJ (ω = 0.7)
Defl., 1x BJ
K = 10−1
30
25
20
K=1
15
10
5
0
402
802
1602
3202
# mesh elements
# CG Iterations:
degree
mesh
degrees of freedom
Prec., 2x BJ
Prec., 2x BJ (ω = 0.7)
Defl., 1x BJ
N=402
9600
45
34
47
N=802
38400
45
35
47
p=2
N=1602
153600
45
36
47
N=3202
614400
45
36
47
N=402
16000
59
36
49
N=802
64000
59
37
49
p=3
N=1602
256000
60
37
50
N=3202
1024000
60
38
50
N=402
9600
0.09
0.07
0.07
N=802
38400
0.63
0.49
0.52
p=2
N=1602
153600
3.26
2.62
2.75
N=3202
614400
16.53
13.31
14.41
N=402
16000
0.40
0.24
0.23
N=802
64000
1.83
1.16
1.09
p=3
N=1602
256000
8.35
5.20
5.09
N=3202
1024000
38.42
24.65
24.69
CPU time in seconds:
degree
mesh
degrees of freedom
Prec., 2x BJ
Prec., 2x BJ (ω = 0.7)
Defl., 1x BJ
Table 2.12. Neumann BCs
Section 2.6 Conclusion
31
45
40
CPU time in sec.
35
Prec., 2x BJ
Prec., 2x BJ (ω = 0.7)
Defl., 1x BJ
K = 10−3
1
4
1
2
1
1
4
30
25
20
K=
15
1
1
4
1
4
1
2
10
5
0
402
802
1602
3202
# mesh elements
# CG Iterations:
degree
mesh
degrees of freedom
Prec., 2x BJ
Prec., 2x BJ (ω = 0.7)
Defl., 1x BJ
N=402
9600
47
36
48
N=802
38400
48
37
49
p=2
N=1602
153600
49
38
51
N=3202
614400
50
38
52
N=402
16000
61
39
54
N=802
64000
67
39
55
p=3
N=1602
256000
68
41
57
N=3202
1024000
71
42
57
N=402
9600
0.09
0.07
0.07
N=802
38400
0.68
0.52
0.54
p=2
N=1602
153600
3.57
2.79
2.99
N=3202
614400
17.65
13.62
15.47
N=402
16000
0.41
0.27
0.25
N=802
64000
2.08
1.23
1.21
p=3
N=1602
256000
9.63
5.83
5.85
N=3202
1024000
44.32
26.68
27.04
CPU time in seconds:
degree
mesh
degrees of freedom
Prec., 2x BJ
Prec., 2x BJ (ω = 0.7)
Defl., 1x BJ
Table 2.13. Anisotropy
32
Chapter 2 Linear DG systems
3
Theoretical scalability
This chapter is based on:
P. van Slingerland, C. Vuik, Scalable two-level preconditioning and deflation
based on a piecewise constant subspace for (SIP)DG systems. Submitted to
JCAM.
34
3.1
Chapter 3 Theoretical scalability
Introduction
Chapter 2 reformulated the two-level preconditioner proposed in [24] as a deflation method, and demonstrated numerically that both two-level variants can
yield fast and scalable CG convergence using (damped) block Jacobi smoothing.
In this chapter, we focus on theoretical support for these findings.
For symmetric problems, convergence theory for a large class of two-level
methods has been established by Falgout et al. [31]. They have derived spectral
bounds for the error iteration matrix corresponding to such two-level methods.
These results are abstract in the sense that they apply for a large family of
coarse spaces and smoothers, and for any symmetric and positive-definite coefficient matrix A. The extension to the nonsymmetric case has been presented
by Notay [57].
By applying the work in [31] SIPG matrices, Dobrev et al [24] have shown
theoretically that their two-level preconditioner (with coarse corrections based
on the DG discretization with polynomial degree p = 0) yields scalable convergence of the CG method (independent of the mesh element diameter). This
result was established for SIPG schemes with polynomial degree p = 1.
To extend the work in [24], in this chapter, we derive bounds for the condition number of the preconditioned system for both two-level methods studied
in Chapter 2. In particular, we show that these bounds are independent of the
mesh element diameter. Unlike before, our results also apply for p > 1 (besides
p = 1). Another difference is that we include BNN/ADEF2 deflation in the
analysis, and compare it to the original preconditioning variant. Additionally,
we demonstrate that the required restrictions on the smoother are satisfied for
(damped) block Jacobi smoothing.
The outline of this chapter is as follows. Section 3.2 specifies both two-level
methods for the linear SIPG systems under consideration. Section 3.3 discusses
abstract relations for the condition number of the preconditioned/deflated system, valid for any SPD matrix A. Section 3.4 derives an auxiliary property
of SIPG matrices that is a consequence of regularity of the mesh. Section 3.5
uses this property to obtain the main scalability result. Finally, Section 3.6
summarizes the main conclusions.
3.2
Methods and assumptions
Basically, this chapter considers the same linear SIPG systems and two-level
preconditioning strategies as in Chapter 2, but now for a larger class of meshes
and basis functions. This section specifies the slightly alternative formulations
and additional assumptions used in the theoretical analysis to come. Section
3.2.1 specifies the diffusion model and discretizes it by means of the SIPG
method. Section 3.2.2 discusses the resulting linear systems. Section 3.2.3
considers two two-level preconditioning strategies for solving the resulting linear
system by means of the preconditioned CG method.
Section 3.2 Methods and assumptions
3.2.1
35
SIPG discretization for diffusion problems
We study the following diffusion problem on the d-dimensional domain Ω with
boundary ∂Ω = ∂ΩD ∪ ∂ΩN and outward normal n:
−∇ · (K∇u) = f,
u = gD ,
K∇u · n = gN ,
in Ω,
on ∂ΩD ,
on ∂ΩN .
(3.1)
We assume that the diffusion K is a scalar that is bounded below and above by
positive constants, and that the other model parameters are chosen such that
a weak solution of (3.1) exists1 . Additionally, we assume and that Ω is either
an interval (d = 1), polygon (d = 2) or polyhedra (d = 3).
To discretize the model problem (3.1), we subdivide Ω into mesh elements
E1 , ..., EN with maximum element diameter h.
We assume that each mesh element Ei is affine-equivalent with a certain reference element E0 that is an interval/polygon/polyhedra (inde- (3.2)
pendent of h) with mutually affine-equivalent edges.
Note that all meshes consisting entirely of either intervals, triangles, tetrahedrons, parallelograms, or parallelepipeds satisfy this property.
Furthermore, we assume that the mesh is regular in the sense of [18, p. 124].
To specify this property, for all i = 0, ..., N , let hi and ρi denote the diameter
of Ei , and the diameter of the largest ball contained in Ei respectively. We can
now define regularity as2 :
hi
. 1,
ρi
∀i = 1, ..., N.
(3.3)
Now that we have specified the mesh, we can construct an SIPG approximation for our model problem (3.1). To this end, define the test space V that
contains each function that is a polynomial of degree p or lower within each
mesh element, and that may be discontinuous at the element boundaries. The
SIPG approximation uh is now defined as the unique element in this test space
that satisfies the relation
B(uh , v) = L(v),
for all test functions v ∈ V,
where B and L are (bi)linear forms that are similar to those defined earlier in
Section 2.2.1, but now for a larger class of meshes. The only difference is that
the quantity h is replaced by an alternative value that depends on the edge
under consideration (and that K is scalar).
1 That
1
is, f, gN ∈ L2 (Ω) and gD ∈ H 2 (Ω) [63, p. 25, 26].
this paper, we use the symbol . in expressions of the form “F (x) . G(x)
for all x ∈ X” to indicate that there exists a constant C > 0, independent of the variable
x and the maximum mesh element diameter h (or the number of mesh elements), such that
F (x) ≤ CG(x) for all x ∈ X. The symbol & is defined similarly.
2 Throughout
36
Chapter 3 Theoretical scalability
To specify this, we use the same notation as before: the vector ni denotes
the outward normal of mesh element Ei ; the set Γh is the collection of all interior
edges e = ∂Ei ∩∂Ej ; the set ΓD is the collection of all Dirichlet boundary edges
e = ∂Ei ∩ ∂ΩD ; the set ΓN is the collection of all Neumann boundary edges
e = ∂Ei ∩ ∂ΩN ; and [.] and {.} denote the usual trace operators for jumps and
averages at the mesh element boundaries, as defined in Section 2.2.1.
Additionally, for all edges e, we write he to denote the length of the largest
mesh element adjacent to e for one-dimensional problems, the length of e for
two-dimensional problems, and the square root of the surface area of e for threedimensional problems. Using this notation, the forms B and L are defined as
follows (for one-dimensional problems, the boundary integrals below should be
interpreted as function evaluations of the integrand):
BΩ (uh , v) =
N Z
X
i=1
Ei
X
K∇uh · ∇v,
Z
σ
[uh ] · [v] ,
h
e
e∈Γh ∪ΓD e
X Z Br (uh , v) = −
{K∇uh } · [v] + [uh ] · {K∇v} ,
Bσ (uh , v) =
e∈Γh ∪ΓD
e
B(uh , v) = BΩ (uh , v) + Bσ (uh , v) + Br (uh , v),
Z
X Z X Z
σ
[K∇v] + v gD +
L(v) =
fv −
vgN ,
he
e
Ω
e
e∈ΓD
(3.4)
e∈ΓN
where σ ≥ 0 is the penalty parameter (cf. Section 2.2.3). Its value may
vary throughout the domain, and we assume that it is bounded below and
above by positive constants (independent of the maximum element diameter
h). Furthermore, we assume that it is sufficiently large so that the scheme is
coercive3 [63, p. 38–40], i.e.:
BΩ (v, v) + Bσ (v, v) . B(v, v),
3.2.2
∀v ∈ V.
(3.5)
Linear system
In order to compute the SIPG approximation, we need to choose a basis for the
test space V . In Section 2.2.2, we have discussed the monomial basis functions
for uniform Cartesian meshes. We now consider a more general class: for all
i = 1, ..., N , let Fi : Ei → E0 denote an invertible affine mapping (which exists
(0)
by assumption (3.2)). Furthermore, let the functions φk : E0 → R (with
3 For
k2
coercivity, it suffices that σ ≥ 2Cn0 k1 , where k0 and k1 are the global lower and
0
upper bounds for the diffusion coefficient K respectively, n0 is the maximum number of
neighbors an element can have (e.g. n0 = 4 for a two-dimensional quadrilateral mesh), and
C is a constant occurring in a trace inequality that does not depend on h (but may depend
on p). See [63, p. 23, 38-39] for more details.
Section 3.2 Methods and assumptions
37
k = 1, ..., m) form a basis for the space of all polynomials of degree p and lower
(0)
on the reference element (setting φ1 = 1). Using this basis on the reference
(i)
element, the basis function φk (with k = 1, ..., m and i = 1, ..., N ) is zero in the
(i)
(0)
entire domain, except in the mesh element Ei , where it reads φk = φk ◦ Fi .
Now that we have defined the basis functions, we can express uh as a linear
combination of these functions and construct a linear system. Although we are
now considering a larger class of meshes and basis functions, we arrive at the
same forms we have seen earlier in Section 2.2.2:
uh =
m
N X
X
(i) (i)
uk φ k ,
i=1 k=1
(i)
where the unknowns uk
linear system Au = b of

A11

 A21

 .
 ..
AN 1
in this expression can be determined by solving a
following form:

A12 . . . A1N  u1   b1 
..   u   b 
2
A22
. 
 2

(3.6)
..  =  ..  ,

..
 .   . 
.
bN
uN
...
AN N
where the blocks all have dimension m, and where, for all i, j = 1, ..., N :


(i)
(j)
(i)
(j)
(i)
(j)
B(φ1 , φ1 ) B(φ2 , φ1 ) . . . B(φm , φ1 )


..


(i)
(j)
(i)
(j)
.
B(φ1 , φ2 ) B(φ2 , φ2 )

Aji = 

..
..


.
.


(i)
(j)
(i)
(j)
B(φ1 , φm )
...
B(φm , φm )
 (i) 


(j)
u1
L(φ1 )
 (i) 

(j) 
u2 
L(φ2 )



ui = 
,
b
=
(3.7)
j
 .. 
 ..  .
 . 
 . 
(i)
um
(j)
L(φm )
Note that A is Symmetric and Positive-Definite (SPD), as the bilinear form B
is SPD (cf. (3.4) and (3.5)).
3.2.3
Two-level preconditioning and deflation
To solve the linear SIPG system by means of the preconditioned CG method, we
consider the two-level preconditioner and the corresponding deflation variant
discussed in Chapter 2. For the deflation method, we consider the alternative
(yet equivalent) BNN formulation. We specify both methods below, as well as
some additional assumptions on the smoother.
38
Chapter 3 Theoretical scalability
Recall the restriction operator R, which is defined such that A0 := RART is
the SIPG matrix for polynomial degree p = 0, and the coarse correction operator Q := RT A−1
0 R. The two-level preconditioner (Definition 2.3.1) combines
−1
−1
r
≈ A−1 . The result y = Pprec
this operator with a nonsingular smoother Mprec
of applying the two-level preconditioner to a vector r can be computed in three
steps:
−1
y(1) = Mprec
r
y
(2)
=y
(1)
(pre-smoothing),
+ Q(r − Ay
(1)
)
−T
y = y(2) + Mprec
(r − Ay(2) )
(coarse correction),
(post-smoothing).
(3.8)
Basically, the BNN deflation variant is obtained by turning (3.8) inside out,
−1
−1
and using an SPD smoother Mdefl
≈ A−1 . The result y = Pdefl
r of applying
the BNN deflation technique to a vector r can now be computed as:
y(1) := Qr
(pre-coarse correction).
−1
y(2) := y(1) + Mdefl
(r − Ay(1) )
y := y
(2)
+ Q(r − Ay
(2)
)
(smoothing),
(post-coarse correction).
(3.9)
Both two-level strategies can be implemented in a standard preconditioned
CG algorithm (cf. Section 2.3.3). We stress that the BNN deflation variant
can be implemented more efficiently in a CG algorithm by using the so-called
ADEF2 deflation variant (cf. Section 2.4). However, for the theoretical purposes in this paper, it is more convenient to study BNN rather than ADEF2.
Furthermore, we require some additional assumptions on the smoothers, indicated hereafter.
To specify these, for any SPD matrix M , let
πM := RT (RM RT )−1 RM
(3.10)
denote the projection onto the coarse space Range(RT ) that yields the best
approximation in the M -norm [31]. Additionally, for any nonsingular matrix
M such that M + M T − A is SPD, define the symmetrization
f := M T (M + M T − A)−1 M.
M
Using this notation, we can now specify the additional smoother requirements.
For the preconditioner, we assume:
T
Mprec + Mprec
− A is SPD.
fprec v . vT v,
h2−d vT M
(3.11)
∀v ∈ Range(I − πI ).
(3.12)
For the deflation method, we assume:
2Mdefl − A is SPD,
(3.13)
Section 3.3 Abstract relations for any SPD matrix A
h2−d vT Mdefl v . vT v,
39
∀v ∈ Range(I − πI ).
(3.14)
−1
We have seen in Chapter 2 that the operator Pprec
is SPD assuming that
(3.11) is satisfied (for the deflation variant, it suffices that Mdefl is SPD). The
conditions (3.11) and (3.13) imply that “the smoother iteration is a contraction
in the A-norm” [31, p. 473]. The main idea behind the conditions (3.12) and
(3.14) is that the smoother should scale with h2−d in the same way that A
f is an efficient preconditioner for A in the space orthogonal
does, and that M
to the coarse space Range(RT ) [79, p. 78]. A slightly stronger version of (3.12)
is also used in [24] to establish scalable convergence.
It will be shown in Section 3.5.2 that the requirements (3.11)–(3.14) are
satisfied for (damped) block Jacobi smoothing (with m × m blocks). A similar
strategy can be used to show (3.11) and (3.12) for (damped) block Gauss-Seidel
smoothing (with m × m blocks)4 .
3.3
Abstract relations for any SPD matrix A
This section discusses abstract relations for the condition number of the preconditioned system for both two-level methods. These results are abstract in
the sense that they apply for any SPD matrix A, so they are not restricted
to SIPG matrices. Section 3.3.1 discusses the condition number of a preconditioned system for a certain class of operators. Section 3.3.2 considers the
specific implications for both two-level methods. Section 3.3.3 compares them
under specific assumptions on the smoother.
3.3.1
Using the error iteration matrix
In this section, we consider the condition number of the preconditioned system for arbitrary SPD matrices A and a certain class of SPD preconditioners
P −1 . Specifically, each preconditioner P −1 in this class is such that, for some
SPD matrix M , the so-called error iteration matrix I − P −1 A has the same
eigenvalues as (recall the notation in Section 3.2.3):
TM := (I − πA )(I − M −1 A)(I − πA ).
In Section 3.3.2 hereafter, we will see that both two-level methods are in this
class for certain specific choices of M . Defining
2
KM := sup
v6=0
k(I − πM )vkM
2
kvkA
,
(3.15)
we now have the following result:
4 To show that (3.11) and (3.12) are valid for block Gauss-Seidel smoothing, note that
T
(3.11) is automatically satisfied as Mprec + Mprec
− A is the block diagonal of A, which is
SPD. To show (3.12), the main idea is to follow a similar strategy as in [79, Proposition 6.12],
and then use the result for block Jacobi obtained in Section 3.5.2.
40
Chapter 3 Theoretical scalability
Lemma 3.3.1 The condition number (in the 2-norm) of the preconditioned
system P −1 A above can be bounded as follows5 :
κ2 (P −1 A) ≤ λmax (M −1 A)KM .
(3.16)
Additionally, if6 M − A ≥ 0, then,
κ2 (P −1 A) = KM .
(3.17)
1
1
To show this, we use that TM has real eigenvalues (as A 2 TM A− 2 is symmetric),
and that [57, Theorem 2.1 and Corollary 2.1]:
TM has m times eigenvalue 0,
−1
λmin (TM ) ≥ 1 − λmax (M A),
1
λmax (TM ) = 1 −
.
λmax (A−1 M (I − πM ))
1
(3.18)
(3.19)
(3.20)
1
Proof (of Lemma 3.3.1) First, note that P −1 A and P − 2 AP − 2 have the same positive
eigenvalues and singular values. Hence, we may express the condition number as:
κ2 (P −1 A)
=
(3.20)
=
1 − λmin (TM )
λmax (P −1 A)
=
λmin (P −1 A)
1 − λmax (TM )
1 − λmin (TM ) λmax A−1 M (I − πM ) .
(3.21)
Because I − πM = (I − πM )2 is a projection and M (I − πM ) is symmetric, it follows
that:
λmax A−1 M (I − πM ) = λmax A−1 M (I − πM )2
= λmax A−1 (I − πM )T M (I − πM )
1
1
= λmax A− 2 (I − πM )T M (I − πM )A− 2
= sup
v6=0
k(I − πM )vk2M
kvk2A
(3.15)
=
KM .
Substitution into (3.21) yields:
κ2 (P −1 A) = 1 − λmin (TM ) KM .
(3.22)
Application of (3.19) now completes the proof of (3.16). To show (3.17), assume that
1
1
M − A ≥ 0, which implies that I − A 2 M −1 A 2 ≥ 0:
M
M −1
≥
[40, p. 398, 471]
≤
A
A−1
5 Throughout this chapter, λ
min and λmax denote the smallest and largest eigenvalue of a
matrix with real eigenvalues respectively.
6 Throughout this chapter, for symmetrical matrices M , M ∈ R n×n , we write M ≤ M
1
2
1
2
to indicate that vT M1 v ≤ vT M2 v for all vectors v ∈ R n ; the notation ≥, <, and > is used
similarly.
Section 3.3 Abstract relations for any SPD matrix A
1
1
1
1
A 2 M −1 A 2
I − A 2 M −1 A 2
41
≤
I
≥
0.
Hence, defining the symmetric projection π̄A :=
values of TM are non-negative:
1
1
1
A2
1
πA A− 2 , it follows that the eigen-
1
1
A 2 TM A− 2 = (I − π̄A )(I − A 2 M −1 A 2 )(I − π̄A ) ≥ 0.
As a result, (3.18) implies that λmin (TM ) = 0. Substitution into (3.22) yields (3.17),
which then completes the proof.
3.3.2
Implications for the two-level methods
Next, we apply the result in the previous section to analyze the condition
number of the preconditioned system for both the two-level preconditioner
−1
−1
and the corresponding BNN deflation variant Pdefl
(as specified in Section
Pprec
3.2.3). For the two-level preconditioner, it is well-known that
−1
κ2 (Pprec
A) ≤ KM
fprec .
(3.23)
This follows as a special case from [31] (also cf. [79, p. 70-73]), and relies
on assumption (3.11). Below, we observe that the theory in [57] implies (via
Lemma 3.3.1) that (3.23) remains true if we replace the inequality by equality.
Furthermore, we obtain similar bounds for the deflation variant. Altogether,
we have the following result, which applies for any SPD matrix A:
−1
−1
and Pdefl
be the twoLemma 3.3.2 Suppose that A is SPD and let Pprec
level operators specified in Section 3.2.3. Then, assuming (3.11), the con−1
A can be
dition number (in the 2-norm) of the preconditioned system Pprec
expressed as follows:
−1
κ2 (Pprec
A) = KM
fprec .
(3.24)
Additionally, assuming (3.13), we have for deflation:
−1
−1
A)KMdefl < 2KMdefl ,
A) ≤ λmax (Mdefl
κ2 (Pdefl
(3.25)
and, under the stronger assumption Mdefl − A ≥ 0:
−1
κ2 (Pdefl
A) = KMdefl .
(3.26)
To show this result, we apply Lemma 3.3.1, using (σ denotes the spectrum):
−1
(3.27)
σ I − Pprec
A = σ TM
fprec ,
−1
σ I − Pdefl
A = σ TMdefl .
(3.28)
42
Chapter 3 Theoretical scalability
These relations follow similar to [75, p. 1730]. Finally, we use that, for any
nonsingular M [79, Proposition 3.8]:
M + MT − A > 0
⇒
f − A ≥ 0.
M
(3.29)
fprec − A ≥ 0. Using this
Proof (of Lemma 3.3.2) Combining (3.11) and (3.29) gives M
with (3.27) in Lemma 3.3.1 yields (3.24). Similarly, (3.26) follows from Lemma 3.3.1
using (3.28) and the assumption Mdefl − A ≥ 0. To show (3.25), note that the first
inequality results from Lemma 3.3.1 and (3.28), while the second inequality follows from
−1
observing that (3.13) implies that λmax (Mdefl
A) < 2. This completes the proof.
3.3.3
Comparing deflation and preconditioning
In this section, we compare both two-level methods in terms of the corresponding condition numbers. In [75, Theorem 6.1], it has been shown that the
−1
−1
fprec . Below, we compare
A and Pdefl
A are equal if Mdefl = M
eigenvalues of Pprec
both two-level methods in case they both use the same smoother.
−1
−1
Theorem 3.3.3 Suppose that A is SPD and let Pprec
and Pdefl
be the twolevel operators specified in Section 3.2.3. Furthermore, choose Mprec =
Mdefl =: M SPD with M − A ≥ 0. Then, both methods are related in the
following sense:
1
−1
−1
−1
A).
A) ≤ κ2 (Pprec
A) ≤ κ2 (Pdefl
κ2 (Pdefl
2
(3.30)
Before showing this result, we note that it implies that the CG convergence for
the preconditioner is asymptotically not worse than for the deflation variant.
That is, assuming M − A ≥ 0. In general, we may have 2M − A > 0 rather
than the stronger assumption M − A ≥ 0. This is the case for block Jacobi
smoothing in the numerical experiments in Section 2.5, where we have seen that
deflation can yield fewer iterations (at lower costs per iteration). Nevertheless,
if the smoother M is replaced by the damped alternative ω −1 M such that
ω −1 M − A ≥ 0 (note that any ω ≤ 21 suffices if 2M − A > 0), then the result
above applies (although a larger ω might give better results). Altogether,
Theorem 3.3.3 provides insight in the way both two-level methods are related,
but does not imply that preconditioning is always better.
To show Theorem 3.3.3, we use, for any SPD matrices M, N :
1
1
(3.31)
KM ≤ N − 2 M (I − πM )N − 2 KN .
This follows from the more general work in [Not10, Corollary 2.2]. Additionally,
we use, for any SPD matrix M [Not05, eq. (45)]:
1
1
f≤
M ≤M
M.
2
2 − λmax (M −1 A)
(3.32)
Section 3.4 Intermezzo: regularity on the block diagonal of A
43
Theorem 3.3.3 can now be shown as follows:
Proof (of Theorem 3.3.3) First, we show that (3.31) implies that, for any SPD matrices M, N and scalar α > 0:
M ≤ αN
⇒
To show this, define π̄M = M
jection. Hence:
1
2
1
πM M − 2
=
=
=
1
1
M ≤αN ⇒M 2 N −1 M 2 ≤α
≤
(I−π̄M ) symmetric projection
≤
KM ≤ αKN .
and observe that I − π̄M is a symmetric pro-
1 2
− 21
M (I − πM )N − 2 N
2
1
1
1 2
− 12 21 12
M M (I − πM )M − 2 M 2 N − 2 N
2
1
1 2
− 12 21
−
M (I − π̄M )M 2 N 2 N
2
1
1 2
−
(I − π̄M )M 2 N 2 v 1 −1 1
M2
M2N
sup
kvk22
v6=0
1 2
1
(I − π̄M )M 2 N − 2 v
2
α sup
kvk22
v6=0
1
2 − 12 2
v
M N
2
α sup
kvk22
v6=0
kvk2
≤
α sup
v6=0
−1
−1
M ≤αN ⇒N 2 M N 2 ≤α
≤
α2 sup
v6=0
=
(3.33)
N
−1
−1
2 MN 2
kvk22
kvk22
kvk22
α2 .
This completes the proof of (3.33). Combining (3.33) and (3.32) now gives:
1
1
KM ≤ KM
KM .
f ≤
2
2 − λmax (M −1 A)
Application of Lemma 3.3.2 (using Mprec = Mdefl =: M SPD with M − A ≥ 0) yields
−1
−1
κ2 (Pdefl
A) = KM
f and κ2 (Pdefl A) = KM . Hence:
1
1
−1
−1
−1
κ2 (Pdefl
A) ≤ κ2 (Pprec
A) ≤
κ2 (Pdefl
A).
2
2 − λmax (M −1 A)
Observing that λmax (M −1 A) ≤ 1 (as M − A SPD) yields (3.30) which then completes
the proof.
3.4
Intermezzo: regularity on the block diagonal of A
To further refine the abstract relations in the previous section for our SIPG
application, we need to derive an auxiliary result: the diagonal blocks of a SIPG
matrix A all ‘behave’ in a similar manner in the space orthogonal to the coarse
44
Chapter 3 Theoretical scalability
space. This section obtains this result in three steps: Section 3.4.1 discusses
an auxiliary property based on the regularity of the mesh. Section 3.4.2 uses
this property to establish the desired result in terms of ‘small bilinear forms’.
Section 3.4.3 can then show the main result of this section: regularity on the
block diagonal of A.
3.4.1
Using regularity of the mesh
The first step is rather abstract consequence of the regularity of the mesh.
To state this result, recall the mapping Fi : Ei → E0 (cf. Section 3.2.2).
Because this mapping is invertible and affine by assumption (3.2), there exists
an invertible matrix Gi ∈ Rd×d and a vector gi ∈ Rd such that
Fi (x) = Gi x + gi ,
∀x ∈ Ei .
−1
Next, let |G−1
i | denote the determinant of Gi , and define
T
Zi := |G−1
i |Gi Gi .
Using regularity of the mesh, the following spectral properties of Zi can be
shown:
Lemma 3.4.1 Assuming (3.2) and (3.3), the eigenvalues of the matrices
Zi above satisfy the following relation:
1 . λmin (h2−d Zi ) ≤ λmax (h2−d Zi ) . 1,
∀i = 1, ..., N.
(3.34)
To show this result, we use the following relations [18, p. 120–122]7 :
−1 G = meas(Ei ) ,
i
meas(E0 )
kGi k2 ≤
h0
,
ρi
We can now prove Lemma 3.4.1:
−1 G ≤ hi .
i
2
ρ0
(3.35)
T
Proof (of Lemma 3.4.1) Because Zi := |G−1
i |Gi Gi , and Gi is invertible, we have (cf.
[67, p. 26]):
T
2−d −1 λmax (h2−d Zi ) = h2−d G−1
Gi kGi k22 ,
i λmax (Gi Gi ) = h
1
1
= h2−d G−1
λmin (h2−d Zi ) = h2−d G−1
.
i
i T
−1 2
λmax (Gi Gi )−1
Gi 2
Applying the relations in (3.35), using that meas(E0 ), h0 and ρ0 do not depend on h,
and observing that ρdi . meas(Ei ) . hdi (for all i = 1, ..., N ), we may write:
d 2
meas(Ei ) h0 2
h
hi
meas(Ei ) h 2
λmax (h2−d Zi ) ≤ h2−d
.
.
,
meas(E0 ) ρi
hd
ρi
h
ρi
7 Here,
meas(.) denotes the Lesbesque measure.
Section 3.4 Intermezzo: regularity on the block diagonal of A
λmin (h2−d Zi ) ≥ h2−d
meas(Ei )
meas(E0 )
ρ0
hi
2
&
meas(Ei )
hd
45
h
hi
2
&
ρ d h 2
i
.
h
hi
Hence, the proof is completed if we can show that
1≤
h
h
≤
. 1,
hi
ρi
∀i = 1, ..., N.
The first two inequalities follow from the fact that ρi ≤ hi ≤ h. The last inequality follows
as a special case from (3.3). Hence, we obtain (3.34), which completes the proof.
3.4.2
The desired result in terms of ‘small’ bilinear forms
The second step is to use the regularity result in the previous section to obtain
the desired result (regularity on the block diagonal of A) in terms of ‘small’
bilinear forms. To state this result, we require the following notation: let
V0 denote the space of all polynomials of degree p and lower defined on the
reference element E0 . Additionally, let Γi denote the set of all edges of Ei
that are either in the interior or at the Dirichlet boundary. Furthermore, let
Γ0 denote the set of all edges of the reference element E0 . Next, define the
following bilinear forms8 :
Z
Z
(0)
(i)
∇v · ∇w,
K∇ (v ◦ Fi ) · ∇ (w ◦ Fi ) , BΩ (v, w) =
BΩ (v, w) =
E0
Ei
XZ σ
XZ
Bσ(i) (v, w) =
[v ◦ Fi ] · [w ◦ Fi ] ,
Bσ(0) (v, w) =
[v] · [w] ,
e he
e
e∈Γi
e∈Γ0
for all v, w ∈ V0 and i = 1, ..., N . Using this notation, we now have the following
result:
Lemma 3.4.2 Suppose that the diffusion coefficient K and the penalty parameter σ are bounded above and below by positive constants (independent
of h). Assume that (3.2) and (3.3) hold. Then, the bilinear forms above
satisfy the following relations:
(0)
(i)
(0)
BΩ (w, w) . h2−d BΩ (w, w) . BΩ (w, w),
∀w ∈ V0 ,
∀i = 1, ..., N.
(3.36)
0 ≤ h2−d Bσ(i) (w, w) . Bσ(0) (w, w),
∀w ∈ V0 ,
∀i = 1, ..., N.
(3.37)
We discuss the proof of both relations individually hereafter.
8 Here,
the trace operators are defined as before by extending the function to be zero
outside E0 and Ei .
46
Chapter 3 Theoretical scalability
Proof (of (3.36)) Because the diffusion coefficient K is bounded below and above by
positive constants (independent of h), we may write (all displayed relations below are for
all w ∈ V0 and for all i = 1, ..., N ):
Z
Z
(i)
∇ (w ◦ Fi ) · ∇ (w ◦ Fi ) .
(3.38)
∇ (w ◦ Fi ) · ∇ (w ◦ Fi ) . BΩ (w, w) .
Ei
Ei
Next, we apply the chain rule, using that the Jacobian of Fi is equal to Gi :
Z Z
∇ (w ◦ Fi ) · ∇ (w ◦ Fi ) =
Gi ∇w ◦ Fi · Gi ∇w ◦ Fi .
Ei
Ei
A change of variables (from x ∈ Ei to Fi (x) ∈ E0 ) introduces a factor G−1
i :
Z Z
−1 ∇ (w ◦ Fi ) · ∇ (w ◦ Fi ) =
Gi Gi ∇w · Gi ∇w
E0
Ei
=
Z
T
(∇w)T G−1
i Gi Gi ∇w.
E0
|
{z
}
=Zi
Substitution of this expression into (3.38) and multiplication with h2−d yields:
Z
Z
(i)
(∇w)T (h2−d Zi )(∇w).
(∇w)T (h2−d Zi )(∇w) . h2−d BΩ (w, w) .
E0
E0
Application of Lemma 3.4.1 gives:
Z
Z
(i)
∇w · ∇w . h2−d BΩ (w, w) .
|
E0
{z
(0)
=BΩ (w,w)
|
}
which completes the proof of (3.36).
∇w · ∇w,
E0
{z
(0)
=BΩ (w,w)
}
Proof (of (3.37) for 1D problems) Because the penalty parameter σ is bounded be(i)
low and above by positive constants (independent of h), and because Bσ is SPSD, it
follows that (all displayed relations below are for all w ∈ V0 and for all i = 1, ..., N ):
X Z h2−d
(i)
0 ≤ h2−d Bσ (w, w) .
[w ◦ Fi ] · [w ◦ Fi ] .
(3.39)
e he
e∈Γ
i
For one-dimensional problems, the integral over an edge e should be interpreted as the
evaluation of the integrand. Furthermore, he denotes the size of the largest mesh element
adjacent to e, so regularity (3.3) implies that hh . 1 for all e. Hence, we may write (for
e
d = 1):
X
(i)
[w ◦ Fi (e)] · [w ◦ Fi (e)] .
0 ≤ hBσ (w, w) .
e∈Γi
Finally, observe that, for all e ∈ Γi , the transformed edge value Fi (e) =: e0 ∈ Γ0 .
Different e ∈ Γi yield different e0 ∈ Γ0 , although not all e0 ∈ Γ0 are reached in the
presence of Neumann boundary conditions. Nevertheless, we may write:
X
(i)
0 ≤ hBσ (w, w) .
[w(e0 )] · [w(e0 )] .
e0 ∈Γ0
|
{z
(0)
=Bσ (w,w)
}
This completes the proof of (3.37) for one-dimensional problems.
Section 3.4 Intermezzo: regularity on the block diagonal of A
47
Proof (of (3.37) for 2D problems) Similar to the one-dimensional case, we can obtain (3.39). For two-dimensional problems, the edges e are line segments and he =
meas(e), i.e. the length of e. Hence, for all e there exists an invertible affine mapping
re : [0, 1] → e. By definition of the line integral over e, we may now rewrite (3.39) as
(using d = 2):
X 1 Z 1
(i)
0 ≤ Bσ (w, w) .
[w ◦ Fi ◦ re (t)] · [w ◦ Fi ◦ re (t)] re′ dt
h
e
0
e∈Γ
i
Because re (t) is affine, its derivative re′ is a constant and:
Z
Z 1
′
′
re dt =
r e =
1 de = meas(e) = he .
e
0
Hence:
(i)
0 ≤ Bσ (w, w) .
X Z
e∈Γi
1
[w ◦ Fi ◦ re (t)] · [w ◦ Fi ◦ re (t)] dt.
0
Next, consider a single e ∈ Γi : note that Fi ◦ re ([0, 1]) = Fi (e) =: e0 ⊂ ∂E0 , and define
the invertible affine mapping re0 := Fi ◦ re . As above, we have that re′ 0 = meas(e0 ).
By definition of the line integral over e0 , we may now write (using that meas(e0 ) does
not depend on h):
Z
Z
Z 1
1
[w] · [w] .
[w] · [w] .
[w ◦ Fi ◦ re (t)] · [w ◦ Fi ◦ re (t)] dt =
meas(e0 ) e0
e0
0
Next, we apply this strategy for all e ∈ Γi , which yield different (disjunct) e0 ⊂ ∂E0 ,
although the entire boundary of E0 is not reached in the presence of Neumann boundary
conditions. Combining the results, we may write (after possible repartitioning of the
edges e0 ):
X Z
(i)
0 ≤ Bσ (w, w) .
[w] · [w] .
e0 ∈Γ0
|
e0
{z
(0)
=Bσ (w,w)
}
This completes the proof of (3.37) for two-dimensional problems (d = 2).
Proof (of (3.37) for 3D problems) The proof is similar to the two-dimensional case,
except that we are now dealing with surface integrals rather than line integrals. Similar
the one-dimensional case,
p we have (3.39). For three-dimensional problems, the faces e
are polygons and he = meas(e), i.e. the square root of the surface area of e. Because
all faces are mutually affine-equivalent (3.2), for all e, there exists an invertible affine
mapping re : D → e for some polygon D ⊂ R 2 (independent of h). By definition of the
surface integral over e, we may rewrite (3.39) as (using d = 3):
(i)
0 ≤ h−1 Bσ (w, w)
X 1 Z
∂re
∂re ×
.
[w ◦ Fi ◦ re (u, v)] · [w ◦ Fi ◦ re (u, v)] du dv.
h he D
∂u
∂v e∈Γ
i
∂r
∂r
and ∂v
are constant, and:
Because re (u, v) is affine, it follows that its derivatives ∂u
Z ∂r
∂r
∂r ∂r 1
×
du dv
∂u × ∂v = meas(D)
∂v D ∂u
Z
1
meas(e)
h2e
=
1 de =
=
.
meas(D) e
meas(D)
meas(D)
48
Chapter 3 Theoretical scalability
Hence:
(i)
0 ≤ h−1 Bσ (w, w) .
X
e∈Γi
he
h meas(D)
Z
[w ◦ Fi ◦ re (u, v)] · [w ◦ Fi ◦ re (u, v)] du dv
D
Because e can be contained in a circle with diameter h, it follows that meas(e) . h2 ,
hence hhe . 1:
Z
X
1
(i)
[w ◦ Fi ◦ re (u, v)] · [w ◦ Fi ◦ re (u, v)] du dv
0 ≤ h−1 Bσ (w, w) .
meas(D) D
e∈Γ
i
Next, consider a single e ∈ Γi : note that Fi ◦ re (D) = Fi (e) =: e0 ⊂ ∂E0 , and define
∂re ∂re
the invertible affine mapping re0 := Fi ◦ re . As above, we have that ∂u0 × ∂v0 =
meas(e0 )
.
meas(D)
By definition of the surface integral over e0 , we may now write (using that
meas(e0 ) does not depend on h):
Z
Z
1
1
[w ◦ Fi ◦ re (u, v)] · [w ◦ Fi ◦ re (u, v)] du dv =
[w] · [w]
meas(D) D
meas(e0 ) e0
Z
[w] · [w] .
.
e0
Next, we apply this strategy for all e ∈ Γi , which yield different (disjunct) e0 ⊂ ∂E0 ,
although the entire boundary of E0 is not reached in the presence of Neumann boundary
conditions. Combining the results, we may write (after possible repartitioning of the
edges e0 ):
X Z
(i)
[w] · [w] .
0 ≤ h−1 Bσ (w, w) .
e0 ∈Γ0
|
e0
{z
(0)
=Bσ (w,w)
}
This completes the proof of (3.37) for three-dimensional problems (d = 3).
3.4.3
Regularity on the block diagonal of A
As a final step, we now demonstrate that the diagonal blocks of a SIPG matrix
A all behave in a similar manner in the space orthogonal to the coarse space.
To state this result, we require the following notation: suppose that AΩ results
from the bilinear form h2−d BΩ in the same way that A results from the bilinear
form B: this is established by substituting AΩ for A and h2−d BΩ for B in (3.6)
and (3.7). Similarly, suppose that the matrices Aσ and Ar result from the
bilinear forms h2−d Bσ and h2−d Br respectively. Altogether, we may write
A = hd−2 (AΩ + Aσ + Ar ).
(3.40)
Finally, let Dσ be the result of extracting the diagonal blocks of size m × m
from Aσ . Using this notation, we now have regularity on the block diagonal of
A in the following sense:
Section 3.4 Intermezzo: regularity on the block diagonal of A
49
Theorem 3.4.3 Suppose that the diffusion coefficient K and the penalty
parameter σ are bounded above and below by positive constants (independent
of h). Assume that (3.2) and (3.3) hold. Then, the matrices AΩ and Dσ
above satisfy the following relations:
vT v . vT AΩ v,
∀v ∈ Range(I − πI ),
vT AΩ v . vT v,
T
∀v ∈ RmN ,
T
mN
0 ≤ v Dσ v . v v,
∀v ∈ R
.
(3.41)
(3.42)
(3.43)
To show this result, the main idea is to observe that AΩ is an N × N block
diagonal matrix with blocks of size m × m, where the first row and column in
(j)
every diagonal block is zero: this follows from the fact that BΩ (φik , φℓ ) = 0
(j)
for i 6= j, and that the gradient of the piecewise constant basis function φ1 is
(piecewise) zero. Altogether, AΩ is of the following form:


0
0
 0 A(1)

Ω




..


.




0
0


AΩ = 
(3.44)
.
(i)


0 AΩ




..


.



0
0 
(N )
0 AΩ
As a consequence, we can treat the diagonal blocks individually by applying
Lemma 3.4.2, and then combine the results (a similar strategy is used for the
block diagonal Dσ ).
To show (3.41), we also use the nature of πI = RT R, which is the projection
operator onto the coarse space Range(RT ) that yields the best approximation in
the 2-norm. As a result, the space Range(I − πI ) is orthogonal to Range(RT ),
where the latter corresponds to the piecewise constant basis functions. In
particular, any v ∈ Range(I − πI ) ⊂ RN m is of the form:


0
 v1 


 .. 
 . 


 0 
.

(3.45)
v=

 vi 
 . 
 .. 


 0 
vN
50
Chapter 3 Theoretical scalability
Using these ideas, we can now show Theorem 3.4.3:
(i)
Proof (of Theorem 3.4.3) Let AΩ denote the result of deleting the first row and
column in the ith diagonal block in AΩ , as indicated in (3.44). In other words:
(i) (0)
(0)
(i)
= h2−d BΩ (φk , φℓ ),
AΩ
ℓ−1,k−1
(0)
for all k, ℓ = 2, ..., m. Next, observe that BΩ
is independent of h and symmetric.
(0)
(0)
Furthermore, for all higher-order polynomials v ∈ span{φ2 , ..., φm } \ {0}, the gradient
(0)
BΩ (v, v)
(0)
of v is nonzero, which implies that
> 0. In other words, BΩ is even positivedefinite for the subspace under consideration. As a consequence, applying Lemma 3.4.2,
we obtain a result similar to (3.41), but then for the individual diagonal blocks:
(i)
wT w . wT AΩ w,
∀w ∈ R m−1 ,
∀i = 1, ..., N.
Using the notation in (3.45), this relations hold in particular for w = vi , for all i =
1, ..., N . Summing over all i then yields:
N
X
i=1
viT vi .
N
X
(i)
viT AΩ vi ,
∀v ∈ Range(I − πI ),
i=1
Using the notation in (3.45), this can be rewritten as (3.41), which then completes its
proof. The relations (3.42) and (3.43) follow in a similar manner from Lemma 3.4.2
(without deleting the first row and column in each diagonal block).
3.5
Main result: scalability for SIPG systems
Combining the results in the previous section, we can now show the main result
of this chapter: both two-level methods yield scalable convergence (independent of the mesh element diameter) of the preconditioned CG method for SIPG
systems. This result has been shown by Dobrev et al. [24] for the preconditioning variant for p = 1. In this section, we extend these results for p ≥ 1 and for
the deflation variant. Section 3.5.1 obtains the aforementioned scalability for
a general class of smoothers. Section 3.5.2 shows that the required smoother
criteria are valid for block Jacobi smoothing. Section 3.5.3 studies the influence
of damping and the penalty parameter on the upper bound of the condition
numbers for block Jacobi smoothing.
3.5.1
Main result: scalability for SIPG systems
To state the main scalability result of this chapter, let A be the discretization
matrix resulting from an SIPG scheme with p ≥ 1, as defined in Section 3.2.1.
−1
−1
and Pdefl
denote the two-level preconditioner and BNN
Furthermore, let Pprec
deflation variant respectively, as specified in Section 3.2.3. The main result can
now be stated as follows:
Section 3.5 Main result: scalability for SIPG systems
51
Theorem 3.5.1 (Main result) Suppose that the diffusion coefficient K
and the penalty parameter σ are bounded above and below by positive constants (independent of h). Assume that (3.2), (3.3), and (3.5) hold. Furthermore, assume that the smoother conditions (3.11), (3.12), (3.13), and
(3.14) are satisfied. Then, both two-level methods yield scalable CG convergence in the sense that the condition number κ2 (in the 2-norm) of the
preconditioned system can be bounded independently of the maximum mesh
element diameter h:
−1
κ2 (Pdefl
A) . 1.
−1
κ2 (Pprec
A) . 1,
(3.46)
To show Theorem 3.5.1, the main idea is to consider Lemma 3.3.2:
−1
κ2 (Pdefl
A) < 2KMdefl .
−1
κ2 (Pprec
A) = KM
fprec ,
(3.47)
The proof is then completed by showing that KM
fprec , KMdefl . 1, for any
smoother that satisfies the criteria above. This is established using the auxiliary result Theorem 3.4.3, and coercivity (3.5) in matrix form:
hd−2 vT (AΩ + Aσ ) v . vT Av,
∀v ∈ RN m .
(3.48)
Altogether, Theorem 3.5.1 can now be shown as follows:
Proof (of Theorem 3.5.1) First, we will show that KM
f
prec
. 1 (a similar strategy
f
f
yields KMdefl
. 1). For ease of notation, we will write M for Mprec . The main idea is to
show that (I − πM
)v
.
kvk
for
all
v:
because
π
is
a
projection onto the coarse
f
f
f
A
M
M
f-norm, we can replace π f
space Range(RT ) that yields the best approximation in the M
M
by the suboptimal projection πI , and then combine the properties established so far:
(I − π f )v2
≤
k(I − πI )vk2M
f
f
M
M
(3.12)
.
(3.41)
.
(3.44), (3.45)
=
Bσ SPSD ⇒ Aσ SPSD
.
(3.48)
.
hd−2 k(I − πI )vk22
hd−2 k(I − πI )vk2AΩ
hd−2 kvk2AΩ
hd−2 kvk2AΩ +Aσ
kvk2A
∀v ∈ R N m .
Substitution of this relation into the definition of KM
f yields:
2
(I − π f )v
f
M
M
KM
. 1,
f := sup
kvk2A
v6=0
A similar strategy, using (3.14) instead of (3.12), yields KMdefl . 1. Substitution of
KM
fprec , KMdefl . 1 into (3.47) now yields (3.46), which completes the proof of Theorem 3.5.1.
52
Chapter 3 Theoretical scalability
3.5.2
Special case: block Jacobi smoothing
This section demonstrates that Theorem 3.5.1 is valid for (damped) block Jacobi smoothing. To specify this result, suppose that MBJ is the block Jacobi
smoother with blocks of size m × m. Next, consider the specific choice
Mprec = Mdefl = ω −1 MBJ
with damping parameter ω > 0 (independent of h). We assume that ω ≤ 1,
with ω < 1 strictly for the preconditioning variant.
Additionally, we assume that there exists a permutation matrix P such that
A can be permuted as:
P AP T = ∆ − L − LT ,
(3.49)
with



∆=


∆1
∆2
..
.
∆q


,


0
 L1

−L = 

0
..
.

0
Lq−1
0


,

for some block-diagonal matrices ∆1 , ..., ∆q with blocks of size m × m, matrices
L1 , ..., Lq−1 , and integer q ≤ N . Note that this assumption implies that the
matrix A has property Aπ in the sense of [5, Definition 6.7]. Moreover, we
remark that (3.49) is satisfied if the mesh can be colored by two colors9 (in
that case, we can choose q = 2, and ∆1 and ∆2 each correspond to one of the
two colors). In particular, structured rectangular meshes can be colored by two
colors and thus satisfy (3.49).
Altogether, assuming10 (3.49), we can now show that all smoother requirements for Theorem 3.5.1 are satisfied for (damped) block Jacobi smoothing:
Corollary 3.5.2 If (3.49) is satisfied, then Theorem 3.5.1 applies for the
damped block Jacobi smoothers Mprec and Mdefl above, i.e. both two-level
methods yield scalable CG convergence in the sense that the condition number κ2 (in the 2-norm) of the preconditioned system can be bounded independently of the maximum mesh element diameter h:
−1
κ2 (Pprec
A) . 1,
−1
A) . 1.
κ2 (Pdefl
This result follows immediately from Theorem 3.5.1 once we have verified that
the conditions (3.11), (3.13), (3.12), and (3.14) are satisfied for the damped
9 That is, the mesh can be represented by a graph whose vertices can be colored such that
connected vertices do not have the same color.
10 Alternatively, we could assume that the damping parameter ω is sufficiently small. This
option is not considered further in this thesis.
Section 3.5 Main result: scalability for SIPG systems
53
block Jacobi smoothers under consideration. In other words, writing M :=
ω −1 MBJ , we need to show:
2M − A > 0,
h
2−d T
h
2−d T
(3.50)
T
v M v . v v,
fv . v v,
v M
T
∀v ∈ Range(I − πI ),
(3.51)
∀v ∈ Range(I − πI ),
(3.52)
for all ω ≤ 1, with ω < 1 strictly for (3.52). We treat each relation separately.
To show (3.50), we use that (ρ denotes the spectral radius) :
B := ∆−1 (L + LT ),
ρ(B) < 1,
(3.53)
which follows from [5, Theorem 6.38] using (3.49) and the fact that A and M
are SPD.
Proof (of (3.50)) Without loss of generality, assume that ω = 1. Next, observe that
P M P T = ∆. Hence,
P (2M − A)P T
P M P T =∆
=
P AP T :=∆−L−LT
2∆ − P AP T
=
2∆ − ∆ + L + LT
=
∆ I + ∆−1 (L + LT )
{z
}
|
=:B
=
∆(I + B).
Hence,
1
1
λmin (2M − A) = λmin (I + ∆ 2 B∆− 2 ).
1
1
And because (3.53) implies that ρ(B) = ρ(∆ 2 B∆− 2 ) < 1, it follows that 2M − A > 0.
This completes the proof.
To show (3.51), the main idea is to use Theorem 3.4.3 and the following property (cf. [24, p. 760] and [43, p. 4]):
0 < B(v, v) . BΩ (v, v) + Bσ (v, v),
∀v ∈ RN m .
(3.54)
Proof (of (3.51)) Without loss of generality, assume that ω = 1. Next, recall the
notation introduced in the beginning of Section 3.4.3. Additionally, similar to Dσ , let
Dr be the result of extracting the diagonal blocks of size m × m from Ar . Using this
notation, and the fact that AΩ is a block diagonal matrix with blocks of size m × m, we
may write:
h2−d M = AΩ + Dσ + Dr .
Next, we write (3.54) in matrix form:
0 < vT Av . vT hd−2 AΩ + hd−2 Aσ v,
(3.55)
∀v ∈ R N m .
Because this relation is also true when considering the diagonal blocks only, we may
write:
h2−d vT M v . v(AΩ + Dσ )v,
∀v ∈ R N m .
Application of Theorem 3.4.3 now yields (3.51), which completes the proof.
54
Chapter 3 Theoretical scalability
To show (3.52), we combine the previous results (3.50) and (3.51):
Proof (of (3.52)) Using (3.50), and the fact that ω < 1 strictly, it can be shown that
f≤
M
which can be seen as follows:
1
M,
2(1 − ω)
(3.56)
2ωM − A
≥
0,
2M − A = (2 − 2ω)M + 2ωM − A
| {z }
≥
(2 − 2ω)M,
≥0
(2M − A)−1
[40, p. 398, 471]
≤
M (2M − A)−1 M
{z
}
|
≤
1
M −1 ,
2(1 − ω)
1
M.
2(1 − ω)
f
=:M
Combining this relation with (3.51), it now follows that
fv
h2−d vT M
(3.56)
≤
1
h2−d vT M v
2(1 − ω)
(3.51)
.
vT v,
∀v ∈ Range(I − πI ).
This completes the proof of (3.52).
3.5.3
Influence of damping and the penalty parameter for
block Jacobi smoothing
In Section 2.5.2, we studied the influence of damping and the penalty parameter on the CG convergence. We found that both two-level methods perform
significantly better if the penalty parameter is chosen dependent on local values of the diffusion coefficient. Furthermore, a damping parameter around
ω = 0.7 was observed to be optimal for the preconditioning variant during the
numerical experiments. To gain more insight in these results, in this section,
we study the influence of damping and the penalty parameter on the constants
in Corollary 3.5.2, where the same damped block Jacobi smoother is used for
both two-level methods.
Regarding damping, it can be shown (the proof is given at the end of this
section):
−1
A) ≤
κ2 (Pdefl
2
KMBJ ,
ω
−1
κ2 (Pprec
A) <
1
KMBJ .
2ω(1 − ω)
(3.57)
Although these upper bounds may not be optimal, the fact that the upper
bound for the preconditioner blows up as ω tends to 1 is in line with our earlier
numerical observation in Section 2.5 that the preconditioning variant performs
better for ω safely away from 1.
(i)
To study the influence of the penalty parameter, let σmax denote the largest
value that the penalty parameter σ attains at the edges of mesh element Ei ,
(i)
and let Kmin denote the smallest value that the diffusion coefficient K attains
Section 3.5 Main result: scalability for SIPG systems
55
within Ei (for all i = 1, ..., N ). We can now bound KMBJ in terms of the local
ratio between the penalty parameter and the diffusion coefficient, assuming11
Aσ + Ar ≥ 0 (the proof is given at the end of this section):
(i)
σmax
KMBJ ≤ C1 max
i=1,...,N
+ C2 ,
(i)
Kmin
(3.58)
for some positive constants C1 and C2 that are independent of the mesh element diameter h and the penalty parameter σ (but possibly dependent on the
diffusion coefficient K). The result of substituting (3.58) into (3.57) is in line
with the observation in Section 2.5 that the penalty parameter can best be
chosen dependent on local values of the diffusion coefficient.
We end this section with the proofs of (3.57) and (3.58):
Proof (of (3.57)) The first inequality follows from (3.47), (3.33) and the fact that
Mdefl = ω −1 MBJ . To show the second inequality, we use that (3.50) implies that
−1
A) < 2ω:
λmax (Mprec
(3.47)
−1
κ2 (Pprec
A)
=
KM
f
prec
(3.33), (3.32)
1
≤
−1
A)
2 − λmax (Mprec
−1
λmax (Mprec
A)<2ω
1
KMprec
2(1 − ω)
<
(3.33),Mprec =ω −1 MBJ
≤
KMprec
1
KMBJ .
2ω(1 − ω)
This completes the proof of (3.57).
Proof (of (3.58)) By definition,
KMBJ
(3.15)
=
sup
v6=0
(I − πM )v2
BJ
M
BJ
kvk2A
.
As in the proof of Theorem 3.5.1, we may replace πMBJ by the suboptimal projection
πI :
KMBJ
≤
sup
v6=0
k(I − πI )vk2MBJ
kvk2A
.
Using the notation in (3.40) and (3.55), we can rewrite this as (note that the factor hd−2
cancels):
KMBJ
≤
sup
k(I − πI )vk2AΩ +Dσ +Dr
kvk2AΩ +Aσ +Ar
v6=0
.
Next, we use the assumption Aσ + Ar ≥ 0:
KMBJ
11 This
Aσ +Ar ≥0
≤
sup
v6=0
k(I − πI )vk2AΩ +Dσ +Dr
kvk2AΩ
condition seems closely related to coercivity (3.5). How either can be guaranteed
in practice (for problems with strong contrasts in the coefficients) is left for future research.
56
Chapter 3 Theoretical scalability
(3.44), (3.45)
=
k(I − πI )vk2AΩ +Dσ +Dr
sup
k(I − πI )vk2AΩ
v6=0
=
1+
kvk2Dσ +Dr
sup
v∈Range(I−πI )
kvk2AΩ
(i)
.
(i)
(i)
Next, consider the notation for AΩ in (3.44) and, similarly, let Dσ and Dr denote
the result of removing the first row and column from diagonal blok i in Dσ and Dr
respectively. Then, we may write, using the notation for the components of v in (3.45):
PN
(i)
(i)
T
T
i=1 vi (Dσ + Dr )vi
KMBJ ≤ 1 +
sup
.
PN
(i)
T
v∈Range(I−πI )
i=1 vi AΩ vi
At the same time, it can be shown (similar to Section 3.4) that there exist positive
constants CΩ , Cσ , Cr independent of h and σ, with CΩ , Cσ also independent of K, such
that, for all w ∈ R m−1 :
(i)
(i)
wT AΩ w ≥ CΩ Kmin wT w,
(i)
(i)
wT Dσ w ≤ Cσ σmax wT w,
(i)
wT Dr w ≤ Cr wT w.
Combining these relations gives:
(i)
(i)
(i)
viT (Dσ + Dr )viT ≤
Cσ σmax + Cr
(i)
CΩ Kmin
(i)
viT AΩ vi .
Using the latter relation, we may now write:
PN
i=1
KMBJ
≤
1+
sup
v∈Range(I−πI )
≤
1+
max
i=1,...,N
(
(i)
Cσ σmax
(i)
CΩ Kmin
)
(i)
Cσ σmax +Cr
(i)
CΩ Kmin
PN
(i)
viT AΩ vi
(
)
Cr
+ max
.
(i)
i=1,...,N
CΩ Kmin
i=1
This can be rewritten as (3.58), which then completes the proof.
3.6
(i)
viT AΩ vi
Conclusion
This chapter is focused on the theoretical analysis of the two-level preconditioner and deflation variant studied numerically in Chapter 2 for linear systems
resulting from SIPG discretizations for diffusion problems. For both two-level
methods, we have found that the condition number of the preconditioned system can be bounded independently of the mesh element diameter. This result
is valid for any polynomial degree p ≥ 1, which extends the available analysis
for the preconditioning variant for p = 1 in [24]. We have verified that the
restrictions on the smoother are satisfied for block Jacobi smoothing. Altogether, our theory explains the scalable CG convergence observed during the
numerical experiments in Chapter 2, and guarantees similar results for a large
class of other diffusion problems on a variety of meshes.
4
Hidden DG accuracy
This chapter is based on:
P. van Slingerland, J.K. Ryan, C. Vuik, Position-dependent smoothness-increasing accuracy-conserving (SIAC) filtering for improving Discontinuous Galerkin
solutions, SIAM J. Sci. Comp., 33(2011), pp 802–825.
58
4.1
Chapter 4 Hidden DG accuracy
Introduction
DG approximations can contain ‘hidden accuracy’: although the convergence
rate of a DG scheme is typically of order p + 1 (where p is the polynomial
degree), it can be improved to order 2p + 1 by applying a post-processor (cf.
Section 1.4). Interestingly, this post-processor does not contain any information
of the underlying physics or numerics, and needs to be applied only once, at
the final time.
The main idea behind the post-processor is to compute a convolution of the
DG approximation against a linear combination of 2p + 1 B-splines. A positive
side effect of this strategy is that the smoothness of the B-splines is carried
over to the DG approximation, which then becomes p − 1 times continuously
differentiable. This enhanced smoothness can benefit the visualization of the
approximation, e.g. in the form of streamlines.
The aforementioned filter, also known as the symmetric post-processor, was
introduced by Bramble and Schatz [11] in the context of Ritz-Galerkin approximations for elliptic problems. Cockburn, Luskin, Shu, and Süli [22] demonstrated that the same technique can also be used to enhance the convergence
rate of DG approximations from order p + 1 to order 2p + 1. This was shown
for linear periodic hyperbolic problems with a sufficiently smooth exact solution. An overview of the development of post-processing techniques can also
be found in [22].
To make the post-processor applicable near non-periodic boundaries and
shocks, Ryan and Shu [64] introduced the one-sided post-processor. Inspired
by the ideas in [16, 35, 51], they shifted the support of the local averaging
operator to one side of the evaluation point. Using the resulting one-sided
post-processor near boundaries and shocks and the original symmetric postprocessor in the interior, it was now possible to post-process the entire domain,
even for non-smooth solutions or non-periodic boundary conditions.
To improve the accuracy and smoothness of this one-sided strategy, in this
chapter, we propose a position-dependent post-processor. In particular, we investigate the impact of using extra B-splines near the boundary. This way, we
seek to reduce the possibility that the post-processor worsens the errors near
the boundary (despite superconvergence). Furthermore, we study the effect of
smoother transitions in the position of the B-splines. With this strategy, we
aim to eliminate the necessity of (re)introducing artificial discontinuities. To
compare the performance of the resulting position-dependent post-processor
and the original one-sided filter, we discuss seven numerical experiments, including a problem with stationary shocks, a two-dimensional system, and a
streamline visualization example (theoretical error estimates are discussed in
Chapter 5).
The outline of this chapter is as follows. Section 4.2 specifies the DG
schemes under consideration. Section 4.3 provides the basics of the original
symmetric and one-sided post-processor. Section 4.4 introduces the positiondependent post-processor. Section 4.5 discusses the numerical experiments.
Section 4.2 Discretization
59
Section 4.6 summarizes the main conclusions.
4.2
Discretization
This section summarizes the DG method for linear hyperbolic problems. Section 4.2.1 considers the one-dimensional case, following [21]. Section 4.2.2 discusses two-dimensional systems, similar to [22].
4.2.1
DG for one-dimensional hyperbolic problems
Consider the following problem on the interval [a, b]:
ut + cu x = f,
with initial condition u0 and either periodic or Dirichlet boundary conditions
at x = a. The functions c and f may depend on space and time, but not on u.
We assume that the velocity c is positive.
To construct a DG approximation for this problem, consider a uniform
mesh with elements Ei = [xi− 21 , xi+ 21 ] of length h > 0, where x 12 = a and
xi+ 12 = xi− 21 + h (for all i = 1, ..., N ). Next, define the test space V that
contains each function that is a polynomial of degree p or lower within each
mesh element, and that may be discontinuous at the mesh element boundaries.
For all v ∈ V , we let v (i) denote the (continuous) restriction of v to Ei .
At the initial time t = 0, the DG approximation uh is the L2 -projection of
u0 onto V . For t > 0, it is the function in V such that:
Z b
Z b
f v,
∀v ∈ V,
(uh )t v + B(uh , v) =
a
a
(0)
where B is the following bilinear form (defining uh |x 1 in terms of the boundary
2
condition for u at x = a):
B(uh , v) = −
N Z
X
i=1
cuh vx +
Ei
N X
i=1
(i)
v )|xi− 1 .
(i−1) (i)
(cuh v (i) )|xi+ 1 − (cuh
2
2
This form uses an upwind flux approximation for uh at the element boundaries.
For more details, cf. [21].
4.2.2
DG for two-dimensional hyperbolic systems
Similar to the one-dimensional case, we can construct a DG approximation for
two-dimensional hyperbolic systems. To specify this, consider the following
problem on a square domain Ω:
ut + A1 ux1 + A2 ux2 = f,
(4.1)
60
Chapter 4 Hidden DG accuracy
with initial condition u0 and periodic boundary conditions. Here, u and f
are vector-valued functions with m entries and the coefficients Aj are constant
matrices of size m×m. We assume that the system above is strongly hyperbolic
in the sense that, for all n ∈ R2 , there exists a diagonal matrix Λ and a
nonsingular matrix R such that
"
#
λ1
−1
R(A1 n1 + A2 n2 )R = Λ =
.
(4.2)
..
.
λ
m
To construct a DG approximation for the system (4.1), we use a uniform Cartesian mesh for the spatial domain Ω with compact mesh elements
E1 , ..., EN of size h × h. Next, define the test space V that contains each
vector-valued function with m entries that are polynomials of degree p or lower
within each mesh element, and that may be discontinuous at the mesh element
boundaries. For all v ∈ V , we let v (i) denote the (continuous) restriction of v
to Ei .
At the initial time t = 0, the DG approximation uh is the L2 -projection
of u0 onto V . For t > 0, it is the function in V such that (h., .i denotes the
standard inner product in ℓ2 ):
Z
Z
h(uh )t , vi + B(uh , v) =
hf, vi ,
∀v ∈ V,
Ω
Ω
where B is a bilinear form specified hereafter.
To define B, let ûh denote the following upwind flux approximation for
uh on the mesh element boundaries [22, p. 587]: consider an edge e of mesh
(i)
(i)
element Ei with outward normal n(i) = (n1 , n2 ) and neighboring element Ej .
Using the notation in (4.2) (substituting n = n(i) ), define w := R−1 uh . We
then set ûh = Rŵ, where the k th entry of ŵ is defined as:
(
(i)
vk , if λk > 0,
ŵk =
(i)
vk , else.
Using this numerical flux, the bilinear form B can be specified as follows:
B(uh , v)
=
−
+
N Z
X
uh , AT1 vx1 + AT2 vx2
i=1
Ei
N
X
E
X Z D
(i)
(i)
(A1 n1 + A2 n2 )ûh , v (i) .
i=1 e∈∂Ei
4.3
e
Original post-processing strategies
The smoothness and accuracy of a DG approximation can be improved by
applying a post-processer at the final simulation time. This section provides
Section 4.3 Original post-processing strategies
61
the basics of this technique. Section 4.3.1 defines B-splines [70, 71], which are
the building blocks of the filter. Section 4.3.2 discusses the original symmetric
post-processor studied in [11, 22]. Section 4.3.3 considers the one-sided postprocessor introduced in [64] for post-processing near non-periodic boundaries
and shocks.
4.3.1
B-splines
A B-spline ψ (p+1) of order p + 1 can be defined recursively in the following
manner1 :
ψ (1) := 1[− 12 , 12 ] ,
ψ (p+1) := ψ (p) ⋆ ψ (1) ,
for all p ≥ 1.
(4.3)
Figure 4.1 provides an illustration of a B-spline. In general, a B-spline of order
p + 1 is a piecewise polynomial of degree p that is p − 1 times continuously
p+1
differentiable. Moreover, its support reads [− p+1
2 , 2 ]. For more details on
B-splines, cf. [70, 71].
1
Figure 4.1
Illustration of a B-spline ψ p+1 (x)
for p = 1. Its support is contained
in [− p+1
, p+1
].
2
2
0
− p+1
2
4.3.2
x
p+1
2
Symmetric Post-processor
The symmetric post-processor [11, 22] enhances a DG approximation (with
polynomial degree p) by convolving it against a linear combination of B-splines.
To specify this technique, we choose 2p + 1 integer nodes that are located
symmetrically around the origin:
xj = −p + j,
j = 0, ..., 2p.
(4.4)
Next, we place a B-spline of order p + 1 at each kernel node, and define a kernel
K that is a linear combination of these B-splines:
K(x) =
2p
X
j=0
1 Here,
cj ψ (p+1) (x − xj ),
for all x ∈ R,
(4.5)
the symbol ⋆ denotes the convolution operator. Furthermore, 1[− 1 , 1 ] is the indi-
cator function that is 1 in [− 12 , 21 ] and 0 elsewhere.
2 2
62
Chapter 4 Hidden DG accuracy
where the coefficients cj are determined by the following linear system:
(
Z ∞
2p
X
1, for k = 0,
(p+1)
k
cj
ψ
(x)(x + xj ) dx =
(4.6)
0, else.
−∞
j=0
Existence and uniqueness of the solution of the system in (4.6) have been
shown in [10, Lemma 8.1]. The coefficients are chosen in this way to ensure
that the kernel reproduces polynomials q of degree 2p and lower in the sense
that K ⋆ q = q. The relevance of this property will be discussed further in
Section 5.3.1.
The symmetric post-processor can now be defined as follows:
Definition 4.3.1 (Symmetric Post-processor) Consider a periodic
DG approximation uh at some final simulation time on the interval [a, b],
using mesh elements of size h and polynomials of degree p, as discussed in
Section 4.2.1. Let K denote the kernel defined by (4.4), (4.5) and (4.6).
Then, the result of post-processing uh in the evaluation point x̄ ∈ (a, b) is
computed by convolving uh against the scaled kernel K (using a periodic
extension of uh when needed):
u∗h (x̄) =
↑
1
h
Z
b
K
a
x̄ − x
h
uh (x) dx.
(4.7)
Figure 4.2
Application of the scaled symmetric kernel K x̄−x
at the evaluation
h
point x̄ in the mesh (p = 1). The kernel nodes are indicated by circles, and
the corresponding B-splines by dashed
lines.
x̄
Figure 4.2 illustrates the symmetric post-processor. Although this technique
does not contain any information of the underlying physics or numerics, it has
been shown in [22] that it enhances the DG convergence rate from order p + 1
to order 2p + 1 for the linear periodic hyperbolic problems under consideration,
assuming that the exact solution is sufficiently smooth.
A second feature of the post-processor is that the smoothness of the Bsplines is carried over to the approximation, i.e. u⋆h is p − 1 times continuously
differentiable. This enhanced smoothness can be benefit the visualization of
the approximation, e.g. in the form of streamlines (also cf. Section 4.5.7).
Section 4.3 Original post-processing strategies
63
Finally, we remark that the application of the post-processor is not restricted to the DG approximations above, but can be applied to any function,
also in higher dimensions (cf. Section 4.4.3). However, whether this also yields
acceptable accuracy and efficiency does depend on the underlying problem.
The post-processor has been effectively applied for DG discretizations on nonuniform rectangular [23] and unstructured triangular [48] meshes. Furthermore,
it has been shown successful for certain linear convection-diffusion problems [41]
and nonlinear hyperbolic conservation laws [42].
4.3.3
One-sided post-processor
The symmetric post-processor is an effective technique for enhancing the accuracy and smoothness for periodic problems with a sufficiently smooth exact
solution. To be able to post-process near non-periodic boundaries and shocks
as well, the one-sided post-processor has been proposed in [64].
This technique is similar to the symmetric case, except that the kernel nodes
are shifted to one side of the origin, e.g. the right side:
p+1
xj =
+ j,
for all j = 0, ..., 2p.
(4.8)
2
The resulting right-sided post-processor is now obtained by replacing (4.4) by
(4.8) in Definition 4.3.1 (note that the kernel coefficients cj now take different
values).
↑
x̄
Figure 4.3
Application of the
scaled
right-sided
at the evaluation
kernel K x̄−x
h
point x̄ in the mesh (p = 1). The kernel nodes are indicated by circles, and
the corresponding B-splines by dashed
lines. Unlike the symmetric kernel, it
can be applied near the right boundary.
Figure 4.3 illustrates the right-sided post-processor. Unlike the symmetric
post-processor (cf. Figure 4.2), it can be applied anywhere close to the right
boundary of the domain, without requiring information outside the spatial
domain. This is because the support of the right-sided kernel is located
entirely
on the right side of the origin (although the support of K x̄−.
is located on
h
the left side of the evaluation point x̄, as illustrated). This follows from (4.5),
(4.8), and the fact that the support of the B-spline ψ (p+1) is contained in the
p+1
interval [− p+1
2 , 2 ].
64
Chapter 4 Hidden DG accuracy
Although the right-sided post-processor is suitable near the right boundary,
it is not suitable near the left boundary, as it would require information outside
of the spatial domain in this region (this is similar to the symmetric case). To
be able to post-process near the left boundary, we can reverse the strategy
above and shift the kernel nodes to the other (left) side of the origin. The
resulting left-sided post-processor is applicable near the left boundary, but not
the right.
Although neither of the post-processors can be applied in the entire domain,
we can still cover the entire domain by combining the previous kernel types: in
the interior, we can use the symmetric kernel; at the right boundary, we can
use the right-sided kernel; at the left boundary, we can use the left-sided kernel;
and in the transition regions, we can use kernels that are between a symmetric
and a one-sided kernel (specified below). Altogether, we obtain the following
post-processor, as proposed in [64]:
Definition 4.3.2 ((Combined) one-sided post-processor) Consider
a DG approximation uh at some final simulation time on the interval [a, b],
using mesh elements of size h and polynomials of degree p, as discussed in
Section 4.2.1. Let x̄ denote an evaluation point in (a, b). Depending on
this evaluation point, define the following nodes:
xj = −p + j + λ(x̄),
for all j = 0, ..., 2p,
(
x̄−a
a+b
min{0, −⌈ 3p+1
2 ⌉ + ⌊ h ⌋}, for x̄ ∈ [a, 2 ),
λ(x̄) =
3p+1
x̄−b
a+b
for x̄ ∈ [ 2 , b].
max{0, ⌈ 2 ⌉ + ⌈ h ⌉},
(4.9)
(4.10)
Let K be the result of substituting these nodes into (4.5) and (4.6)2 . Then,
the result of post-processing uh in x̄ is obtained by computing the convolution with the scaled kernel K as in (4.7).
⌈ 3p+1
⌉
2
0
⌉ left-sided
−⌈ 3p+1
2
a
2 As
right-sided
Figure 4.4
Illustration of the shift function
λ(x̄) in (4.10). This function selects
the appropriate kernel, depending
on the evaluation point x̄.
symmetric
x̄
b
for the symmetric kernel, existence and uniqueness of the kernel coefficients defined
by (4.6) using shifted nodes can be shown following [10, Lemma 8.1].
Section 4.4 Position-dependent post-processor
65
The shift function λ in (4.10) is designed so that the post-processor above can
be applied in the entire domain, even for non-periodic boundary conditions.
An illustration of λ is given in Figure 4.4. Different values of λ result in
different kernels, so the kernel has become dependent on the evaluation point x̄
through λ: the case λ = 0 corresponds to the symmetric kernel. Similarly, for
3p+1
λ = −⌈ 3p+1
2 ⌉ and λ = ⌈ 2 ⌉, the left- and right-sided kernels are obtained.
Observe that the symmetric kernel is applied whenever possible. The reason
for this is that the symmetric post-processor yields better accuracy than the
non-symmetric variants. The latter has been observed in [64, p. 298] and will
also be demonstrated in Section 4.5.
Finally, note that we can now post-process near shocks by treating them as
a domain boundary. This is established by subdividing the spatial domain into
subdomains such that the shocks occur at the boundaries of these subdomains.
After that, each subdomain can be post-processed individually as prescribed
by Definition 4.3.2. Unlike the symmetric post-processor, which would render
the shock p − 1 times continuously differentiable, this strategy does not remove
the discontinuous nature of the shock. In that sense, the underlying physics is
better represented.
4.4
Position-dependent post-processor
The (combined) one-sided post-processor can be applied in the entire domain,
including near boundaries and shocks. To enhance the smoothness and accuracy of this technique, in this section, we propose the position-dependent
post-processor.
Section 4.4.1 generalizes the original one-sided post-processor by relaxing
both the number and the position of the kernel nodes. Section 4.4.2 uses
this generalization to introduce the position-dependent post-processor. Section 4.4.3 describes the application of a post-processor for two-dimensional
problems.
4.4.1
Generalized post-processor
To improve the one-sided post-processor, the first step is to generalize it by
relaxing both the number and the position of the kernel nodes. More specifically, we drop the convention that the kernel must be based on 2p + 1 kernel
nodes, and use r + 1 B-splines instead (where r is any positive integer at this
point). Furthermore, we allow the kernel nodes to take real values, instead
of the original approach to consider integers only. This makes it possible to
render the shift function, and thus the overall technique, smooth.
Altogether, the generalized post-processor is obtained by considering the
original post-processor in Definition 4.3.2, but then using r + 1 kernel nodes
and removing the rounding operators in the shift function (4.10):
66
Chapter 4 Hidden DG accuracy
Definition 4.4.1 (Generalized post-processor) Consider a DG approximation uh at some final simulation time on the interval [a, b], using
mesh elements of size h and polynomials of degree p, as discussed in Section 4.2.1. Let x̄ denote an evaluation point in (a, b). Depending on this
evaluation point, define the following r + 1 kernel nodes:
r
xj = − + j + λ(x̄),
for all j = 0, ..., r,
(2
a+b
min{0, − r+p+1
+ x̄−a
2
h }, for x̄ ∈ [a, 2 ),
λ(x̄) =
+ x̄−b
for x̄ ∈ [ a+b
max{0, r+p+1
2
h },
2 , b].
(4.11)
(4.12)
Let K be the result of substituting these nodes into (4.5) and (4.6) (using
r rather than 2p in the upper bound of the summation). Then, the result
of post-processing uh in x̄ is obtained by computing the convolution with
the scaled kernel K as in (4.7).
r+p+1
2
0
left-sided
− r+p+1
2
a
right-sided
Figure 4.5
Illustration of the modified shift
function λ(x̄) in (4.12). Unlike the
original shift function (cf. Figure
4.4), it no longer contains discontinuities.
symmetric
x̄
b
The modified shift function in (4.12) is illustrated in Figure 4.5. Compared to
the former shift function (cf. Figure 4.4), there are two main advantages: the
first is that its values are closer to zero. As a consequence, the resulting kernels
are ‘more symmetric’ and thus more accurate (cf. Section 4.3.3).
The second advantage is that the modified shift function is now continuous
everywhere. Because the filtered DG solution is as smooth as the shift function
(but typically not smoother than the B-splines), the generalized post-processor
yields a better and more realistic smoothness than before.
We remark that the smoothness could be enhanced further by designing
a shift function that has the same smoothness of the B-splines throughout
the entire domain (note that the present version is not differentiable in two
locations). However, such an alternative shift function would be less close to
zero, resulting in kernels that are ‘less symmetric’ and thus less accurate.
It will be shown in Chapter 5 that, similar to the to the symmetric postprocessor, the generalized post-processor can extract DG superconvergence of
Section 4.4 Position-dependent post-processor
67
order 2p + 1.
4.4.2
Position-dependent post-processor
The generalized post-processor can be seen as a variant of the original onesided post-processor that yields better smoothness for an arbitrary number of
kernel nodes. To enhance the accuracy as well, we now combine two generalized
post-processors to incorporate extra kernel nodes in the non-symmetric kernels
near the boundary. The resulting technique is called the position-dependent
post-processor, and aims to increase the accuracy near the boundary to that
established by the symmetric kernel in the interior.
To specify this method, we apply the generalized post-processor twice: once
using 2p + 1 kernel nodes and once using 4p + 1 kernel nodes (this particular
choice is motivated below). After that, we compute a ‘smooth’ convex combination of these two results to select the extra kernel nodes near the boundary
only, while maintaining the enhanced smoothness. Altogether, we arrive at the
following definition:
Definition 4.4.2 (Position-dependent post-processor) Consider a
DG approximation uh at some final simulation time on the interval [a, b],
using mesh elements of size h, and polynomials of degree p, as discussed
in Section 4.2.1. Let x̄ denote an evaluation point in (a, b). For this
evaluation point, suppose that u⋆h,2p+1 (x̄) and u⋆h,4p+1 (x̄) are the result
of applying the generalized post-processor (Definition 4.4.1) using 2p + 1
and 4p + 1 kernel nodes respectively. Furtermore, let θ : [a, b] → [0, 1] be
a (‘smooth’) function that is equal to 1 in the interior and equal to 0 near
the boundary. Then, we set
u⋆h (x̄) = θ(x̄)u⋆h,2p+1 (x̄) + 1 − θ(x̄) u⋆h,4p+1 (x̄).
2p+1 nodes
1
0
4p+1 nodes
a
4p+1 nodes
x̄
Figure 4.6
Illustration of the coefficient
function θ(x̄), which smoothly
selects extra kernel nodes near the
boundary for higher accuracy.
b
The purpose of the coefficient function θ is to select the appropriate number of
kernel nodes, depending on the evaluation point. An illustration of a specific
68
Chapter 4 Hidden DG accuracy
choice for θ is given in Figure 4.6. Near the boundary, we prefer to use extra
nodes to enhance the accuracy of the non-symmetric kernels. Hence, θ = 0 in
this region3 , to select u⋆h,4p+1 rather than u⋆h,2p+1 . In the interior, the symmetric kernel already provides the desired accuracy without the additional costs
for the use of extra B-splines (costs are discussed further below). Hence, θ = 1
in this region, to select u⋆h,2p+1 rather than u⋆h,4p+1 . Between the boundary regions and the interior, there are two transition regions, where θ varies smoothly
between 0 and 1 to avoid that artificial discontinuities are introduced4 .
It will be shown in Chapter 5 that, similar to the to the generalized postprocessor, the position-dependent post-processor can extract DG superconvergence of order 2p + 1.
There are several remarks to be made with respect to the position-dependent
post-processor:
• The symmetric and position-dependent post-processor are based on the
same building blocks. For this reason, we speculate that suitable application types for the symmetric post-processor (away from non-periodic
boundaries and shocks) are also suitable for the position-dependent postprocessor (but now in the entire domain). This would include the unstructured and non-linear problems mentioned at the end of Section 4.3.2,
but verification of the latter is left for future research. One- and twodimensional linear problems on uniform meshes will be studied in Section
4.5.
• It is not needed to compute both u⋆h,2p+1 and u⋆h,4p+1 in the entire domain.
This is only required in the two small transition regions where θ ∈ (0, 1).
Near the boundary, we only need to compute u⋆h,4p+1 . In the interior, we
only require u⋆h,2p+1 .
• Similar to the (combined) one-sided post-processor, we can apply the
position-dependent post-processor near shocks, as discussed in Section
4.3.3.
• The use of extra kernel nodes increases the kernel support, and thus the
computational costs. Using r + 1 B-splines of order p + 1, the convolution
with the kernel can be implemented using small matrix-vector multiplications. These matrices have size (r + 1) × (p + 1) and contain the inner
products of the B-splines with the DG basis functions. The vectors (of
length p + 1) contain the DG modes for a mesh element in the kernel
support. Altogether, to compute u⋆h,r+1 in an evaluation point, r + p + 2
such matrix-vector multiplications are required, plus the summation of all
3 In this example, the boundary region where θ = 0 is 3p+1 mesh elements wide, i.e. the
2
smallest region where the symmetric kernel with 2p + 1 nodes cannot be applied.
4 In this example, the transition regions are two mesh elements wide. In these regions, θ
is polynomial of degree 2p + 1 such that θ is p + 1 times differentiable.
Section 4.4 Position-dependent post-processor
69
elements in the resulting vectors. We stress that the post-processor is applied only once, at the final simulation time. To keep the computational
time at a minimum, extra kernel nodes should only be applied where
necessary. For a more efficient inexact implementation, see [49, 47, 50].
• Instead of 4p + 1, we could use any number of kernel nodes near the
boundary (provided that the convolution remains well-defined). For example, we also ran tests using 3p+1 and 5p+1 kernel nodes. As expected,
we found that a larger number of kernel nodes leads to more accuracy.
However, the difference between using 5p + 1 nodes and 4p + 1 nodes was
relatively small. For 3p + 1 nodes, the errors were not improved over the
unfiltered errors for the coarser meshes. Based on these experiments, using 4p + 1 kernel nodes is a natural choice for enhancing the errors where
necessary, without increasing the kernel support and the computational
costs too much.
• For coarsemeshes, the use of extra kernel nodes can increase the support
of K x̄−.
such that it is no longer contained in the domain [a, b] (no
h
matter how you shift it). This is the case if (r + p + 1)h becomes larger
than b − a. To be able to apply the post-processor anyway, a possible
solution is to scale the kernel by the smaller scaling
H=
b−a
<h
r+p+1
(4.13)
rather than h in (4.7). This shrinks the kernel support appropriately.
It will be demonstrated in Section 4.5 that the use of a smaller scaling
for the coarser meshes can reduce the accuracy-enhancing quality of the
post-processor. For this reason, the alternative scaling should only be
applied when necessary.
4.4.3
Post-processing in two dimensions
All post-processing techniques discussed in this chapter can also be applied
for higher-dimensional problems using tensor products. In this section, we
illustrate this for a square domain Ω = [a1 , b1 ] × [a2 , b2 ] (for higher-dimensional
problems, cf. Section 5.2.1).
To this end, consider an evaluation point (x̄1 , x̄2 ) ∈ (a1 , b1 ) × (a2 , b2 ). Let
K1 denote the one-dimensional kernel for the evaluation point x̄1 ∈ (a1 , b1 ),
corresponding to any of the post-processors discussed in the previous sections:
symmetric, (combined) one-sided, generalized, or position-dependent5 . Similarly, let K2 denote such a kernel for x̄2 ∈ (a2 , b2 ). Note that K1 and K2 are
typically not the same kernel, as they are based on different evaluation points.
5 For
the position-dependent post-processor, this means we take the convex combination
of the two generalized kernels involved, i.e. θ(x̄1 )K2p+1 + (1 − θ(x̄1 ))K4p+1 .
70
Chapter 4 Hidden DG accuracy
Using this notation, a DG approximation uh on the square domain Ω (cf.
Section 4.2.2) can now be post-processed by computing the convolution against
the tensor product of the two scaled kernels:
Z b1 Z b2
1
x̄1 − x1
x̄2 − x2
⋆
uh (x̄1 , x̄1 ) = 2
K1
K2
uh (x1 , x2 ) dx2 dx1 .
h a1 a2
h
h
If uh is vector-valued, then this strategy can be applied to each individual
component.
4.5
Numerical Results
The previous section proposed a position-dependent post-processor to improve
on the accuracy of the original (combined) one-sided post-processor. In this
section, we compare both techniques in terms of numerical experiments (for a
theoretical study, cf. Chapter 5).
In particular, we consider the L2 -projection of a sine function (Section
4.5.1); four one-dimensional hyperbolic problems, including periodic boundary conditions (Section 4.5.2), Dirichlet boundary conditions (Section 4.5.3),
variable coefficients (Section 4.5.4), and two stationary shocks (Section 4.5.5);
a two-dimensional system (Section 4.5.6); and the two-dimensional test cases
studied in [73], including streamline visualizations (Section 4.5.7).
In our implementation, the DG approximations are based on first order
upwind fluxes, monomial basis functions, and uniform meshes, as discussed in
Section 4.2. For the time-discretization, we use a third-order SSP-RK scheme
[36, 37] with a sufficiently small time step to ensure that the time-stepping
errors are not dominating the spatial errors. We apply both the ‘old’ and the
‘new’ post-processor at the final time T = 12.5, as specified in Definition 4.3.2
and Definition 4.4.2 respectively. Convolutions of the form (4.7) are computed
exactly using Gaussian quadrature [64]. Finally, we make use of the ARPREC
multi-precision package to reduce round-off errors appropriately [8].
Section 4.5 Numerical Results
71
L2 -Projection
4.5.1
The first test case is the L2 -projection of u(x) = sin(x) onto the space of
piecewise polynomials of degree p = 1, 2, 3 on the domain [0, 2π]. This test
case can also be interpreted as a DG approximation at the initial time. It is
the most elementary case that we can use to test the reliability of our filter.
Table 4.1 demonstrates that both post-processors enhance the convergence
rate from O(hp+1 ) to O(h2p+1 ). However, both the orders and the magnitude
of the L2 - and L∞ -errors are better for the new post-processor. More importantly, it yields an improvement over the unfiltered errors in both norms, even
for coarse meshes. The plots (for p = 2) illustrate that both post-processors
produce identical results in the interior of the domain, where they apply the
same symmetric kernel with 2p + 1 nodes. The differences occur at the boundary: the new post-processor yields significantly better results without introducing a spurious stair-stepping effect. This can be explained by the use of extra
kernel nodes and the continuity of the new shift function (cf. Section 4.4). As
a result, unlike before, the new post-processor enhances the accuracy of the
DG approximation in the entire domain.
20 el.
40 el.
80 el.
160 el.
−6
10
−9
10
−6
10
−9
10
−12
−12
−15
spatial domain
10
80 el.
160 el.
−15
10
−18
6.2832
20 el.
40 el.
−12
−15
0
−6
10
10
10
−18
10
10
10
10
−3
−9
10
10
10
20 el.
40 el.
80 el.
160 el.
−3
10
Error After
Post−Processing (New)
Error After
Post−Processing (Old)
Error Before
Post−Processing
−3
−18
0
6.2832
10
Before
After
Post-Processing
Post-Processing (Old)
∞
2
mesh L -error order L -error order L -error order L∞ -error order
Polynomial Degree p = 1
20 6.51e-03 5.95e-03
1.60e-02 2.21e-02
40 1.63e-03 2.00 1.50e-03 1.99 1.71e-03 3.22 3.11e-03 2.83
80 4.07e-04 2.00 3.76e-04 2.00 1.58e-04 3.43 4.00e-04 2.96
160 1.02e-04 2.00 9.40e-05 2.00 1.42e-05 3.48 5.03e-05 2.99
Polynomial Degree p = 2
20 1.73e-04 1.28e-04
3.95e-03 6.68e-03
40 2.16e-05 3.00 1.61e-05 2.99 2.11e-04 4.23 3.92e-04 4.09
80 2.70e-06 3.00 2.02e-06 3.00 5.47e-06 5.27 1.39e-05 4.82
160 3.38e-07 3.00 2.53e-07 3.00 1.26e-07 5.45 4.46e-07 4.96
Polynomial Degree p = 3
20 3.42e-06 2.15e-06
1.06e-04 2.26e-04
40 2.14e-07 4.00 1.35e-07 3.99 4.71e-06 4.49 8.96e-06 4.66
80 1.34e-08 4.00 8.49e-09 4.00 3.41e-08 7.11 8.72e-08 6.68
160 8.36e-10 4.00 5.31e-10 4.00 2.00e-10 7.41 7.16e-10 6.93
2
Table 4.1. L2 -projection
0
6.2832
spatial domain
spatial domain
After
Post-Processing (New)
L -error order L∞ -error order
2
4.88e-04 1.26e-03
1.90e-05 4.68 5.35e-05 4.56
9.02e-07 4.40 1.79e-06 4.90
5.33e-08 4.08 5.69e-08 4.98
4.19e-06 3.14e-06
8.69e-08 5.59 6.71e-08 5.55
1.38e-09 5.97 7.87e-10 6.41
2.17e-11 5.99 1.23e-11 6.00
3.75e-07 9.84e-07
6.30e-10 9.22 3.89e-10 11.31
2.67e-12 7.88 1.53e-12 7.99
1.06e-14 7.98 5.97e-15 8.00
72
Chapter 4 Hidden DG accuracy
Isolating the boundary effects To isolate the effect of extra kernel nodes
at the boundary, we revisit our previous test case and apply the left-sided
post-processor in the entire domain for 2p + 1 (‘old’) and 4p + 1 (‘new’) kernel
nodes (using a periodic extension of uh when needed). In other words, the only
difference between the two post-processors is now the number of kernel nodes.
Table 4.2 demonstrates that the use of 4p + 1 kernel nodes leads to better
accuracy of O(h4p+1 ). The same phenomenon occurs at the boundary in Table
4.1, which explains why the new post-processor improves the errors in that
region.
We stress that the accuracy of O(h4p+1 ) cannot be expected in general.
Based on the analysis in Section 5.5.3 later on, the theoretical error takes the
form C1 h4p+1 + C2 h2p+1 , where the order in the first term is equal to the
number of kernel nodes (and C1 , C2 > 0 are constant with respect to h). We
speculate that the meshes in this test case are sufficiently coarse so that the
first error dominates the second. For finer meshes, it is likely that the second
error will start to dominate, so that the error then becomes O(h2p+1 ). We will
encounter such an example in the next section.
20 el.
40 el.
80 el.
160 el.
−6
10
−9
10
−6
10
−9
10
−12
−12
−15
spatial domain
10
160 el.
−15
−18
0
6.2832
10
0
6.2832
spatial domain
spatial domain
Before
After
Post-Processing
Post-Processing (Old)
∞
2
mesh L -error order L -error order L -error order L∞ -error order
Polynomial Degree p = 1
20 6.51e-03 5.95e-03
4.50e-02 2.54e-02
40 1.63e-03 2.00 1.50e-03 1.99 5.70e-03 2.98 3.22e-03 2.98
80 4.07e-04 2.00 3.76e-04 2.00 7.15e-04 3.00 4.03e-04 3.00
160 1.02e-04 2.00 9.40e-05 2.00 8.94e-05 3.00 5.05e-05 3.00
Polynomial Degree p = 2
20 1.73e-04 1.28e-04
1.03e-02 5.79e-03
40 2.16e-05 3.00 1.61e-05 2.99 3.29e-04 4.97 1.85e-04 4.96
80 2.70e-06 3.00 2.02e-06 3.00 1.03e-05 4.99 5.83e-06 4.99
160 3.38e-07 3.00 2.53e-07 3.00 3.23e-07 5.00 1.82e-07 5.00
Polynomial Degree p = 3
20 3.42e-06 2.15e-06
2.58e-03 1.46e-03
40 2.14e-07 4.00 1.35e-07 3.99 2.09e-05 6.95 1.18e-05 6.95
80 1.34e-08 4.00 8.49e-09 4.00 1.65e-07 6.99 9.29e-08 6.99
160 8.36e-10 4.00 5.31e-10 4.00 1.29e-09 7.00 7.28e-10 7.00
2
80 el.
10
−18
6.2832
20 el.
40 el.
−12
−15
0
−6
10
10
10
−18
10
10
10
10
−3
−9
10
10
10
20 el.
40 el.
80 el.
160 el.
−3
10
Error After
Post−Processing (New)
Error After
Post−Processing (Old)
Error Before
Post−Processing
−3
After
Post-Processing (New)
L -error order L∞ -error order
2
3.72e-03 2.10e-03
1.18e-04 4.97 6.71e-05 4.97
3.72e-06 4.99 2.11e-06 4.99
1.17e-07 5.00 6.62e-08 4.99
9.52e-05 5.37e-05
1.93e-07 8.95 1.09e-07 8.95
3.81e-10 8.98 2.16e-10 8.97
7.61e-13 8.97 4.42e-13 8.93
2.82e-06 1.59e-06
3.62e-10 12.93 2.04e-10 12.93
4.47e-14 12.98 2.53e-14 12.98
5.51e-18 12.99 3.21e-18 12.94
Table 4.2. L2 -projection: isolating boundary effects
Section 4.5 Numerical Results
4.5.2
73
Constant coefficients
We now consider a one-dimensional linear hyperbolic equation with constant
coefficients and periodic boundary conditions:
ut + ux = 0,
x ∈ [0, 2π],
t ∈ [0, 12.5].
The initial condition is chosen such that the exact solution reads u(x, t) =
sin(x − t). For t = 0, this test case is equivalent to the one discussed in the
previous section. Here, we consider the final time t = 12.5. This test case is
more challenging than the previous one because uh now contains information
of the physics of the PDE and the numerics of the DG method.
Nevertheless, Table 4.3 demonstrates that the results are similar to those
for the L2 -projection: we are able to obtain better errors than the DG solution
and the old post-processor. In fact, the magnitude of the errors is improved
throughout the entire domain when the new position-dependent post-processor
is applied to the DG approximation. Furthermore, the convergence rate is
improved from O(hp+1 ) to O(h2p+1 ).
Error After
Post−Processing (Old)
Error Before
Post−Processing
−3
−3
10
10
20 el.
40 el.
80 el.
160 el.
−6
10
−9
Error After
Post−Processing (New)
20 el.
40 el.
−6
160 el.
−12
−12
6.2832
spatial domain
80 el.
160 el.
−12
10
0
6.2832
0
spatial domain
Before
After
Post-Processing
Post-Processing (Old)
∞
2
mesh L -error order L -error order L -error order L∞ -error order
Polynomial Degree p = 1
20 1.41e-02 1.02e-02
1.89e-02 2.21e-02
40 2.91e-03 2.28 2.69e-03 1.92 2.10e-03 3.17 3.12e-03 2.82
80 6.81e-04 2.09 7.57e-04 1.83 2.18e-04 3.27 4.03e-04 2.95
160 1.67e-04 2.03 2.00e-04 1.92 2.34e-05 3.22 5.10e-05 2.98
Polynomial Degree p = 2
20 2.68e-04 3.18e-04
4.00e-03 7.50e-03
40 3.35e-05 3.00 3.98e-05 3.00 2.11e-04 4.25 4.07e-04 4.20
80 4.19e-06 3.00 4.97e-06 3.00 5.46e-06 5.27 1.41e-05 4.85
160 5.24e-07 3.00 6.22e-07 3.00 1.25e-07 5.45 4.49e-07 4.97
Polynomial Degree p = 3
20 5.18e-06 4.40e-06
1.30e-04 3.21e-04
40 3.24e-07 4.00 2.76e-07 4.00 4.71e-06 4.79 9.45e-06 5.09
80 2.02e-08 4.00 1.72e-08 4.00 3.41e-08 7.11 8.91e-08 6.73
160 1.26e-09 4.00 1.08e-09 4.00 2.00e-10 7.41 7.23e-10 6.95
2
40 el.
10
10
0
10
−9
10
10
20 el.
−6
10
−9
10
−3
10
80 el.
6.2832
spatial domain
After
Post-Processing (New)
L -error order L∞ -error order
2
9.60e-03 5.44e-03
1.20e-03 3.00 6.78e-04 3.00
1.50e-04 3.00 8.45e-05 3.00
1.87e-05 3.00 1.05e-05 3.00
1.30e-05 8.41e-06
3.77e-07 5.11 2.16e-07 5.28
1.06e-08 5.16 5.97e-09 5.18
3.09e-10 5.10 1.74e-10 5.10
3.76e-07 1.05e-06
6.63e-10 9.15 4.09e-10 11.32
2.96e-12 7.81 1.69e-12 7.92
1.29e-14 7.84 7.28e-15 7.86
Table 4.3. Constant coefficients
74
Chapter 4 Hidden DG accuracy
Isolating boundary effects Similar to Section 4.5.1, we isolate the effect
of extra kernel nodes near the boundary by repeating the experiment while
applying the left-sided post-processor throughout the entire domain.
The results are displayed in Table 4.4. As before (cf. Table 4.2), the extra
nodes yield faster convergence and smaller errors. This explains the higher
accuracy of the position-dependent post-processor near the boundary in Table
4.3.
Unlike before, the convergence rate is typically lower than O(h4p+1 ), and
it drops to O(h2p+1 ) for finer meshes. This change in the convergence rate
is in line with our earlier speculation that the lower order component of the
error, which is O(h2p+1 ), starts to dominate the higher order component for
sufficiently fine meshes.
Error Before
Post−Processing
−3
20 el.
−3
10
10
20 el.
40 el.
80 el.
160 el.
−6
10
Error After
Post−Processing (New)
Error After
Post−Processing (Old)
−9
80 el.
−6
10
20 el.
−6
10
160 el.
−9
10
−3
10
40 el.
40 el.
−9
10
10
80 el.
160 el.
−12
−12
10
−12
10
0
6.2832
spatial domain
10
0
6.2832
Before
After
Post-Processing
Post-Processing (Old)
∞
2
mesh L -error order L -error order L -error order L∞ -error order
Polynomial Degree p = 1
20 1.41e-02 1.02e-02
4.19e-02 2.36e-02
40 2.91e-03 2.28 2.69e-03 1.92 5.57e-03 2.91 3.14e-03 2.91
80 6.81e-04 2.09 7.57e-04 1.83 7.15e-04 2.96 4.03e-04 2.96
160 1.67e-04 2.03 2.00e-04 1.92 9.04e-05 2.98 5.10e-05 2.98
Polynomial Degree p = 2
20 2.68e-04 3.18e-04
1.03e-02 5.79e-03
40 3.35e-05 3.00 3.98e-05 3.00 3.29e-04 4.97 1.85e-04 4.97
80 4.19e-06 3.00 4.97e-06 3.00 1.03e-05 4.99 5.83e-06 4.99
160 5.24e-07 3.00 6.22e-07 3.00 3.23e-07 5.00 1.82e-07 5.00
Polynomial Degree p = 3
20 5.18e-06 4.40e-06
2.58e-03 1.46e-03
40 3.24e-07 4.00 2.76e-07 4.00 2.09e-05 6.95 1.18e-05 6.95
80 2.02e-08 4.00 1.72e-08 4.00 1.65e-07 6.99 9.29e-08 6.99
160 1.26e-09 4.00 1.08e-09 4.00 1.29e-09 7.00 7.28e-10 7.00
2
0
6.2832
spatial domain
spatial domain
After
Post-Processing (New)
L -error order L∞ -error order
2
1.22e-02 6.87e-03
1.24e-03 3.30 6.98e-04 3.30
1.50e-04 3.05 8.45e-05 3.05
1.86e-05 3.01 1.05e-05 3.01
1.05e-04 5.90e-05
4.49e-07 7.86 2.53e-07 7.86
9.34e-09 5.59 5.27e-09 5.59
2.88e-10 5.02 1.62e-10 5.02
2.82e-06 1.59e-06
3.96e-10 12.80 2.24e-10 12.80
3.16e-13 10.29 1.78e-13 10.29
2.32e-15 7.09 1.31e-15 7.09
Table 4.4. Constant coefficients: isolating boundary effects
Section 4.5 Numerical Results
4.5.3
75
Dirichlet BCs
The previous section discussed a test case with periodic boundary conditions.
For those problems, it is actually not necessary to use a one-sided approach near
the boundary: the more accurate symmetric post-processor could be applied
by using a periodic extension of the DG solution. However, in most real-life
applications, the boundary conditions are not periodic. For this reason, we
revisit the test case of the previous section, but now using Dirichlet boundary
conditions. That is,
ut + ux = 0,
u(0, t) = sin(−t),
x ∈ [0, 2π],
t ∈ [0, 12.5].
The initial condition is chosen such that the exact solution reads u(x, t) =
sin(x − t).
The results are displayed in Table 4.5: similar to the periodic case, we
observe that the convergence rate is improved from O(hp+1 ) to O(h2p+1 ). Furthermore, the smoothness and the accuracy are improved in the entire domain,
including the boundary.
Error After
Post−Processing (Old)
Error Before
Post−Processing
−3
−3
10
10
20 el.
40 el.
80 el.
160 el.
−6
10
−9
Error After
Post−Processing (New)
20 el.
40 el.
−6
−6
10
160 el.
20 el.
10
40 el.
−9
10
−3
10
80 el.
−9
10
10
80 el.
160 el.
−12
−12
10
−12
10
0
6.2832
spatial domain
10
0
6.2832
0
spatial domain
Before
After
Post-Processing
Post-Processing (Old)
∞
2
mesh L -error order L -error order L -error order L∞ -error order
Polynomial Degree p = 1
20 1.10e-02 1.29e-02
1.63e-02 2.33e-02
40 2.68e-03 2.03 3.29e-03 1.97 1.76e-03 3.22 3.23e-03 2.85
80 6.67e-04 2.01 8.32e-04 1.98 1.66e-04 3.40 4.13e-04 2.97
160 1.66e-04 2.00 2.09e-04 1.99 1.55e-05 3.42 5.19e-05 2.99
Polynomial Degree p = 2
20 2.68e-04 3.17e-04
4.00e-03 7.50e-03
40 3.35e-05 3.00 3.98e-05 2.99 2.11e-04 4.25 4.07e-04 4.20
80 4.19e-06 3.00 4.97e-06 3.00 5.46e-06 5.27 1.41e-05 4.85
160 5.24e-07 3.00 6.22e-07 3.00 1.25e-07 5.45 4.49e-07 4.97
Polynomial Degree p = 3
20 5.18e-06 4.40e-06
1.30e-04 3.21e-04
40 3.24e-07 4.00 2.76e-07 4.00 4.71e-06 4.79 9.45e-06 5.09
80 2.02e-08 4.00 1.72e-08 4.00 3.41e-08 7.11 8.91e-08 6.73
160 1.26e-09 4.00 1.08e-09 4.00 2.00e-10 7.41 7.23e-10 6.95
2
Table 4.5. Dirichlet BCs
6.2832
spatial domain
After
Post-Processing (New)
L -error order L∞ -error order
2
3.37e-03 2.41e-03
4.14e-04 3.02 3.00e-04 3.01
5.13e-05 3.01 3.72e-05 3.01
6.39e-06 3.01 4.63e-06 3.01
6.98e-06 5.23e-06
1.84e-07 5.25 1.24e-07 5.40
4.63e-09 5.31 3.16e-09 5.29
1.28e-10 5.18 8.83e-11 5.16
3.75e-07 1.05e-06
6.39e-10 9.20 3.97e-10 11.37
2.75e-12 7.86 1.59e-12 7.96
1.12e-14 7.94 6.48e-15 7.94
76
Chapter 4 Hidden DG accuracy
4.5.4
Variable coefficients
As a first step towards nonlinear problems, we now consider a linear problem
with smoothly varying coefficients:
ut + (a u)x = f,
a(x, t) = 2 + sin(x + t),
x ∈ [0, 2π],
t ∈ [0, 12.5].
The boundary conditions are periodic, and the initial condition and the forcing
term f (x, t) are chosen such that the exact solution reads u(x, t) = sin(x − t).
The results are displayed in Table 4.6. Similar to all of the previous test
cases, we observe that the convergence rate is improved from from O(hp+1 ) to
O(h2p+1 ). Furthermore, the smoothness and the accuracy are improved in the
entire domain, including the boundary.
Error Before
Post−Processing
−3
−3
10
10
20 el.
40 el.
80 el.
160 el.
−6
10
Error After
Post−Processing (New)
Error After
Post−Processing (Old)
−9
−6
10
20 el.
40 el.
80 el.
10
160 el.
10
−9
10
−12
−12
−12
10
−15
10
−15
10
0
6.2832
spatial domain
10
0
6.2832
0
6.2832
spatial domain
spatial domain
Before
After
Post-Processing
Post-Processing (Old)
∞
2
mesh L -error order L -error order L -error order L∞ -error order
Polynomial Degree p = 1
20 1.09e-02 1.46e-02
1.63e-02 2.47e-02
40 2.68e-03 2.03 3.53e-03 2.05 1.75e-03 3.22 3.38e-03 2.87
80 6.66e-04 2.01 8.62e-04 2.03 1.64e-04 3.41 4.31e-04 2.97
160 1.66e-04 2.00 2.13e-04 2.02 1.52e-05 3.44 5.40e-05 3.00
Polynomial Degree p = 2
20 2.68e-04 3.31e-04
4.00e-03 7.50e-03
40 3.35e-05 3.00 4.07e-05 3.03 2.11e-04 4.25 4.07e-04 4.21
80 4.19e-06 3.00 5.03e-06 3.02 5.46e-06 5.27 1.41e-05 4.85
160 5.24e-07 3.00 6.25e-07 3.01 1.25e-07 5.45 4.49e-07 4.97
Polynomial Degree p = 3
20 5.17e-06 4.41e-06
1.30e-04 3.21e-04
40 3.23e-07 4.00 2.76e-07 4.00 4.71e-06 4.79 9.45e-06 5.09
80 2.02e-08 4.00 1.73e-08 4.00 3.41e-08 7.11 8.91e-08 6.73
160 1.26e-09 4.00 1.08e-09 4.00 2.00e-10 7.41 7.23e-10 6.95
2
40 el.
80 el.
160 el.
10
10
−15
20 el.
−6
−9
10
10
−3
After
Post-Processing (New)
L -error order L∞ -error order
2
2.75e-03 2.95e-03
3.48e-04 2.98 2.77e-04 3.41
4.38e-05 2.99 2.99e-05 3.21
5.50e-06 3.00 3.63e-06 3.04
4.58e-06 4.67e-06
1.01e-07 5.50 1.18e-07 5.31
2.77e-09 5.19 2.15e-09 5.78
9.81e-11 4.82 7.36e-11 4.87
1.11e-05 3.91e-05
6.63e-10 14.03 1.12e-09 15.09
2.65e-12 7.97 1.53e-12 9.51
1.06e-14 7.97 7.13e-15 7.75
Table 4.6. Variable coefficients
Section 4.5 Numerical Results
4.5.5
77
Discontinuous coefficients
For all of the previous test cases, the exact solution is infinitely smooth. In Section 4.4.2, we emphasized that the position-dependent post-processor can also
be applied near a shock, by treating it as a boundary. To test this numerically,
we now consider a problem with discontinuous coefficients [64]:
(
1
, x ∈ [− 12 , 21 ],
ut + (a u)x = 0, a(x) = 2
, x ∈ [−1, 1], t ∈ [0, 12.5].
1, else,
The boundary conditions are periodic, and the initial condition is chosen such
that the exact solution has two stationary shocks:
(
−2 cos 4π(x − 21 t) , x ∈ [− 21 , 21 ],
u(x, t) =
cos 2π(x − t) ,
else.
The results are displayed in Table 4.7. The accuracy of the new postprocessor is better than that of the DG solution as long as the mesh is sufficiently fine. For the coarser meshes, the lower accuracy is because a kernel
Error Before
Post−Processing
Error After
Post−Processing (New)
Error After
Post−Processing (Old)
0
0
10
0
10
10
20 el.
−3
10
40 el.
−3
−3
10
80 el.
80 el.
−6
10
160 el.
−9
160 el.
−6
10
−9
10
1
spatial domain
40 el.
80 el.
−6
10
160 el.
10
−1
1
−1
1
spatial domain
spatial domain
Before
After
Post-Processing
Post-Processing (Old)
∞
2
mesh L -error order L -error order L -error order L∞ -error order
Polynomial Degree p = 1
20 1.21e+00 - 1.56e+00 1.15e+00 - 1.54e+00 40
2.72e-01 2.15 3.77e-01 2.05 2.54e-01 2.18 3.49e-01 2.14
80
3.83e-02 2.83 5.74e-02 2.71 3.63e-02 2.80 4.88e-02 2.84
160 5.20e-03 2.88 8.62e-03 2.74 4.70e-03 2.95 6.19e-03 2.98
Polynomial Degree p = 2
20
3.65e-02
5.14e-02
6.81e+00 - 1.62e+01 40
2.05e-03 4.15 4.84e-03 3.41 1.67e-01 5.35 6.78e-01 4.58
80
2.17e-04 3.24 6.27e-04 2.95 6.03e-03 4.79 2.79e-02 4.60
160 2.68e-05 3.02 7.94e-05 2.98 8.41e-05 6.16 5.79e-04 5.59
Polynomial Degree p = 3
20
1.08e-03
2.45e-03
3.58e+00 - 1.25e+01 40
6.60e-05 4.04 1.37e-04 4.16 1.87e-02 7.58 9.95e-02 6.97
80
4.13e-06 4.00 8.74e-06 3.97 6.50e-04 4.84 2.92e-03 5.09
160 2.58e-07 4.00 5.51e-07 3.99 2.62e-06 7.95 1.77e-05 7.36
2
10
−9
10
−1
20 el.
20 el.
40 el.
After
Post-Processing (New)
L -error order L∞ -error order
2
1.20e+00 - 1.62e+00 2.74e-01 2.13 4.33e-01 1.91
3.75e-02 2.87 5.02e-02 3.11
4.75e-03 2.98 6.17e-03 3.03
5.71e-01
- 2.94e+00 1.25e-03 8.84 1.83e-03 10.66
4.16e-05 4.91 1.40e-04 3.71
1.18e-06 5.14 1.69e-06 6.37
2.27e-01
6.61e-01
2.64e-03 6.43 1.85e-02 5.16
5.20e-06 8.99 6.98e-05 8.05
4.67e-09 10.12 8.70e-08 9.65
Table 4.7. Discontinuous coefficients
78
Chapter 4 Hidden DG accuracy
scale smaller than h is required to ensure that the support of the (scaled) kernel
fits between two subsequent boundaries/shocks (cf. Section 4.4.2, (4.13)).
Nevertheless, for p = 2 and p = 3, the magnitude of the errors is much
smaller for the new post-processor than for the old one. For p = 1, this is not
the case: the errors are slightly worse. We speculate that this is due to the fact
that more extra kernel nodes are used for p = 2 than for p = 1. Improvement
of this issue is left for future research.
4.5.6
Two-dimensional system
In the previous sections, we considered several one-dimensional problems. However, higher-dimensional fields can also be filtered, as discussed in Section 4.4.3.
In this section, we apply such a strategy for the following two-dimensional system:
u
−1 0 u
0 −1 u
0
+
+
=
,
(x, y) ∈ [0, 2π]2 ,
v t
0 1 v x
−1 0
v y
0
t ∈ [0, 12.5].
The boundary conditions are periodic, and the initial condition is given by
1
u0 (x, y) = √ (sin(x + y) − cos(x + y)),
2 2
√
√
1
v0 (x, y) = √ ( 2 − 1) sin(x + y) + (1 + 2) cos(x + y) .
2 2
The results for this test case are displayed in Table 4.8 for a final time of t =
12.5. Similar to the one-dimensional problems, we observe that the convergence
rate is improved from O(hp+1 ) to O(h2p+1 ). Furthermore, unlike the original
filter, the position-dependent post-processor improves the DG errors in the
entire domain, even for coarse meshes. In other words, the results for this
two-dimensional problem are similar to those for the previous smooth onedimensional cases.
4.5.7
Two-dimensional streamlines
Next, we study the two-dimensional tests of Steffen et al. [73], including streamline visualizations. In each test, a velocity profile (u, v) on the square [−1, 1]2
is given. We consider the L2 -projection of that solution onto the space of
piecewise polynomials of degree p = 1, 2, 3 (as before, this is similar to a DG
approximation at the initial time). The velocity (u, v) is obtained as a function
of (x, y) from the real and imaginary parts of a complex number r:
u := Re(r),
v := Im(r),
Section 4.5 Numerical Results
79
u-component
Before
After
Post-Processing
Post-Processing (Old)
2
∞
2
mesh L -error order L -error order L -error order L∞ -error order
Polynomial Degree p = 1
202 1.22e-01 3.94e-02
1.16e-01 2.77e-02
402 1.77e-02 2.78 7.12e-03 2.47 1.55e-02 2.90 4.58e-03 2.60
802 2.94e-03 2.59 1.38e-03 2.36 1.96e-03 2.98 6.33e-04 2.85
Polynomial Degree p = 2
202 1.58e-03 1.24e-03
1.96e-02 1.77e-02
402 1.95e-04 3.02 1.62e-04 2.94 4.29e-04 5.51 6.00e-04 4.88
2
80
2.44e-05 3.00 2.05e-05 2.98 9.46e-06 5.50 1.76e-05 5.09
Polynomial Degree p = 3
202 7.87e-05 5.58e-05
2.00e-03 1.79e-03
402 4.98e-06 3.98 3.30e-06 4.08 1.10e-05 7.51 1.54e-05 6.86
802 3.11e-07 4.00 1.97e-07 4.06 6.07e-08 7.50 1.17e-07 7.05
After
Post-Processing (New)
L -error order L∞ -error order
2
1.18e-01 2.73e-02
1.54e-02 2.93 3.48e-03 2.97
1.95e-03 2.98 4.39e-04 2.99
3.33e-04 1.03e-04
1.01e-05 5.05 2.29e-06 5.50
3.12e-07 5.01 7.03e-08 5.03
1.33e-06 1.93e-06
6.87e-09 7.60 1.69e-09 10.16
5.02e-11 7.10 1.16e-11 7.19
v-component
Before
After
Post-Processing
Post-Processing (Old)
mesh L2 -error order L∞ -error order L2 -error order L∞ -error order
Polynomial Degree p = 1
202 1.43e-01 3.89e-02
1.36e-01 3.65e-02
402 2.04e-02 2.81 6.13e-03 2.67 1.81e-02 2.91 5.42e-03 2.75
802 3.31e-03 2.62 1.63e-03 1.91 2.27e-03 2.99 7.23e-04 2.91
Polynomial Degree p = 2
202 2.22e-03 2.45e-03
2.16e-02 1.43e-02
402 2.72e-04 3.03 3.07e-04 3.00 4.89e-04 5.46 6.64e-04 4.43
802 3.39e-05 3.00 3.84e-05 3.00 1.09e-05 5.49 2.18e-05 4.93
Polynomial Degree p = 3
202 1.14e-04 1.26e-04
2.21e-03 1.41e-03
402 7.49e-06 3.93 8.21e-06 3.94 1.25e-05 7.46 1.60e-05 6.45
2
80
4.75e-07 3.98 5.23e-07 3.97 6.98e-08 7.48 1.40e-07 6.84
After
Post-Processing (New)
L2 -error order L∞ -error order
1.39e-01 3.21e-02
1.80e-02 2.95 4.07e-03 2.98
2.26e-03 2.99 5.09e-04 3.00
3.80e-04 1.18e-04
1.15e-05 5.04 2.63e-06 5.49
3.59e-07 5.01 8.09e-08 5.02
1.60e-06 1.86e-06
8.32e-09 7.59 2.04e-09 9.84
5.94e-11 7.13 1.37e-11 7.21
Table 4.8. Two-dimensional system
where, defining the complex number z := x + iy, the following three test cases
are given:
r =(z − (0.74 + 0.35i))(z − (0.68 − 0.59i))
(z − (−0.11 − 0.72i))(z − (−0.58 + 0.64i))
(z − (0.51 − 0.27i))(z − (−0.12 + 0.84i))2
r =(z − (0.94 + 0.15i))(z + (−0.38 − 0.39i))
(z − (0.09 − 0.92i))(z − (−0.38 + 0.84i))
(z − (0.71 − 0.07i))
r = − (z − (0.74 + 0.35i))(z − (0.11 − 0.11i))
(Case 1),
(Case 2),
2
(z − (−0.11 + 0.72i))(z − (−0.58 + 0.64i))
(z − (0.51 − 0.27i))
(Case 3).
80
Chapter 4 Hidden DG accuracy
For each test case, the position-dependent
post-processor enhances the convergence rate from O hp+1 to O h2p+1 . This can be seen from Table 4.9,
Table 4.10, and Table 4.11. Figure 4.7 illustrates the local accuracy improvement for the first case.
An interesting effect can be seen in the three tables: for sufficiently large p,
the errors of the post-processed field are of the order of the machine precision,
which suggests that the exact solution has been reached. This happens e.g. for
the second case with p ≥ 2. For that problem, the exact solution is a polynomial of degree 5. At the same time, the post-processed solution is a piecewise
polynomial of degree no more than 2p + 1 in each variable6 . Combining these
facts (observing 2p + 1 ≥ 5 for p ≥ 2), the high accuracy suggests that the
post-processed L2 -projection onto the space of piecewise polynomials of degree
p behaves like the L2 -projection onto the space of piecewise polynomials of
degree 2p + 1 in each variable. Theoretical support for this phenomenon is left
for future research.
A good feature of the post-processor is that it can enhance the accuracy
of streamlines, especially near critical points. This was observed by Steffen
et al. [73] for the symmetric post-processor, away from the boundary. Figure
4.8 shows that similar improvements are obtained for the position-dependent
post-processor in the entire spatial domain (using a standard RK-4 method
with time step ∆t = 0.01 to compute the streamlines). We have translated
the second field of Steffan et al. so that the critical points are located close
to the boundary to emphasize the improved applicability and accuracy of the
position-dependent post-processor near the boundary.
4.6
Conclusion
This chapter proposes the position-dependent post-processor as an alternative
to the one-sided post-processor [64], and analyzes the impact of both strategies
on the accuracy and smoothness of DG (upwind) approximations for hyperbolic
problems. Our numerical results demonstrate that the new post-processor can
enhance the convergence rate from order p + 1 to order 2p + 1, in both the L2 and the L∞ -norm. The differences with the original one-sided method occur
at the boundary of the domain: in those regions, the new post-processor uses
extra kernel nodes, as well as a smoother transition of the nodes. This results in
significantly smaller errors with a more realistic smoothness. Altogether, unlike
before, the proposed position-dependent post-processor can be used to obtain
better smoothness and accuracy than the unfiltered DG approximation in the
entire domain, including near (non-periodic) boundaries and shocks. This can
aid to better visualization of the results, e.g. in the form of streamlines.
6 This is because the post-processed solution is obtained from the convolution of a piecewise
polynomial of degree p (the L2 -projection before post-processing) with a piecewise polynomial
of degree p in each variable (the kernel, a linear combination of B-splines of degree p + 1).
Section 4.6 Conclusion
81
u-component before post-processing
u-component after post-processing
−5
−5
−6
−6
−7
−7
y
−4
y
−4
−8
−8
−9
−9
−10
−10
−11
−11
x
x
v-component before post-processing
v-component after post-processing
−6
−6
y
−4
y
−4
−8
−8
−10
−10
x
x
y
y
Figure 4.7. Logarithm of the local error before and after post-processing (Case 1,
p = 2, N = 402 ).
Before
After
Exact
Before
Exact
After
x
x
Figure 4.8. Enhanced streamline visualization after post-processing for Case 1
(left) and Case 2 (right) (p = 1, N = 202 ).
82
Chapter 4 Hidden DG accuracy
u-component:
L2 -error
mesh
Before
Error
Order
202
402
802
5.36e-02
1.35e-02
3.37e-03
1.99
2.00
202
402
802
1.92e-03
2.41e-04
3.01e-05
2.99
3.00
202
402
802
4.96e-05
3.11e-06
1.94e-07
4.00
4.00
L∞ -error
After
Before
After
Error
Order
Error
Order
Error
Order
Polynomial Degree p = 1
3.58e-03
4.63e-01
2.26e-02
1.20e-04
4.90
1.27e-01
1.87
8.48e-04
4.74
5.98e-06
4.33
3.32e-02
1.93
2.95e-05
4.85
Polynomial Degree p = 2
6.01e-06
1.97e-02
7.56e-06
2.00e-07
4.91
2.67e-03
2.89
2.19e-07
5.11
4.23e-09
5.56
3.47e-04
2.94
4.25e-09
5.69
Polynomial Degree p = 3
1.08e-23
4.22e-04
1.84e-22
7.03e-22
2.80e-05
3.91
1.40e-20
2.13e-21
1.81e-06
3.96
7.16e-20
-
v-component:
L2 -error
mesh
Before
Error
Order
202
402
802
1.23e-01
3.10e-02
7.75e-03
1.99
2.00
202
402
802
4.20e-03
5.27e-04
6.59e-05
3.00
3.00
202
402
802
9.09e-05
5.69e-06
3.56e-07
4.00
4.00
L∞ -error
After
Before
After
Error
Order
Error
Order
Error
Order
Polynomial Degree p = 1
4.82e-03
1.40e+00
3.53e-02
1.75e-04
4.78
3.77e-01
1.90
1.31e-03
4.75
9.84e-06
4.16
9.78e-02
1.95
4.57e-05
4.85
Polynomial Degree p = 2
1.14e-05
4.78e-02
1.56e-05
3.07e-07
5.21
6.37e-03
2.91
3.49e-07
5.48
5.99e-09
5.68
8.21e-04
2.95
6.31e-09
5.79
Polynomial Degree p = 3
1.75e-23
8.18e-04
3.47e-22
1.41e-21
5.37e-05
3.93
2.87e-20
2.44e-21
3.44e-06
3.96
6.81e-20
-
Table 4.9. Case 1
Section 4.6 Conclusion
83
u-component:
L2 -error
mesh
Before
Error
Order
202
402
802
2.00e-02
5.00e-03
1.25e-03
2.00
2.00
202
402
802
4.65e-04
5.83e-05
7.28e-06
3.00
3.00
202
402
802
6.80e-06
4.25e-07
2.66e-08
4.00
4.00
L∞ -error
After
Before
After
Error
Order
Error
Order
Error
Order
Polynomial Degree p = 1
2.57e-04
1.31e-01
7.92e-04
1.25e-05
4.36
3.42e-02
1.93
2.65e-05
4.90
8.39e-07
3.90
8.76e-03
1.97
9.38e-07
4.82
Polynomial Degree p = 2
2.68e-26
2.80e-03
2.95e-25
7.75e-25
3.62e-04
2.95
1.08e-23
2.34e-24
4.60e-05
2.98
4.55e-23
Polynomial Degree p = 3
4.13e-24
2.53e-05
5.94e-23
3.91e-22
1.61e-06
3.97
4.90e-21
9.63e-22
1.02e-07
3.99
2.31e-20
-
v-component:
L2 -error
mesh
Before
Error
Order
202
402
802
2.69e-02
6.74e-03
1.69e-03
2.00
2.00
202
402
802
5.73e-04
7.17e-05
8.97e-06
3.00
3.00
202
402
802
7.45e-06
4.66e-07
2.91e-08
4.00
4.00
L∞ -error
After
Before
After
Error
Order
Error
Order
Error
Order
Polynomial Degree p = 1
2.72e-04
1.84e-01
8.04e-04
1.42e-05
4.26
4.80e-02
1.94
2.73e-05
4.88
9.50e-07
3.90
1.23e-02
1.97
1.05e-06
4.71
Polynomial Degree p = 2
7.39e-26
3.49e-03
1.04e-24
1.45e-24
4.50e-04
2.96
2.82e-23
3.63e-24
5.71e-05
2.98
9.69e-23
Polynomial Degree p = 3
1.01e-23
2.81e-05
1.77e-22
7.18e-22
1.79e-06
3.98
1.22e-20
1.68e-21
1.13e-07
3.99
4.11e-20
-
Table 4.10. Case 2
84
Chapter 4 Hidden DG accuracy
u-component:
L2 -error
mesh
Before
Error
Order
202
402
802
4.92e-02
1.23e-02
3.09e-03
1.99
2.00
202
402
802
1.89e-03
2.37e-04
2.97e-05
3.00
3.00
202
402
802
4.27e-05
2.67e-06
1.67e-07
4.00
4.00
L∞ -error
After
Before
After
Error
Order
Error
Order
Error
Order
Polynomial Degree p = 1
1.34e-03
3.98e-01
8.15e-03
3.73e-05
5.16
1.06e-01
1.90
2.82e-04
4.85
1.56e-06
4.58
2.75e-02
1.95
8.84e-06
5.00
Polynomial Degree p = 2
4.50e-06
1.42e-02
4.78e-06
1.11e-07
5.34
1.88e-03
2.92
7.47e-08
6.00
2.03e-09
5.77
2.41e-04
2.96
1.17e-09
6.00
Polynomial Degree p = 3
3.98e-24
2.40e-04
5.29e-23
5.44e-22
1.56e-05
3.94
1.45e-20
1.25e-21
9.92e-07
3.97
4.04e-20
-
v-component:
L2 -error
mesh
Before
Error
Order
202
402
802
4.17e-02
1.05e-02
2.62e-03
1.99
2.00
202
402
802
1.66e-03
2.08e-04
2.60e-05
3.00
3.00
202
402
802
3.79e-05
2.37e-06
1.48e-07
4.00
4.00
L∞ -error
After
Before
After
Error
Order
Error
Order
Error
Order
Polynomial Degree p = 1
7.11e-04
2.04e-01
3.90e-03
2.26e-05
4.98
5.39e-02
1.92
1.28e-04
4.93
1.10e-06
4.35
1.39e-02
1.96
4.17e-06
4.94
Polynomial Degree p = 2
3.65e-26
6.80e-03
5.33e-25
9.01e-25
8.95e-04
2.93
2.28e-23
2.58e-24
1.15e-04
2.96
9.17e-23
Polynomial Degree p = 3
4.01e-24
1.24e-04
5.62e-23
4.48e-22
8.05e-06
3.95
1.11e-20
9.47e-22
5.12e-07
3.97
3.13e-20
-
Table 4.11. Case 3
5
Theoretical Superconvergence
This chapter is based on:
L. Ji, P. van Slingerland, J.K. Ryan, C. Vuik, Superconvergent error estimates
for position-dependent smoothness-increasing accuracy-conserving (SIAC) postprocessing of Discontinuous Galerkin Solutions. Accepted for publication in
Math. Comp.
86
5.1
Chapter 5 Theoretical Superconvergence
Introduction
In Chapter 4, we proposed the position-dependent post-processor by generalizing the original symmetric and one-sided post-processor. Various numerical
experiments demonstrated that, unlike before, this technique enhances both
the smoothness and the unfiltered errors in the entire spatial domain, including the boundary region. Furthermore, an improvement of the convergence
rate form order p + 1 to order 2p + 1 was observed. This chapter focuses on
theoretical support for these findings.
For the symmetric filter, such theory is already available: Bramble and
Schatz [11] derived superconvergence in the L2 - and L∞ -norm for (continuous)
Ritz-Galerkin approximations. Cockburn, Luskin, Shu and Süli [22] extended
this work to show an accuracy improvement from order p + 1 to order 2p + 1
for DG schemes in the L2 -norm for linear periodic hyperbolic problems. Interestingly, these results were established despite the fact that the post-processor
does not contain any information of the underlying physics or numerics.
To extend this available theory for the symmetric filter and to explain the
numerical results in Chapter 4, in this chapter, we derive error estimates for the
generalized and position-dependent post-processor. In particular, we show that
it enhances the DG convergence from order p+ 1 to order 2p+ 1 in the L2 -norm
and to order min{2p + 1, 2p + 2 − d/2} in the L∞ -norm for d-dimensional linear
periodic hyperbolic problems. Unlike [11, 22] these estimates are valid in the
entire spatial domain, while the post-processor does not require information
outside the domain. Furthermore, it is the first time that such results are
established in the L∞ -norm for DG approximations.
The outline of this chapter is as follows. Section 5.2 discusses the DG
method and post-processor under consideration, as well as some basic notation.
Section 5.3 derives two auxiliary properties. Section 5.4 uses these to obtain the
main error estimates in abstract form. Section 5.5 considers the implications
for DG approximations and links the theory to the numerical observations in
Chapter 4. Finally, Section 5.6 summarizes the main conclusions.
5.2
Methods and notation
This section specifies the methods and notation considered in this chapter. Section 5.2.1 recalls the generalized post-processor from Chapter 4. Section 5.2.2
summarizes the DG method for linear periodic hyperbolic problems. Section
5.2.3 introduces some additional notation that we will use frequently.
5.2.1
Post-processor
The position-dependent post-processor is based on a convex combination of two
generalized post-processors (cf. Section 4.4). For this reason, in this chapter,
we primarily focus on the generalized post-processor. Below, we repeat the def-
Section 5.2 Methods and notation
87
inition of the latter, using slightly different notation than before. Implications
for the position-dependent post-processor are discussed in Section 5.5.3.
Recall the definition of a B-spline of order ℓ ≥ 1 (cf. (4.3)):
ψ (1) := 1[− 21 , 12 ] ,
ψ (ℓ) := ψ (ℓ−1) ⋆ ψ (1) ,
for all ℓ ≥ 2.
(5.1)
Next, we define the one-dimensional kernel for an evaluation point x̄ ∈ (a, b)
as the following linear combination of r + 1 B-splines of order ℓ:
K(x) =
r
X
j=0
cj ψ (ℓ) (x − xj ),
for all x ∈ R,
(5.2)
where the kernel nodes read (the additional small factor ε > 0 is discussed
below):
r
xj = − + j + λ(x̄),
for all j = 0, ..., r,
(2
a+b
+ x̄−a
min{0, − r+ℓ+ε
2
h }, for x̄ ∈ [a, 2 ),
λ(x̄) =
max{0, r+ℓ+ε
+ x̄−b
for x̄ ∈ [ a+b
2
h },
2 , b],
(5.3)
(5.4)
and the kernel coefficients cj satisfy:
r
X
j=0
cj
Z
(
1,
ψ (ℓ) (x)(x + xj )k dx =
0,
−∞
∞
for k = 0,
else.
(5.5)
Different evaluation points result in different kernels.
Now that we have specified the one-dimensional kernel, we can define the
generalized post-processor for a d-dimensional domain Ω = (a1 , b1 ) × ... ×
(ad , bd ). To this end, we proceed as in Section 4.4.3: consider an evaluation
point x̄ = (x̄1 , ..., x̄d ) ∈ Ω. Next, let Kj denote the one-dimensional kernel for
the evaluation point x̄j ∈ (aj , bj ) (as specified above), and set:
K(x) = K1 (x1 )... Kd (xd ),
for all x ∈ Rd .
(5.6)
Using this definition, we can apply the generalized post-processor to a function
u ∈ L2 (Ω) by computing the convolution with the scaled kernel K:
Z
x̄ − x
1
u(x) dx.
(5.7)
K
u⋆ (x̄) = d
h Ω
h
Note that we have added a factor 2ε in the definition of the shift function
(5.4). In practice, we can set ε = 0 (cf. (4.12)). However, for the theoretical
purposes in this chapter, we need ε > 0 to be arbitrarily small yet fixed (independent of h). We stress that the resulting post-processor is still applicable in
the entire spatial domain, including the region near the boundary. A nonzero
value ε > 0 simply means that the post-processor is slightly less ‘symmetric’
near the boundary (cf. Section 4.3.3).
88
5.2.2
Chapter 5 Theoretical Superconvergence
DG discretization for hyperbolic problems
We study the generalized post-processor in both an abstract framework and in
the context of DG schemes for linear periodic hyperbolic problems on uniform
meshes with exact time integration. Below, we specify the latter. We acknowledge that the aforementioned assumptions are quite strong, and usually not
valid in practice. Nevertheless, numerical experiments show that the positiondependent post-processor enhances the accuracy in a similar manner for other
problems as well (cf. Chapter 4).
Altogether, we study the following d-dimensional linear hyperbolic problem
on the spatial domain Ω = [0, 1]d :
ut +
d
X
Aj uxj + A0 u = 0,
(5.8)
j=1
with initial condition u0 and periodic boundary conditions. The coefficients Aj
are assumed to be scalar and constant.
To obtain a DG approximation for this system, consider a uniform mesh for
the spatial domain Ω with elements E1 , ..., EN of size h × ... × h. Next, define
the test space V that contains each element of L2 (Ω) that is a polynomial of
degree p or lower within each mesh element, and that may be discontinuous at
the mesh element boundaries.
At the initial time t = 0, the DG approximation uh is the L2 -projection of
u0 onto V . For t > 0, uh is the function in V such that:
Z
(uh )t v + B(uh , v) = 0,
∀v ∈ V,
Ω
where B is a bilinear form defined hereafter.
(i)
(i)
To specify B, let (n1 , ..., nd ) be the outward normal of Ei . Furthermore,
let ûh denote the usual upwind flux (cf. Section 4.2). Now, the bilinear form
B can be specified as follows:



Z
N
d
d
X
X Z X
X
(i)

B(uh , v) =
Aj vxj + A0 v  .
Aj nj ûh v +
u h −
i=1
5.2.3
e∈∂Ei
e j=1
Ei
j=1
This section specifies some additional notation that we use throughout this
chapter.
Unless specified otherwise, Ω denotes a spatial domain of the form (a1 , b1 )×
... × (ad , bd ). The standard norms in Lp (Ω) are denoted as:
kukLp (Ω) =
Z
Ω
|u|p
p1
,
1 ≤ p < ∞,
Section 5.2 Methods and notation
89
kukL∞ (Ω) = ess sup |u(x)|.
x∈Ω
Furthermore, for integers k ≥ 0 and p = 2, ∞, let W k,p (Ω) denote the usual
Sobolev space, i.e. the set of all functions u such that, for every d-dimensional
multi-index1 α with |α| ≤ k, the weak partial derivative Dα u belongs to Lp (Ω).
For p = 2, we write H k (Ω) = W k,2 (Ω), which is equipped with the norm:

1/2
X
2
kukH k (Ω) = 
(5.9)
kDα ukL2 (Ω)  .
|α|≤k
Additionally, we define the negative-order space H −k (Ω) for integers k ≥ 0:
this space is the closure of the smooth functions C ∞ (Ω) with respect to the
so-called negative-order norm2 :
R
uv
Ω
kukH −k (Ω) = sup
.
(5.10)
∞
kvk
v∈C0 (Ω)
H k (Ω)
A multi-variate B-spline of order ℓ is the tensor product of one-dimensional
B-splines:
ψ (ℓ) (x) = ψ (ℓ) (x1 )... ψ (ℓ) (xd ),
A scaled multi-variate B-spline is defined as:
1
1
(ℓ)
x ,
ψh (x) = d ψ (ℓ)
h
h
∀x ∈ Rd .
∀x ∈ Rd .
The usual central difference operator in the j th direction is denoted as3
1
∂h,j
u(x) =
u(x + h2 ej ) − u(x − h2 ej )
.
h
k
Using this notation, we set for any d-dimensional multi-index α (writing ∂h,j
u=
k−1 1
∂h,j ∂h,j u for integers k ≥ 2):
αd
α1
∂hα u = ∂h,1
...∂h,d
u.
Finally, we use the notation Ωα for the largest subset of Ω = (a1 , b1 ) × ... ×
(ad , bd ) such that ∂hα u : Ωα → R does not require information outside Ω:
h
h
h
h
× ... × ad + αd , bd − αd
.
(5.11)
Ωα = a1 + α1 , b1 − α1
2
2
2
2
For scalars γ we set Ωγ = Ω(γ,...,γ) .
1A
d-dimensional multi-index α is a d-tuple
nonnegative
α = (α1 , ..., αd ).
ofα
αintegers:
1
d
∂
∂
Furthermore, |α| = α1 + ... + αd and D α u = ∂x
u.
... ∂x
1
d
2 C ∞ (Ω) denotes the usual set of all functions in C ∞ (Ω) with compact support.
0
3 Here, e is the multi-index whose j th component is 1 and all other components are 0.
j
90
Chapter 5 Theoretical Superconvergence
5.3
Auxiliary results
To obtain the main error estimates, we require two auxiliary results, which
are discussed in this section. Section 5.3.1 derives an estimate for ku − u⋆ k.
Section 5.3.2 expresses derivatives of convolutions with B-splines in terms of
central differences.
5.3.1
Estimating ku − u⋆ k
In Section 4.3.2, we mentioned that the symmetric post-processor reproduces
polynomials of degree 2p. As a result, the difference between u and u⋆ is of
order 2p + 1 (assuming that u is sufficiently smooth). Actually, Bramble and
Schatz [11, Lemma 5.2] designed the symmetric kernel this way. In this section,
we show that a similar result holds for the generalized post-processor. Unlike
before, this estimate is valid in the entire spatial domain.
Lemma 5.3.1 Consider the generalized post-processor with r + 1 kernel
nodes, as defined in Section 5.2.1. Let k ≤ r + 1 be a positive integer.
Then4 ,
X
ku − u⋆ kL∞ (Ω) .
kDα ukL∞ (Ω) hk ,
∀u ∈ W k,∞ (Ω).
(5.12)
|α|=k
Here, α denotes a d-dimensional multi-index. The constant involved depends on the L1 -norm of the kernels, as indicated in the proof below.
To show Lemma 5.3.1, we follow the same strategy sketched for the symmetric
post-processor in [11, p. 103]: the main idea is to demonstrate (the proof is
given below) that the generalized post-processor reproduces polynomials p of
degree r (or lower, in each variable):
p∗ (x̄) = p(x̄),
∀x̄ ∈ Ω.
(5.13)
Using this property, the proof can be completed using Taylor’s theorem: for
u ∈ W k,∞ (Ω) and x, x̄ ∈ Ω, we have5 :
u(x)
=
X
(x − x̄)α α
D u(x̄)
α!
|α|≤k−1
X (x − x̄)α Z 1
+k
sk−1 Dα u x + s(x̄ − x) ds.
α!
0
(5.14)
|α|=k
4 Throughout this chapter, we use the symbol . in expressions of the form “F (x) . G(x)
for all x ∈ X” to indicate that there exists a constant C > 0, independent of the variable x
and the scaling h, such that F (x) ≤ CG(x) for all x ∈ X. The symbol & is defined similarly.
5 For a d-dimensional multi-index α, we write α! = α !...α ! and xα = xα1 ...xαd .
1
d
Section 5.3 Auxiliary results
91
This follows from [14, p. 83, 100, 101] (using the chain rule and Lipschitz
continuity of the derivatives of order k − 1 and lower [30, p. 269]). We can now
show Lemma 5.3.1:
Proof (of Lemma 5.3.1) To show (5.13), first consider the one-dimensional case. Without loss of generality, we may assume that p is a monomial basis function, i.e. p(x) = xm
(with m = 0, ..., r). Next, observe that, by definition of the post-processor,
p⋆ (x̄) :=
=
1
h
Z
r
X
j=0
K
Ω
cj
Z
x̄ − x
h
∞
xm dx :=
r Z
X
j=0
ψ (ℓ) (y) x̄ − h(y + xj )
−∞
∞
cj ψ (ℓ)
−∞
m
x̄ − x
− xj xm dx.
h
|
{z
}
=:y
dy.
Using the binomial theorem we may write:
p⋆ (x̄) =
m
X
n=0
Z
r
X
m!
cj
x̄n (−h)m−n
ψ (ℓ) (y)(y + xj )m−n dy = xm = p(x).
(m − n)!n!
R
j=0
{z
}
|
= 1 for m = n, and zero otherwise (5.5)
This completes the proof of (5.13) for the one-dimensional case. For higher-dimensional
problems, this result follows from (5.7) and repetitive application of (5.13) for onedimensional kernels.
Now that we have obtained (5.13), we can show (5.12). To this end, choose an
evaluation point x̄ ∈ Ω. Because the post-processor reproduces polynomials (5.13), we
may write for any polynomial p of degree r or lower (note that u, u⋆ and p are continuous):
|u − u⋆ |(x̄) = |(u − p)(x̄) − (u − p)⋆ (x̄)|.
In particular, we may choose p to be the Taylor polynomial of degree k − 1 that approximates u near the point x̄. As a consequence, (u − p)(x̄) = 0, and we may write:
|u − u⋆ |(x̄)
=
=
|(u − p)⋆ (x̄)|
1 Z
x̄ − x
K
(u − p)(x) dx
d
h Ω
h
| {z }
=:y
=
Hölder
≤
Z
K(y)(u
−
p)(x̄
−
hy)
dy
supp{K}
kKkL1 (Rd ) k(u − p)(x̄ − h.)kL∞ (supp(K)) .
(5.15)
To estimate the second term, we apply Taylor’s theorem (5.14) for all y ∈ supp(K):
Z
X (−hy)α 1 k−1 α
|u − p|(x̄ − hy) = k
s
D u x̄ − hy + s(hy) ds
α!
0
|α|=k
Z 1
X −hy α α
sk−1 ds
≤
α! kD ukL∞ (Ω) k
0
|α|=k
{z
}
|
=1
X |y α |
kD α ukL∞ (Ω) hk
≤
α!
|α|=k
92
Chapter 5 Theoretical Superconvergence
Substitution of this result into (5.15) yields:


 X |y α |

|u − u⋆ |(x̄) ≤ kKkL1 (Rd )
sup
kD α ukL∞ (Ω) hk

y∈supp(K) |α|=k α!
X
kD α ukL∞ (Ω) hk ,
.
∀x̄ ∈ Ω.
|α|=k
Here we have used that K is independent of h for all x̄ (although different x̄ yield different
kernels). We now arrive at (5.12), which completes the proof.
5.3.2
Derivatives of B-splines
The derivative of a B-spline ψ ℓ can be expressed as the central difference of
the lower-order B-spline ψ ℓ−1 [70, p. 12]. In [11, Lemma 5.3], this property
was exploited to estimate norms of derivatives of u⋆ for the symmetric-postprocessor. In this section, we obtain a similar result for the convolution of u
with a single B-spline (without requiring continuity of u). This reduction to
the core elements of the post-processor is convenient when handling different
kernel types later on. Another difference with [11] is that we provide an explicit
expression for the largest subdomain for which the estimates are valid (without
requiring information outside Ω). These subdomains will turn out to be just
large enough to ensure that our main estimates are applicable in the entire
spatial domain. Altogether, we have the following result:
Lemma 5.3.2 Let α be a d-dimensional multi-index whose entries are not
(ℓ)
larger than ℓ ≥ 1. If u ∈ L∞ (Ω), then ψh ⋆ u ∈ W ℓ,∞ (Ωℓ ), and
α
(ℓ)
≤ k∂hα ukL∞ (Ωα ) .
(5.16)
D ψh ⋆ u L∞ (Ωℓ )
(ℓ)
Similarly, if u ∈ L2 (Ω), then ψh ⋆ u ∈ H ℓ (Ωℓ ), and, for k ≥ 0:
α
(ℓ)
≤ k∂hα ukH −k (Ωα ) .
D ψh ⋆ u H −k (Ωℓ )
(5.17)
To show Lemma 5.3.2, the main idea is to demonstrate (the proof is given
below)6 :
(ℓ)
(ℓ−α)
D α ψh ⋆ u = ψh
⋆ ∂hα u.
(5.18)
Furthermore, we make use of Young’s inequality for convolutions [9, Theorem
3.9.4]: for 1 ≤ p, q, r ≤ ∞, f ∈ Lp (Rd ), and g ∈ Lq (Rd ):
1 1
1
+ = +1
p q
r
6 Here,
⇒
kf ⋆ gkLr (Rd ) ≤ kf kLp (Rd ) kgkLq (Rd ) .
(ℓ−α)
we use the notation ψh
thermore, the notation
(0)
ψh
(ℓ−α1 )
(x) = ψh
(ℓ−αd )
(x1 )... ψh
⋆ u should be interpreted as u.
(5.19)
(xd ), for all x ∈ R d . Fur-
Section 5.3 Auxiliary results
93
Additionally, we use that [70, p.3]:
(ℓ) ψh L1 (Rd )
= 1.
(5.20)
We can now show Lemma 5.3.2:
Proof (of Lemma 5.3.2) To show (5.18), first, consider the one-dimensional case with
ℓ = 1:
Z h
Z
h
2
1 x+ 2
1
(1)
ψh ⋆ u (x) =
u(x − y ) dy =
u(s) ds.
h − h2 | {z }
h x− h2
=:s
(1)
ψh ⋆ u
(1)
D(ψh ⋆ u)
As a result,
is continuous, and
= ∂h u almost everywhere [9, Theorem
5.4.2]. For ℓ ≥ 1, we may now write:
(5.1) (ℓ)
(ℓ−1)
(1)
(ℓ−1)
(1)
(ℓ−1)
D ψh ⋆ u = D ψh
⋆ ψh ⋆ u = ψh
⋆ D ψh ⋆ u = ψh
⋆ ∂h u.
For higher-order derivatives, we may repeat this strategy obtain (5.18) for the onedimensional case. For the multi-dimensional case, we apply the above in each direction,
which then completes the proof of (5.18).
To show (5.16), note that (5.18) implies that
α
(ℓ−α)
(ℓ)
= ψh
ψh ⋆ u ∞
⋆ ∂hα u ∞
.
D
L
(Ωℓ )
L
(Ωℓ )
Next, define w ∈ L∞ (R d ) such that w = ∂hα u in Ωα and zero everywhere else. Because
of the local support of the B-spline, the convolution in the right hand side above requires
only information of ∂hα u in Ωα ⊇ Ωℓ . Hence, we may replace ∂hα u by w:
(ℓ−α)
α
(ℓ)
.
⋆ w ∞
= ψh
ψh ⋆ u ∞
D
L
L
(Ωℓ )
(Ωℓ )
Next, we apply Young’s inquality (5.19) with p = 1, and q, r = ∞:
α
(ℓ−α) (ℓ)
ψh ⋆ u ∞
≤ ψh
1 d kwkL∞ (Rd ) .
D
L
(Ωℓ )
= ∂ α u
L (R )
now yields (5.16).
Using (5.20) and the fact that kwkL∞ (Rd )
h
L∞ (Ωα )
To show (5.17), note that (5.18) implies that
α
(ℓ−α)
(ℓ)
ψh ⋆ u −k
= ψh
⋆ ∂hα u −k
D
H
(Ωℓ )
L∞ (R d )
H
(Ωℓ )
∂hα u
in Ωα and zero everywhere else, so
As before, we define w ∈
such that w =
we may replace ∂hα u by w:
α
(ℓ−α)
(ℓ)
ψh ⋆ u −k
= ψh
⋆ w −k
D
H
(Ωℓ )
H
(Ωℓ )
Next, choose v ∈ C0∞ (Ωℓ ), and extend it to R d by setting it equal to zero outside Ωℓ .
We may now write:
Z
Z Fubini
(ℓ−α)
(ℓ−α)
w ψh
⋆v .
ψh
⋆w v =
Ωα
Ωℓ
(ℓ−α)
ψh
C0∞ (Ωα ),
Next, we note that
⋆v ∈
so we may consider it as a test function in the
definition of the negative-order norm (5.10) of w:
R
(ℓ−α)
Z ⋆v Ωα w ψ h
(ℓ−α)
(ℓ−α)
⋆ v k
ψh
⋆w v = ψh
(ℓ−α)
H (Ωα )
Ωℓ
⋆
v
ψh
k
H (Ωα )
{z
}
|
≤kwkH −k (Ω )
α
94
Chapter 5 Theoretical Superconvergence
At the same time,
2
(ℓ−α)
⋆ v
ψh
(ℓ−α)
⋆ v
≤ kwkH −k (Ωα ) ψh
H k (Ωα )
=
H k (Ωα )
.
2
X β
(ℓ−α)
⋆v 2
D ψh
L (Ωα )
|β|≤k
2
X (ℓ−α)
⋆ Dβ v 2
=
ψh
L (Ωα )
|β|≤k
.
Next, apply Young’s inequality (5.19) with p = 1 and q, r = 2 to obtain (using that v,
and thus its derivatives, have compact support in Ωℓ ):
2
2
X (5.20)
(ℓ−α) 2
(ℓ−α)
⋆ v k
≤
= kvk2H k (Ω ) .
ψh
ψh
1 d D β v 2
ℓ
H (Ωα )
L (R )
L (Ωℓ )
|β|≤k |
{z
}
=1
Finally, we combine the results above to obtain:
R (ℓ−α)
⋆w v
Ωℓ ψ h
α
(ℓ)
ψh ⋆ u −k
=
sup
≤ kwkH −k (Ωα ) .
D
H
(Ωℓ )
kvkH k (Ωℓ )
v∈C0∞ (Ωℓ )
Replacing w by ∂hα u completes the proof of (5.17).
5.4
The main result in abstract form
Using the auxiliary properties discussed in the previous section, we now derive
an estimate for ku − v ⋆ k. This result is applicable for any (sufficiently smooth)
functions u and v. The implications for DG approximations will be studied in
Section 5.5. Section 5.4.1 expresses a post-processed function in terms of convolutions with B-splines. Section 5.4.2 estimates the remaining terms further.
Section 5.4.3 combines these two results with Lemma 5.3.1 to obtain the final
estimate for ku − v ⋆ k.
5.4.1
Reducing the post-processor to its building blocks
The first step is to express v ⋆ in terms of convolutions with B-splines. This
removes the dependency on the evaluation point and, as such, simplifies the
analysis. As before, we provide explicit expressions for the subdomains involved, which are crucial to ensure that our final error estimates apply in the
entire domain. Altogether, we show the following result:
Lemma 5.4.1 Consider the generalized post-processor with B-splines of
order ℓ ≥ 1 and ε > 0 small, as defined in Section 5.2.1. Then, for all
v ∈ L2 (Ω):
(ℓ)
kv ⋆ kL2 (Ω) . ψh ⋆ v 2
,
(5.21)
L (Ωℓ+ε )
(ℓ)
kv ⋆ kL∞ (Ω) . ψh ⋆ v .
(5.22)
L∞ (Ωℓ+ε )
Section 5.4 The main result in abstract form
95
The constants involved depend on the kernel coefficients, as indicated in the
proof below.
To show Lemma 5.4.1, the key is to observe that the kernel nodes are located
within Ωℓ+ε during the convolution (this is motivated below).
Proof (of Lemma 5.4.1) First, consider the one-dimensional case, and choose an evaluation point x̄ ∈ Ω. Then,
v ⋆ (x̄) =
r
X
j=0
cj
Z
Ω
1 (ℓ)
ψ
h
x̄ − x
− xj
h
v(x) dx =
r
X
(ℓ)
cj ψh ⋆ v (x̄ − hxj ).
| {z }
j=0
∈Ωℓ+ε
Next, observe that x̄ − hxj ∈ Ωℓ+ε for all x̄ ∈ Ω, which can be seen as follows. At
the end points of the domain Ω = (a, b), the shift function (5.4) reads λ(a) = − r+ℓ+ε
2
respectively. Hence, for x̄ = a, the right most kernel node (5.3) is
and λ(b) = r+l+ε
2
xr (a) = − ℓ+ε
. Similarly, for x̄ = b, the left-most kernel node is x0 (b) = ℓ+ε
. Hence,
2
2
ℓ+ε
,
b
−
h
],
which
is
precisely
Ωℓ+ε by
the quantity x̄ − hxj takes values in [a + h ℓ+ǫ
2
2
definition (5.11).
Considering the L2 -norm of v ⋆ now yields (recall the dependence of the kernel coefficients on the evaluation point):
kv ⋆ k2L2 (Ω)
2
Z X
r
(ℓ)
cj ψh ⋆ v (x̄ − hxj ) dx̄
Ω j=0


2 
Z
r r
X
X
(ℓ)
2

|cj |  
ψh ⋆ v (x̄ − hxj )  dx̄
Ω
=
Cauchy-Schwartz
≤
j=0
j=0
≤
sup
x̄∈Ω
≤
X
r
|cj (x̄)|2
r sup
x̄∈Ω
j=0
{z
|
j=0
j=0
X
r
2
X
r Z (ℓ)
ψh ⋆ v (x̄ − hxj ) dx̄
| {z } Ω
∈Ωℓ+ε
2
(ℓ)
|cj (x̄)|2 ψh ⋆ v 2
L (Ωℓ+ε )
.
}
constant
Similarly, when we consider the L∞ -norm of v ⋆ , we obtain:
)
( r
X
(ℓ)
cj ψh ⋆ v (x̄ − hxj )
kv ⋆ kL∞ (Ω) = sup
x̄
| {z }
j=0

≤ sup 
x̄∈Ω
|
r
X
j=0

∈Ωℓ+ε
(ℓ)
|cj (x̄)| ψh ⋆ v {z
constant
}
L∞ (Ωℓ+ε )
.
This completes the proof for the one-dimensional case.
For general d ≥ 1, the desired result follows by applying the strategy above for
each spatial direction. To be more specific, recall that the kernel K is a tensor product
of one-dimensional kernels K1 , ..., Kd (cf. (5.6)). We can now write for x̄ ∈ Ω (the
96
Chapter 5 Theoretical Superconvergence
super-scripted indices below indicate the corresponding one-dimensional kernel):



(1)
xj 1
r
r


X
X


 .

(ℓ)
(1)
(d)
cj1 ...cj ψh ⋆ v x̄ − h  ..
...
v ⋆ (x̄) =
 .
d



jd =0
j1 =0
(d)
xj
d
|
{z
}
∈Ωℓ+ε
Considering the L2 - and L∞ -norm yields:
v


u
r r
u
2 X
X
u
(ℓ)
(1)
(d)
⋆
...
kv kL2 (Ω) ≤ td r sup 
cj1 (x̄)...cjd (x̄)  ψh ⋆ v x̄∈Ω
kv ⋆ kL∞ (Ω)
|
{z
}
constant

r r
X
X
(ℓ)
(1)
(d)
...
≤ sup 
cj1 (x̄)...cjd (x̄) ψh ⋆ v x̄∈Ω
|

j1 =0
This completes the proof.
5.4.2
L2 (Ωℓ+ε )
jd =0
j1 =0
L∞ (Ωℓ+ε )
jd =0
{z
,
.
}
constant
Treating the remaining building blocks
Now that we have reduced the post-processor to its building blocks, in this
section, we further estimate the latter. To this end, we follow Bramble and
Schatz [11, p. 104–105], while considering B-splines rather than the symmetric
kernel (and without any restriction to continuous functions). As before, another
difference with [11] is that we keep a careful administration of the subdomains
involved, as this is crucial for our final estimate to be applicable in the entire
domain. Altogether, we show the following result:
Lemma 5.4.2 Consider a B-spline of order ℓ ≥ 1, let ε > 0 small (cf.
Section 5.2.1), and define7 d0 = 1 + [d/2]. Then, for all v ∈ L2 (Ω):
X
(ℓ)
.
k∂hα vkH −ℓ (Ωα ) .
(5.23)
ψh ⋆ v 2
L (Ωℓ+ε )
|α|≤ℓ
Furthermore, for all v ∈ L∞ (Ω):
X
X
(ℓ)
.
k∂hα vkH −ℓ (Ωα ) + hℓ
k∂hα vkL∞ (Ωα ) .
ψh ⋆ v ∞
L
(Ωℓ+ε )
|α|≤ℓ+d0
|α|≤ℓ
(5.24)
7 Here,
[d/2] denotes the integer part of d/2.
Section 5.4 The main result in abstract form
97
To show Lemma 5.4.2, the main idea is to switch from L2 -norms to negativeorder norms at the cost of smoothness: for open bounded d-dimensional domains8 X0 ⋐ X1 and nonnegative integers k, we have:
X
kukL2 (X0 ) .
kDα ukH −k (X1 ) ,
∀u ∈ L2 (X1 ).
(5.25)
|α|≤k
This has been shown in [11, p. 96]. At the same time, we can switch from
L∞ -norms to L2 -norms (again, at the cost of smoothness): for open bounded
d-dimensional domains X0 ⋐ X1 and d0 := [d/2]+1, we have that u ∈ H d0 (X1 )
is continuous almost everywhere in X0 , and:
∀u ∈ H d0 (X1 ).
kukL∞ (X0 ) . kukH d0 (X1 ) ,
(5.26)
This result is given in [12, p. 679]. Combining these relations with the expression for derivatives of B-splines in Lemma 5.3.2, we can now show Lemma 5.4.2
as follows:
Proof (of Lemma 5.4.2) Relation (5.23) is obtained as follows:
(ℓ)
ψh ⋆ v L2 (Ω
X α
(ℓ)
ψh ⋆ v D
(5.25), Ωℓ+ε ⋐Ωℓ
.
ℓ+ε )
|α|≤ℓ
Lemma 5.3.2
≤
X
H −ℓ (Ωℓ )
k∂hα vkH −ℓ (Ωα ) .
|α|≤ℓ
(ℓ)
Relation (5.24) requires a little more work: let (ψh ⋆ v)⋆ denote the result of applying
(ℓ)
ψh
the generalized post-processor to
⋆ v using r + 1 B-splines of order d0 (!) in the
domain Ωℓ+ε (cf. Section 5.2.1). The triangle inequality then gives:
⋆ ⋆ (ℓ)
(ℓ)
(ℓ)
(ℓ)
≤ ψh ⋆ v − ψh ⋆ v ∞
+ ψh ⋆ v ∞
.
ψh ⋆ v ∞
L
(Ωℓ+ε )
L
(Ωℓ+ε )
L
(Ωℓ+ε )
(5.27)
(ℓ)
To estimate the first term in the right hand side of (5.27), we observe that ψh ⋆ v ∈
(ℓ)
ℓ,∞
W
(Ωℓ ) by Lemma 5.3.2. Hence, we can apply Lemma 5.3.1 (substituting ψh ⋆ v for
u) and rewrite the derivatives using Lemma 5.3.2:
⋆ X Lemma 5.3.1
(ℓ)
α
(ℓ)
(ℓ)
ψh ⋆ v .
hℓ
ψh ⋆ v − ψh ⋆ v ∞
D
L
(Ωℓ+ε )
L∞ (Ωℓ+ε )
|α|=ℓ
Ωℓ+ε ⊆ Ωℓ
≤
hℓ
X α
(ℓ)
ψh ⋆ v D
L∞ (Ωℓ )
|α|=ℓ
Lemma 5.3.2
≤
hℓ
X
k∂hα vkL∞ (Ωα )
(5.28)
|α|≤ℓ
To estimate the second term in the right hand side of (5.27), we apply Lemma 5.4.1,
substituting Ωℓ+ε for Ω and d0 for ℓ. This results in a reduction from Ωℓ+ε to Ωℓ+d0 +2ε:
⋆ Lemma 5.4.1
(ℓ)
(ℓ+d0 )
.
⋆ v ∞
ψh ⋆ v ∞
ψh
L
8 We
(Ωℓ+ε )
L
(Ωℓ+d0 +2ε )
write X0 ⋐ X1 to indicate that X0 is a compactly embedded in X1 , i.e. the closure
of X0 is compact and a subset of the interior of X1 .
98
Chapter 5 Theoretical Superconvergence
Next, we switch to L2 -norms using (5.26), after which we can proceed as before for the
estimates in the L2 -norm:
⋆ (ℓ)
ψh ⋆ v ∞
L
(5.26), Ωℓ+d0 +2ε ⋐Ωℓ+d0 +ε
.
(Ωℓ+ε )
X α
(ℓ+d )
ψh 0 ⋆ v D
|α|≤d0
(5.25), Ωℓ+d0 +ε ⋐Ωℓ+d0
.
X
L2 (Ωℓ+d0 +ε )
X α+β
(ℓ+d )
ψh 0 ⋆ v D
H −ℓ (Ωℓ+d0 )
|α|≤d0 |β|≤ℓ
X
reordering
.
|α|≤ℓ+d0
X
Lemma 5.3.2
.
α
(ℓ+d )
ψh 0 ⋆ v D
H −ℓ (Ωℓ+d0 )
k∂hα vkH −ℓ (Ωα ) .
.
(5.29)
|α|≤ℓ+d0
Substituting (5.28) and (5.29) into (5.27) yields (5.24), which then completes the proof.
5.4.3
The main error estimate in abstract form
Combining the auxiliary results in the previous sections, we can now estimate
ku − v ⋆ k in both the L2 - and the L∞ -norm. These estimates are in abstract
form in the sense that they are applicable for any (sufficiently smooth) functions
u and v. To obtain these results, we follow [11, p. 104–106], while considering
the generalized post-processor rather than the symmetric filter. Unlike before,
our error estimates are valid in the entire domain, and the L∞ -estimates are
not restricted to continuous v.
Theorem 5.4.3 Consider the generalized post-processor using r + 1 Bsplines of order ℓ, as discussed in Section 5.2.1. Let k ≤ r + 1. Then,
for all u ∈ W k,∞ (Ω) and v ∈ L2 (Ω):
X
X
ku − v ⋆ kL2 (Ω) .
kDα ukL∞ (Ω) hk +
k∂hα (u − v)kH −(ℓ) (Ωα ) .
|α|=k
|α|≤ℓ
(5.30)
Furthermore, for all u ∈ W k,∞ (Ω) and v ∈ L∞ (Ω):
X
X
ku − v ⋆ kL∞ (Ω) .
kDα ukL∞ (Ω) hk +
k∂hα (u − v)kH −ℓ (Ωα )
|α|=k
+
X
|α|≤ℓ
|α|≤ℓ+d0
k∂hα (u
− v)kL∞ (Ωα ) hℓ .
(5.31)
To show Theorem 5.4.3, the main idea is to apply the triangle inequality to
write:
ku − v ⋆ k ≤ ku − u⋆ k + k(u − v)⋆ k .
(5.32)
Section 5.5 The main result for DG approximations
99
After that, we can apply Lemma 5.3.1 to the first term and Lemma 5.4.2 to
the second. Altogether, the proof of Theorem 5.4.3 reads as follows:
Proof (of Theorem 5.4.3) To show (5.30), use the triangle inequality and the linearity
of the post-processor to write:
ku − v ⋆ kL2 (Ω) ≤ ku − u⋆ kL2 (Ω) + ku⋆ − v ⋆ kL2 (Ω)
. ku − u⋆ kL∞ (Ω) + k(u − v)⋆ kL2 (Ω)
⋆
ku − v kL∞ (Ω) ≤ ku − u⋆ kL∞ (Ω) + k(u − v)⋆ kL∞ (Ω)
Application of Lemma 5.4.1 to the last terms gives:
(ℓ)
ku − v ⋆ kL2 (Ω) . ku − u⋆ kL∞ (Ω) + ψh ⋆ (u − v) 2
L (Ωℓ+ǫ )
(ℓ)
ku − v ⋆ kL∞ (Ω) . ku − u⋆ kL∞ (Ω) + ψh ⋆ (u − v) ∞
L
(Ωℓ+ǫ )
Application of Lemma 5.3.1 to the first term and Lemma 5.4.2 to the second term in
each inequality yields (5.30) and (5.31), which then completes the proof.
5.5
The main result for DG approximations
In the previous section, we obtained error estimates for arbitrary filtered functions. In this section, we study the implications for filtered DG approximations. Section 5.5.1 discusses the convergence of unfiltered DG approximations, including the superconvergence in the negative-order norm established
in [22]. Section 5.5.2 uses these results to derive the main theorem of this
chapter: the generalized post-processor improves the convergence rate of a DG
approximation from order p + 1 to order 2p + 1 in the L2 -norm, and to order
min{2p + 1, 2p + 2 − d/2} in the L∞ -norm. Section 5.5.3 discusses why the
same convergence rates result for the position-dependent post-processor, and
explains the accuracy improvement we observed earlier during the numerical
experiments in Section 4.5.
5.5.1
Unfiltered DG convergence
A DG approximation with polynomial degree p typically converges at rate
p + 1 in the L2 -norm for sufficiently smooth problems. In the negative-order
norm, superconvergence of order 2p + 1 has been shown for the linear periodic
hyperbolic problems under consideration [22]. In this section, we summarize
these results, including the implications for L∞ -norms and central differences
of the error:
Lemma 5.5.1 Consider the linear periodic hyperbolic problem (5.8) with
exact solution u and initial data u0 . Suppose that uh is the DG approximation for u with polynomial degree p ≥ 1 and mesh size h, as discussed in
100
Chapter 5 Theoretical Superconvergence
Section 5.2.2. Let α be a d-dimensional multi-index. Then, for all initial
data u0 ∈ W p+1+|α|,∞ (Ω) and corresponding u and uh :
k∂hα (u − uh )kH −(p+1) (Ωα ) . ku0 kH p+1+|α| (Ω) h2p+1 .
(5.33)
Furthermore, for all u0 ∈ W p+2+|α|,∞ (Ω) and corresponding u and uh , if
the DG scheme yields convergence of order p + 1 in the sense that
ku − uh kL2 (Ω) . ku0 kH p+2 (Ω) hp+1 ,
(5.34)
then,
k∂hα (u − uh )kL2 (Ωα )
.
k∂hα (u − uh )kL∞ (Ωα )
.
ku0 kH p+2+|α| (Ω) hp+1 ,
d
max Dβ+α uL∞ (Ω) hp+1− 2
(5.35)
|β|=p+1
d
+ ku0 kH p+2+|α| (Ω) hp+1− 2 .
(5.36)
Before we show these results, we have the following remarks regarding assumption (5.34). First of all, for one-dimensional problems with A0 = 0 in (5.8),
relation (5.34) has been shown in [21, p. 166, 189–199]. For stationary problems, the same result has been derived for certain two-dimensional triangulations (satisfying a so-called transversality condition) [61], and for d-dimensional
meshes with a unique outflow edge per mesh element [19, 85, 20] (with a lower
order Sobolev norm in the right hand side). For some stationary problems,
a convergence rate of order p + 12 , as shown by Johnson and Pitkäranta [44],
has shown to be sharp [62]. Nevertheless, convergence of order p + 1 is usually observed in practice for (5.8) [65, 47]. In any case, assumption (5.34) is
not needed to obtain the main error estimate in Section 5.5.2 hereafter in the
L2 -norm; only the result in the L∞ -norm relies on it. Altogether, it seems
reasonable to require (5.34) at this point.
To show Lemma 5.5.1, we use the following known error estimate in the
negative-order norm9 [22, Theorem 3.3]:
ku − uh kH −(p+1) (Ω) . ku0 kH p+1 (Ω) h2p+1 .
(5.37)
To obtain the L∞ -estimate, we use the following inverse inequality [12, p. 680]:
d
kvkL∞ (Ω) . h− 2 kvkL2 (Ω) ,
∀v ∈ V,
(5.38)
and the following property of the L2 -projection Ph onto V [18, p. 121-129]:
kv − Ph vkL∞ (Ω) . max Dβ v L∞ (Ω) hp+1 , ∀v ∈ W p+1,∞ (Ω). (5.39)
|β|=p+1
9 Actually,
it
follows
from
[22,
Theorem
3.3]
that
ku − uh kH −(p+1) (X
0)
.
ku0 kH p+1 (Ω) h2p+1 for X0 ⋐ Ω. However, the same result is true when we replace X0
by Ω, due to the periodicity of the problem.
Section 5.5 The main result for DG approximations
101
Altogether, Lemma 5.5.1 can be shown as follows:
Proof (of Lemma 5.5.1) Let Ω′ be the result of translating Ω (and the corresponding
in each direction, i.e.
mesh) by a distance h
2
h
h
h
h
Ω′ =
,1 +
× ... ×
,1 +
.
2
2
2
2
Next, consider ∂hα u, ∂hα u0 and ∂hα uh on this translated domain (using periodic extensions). Because the problem is linear and periodic, ∂hα u is a solution to (5.8) for the
domain Ω′ (with initial condition ∂hα u0 ). At the same time, ∂hα uh is the DG approximation for ∂hα u (on the translated mesh). Hence, we can apply (5.34) and (5.37) for this
translated problem on Ω′ (also cf. [22, p. 590]):
k∂hα (u − uh )kL2 (Ω′ ) . k∂hα u0 kH p+2 (Ω′ ) hp+1 ,
k∂hα (u − uh )kH −(p+1) (Ω′ ) . k∂hα u0 kH p+1 (Ω′ ) h2p+1 .
Next, observe that it follows from (5.18), (5.19), (5.20) (and the periodicity) that, for
1 ≤ p ≤ ∞:
k∂hα ukLp (Ω′ ) . kD α ukLp (Ω′ ) .
(5.40)
With this reasoning, we obtain:
k∂hα (u − uh )kL2 (Ω′ ) . ku0 kH p+2+|α| (Ω′ ) hp+1 ,
k∂hα (u
− uh )kH −(p+1) (Ω′ ) . ku0 kH p+1+|α| (Ω′ ) h
2p+1
(5.41)
.
Using periodicity and translation invariance again, we may replace Ω′ by Ω, and we arrive
at (5.35) and (5.33) (using Ωα ⊆ Ω).
To show (5.36), we consider the translated domain again, and use the triangle inequality and the inverse inequality (5.38) to write:
k∂hα (u − uh )kL∞ (Ω′ )
k∂hα u − Ph ∂hα ukL∞ (Ω′ ) + kPh ∂hα u − ∂hα uh kL∞ (Ω′ )
≤
(5.38)
.
d
k∂hα u − Ph ∂hα ukL∞ (Ω′ ) + h− 2 kPh ∂hα u − ∂hα uh kL2 (Ω′ ) .
Next, apply the triangle inequality again, and use the property of the polynomial projection (5.39):
k∂hα (u − uh )kL∞ (Ω′ )
d
.
k∂hα u − Ph ∂hα ukL∞ (Ω′ ) + h− 2 kPh ∂hα u − ∂hα ukL2 (Ω′ )
.
+h− 2 k∂hα u − ∂hα uh kL2 (Ω′ )
d
d
1 + h− 2 k∂hα u − Ph ∂hα ukL∞ (Ω′ ) + h− 2 k∂hα u − ∂hα uh kL2 (Ω′ )
d
(5.39)
.
(5.41), (5.40)
.
d
1 + h− 2
max D β ∂hα u
|β|=p+1
max D β+α u
|β|=p+1
L∞ (Ω′ )
d
L∞ (Ω′ )
d
hp+1 + h− 2 k∂hα (u − uh )kL2 (Ω′ )
d
hp+1− 2 + ku0 kH p+2+|α| (Ω′ ) hp+1− 2 .
Using periodicity and translation invariance again, we may replace Ω′ by Ω, and we arrive
at (5.36) (using Ωα ⊆ Ω), which then completes the proof.
102
Chapter 5 Theoretical Superconvergence
5.5.2
Main result: extracting DG superconvergence
We now arrive at the main theorem of this chapter: the generalized postprocessor improves the convergence rate of a DG approximation from order
p + 1 to order 2p + 1 in the L2 -norm, and to order min{2p + 1, 2p + 2 − d/2} in
the L∞ -norm. This theorem extends existing error estimates for the symmetric
post-processor in the L2 -norm [22]. Below, we also include the L∞ -norm in the
analysis.
Theorem 5.5.2 (Main result) Consider the linear periodic hyperbolic
problem (5.8) with exact solution u and initial data u0 . Suppose that uh is
the DG approximation for u with polynomial degree p ≥ 1 and mesh size h,
as discussed in Section 5.2.2. Furthermore, consider the generalized postprocessor with at least 2p + 1 B-splines of order p + 1 (cf. Section 5.2.1).
Then, assuming u0 , u ∈ W 2p+3+[d/2],∞ (Ω):
ku − u⋆h kL2 (Ω) . h2p+1 .
(5.42)
If, furthermore, (5.34) holds, then,
ku − u⋆h kL∞ (Ω) . hmin{2p+1,2p+2−d/2} .
(5.43)
The constants involved may depend on u and u0 , but not on uh (as indicated
in the proof below).
To show Theorem 5.5.2, we substitute Lemma 5.5.1 into Theorem 5.4.3:
Proof (of Theorem 5.5.2) Let k be any positive integer not larger than the number
of kernel nodes such that u ∈ W k,∞ (Ω). Then, substitution of v = uh and ℓ = p + 1 into
(5.30) and (5.31) yields:
X
X
k∂hα (u − uh )kH −(p+1) (Ωα )
kD α ukL∞ (Ω) hk +
ku − u⋆h kL2 (Ω) .
|α|≤p+1
|α|=k
ku −
u⋆h kL∞ (Ω)
.
X
α
k
kD ukL∞ (Ω) h +
X
k∂hα (u − uh )kH −(p+1) (Ωα )
|α|≤p+1+d0
|α|=k
+
X
k∂hα (u − uh )kL∞ (Ωα ) hp+1 .
|α|≤p+1
Application of Lemma 5.5.1 gives (using u0 , u ∈ W 2p+3+[d/2],∞ (Ω) by assumption):
X
X
ku0 kH p+1+|α| (Ω) h2p+1
kD α ukL∞ (Ω) hk +
ku − u⋆h kL2 (Ω) .
|α|≤p+1
|α|=k
ku −
u⋆h kL∞ (Ω)
.
X
α
kD ukL∞ (Ω) h +
|α|=k
+
X
k
X
|α|≤p+1
ku0 kH p+1+|α| (Ω) h2p+1
|α|≤p+1+d0
max D β+α u
d
L∞ (Ω)
|β|=p+1
d
+ ku0 kH p+2+|α| (Ω) h2p+2− 2
.
h2p+2− 2
Section 5.5 The main result for DG approximations
103
Choosing k = 2p + 1, this can be simplified to (5.42) and (5.43), which then completes
the proof.
5.5.3
Implications for the position-dependent post-processor
Now that we have established error estimates for the generalized post-processor,
the same convergence rates follow automatically for the position-dependent
post-processor. This is because the latter is based on a convex combination of
two generalized post-processors with 2p + 1 and 4p + 1 B-splines respectively
(cf. Section 4.4.2).
More precisely, we claim that Theorem 5.5.2 is also valid if u⋆h is the result of applying the position-dependent post-processor to uh (the constants
involved are typically different though). To see this, consider Theorem 5.5.2
and suppose that u⋆h,2p+1 and u⋆h,4p+1 are the result of applying the generalized
post-processor to uh with 2p + 1 and 4p + 1 B-splines respectively. Then, for
θ : Ω → [0, 1], we have in any norm:
ku − u⋆h k := u − θ u⋆h,2p+1 + (1 − θ)u⋆h,4p+1 ≤ u − u⋆h,2p+1 + u − u⋆h,4p+1 .
Application of Theorem 5.5.2 to both terms then yields (5.42) and (5.43) for
the position-dependent post-processor. This means that it improves the convergence rate of a DG approximation from order p + 1 to order 2p + 1 in the
L2 -norm, and to order min{2p + 1, 2p + 2 − d/2} in the L∞ -norm.
We can also use the analysis in this chapter to explain the accuracy improvement near the boundary we observed earlier during the numerical experiments
in Section 4.5. Similar to (5.32), we may write for the pointwise error in x̄ ∈ Ω:
|u − u⋆h | (x̄) ≤ |u − u⋆ | (x̄) + |u⋆ − u⋆h | (x̄).
(5.44)
Following the analysis in the previous sections, the accuracy of the second term
is O(h2p+1 ) (in the L2 -norm). For the first term, we can apply the bounds
obtained in the proof of Lemma 5.3.1 (we omit the dependency on u for simplicity):
|u − u⋆ |(x̄) . θ(x̄) kK2p+1 kL1 (Rd ) h2p+1 + 1 − θ(x̄) kK4p+1 kL1 (Rd ) h4p+1 .
Here, K2p+1 and K4p+1 denote the kernels corresponding to the generalized
post-processor with 2p + 1 and 4p + 1 B-splines respectively. From this expression it can be seen that this part of the error is influenced by the L1 -norm of
the kernel. The latter becomes larger as the kernel becomes less symmetric.
At the same time, we are forced to switch to non-symmetric kernels near the
boundary. Using θ = 0 in this region (cf. Section 4.4.2), we can compensate
the relatively large L1 -norm of the kernel by a larger order (4p + 1 rather than
2p+1). This basically explains our observations in Section 4.5: the use of extra
kernel near the boundary yields better accuracy.
104
5.6
Chapter 5 Theoretical Superconvergence
Conclusion
This chapter derives theoretical error estimates for the position-dependent postprocessor proposed in Chapter 4 for DG (upwind) approximations for linear
hyperbolic problems. We have found that it enhances the accuracy from order
p + 1 to order 2p + 1 in the L2 -norm, and to order min{2p + 1, 2p + 2 − d/2}
in the L∞ -norm (where p is the polynomial degree and d is the spatial dimension). This expands the L2 -estimates in [22] for the symmetric post-processor,
which cannot be applied near non-periodic boundaries and shocks. Altogether,
our theory explains the superconvergence observed during the numerical experiments in Chapter 4, and guarantees similar results for a certain class of linear
hyperbolic problems. Furthermore, our abstract formulation can be used to obtain similar error estimates for any approximation for which superconvergence
in the negative-order norm can be shown.
6
Conclusion
6.1
Introduction
This thesis is focused on the linear systems and hidden accuracy of Discontinuous Galerkin (DG) discretizations. In particular, it discusses the two-level preconditioner in [24], investigates an alternative strategy in the form of a deflation
method with the same coarse space, and studies the impact of both techniques
on the convergence of the Conjugate Gradient (CG) method for Symmetric
Interior Penalty (discontinuous) Galerkin (SIPG) discretizations for diffusion
problems. Moreover, this thesis considers the one-sided post-processor in [64],
proposes the position-dependent post-processor, and analyzes the impact of
both strategies on the accuracy and smoothness of DG (upwind) approximations for hyperbolic problems.
The remaining of this chapter summarizes the main conclusions of this research in the following manner: Section 6.2 considers the two-level methods for
solving the linear systems. Section 6.3 discusses post-processing for extracting the hidden accuracy. Finally, Section 6.4 provides suggestions for future
research.
6.2
Linear DG systems
We have found that both the two-level preconditioner and the deflation variant
yield scalable CG convergence (independent of the mesh element diameter).
This has been shown theoretically for any polynomial degree p ≥ 1, which
extends the available analysis for the preconditioning variant for p = 1 in [24].
The scalability of both methods is also confirmed by our numerical experiments for various diffusion problems with extreme contrasts in the coefficients.
These include problems mimicking bubbly flow, ground water flow, and the
presence of layers of sand and shale in oil reservoir simulations. These ex-
106
Chapter 6 Conclusion
periments also demonstrate that both two-level methods only yield fast CG
convergence provided that the penalty parameter is chosen dependent on local
values of the diffusion coefficient (using the largest limit value at discontinuities). The latter also benefits the accuracy of the SIPG discretization.
At the coarse level, both two-level methods need to solve the same coarse
system. The latter is similar to a system resulting from a central difference
scheme, for which very efficient solution techniques are readily available. In that
sense, both two-level methods transform the original challenging DG system
into a more familiar problem. It has been verified that the coarse systems can
be solved efficiently by using an inexact solver with relatively low accuracy, such
as the CG method combined with a scalable algebraic multigrid preconditioner.
The main difference between both methods is that the deflation method can
be implemented by skipping one of the two smoothing steps in the algorithm
for the preconditioning variant. This may be particularly advantageous for
expensive smoothers, although the basic block Jacobi smoother was found to
be highly effective for the problems under consideration. Despite the lower
costs per iteration, we have found that the deflation method can require fewer
iterations, especially for large problems. As a result, it can be up to 35% faster
than the original preconditioner (in terms of the overall computational time in
seconds). That is, when damping of the smoother is not taken into account. If
an optimal damping parameter is used, both two-level strategies yield similar
efficiency (deflation appears unaffected by damping). However, it remains an
open question how the damping parameter can best be selected in practice.
Altogether, this work contributes to shorter computational times of DG
discretizations, e.g. for oil reservoir simulations. This strengthens the increasing consensus that DG methods can be an effective alternative for classical
discretizations schemes, such as the Finite Volume Method (FVM).
6.3
Hidden DG accuracy
We have found that the proposed position-dependent post-processor enhances
the DG convergence from order p + 1 to order 2p + 1. This result has been
shown theoretically in both the L2 - and the L∞ -norm (with a slightly lower
order in the L∞ -norm for higher-dimensional problems). This expands the L2 estimates in [22] for the symmetric post-processor, which cannot be applied
near non-periodic boundaries and shocks.
The aforementioned superconvergence of order 2p + 1 is also demonstrated
by our numerical experiments. These include problems with non-periodic
boundary conditions, problems with stationary shocks, a two-dimensional system, and a streamline visualization example. These experiments illustrate that
both the position-dependent post-processor and the original one-sided technique produce the same results in the domain interior. In that region, both
techniques apply the symmetric post-processor, resulting in the usual high level
of accuracy and smoothness.
Section 6.4 Future research
107
The differences occur at the boundary of the domain: in those regions, the
position-dependent post-processor yields a more realistic smoothness without
the previous artificial stair-stepping effect. Furthermore, unlike before, it improves the local accuracy of the DG approximation in the entire domain, not
just in order but also in magnitude.
Altogether, this work contributes to more accurate visualization of DG approximations, e.g. in the form of streamlines. Furthermore, it sustains the
originally thought.
6.4
Future research
Suggestions for future research include the following:
1. The comparison of both two-level methods could be continued for more
advanced applications, e.g. for three-dimensional unstructured meshes,
or a strongly anisotropic diffusion tensor.
2. According to the present study, the coarse systems in the two-level methods can be solved efficiently using an inexact CG solver. To improve the
efficiency further, the Flexible CG (FCG) method could be studied to
reduce the number of inner iterations (cf. Section 2.4.3).
3. The derived convergence theory for the two-level methods is based on
the assumption that the scheme is coercive (or the condition used in
Section 3.5.3). To ensure this in applications, practical conditions for
the diffusion-dependent penalty parameter could be derived, possibly by
applying available global strategies in a local fashion (cf. Section 2.2.3).
4. The application of the two-level deflation method could be extended to
non-symmetric DG schemes, such as the Non-symmetric Interior Penalty
Galerkin (NIPG) method. The latter could be less sensitive to the choice
of the penalty parameter.
5. The theoretical error estimates for the post-processor apply for linear
hyperbolic problems with constant coefficients and periodic boundary
conditions. Nevertheless, the numerical results in this thesis suggest that
it is effective for a larger class of problems, including Dirichlet boundary
conditions and variable coefficients. Theoretical support for these findings could be a welcome step towards real-life applications. Additionally,
the position-dependent post-processor could be analyzed for unstructured
meshes and non-linear problems, e.g. by following [48, 42].
6. For large values of the polynomial degree p, further modification of the
position-dependent post-processor may be required to avoid that the kernel support becomes too large to fit the geometric setting, and to avoid
108
Chapter 6 Conclusion
that round-off errors start to dominate (as the magnitude of the one-sided
kernel coefficients increases rapidly with p).
7. Numerical results suggest that there exists a relation between the postprocessor and the L2 -projection onto the space of piecewise polynomials
of degree 2p + 1 (cf. Section 4.5.7). This could be studied theoretically.
Acknowledgments
For their contributions to this dissertation, I would like to express my sincere
gratitude to ...
... Kees Vuik, for being my promotor and supervisor. Kees, we have been
working together since I chose “Een ‘lastig’ probleem” for my Bachelor’s
thesis. Over the years, you have become like an academic dad to me.
Without your faith and support, this dissertation would not have been
written.
... Ben de Pagter and Wim van Horssen, for supporting my position as a
PhD student at DIAM.
... my committee, for taking the time to evaluate this thesis. Special thanks
are due to Yvan Notay, for sending extensive and valuable comments on
Chapter 3. Furthermore I am thankful to Jan Dirk Jansen, for giving
relevant feedback on early versions of Chapter 2.
... Scott MacLachlan, for offering useful suggestions with respect to damping
at Copper Mountain.
... Jan van Neerven, Markus Haase, and Mark Veraar, for helping me with
derivations with an inspiring passion for functional analysis.
... Kees Lemmens, for turning all my computer issues into enjoyable mini
classes on Linux. I can never go back to Windows now.
... my colleagues in the Numerical Analysis group, for sharing interesting
and amusing ideas during tea talks and at the coffee machine. I have
warm memories to the times we were celebrating Sinterklaas, having a
110
Acknowledgments
Pakistani dinner, and losing with style to our computer science colleagues
at karting.
... Aletta Wubben and Louise van Swaaij, for teaching me the soft skills that
are easily overlooked at a technical university.
... Cor Kraaikamp, for offering a cup of tea and alternative views at just the
right moments.
... my friends and family, for supporting me, inviting me to fun outings, and
reminding me of who I was before I started this research.
... Sonja Cox, for being my paranymph and for analyzing the world with
me, including ourselves and, yes, our analyzing habits... Sonja, you are a
wonderful friend and I hope you will move back to the Netherlands soon.
... my parents, Gé and Mary van Slingerland, for giving me “een goede basis”
like on that day when it was snowing heavily, move me every time. To
quote the song, “You are the wind beneath my wings”.
... Ewoud Marijt, for being my paranymph and for encouraging me to think
in terms of possibilities. Ewoud, you always manage to make me smile,
even if I, for some reason, try not to. I am incredibly lucky to have you
by my side.
Paulien van Slingerland
Delft, May 2013
Curriculum Vitae
Paulien van Slingerland was born on May 19, 1983, in Leiderdorp, The Netherlands. After completing her secondary education at the Stedelijk Gymnasium
Leiden in 2001, she enrolled for Applied Mathematics at Delft University of
Technology. She was awarded the CIVI aanmoedigingsprijs for her propaedeutics diploma (cum laude) in 2002, after which she obtained her MSc. degree
(cum laude) in 2007. The time integration scheme that was the result of her
Master’s thesis is still being used by Deltares for simulating water quality. In
September 2007, Paulien started working as a PhD student at Delft University
of Technology. Initially, she studied coastal wave modeling in the Fluid Mechanics group (Faculty of Civil Engineering and Geosciences). After a year she
switched to the Numerical Analysis group (Faculty of Electrical Engineering,
Mathematics and Computer Science), which has resulted in this thesis, four refereed journal papers (one is accepted, three are in submission), and a prize for
the best poster presentation during the thirty-sixth Woudschoten conference
of the Werkgemeenschap Scientific Computing. Since January 2013, Paulien is
working at TNO as a trainee.
112
Curriculum Vitae
Publications
Journal papers
- P. van Slingerland, C. Vuik. Scalable two-level preconditioning and deflation based on a piecewise constant subspace for (SIP)DG systems. Submitted to JCAM.
- P. van Slingerland, C. Vuik. Fast linear solver for pressure computation
in layered domains. Submitted to Comput. Geosci.
- L. Ji, P. van Slingerland, J.K. Ryan, C. Vuik. Superconvergent error estimates for position-dependent smoothness-increasing accuracy-conserving
(SIAC) post-processing of Discontinuous Galerkin Solutions. Accepted
for publication in Math. Comp.
- P. van Slingerland, J.K. Ryan, C. Vuik. Position-dependent smoothnessincreasing accuracy-conserving (SIAC) filtering for improving Discontinuous Galerkin solutions. SIAM J. Sci. Comp., 33(2011), pp 802–825.
Technical reports
- P. van Slingerland, C. Vuik. Scalable two-level preconditioning and deflation based on a piecewise constant subspace for (SIP)DG systems. DIAM
report 12-11, Delft University of Technology, 2012.
- P. van Slingerland, C. Vuik. Fast linear solver for pressure computation
in layered domains. DIAM report 12-10, Delft University of Technology,
2012.
114
Publications
- P. Slingerland, C. Vuik. Spectral two-level deflation for DG: a preconditioner for CG that does not need symmetry. DIAM report 11-12, Delft
University of Technology, 2011.
- P. van Slingerland, J.K. Ryan, C. Vuik. Smoothness-increasing convergence-conserving spline filters applied to streamline visualisation of DG
approximations. DIAM report 09-06, Delft University of Technology,
2009.
- P. van Slingerland, M. Borsboom, C. Vuik. A local theta scheme for
advection problems with strongly varying meshes and velocity profiles.
DIAM report 08-17, Delft University of Technology, 2008.
Talks at international conferences
- Fast linear solver for pressure computation in layered domains. 13th European conference on the mathematics of oil recovery (ECMOR). Biarritz,
France, 2012.
- A preconditioner for CG that does not need symmetry. Twelfth Copper
Mountain conference on iterative methods (COPPER). Copper Mountain
(Colorado), United States of America, 2012.
- Exploiting the nested block structure of DG matrices: a block ILU preconditioner with deflation and a spectral two-level strategy. International
conference on preconditioning techniques for scientific and industrial applications (PRECOND). Bordeaux, France, 2011.
- A local theta scheme for advection problems with strongly varying meshes
and velocity profiles. The mathematics of finite elements and applications
(MAFELAP). Brunel University, Londen, England, 2009.
- A robust higher-order variable-θ scheme for the advection diffusion equation on unstructured grids. 2nd international conference on high order
non-oscillatory methods for wave propagation, transport and flow problems. Trento, Italy, 2007.
Other talks
- Spectral two-level deflation for DG: a preconditioner for CG that does
not need symmetry .... Discontinuous Galerkin methods in computational electromagnetics: a workshop on recent developments in theory
and applications. National Aerospace Laboratory of the Netherlands,
Amsterdam, The Netherlands, 2011.
- Extracting the hidden accuracy of DG solutions. Spring meeting Werkgemeenschap Scientific Computing. Antwerp, Belgium, 2010.
Publications
115
- A local theta scheme for advection (dominated) problems with strongly
varying meshes and velocity profiles. Invited talk at the Institute of Applied Mathematics. Dortmund University of Technology, Dortmund, Germany, 2009.
- An accurate and robust local theta FCT scheme for the advection equation
for strongly varying meshes and velocity profiles. Meeting Kontactgroep
Numerieke Stromingsleer. University of Twente, Enschede, The Netherlands, 2007.
Poster presentations
- A preconditioner for CG that does not need symmetry. NWO-JSPS joint
seminar: numerical linear algebra - algorithms, applications, and training.
Delft University of Technology, Delft, 2012.
- A preconditioner for CG that does not need symmetry. Thirty-sixth
Woudschoten conference, Werkgemeenschap Scientific Computing. Zutphen, The Netherlands, 2011.
- The hidden accuracy of DG. Thirty-fifth Woudschoten conference, Werkgemeenschap Scientific Computing. Zutphen, The Netherlands, 2010.
- Post-processing for DG: improving the accuracy near boundaries and
shocks. Opening symposium Applied Mathematics Institute. Delft University of Technology, Delft, The Netherlands, 2010.
- Post-processing for DG. Thirty-fourth Woudschoten conference, Werkgemeenschap Scientific Computing. Zutphen, The Netherlands, 2009.
- Smoothness-increasing accuracy-conserving spline filters applied to
streamline visualisation of DG approximations. Burgersdag 2009, J.M.
Burgerscentrum. Eindhoven University of Technology, Eindhoven, The
Netherlands, 2009.
116
Publications
Bibliography
[1] S. Adjerid, K. D. Devine, J. E. Flaherty, and Lilia Krivodonova. A posteriori error estimation for discontinuous Galerkin solutions of hyperbolic
problems. Comput. Methods Appl. Mech. Engrg., 191(11-12):1097–1112,
2002.
[2] S. Adjerid and T. C. Massey. Superconvergence of discontinuous Galerkin
solutions for a nonlinear scalar hyperbolic problem. Comput. Methods
Appl. Mech. Engrg., 195(25-28):3331–3346, 2006.
[3] P. F. Antonietti and B. Ayuso. Schwarz domain decomposition preconditioners for discontinuous Galerkin approximations of elliptic problems:
non-overlapping case. M2AN Math. Model. Numer. Anal., 41(1):21–54,
2007.
[4] D. N. Arnold, F. Brezzi, B. Cockburn, and L. D. Marini. Unified analysis
of discontinuous Galerkin methods for elliptic problems. SIAM J. Numer.
Anal., 39(5):1749–1779 (electronic), 2002.
[5] O. Axelsson. Iterative solution methods. Cambridge University Press,
Cambridge, 1994.
[6] O. Axelsson and P. S. Vassilevski. Variable-step multilevel preconditioning
methods. I. Selfadjoint and positive definite elliptic problems. Numer.
Linear Algebra Appl., 1(1):75–101, 1994.
[7] B. Ayuso de Dios, M. Holst, Y. Zhu, and L. Zikatanov. Multilevel preconditioners for discontinuous Galerkin approximations of elliptic problems
with jump coeffients. arXiv:1012.1287v2, 2012.
118
BIBLIOGRAPHY
[8] D.H. Bailey, Y. Hida, X. S. Li, and B. Thompson. ARPREC: an arbitrary precision computation package. Technical Report 53651, Lawrence
Berkeley National Laboratory, September 2002.
[9] V. I. Bogachev. Measure theory. Springer-Verlag, Berlin, 2007.
[10] J. H. Bramble and A. H. Schatz. Estimates for spline projections. Rev.
Française Automat. Informat. Recherche Opérationnelle, 10(R-2):5–37,
1976.
[11] J. H. Bramble and A. H. Schatz. Higher order local accuracy by averaging
in the finite element method. Math. Comp., 31(137):94–111, 1977.
[12] James H. Bramble, Joachim A. Nitsche, and Alfred H. Schatz. Maximumnorm interior estimates for Ritz-Galerkin methods. Math. Comput.,
29:677–688, 1975.
[13] S. C. Brenner and J. Zhao. Convergence of multigrid algorithms for interior
penalty methods. Appl. Numer. Anal. Comput. Math., 2(1):3–18, 2005.
[14] V. I. Burenkov. Sobolev spaces on domains, volume 137 of Teubner-Texte
zur Mathematik [Teubner Texts in Mathematics]. B. G. Teubner Verlagsgesellschaft mbH, Stuttgart, 1998.
[15] E. Burman and P. Zunino. A domain decomposition method based
on weighted interior penalties for advection-diffusion-reaction problems.
SIAM J. Numer. Anal., 44(4):1612–1638 (electronic), 2006.
[16] W. Cai, D. Gottlieb, and C.-W. Shu. On one-sided filters for spectral Fourier approximations of discontinuous functions. SIAM J. Numer.
Anal., 29(4):905–916, 1992.
[17] P. Castillo. Performance of discontinuous Galerkin methods for elliptic
PDEs. SIAM J. Sci. Comput., 24(2):524–547, 2002.
[18] P. G. Ciarlet. The finite element method for elliptic problems, volume 4
of Studies in mathematics and its applications. North-Holland, New York,
1978.
[19] B. Cockburn, B. Dong, and J. Guzmán. Optimal convergence of the original DG method for the transport-reaction equation on special meshes.
SIAM J. Numer. Anal., 46(3):1250–1265, 2008.
[20] B. Cockburn, B. Dong, J. Guzmán, and J. Qian. Optimal convergence of
the original DG method on special meshes for variable transport velocity.
SIAM J. Numer. Anal., 48(1):133–146, 2010.
BIBLIOGRAPHY
119
[21] B. Cockburn, C. Johnson, C.-W. Shu, and E. Tadmor. Advanced numerical
approximation of nonlinear hyperbolic equations, volume 1697 of Lecture
Notes in Mathematics. Springer-Verlag, Berlin, 1998. Papers from the
C.I.M.E. Summer School held in Cetraro, June 23–28, 1997, Edited by
Alfio Quarteroni, Fondazione C.I.M.E.. [C.I.M.E. Foundation].
[22] B. Cockburn, M. Luskin, C.-W. Shu, and E. Süli. Enhanced accuracy by
post-processing for finite element methods for hyperbolic equations. Math.
Comp., 72(242):577–606, 2003.
[23] S. Curtis, R. M. Kirby, J. K. Ryan, and C.-W. Shu. Postprocessing for
the discontinuous Galerkin method over nonuniform meshes. SIAM J. Sci.
Comput., 30(1):272–289, 2007.
[24] V. A. Dobrev, R. D. Lazarov, P. S. Vassilevski, and L. T. Zikatanov.
Two-level preconditioning of discontinuous Galerkin approximations of
second-order elliptic equations. Numer. Linear Algebra Appl., 13(9):753–
770, 2006.
[25] Veselin A. Dobrev, Raytcho D. Lazarov, and Ludmil T. Zikatanov. Preconditioning of symmetric interior penalty discontinuous Galerkin FEM for
elliptic problems. In Domain decomposition methods in science and engineering XVII, volume 60 of Lect. Notes Comput. Sci. Eng., pages 33–44.
Springer, Berlin, 2008.
[26] Zdeněk Dostál. Conjugate gradient method with preconditioning by projector. International Journal of Computer Mathematics, 23(3-4):315–323,
1988.
[27] Maksymilian Dryja. On discontinuous Galerkin methods for elliptic problems with discontinuous coefficients. Comput. Methods Appl. Math.,
3(1):76–85 (electronic), 2003. Dedicated to Raytcho Lazarov.
[28] Y. Epshteyn and B. Rivière. Estimation of penalty parameters for symmetric interior penalty Galerkin methods. J. Comput. Appl. Math.,
206(2):843–872, 2007.
[29] A. Ern, A.F. Stephansen, and P. Zunino. A discontinuous Galerkin method
with weighted averages for advection-diffusion equations with locally small
and anisotropic diffusivity. IMA J. Numer. Anal., 29(2):235–256, 2009.
[30] L. C. Evans. Partial differential equations, volume 19 of Graduate Studies
in Mathematics. American Mathematical Society, Providence, RI, 1998.
[31] R. D. Falgout, P. S. Vassilevski, and L. T. Zikatanov. On two-grid convergence estimates. Numer. Linear Algebra Appl., 12(5-6):471–494, 2005.
[32] X. Feng and O. A. Karakashian. Two-level additive Schwarz methods for
a discontinuous Galerkin approximation of second order elliptic problems.
SIAM J. Numer. Anal., 39(4):1343–1365 (electronic), 2001.
120
BIBLIOGRAPHY
[33] K. J. Fidkowski, T. A. Oliver, J. Lu, and D. L. Darmofal. p-Multigrid
solution of high-order discontinuous Galerkin discretizations of the compressible Navier-Stokes equations. J. Comput. Phys., 207(1):92–113, 2005.
[34] J. Gopalakrishnan and G. Kanschat. A multilevel discontinuous Galerkin
method. Numer. Math., 95(3):527–550, 2003.
[35] D. Gottlieb, C.-W. Shu, A. Solomonoff, and H. Vandeven. On the Gibbs
phenomenon. I. Recovering exponential accuracy from the Fourier partial
sum of a nonperiodic analytic function. J. Comput. Appl. Math., 43(12):81–98, 1992.
[36] S. Gottlieb and C.-W. Shu. Total variation diminishing Runge-Kutta
schemes. Mathematics of Computation, 67:73–85, 1998.
[37] S. Gottlieb, C.-W. Shu, and E. Tadmor. Strong stability preserving highorder time discretization methods. SIAM Review, 43:89–112, 2001.
[38] J. S. Hesthaven, S. Gottlieb, and D. Gottlieb. Spectral methods for timedependent problems, volume 21 of Cambridge Monographs on Applied and
Computational Mathematics. Cambridge University Press, Cambridge,
2007.
[39] J.S. Hesthaven and T. Warburton. Nodal discontinuous Galerkin methods,
volume 54 of Texts in Applied Mathematics. Springer, New York, 2008.
Algorithms, analysis, and applications.
[40] R.A. Horn and C.R. Johnson. Matrix analysis. Cambridge University
Press, Cambridge, 1988.
[41] Liangyue Ji, Yan Xu, and Jennifer K. Ryan. Accuracy-enhancement
of discontinuous Galerkin solutions for convection-diffusion equations in
multiple-dimensions. Math. Comp., 81(280):1929–1950, 2012.
[42] Liangyue Ji, Yan Xu, and Jennifer K Ryan. Negative-order norm estimates
for nonlinear hyperbolic conservation laws. J. Sci. Comput., 54(2-3):531–
548, 2013.
[43] K. Johannsen. A symmetric smoother for the nonsymmetric interior
penalty discontinuous Galerkin discretization. Technical Report ICES Report 05-23, University of Texas at Austin, 2005.
[44] C. Johnson and J. Pitkäranta. An analysis of the discontinuous Galerkin
method for a scalar hyperbolic equation. Math. Comp., 46(173):1–26, 1986.
[45] R. B. Lowrie. Compact higher-order numerical methods for hyperbolic
conservation laws. PhD thesis, University of Michigan, 1996.
BIBLIOGRAPHY
121
[46] Lois Mansfield. On the use of deflation to improve the convergence of conjugate gradient iteration. Communications in Applied Numerical Methods,
4(2):151–156, 1988.
[47] H. Mirzaee, L. Ji, J. K. Ryan, and R. M. Kirby. Smoothness-increasing
accuracy-conserving (SIAC) postprocessing for discontinuous Galerkin
solutions over structured triangular meshes. SIAM J. Numer. Anal.,
49(5):1899–1920, 2011.
[48] H. Mirzaee, J. King, J.K. Ryan, and R.M. Kirby. Smoothness-increasing
accuracy-conserving filters for discontinuous Galerkin solutions over unstructured triangular meshes. SIAM J. Sci. Comput., 35(1):A212–A230,
2013.
[49] H. Mirzaee, J. K. Ryan, and R. M. Kirby. Quantification of errors introduced in the numerical approximation and implementation of smoothnessincreasing accuracy conserving (SIAC) filtering of discontinuous Galerkin
(DG) fields. J. Sci. Comput., 45(1-3):447–470, 2010.
[50] H. Mirzaee, J. K. Ryan, and R. M. Kirby. Efficient implementation of
smoothness-increasing accuracy-conserving (SIAC) filters for discontinuous Galerkin solutions. J. Sci. Comput., 52(1):85–112, 2012.
[51] M. S. Mock and P. D. Lax. The computation of discontinuous solutions
of linear hyperbolic equations. Comm. Pure Appl. Math., 31(4):423–430,
1978.
[52] R. Nabben and C. Vuik. A comparison of deflation and coarse grid correction applied to porous media flow. SIAM J. Numer. Anal., 42(4):1631–
1647 (electronic), 2004.
[53] R. Nabben and C. Vuik. A comparison of deflation and the balancing
preconditioner. SIAM J. Sci. Comput., 27(5):1742–1759 (electronic), 2006.
[54] R. Nabben and C. Vuik. A comparison of abstract versions of deflation,
balancing and additive coarse grid correction preconditioners. Numerical
Linear Algebra with Applications, 15(4):355–372, 2008.
[55] R. A. Nicolaides. Deflation of conjugate gradients with applications to
boundary value problems. SIAM J. Numer. Anal., 24(2):355–365, 1987.
[56] Y. Notay.
22(4):1444–1460 (electronic), 2000.
SIAM J. Sci. Comput.,
[57] Yvan Notay. Algebraic analysis of two-grid methods: The nonsymmetric
case. Numer. Linear Algebra Appl., 17(1):73–96, 2010.
[58] P.-O. Persson and J. Peraire. Newton-GMRES preconditioning for discontinuous Galerkin discretizations of the Navier-Stokes equations. SIAM J.
Sci. Comput., 30(6):2709–2733, 2008.
122
BIBLIOGRAPHY
[59] F. Prill, M. Lukáčová-Medviďová, and R. Hartmann. Smoothed aggregation multigrid for the discontinuous Galerkin method. SIAM J. Sci.
Comput., 31(5):3503–3528, 2009.
[60] J. Proft and B. Rivière. Discontinuous Galerkin methods for convectiondiffusion equations for varying and vanishing diffusivity. Int. J. Numer.
Anal. Model., 6(4):533–561, 2009.
[61] G.R. Richter. An optimal-order error estimate for the discontinuous
Galerkin method. Math. Comp., 50(181):75–88, 1988.
[62] G.R. Richter. On the order of convergence of the discontinuous Galerkin
method for hyperbolic equations. Math. Comp., 77(264):1871–1885, 2008.
[63] B. Rivière. Discontinuous Galerkin methods for solving elliptic and
parabolic equations, volume 35 of Frontiers in Applied Mathematics. Society for Industrial and Applied Mathematics (SIAM), Philadelphia, PA,
2008. Theory and implementation.
[64] J. K. Ryan and C.-W. Shu. On a one-sided post-processing technique for
the discontinuous Galerkin methods. Methods Appl. Anal., 10(2):295–307,
2003.
[65] J. K. Ryan, C.-W. Shu, and H. Atkins. Extension of a postprocessing technique for the discontinuous Galerkin method for hyperbolic equations with
application to an aeroacoustic problem. SIAM J. Sci. Comput., 26(3):821–
843, 2005.
[66] J.K. Ryan and B. Cockburn. Local derivative post-processing for the
discontinuous Galerkin method. J. Comput. Phys., 228(23):8642–8664,
2009.
[67] Y. Saad. Iterative methods for sparse linear systems. This is a revised
version of the book published in 1996 by PWS Publishing, Boston. It
2000.
[68] Y. Saad and B. Suchomel. ARMS: an algebraic recursive multilevel solver
for general sparse linear systems. Numer. Linear Algebra Appl., 9(5):359–
378, 2002.
[69] Y. Saad, M. Yeung, J. Erhel, and F. Guyomarc’h. A deflated version of
the conjugate gradient algorithm. SIAM J. Sci. Comput., 21(5):1909–1926,
December 1999.
[70] I. J. Schoenberg. Cardinal spline interpolation. SIAM, Philadelphia, Pa.,
1973.
[71] L. L. Schumaker. Spline functions: basic theory. John Wiley & Sons Inc.,
New York, 1981.
BIBLIOGRAPHY
123
[72] S. J. Sherwin, R. M. Kirby, J. Peiró, R. L. Taylor, and O. C. Zienkiewicz.
On 2D elliptic discontinuous Galerkin methods. Internat. J. Numer. Methods Engrg., 65(5):752–784, 2006.
[73] M. Steffen, S. Curtis, R. M. Kirby, and J. K. Ryan. Investigation of
smoothness-increasing accuracy-conserving filters for improving streamline integration through discontinuous fields. IEEE Transactions on Visualization and Computer Graphics, 14:680–692, 2008.
[74] K. Stüben. An introduction to algebraic multigrid. In U. Trottenberg,
C. W. Oosterlee, and A. Schüller, editors, Multigrid, pages 413–532. Academic Press, 2001.
[75] J. M. Tang, S. P. MacLachlan, R. Nabben, and C. Vuik. A comparison
of two-level preconditioners based on multigrid and deflation. SIAM J.
Matrix Anal. Appl., 31(4):1715–1739, 2010.
[76] J. M. Tang, R. Nabben, C. Vuik, and Y. A. Erlangga. Comparison of twolevel preconditioners derived from deflation, domain decomposition and
multigrid methods. J. Sci. Comput., 39(3):340–370, 2009.
[77] V. Thomée. High order local approximations to derivatives in the finite
element method. Math. Comp., 31(139):652–660, 1977.
[78] V. Thomée. Negative norm estimates and superconvergence in Galerkin
methods for parabolic problems. Math. Comp., 34(149):93–113, 1980.
[79] P. S. Vassilevski. Multilevel block factorization preconditioners. Springer,
New York, 2008. Matrix-based analysis and algorithms for solving finite
element equations.
[80] C. Vuik, A. Segal, and J.A. Meijerink. An efficient preconditioned CG
method for the solution of a class of layered problems with extreme contrasts in the coefficients. Journal of Computational Physics, 152:385–403,
1999.
[81] C. Vuik, A. Segal, J.A. Meijerink, and G.T. Wijma. The construction of
projection vectors for a Deflated ICCG method applied to problems with
extreme contrasts in the coefficients. Journal of Computational Physics,
172:426–450, 2001.
[82] D. Walfish, J. K. Ryan, R. M. Kirby, and R. Haimes. One-sided
smoothness-increasing accuracy-conserving filtering for enhanced streamline integration through discontinuous fields. Journal of Scientific Computing, 38:164–184, 2009.
[83] Jinchao Xu. Iterative methods by space decomposition and subspace correction. SIAM Rev., 34(4):581–613, 1992.
124
BIBLIOGRAPHY
[84] I. Yavneh. Why multigrid methods are so efficient. Computing in Science
& Engineering, 8(6):12–22, 2006.
[85] T. Zhang and Z. Li. Optimal error estimate and superconvergence of the
DG method for first-order hyperbolic problems. J. Comput. Appl. Math.,
235(1):144–153, 2010.
```