Couplex Benchmark Computations with UG Peter Bastian and Stefan Lang Interdisciplinary Center for Scientific Computing, Universität Heidelberg, Im Neuenheimer Feld 368, 69120 Heidelberg, Germany November 29, 2002 Abstract. This paper describes the numerical results for the Couplex benchmark obtained with the simulation software UG using vertex centered finite volume and higher order discontinuous Galerkin schemes. Multigrid solvers on unstructured grids, local mesh refinement and parallel computation are employed to yield very accurate solutions. Since the full range of results required in the benchmarks is too large to be displayed in this paper we focus on the comparison of discretization schemes, assessment of numerical errors and the presentation of parallel computations. Keywords: flow, transport, finite volume scheme, discontinuous Galerkin scheme, parallel computation 1. Introduction In 2001 ANDRA (agence nationale pour la gestion des déchets radioactifs) proposed a benchmark for porous medium flow and transport simulations in connection with the safety assessment of underground waste repositories in clay formations. The complete specification of the benchmark is given in [this issue]. This paper describes the results obtained with the simulation software UG [8] and the numerical methods that have been used. The focus of this work is on the comparison of discretization schemes, assessment of numerical errors through mesh refinement studies and the presentation of parallel computations. In order to be self-contained we shortly review the relevant model equations and the Couplex benchmark. Let Ω be a domain in IRd , d = 2, 3, with outward unit normal n. The equation for groundwater flow in head-based formulation is given by ∇·u=f in Ω, u = −K∇H, (1) with Dirichlet boundary conditions H = H 0 on ΓH and flux boundary conditions u · n = U on boundary ΓU . H is the hydraulic head, u is the Darcy velocity and K is the permeability tensor. The challenge for numerical methods solving (1) is to achieve high accuracy for the flow velocity u subsequently entering the transport equation. Moreover, the permeability tensor may vary over many orders of magnitude. c 2002 Kluwer Academic Publishers. Printed in the Netherlands. couplex.tex; 29/11/2002; 16:23; p.1 2 P. Bastian S. Lang The generic mass balance equation including convective and dispersive transport as well as radioactive decay reads ∂C Rω + λC + ∇ · j = q(C) in Ω, ∂t j = uC − D(u)∇C (2) with Dirichlet boundary conditions C = C 0 on ΓC , flux boundary conditions j · n = J on boundary ΓJ and outflow boundary conditions j · n = (uC − D(u)∇C) · n on ΓO . We assume that u · n ≥ 0 on ΓO . R is the retardation factor, ω the effective porosity, λ = log 2/T with T the half life time of the element and D the diffusion/dispersion tensor. The Couplex benchmark consists of three phases. Couplex 1 requires a far field simulation in a layered aquifer system of 25km×700m. Nuclide transport is described by linear transport and decay (2) in a stationary flow field given by (1). The waste repository is modelled by a source term q(x) located in a region of 3km×6m. The difficulty of Couplex 1 lies in the anisotropic repository geometry combined with the large range of concentration values C ∈ [10 −12 , 10−2 ] to be computed. Moreover, the permeability K varies by 7 orders of magnitude. Couplex 2 requires a detailed three-dimensional simulation of a single elementary cell of the repository which is assumed to be repeated periodically. In total the transport of 10 components has to be computed which partly depend on each other through a nonlinear dissolution-precipitation term q(C). Additionally, Couplex 2 requires the implementation of periodic boundary conditions which is non-trivial for unstructured mesh codes. Couplex 3 is an extension of Couplex 1 with the source term being the result of the Couplex 2 simulation. Equation (1) is an elliptic equation for the hydraulic head H. Transport simulations require an accurate approximation of the velocity u in the presence of large permeability contrasts. Moreover, the computed flow field should be locally conservative. In this work we use the vertex centered finite volume scheme and the higher order discontinuous Galerkin method for the solution of the flow equation. The latter scheme is comparable in accuracy and efficiency with the mixed finite element method. The transport equation (2) is hyperbolic for D = 0, otherwise it is parabolic. If D is small compared to u the equation is singularly perturbed. It is formally parabolic but may exhibit certain features of hyperbolic equations, e. g. solutions with sharp fronts. Let us define the dimensionless numbers Cr = |u|∆t/h and P e = |u|h/D, (3) where we assumed that D is isotropic. couplex.tex; 29/11/2002; 16:23; p.2 Couplex Benchmark with UG 3 The Courant number Cr describes the propagation speed in terms of mesh cells and P e measures the relative strength of convection and diffusion. If the mesh size h, velocity u and diffusion coefficient D vary spatially the Courant and Peclet numbers are defined locally. State-of-the-art methods for solving instationary hyperbolic problems (P e = ∞) are the high-resolution schemes such as the higher-order Godunov method, ENO-schemes [29] and the Runge-Kutta Discontinuous Galerkin method [19, 17, 21]. These schemes are constructed by combining a conservative spatial discretization with explicit TVD timestepping schemes and slope or flux limiters. Limiters are necessary to remove unphysical oscillations from the numerical solution. In order to ensure stability, an upper bound on Cr is required. Implicit schemes are usually not used for transient hyperbolic problems because accuracy demands time steps similar in size to that of explicit schemes and unphysical oscillations are hard to avoid. High-resolution implicit schemes are possible but require the solution of nonlinear algebraic equations while having the same time-step restriction as explicit schemes [28]. Higher-order implicit schemes are efficient, however, if the solution is smooth and large time steps are required. Efficient methods for the parabolic case P e = 0 are constructed from spatial discretizations for elliptic problems combined with implicit temporal discretizations for stiff ordinary differential equations (method of lines approach). Since solutions tend to get smoother with time, large time-steps should be possible. For problems with 0 < P e < ∞, many different methods have been proposed. For large Peclet numbers explicit methods are very efficient. One can show for standard schemes (second order Godunov with cell centered finite volumes) that Cr < 1 < P e allows the use of fully explicit schemes. The time step must be restricted to ensure stability but accuracy demands a time step of similar size. Methods of this type are presented in [22, 20, 1]. If P e < 1 the fully implicit approach [34] is applicable because no sharp fronts are present in the solution. In the Couplex benchmark the Peclet number varies in space over several orders of magnitude. In that case operator splitting methods may be used which treat the convective part explicitly and the diffusive part implicitly. If the solution is smooth, as is the case in the Couplex benchmark, higher order implicit schemes are the method of choice. Moreover, the Couplex benchmark demands the computation of long time intervals with nearly stationary solution. This can only be done efficiently with implicit methods. We note that there are alternative approaches to solve the transport equation based on the method of characteristics [23, 15, 3, 40]. These methods maintain sharp fronts while allowing large time-steps. couplex.tex; 29/11/2002; 16:23; p.3 4 P. Bastian S. Lang The paper is organized as follows. In Sections 2 and 3 we describe the vertex centered finite volume method and the higher order discontinuous Galerkin scheme. The PDE software framework UG underlying all the computations is described in Section 4 while Sections 5 and 6 contain the numerical results for Couplex 1 and Couplex 2, respectively. Conclusions are given in Section 7. 2. Vertex Centered Finite Volume Scheme In this section we present the vertex centered finite volume (VCFV) scheme which has been widely used for computational fluid mechanics computations [38, 30, 13]. Combined with the fractional step θ time stepping procedure it is used for Couplex 1 as well as Couplex 2. 2.1. Notation Let Eh = {e1 , . . . , enh } be a non-degenerate quasi-uniform subdivision of Ω where e ∈ Eh is a triangle or quadrilateral if d = 2 and e is a tetrahedron, pyramid, prism or hexahedron with planar faces if d = 3. Let h denote the maximum diameter of the elements in E h . In Eh the intersection of two elements is either a vertex, an edge or a face (if d = 3). The domain covered by e ∈ Eh is denoted by Ωe and the outward unit normal to Ωe is ne . Let Bh = {b1 , . . . , bmh } be a secondary subdivision of Ω constructed from Eh as follows: In two dimensions connect the barycenter of an element with the midpoints of the edges of the element. This construction is shown in Figure 1. In three dimensions the barycenter of an element is connected to edge midpoints and face barycenters. The boxes b ∈ B h are the polygonal regions associated with each vertex of the mesh. The domain covered by b ∈ Bh is denoted by Ωb . The position of the vertex associated with b ∈ Bh is xb . Note that if b is associated with a boundary vertex we have that x ∈ ∂Ωb , cf. vj in Figure 1. The outward unit normal to Ωb is denoted by nb . We define the internal skeleton Γint = {γe,b,b0 | γe,b,b0 = Ωe ∩ ∂Ωb ∩ ∂Ωb0 for e ∈ Eh , b, b0 ∈ Bh } (4) where γe,b,b0 ⊆ IRd−1 is the intersection of two boxes within an element. Correspondingly, the external skeleton is defined as Γext = {γe,b | γe,b = ∂Ωe ∩ ∂Ωb ∩ ∂Ω for e ∈ Eh , b0 ∈ Bh }. (5) With each γe,b,b0 ∈ Γint we associate a unit normal n. The orientation can be selected arbitrarily. With any γ e,b ∈ Γext we associate the unit normal n oriented outward to Ω. couplex.tex; 29/11/2002; 16:23; p.4 5 Couplex Benchmark with UG bi vi bj vj Figure 1. Construction of secondary mesh. Box bi is associated with vertex vi . For any x ∈ γ ∈ Γint we denote the jump of a function v by [v](x) = lim v(x + n) − lim v(x − n). →0+ (6) →0+ Note that the jump is well defined for functions v being discontinuous on the skeleton Γint . Finally we need the following finite element spaces. Vh = {v ∈ C 0 (Ω) | v is linear on e ∈ Eh } (7) is the standard conforming, piece-wise linear finite element space. Its subspace necessary for homogeneous Dirichlet boundary conditions is Vh0 = {v ∈ Vh | v|ΓH = 0}. (8) We also need the following discontinuous function space Wh = {w ∈ L2 (Ω) | w is constant on b ∈ Bh } (9) and its subspace Wh0 = {w ∈ Wh | w|ΓH = 0}. (10) 2.2. Scheme The VCFV method can be written as a Petrov-Galerkin finite element method with continuous trial and discontinuous test functions. Consider the case H0 = 0 (homogeneous Dirichlet boundary conditions). Then the VCFV-scheme applied to the flow equation is defined as follows: Find H ∈ Vh0 such that for all w ∈ Wh0 − X Z γ∈Γint γ (K∇H)·n[w] ds = X Z b∈BhΩ b f w dx− X Z U w ds. (11) γ∈Γext ∩ΓU γ couplex.tex; 29/11/2002; 16:23; p.5 6 P. Bastian S. Lang The case of inhomogeneous Dirichlet boundary conditions is treated as in the standard finite element method. The VCFV-scheme applied to the transport equation is written in semi-discrete form. Again in the case of homogeneous Dirichlet boundary conditions C0 = 0 the problem is to find C(t) : [0, T ] → V h0 such that for all w ∈ Wh0 we have ∂ X ∂t b∈B Z hΩ + RωCw dx + b∈Bh Ω b X Z γ∈Γint + X γ γ∈Γext ∩ΓO = X Z X Z b∈Bh Ω b RωλCw dx b C ∗ u · n[w] ds − Z γ qw dx − Z γ C ∗ u · nw ds − Z X (D∇C) · n[w] ds Z γ (12) (D∇C) · nw ds Jw ds. γ∈Γext ∩ΓJ γ The evaluation of C at x ∈ γe,b,b0 is denoted by C ∗ and is done as follows: C ∗ (x) = (1 − β)C(x) + β C(xb ) C(xb0 ) u · nb ≥ 0 . u · nb < 0 (13) The value β = 1 results in full upwinding and β = 0 corresponds to a second order formulation which is equivalent to central finite differences on certain meshes. The evaluation on the outflow boundary for x ∈ γ e,b is done in a similar way: C ∗ (x) = (1 − β)C(x) + βC(xb ). (14) The theoretical properties of the VCFV-scheme are treated extensively in [30, 13]. The spatial discretization error of the method in L 2 is first order if β = 1 and second order if β = 0 and the solution is sufficiently regular. Additional interesting properties of the method are: − The method is locally conservative on the subdivision B h . − It is possible to treat full tensors D, which might be a problem in cell centered finite volume schemes. − A discrete maximum princible (or, equivalently, stability in the maximum norm or M -matrix property of the stiffness matrix) can couplex.tex; 29/11/2002; 16:23; p.6 Couplex Benchmark with UG 7 be proven under certain restrictions: Discretization of the diffusion term leads to an M -matrix if the mesh is sufficiently regular. For triangular meshes the Delauney criterion is required (sum of two angles opposite an edge is at most π). The discretization of the convective term always leads to an M -matrix if β = 1 (upwind). In the case β = 0 one needs P e < 2 and the Delauney criterion. 2.3. Fractional step θ scheme Insertion of a basis for the function spaces V h0 and Wh0 into Eq. (12) leads to a system of ordinary differential for the coefficients y h which we denote in its generic form by d yh = Lh (t, yh (t)). dt (15) Here we describe certain implicit time discretizations used in connection with the VCFV-scheme. The time interval [0, T ] is subdived into 0 = t 0 < t1 < . . . < tM = T with ∆tn = tn+1 − tn . The approximation of yh (t) to be computed is denoted by yhn . The so-called one step θ scheme is given as follows: yhn+1 − ∆tn (1 − θ)Lh (tn+1 , y n+1 ) = yhn + ∆tn θLh (tn , y n ) (16) Setting θ = 1 gives the implicit Euler method and θ = 1/2 corresponds to the Crank-Nicolson scheme. An improvement over these one step schemes is the fractional step θ scheme, which consists of three successive steps of the one step θ scheme with θ and ∆t chosen in a special way: √ √ θ1 = 2 − √1 ∆t1 = ( 2 − √1)∆t (17) θ2 = 2 − √2 ∆t2 = (1 − √2/2)∆t θ3 = 2 − 2 ∆t3 = (1 − 2/2)∆t Each time step ∆t is subdivided into the smaller time steps ∆t 1 , ∆t2 and ∆t3 but ∆t is chosen three times as large as for the one step schemes. The fractional step scheme is 2nd order accurate and has improved stability properties (cf. [32]). The fractional step scheme is used in all VCFV computations for Couplex 1. Within each substep a large system of linear algebraic equations has to be solved. This is done by so-called multigrid methods. Multigrid methods have the advantage of being of optimal, i. e . linear, complexity with respect to the number of unknowns. Standard references for multigrid methods are [25, 26]. Robust multigrid methods for porous medium applications are treated in [10, 16]. couplex.tex; 29/11/2002; 16:23; p.7 8 P. Bastian S. Lang 3. Discontinuous Galerkin Method Due to their flexibility, discontinuous Galerkin (DG) methods have been popular among the finite element community and they have been applied to a wide range of computational fluid problems. Since the first DG method introduced in [33] the methods have been developed for hyperbolic problems known as the Runge-Kutta DG method [17, 21] and for elliptic problems in [41, 31, 20, 36, 37]. A unified analysis for many DG methods for elliptic problems has been given recently in [4]. A general overview is available in [18]. Advantages of DG methods are their higher order convergence property, local conservation of mass and flexibility with respect to meshing and hp-adaptive refinement. Their uniform applicability to hyperbolic, elliptic and parabolic problems as well as their robustness with respect to strongly discontinuous coefficients renders them very attractive for porous medium flow and transport calculations [35, 1]. DG methods for elliptic problems are comparable in quality with mixed finite element methods [5]. In this work we use a formulation due to Oden, Babuška, and Baumann [31] and combine it with diagonally implicit Runge-Kutta time discretizations. 3.1. Notation Let Eh denote a subdivision of the domain Ω as defined above. The space of polynomial functions of degree r on element e ∈ E h is defined by X cab xa y b }. (18) Pr (Ωe ) = {w : Ωe → IR | w(x, y) = 0≤a+b≤r The extension to three space dimensions is obvious. Note that P r can be used on triangles (tetrahedra) and quadrilaterals (hexahedra). In the implementation Pr is generated from basis polynomials on the reference element. Moreover, we use basis polynomials that are L 2 -orthogonal on the reference elements. This improves the conditioning of the arising matrices and leads to diagonal mass matrices. The finite element space used in the DG method is defined as V r (Eh ) = Y Pr (Ωe ). (19) e∈Eh Note that functions in V r (Eh ) are discontinuous at element boundaries. The skeleton Γint has to be redefined suitably to cope with the discontinuities at element boundaries. We define the internal skeleton Γint = {γe,f | γe,f = ∂Ωe ∩ ∂Ωf ∀e, f ∈ Eh }. (20) couplex.tex; 29/11/2002; 16:23; p.8 9 Couplex Benchmark with UG Correspondingly, the external skeleton is defined as Γext = {γe | γe = ∂Ωe ∩ ∂Ω ∀e ∈ Eh }. (21) With each γe,f ∈ Γint we associate a unit normal n. The orientation can be selected arbitrarily. With any γ e ∈ Γext we associate the unit normal n oriented outward to Ω. In addition to the jump (6) we also define the average of a function at x ∈ γ ∈ Γint : 1 hvi(x) = 2 lim v(x + εn) + lim v(x − εn) . ε→0+ ε→0+ (22) 3.2. Scheme The DG scheme for solving the elliptic problem (1) is given as follows: Find H ∈ V r (Eh ) such that for all v ∈ V r (Eh ) X Z e∈EhΩ (K∇H) · ∇v dx e + X Z γ∈Γint γ + hK∇v · ni [H] − [v] hK∇H · ni ds Z X γ∈Γext ∩ΓH γ = X Z e∈EhΩ + f v dx − e X (K∇v · n)H − v K∇H · n ds Z γ∈Γext ∩ΓH γ X Z (23) U v ds γ∈Γext ∩ΓU γ (K∇v · n)H0 ds Note that the Dirichlet boundary condition is approximated weakly. Assuming that the solution is sufficiently regular the convergence rate of the scheme in the energy norm (and thus for the velocity u = −K∇H) is O(hr ) and the convergence rate in L2 is O(hr ) if r is even and O(hr+1 ) if r is odd. This anomaly can can be remedied with other stabilizations such as the nonsymmetric interior penalty DG method or the local DG method [20, 37, 4]. Insertion of a basis into (23) results into a large system of linear equations. In [11] we developed an optimal order multigrid algorithm for solving these systems. It uses polynomials of the same degree r on the coarse grid and a point-block ILU smoother combined with a reordering strategy. couplex.tex; 29/11/2002; 16:23; p.9 10 P. Bastian S. Lang The local conservation property of the DG scheme becomes obvious when a test function v ∈ V r (Eh ) with v|Ωe = 1 is inserted. Then the scheme reduces to X Z γ∈Γint γ [v] hu · nids + X Z γ∈∂Ω γ u · n v ds = X Z e∈EhΩ f v dx (24) e which shows that the conserved flux is the average hu · ni. The Darcy velocity uDG = −K∇H is discontinuous at element boundaries and does not have continuous normal component u DG · n. Thus, the average flux hu · ni is inconsistent with the fluxes evaluated from left and right. Mathematically we have u DG 6∈ H(div; Ω). A velocity field with continuous normal component is, however, required by most transport simulations such as the scheme described below. In [12] we describe a simple projection scheme Π : (V r (Eh ))d → H(div; Ω) and prove that the projection does not reduce the accuracy of the DG scheme. This projected velocity u∗ = Π(uDG ) is used in the transport simulation. The semi-discrete DG scheme for solving problem (2) in either the hyperbolic or parabolic form is given as follows: Find C : [0, T ] → V r (Eh ) such that for all v ∈ V r (Eh ) ∂ X ∂t e∈E Z RωCv dx + e∈EhΩ h Ωe − + X Z e∈EhΩ γ∈Γint γ [v]C ∗ hu · ni ds X Z (D∇v · n)C − v(D∇C · n) ds X Z vCu · n ds X Z e∈EhΩ − X Z hD∇v · ni[C] − [v]hD∇C · ni ds γ∈Γext ∩Γ∗ γ = e e X Z γ∈Γext ∩ΓC γ + RωλCv dx (uC − D∇C) · ∇v dx + γ∈Γint γ + X Z qv dx − e X Z γ∈Γext ∩Γ0 γ X Z Jv ds γ∈Γext ∩ΓJ γ vC0 u · n ds + X Z γ∈Γext ∩ΓC γ D∇v · n C0 ds where we have used the refined decomposition of the boundary into outflow combined with Dirichlet outflow Γ∗ = ΓO ∪ {x ∈ ΓC | u(x) · n > 0} (25) couplex.tex; 29/11/2002; 16:23; p.10 11 Couplex Benchmark with UG and Dirichlet inflow Γ0 = {x ∈ ΓC | u(x) · n ≤ 0}. (26) The concentration in the convective term for x ∈ γ ∈ Γ int is upwinded via lim C(x − n) if hu · ni ≥ 0 . (27) C ∗ (x) = →0+ lim C(x + n) else →0+ The spatial error of this formulation is O(h r+1 ) in L2 in the hyperbolic case (D = 0) for a sufficiently regular solution. Error estimates are provided in [34]. 3.3. Diagonally Implicit Runge-Kutta Methods After introducing a basis and inverting the mass matrix the scheme (25) can be written as a large system of ordinary differential equations (15) for the coefficients yh (t). All Runge-Kutta methods used here can be written in the following form which computes yhn+1 from a given yhn : (i) 1. yh = i h P k=1 (k) yhn + bik ∆tn Lh (tn + dk ∆tn , yh ) i i = 1(1)s ; (s) 2. yhn+1 = yh ; The number of stages of the scheme is s. For hyperbolic and convectiondominated parabolic problems explicit schemes with the total variation diminishing (TVD) property were developed in [39, 19]. In this work we use three diagonally implicit Runge-Kutta schemes with favourable stability characteristics: The strongly S-stable schemes of order 2 (with 2 stages) and 3 (with 3 stages) given in [2] and the L-stable scheme of order 4 with 5 stages which is given in [27]. No slope limiters are used in this scheme. We assume that the grid Peclet number is large enough to avoid oscillations. Within each stage of the Runge-Kutta method a large system of algebraic equations has to be solved. This is done with the Newton-Multigrid techniques. 4. PDE Simulation Software UG The schemes described above have all been implemented on the basis of the software system UG [8, 6]. UG is an ongoing long-term development effort and establishes a software framework for the computation of partial differential equations (PDEs). couplex.tex; 29/11/2002; 16:23; p.11 12 P. Bastian S. Lang The following features are provided by the framework in a problemindependent and reusable way: − Mesh management: Unstructured meshes in two and three space dimensions with six different element types as well as flexible placement of degrees of freedom. − Parallelism: All modules realized inside UG provide their functionality also in parallel. Using parallel resources is nearly transparent on user level. Even an inexperienced user is able to use a powerful supercomputer to speed up his computations significantly by factors exceeding 102 . As target platform to run the UG software the most scalable machine architecture is selected. Data exchange between processes (communication) is done via message passing, e.g. using MPI. The parallelization of UG is based on the flexible, graph-based programming model DDD [14, 7] which also supports dynamic migration of complex data structures. − h-adaptivity: local grid adaption can be performed to concentrate the unknowns on areas where interesting phaenomena are expected or high accuracy of the solution is needed, e.g. the container region in the Couplex 2 benchmark. − Multigrid methods: As solvers for the large linear systems multigrid methods are applied. Especially when performing parallel large-scale computations with millions of unknowns optimal complexity solvers are a key issue for scalability. Over the last years, besides the couplex module, several other application modules in the context of subsurface flow have been realized. d 3 f [24] is a simulator for density driven flow scenarios. Primarily it was designed to study flow phenomena in geological formations around salt domes. These salt domes play a central role in german nuclear waste repository management. The simulator includes tools to perform several stages of a case study: Modelling geological layers, grid generation, the numerical simulator and postprocessing tools for visualization. For transport simulation in fractured porous media a specialized module including fracture and grid generation has been developed, see [9]. Multiphase/multicomponent flow applications are described in [10, 16]. Key UG capabilities used in the couplex benchmark are: − Anisotropic mesh support: Bisection refinement rules can be applied to decrease geometric anisotropy. This capablity has been used in couplex 2, to keep both element angles good and mesh anisotropy low. couplex.tex; 29/11/2002; 16:23; p.12 Couplex Benchmark with UG 13 Figure 2. Initial coarse mesh for Couplex 1 simulation with zoom of the vicinity of the repository. Both plots are scaled by a factor 12 in y-direction. − Periodic boundaries: An arbitrary number of periodic boundaries can be defined. These are transparent for local grid adaption and parallelism. 5. Couplex 1 5.1. Meshing Figure 2 shows the coarse mesh with 222 elements used to discretize the Couplex 1 domain. In order to ensure a maximum principle obtuse angles had to be avoided. The anisotropic repository region is discretized with long and flat rectangular elements, then triangular elements are used to make the transition from the small side with a length of 6m to large elements needed in the rest of the domain. Structured mesh codes would result in a mesh with anisotropic elements also outside the repository region. To our knowledge no automatic mesh generator is able to produce a mesh comparable in quality to the hand-made one. couplex.tex; 29/11/2002; 16:23; p.13 14 P. Bastian S. Lang 5.2. Grid Convergence of Transport Calculation An important question in a nuclear waste management simulation is to determine the concentration levels at various positions in space and time. In this subsection we consider the concentration levels of 129 I at t = 200000yr and try to assess the accuracy in the position of the isoline for concentration 10−8 . In order to do this we compute the concentration field with decreasing spatial and temporal mesh size. In particular we use 14521 (level 3), 56832 (level 4), 227328 (level 5) and 909312 (level 6) elements with, respectively, 52, 68, 68 and 104 timesteps. The result for the VCFV scheme using second order approximation for the convective terms (“central differences”) is shown in Figure 3. The results do not show unphysical oscillations if the mesh is sufficently fine (level 4). This is a consequence of both: Small mesh Peclet number and good mesh design. As can be seen in Figure 3, the position of the isoline at the border between the clay and limestone layers is overestimated in the coarse grid simulation. With subsequent refinement the position of the contour line moves to the right. The size of the time steps is varied from 100[yr] initially up to 10 5 [yr] for the VCFV/fractional step scheme and 10 6 [yr] for the DG/SDIRK schemes according to a prescribed schedule. Note that each time step is subdivided into three substeps in the fractional step and the SDIRK(3) scheme. Let xl be the position of the contour line computed with the second order VCFV scheme on level l which is shown in the first column of Table 5.2. From that data we estimate the convergence rate as 1 |x6 − x5 | ≈ |x5 − x4 | 3.5 (28) (where we expected an asymptotic convergence factor of 1/4) and therefore get the error estimate |x6 − xexact | ≤ |x6 − x5 | 3.5 1 1 − 1/3.5 = 161 [m]. (29) Columns two, three and four of Table 5.2 contain the error estimates obtained with all three schemes schemes considered. The DG(2) scheme is formally second order accurate and yields an accuracy comparable to the second order VCFV scheme with about the same number of degrees of freedom (DOF). The error in the VCFV full upwind scheme is about 12 times larger than for the second order schemes for the same number of DOF. A comparison of computation times is shown in Table 5.2. The DG scheme is significantly more efficient than the VCFV scheme mainly couplex.tex; 29/11/2002; 16:23; p.14 Couplex Benchmark with UG 15 contour line 1E−8 error Position Figure 3. Contour lines of 129 I at t = 200000yr computed with 2nd order VCFV/fractional step scheme on levels 3, 4, 5, 6 and 52, 68, 68 and 104 (FS-θ) time steps (from top). Colors code the contour lines: 10−12 (blue), 10−10 (light blue), 10−8 (light green), 10−6 (green), 10−4 (orange), 10−2 (red). couplex.tex; 29/11/2002; 16:23; p.15 16 P. Bastian S. Lang 195000 Figure 4. Contour lines of 129 I at t = 200000yr. Computed with full upwinding VCFV scheme on level 6 (top) and DG(2) in space, third order RK scheme in time on level 5 (bottom). Colors code the contour lines: 10−12 (blue), 10−10 (light blue), 10−8 (light green), 10−6 (green), 10−4 (orange), 10−2 (red). Table I. Error analysis for x-position of the isoline 10−8 on the border of clay and limestone layer obtained with different discretization schemes and mesh refinements. level 3 4 5 6 VCFV 2nd order/FS-θ position [m] error [m] 15743 11089 9604 9208 6760 1160 560 161 DG(2)/SDIRK(3) error [m] VCFV full upwind/FS-θ error [m] -1240 -540 -140 - 17160 7260 3860 1860 because much larger time steps can be taken. The size of the time step is limited in the second order VCFV scheme because the linear systems get very difficult to solve due to loss of diagonal dominance. 5.3. Breakthrough Curves at Aquifer Boundary Figure 5 shows the breakthrough curves for 129 I at the outflow boundary of the limestone layer obtained on meshes with 3725 up to 227809 couplex.tex; 29/11/2002; 16:23; p.16 17 Couplex Benchmark with UG Table II. Computation time for most accurate Couplex 1 calculations. Scheme 2nd Order VCFV DG(2)/SDIRK(3) DOF time steps processors comp. time [h] 0.9 · 106 1.35 · 106 1692 87 40 × PII/400 1 × PIV/2200 46.3 63.1 I129 Outflow through Limestone Layer 1.8e-06 3725 vertices 14521 vertices 57329 vertices 227809 vertices [Mole/yr] 1.6e-06 1.4e-06 1.2e-06 1e-06 8e-07 6e-07 4e-07 2e-07 0 -2e-07 0 500000 1e+06 1.5e+06 Figure 5. Breakthrough curves for successively refined meshes. 129 2e+06 Time [yr] 2.5e+06 3e+06 3.5e+06 4e+06 I at outflow boundary of limestone layers on vertices. The figure shows that such an integral quantity can already be computed accurately on a relatively coarse mesh. The solutions obtained on the two finest meshes are almost undistinguishable. 6. Couplex 2 6.1. Global Solution Algorithm and Meshing Couplex 2 involves the solution of in total 10 coupled equations for the dissolved components in the water phase: C s (silica), C0 (135 Cs), C1 (238 Pu), C−1 (234 U), C2 (242 Pu), C−2 (238 U) and the precipitated components in the solid phase: F1 (238 Pu), F−1 (234 U), F2 (242 Pu), F−2 (238 U). These equations are discretized with the 2nd order VCFV scheme in space and implicit Euler in time. The discrete equations for the coupled system are solved sequentially in blocks as follows: First the linear equation for Cs is solved, then that for C0 , then block {C1 , C2 , F1 , F2 } and finally {C−1 , C−2 , F−1 , F−2 } are solved in a fully coupled way using Newton-Multigrid because these unknowns are nonlinearly dependent on each other through the precipitation term. The domain used in the Couplex 2 computation is 18 × 24.8 × 100[m3 ] and contains exactly one container (i. e. we use half of the couplex.tex; 29/11/2002; 16:23; p.17 18 P. Bastian S. Lang Figure 6. Minimal mesh to resolve geometry (left), final inital mesh, which is conforming across periodic boundaries (middle), mesh with element anisotropy reduced by directed refinement and load balancing of mesh onto processors (right). domain proposed in the benchmark document due to symmetry). The meshing of the Couplex 2 geometry with its components backfill, seal, buffer and container was done manually to avoid bad angles. The resulting unstructured triangulation involves hexahedra and prisms. The left picture in figure 6 shows the minimal triangulation to resolve the geometry. Across periodic boundaries the mesh has to match. This requirement leads to the final triangulation, which is visualized in figure 6 in the middle picture. The starting mesh is then further refined using directed refinement. Bisection rules for prisms/hexahedra and trisection rules for hexahedra were realized to reduce element anisotropy and furthermore improve element angles. The directed bisection of prisms introduces two new elements: a prism and a hexahedron with angles equal or better than the angles of the refined prism. This is done until element anisotropies are better than 1/2.5 and element angles do not exceed 135 degrees. This initial refinement phase to improve mesh quality resulted in a mesh with 1182 elements and 1622 vertices. After further uniform refinement and load balancing onto 32 processors the mesh looks like in the right picture of figure 6. Figure 7 shows the isosurfaces of the Cesium concentration at three different times computed on a mesh with 79862 vertices. 6.2. Comments on Glass Disolution Model The Couplex benchmark document proposes a detailed model of glass dissolution in Section 3.3 resulting in a Fourier type boundary condition couplex.tex; 29/11/2002; 16:23; p.18 19 Couplex Benchmark with UG Figure 7. Concentration isosurfaces of Cesium after 104 , 105 and 106 years. at the container. The equation for silica transport is given by ρp νp ∂Cs + ∇ · js (Cs ) = φRs ∂t λp Cs 1− Sp ! in Ω (30) where the (sligthly modified) boundary condition at the container is given by Cs js (Cs ) · n = −ρm νm 1 − . (31) Sm Note that the two equilibrium concentrations S p = 0.54 [mol/m3 ] and Sm = 0.82 [mol/m3 ] are different. This results in a boundary layer which is hard to resolve numerically. The size of the boundary layer depends on the various constants given in the equation above. Unfortunately, the parameter λp is not specified uniquely but is allowed to range from 8 · 10−10 to 10−3 which, consequently, leads to a large variation in the size of the boundary layer. In this section we compute the size of the boundary layer analytically under the assumption that the convective/dispersive fluxes in Eq. (30) can be neglected. Then we will verify the analytical observations by highly resolved computations. Let B be a cube of size h3 located with one face at the container boundary. Integration of (30) over B gives ∂ φRs ∂t Z B Cs dx + Z ∂B js (Cs ) · n ds = Z B ρp νp λp Cs 1− Sp ! dx. (32) Neglecting convective/diffusive fluxes over interior boundaries, inserting the boundary condition and assuming C s to be constant in B couplex.tex; 29/11/2002; 16:23; p.19 20 P. Bastian S. Lang results in an ordinary differential equation φRs h 3 ∂Cs ∂t − ρ m νm Cs 1− Sm 2 h =h 3 ρp νp λp Cs 1− Sp ! , (33) which, after some algebraic manipulation, can be written in the form ∂Cs φRs = ∂t ρp νp ρm νm + λp h with S ∗ (λp , h) = Sp Sm ρp ν p λp ρp ν p λp S m + ! Cs 1− ∗ S ρm ν m h ρm ν m h Sp + . (34) (35) S ∗ is the value of Cs for t → ∞ at the container boundary computed by a numerical scheme (say VCFV) on a mesh of size h under the assumption that convective/dispersive fluxes can be neglected. It can be shown that S ∗ is achieved very quickly with respect to the time scale of Couplex 2 (i. e. after a few years). Note that S ∗ (λp , h) is still a function of λp and h since both parameters can be chosen by the implementor of Couplex 2. Figure 8 shows the dependence of S ∗ on the mesh size for the given range of λ p and directly visualizes the boundary layer. Note that h is given on a logarithmic scale. According to Figure 8 it requires a mesh resolution of h = 10−6 [m] for λp = 10−6 in order to resolve the boundary layer! This consideration has direct consequences for the dissolution time (time where all the silica in the container has been dissolved). Since S ∗ converges to Sm for h → 0 the flux in the boundary condition (31) converges to 0 and consequently the dissolution time is ∞. Taking convective/dispersive fluxes into account will result in a limiting value below Sm at the container boundary. This effect can only be analysed by numerical computations on meshes which are highly resolved in the vicinity of the container. Table 6.2 shows the corresponding results for a local mesh size as small as 1.25 · 10 −2 [m] obtained through local mesh refinement. Clearly, the boundary layer can be resolved for λp = 10−3 and the concentration at the container boundary converges to a value ≈ 0.747. For smaller values of λ p the boundary layer cannot be resolved due to mesh size limitations but the effect can be seen. The fact that the silica concentration at the container boundary is highly sensitive to both mesh size h and parameter λ p renders this model unsuitable for a benchmark computation. The validity of this model should be checked carefully. For these reasons we decided to use the flux boundary condition with τ = 10 −2 for our Couplex 2 computations. couplex.tex; 29/11/2002; 16:23; p.20 21 Couplex Benchmark with UG Silica concentration at container boundary C_s_boundary [mol/m^3] 0.85 0.8 0.75 0.7 0.65 0.6 0.55 0.5 1e-10 lambda_p=1E-3 lambda_p=1E-4 lambda_p=1E-5 lambda_p=1E-6 lambda_p=1E-7 lambda_p=1E-8 lambda_p=1E-9 1e-08 1e-06 0.0001 0.01 1 log_10(h) [m] Figure 8. Silica concentration at container boundary depending on λp and mesh size h. Table III. Computed concentration value of silica at container boundary x = 4.95, y = 13.33, z = 50 after t = 250[yr] for various mesh sizes and values of parameter λp . h[m] 0.4 0.2 0.1 0.05 0.025 0.0125 λp 10−3 10−4 10−5 10−6 0.6496 0.6976 0.7287 0.7416 0.7455 0.7464 0.5574 0.5754 0.6034 0.6356 0.6586 0.6683 0.5418 0.5441 0.5484 0.5564 0.5695 0.5851 0.5402 0.5404 0.5409 0.5418 0.5436 0.5469 6.3. Convergence of Breakthrough Curves As one result of the Couplex 2 computation we consider the breakthrough curves for the dissolved components C 0 (135 Cs) and C−1 (234 U) at the upper and lower domain boundaries (host to outside) in more detail. These two components have been chosen since C 0 is governed by a linear equation and C−1 is governed by a nonlinear equation (precipitation). Moreover, these results are the input of the Couplex 3 computation and therefore have direct influence on the accuracy of the far field computation. Figure 9 shows the corresponding breakthrough curves. Clearly the results for 135 Cs can be obtained quite accurately already on a very coarse mesh. However, the accurate computation of the breakthrough curve for 234 U requires a much finer mesh. To reduce the relative error in couplex.tex; 29/11/2002; 16:23; p.21 22 P. Bastian S. Lang Cs135 Host to Outside 1.2e-05 1622 vertices 10936 vertices 79862 vertices [Micromole/yr] 1e-05 8e-06 6e-06 4e-06 2e-06 0 -2e-06 0 1e+06 2e+06 3e+06 Time [yr] 4e+06 5e+06 6e+06 U234 Host to Outside 4e-10 1622 vertices 10936 vertices 79862 vertices 609418 vertices 3.5e-10 [Micromole/yr] 3e-10 2.5e-10 2e-10 1.5e-10 1e-10 5e-11 0 -5e-11 0 1e+06 2e+06 Figure 9. Breakthrough curves for lower boundaries of the domain. 3e+06 Time [yr] 135 Cs (top) and 4e+06 234 5e+06 6e+06 U (bottom) at upper and the peak value to about 1% requires the calculation on 609418 vertices with 6703598 degrees of freedom. 6.4. Parallel Computation Total execution time for three different Couplex 2 configurations are listed in Table 6.4. Computations have been performed on the HELICS cluster consisting of AMD Athlon 1.3 GHz processors connected by a Myrinet 2000 interconnect. In the configurations the problem size is roughly scaled with the number of processors. All computation times (reported in seconds) are in the range of a few hours. It can be seen that the largest computation on more than 6 · 10 6 degrees of freedom would require 9 days computing time on a sequential machine using 6 GB main memory! Above, it has been demonstrated that grid convergence for the 234 U breakthrough curve is obtained only with this fine mesh computation. Therefore, parallelization was absolutely necessary for a successful computation of the Couplex 2 benchmark. Table 6.4 also contains the time needed per time step and the resulting speedup per time step. couplex.tex; 29/11/2002; 16:23; p.22 23 Couplex Benchmark with UG Table IV. Parallel Speedup of Couplex 2 computations with 1, 8 and 64 Timesteps. To increase accuracy also in time dimension timestep sizes have been varied. P DOF Timesteps Job time [s] Time/Step [s] Speedup/Step 1 8 64 120296 878482 6703598 50 57 74 8621 22720 23040 172 398 311 3.5 35.4 7. Conclusions In this paper we demonstrate that a successful computation of the Couplex benchmark is an interdisciplinary task that requires knowledge about the underlying partial differential equations, accurate discretization schemes, efficient solvers and simulator software that is providing support for unstructured meshes, local refinement and parallel computation. We have shown that estimates of the error in the unknown solution can be obtained through systematic mesh refinement studies which is extremely important in repository safety analysis. This assessment of numerical errors required computations on about 10 7 degrees of freedom in three space dimensions over long time intervals which can only be done effectively with high-performance computing capabilites. Acknowledgements We greatly acknowledge the use of the “HEidelberg LInux Cluster System” for the parallel computations. References 1. 2. 3. 4. Aizinger, V., C. Dawson, B. Cockburn, and P. Castillo: 2001, ‘The local discontinuous Galerkin method for contaminant transport’. Adv. Wat. Res. 24, 73–87. Alexander, R.: 1977, ‘Diagonally implicit Runge-Kutta methods for stiff O.D.E.’s’. SIAM Journal Numer. Anal. 14, 1006–1021. Arbogast, T. and M. F. Wheeler: 1995, ‘A characteristics-mixed finite element method for advection-dominated transport problems’. SIAM J. Numer. Anal. 32, 404–424. Arnold, D., F. Brezzi, B. Cockburn, and L. Marini: 2002, ‘Unified analysis of discontinuous Galerkin methods for elliptic problems’. SIAM J. Numer. Anal. 39(5), 1749–1779. couplex.tex; 29/11/2002; 16:23; p.23 24 5. 6. 7. 8. 9. 10. 11. 12. 13. 14. 15. 16. 17. 18. 19. 20. 21. P. Bastian S. Lang Bastian, P.: 2003, Numerical Computation of Multiphase Flows. SpringerVerlag. in preparation. Bastian, P., K. Birken, K. Johannsen, S. Lang, N. Neuss, H. Rentz-Reichert, and C. Wieners: 1998, ‘Parallel Unstructured Grid Computations’. In: W. Hackbusch (ed.): Proceedings of the 14th GAMM Seminar Kiel. . Bastian, P., K. Birken, and S. Lang: 1999, ‘High level software tools for unstructured adaptive grids on massively parallel systems’. In: Proc. of 9th SIAM Conf. on Parallel Processing for Scientific Computing. p. published on CD. Bastian, P., K. Birken, S. Lang, K. Johannsen, N. Neuß, H. Rentz-Reichert, and C. Wieners: 1997, ‘UG: A flexible software toolbox for solving partial differential equations’. Computing and Visualization in Science 1, 27–40. , . Bastian, P., Z. Chen, R. Ewing, R. Helmig, H. Jakobs, and V. Reichenberger: 2000, ‘Numerical Solution of Multiphase flow in Fractured porous media’. In: Z. Chen, R. Ewing, and Z. Shi (eds.): Numerical treatment of multiphase flows in porous media. pp. 50–68. Bastian, P. and R. Helmig: 1999, ‘Efficient fully-coupled solution techniques for two-phase flow in porous media. Parallel multigrid solution and large scale computations’. Adv. Water Res. 23, 199–216. . Bastian, P. and V. Reichenberger: 2000, ‘Multigrid for higher order discontinuous Galerkin finite elements applied to groundwater flow’. Technical Report 2000-37, SFB 359. . Bastian, P. and B. Rivière: 2002, ‘Superconvergence and H(div)-projection for discontinuous Galerkin methods’. Technical Report 2002-23, IWR, Universität Heidelberg. , submitted to Int. J. Numer. Meth. Fluids. Bey, J.: 1998, Finite–Volumen– und Mehrgitterverfahren für elliptische Randwertprobleme, Advances in Numerical Mathematics. Stuttgart: Teubner-Verlag. Birken, K.: 1998, ‘Ein Modell zur effizienten Parallelisierung von Algorithmen auf komplexen, dynamischen Datenstrukturen’. Ph.D. thesis, Universität Stuttgart. Celia, M., T. Russel, I. Herrera, and R. Ewing: 1990, ‘An Eulerian–Lagrangian localized adjoint method for the advection–diffusion equation’. Adv. Water Resources 13(4), 187–206. Class, H., R. Helmig, and P. Bastian: 2002, ‘Numerical simulation of nonisothermal multiphase multicomponent processes in porous media. 1. An efficient solution technique’. Adv. Wat. Res. 25, 533–550. . Cockburn, B., S. Hou, and C. Shu: 1990, ‘TVB Runge-Kutta local projection discontinuous Galerkin finite element method for conservation laws IV: The multidimensional case’. Math. Comput. 54, 545. Cockburn, B., S. Lin, and C. Shu (eds.): 2000, ‘Discontinuous Galerkin methods. Theory, computation and applications’, Vol. 11 of Lecture Notes in Computational Science and Engineering. Springer-Verlag. Cockburn, B. and C. Shu: 1991, ‘The Runge-Kutta local projection P 1 discontinuous Galerkin method for scalar conservation laws’. M 2 AN 25, 337. Cockburn, B. and C. Shu: 1998a, ‘The local discontinuous Galerkin finite element method for convection-diffusion systems’. SIAM J. Numer. Anal. 35, 2440–2463. Cockburn, B. and C. Shu: 1998b, ‘The Runge-Kutta discontinuous Galerkin method for conservation laws V: Multidimensional systems’. J. Comput. Phys. 141, 199–224. couplex.tex; 29/11/2002; 16:23; p.24 Couplex Benchmark with UG 22. 23. 24. 25. 26. 27. 28. 29. 30. 31. 32. 33. 34. 35. 36. 37. 38. 39. 40. 41. 25 Dawson, C.: 1991, ‘Godunov–mixed methods for advective flow problems in one space dimension’. SIAM J. Numer. Anal. 28(5), 1282–1309. Douglas Jr., J. and T. Russel: 1982, ‘Numerical Methods for convection dominated diffusion problems based on combining the method of characteristics with finite element or finite difference procedures’. SIAM J. Numer. Anal. 19(5), 871–885. Fein, E. (ed.): 1998, ‘d3 f – Ein Programmpaket zur Modellierung von Dichteströmungen’. GRS-139. Hackbusch, W.: 1985, Multi–Grid Methods and Applications. Springer–Verlag. Hackbusch, W.: 1994, Iterative Solution of Large Sparse Systems of Linear Equations. Springer. Hairer, E. and G. Wanner: 1991, Solving ordinary differential equations II. Springer, Berlin. Harten, A. and P. Lax: 1984, ‘On a class of high-resolution total-variationstable finite-difference schemes’. SIAM J. Numer. Anal. 21, 1–23. LeVeque, R.: 1992, Numerical Methods for Conservation Laws. Birkhäuser. Michev, I.: 1996, ‘Finite volume and finite volume element methods for nonsymmetric problems’. Ph.D. thesis, Texas A&M University. Oden, J., I. Babuška, and C. Baumann: 1998, ‘A Discontinuous hp Finite Element Method for Diffusion Problems’. Journal of Computational Physics 146, 491–519. Rannacher, R.: 1994, ‘Accurate Time Discretization Schemes for Computing Nonstationary Incompressible Fluid Flow’. In: Proceedings of the International Conference on Computational Methods in Water Resources X. pp. 1239–1246. Reed, W. and T. Hill: 1973, ‘Triangular mesh methods for the neutron transport equation’. Technical report, Los Alamos Scientific Laboratory. Rivière, B. and M. Wheeler: 2002, ‘Nonconforming methods for transport with nonlinear reaction’. In: Z. Chen and R. Ewing (eds.): Fluid Flow and Transport in Porous Media: Mathematical and Numerical, Vol. 295. American Mathematical Society, pp. 421–432. Rivière, B., M. Wheeler, and K. Banaś: 2000, ‘Part II. Discontinuous Galerkin method applied to a single phase flow in porous media’. Comput. Geosci. 4, 337–349. Rivière, B., M. Wheeler, and V. Girault: 1999, ‘Improved energy estimates for interior penalty, constrained and discontinuous Galerkin methods for elliptic problems I’. Comput. Geosci. 3, 337–360. Rivière, B., M. Wheeler, and V. Girault: 2001, ‘A priori error estimates for finite element emthods based on discontinuous approximation spaces for elliptic problem’. SIAM Journal on Numerical Analysis 39(3), 902–931. Schneider, G. and M. Raw: 1986, ‘A Skewed, Positive Influence Coefficient Upwinding Procedure for Control-Volume-Based Finite-Element ConvectionDiffusion Computation’. Numerical Heat Transfer 9, 1–26. Shu, C.: 1988, ‘Total-variation-diminishing time discretizations’. SIAM J. Sci. Stat. Comput. 9(6), 1073–1084. Wang, H., H. Dahle, R. Ewing, M. Espedal, R. Sharpley, and S. Man: 1999, ‘An ELLAM scheme dor advection-diffusion equations in two dimensions’. SIAM J. Sci. Stat. Comput. 20(6), 2160–2194. Wheeler, M.: 1978, ‘An elliptic collocation finite element method with interior penalties’. SIAM J. Numer. Anal. 15(1), 152–161. couplex.tex; 29/11/2002; 16:23; p.25 couplex.tex; 29/11/2002; 16:23; p.26

* Your assessment is very important for improving the work of artificial intelligence, which forms the content of this project

Download PDF

advertisement