PETSc Users Manual Revision 3.2

Chapter 5

SNES: Nonlinear Solvers

The solution of large-scale nonlinear problems pervades many facets of computational science and demands robust and flexible solution strategies. The SNES library of PETSc provides a powerful suite of datastructure-neutral numerical routines for such problems. Built on top of the linear solvers and data structures discussed in preceding chapters, SNES enables the user to easily customize the nonlinear solvers according to the application at hand. Also, the SNES interface is identical for the uniprocess and parallel cases; the only difference in the parallel version is that each process typically forms only its local contribution to various matrices and vectors.

The SNES class includes methods for solving systems of nonlinear equations of the form

F ( x ) = 0 , (5.1) where F :

< n

→ < n

. Newton-like methods provide the core of the package, including both line search and trust region techniques, which are discussed further in Section

5.2

. Following the PETSc design philosophy,

the interfaces to the various solvers are all virtually identical. In addition, the SNES software is completely flexible, so that the user can at runtime change any facet of the solution process.

The general form of the n

-dimensional Newton’s method for solving ( 5.1

) is

x k +1

= x k

−

[ F

0

( x k

)]

−

1

F ( x k

) , k = 0 , 1 , . . . , (5.2) where x

0 is an initial approximation to the solution and

F

0

( x k

)

, the Jacobian, is nonsingular at each itera-

tion. In practice, the Newton iteration ( 5.2

) is implemented by the following two steps:

1 .

(Approximately) solve F

0

( x k

)∆ x k

=

−

F ( x k

) .

2 .

Update x k +1

= x k

+ ∆ x k

.

(5.3)

(5.4)

5.1

Basic SNES Usage

In the simplest usage of the nonlinear solvers, the user must merely provide a C, C++, or Fortran routine to

evaluate the nonlinear function of Equation ( 5.1

). The corresponding Jacobian matrix can be approximated

with finite differences. For codes that are typically more efficient and accurate, the user can provide a routine to compute the Jacobian; details regarding these application-provided routines are discussed below.

To provide an overview of the use of the nonlinear solvers, we first introduce a complete and simple example in Figure

14 , corresponding to

${PETSC_DIR}/src/snes/examples/tutorials/ex1.c

.

static char help[] = "Newton’s method for a two-variable system, sequential.\n\n";

91

/*T

Concepts: SNESˆbasic example

T*/

/*

Include "petscsnes.h" so that we can use SNES solvers.

Note that this file automatically includes: petscsys.h

- base PETSc routines petscmat.h - matrices petscvec.h - vectors petscis.h

- index sets petscviewer.h - viewers petscksp.h

- linear solvers petscksp.h - Krylov subspace methods petscpc.h

- preconditioners

*/

#include <petscsnes.h> typedef struct {

Vec xloc,rloc;

VecScatter scatter;

} AppCtx;

/* local solution, residual vectors */

/*

User-defined routines

*/ extern PetscErrorCode FormJacobian1(SNES,Vec,Mat*,Mat*,MatStructure*,void*); extern PetscErrorCode FormFunction1(SNES,Vec,Vec,void*); extern PetscErrorCode FormJacobian2(SNES,Vec,Mat*,Mat*,MatStructure*,void*); extern PetscErrorCode FormFunction2(SNES,Vec,Vec,void*);

#undef __FUNCT__

#define __FUNCT__ "main" int main(int argc,char **argv)

{

SNES

KSP

PC

Vec snes; ksp; pc; x,r;

Mat J;

PetscErrorCode ierr;

PetscInt its;

PetscMPIInt

PetscScalar

PetscBool

AppCtx

IS

/* nonlinear solver context */

/* linear solver context */

/* preconditioner context */

/* solution, residual vectors */

/* Jacobian matrix */ size,rank; pfive = .5,*xx; flg; user; /* user-defined work context */ isglobal,islocal;

PetscInitialize(&argc,&argv,(char *)0,help); ierr = MPI_Comm_size(PETSC_COMM_WORLD,&size);CHKERRQ(ierr); ierr = MPI_Comm_rank(PETSC_COMM_WORLD,&rank);CHKERRQ(ierr);

/* - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -

Create nonlinear solver context

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - */ ierr = SNESCreate(PETSC_COMM_WORLD,&snes);CHKERRQ(ierr);

/* - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -

92

Create matrix and vector data structures; set corresponding routines

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -

*/

/*

Create vectors for solution and nonlinear function

*/ ierr = VecCreate(PETSC_COMM_WORLD,&x);CHKERRQ(ierr); ierr = VecSetSizes(x,PETSC_DECIDE,2);CHKERRQ(ierr); ierr = VecSetFromOptions(x);CHKERRQ(ierr); ierr = VecDuplicate(x,&r);CHKERRQ(ierr); if (size > 1){ ierr = VecCreateSeq(PETSC_COMM_SELF,2,&user.xloc);CHKERRQ(ierr); ierr = VecDuplicate(user.xloc,&user.rloc);CHKERRQ(ierr);

}

/* Create the scatter between the global x and local xloc */ ierr = ISCreateStride(MPI_COMM_SELF,2,0,1,&islocal);CHKERRQ(ierr); ierr = ISCreateStride(MPI_COMM_SELF,2,0,1,&isglobal);CHKERRQ(ierr); ierr = VecScatterCreate(x,isglobal,user.xloc,islocal,&user.scatter);CHKERRQ(ierr); ierr = ISDestroy(&isglobal);CHKERRQ(ierr); ierr = ISDestroy(&islocal);CHKERRQ(ierr);

/*

Create Jacobian matrix data structure

*/ ierr = MatCreate(PETSC_COMM_WORLD,&J);CHKERRQ(ierr); ierr = MatSetSizes(J,PETSC_DECIDE,PETSC_DECIDE,2,2);CHKERRQ(ierr); ierr = MatSetFromOptions(J);CHKERRQ(ierr); ierr = PetscOptionsHasName(PETSC_NULL,"-hard",&flg);CHKERRQ(ierr); if (!flg) {

/*

Set function evaluation routine and vector.

*/ ierr = SNESSetFunction(snes,r,FormFunction1,&user);CHKERRQ(ierr);

/*

Set Jacobian matrix data structure and Jacobian evaluation routine

*/ ierr = SNESSetJacobian(snes,J,J,FormJacobian1,PETSC_NULL);CHKERRQ(ierr);

} else { if (size != 1) SETERRQ(PETSC_COMM_SELF,1,"This case is a uniprocessor example only!"); ierr = SNESSetFunction(snes,r,FormFunction2,PETSC_NULL);CHKERRQ(ierr); ierr = SNESSetJacobian(snes,J,J,FormJacobian2,PETSC_NULL);CHKERRQ(ierr);

}

/* - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -

Customize nonlinear solver; set runtime options

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - */

/*

Set linear solver defaults for this problem. By extracting the

KSP, KSP, and PC contexts from the SNES context, we can then directly call any KSP, KSP, and PC routines to set various options.

93

*/ ierr = SNESGetKSP(snes,&ksp);CHKERRQ(ierr); ierr = KSPGetPC(ksp,&pc);CHKERRQ(ierr); ierr = PCSetType(pc,PCNONE);CHKERRQ(ierr); ierr = KSPSetTolerances(ksp,1.e-4,PETSC_DEFAULT,PETSC_DEFAULT,20);CHKERRQ(ierr);

/*

Set SNES/KSP/KSP/PC runtime options, e.g.,

-snes_view -snes_monitor -ksp_type <ksp> -pc_type <pc>

These options will override those specified above as long as

SNESSetFromOptions() is called _after_ any other customization routines.

*/ ierr = SNESSetFromOptions(snes);CHKERRQ(ierr);

/* - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -

Evaluate initial guess; then solve nonlinear system

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - */ if (!flg) { ierr = VecSet(x,pfive);CHKERRQ(ierr);

} else {

}

/* ierr = VecGetArray(x,&xx);CHKERRQ(ierr); xx[0] = 2.0; xx[1] = 3.0; ierr = VecRestoreArray(x,&xx);CHKERRQ(ierr);

Note: The user should initialize the vector, x, with the initial guess for the nonlinear solver prior to calling SNESSolve().

In particular, to employ an initial guess of zero, the user should explicitly set this vector to zero by calling VecSet().

*/ ierr = SNESSolve(snes,PETSC_NULL,x);CHKERRQ(ierr); ierr = SNESGetIterationNumber(snes,&its);CHKERRQ(ierr); if (flg) {

Vec f; ierr = VecView(x,PETSC_VIEWER_STDOUT_WORLD);CHKERRQ(ierr); ierr = SNESGetFunction(snes,&f,0,0);CHKERRQ(ierr); ierr = VecView(r,PETSC_VIEWER_STDOUT_WORLD);CHKERRQ(ierr);

} ierr = PetscPrintf(PETSC_COMM_WORLD,"number of Newton iterations = %D\n\n",its);CHKERRQ(ierr);

/* - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -

Free work space.

All PETSc objects should be destroyed when they are no longer needed.

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - */ ierr = VecDestroy(&x);CHKERRQ(ierr); ierr = VecDestroy(&r);CHKERRQ(ierr); ierr = MatDestroy(&J);CHKERRQ(ierr); ierr = SNESDestroy(&snes);CHKERRQ(ierr); if (size > 1){ ierr = VecDestroy(&user.xloc);CHKERRQ(ierr); ierr = VecDestroy(&user.rloc);CHKERRQ(ierr); ierr = VecScatterDestroy(&user.scatter);CHKERRQ(ierr);

}

94

ierr = PetscFinalize(); return 0;

}

/* ------------------------------------------------------------------- */

#undef __FUNCT__

#define __FUNCT__ "FormFunction1"

/*

FormFunction1 - Evaluates nonlinear function, F(x).

Input Parameters:

.

snes - the SNES context

.

x - input vector

.

ctx - optional user-defined context

Output Parameter:

.

f - function vector

*/

PetscErrorCode FormFunction1(SNES snes,Vec x,Vec f,void *ctx)

{


PetscScalar

*xx,*ff;

AppCtx

Vec

VecScatter

*user = (AppCtx*)ctx; xloc=user->xloc,floc=user->rloc; scatter=user->scatter;

MPI_Comm

PetscMPIInt

PetscInt comm; size,rank; rstart,rend; ierr = PetscObjectGetComm((PetscObject)snes,&comm);CHKERRQ(ierr); ierr = MPI_Comm_size(comm,&size);CHKERRQ(ierr); ierr = MPI_Comm_rank(comm,&rank);CHKERRQ(ierr); if (size > 1){

/*

This is a ridiculous case for testing intermidiate steps from sequential code development to parallel implementation.

(1) scatter x into a sequetial vector;

(2) each process evaluates all values of floc;

(3) scatter floc back to the parallel f.

*/ ierr = VecScatterBegin(scatter,x,xloc,INSERT_VALUES,SCATTER_FORWARD);CHKERRQ(ierr); ierr = VecScatterEnd(scatter,x,xloc,INSERT_VALUES,SCATTER_FORWARD);CHKERRQ(ierr); ierr = VecGetOwnershipRange(f,&rstart,&rend);CHKERRQ(ierr); ierr = VecGetArray(xloc,&xx);CHKERRQ(ierr); ierr = VecGetArray(floc,&ff);CHKERRQ(ierr); ff[0] = xx[0]*xx[0] + xx[0]*xx[1] - 3.0; ff[1] = xx[0]*xx[1] + xx[1]*xx[1] - 6.0; ierr = VecRestoreArray(floc,&ff);CHKERRQ(ierr); ierr = VecRestoreArray(xloc,&xx);CHKERRQ(ierr); ierr = VecScatterBegin(scatter,floc,f,INSERT_VALUES,SCATTER_REVERSE);CHKERRQ(ierr); ierr = VecScatterEnd(scatter,floc,f,INSERT_VALUES,SCATTER_REVERSE);CHKERRQ(ierr);

} else {

/*

Get pointers to vector data.

95

- For default PETSc vectors, VecGetArray() returns a pointer to the data array.

Otherwise, the routine is implementation dependent.

- You MUST call VecRestoreArray() when you no longer need access to the array.

*/ ierr = VecGetArray(x,&xx);CHKERRQ(ierr); ierr = VecGetArray(f,&ff);CHKERRQ(ierr);

/* Compute function */ ff[0] = xx[0]*xx[0] + xx[0]*xx[1] - 3.0; ff[1] = xx[0]*xx[1] + xx[1]*xx[1] - 6.0;

/* Restore vectors */ ierr = VecRestoreArray(x,&xx);CHKERRQ(ierr); ierr = VecRestoreArray(f,&ff);CHKERRQ(ierr);

} return 0;

}

/* ------------------------------------------------------------------- */

#undef __FUNCT__

#define __FUNCT__ "FormJacobian1"

/*

FormJacobian1 - Evaluates Jacobian matrix.

Input Parameters:

.


.

x - input vector

.

dummy - optional user-defined context (not used here)

Output Parameters:

.

jac - Jacobian matrix

.

B - optionally different preconditioning matrix

.

flag - flag indicating matrix structure

*/

PetscErrorCode FormJacobian1(SNES snes,Vec x,Mat *jac,Mat *B,MatStructure

*flag,void *dummy)

{

PetscScalar

*xx,A[4];


PetscInt idx[2] = {0,1};

/*

Get pointer to vector data

*/ ierr = VecGetArray(x,&xx);CHKERRQ(ierr);

/*

Compute Jacobian entries and insert into matrix.

- Since this is such a small problem, we set all entries for the matrix at once.

*/

A[0] = 2.0*xx[0] + xx[1]; A[1] = xx[0];

A[2] = xx[1]; A[3] = xx[0] + 2.0*xx[1]; ierr = MatSetValues(*B,2,idx,2,idx,A,INSERT_VALUES);CHKERRQ(ierr);

*flag = SAME_NONZERO_PATTERN;

96

/*

Restore vector

*/ ierr = VecRestoreArray(x,&xx);CHKERRQ(ierr);

}

/*

Assemble matrix

*/ ierr = MatAssemblyBegin(*B,MAT_FINAL_ASSEMBLY);CHKERRQ(ierr); ierr = MatAssemblyEnd(*B,MAT_FINAL_ASSEMBLY);CHKERRQ(ierr); if (*jac != *B){ ierr = MatAssemblyBegin(*jac,MAT_FINAL_ASSEMBLY);CHKERRQ(ierr); ierr = MatAssemblyEnd(*jac,MAT_FINAL_ASSEMBLY);CHKERRQ(ierr);

} return 0;

/* ------------------------------------------------------------------- */

#undef __FUNCT__

#define __FUNCT__ "FormFunction2"

PetscErrorCode FormFunction2(SNES snes,Vec x,Vec f,void *dummy)

{


PetscScalar

*xx,*ff;

/*

Get pointers to vector data.

- For default PETSc vectors, VecGetArray() returns a pointer to the data array.

Otherwise, the routine is implementation dependent.

- You MUST call VecRestoreArray() when you no longer need access to the array.

*/ ierr = VecGetArray(x,&xx);CHKERRQ(ierr); ierr = VecGetArray(f,&ff);CHKERRQ(ierr);

/*

Compute function

*/ ff[0] = PetscSinScalar(3.0*xx[0]) + xx[0]; ff[1] = xx[1];

/*

Restore vectors

*/ ierr = VecRestoreArray(x,&xx);CHKERRQ(ierr); ierr = VecRestoreArray(f,&ff);CHKERRQ(ierr); return 0;

}

/* ------------------------------------------------------------------- */

#undef __FUNCT__

#define __FUNCT__ "FormJacobian2"

PetscErrorCode FormJacobian2(SNES snes,Vec x,Mat *jac,Mat *B,MatStructure

*flag,void *dummy)

{

97

PetscScalar

*xx,A[4];


PetscInt idx[2] = {0,1};

/*

Get pointer to vector data

*/ ierr = VecGetArray(x,&xx);CHKERRQ(ierr);

/*

Compute Jacobian entries and insert into matrix.

- Since this is such a small problem, we set all entries for the matrix at once.

*/

A[0] = 3.0*PetscCosScalar(3.0*xx[0]) + 1.0; A[1] = 0.0;

A[2] = 0.0; A[3] = 1.0; ierr = MatSetValues(*B,2,idx,2,idx,A,INSERT_VALUES);CHKERRQ(ierr);

*flag = SAME_NONZERO_PATTERN;

/*

Restore vector

*/ ierr = VecRestoreArray(x,&xx);CHKERRQ(ierr);

}

/*

Assemble matrix

*/ ierr = MatAssemblyBegin(*B,MAT_FINAL_ASSEMBLY);CHKERRQ(ierr); ierr = MatAssemblyEnd(*B,MAT_FINAL_ASSEMBLY);CHKERRQ(ierr); if (*jac != *B){ ierr = MatAssemblyBegin(*jac,MAT_FINAL_ASSEMBLY);CHKERRQ(ierr); ierr = MatAssemblyEnd(*jac,MAT_FINAL_ASSEMBLY);CHKERRQ(ierr);

} return 0;

Figure 14: Example of Uniprocess SNES Code

To create a SNES solver, one must first call SNESCreate () as follows:

SNESCreate (MPI Comm comm, SNES *snes);

The user must then set routines for evaluating the function of equation ( 5.1

) and its associated Jacobian

matrix, as discussed in the following sections.

To choose a nonlinear solution method, the user can either call

SNESSetType ( SNES snes, SNESType method); or use the the option

-snes_type <method>

, where details regarding the available methods are presented in Section

5.2

. The application code can take complete control of the linear and nonlinear techniques

used in the Newton-like method by calling

SNESSetFromOptions (snes);

98

This routine provides an interface to the PETSc options database, so that at runtime the user can select a particular nonlinear solver, set various parameters and customized routines (e.g., specialized line search variants), prescribe the convergence tolerance, and set monitoring routines. With this routine the user can also control all linear solver options in the KSP , and PC modules, as discussed in Chapter

4 .

After having set these routines and options, the user solves the problem by calling

SNESSolve ( SNES snes, Vec b, Vec x); where x indicates the solution vector. The user should initialize this vector to the initial guess for the nonlinear solver prior to calling SNESSolve (). In particular, to employ an initial guess of zero, the user should explicitly set this vector to zero by calling VecSet (). Finally, after solving the nonlinear system (or several systems), the user should destroy the SNES context with

SNESDestroy ( SNES *snes);

5.1.1

Nonlinear Function Evaluation

When solving a system of nonlinear equations, the user must provide a vector, f , for storing the function of

Equation ( 5.1

), as well as a routine that evaluates this function at the vector

x

. This information should be set with the command

SNESSetFunction ( SNES snes, Vec f,

PetscErrorCode (*FormFunction)( SNES snes, Vec x, Vec f,void *ctx),void *ctx);

The argument ctx is an optional user-defined context, which can store any private, application-specific data required by the function evaluation routine; PETSC NULL should be used if such information is not needed. In C and C++, a user-defined context is merely a structure in which various objects can be stashed; in

Fortran a user context can be an integer array that contains both parameters and pointers to PETSc objects.

${PETSC_DIR}/src/snes/examples/tutorials/ex5.c

and

${PETSC_DIR}/src/snes/ examples/tutorials/ex5f.F

give examples of user-defined application contexts in C and Fortran, respectively.

5.1.2

Jacobian Evaluation

The user must also specify a routine to form some approximation of the Jacobian matrix,

A

, at the current iterate, x

, as is typically done with

SNESSetJacobian ( SNES snes, Mat A, Mat B, PetscErrorCode (*FormJacobian)( SNES snes,

Vec x, Mat *A, Mat *B, MatStructure *flag,void *ctx),void *ctx);

The arguments of the routine

FormJacobian() are the current iterate, x

; the Jacobian matrix,

A

; the preconditioner matrix,

B

(which is usually the same as

A

); a flag indicating information about the preconditioner matrix structure; and an optional user-defined Jacobian context, ctx

, for application-specific data.

The options for flag are identical to those for the flag of KSPSetOperators (), discussed in Section

4.1

.

Note that the SNES solvers are all data-structure neutral, so the full range of PETSc matrix formats (including “matrix-free” methods) can be used. Chapter

3

discusses information regarding available matrix formats and options, while Section

5.5

focuses on matrix-free methods in SNES . We briefly touch on a few details of matrix usage that are particularly important for efficient use of the nonlinear solvers.

A common usage paradigm is to assemble the problem Jacobian in the preconditioner storage

B

, rather than

A

. In the case where they are identical, as in many simulations, this makes no difference. However, it allows us to check the analytic Jacobian we construct in

FormJacobian() by passing the

-snes_mf_ operator flag. This causes PETSc to approximate the Jacobian using finite differencing of the function

99

Method SNESType Options Name Default Convergence Test

Line search SNESLS

Trust region SNESTR ls tr

Test Jacobian SNESTEST test

SNESConverged LS()

SNESConverged TR()

Table 6: PETSc Nonlinear Solvers evaluation (discussed in section

5.6

), and the analytic Jacobian becomes merely the preconditioner. Even if

the analytic Jacobian is incorrect, it is likely that the finite difference approximation will converge, and thus this is an excellent method to verify the analytic Jacobian. Moreover, if the analytic Jacobian is incomplete

(some terms are missing or approximate),

-snes_mf_operator may be used to obtain the exact solution, where the Jacobian approximation has been transferred to the preconditioner.

During successive calls to FormJacobian() , the user can either insert new matrix contexts or reuse old ones, depending on the application requirements. For many sparse matrix formats, reusing the old space

(and merely changing the matrix elements) is more efficient; however, if the matrix structure completely changes, creating an entirely new matrix context may be preferable. Upon subsequent calls to the FormJ acobian() routine, the user may wish to reinitialize the matrix entries to zero by calling MatZeroEntries ().

See Section

3.4

for details on the reuse of the matrix context.

If the preconditioning matrix retains identical nonzero structure during successive nonlinear iterations, setting the parameter, flag

, in the

FormJacobian() routine to be

SAME_NONZERO_PATTERN and reusing the matrix context can save considerable overhead. For example, when one is using a parallel preconditioner such as incomplete factorization in solving the linearized Newton systems for such problems, matrix colorings and communication patterns can be determined a single time and then reused repeatedly throughout the solution process. In addition, if using different matrices for the actual Jacobian and the preconditioner, the user can hold the preconditioner matrix fixed for multiple iterations by setting flag to

SAME_PRECONDITIONER

. See the discussion of KSPSetOperators () in Section

4.1

for details.

ples.

The directory

${PETSC_DIR}/src/snes/examples/tutorials provides a variety of exam-

5.2

The Nonlinear Solvers

As summarized in Table

6 ,

SNES includes several Newton-like nonlinear solvers based on line search techniques and trust region methods.

Each solver may have associated with it a set of options, which can be set with routines and options database commands provided for this purpose. A complete list can be found by consulting the manual pages or by running a program with the

-help option; we discuss just a few in the sections below.

5.2.1

Line Search Techniques

The method SNESLS (

-snes_type ls

) provides a line search Newton method for solving systems of

nonlinear equations. By default, this technique employs cubic backtracking [ 4 ]. An alternative line search

routine can be set with the command

SNESSetLineSearch( SNES snes, PetscErrorCode (*ls)( SNES , Vec , Vec , Vec , Vec ,double,double*,double*), void *lsctx);

Other line search methods provided by PETSc are SNESSearchQuadraticLine(), SNESLineSearchNo (), and

SNESLineSearchNoNorms (), which can be set with the option

100

-snes_ls [cubic, quadratic, basic, basicnonorms]

The line search routines involve several parameters, which are set to defaults that are reasonable for many applications. The user can override the defaults by using the options

-snes_ls_alpha <alpha>

,

-snes_ls_maxstep <max>

, and

-snes_ls_steptol <tol>

.

5.2.2

Trust Region Methods

The trust region method in SNES for solving systems of nonlinear equations, SNESTR (

-snes_type tr

), is taken from the MINPACK project [ 13 ]. Several parameters can be set to control the variation of the

trust region size during the solution process. In particular, the user can control the initial trust region radius, computed by

∆ = ∆

0 k

F

0 k

2

, by setting

∆

0 via the option

-snes_tr_delta0 <delta0>

.

5.3

General Options

This section discusses options and routines that apply to all SNES solvers and problem classes. In particular, we focus on convergence tests, monitoring routines, and tools for checking derivative computations.

5.3.1

Convergence Tests

Convergence of the nonlinear solvers can be detected in a variety of ways; the user can even specify a customized test, as discussed below. The default convergence routines for the various nonlinear solvers within SNES are listed in Table

6 ; see the corresponding manual pages for detailed descriptions. Each

of these convergence tests involves several parameters, which are set by default to values that should be reasonable for a wide range of problems. The user can customize the parameters to the problem at hand by using some of the following routines and options.

One method of convergence testing is to declare convergence when the norm of the change in the solution between successive iterations is less than some tolerance, stol

. Convergence can also be determined based on the norm of the function Such a test can use either the absolute size of the norm, atol

, or its relative decrease, rtol , from an initial guess. The following routine sets these parameters, which are used in many of the default SNES convergence tests:

SNESSetTolerances ( SNES snes,double atol,double rtol,double stol, int its,int fcts);

This routine also sets the maximum numbers of allowable nonlinear iterations, its

, and function evaluations, fcts

. The corresponding options database commands for setting these parameters are

-snes_ atol <atol>

,

-snes_rtol <rtol>

,

-snes_stol <stol>

,

-snes_max_it <its>

, and

-snes_max_funcs <fcts>

. A related routine is SNESGetTolerances ().

Convergence tests for trust regions methods often use an additional parameter that indicates the minimium allowable trust region radius. The user can set this parameter with the option

-snes_trtol

<trtol> or with the routine

SNESSetTrustRegionTolerance ( SNES snes,double trtol);

Users can set their own customized convergence tests in SNES by using the command

SNESSetConvergenceTest ( SNES snes, PetscErrorCode (*test)( SNES snes,int it,double xnorm, double gnorm,double f, SNESConvergedReason reason, void *cctx),void *cctx, PetscErrorCode (*destroy)(void *cctx));

101

The final argument of the convergence test routine, cctx

, denotes an optional user-defined context for private data. When solving systems of nonlinear equations, the arguments xnorm , gnorm , and f are the current iterate norm, current step norm, and function norm, respectively.

SNESConvergedReason should be set positive for convergence and negative for divergence. See include/petscsnes.h

for a list of values for SNESConvergedReason .

5.3.2

Convergence Monitoring

By default the SNES solvers run silently without displaying information about the iterations. The user can initiate monitoring with the command

SNESMonitorSet ( SNES snes, PetscErrorCode (*mon)( SNES ,int its,double norm,void* mctx), void *mctx, PetscErrorCode (*monitordestroy)(void**));

The routine, mon

, indicates a user-defined monitoring routine, where its and mctx respectively denote the iteration number and an optional user-defined context for private data for the monitor routine. The argument norm is the function norm.

The routine set by SNESMonitorSet () is called once after every successful step computation within the nonlinear solver. Hence, the user can employ this routine for any application-specific computations that should be done after the solution update. The option

-snes_monitor activates the default SNES monitor routine, SNESMonitorDefault (), while

-snes_monitor_draw draws a simple line graph of the residual norm’s convergence.

Once can cancel hardwired monitoring routines for SNES at runtime with

-snes_monitor_cancel

.

As the Newton method converges so that the residual norm is small, say 10

− 10

, many of the final digits printed with the

-snes_monitor option are meaningless. Worse, they are different on different machines; due to different round-off rules used by, say, the IBM RS6000 and the Sun Sparc. This makes testing between different machines difficult. The option

-snes_monitor_short causes PETSc to print fewer of the digits of the residual norm as it gets smaller; thus on most of the machines it will always print the same numbers making cross process testing easier.

The routines

SNESGetSolution ( SNES snes, Vec *x);

SNESGetFunction ( SNES snes, Vec *r,void *ctx, int(**func)( SNES , Vec , Vec ,void*)); return the solution vector and function vector from a SNES context. These routines are useful, for instance, if the convergence test requires some property of the solution or function other than those passed with routine arguments.

5.3.3

Checking Accuracy of Derivatives

Since hand-coding routines for Jacobian matrix evaluation can be error prone, SNES provides easy-to-use support for checking these matrices against finite difference versions. In the simplest form of comparison, users can employ the option

-snes_type test to compare the matrices at several points. Although not exhaustive, this test will generally catch obvious problems. One can compare the elements of the two matrices by using the option

-snes_test_display

, which causes the two matrices to be printed to the screen.

Another means for verifying the correctness of a code for Jacobian computation is running the problem with either the finite difference or matrix-free variant,

-snes_fd or

-snes_mf

. see Section

5.6

or Section

5.5

). If a problem converges well with these matrix approximations but not with a user-provided routine, the

problem probably lies with the hand-coded matrix.

102

5.4

Inexact Newton-like Methods

Since exact solution of the linear Newton systems within ( 5.2

) at each iteration can be costly, modifica-

tions are often introduced that significantly reduce these expenses and yet retain the rapid convergence of

Newton’s method. Inexact or truncated Newton techniques approximately solve the linear systems using an iterative scheme. In comparison with using direct methods for solving the Newton systems, iterative methods have the virtue of requiring little space for matrix storage and potentially saving significant computational work. Within the class of inexact Newton methods, of particular interest are Newton-Krylov methods, where the subsidiary iterative technique for solving the Newton system is chosen from the class of

Krylov subspace projection methods. Note that at runtime the user can set any of the linear solver options discussed in Chapter

4 , such as

-ksp_type <ksp_method> and

-pc_type <pc_method>

, to set the Krylov subspace and preconditioner methods.

Two levels of iterations occur for the inexact techniques, where during each global or outer Newton iteration a sequence of subsidiary inner iterations of a linear solver is performed. Appropriate control of the accuracy to which the subsidiary iterative method solves the Newton system at each global iteration is critical, since these inner iterations determine the asymptotic convergence rate for inexact Newton techniques.

While the Newton systems must be solved well enough to retain fast local convergence of the Newton’s iterates, use of excessive inner iterations, particularly when k x k

− x

∗ k is large, is neither necessary nor economical. Thus, the number of required inner iterations typically increases as the Newton process progresses, so that the truncated iterates approach the true Newton iterates.

A sequence of nonnegative numbers

{

η k

} can be used to indicate the variable convergence criterion.

In this case, when solving a system of nonlinear equations, the update step of the Newton process remains unchanged, and direct solution of the linear system is replaced by iteration on the system until the residuals r

( i ) k

= F

0

( x k

)∆ x k

+ F ( x k

) satisfy k r

( i ) k k k

F ( x k

) k

≤

η k

≤

η < 1 .

Here x

0 is an initial approximation of the solution, and k · k denotes an arbitrary norm in

< n

.

By default a constant relative convergence tolerance is used for solving the subsidiary linear systems within the Newton-like methods of SNES . When solving a system of nonlinear equations, one can instead

employ the techniques of Eisenstat and Walker [ 6 ] to compute

η k at each step of the nonlinear solver by using the option

-snes_ksp_ew_conv

. In addition, by adding one’s own KSP convergence test (see

Section

4.3.2

), one can easily create one’s own, problem-dependent, inner convergence tests.

5.5

Matrix-Free Methods

The SNES class fully supports matrix-free methods. The matrices specified in the Jacobian evaluation routine need not be conventional matrices; instead, they can point to the data required to implement a particular matrix-free method. The matrix-free variant is allowed only when the linear systems are solved by an iterative method in combination with no preconditioning ( PCNONE or

-pc_type none

), a user-provided preconditioner matrix, or a user-provided preconditioner shell ( PCSHELL , discussed in Section

4.4

); that

is, obviously matrix-free methods cannot be used if a direct solver is to be employed.

The user can create a matrix-free context for use within SNES with the routine

MatCreateSNESMF ( SNES snes, Mat *mat);

103

This routine creates the data structures needed for the matrix-vector products that arise within Krylov space

iterative methods [ 2 ] by employing the matrix type

MATSHELL , discussed in Section

3.3

. The default

SNES matrix-free approximations can also be invoked with the command

-snes_mf

. Or, one can retain the user-provided Jacobian preconditioner, but replace the user-provided Jacobian matrix with the default matrix free variant with the option -snes_mf_operator .

See also

MatCreateMFFD ( Vec x, Mat *mat); for users who need a matrix-free matrix but are not using SNES .

The user can set one parameter to control the Jacobian-vector product approximation with the command

MatMFFDSetFunctionError ( Mat mat,double rerror);

The parameter rerror should be set to the square root of the relative error in the function evaluations, e rel

; the default is 10

− 8

, which assumes that the functions are evaluated to full double precision accuracy. This parameter can also be set from the options database with

-snes mf err

<err>

In addition, SNES provides a way to register new routines to compute the differencing parameter ( h

); see the manual page for MatMFFDSetType () and MatMFFDRegisterDynamic ). We currently provide two default routines accessible via

-snes mf type

<default or wp>

For the default approach there is one “tuning” parameter, set with

MatMFFDDSSetUmin ( Mat mat, PetscReal umin);

This parameter, umin

(or u min

), is a bit involved; its default is 10

−

6 approximated via the formula

F ( u + h

∗ a )

−

F ( u )

F

0

( u ) a

≈ h

. The Jacobian-vector product is where h is computed via

= e rel

∗ u min h = e rel

∗ u

T a/ || a ||

2

2

∗ sign ( u

T a )

∗ || a

||

1

/

|| a

||

2

2 if

| u

0 a | > u min

∗ || a ||

1 otherwise

.

This approach is taken from Brown and Saad [ 2 ]. The parameter can also be set from the options database

with

-snes mf umin

<umin>

The second approach, taken from Walker and Pernice, [ 15 ], computes

h via h = p

1 +

|| u

|| e rel

|| a

||

This has no tunable parameters, but note that inside the nonlinear solve for the entire linear iterative process u does not change hence p

1 +

|| u

|| need be computed only once. This information may be set with the options

MatMFFDWPSetComputeNormU ( Mat mat, PetscBool );

104

or

-mat mffd compute normu

<true or false>

This information is used to eliminate the redundant computation of these parameters, therefor reducing the number of collective operations and improving the efficiency of the application code.

It is also possible to monitor the differencing parameters h that are computed via the routines

MatMFFDSetHHistory ( Mat , PetscScalar *,int);

MatMFFDResetHHistory ( Mat , PetscScalar *,int);

MatMFFDGetH ( Mat , PetscScalar *);

We include an example in Figure

15

that explicitly uses a matrix-free approach. Note that by using the option

-snes_mf one can easily convert any SNES code to use a matrix-free Newton-Krylov method without a preconditioner. As shown in this example, SNESSetFromOptions () must be called after SNESSetJacobian () to enable runtime switching between the user-specified Jacobian and the default SNES matrix-free form.

Table

7

summarizes the various matrix situations that SNES supports. In particular, different linear system matrices and preconditioning matrices are allowed, as well as both matrix-free and applicationprovided preconditioners. All combinations are possible, as demonstrated by the example,

${PETSC_DI

R}/src/snes/examples/tutorials/ex6.c

, in Figure

15 .

Matrix Use Conventional Matrix Formats Matrix-Free Versions

Jacobian

Matrix

Create matrix with MatCreate ().

∗

Create matrix with MatCreateShell ().

Assemble matrix with user-defined Use MatShellSetOperation () to set routine.

† various matrix actions.

Or use MatCreateMFFD () or MatCreateSNESMF ().

Preconditioning

Matrix

Create matrix with MatCreate ().

∗

Use SNESGetKSP () and KSPGetPC ()

Assemble matrix with user-defined to access the PC , then use routine.

†

PCSetType (pc, PCSHELL ); followed by PCShellSetApply ().

∗

Use either the generic MatCreate () or a format-specific variant such as MatCreateMPIAIJ ().

†

Set user-defined matrix formation routine with SNESSetJacobian ().

Table 7: Jacobian Options static char help[] = "u‘‘ + uˆ{2} = f. Different matrices for the Jacobian and the preconditioner.\n\

Demonstrates the use of matrix-free Newton-Krylov methods in conjunction\n\ with a user-provided preconditioner.

Input arguments are:\n\

105

-snes_mf : Use matrix-free Newton methods\n\

-user_precond : Employ a user-defined preconditioner.

Used only with\n\ matrix-free methods in this example.\n\n";

/*T

Concepts: SNESˆdifferent matrices for the Jacobian and preconditioner;

Concepts: SNESˆmatrix-free methods

T*/

Concepts: SNESˆuser-provided preconditioner;

Concepts: matrix-free methods

Concepts: user-provided preconditioner;

Processors: 1

/*

Include "petscsnes.h" so that we can use SNES solvers.

Note that this file automatically includes: petscsys.h

- base PETSc routines petscmat.h - matrices petscvec.h - vectors petscis.h

- index sets petscviewer.h - viewers petscksp.h

- linear solvers petscksp.h - Krylov subspace methods petscpc.h

- preconditioners

*/

#include <petscsnes.h>

/*

User-defined routines

*/

PetscErrorCode FormJacobian(SNES,Vec,Mat*,Mat*,MatStructure*,void*);

PetscErrorCode FormFunction(SNES,Vec,Vec,void*);

PetscErrorCode MatrixFreePreconditioner(PC,Vec,Vec); int main(int argc,char **argv)

{

SNES

KSP

PC

Vec

Mat snes; ksp; pc; x,r,F;

J,JPrec;

/* SNES context */

/* KSP context */

/* PC context */

/* vectors */

/* Jacobian,preconditioner matrices

*/


PetscInt

PetscMPIInt it,n = 5,i; size;

PetscInt

PetscReal

PetscScalar

PetscBool

*Shistit = 0,Khistl = 200,Shistl = 10; h,xp = 0.0,*Khist = 0,*Shist = 0; v,pfive = .5; flg;

PetscInitialize(&argc,&argv,(char *)0,help); ierr = MPI_Comm_size(PETSC_COMM_WORLD,&size);CHKERRQ(ierr); if (size != 1) SETERRQ(PETSC_COMM_SELF,1,"This is a uniprocessor example only!"); ierr = PetscOptionsGetInt(PETSC_NULL,"-n",&n,PETSC_NULL);CHKERRQ(ierr); h = 1.0/(n-1);

/* - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -

106

Create nonlinear solver context

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - */ ierr = SNESCreate(PETSC_COMM_WORLD,&snes);CHKERRQ(ierr);

*/

/* - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -

Create vector data structures; set function evaluation routine

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - ierr = VecCreate(PETSC_COMM_SELF,&x);CHKERRQ(ierr); ierr = VecSetSizes(x,PETSC_DECIDE,n);CHKERRQ(ierr); ierr = VecSetFromOptions(x);CHKERRQ(ierr); ierr = VecDuplicate(x,&r);CHKERRQ(ierr); ierr = VecDuplicate(x,&F);CHKERRQ(ierr); ierr = SNESSetFunction(snes,r,FormFunction,(void*)F);CHKERRQ(ierr);

*/

/* - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -

Create matrix data structures; set Jacobian evaluation routine

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - ierr = MatCreateSeqAIJ(PETSC_COMM_SELF,n,n,3,PETSC_NULL,&J);CHKERRQ(ierr); ierr = MatCreateSeqAIJ(PETSC_COMM_SELF,n,n,1,PETSC_NULL,&JPrec);CHKERRQ(ierr);

/*

Note that in this case we create separate matrices for the Jacobian and preconditioner matrix.

Both of these are computed in the routine FormJacobian()

*/ ierr = SNESSetJacobian(snes,J,JPrec,FormJacobian,0);CHKERRQ(ierr);

/* - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -

Customize nonlinear solver; set runtime options

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - */

/* Set preconditioner for matrix-free method */ flg = PETSC_FALSE; ierr = PetscOptionsGetBool(PETSC_NULL,"-snes_mf",&flg,PETSC_NULL);CHKERRQ(ierr); if (flg) { ierr = SNESGetKSP(snes,&ksp);CHKERRQ(ierr); ierr = KSPGetPC(ksp,&pc);CHKERRQ(ierr); ierr = PetscOptionsHasName(PETSC_NULL,"-user_precond",&flg);CHKERRQ(ierr); if (flg) { /* user-defined precond */ ierr = PCSetType(pc,PCSHELL);CHKERRQ(ierr); ierr = PCShellSetApply(pc,MatrixFreePreconditioner);CHKERRQ(ierr);

} else {ierr = PCSetType(pc,PCNONE);CHKERRQ(ierr);}

} ierr = SNESSetFromOptions(snes);CHKERRQ(ierr); us

/*

Save all the linear residuals for all the Newton steps; this enables

107

to retain complete convergence history for printing after the conclusion of SNESSolve().

Alternatively, one could use the monitoring options

-snes_monitor -ksp_monitor to see this information during the solver’s execution; however, such output during the run distorts performance evaluation data.

So, the following is a good option when monitoring code performance, for example when using -log_summary.

*/ ierr = PetscOptionsHasName(PETSC_NULL,"-rhistory",&flg);CHKERRQ(ierr); if (flg) { ierr = SNESGetKSP(snes,&ksp);CHKERRQ(ierr); ierr = PetscMalloc(Khistl*sizeof(PetscReal),&Khist);CHKERRQ(ierr); ierr = KSPSetResidualHistory(ksp,Khist,Khistl,PETSC_FALSE);CHKERRQ(ierr); ierr = PetscMalloc(Shistl*sizeof(PetscReal),&Shist);CHKERRQ(ierr); ierr = PetscMalloc(Shistl*sizeof(PetscInt),&Shistit);CHKERRQ(ierr); ierr = SNESSetConvergenceHistory(snes,Shist,Shistit,Shistl,PETSC_FALSE);CHKERRQ(ierr);

}

/* - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -

Initialize application:

Store right-hand-side of PDE and exact solution

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - */ xp = 0.0; for (i=0; i<n; i++) { v = 6.0*xp + pow(xp+1.e-12,6.0); /* +1.e-12 is to prevent 0ˆ6 */ ierr = VecSetValues(F,1,&i,&v,INSERT_VALUES);CHKERRQ(ierr); xp += h;

}

/* - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -

Evaluate initial guess; then solve nonlinear system

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - */ ierr = VecSet(x,pfive);CHKERRQ(ierr); ierr = SNESSolve(snes,PETSC_NULL,x);CHKERRQ(ierr); ierr = SNESGetIterationNumber(snes,&it);CHKERRQ(ierr); ierr = PetscPrintf(PETSC_COMM_SELF,"Newton iterations = %D\n\n",it);CHKERRQ(ierr); ierr = PetscOptionsHasName(PETSC_NULL,"-rhistory",&flg);CHKERRQ(ierr); if (flg) { ierr = KSPGetResidualHistory(ksp,PETSC_NULL,&Khistl);CHKERRQ(ierr); ierr = PetscRealView(Khistl,Khist,PETSC_VIEWER_STDOUT_SELF);CHKERRQ(ierr); ierr = PetscFree(Khist);CHKERRQ(ierr);CHKERRQ(ierr); ierr = SNESGetConvergenceHistory(snes,PETSC_NULL,PETSC_NULL,&Shistl);CHKERRQ(ierr); ierr = PetscRealView(Shistl,Shist,PETSC_VIEWER_STDOUT_SELF);CHKERRQ(ierr); ierr = PetscIntView(Shistl,Shistit,PETSC_VIEWER_STDOUT_SELF);CHKERRQ(ierr); ierr = PetscFree(Shist);CHKERRQ(ierr); ierr = PetscFree(Shistit);CHKERRQ(ierr);

}

/* - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -

Free work space.

All PETSc objects should be destroyed when they are no longer needed.

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - */

108

ierr = VecDestroy(&x);CHKERRQ(ierr); ierr = VecDestroy(&F);CHKERRQ(ierr); ierr = VecDestroy(&r);CHKERRQ(ierr); ierr = MatDestroy(&J);CHKERRQ(ierr); ierr = MatDestroy(&JPrec);CHKERRQ(ierr); ierr = SNESDestroy(&snes);CHKERRQ(ierr); ierr = PetscFinalize(); return 0;

}

/* ------------------------------------------------------------------- */

/*

FormInitialGuess - Forms initial approximation.

Input Parameters: user - user-defined application context

X - vector

Output Parameter:

X - vector

*/

PetscErrorCode FormFunction(SNES snes,Vec x,Vec f,void *dummy)

{

PetscScalar

*xx,*ff,*FF,d;


PetscInt i,n; ierr = VecGetArray(x,&xx);CHKERRQ(ierr); ierr = VecGetArray(f,&ff);CHKERRQ(ierr); ierr = VecGetArray((Vec)dummy,&FF);CHKERRQ(ierr); ierr = VecGetSize(x,&n);CHKERRQ(ierr); d = (PetscReal)(n - 1); d = d*d; ff[0] = xx[0]; for (i=1; i<n-1; i++) { ff[i] = d*(xx[i-1] - 2.0*xx[i] + xx[i+1]) + xx[i]*xx[i] - FF[i];

} ff[n-1] = xx[n-1] - 1.0; ierr = VecRestoreArray(x,&xx);CHKERRQ(ierr); ierr = VecRestoreArray(f,&ff);CHKERRQ(ierr); ierr = VecRestoreArray((Vec)dummy,&FF);CHKERRQ(ierr); return 0;

}

/* ------------------------------------------------------------------- */

/*

FormJacobian - This routine demonstrates the use of different matrices for the Jacobian and preconditioner

Input Parameters:

.


.

x - input vector

.

ptr - optional user-defined context, as set by SNESSetJacobian()

Output Parameters:

.

A - Jacobian matrix

.

B - different preconditioning matrix

.

flag - flag indicating matrix structure

*/

109

PetscErrorCode FormJacobian(SNES snes,Vec x,Mat *jac,Mat *prejac,MatStructure

*flag,void *dummy)

{

PetscScalar

PetscInt

*xx,A[3],d; i,n,j[3];

PetscErrorCode ierr; ierr = VecGetArray(x,&xx);CHKERRQ(ierr); ierr = VecGetSize(x,&n);CHKERRQ(ierr); d = (PetscReal)(n - 1); d = d*d;

/* Form Jacobian.

Also form a different preconditioning matrix that has only the diagonal elements. */ i = 0; A[0] = 1.0; ierr = MatSetValues(*jac,1,&i,1,&i,&A[0],INSERT_VALUES);CHKERRQ(ierr); ierr = MatSetValues(*prejac,1,&i,1,&i,&A[0],INSERT_VALUES);CHKERRQ(ierr); for (i=1; i<n-1; i++) { j[0] = i - 1; j[1] = i; j[2] = i + 1;

A[0] = d; A[1] = -2.0*d + 2.0*xx[i]; A[2] = d; ierr = MatSetValues(*jac,1,&i,3,j,A,INSERT_VALUES);CHKERRQ(ierr); ierr = MatSetValues(*prejac,1,&i,1,&i,&A[1],INSERT_VALUES);CHKERRQ(ierr);

} i = n-1; A[0] = 1.0; ierr = MatSetValues(*jac,1,&i,1,&i,&A[0],INSERT_VALUES);CHKERRQ(ierr); ierr = MatSetValues(*prejac,1,&i,1,&i,&A[0],INSERT_VALUES);CHKERRQ(ierr); ierr = MatAssemblyBegin(*jac,MAT_FINAL_ASSEMBLY);CHKERRQ(ierr); ierr = MatAssemblyBegin(*prejac,MAT_FINAL_ASSEMBLY);CHKERRQ(ierr); ierr = MatAssemblyEnd(*jac,MAT_FINAL_ASSEMBLY);CHKERRQ(ierr); ierr = MatAssemblyEnd(*prejac,MAT_FINAL_ASSEMBLY);CHKERRQ(ierr); ierr = VecRestoreArray(x,&xx);CHKERRQ(ierr);

*flag = SAME_NONZERO_PATTERN; return 0;

}

/* ------------------------------------------------------------------- */

/*

MatrixFreePreconditioner - This routine demonstrates the use of a user-provided preconditioner.

This code implements just the null preconditioner, which of course is not recommended for general use.

Input Parameters:

+ pc - preconditioner

x - input vector

Output Parameter:

.

y - preconditioned vector

*/

PetscErrorCode MatrixFreePreconditioner(PC pc,Vec x,Vec y)

{

PetscErrorCode ierr; ierr = VecCopy(x,y);CHKERRQ(ierr); return 0;

}

110

Figure 15: Example of Uniprocess SNES Code - Both Conventional and Matrix-Free Jacobians

5.6

Finite Difference Jacobian Approximations

PETSc provides some tools to help approximate the Jacobian matrices efficiently via finite differences.

These tools are intended for use in certain situations where one is unable to compute Jacobian matrices analytically, and matrix-free methods do not work well without a preconditioner, due to very poor conditioning.

The approximation requires several steps:

•

First, one colors the columns of the (not yet built) Jacobian matrix, so that columns of the same color do not share any common rows.

•

Next, one creates a MatFDColoring data structure that will be used later in actually computing the

Jacobian.

•

Finally, one tells the nonlinear solvers of SNES to use the SNESDefaultComputeJacobianColor () routine to compute the Jacobians.

A code fragment that demonstrates this process is given below.

ISColoring iscoloring;

MatFDColoring fdcoloring;

MatStructure str;

/*

This initializes the nonzero structure of the Jacobian. This is artificial because clearly if we had a routine to compute the Jacobian we wouldn’t need to use finite differences.

*/

FormJacobian(snes,x,&J,&J,&str,&user);

/*

Color the matrix, i.e. determine groups of columns that share no common rows. These columns in the Jacobian can all be computed simulataneously.

*/

MatGetColoring (J,MATCOLORINGSL,&iscoloring);

/*

Create the data structure that SNESDefaultComputeJacobianColor () uses to compute the actual Jacobians via finite differences.

*/

MatFDColoringCreate (J,iscoloring,&fdcoloring);

ISColoringDestroy (&iscoloring);

MatFDColoringSetFromOptions (fdcoloring);

/*

Tell SNES to use the routine SNESDefaultComputeJacobianColor () to compute Jacobians.

*/

SNESSetJacobian (snes,J,J, SNESDefaultComputeJacobianColor ,fdcoloring);

Of course, we are cheating a bit. If we do not have an analytic formula for computing the Jacobian, then how do we know what its nonzero structure is so that it may be colored? Determining the structure is problem dependent, but fortunately, for most structured grid problems (the class of problems for which

111

PETSc is designed) if one knows the stencil used for the nonlinear function one can usually fairly easily obtain an estimate of the location of nonzeros in the matrix. This is harder in the unstructured case, and has not yet been implemented in general.

One need not necessarily use the routine MatGetColoring () to determine a coloring. For example, if a grid can be colored directly (without using the associated matrix), then that coloring can be provided to

MatFDColoringCreate (). Note that the user must always preset the nonzero structure in the matrix regardless of which coloring routine is used.

For sequential matrices PETSc provides three matrix coloring routines from the MINPACK package

[ 13 ]: smallest-last (

sl

), largest-first ( lf

), and incidence-degree ( id

). These colorings, as well as the

“natural” coloring for which each column has its own unique color, may be accessed with the command line options

-mat coloring type

< sl,id,lf,natural

>

Alternatively, one can set a coloring type of MATCOLORINGSL , MATCOLORINGID , MATCOLORINGLF , or

MATCOLORINGNATURAL when calling MatGetColoring ().

As for the matrix-free computation of Jacobians (see Section

5.5

), two parameters affect the accuracy of

the finite difference Jacobian approximation. These are set with the command

MatFDColoringSetParameters ( MatFDColoring fdcoloring,double rerror,double umin);

The parameter rerror is the square root of the relative error in the function evaluations, e rel is 10

− 8

; the default

, which assumes that the functions are evaluated to full double-precision accuracy. The second parameter, umin

, is a bit more involved; its default is

10 e

−

8

. Column i of the Jacobian matrix (denoted by

F

: i

) is approximated by the formula

F

0

: i

≈

F ( u + h

∗ dx i

)

−

F ( u ) h where h is computed via h = e rel

∗ u i if

| u i

|

> u min h = e rel

∗ u min

∗ sign ( u i

) otherwise .

These parameters may be set from the options database with

-mat fd coloring err err

-mat fd coloring umin umin

Note that although MatGetColoring () works for parallel matrices, the routine currently uses a sequential algorithm. Extensions may be forthcoming. However, if one can compute the coloring iscoloring some other way, the routine MatFDColoringCreate () is scalable. An example of this for 2D distributed arrays is given below that uses the utility routine DMGetColoring ().

DMGetColoring (da, IS COLORING GHOSTED,&iscoloring);

MatFDColoringCreate (J,iscoloring,&fdcoloring);

MatFDColoringSetFromOptions (fdcoloring);

ISColoringDestroy (&iscoloring);

Note that the routine MatFDColoringCreate () currently is only supported for the AIJ and BAIJ matrix formats.

112

5.7

Variational Inequalities

SNES can also solve variational inequalities with box constraints. That is nonlinear algebraic systems with additional inequality constraints on some or all of the variables: Lu

≤ u i

≤

Hu i

. Some or all of the lower bounds may be negative infinity (indicated to PETSc with SNES VI NINF) and some or all of the upper bounds may be infinity (indicated by SNES VI INF). The command

SNESVISetVariableBounds ( SNES , Vec Lu, Vec Hu); is used to indicate that one is solving a variational inequality. The option

-snes_vi_monitor turns on extra monitoring of the active set associated with the bounds and

-snes_vi_type allows selecting from several VI solvers, the default is prefered.

113

114

PETSc Users Manual Revision 3.2

Chapter 5

SNES: Nonlinear Solvers

5.1

Basic SNES Usage

5.1.1

Nonlinear Function Evaluation

5.1.2

Jacobian Evaluation

5.2

The Nonlinear Solvers

5.2.1

Line Search Techniques

5.2.2

Trust Region Methods

5.3

General Options

5.3.1

Convergence Tests

5.3.2

Convergence Monitoring

5.3.3

Checking Accuracy of Derivatives

5.4

Inexact Newton-like Methods

5.5

Matrix-Free Methods

5.6

Finite Difference Jacobian Approximations

5.7

Variational Inequalities

Table of contents