# Spline Methods Draft Tom Lyche and Knut Mørken 25th April 2003 ```Spline Methods
Draft
Tom Lyche and Knut Mørken
25th April 2003
2
Contents
1 Splines and B-splines
an Introduction
1.1 Convex combinations and convex hulls . . . . . . . . .
1.1.1 Stable computations . . . . . . . . . . . . . . .
1.1.2 The convex hull of a set of points . . . . . . . .
1.2 Some fundamental concepts . . . . . . . . . . . . . . .
1.3 Interpolating polynomial curves . . . . . . . . . . . . .
1.3.1 Quadratic interpolation of three points . . . . .
1.3.2 General polynomial interpolation . . . . . . . .
1.3.3 Interpolation by convex combinations? . . . . .
1.4 Bézier curves . . . . . . . . . . . . . . . . . . . . . . .
1.4.1 Quadratic Bézier curves . . . . . . . . . . . . .
1.4.2 Bézier curves based on four and more points .
1.4.3 Composite Bézier curves . . . . . . . . . . . . .
1.5 A geometric construction of spline curves . . . . . . .
1.5.1 Linear spline curves . . . . . . . . . . . . . . .
1.5.2 Quadratic spline curves . . . . . . . . . . . . .
1.5.3 Spline curves of higher degrees . . . . . . . . .
1.5.4 Smoothness of spline curves . . . . . . . . . . .
1.6 Representing spline curves in terms of basis functions
1.7 Conclusion . . . . . . . . . . . . . . . . . . . . . . . .
2 Basic properties of splines and B-splines
2.1 Some simple consequences of the recurrence relation
2.2 Linear combinations of B-splines . . . . . . . . . . .
2.2.1 Spline functions . . . . . . . . . . . . . . . . .
2.2.2 Spline curves . . . . . . . . . . . . . . . . . .
2.3 A matrix representation of B-splines . . . . . . . . .
2.4 Algorithms for evaluating a spline . . . . . . . . . .
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
3 Further properties of splines and B-splines
3.1 Linear independence and representation of polynomials . . . .
3.1.1 Some properties of the B-spline matrices . . . . . . . .
3.1.2 Marsden’s identity and representation of polynomials .
3.1.3 Linear independence of B-splines . . . . . . . . . . . .
i
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
3
3
4
4
7
8
9
10
14
15
15
17
20
21
21
23
24
28
29
32
.
.
.
.
.
.
37
37
43
43
46
47
50
.
.
.
.
57
57
57
59
61
ii
CONTENTS
3.2
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
62
63
66
68
70
4 Knot insertion
4.1 Convergence of the control polygon for spline functions .
4.2 Knot insertion . . . . . . . . . . . . . . . . . . . . . . .
4.2.1 Formulas and algorithms for knot insertion . . .
4.3 B-spline coefficients as functions of the knots . . . . . .
4.3.1 The blossom . . . . . . . . . . . . . . . . . . . .
4.3.2 B-spline coefficients as blossoms . . . . . . . . .
4.4 Inserting one knot at a time . . . . . . . . . . . . . . . .
4.5 Bounding the number of sign changes in a spline . . . .
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
75
75
78
79
85
85
88
90
92
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
99
100
100
102
105
105
109
110
112
112
113
117
120
121
122
.
.
.
.
.
.
.
.
.
.
127
. 127
. 127
. 129
. 130
. 131
. 131
. 133
. 134
. 135
. 136
3.3
Differentiation and smoothness of B-splines . . . . . .
3.2.1 Derivatives of B-splines . . . . . . . . . . . . .
3.2.2 Computing derivatives of splines and B-splines
3.2.3 Smoothness of B-splines . . . . . . . . . . . . .
B-splines as a basis for piecewise polynomials . . . . .
5 Spline Approximation of Functions and Data
5.1 Local Approximation Methods . . . . . . . . . . .
5.1.1 Piecewise linear interpolation . . . . . . . .
5.1.2 Cubic Hermite interpolation . . . . . . . . .
5.1.3 Estimating the derivatives . . . . . . . . . .
5.2 Cubic Spline Interpolation . . . . . . . . . . . . . .
5.2.1 Interpretations of cubic spline interpolation
5.2.2 Numerical solution and examples . . . . . .
5.3 General Spline Approximation . . . . . . . . . . .
5.3.1 Spline interpolation . . . . . . . . . . . . .
5.3.2 Least squares approximation . . . . . . . .
5.4 The Variation Diminishing Spline Approximation .
5.4.1 Preservation of bounds on a function . . . .
5.4.2 Preservation of monotonicity . . . . . . . .
5.4.3 Preservation of convexity . . . . . . . . . .
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
6 Parametric Spline Curves
6.1 Definition of Parametric Curves . . . . . . . . . . . . . . . . . . . .
6.1.1 Regular parametric representations . . . . . . . . . . . . . .
6.1.2 Changes of parameter and parametric curves . . . . . . . .
6.1.3 Arc length parametrisation . . . . . . . . . . . . . . . . . .
6.2 Approximation by Parametric Spline Curves . . . . . . . . . . . . .
6.2.1 Definition of parametric spline curves . . . . . . . . . . . .
6.2.2 The parametric variation diminishing spline approximation
6.2.3 Parametric spline interpolation . . . . . . . . . . . . . . . .
6.2.4 Assigning parameter values to discrete data . . . . . . . . .
6.2.5 General parametric spline approximation . . . . . . . . . .
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
CONTENTS
iii
7 Tensor Product Spline Surfaces
7.1 Explicit tensor product spline surfaces . . . . . . . . .
7.1.1 Definition of the tensor product spline . . . . .
7.1.2 Evaluation of tensor product spline surfaces . .
7.2 Approximation methods for tensor product splines . .
7.2.1 The variation diminishing spline approximation
7.2.2 Tensor Product Spline Interpolation . . . . . .
7.2.3 Least Squares for Gridded Data . . . . . . . . .
7.3 General tensor product methods . . . . . . . . . . . .
7.4 Trivariate Tensor Product Methods . . . . . . . . . . .
7.5 Parametric Surfaces . . . . . . . . . . . . . . . . . . .
7.5.1 Parametric Tensor Product Spline Surfaces . .
.
.
.
.
.
.
.
.
.
.
.
139
. 139
. 139
. 142
. 143
. 143
. 144
. 148
. 151
. 154
. 157
. 158
8 Quasi-interpolation methods
8.1 A general recipe . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
8.1.1 The basic idea . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
8.1.2 A more detailed description . . . . . . . . . . . . . . . . . . . . . .
8.2 Some quasi-interpolants . . . . . . . . . . . . . . . . . . . . . . . . . . . .
8.2.1 Piecewise linear interpolation . . . . . . . . . . . . . . . . . . . . .
8.2.2 A 3-point quadratic quasi-interpolant . . . . . . . . . . . . . . . .
8.2.3 A 5-point cubic quasi-interpolant . . . . . . . . . . . . . . . . . . .
8.2.4 Some remarks on the constructions . . . . . . . . . . . . . . . . . .
8.3 Quasi-interpolants are linear operators . . . . . . . . . . . . . . . . . . . .
8.4 Different kinds of linear functionals and their uses . . . . . . . . . . . . .
8.4.1 Point functionals . . . . . . . . . . . . . . . . . . . . . . . . . . . .
8.4.2 Derivative functionals . . . . . . . . . . . . . . . . . . . . . . . . .
8.4.3 Integral functionals . . . . . . . . . . . . . . . . . . . . . . . . . . .
8.4.4 Preservation of moments and interpolation of linear functionals . .
8.4.5 Least squares approximation . . . . . . . . . . . . . . . . . . . . .
8.4.6 Computation of integral functionals . . . . . . . . . . . . . . . . .
8.5 Alternative ways to construct coefficient functionals . . . . . . . . . . . .
8.5.1 Computation via evaluation of linear functionals . . . . . . . . . .
8.5.2 Computation via explicit representation of the local approximation
8.6 Two quasi-interpolants based on point functionals . . . . . . . . . . . . .
8.6.1 A quasi-interpolant based on the Taylor polynomial . . . . . . . .
8.6.2 Quasi-interpolants based on evaluation . . . . . . . . . . . . . . . .
161
. 161
. 162
. 162
. 164
. 164
. 165
. 166
. 167
. 168
. 169
. 169
. 169
. 170
. 171
. 172
. 173
. 173
. 173
. 174
. 175
. 175
. 177
9 Approximation theory and stability
9.1 The distance to polynomials . . . . . . . . . . . . .
9.2 The distance to splines . . . . . . . . . . . . . . . .
9.2.1 The constant and linear cases . . . . . . . .
9.2.2 The quadratic case . . . . . . . . . . . . . .
9.2.3 The general case . . . . . . . . . . . . . . .
9.3 Stability of the B-spline basis . . . . . . . . . . . .
9.3.1 A general definition of stability . . . . . . .
9.3.2 The condition number of the B-spline basis.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
. . . . . . . .
. . . . . . . .
. . . . . . . .
. . . . . . . .
. . . . . . . .
. . . . . . . .
. . . . . . . .
Infinity norm
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
181
181
183
184
184
186
189
189
190
CONTENTS
1
9.3.3
The condition number of the B-spline basis. p-norm . . . . . . . . . 192
10 Shape Preserving Properties of B-splines
10.1 Bounding the number of zeros of a spline
10.2 Uniqueness of spline interpolation . . . . .
10.2.1 Lagrange Interpolation . . . . . . .
10.2.2 Hermite Interpolation . . . . . . .
10.3 Total positivity . . . . . . . . . . . . . . .
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
A Some Linear Algebra
A.1 Matrices . . . . . . . . . . . . . . . . . . . . . . .
A.1.1 Nonsingular matrices, and inverses. . . . .
A.1.2 Determinants. . . . . . . . . . . . . . . . .
A.1.3 Criteria for nonsingularity and singularity.
A.2 Vector Norms . . . . . . . . . . . . . . . . . . . .
A.3 Vector spaces of functions . . . . . . . . . . . . .
A.3.1 Linear independence and bases . . . . . .
A.4 Normed Vector Spaces . . . . . . . . . . . . . . .
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
199
. 199
. 202
. 204
. 205
. 206
.
.
.
.
.
.
.
.
211
. 211
. 211
. 212
. 212
. 213
. 215
. 216
. 218
CHAPTER 8
Quasi-interpolation methods
In Chapter 5 we considered a number of methods for computing spline approximations.
The starting point for the approximation methods is a data set that is usually discrete and
in the form of function values given at a set of abscissas. The methods in Chapter 5 roughly
fall into two categories: global methods and local methods. A global method is one where
any B-spline coefficient depends on all initial data points, whereas a local method is one
where a B-spline coefficient only depends on data points taken from the neighbourhood
of the support of the corresponding B-spline. Typical global methods are cubic spline
interpolation and least squares approximation, while cubic Hermite interpolation and the
Schoenberg variation diminishing spline approximation are popular local methods.
In this chapter we are going to describe a general recipe for developing local spline
approximation methods. This will enable us to produce an infinite number of approximation schemes that can be tailored to any special needs that we may have or that our given
data set dictates. In principle, the methods are local, but by allowing the area of influence
for a given B-spline coefficient to grow, our general recipe may even encompass the global
methods in Chapter 5.
The recipe we describe produces approximation methods known under the collective
term quasi-interpolation methods. Their advantage is their flexibility and their simplicity.
There is considerable freedom in the recipe to produce tailor-made approximation schemes
for initial data sets with special structure. Quasi-interpolants also allow us to establish
important properties of B-splines. In the next chapter we will employ them to study how
well a given function can be approximated by splines, and to show that B-splines form a
stable basis for splines.
8.1
A general recipe
A spline approximation method consists of two main steps: First the degree and knot vector are determined, and then the B-spline coefficients of the approximation are computed
from given data according to some formula. For some methods like spline interpolation and
least squares approximation, this formula corresponds to the solution of a linear system
of equations. In other cases, like cubic Hermite interpolation and Schoenberg’s Variation
Diminishing spline approximation, the formula for the coefficients is given directly in terms
of given values of the function to be interpolated.
161
162
CHAPTER 8. QUASI-INTERPOLATION METHODS
8.1.1
The basic idea
The basic idea behind the construction of quasi-interpolants is very simple. We focus
on how to compute the B-spline coefficients of the approximation and assume that the
degree and knot vector are known. The procedure depends on two versions of the local
support property of B-splines that we know well from earlier chapters: (i) The B-spline
Bj is nonzero only within the interval [tj , tj+d+1 ], and (ii) on the interval [tµ , tµ+1 ) there
are onlyPd + 1 B-splines in Sd,t that are nonzero so a spline g in Sd,t can be written as
g(x) = µi=µ−d bi Bi (x) when x is restricted to this interval.
P
Suppose we are to compute an approximation g = i ci Bi in Sd,t to a given function
f . To compute cj we can select one knot interval I = [tµ , tµ+1 ] which is a subinterval
of [tj , tj+d+1 ]. We denote
of f to this interval by f I and determine an
Pµ the restriction
I
I
approximation g = i=µ−d bi Bi to f . One of the coefficients of g I will be bj and we
fix cj by setting cj = bj . The whole procedure is then repeated until all the ci have been
determined.
It is important to note the flexibility of this procedure. In choosing the interval I we
will in general have the d + 1 choices µ = j, j, . . . , j + d (fewer if there are multiple knots).
As we shall see below we do not necessarily have to restrict I to be one knot interval; all
that is required is that I ∩ [tµ , tµ+d+1 ] is nonempty. When approximating f I by g I we have
a vast number of possibilities. We may use interpolation or least squares approximation,
or any other approximation method. Suppose we settle for interpolation, then we have
complete freedom in choosing the interpolation points within the interval I. In fact, there
is so much freedom that we can have no hope of exploring all the possibilities.
It turns out that some of this freedom is only apparent — to produce useful quasiinterpolants we have to enforce certain conditions. With the general setup described
above, a useful restriction is that if f I should happen to be a polynomial of degree d then
g I should reproduce f I , i.e., in this case we should have g I = f I . This has the important
consequence that if f is a spline in Sd,t then the approximation g will reproduce f exactly
(apart from rounding
P errors in the numerical computations). To see why this is the case,
suppose that f = i b̂i Bi is a spline in Sd,t . Then f I will be a polynomial that can be
P
written as f I = µi=µ−d b̂i Bi . Since we have assumed that polynomials will be reproduced
P
P
we know that g I = f I so µi=µ−d bi Bi = µi=µ−d b̂i Bi , and by the linear independence of
the B-splines involved we conclude that bi = b̂i for i = µ − d, . . . , µ. But then we see
that cj = bj = b̂j so g will agree with f . An approximation scheme with the property that
P f = f for all f in a space S is to reproduce the space.
8.1.2
A more detailed description
Hopefully, the basic idea behind the construction of quasi-interpolants became clear above.
In this section we describe the construction in some more detail with the generalisations mentioned before. We first write down the general procedure for determining quasiinterpolants and then comment on the different steps afterwards.
Algorithm 8.1 (Construction of quasi-interpolants). Let the spline space Sd,t of
dimension n and the real function f defined on the interval [td+1 , tn+1 ] be given, and
suppose that t is a d + 1-regular knot vector. To approximate f from the space Sd,t
perform the following steps for j = 1, 2, . . . , n:
8.1. A GENERAL RECIPE
163
1. Choose a subinterval I = [tµ , tν ] of [td+1 , tn+1 ] with the property that I ∩ (tj , tj+d+1 )
is nonempty, and let f I denote the restriction of f to this interval.
2. Choose a local approximation method P I and determine an approximation g I to f I ,
gI = P I f I =
µ
X
bi Bi ,
(8.1)
i=ν−d
on the interval I.
3. Set coefficient j of the global approximation P f to bj , i.e.,
cj = bj .
The spline P f =
Pn
j=1 cj Bj
will then be an approximation to f .
The coefficient cj obviously depends on f and this dependence on f is often indicated
by using the notation λj f for cj . This will be our normal notation in the rest of the
chapter.
An important point to note is that the restriction Sd,t,I of the spline space Sd,t to
the interval I can be written as a linear combination of the B-splines {Bi }µi=ν−d . These
are exactly the B-splines whose support intersect the interior of the interval I, and by
construction, one of them must clearly be Bj . This ensures that the coefficient bj that is
needed in step 3 is computed in step 2.
Algorithm 8.1 generalises the simplified procedure in Section 8.1.1 in that I is no
longer required to be a single knot interval in [tj , tj+d+1 ]. This gives us considerably
more flexibility in the choice of local approximation methods. Note in particular that the
classical global methods are included as special cases since we may choose I = [td+1 , tn+1 ].
As we mentioned in Section 8.1.1, we do not get good approximation methods for free.
If P f is going to be a decent approximation to f we must make sure that the local methods
used in step 2 reproduce polynomials or splines.
Lemma 8.2. Suppose that all the local methods used in step 2 of Algorithm 8.1 reproduce
all polynomials of some degree d1 ≤ d. Then the global approximation method P will also
reproduce polynomials of degree d1 . If all the local methods reproduce all the splines in
Sd,t,I then P will reproduce the whole spline space Sd,t .
Proof. The proof of both claims follow just as in the special case in Section 8.1.1, but let
us even so go through the proof of the second claim. We want to prove that if all the local
methods P I reproduce the local spline spaces Sd,t,I and f is a spline in Sd,t , then P f = f .
P
If f is in Sd,t we clearly have f = ni=1 b̂i Bi for appropriate coefficients (b̂i )ni=1 , and the
P
restriction of f to I can be represented as f I = µi=nu−d b̂i Bi . Since P I reproduces Sd,t,I
we will have P I f I = f I or
µ
µ
X
X
bi Bi =
b̂i Bi .
i=ν−d
i=ν−d
The linear independence of the B-splines involved over the interval I then allows us to
conclude that bi = b̂i for all indices i involved in this sum. Since j is one the indices
we therefore have cj = bj = b̂j . When this holds for all values of j we obviously have
Pf = f.
164
CHAPTER 8. QUASI-INTERPOLATION METHODS
The reader should note that if I is a single knot interval, the local spline space Sd,t,I
reduces to the space of polynomials of degree d. Therefore, when I is a single knot interval,
local reproduction of polynomials of degree d leads to global reproduction of the whole
spline space.
Why does reproduction of splines or polynomials ensure that P will be a good approximation method? We will study this in some detail in Chapter 9, but as is often
the case the basic idea is simple: The functions we want to approximate are usually nice
and smooth, like the exponential functions or the trigonometric functions. An important
property of polynomials is that they approximate such smooth functions well, although if
the interval becomes wide we may need to use polynomials of high degree. A quantitative
manifestation of this phenomenon is that if we perform a Taylor expansion of a smooth
function, then the error term will be small, at least if the degree is high enough. If our
approximation method reproduces polynomials it will pick up the essential behaviour of
the Taylor polynomial, while the approximation error will pick up the essence of the error
in the Taylor expansion. The approximation method will therefore perform well whenever the error in the Taylor expansion is small. If we reproduce spline functions we can
essentially reproduce Taylor expansions on each knot interval as long as the function we
approximate has at least the same smoothness as the splines in the spline space we are
using. So instead of increasing the polynomial degree because we are approximating over a
wide interval, we can keep the spacing in the knot vector small and thereby keep the polynomial degree of the spline low. Another way to view this is that by using splines we can
split our function into suitable pieces that each can be approximated well by polynomials
of relatively low degree, even though this is not possible for the complete function. By
constructing quasi-interpolants as outlined above we obtain approximation methods that
actually utilise this approximation power of polynomials on each subinterval. In this way
we can produce good approximations even to functions that are only piecewise smooth.
8.2
Some quasi-interpolants
It is high time to try out our new tool for constructing approximation methods. Let us
see how some simple methods can be obtained from Algorithm 8.1.
8.2.1
Piecewise linear interpolation
Perhaps the simplest, local approximation method is piecewise linear interpolation. We
assume that our n-dimensional spline space S1,t is given and that t is a 2-regular knot
vector. For simplicity we also assume that all the interior knots are simple. The function
f is given on the interval [t2 , tn+1 ]. To determine cj we choose the local interval to be
I = [tj , tj+1 ]. In this case, we have no interior knots in I so S1,t,I is the two dimensional
space of linear polynomials. A basis for this space is given by the two linear B-splines
Bj−1 and Bj , restricted to the interval I. A natural candidate for our local approximation
method is interpolation at tj andtj+1 . On the interval I, the B-spline Bj−1 is a straight
line with value 1 at tj and value 0 at tj+1 , while Bj is a straight line with value 0 at tj
and value 1 at tj+1 . The local interpolant can therefore be written
P1I f (x) = f (tj )Bj−1 (x) + f (tj+1 )Bj (x).
From Algorithm 8.1 we know that the coefficient multiplying Bj is the one that should
multiply Bj also in our global approximation, in other words cj = λj f = f (tj+1 ). The
8.2. SOME QUASI-INTERPOLANTS
165
global approximation is therefore
P1 f (x) =
n
X
f (tj+1 )Bj (x).
i=1
Since a straight line is completely characterised by its value at two points, the local approximation will always give zero error and therefore reproduce all linear polynomials.
Then we know from Lemma 8.2 that P1 will reproduce all splines S1,t .
This may seem like unnecessary formalism in this simple case where the conclusions
are almost obvious, but it illustrates how the construction works in a very transparent
situation.
8.2.2
In our repertoire of approximation methods, we only have one local, quadratic method,
Schoenberg’s variation diminishing spline. With the quasi-interpolant construction it is
easy to construct alternative, local methods. Our starting point is a quadratic spline space
S2,t based on a 3-regular knot vector with distinct interior knots, and a function f to be
approximated by a scheme which we denote P2 . The support of the B-spline Bj is [tj , tj+3 ],
and we choose our local interval as I = [tj+1 , tj+2 ]. Since I is one knot interval, we need a
local approximation method that reproduces quadratic polynomials. One such method is
interpolation at three distinct points. We therefore choose three distinct points xj,0 , xj,1
and xj,2 in I. Some degree of symmetry is always a good guide so we choose
xj,0 = tj+1 ,
xj,1 =
tj+1 + tj+2
,
2
xj,2 = tj+2 .
To determine P2I f we have to solve the linear system of three equations in the three
unknowns bj−1 , bj and bj+1 given by
P2I f (xj,k ) =
j+1
X
bi Bi (xj,k ) = f (xj,k ),
for k = 0, 1, 2.
i=j−1
With the aid of a tool like Mathematica we can solve these equations symbolically. The
result is that
1
bj = (−f (tj+1 ) + 4f (tj+3/2 ) − f (tj+2 ) ,
2
where tj+3/2 = (tj+1 + tj+2 )/2. The expressions for bj−1 and bj+1 are much more complicated and involve the knots tj and tj+3 as well. The simplicity of the expression for bj
stems from the fact that xj,1 was chosen as the midpoint between tj+1 and tj+2 .
The expression for bj is valid whenever tj+1 < tj+2 which is not the case for j = 1 and
j = n since t1 = t2 = t3 and tn+1 = tn+2 = tn+3 . But from Lemma 2.12 we know that any
spline g in S3,t will interpolate its first and last B-spline coefficient at these points so we
simply set c1 = f (t1 ) and cn = f (tn+1 ).
Having constructed the local interpolants, we have all the ingredients necessary to
166
CHAPTER 8. QUASI-INTERPOLATION METHODS
P
construct the quasi-interpolant P2 f = nj=1 λj f Bj , namely


f (t1 ),
when j = 1;



λj f = 1 (−f (xj,0 ) + 4f (xj,1 ) − f (xj,2 ), when 1 < j < n;

2



f (tn+1 ),
when j = n.
Since the local approximation reproduced the local spline space (the space of quadratic
polynomials in this case), the complete quasi-interpolant will reproduce the whole spline
space S2,t .
8.2.3
A 5-point cubic quasi-interpolant
The most commonly used splines are cubic, so let us construct a cubic quasi-interpolant.
We assume that the knot vector is 4-regular and that the interior knots are all distinct. As
usual we focus on the coefficient cj = λj f . It turns out that the choice I = [tj+1 , tj+3 ] is
convenient. The local spline space S3,t,I has dimension 5 and is spanned by the (restriction
of the) B-splines {Bi }j+2
i=j−2 . We want the quasi-interpolant to reproduce the whole spline
space and therefore need P I to reproduce S3,t,I . We want to use interpolation as our local
approximation method, and we know from Chapter 5 that spline interpolation reproduces
the spline space as long as it has a unique solution. The solution is unique if the coefficient
matrix of the resulting linear system is nonsingular, and from Theorem 5.18 we know that
a B-spline coefficient matrix is nonsingular if and only if its diagonal is positive. Since the
dimension of S3,t,I is 5 we need 5 interpolation points. We use the three knots tj+1 , tj+2
and tj+3 and one point from each of the knot intervals in I,
xj,0 = tj+1 ,
xj,1 ∈ (tj+1 , tj+2 ),
xj,2 = tj+2 ,
xj,3 ∈ (tj+2 , tj+3 ),
xj,4 = tj+3 .
Our local interpolation problem is
j+2
X
bi Bi (xj,k ) = f (xj,k ),
for k = 0, 1, . . . , 4.
i=j−2
In matrix-vector form this becomes

Bj−2 (xj,0 ) Bj−1 (xj,0 )
0
Bj−2 (xj,1 ) Bj−1 (xj,1 ) Bj (xj,1 )

Bj−2 (xj,2 ) Bj−1 (xj,2 ) Bj (xj,2 )


0
Bj−1 (xj,3 ) Bj (xj,3 )
0
0
0


 
0
0
bj−2
f (xj,0 )

 

Bj (xj,1 )
0 
 bj−1  f (xj,1 )

 

Bj (xj,2 ) Bj (xj,2 )
  bj  = f (xj,2 )




Bj (xj,3 ) Bj (xj,3 )
bj+1
f (xj,3 )
Bj (xj,4 ) Bj (xj,4 )
bj+2
f (xj,4 )
when we insert the matrix entries that are zero. Because of the way we have chosen the
interpolation points we see that all the entries on the diagonal of the coefficient matrix
will be positive so the matrix is nonsingular. The local problem therefore has a unique
solution and will reproduce S3,t,I . The expression for λj f is in general rather complicated,
but in the special case where the width of the two knot intervals is equal and xj,2 and xj,4
are chosen as the midpoints of the two intervals we end up with
λj f =
1
f (tj+1 ) − 8f (tj+3/2 ) + 20f (tj+2 ) − 8f (tj+5/2 ) + f (tj+3 )
6
8.2. SOME QUASI-INTERPOLANTS
167
where tj+3/2 = (tj+1 + tj+2 )/2 and tj+5/2 = (tj+2 + tj+3 )/2. Unfortunately, this formula
is not valid when j = 1, 2, n − 1 or n since then one or both of the knot intervals in I
collapse to one point. However, our procedure is sufficiently general to derive alternative
formulas for computing the first two coefficients. The first value of j for which the general
procedure works is j = 3. In this case I = [t4 , t6 ] and our interpolation problem involves
the B-splines {Bi }5i=1 . This means that when we solve the local interpolation problem
we obtain B-spline coefficients multiplying all of these B-splines, including B1 and B2 .
There is nothing stopping us from using the same interval I for computation of several
coefficients, so in addition to obtaining λ3 f from this local interpolant, we also use it as
our source for the first two coefficients. In the special case when the interior knots are
uniformly distributed and x3,1 = t9/2 and x3,3 = t11/2 , we find
λ1 f = f (t4 ),
1
−5f (t4 ) + 40f (t9/2 ) − 36f (t5 ) + 18f (t11/2 ) − f (t6 ) .
λ2 f =
18
In general, the second coefficient will be much more complicated, but the first one will not
change.
This same procedure can obviously be used to determine values for the last two coefficients, and under the same conditions of uniformly distributed knots and interpolation
points we find
1
−f (tn−1 ) + 18f (tn−1/2 ) − 36f (tn ) + 40f (tn+1/2 ) − 5f (tn+1 ) ,
18
λn f = f (tn+1 ).
λn−1 f =
8.2.4
Some remarks on the constructions
In all our constructions, we have derived specific formulas for the B-spline coefficients
of the quasi-interpolants in terms of the function f to be approximated, which makes it
natural to use the notation cj = λj f . To do this, we had to solve the local linear system
of equations symbolically. When the systems are small this can be done quite easily with
a computer algebra system like Maple or Mathematica, but the solutions quickly become
complicated and useless unless the knots and interpolation points are nicely structured,
preferably with uniform spacing. The advantage of solving the equations symbolically is of
course that we obtain explicit formulas for the coefficients once and for all and can avoid
solving equations when we approximate a particular function.
For general knots, the local systems of equations usually have to be solved numerically, but quasi-interpolants can nevertheless prove very useful. One situation is real-time
processing of data. Suppose we are in a situation where data are measured and need to
be fitted with a spline in real time. With a global approximation method we would have
to recompute the whole spline each time we receive new data. This would be acceptable
at the beginning, but as the data set grows, we would not be able to compute the new
approximation quickly enough. We could split the approximation into smaller pieces at
regular intervals, but quasi-interpolants seem to be a perfect tool for this kind of application. In a real-time application the data will often be measured at fixed time intervals,
and as we have seen it is then easy to construct quasi-interpolants with explicit formulas
for the coefficients. Even if this is not practicable because the explicit expressions are not
168
CHAPTER 8. QUASI-INTERPOLATION METHODS
available or become too complicated, we just have to solve a simple, linear set of equations
to determine each new coefficient. The important fact is that the size of the system is
constant so that we can handle almost arbitrarily large data sets, the only limitation being
available storage space.
Another important feature of quasi-interpolants is their flexibility. In our constructions
we have assumed that the function we approximate can be evaluated at any point that
we need. This may sometimes be the case, but often the function is only partially known
by a few discrete, measured values at specific abscissas. The procedure for constructing
quasi-interpolants has so much inherent freedom that it can be adapted in a number of
ways to virtually any specific situation, whether the whole data set is available a priori or
the approximation has to be produced in real-time as the data is generated.
8.3
Quasi-interpolants are linear operators
Now that we have seen some examples of quasi-interpolants, let us examine them from a
more general point of view. The basic ingredient of quasi-interpolants is the computation
of each B-spline coefficient, and we have have used the notation cj = λj f = λj (f ) to
indicate that each coefficient depends on f . It is useful to think of λj as a ’function’ that
takes an ordinary function as input and gives a real number as output; such ’functions’
are usually called functionals. If we go back and look at our examples, we notice that in
each case the dependency of our coefficient functionals on f is quite simple: The function
values occur explicitly in the coefficient expressions and are not multiplied or operated
on in any way other than being added together and multiplied by real numbers. This is
familiar from linear algebra.
Definition 8.3. In the construction of quasi-interpolants, each B-spline coefficient is computed by evaluating a linear functional. A linear functional λ is a mapping from a suitable
space of functions S into the real numbers R with the property that if f and g are two
arbitrary functions in S and α and β are two real numbers then
λ(αf + βg) = αλf + βλg.
Linearity is a necessary property of a functional that is being used to compute B-spline
coefficients in the construction of quasi-interpolants. If one of the coefficient functionals are
not linear, then the resulting approximation method is not a quasi-interpolant. Linearity
of the coefficient functionals leads to linearity of the approximation scheme.
Lemma 8.4. Any quasi-interpolant P is a linear operator, i.e., for any two admissible
functions f and g and any real numbers α and β,
P (αf + βg) = αP f + βP g.
Proof. Suppose that the linear coefficient functionals are (λj )nj=1 . Then we have
P (αf + βg) =
n
X
λj (αf + βg)Bi = α
i=1
which demonstrates the linearity of P .
n
X
i=1
λj f Bi + β
n
X
i=1
λj gBi = αP f + βP g
8.4. DIFFERENT KINDS OF LINEAR FUNCTIONALS AND THEIR USES
169
This lemma is simple, but very important since there are so many powerful mathematical tools available to analyse linear operators. In Chapter 9 we are going to see how well
a given function can be approximated by splines. We will do this by applying basic tools
in the analysis of linear operators to some specific quasi-interpolants.
8.4
Different kinds of linear functionals and their uses
In our examples of quasi-interpolants in Section 8.2 the coefficient functionals were all
linear combinations of function values, but there are other functionals that can be useful.
In this section we will consider some of these and how they turn up in approximation
problems.
8.4.1
Point functionals
Let us start by recording the form of the functionals that we have already encountered.
The coefficient functionals in Section 8.2 were all in the form
λf =
`
X
wi f (xi )
(8.2)
i=0
for suitable numbers (wi )`i=0 and (xi )`i=0 . Functionals of this kind can be used if a procedure
is available to compute values of the function f or if measured values of f at specific points
are known. Most of our quasi-interpolants will be of this kind.
Point functionals of this type occur naturally in at least two situations. The first is
when the local approximation method is interpolation, as in our examples above. The
second is when the local approximation method is discrete least squares approximation.
As a simple example, suppose our spline space is S2,t and that in determining cj we
consider the single knot interval I = [tj+1 , tj+2 ]. Suppose also that we have 10 function
values at the points (xj,k )9k=0 in this interval. Since the dimension of S2,t,I is 3, we cannot
interpolate all 10 points. The solution is to perform a local least squares approximation
and determine the local approximation by minimising the sum of the squares of the errors,
min
g∈S2,t,I
9
X
2
g(xj,k ) − f (xj,k ) .
k=0
The result is that cj will be a linear combination of the 10 function values,
cj = λj f =
9
X
wj,k f (xj,k ).
k=0
8.4.2
Derivative functionals
In addition to function values, we can also compute derivatives of a function at a point.
Since differentiation is a linear operator it is easy to check that a functional like λf = f 00 (xi )
is linear. The most general form of a derivative functional based at a point that we will
consider is
r
X
λf =
wk f (k) (x)
k=0
170
CHAPTER 8. QUASI-INTERPOLATION METHODS
where x is a suitable point in the domain of f . We will construct a quasi-interpolant based
on this kind of coefficient functionals in Section 8.6.1. By combining derivative functionals
based at different points we obtain
λf =
ri
` X
X
wi,k f (k) (xi )
i=0 k=0
where each ri is a nonnegative integer. A typical functional of this kind is the divided
difference of a function when some of the arguments are repeated. Such functionals are
fundamental in interpolation with polynomials. Recall that if the same argument occurs
r + 1 times in a divided difference, this signifies that all derivatives of order 0, 1, . . . , r
are to be interpolated at the point. Note that the point functionals above are derivative
functionals with ri = 0 for all i.
8.4.3
Integral functionals
The final kind of linear functionals that we will consider are based on integration. A
typical functional of this kind is
Z b
λf =
f (x)φ(x) dx
(8.3)
a
where φ is some fixed function. Because of basic properties of integration, it is easy to
check that this is a linear functional. Just as with point functionals, we can combine
several functionals like the one in (8.3) together,
Z b
Z b
Z b
λf = w0
f (x)φ0 (x) dx + w1
f (x)φ1 (x) dx + · · · + w`
f (x)φ` (x) dx,
a
a
a
where (wi )`i=0 are real numbers and {φi }`i=0 are suitable functions. Note that the righthand side of this equation can be written in the form (8.3) if we define φ by
φ(x) = w0 φ0 (x) + w1 φ1 (x) + · · · + w` φ` (x).
Point functionals can be considered a special case of integral functionals.
For if φ is a
R
function that is positive on the interval I = (xi − , xi + ) and I φ = 1, then we know
from the mean value theorem that
Z
f (x)φ (x) dx = f (ξ)
I
for some ξ in I , as long as f is a nicely behaved (for example continuous) function. If we
let tend to 0 we clearly have
Z
f (x)φ (x) dx = f (xi ),
(8.4)
lim
→0 I
so by letting φ in (8.3) be a nonnegative function with small support around x and unit
integral we can come as close to point
R binterpolation as we wish.
If we include the condition that a φ dx = 1, then the natural interpretation of (8.3)
is that λf gives a weighted average of the function f , with φ(x) giving the weight of the
8.4. DIFFERENT KINDS OF LINEAR FUNCTIONALS AND THEIR USES
171
function value f (x). A special case of this is when φ is the constant φ(x) = 1/(b − a); then
λf is the traditional average of f . From this point of view the limit (8.4) is quite obvious:
if we take the average of f over ever smaller intervals around xi , the limit must be f (xi ).
Rb
The functional a f (x) dx is often referred to as the first moment of f . As the name
suggests there are more moments. The i + 1st moment of f is given by
Z
b
f (x)xi dx.
a
Moments of a function occur in many applications of mathematics like physics and the
theory of probability.
8.4.4
Preservation of moments and interpolation of linear functionals
Interpolation of function values is a popular approximation method, and we have used it
repeatedly in this book. However, is it a good way to approximate a given function f ?
Is it not a bit haphazard to pick out a few, rather arbitrary, points on the graph of f
and insist that our approximation should reproduce these points exactly and then ignore
all other information about f ? As an example
of what can happen, suppose that we are
m
given a set of function values xi , f (xi ) i=1 and that we use piecewise linear interpolation
to approximate the underlying function. If f has been sampled densely and we interpolate
all the values, we would expect the approximation to be good, but consider what happens
if we interpolate only two of the values. In this case we cannot expect the resulting
straight line to be a good approximation. If we are only allowed to reproduce two pieces
of information about f we would
R generally do
R much better by reproducing its first two
moments, i.e., the two integrals f (x) dx and f (x)x dx, since this would ensure that the
approximation would reproduce some of the average behaviour of f .
Reproduction of moments is quite easy to accomplish. If our approximation is g, we
just have to ensure that the conditions
Z
a
b
g(x)xi dx =
Z
b
f (x)xi dx,
i = 0, 1, . . . , n − 1
a
are enforced if we want to reproduce n moments. In fact, this can be viewed as a generalisation of interpolation if we view interpolation to be preservation of the values of a set
of linear functionals (ρi )ni=1 ,
ρi g = ρi f,
for i = 1, 2, . . . , n.
(8.5)
Rb
When ρi f = a f (x)xi−1 dx for i = 1, . . . , n we preserve moments, while if ρi f = f (xi ) for
i = 1, . . . , n we preserve function values. Suppose for example that g is required to lie in
the linear space P
spanned by the basis {ψj }nj=1 . Then we can determine coefficients (cj )nj=1
so that g(x) = nj=1 cj ψj (x) satisfies the interpolation conditions (8.5) by inserting this
expression for g into (8.5). By exploiting the linearity of the functionals, we end up with
the n linear equations
c1 ρi (ψ1 ) + c2 ρi (ψ2 ) + · · · + cn ρi (ψn ) = ρi (f ),
i = 1, . . . , n
172
CHAPTER 8. QUASI-INTERPOLATION METHODS
in the n unknown coefficients (ci )ni=1 . In matrix-vector form this becomes

  

ρ1 (ψ1 ) ρ1 (ψ2 ) · · · ρ1 (ψn )
c1
ρ1 (f )
 ρ2 (ψ1 ) ρ2 (ψ2 ) · · · ρ1 (ψn )   c2   ρ2 (f ) 

  

 ..
..
..   ..  =  ..  .
..
 .
.
.
.  .   . 
ρn (ψ1 ) ρn (ψ2 ) · · · ρn (ψn )
cn
ρn (f )
(8.6)
A fundamental property of interpolation by point functionals is that the only polynomial
of degree d that interpolates the value 0 at d + 1 points is the zero polynomial. This
corresponds to the fact that when ρi f = f (xi ) and ψi (x) = xi for i = 0, . . . , d, the matrix
in (8.6) is nonsingular. Similarly, it turns out that the only polynomial of degree d whose
d + 1 first moments vanish is the zero polynomial,
which corresponds to the fact that the
Rb
matrix in (8.6) is nonsingular when ρi f = a f (x)xi dx and ψi (x) = xi for i = 0, . . . , d.
If the equations (8.6) can be solved, each coefficient will be a linear combination of the
entries on the right-hand side,
cj = λj f = wj,1 ρ1 (f ) + wj,2 ρ2 (f ) + · · · + wj,n ρn (f ).
We recognise this as (8.2) when the ρi correspond to point functionals, whereas we have
Z b
Z b
Z b
cj = λj f = wj,1
f (x) dx + wj,2
f (x)x dx + · · · + wj,n
f (x)xn−1 dx
a
a
a
Z b
=
f (x) wj,1 + wj,2 x + · · · + wj,n xn−1 dx
a
when the ρi correspond to preservation of moments.
8.4.5
Least squares approximation
In the discussion of point functionals, we mentioned that least squares approximation leads
to coefficients that are linear combinations of point functionals when the error is measured
by summing up the squares of the errors at a given set of data points. This is naturally
termed discrete least squares approximation. In continuous least squares approximation
we minimise the integral of the square of the error. If the function to be approximated is
f and the approximation g is required to lie in a linear space S, we solve the minimisation
problem
Z b
2
min
f (x) − g(x) dx.
g∈S
If S is spanned by
becomes
(ψi )ni=1 ,
a
we can write g as g =
min
(c1 ,...,cn )∈Rn
Z b
f (x) −
a
Pn
n
X
i=1 ci ψ
and the minimisation problem
2
ci ψ(x) dx.
i=1
To determine the minimum we differentiate with respect to each coefficient and set the
derivatives to zero which leads to the so-called normal equations
n
X
i=1
Z
b
Z
ψi (x)ψj (x) dx =
ci
a
b
ψj (x)f (x) dx,
a
for j = 1, . . . , n.
8.5. ALTERNATIVE WAYS TO CONSTRUCT COEFFICIENT FUNCTIONALS 173
Rb
If we use the notation above and introduce the linear functionals ρi f = a ψi (x)f (x)
represented by the basis functions, we recognise this linear system as an instance of (8.6).
In other words, least squares approximation is nothing but interpolation of the linear
functionals represented by the basis functions. In particular, preservation of moments
corresponds to least squares approximation by polynomials.
8.4.6
Computation of integral functionals
In our discussions
R b involving integral functionals we have tacitly assumed that the values
of integrals like a f (x)ψ(x) dx are readily available. This is certainly true if both f and ψ
are polynomials, and it turns out that it is also true if both f and ψ are splines. However,
if f is some general function, then the integral cannot usually be determined exactly, even
when ψi is a polynomial. In such situations we have to resort to numerical integration
methods. Numerical integration amounts to computing an approximation to an integral
by evaluating the function to be integrated at certain points, multiplying the function
values by suitable weights, and then adding up to obtain the approximate value of the
integral,
Z b
f (x) dx ≈ w0 f (x0 ) + w1 f (x1 ) + · · · + w` f (x` ).
a
In other words, when it comes to practical implementation of integral functionals we have
to resort to point functionals. In spite of this, integral functionals and continuous least
squares approximation are such important concepts that it is well worth while to have an
exact mathematical description. And it is important to remember that we do have exact
formulas for the integrals of polynomials and splines.
8.5
Alternative ways to construct coefficient functionals
In Section 8.2 we constructed three quasi-interpolants by following the general procedure
in Section 8.1. In this section we will deduce two alternative ways to construct quasiinterpolants.
8.5.1
Computation via evaluation of linear functionals
Let us use the 3-point, quadratic quasi-interpolant in subsection 8.2.2 as an example. In
this case we used I = [tj+1 , tj+2 ] as the local interval for determining cj = λj f . This
meant that the local spline space S2,t,I become the space of quadratic polynomials on I
which has dimension three. This space is spanned by the three B-splines {Bi }j+1
i=j−1 and
interpolation at the three points
tj+1 + tj+2
, tj+2
2
P
allowed us to determine a local interpolant g I = j+1
i=j−1 bi Bi whose middle coefficient bj
we used as λj f .
An alternative way to do this is as follows. Since g I is constructed by interpolation at
the three points tj+1 , tj+3/2 and tj+2 , we know that λj f can be written in the form
tj+1 ,
tj+3/2 =
λj f = w1 f (tj+1 ) + w2 f (tj+3/2 ) + w3 f (tj+2 ).
(8.7)
174
CHAPTER 8. QUASI-INTERPOLATION METHODS
We want to reproduce the local spline space which in this case is just the space of quadratic
polynomials. This means that (8.7) should be valid for all quadratic polynomials. Reproduction of quadratic polynomials can be accomplished by demanding that (8.7) should be
exact when f is replaced by the three elements of a basis for S2,t,I . The natural basis to
use in our situation is the B-spline basis {Bi }j+1
i=j−1 . Inserting this, we obtain the system
λj Bj−1 = w1 Bj−1 (tj+1 ) + w2 Bj−1 (tj+3/2 ) + w3 Bj−1 (tj+2 ),
λj Bj = w1 Bj (tj+1 ) + w2 Bj (tj+3/2 ) + w3 Bj (tj+2 ),
λj Bj+1 = w1 Bj+1 (tj+1 ) + w2 Bj+1 (tj+3/2 ) + w3 Bj+1 (tj+2 ).
in the three unknowns w1 , w2 and w3 . The left-hand sides of these equations are easy to
determine. Since λj f denotes the jth B-spline coefficient, it is clear that λj Bi = δi,j , i.e.,
it is 1 when i = j and 0 otherwise.
To determine the right-hand sides we have to compute the values of the B-splines. For
this it is useful to note that the wj ’s in equation (8.7) cannot involve any of the knots
other than tj+1 and tj+2 since a general polynomial knows nothing about these knots.
This means that we can choose the other knots so as to make life simple for ourselves.
The easiest option is to choose the first three knots equal to tj+1 and the last three equal
to tj+2 . But then we are in the Bezier setting, and we know that the B-splines in this
case will have the same values if we choose tj+1 = 0 and tj+2 = 1. The knots are then
(0, 0, 0, 1, 1, 1) which means that tj+3/2 = 1/2. If we denote the B-splines on these knots
by {B̃i }3i=1 , we can replace Bi in (8.5.1) by B̃i−j+2 for i = 1, 2, 3. We can now simplify
(8.5.1) to
0 = w1 B̃1 (0) + w2 B̃1 (1/2) + w3 B̃1 (1),
1 = w1 B̃2 (0) + w2 B̃2 (1/2) + w3 B̃2 (1),
0 = w1 B̃3 (0) + w2 B̃3 (1/2) + w3 B̃3 (1).
If we insert the values of the B-splines we end up with the system
w1 + w2 /4 = 0,
w2 /2 = 1,
w2 /4 + w3 = 0,
which has the solution w1 = −1/2, w2 = 2 and w3 = −1/2. In conclusion we have
λj f =
−f (tj+1 ) + 4f (tj+3/2 ) − f (tj+2 )
,
2
as we found in Section 8.2.2.
This approach to determining the linear functional works quite generally and is often
the easiest way to compute the weights (wi ).
8.5.2
Computation via explicit representation of the local approximation
There is a third way to determine the expression for λj f . For this we write down an explicit
expression for the approximation g I . Using the 3-point quadratic quasi-interpolant as our
8.6. TWO QUASI-INTERPOLANTS BASED ON POINT FUNCTIONALS
175
example again, we introduce the abbreviations a = tj+1 , b = tj+3/2 and c = tj+2 . We can
write the local interpolant g I as
g I (x) =
(x − a)(x − c)
(x − a)(x − b)
(x − b)(x − c)
f (a) +
f (b) +
f (c),
(a − b)(a − c)
(b − a)(b − c)
(c − a)(c − b)
as it is easily verified that g I then satisfies the three interpolation conditions g I (a) = f (a),
g I (b) = f (b) and g I (c) = f (c). What remains is to write this in terms of the B-spline
basis {Bi }j+1
i=j−1 and pick out coefficient number j. Recall that we have the notation γj (f )
for the jth B-spline coefficient of a spline f . Coefficient number j on the left-hand side is
λj f . On the right, we find the B-spline coefficients of each of the three polynomials and
add up. The numerator of the first polynomial is (x − b)(x − c) = x2 − (b + c)x + bc. To
find the jth B-spline of this polynomial, we make use of Corollary 3.5 which tells that,
when d = 2, we have γj (x2 ) = tj+1 tj+2 = ac and γj (x) = (tj+1 + tj+2 )/2 = (a + c)/2 = b,
respectively. The jth B-spline coefficient of the first polynomial is therefore
γj
ac − (b + c)b + bc (a − b)(a − c)
=
ac − b2
(a − b)(a − c)
(8.8)
which simplifies to −1/2 since b = (a + c)/2. Similarly, we find that the jth B-spline
coefficient of the second and third polynomials are 2 and −1/2, respectively. The complete
jth B-spline coefficient of the right-hand side of (8.8) is therefore −f (a)/2+2f (b)−f (c)/2.
In total, we have therefore obtained
λj f = γj (g I ) = −
f (tj+1 )
f (tj+2 )
+ 2f (tj+3/2 ) −
,
2
2
as required.
This general procedure also works generally, and we will see another example of it in
Section 8.6.1.
8.6
Two quasi-interpolants based on point functionals
In this section we consider two particular quasi-interpolants that can be constructed for
any polynomial degree. They may be useful for practical approximation problems, but
we are going to use them to prove special properties of spline functions in Chapters 9
and 10. Both quasi-interpolants are based on point functionals: In the first case all the
points are identical which leads to derivative functionals, in the second case all the points
are distinct.
8.6.1
A quasi-interpolant based on the Taylor polynomial
A very simple local, polynomial approximation is the Taylor polynomial. This leads to a
quasi-interpolant based on derivative functionals. Even though we use splines of degree d,
our local approximation can be of lower degree; in Theorem 8.5 this degree is given by r.
Theorem 8.5 (de Boor-Fix). Let r be an integer with 0 ≤ r ≤ d and let xj be a number
in [tj , tj+d+1 ] for j = 1, . . . , n. Consider the quasi-interpolant
Qd,r f =
n
X
j=1
r
λj (f )Bj,d ,
1 X
(−1)k Dd−k ρj,d (xj )Dk f (xj ),
where λj (f ) =
d!
k=0
(8.9)
176
CHAPTER 8. QUASI-INTERPOLATION METHODS
and ρj,d (y) = (y − tj+1 ) · · · (y − tj+d ). Then Qd,r reproduces all polynomials of degree r
and Qd,d reproduces all splines in Sd,t .
Proof. To construct Qd,r we let I be the knot interval that contains xj and let the local
approximation g I = PrI f be the Taylor polynomial of degree r at the point xj ,
I
g (x) =
PrI f (x)
=
r
X
(x − xj )k
k!
k=0
Dk f (xj ).
To construct the linear functional λj f , we have to find the B-spline coefficients of this
polynomial. We use the same approach as in Section 8.5.2. For this Marsden’s identity,
d
(y − x) =
n
X
ρj,d (y)Bj,d (x),
j=1
will be useful. Setting y = xj , we see that the jth B-spline coefficient of (xj − x)d is
ρj,d (xj ). Differentiating Marsden’s identity d − k times with respect to y, setting y = xi
and rearranging, we obtain the jth B-spline coefficient of (x − xj )k /k! as
γj (x − xj )k /k! = (−1)k Dd−k ρj,d (xj )/d!
for k = 0, . . . , r.
Summing up, we find that
r
λj (f ) =
1 X
(−1)k Dd−k ρj,d (xj )Dk f (xj ).
d!
k=0
Since the Taylor polynomial of degree r reproduces polynomials of degree r, we know
that the quasi-interpolant will do the same. If r = d, we reproduce polynomials of degree
d which agree with the local spline space Sd,t,I since I is a single knot interval. The
quasi-interpolant therefore reproduces the whole spline space Sd,t in this case.
Example 8.6. We find
Dd ρj,d (y)/d! = 1,
Dd−1 ρj,d (y)/d! = y − t∗j ,
where
t∗j =
tj+1 + · · · + tj+d
.
d
(8.10)
For r = 1 and xj = t∗j we therefore obtain
Qd,r f =
n
X
f (t∗j )Bj,d
j=1
which is the Variation Diminishing spline approximation. For d = r = 2 we obtain
Q2,2 f =
n
X
f (xj ) − (xj − tj+3/2 )Df (xj ) +
j=1
1
(xj − tj+1 )(xj − tj+2 )D2 f (xj ) Bj,2 .
2
(8.11)
while for d = r = 3 and xj = tj+2 we obtain
Q3,3 f =
n
X
j=1
1
1
f (tj+2 )+ (tj+3 −2tj+2 +tj+1 )Df (tj+2 )− (tj+3 −tj+2 )(tj+2 −tj+1 )D2 f (tj+2 ) Bj,3 . (8.12)
3
6
We leave the detailed derivation as a problem for the reader.
8.6. TWO QUASI-INTERPOLANTS BASED ON POINT FUNCTIONALS
177
PnSince Qd,d f = f for all f ∈ Sd,t it follows that the coefficients of a spline f =
j=1 cj Bj,d can be written in the form
d
1 X
cj =
(−1)k Dd−k ρj,d (xj )Dk f (xj ),
d!
for j = 1, . . . , n,
(8.13)
k=0
where xj is any number in [tj , tj+d+1 ].
8.6.2
Quasi-interpolants based on evaluation
Another natural class of linear functionals is the one where each λj used to define Q is
constructed by evaluating the data at r + 1 distinct points
tj ≤ xj,0 < xj,1 < · · · < xj,r ≤ tj+d+1
(8.14)
located in the support [tj , tj+d+1 ] of the B-spline Bj,d for j = 1, . . . , n. We consider the
quasi-interpolant
n
X
Pd,r f =
λj,r (f )Bj,d ,
(8.15)
j=1
where
λj,r (f ) =
r
X
wj,k f (xj,k ).
(8.16)
k=0
From the preceding theory we know how to choose the constants wj,k so that Pd,r f = f
for all f ∈ πr .
n+d+1
Theorem 8.7. Let Sd,t be a spline space with a d + 1-regular knot vector t = (ti )i=1
.
r
Let (xj,k )k=0 be ` + 1 distinct points in [tj , tj+d+1 ] for j = 1, . . . , n, and let wj,k be the
jth B-spline coefficient of the polynomial
pj,k (x) =
r
Y
x − xj,r
.
xj,k − xj,r
r=0
r6=k
Then Pd,r f = f for all f ∈ πr , and if r = d and all the numbers (xj,k )rk=0 lie in one
subinterval
tj ≤ t`j ≤ xj,0 < xj,1 < · · · < xj,r ≤ t`j +1 ≤ tj+d+1
(8.17)
then Pd,d f = f for all f ∈ Sd,t .
Proof. It is not hard to see that
pj,k (xj,i ) = δk,i ,
so that the polynomial
I
Pd,r
f (x) =
r
X
k, i = 0, . . . , r
pj,k (x)f (xj,k )
k=0
I f (x )
Pd,r
j,r
satisfies the interpolation conditions
therefore follows from the general recipe.
= f (xj,r ) for all j and r. The result
178
CHAPTER 8. QUASI-INTERPOLATION METHODS
In order to give examples of quasi-interpolants based on evaluation we need to know
the B-spline coefficients of the polynomials pj,k . We will return to this in more detail in
Chapter 9, see (9.14) in the case r = d. A similar formula can be given for r < d.
Example 8.8. For r = 1 we have
pj,0 (x) =
xj,1 − x
,
xj,1 − xj,0
pj,1 (x) =
x − xj,0
xj,1 − xj,0
and (8.15) takes the form
Pd,1 f =
n X
t∗j − xj,0
xj,1 − t∗j
f (xj,0 ) +
f (xj,1 ) Bj,d .
xj,1 − xj,0
xj,1 − xj,0
j=1
(8.18)
This quasi-interpolant reproduces straight lines for any choice of tj ≤ xj,0 < xj,1 ≤ tj+d+1 . If we choose
xj,0 = t∗j the method simplifies to
n
X
P̃d,1 f =
f (t∗j )Bj,d .
(8.19)
j=1
This is again the Variation diminishing method of Schoenberg.
Exercises for Chapter 8
8.1 In this exercise we assume that the points (xi,k ) and the spline space Sd,t are as in
Theorem 8.7.
a) Show that for r = d = 2
n X
(tj+1 − xj,1 )(tj+2 − xj,2 ) + (tj+2 − xj,1 )(tj+1 − xj,2 )
P2,2 f =
f (xj,0 )
2(xj,0 − xj,1 )(xj,0 − xj,2 )
j=1
(tj+1 − xj,0 )(tj+2 − xj,2 ) + (tj+2 − xj,0 )(tj+1 − xj,2 )
f (xj,1 )
2(xj,1 − xj,0 )(xj,1 − xj,2 )
(tj+1 − xj,0 )(tj+2 − xj,1 ) + (tj+2 − xj,0 )(tj+1 − xj,1 )
+
f (xj,2 ) Bj,2
2(xj,2 − xj,0 )(xj,2 − xj,1 )
(8.20)
+
b) Show that (8.20) reduces to the operator (9.4) for a suitable choice of (xj,k )2k=0 .
8.2 Derive the following operators Qd,l and show that they are exact for πr for the
indicated r. Again we the points (xj,k ) and the spline space Sd,t are is in Theorem 8.7.
Which of the operators reproduce the whole spline space?
P
a) Qd,0 f = nj=1 f (xj )Bj,d , (r = 0).
P
b) Qd,1 f = nj=1 f (xj ) + (tj − xj )Df (xj ) Bj,d , (r = 1).
P
c) Q̃d,1 f = nj=1 f (t∗j )Bj,d , (r = 1).
d)
Q2,2 f =
n
X
f (xj ) − (xj − tj+3/2 )Df (xj )
j=1
1
+ (xj − tj+1 )(xj − tj+2 )D2 f (xj ) Bj,2 , (r=2).
2
8.6. TWO QUASI-INTERPOLANTS BASED ON POINT FUNCTIONALS
e) Q̃2,2 f =
Pn
j=1
f (tj+3/2 ) − 12 (tj+2 − tj+1 )2 D2 f (tj+3/2 ) Bj,2 ,
179
(r = 2).
f)
Q3,3 f =
n
X
1
f (tj+2 ) + (tj+3 − 2tj+2 + tj+1 )Df (tj+2 )
3
j=1
1
− (tj+3 − tj+2 )(tj+2 − tj+1 )D2 f (tj+2 ) Bj,3 ,
6
(r = 3).
180
CHAPTER 8. QUASI-INTERPOLATION METHODS
CHAPTER 9
Approximation theory and stability
Polynomials of degree d have d+1 degrees of freedom, namely the d+1 coefficients relative
to some polynomial basis. It turns out that each of these degrees of freedom can be utilised
to gain approximation power so that the possible rate of approximation by polynomials of
degree d is hd+1 , see Section 9.1. The meaning of this is that when a smooth function is
approximated by a polynomial of degree d on an interval of length h, the error is bounded
by Chd+1 , where C is a constant that is independent of h. The exponent d + 1 therefore
controls how fast the error tends to zero with h.
When several polynomials are linked smoothly together to form a spline, each polynomial piece has d + 1 coefficients, but some of these are tied up in satisfying the smoothness
conditions. It therefore comes as a nice surprise that the approximation power of splines of
degree d is the same as for polynomials, namely hd+1 , where h is now the largest distance
between two adjacent knots. In passing from polynomials to splines we have therefore
gained flexibility without sacrificing approximation power. We prove this in Section 9.2,
by making use of some of the simple quasi-interpolants that we constructed in Chapter 8;
it turns out that these produce spline approximations with the required accuracy.
The quasi-interpolants also allow us to establish two important properties of B-splines.
The first is that B-splines form a stable basis for splines, see Section 9.3. This means
that small perturbations of the B-spline coefficients can only lead to small perturbations
in the spline, which is of fundamental importance for numerical computations. We have
already seen that an important consequence of the stability of the B-spline basis is that
the control polygon of a spline converges to the spline as the knot spacing tends to zero;
this was proved in Section 4.1.
9.1
The distance to polynomials
We start by determining how well a given a real valued function f defined on an interval
[a, b] can be approximated by a polynomial of degree d. To measure the error in the
approximation we will use the uniform norm which for a bounded function g defined on
an interval [a, b] is defined by
||g||∞,[a,b] = sup g(x).
a≤x≤b
181
182
CHAPTER 9. APPROXIMATION THEORY AND STABILITY
Whenever we have an approximation p to f we can then measure the error by ||f −
p||∞,a,b . There are many possible approximations to f by polynomials of degree d, and
the approximation that makes the error as small as possible is of course of special interest.
This error is referred to as the distance from f to the space πd of polynomials of degree
≤ d and is defined formally as
dist∞,[a,b] (f, πd ) = inf ||f − p||∞,[a,b] .
p∈πd
In order to bound this approximation error, we have to place some restrictions on the
functions that we approximate, and we will only consider functions with piecewise contink [a, b] for some integer
uous derivatives. Such functions lie in a space that we denote C∆
k ≥ 0. A function f lies in this space if it has k − 1 continuous derivatives on the interval
[a, b], and the kth derivative Dk f is continuous everywhere except for a finite number of
points in the interior (a, b), given by ∆ = (zj ). At the points of discontinuity ∆ the limits
from the left and right given by Dk f (zj +) and Dk f (zj −), should exist so all the jumps
0 [a, b]. Note that
are finite. If there are no continuous derivatives we write C∆ [a, b] = C∆
we will often refer to these spaces without stating explicitly what the singularities ∆ are.
An upper bound for the distance of f to polynomials of degree d is fairly simple to
give by choosing a particular approximation, namely Taylor expansion.
d+1
[a, b], then
Theorem 9.1. Given a polynomial degree d and a function f in C∆
dist∞,[a,b] (f, πd ) ≤ Kd hd+1 ||Dd+1 f ||∞,[a,b] ,
where h = b − a and
Kd =
1
2d+1 (d
+ 1)!
depends only on d.
Proof. Consider the truncated Taylor series of f at the midpoint m = (a + b)/2 of [a, b].
Td f (x) =
d
X
(x − m)k
k=0
k!
Dk f (m),
for x ∈ [a, b].
Since Td f is a polynomial of degree d we clearly have
dist∞,[a,b] (f, πd ) ≤ ||f − Td f ||∞,[a,b] .
(9.1)
To study the error we use the integral form of the remainder in the Taylor expansion,
Z
1 x
f (x) − Td f (x) =
(x − y)d Dd+1 f (y)dy,
d! m
which is valid for any x ∈ [a, b]. If we restrict x to the interval [m, b] we obtain
Z
1 x
d+1
|f (x) − Td f (x)| ≤ ||D f ||∞,[a,b]
(x − y)d dy.
d! m
The integral is given by
d+1
Z
1
1 x
1
h
d
d+1
(x − y) dy =
(x − m)
≤
,
d! m
(d + 1)!
(d + 1)! 2
9.2. THE DISTANCE TO SPLINES
183
so for x ≥ m we have
|f (x) − Td f (x)| ≤
1
hd+1 ||Dd+1 f ||∞,[a,b] .
2d+1 (d + 1)!
By symmetry this estimate must also hold for x ≤ m and combining it with (9.1) completes
the proof of the theorem.
We remark that the best possible constant Kd can actually be computed. In fact, for
each f ∈ C d+1 [a, b] there is a point ξ ∈ [a, b] such that
dist∞,[a,b] (f, πd ) =
2
4d+1 (d
+ 1)!
hd+1 |Dd+1 f (ξ)|
Applying this formula to the function f (x) = xd+1 we see that the exponent d + 1 in hd+1
is best possible.
9.2
The distance to splines
Just as we defined the distance from a function f to the space of polynomials of degree
d we can define the distance from f to a spline space. Our aim is to show that on one
knot interval, the distance from f to a spline space of degree d is essentially the same as
the distance from f to the space of polynomials of degree d on a slightly larger interval,
see Theorem 9.2 and Corollary 9.11. Our strategy is to consider the cases d = 0, 1 and 2
separately and then generalise to degree d. The main ingredient in the proof is a family
of simple approximation methods called quasi-interpolants. As well as leading to good
estimates of the distance between f and a spline space, many of the quasi-interpolants are
good, practical approximation methods.
We consider a spline space Sd,t where d is a nonnegative integer and t = (ti )n+d+1
is a
i=1
d + 1 regular knot vector. We set
a = t1 ,
b = tn+d+1 ,
hj = tj+1 − tj ,
h = max hj .
1≤j≤n
Given a function f we consider the distance from f to Sd,t defined by
dist∞,[a,b] (f, Sd,t ) = inf ||f − g||∞,[a,b] .
g∈Sd,t
We want to show the following.
d+1
Theorem 9.2. Let the polynomial degree d and the function f in C∆
[a, b] be given.
Then for any spline space Sd,t
dist∞,[a,b] (f, Sd,t ) ≤ Kd hd+1 ||Dd+1 f ||∞,[a,b] ,
(9.2)
where the constant Kd depends on d, but not on f, h or t.
We will prove this theorem by constructing a spline Pd f such that
|f (x) − Pd f (x)| ≤ Kd hd+1 ||Dd+1 f ||∞,[a,b] ,
x ∈ [a, b]
(9.3)
184
CHAPTER 9. APPROXIMATION THEORY AND STABILITY
for a constant Kd depending only on d. The approximation Pd f will be on the form
Pd f =
n
X
λi (f )Bi,d
i=1
where λi is a rule for computing the ith B-spline coefficient. We will restrict ourselves to
rules λi like
d
X
λi (f ) =
wi,k f (xi,k )
k=0
where the points (xi,k )dk=0 all lie in one knot interval and (wi,k )dk=0 are suitable coefficients.
These kinds of approximation methods are called quasi-interpolants.
9.2.1
The constant and linear cases
We first prove Theorem 9.2 in the low degree cases d = 0 and d = 1. For d = 0 the knots
form a partition a = t1 < · · · < tn+1 = b of [a, b] and the B-spline Bi,0 is the characteristic
function of the interval [ti , ti+1 ) for i = 1, . . . , n−1, while Bn,0 is the characteristic function
of the closed interval [tn , tn+1 ]. We consider the step function
g = P0 f =
n
X
f (ti+1/2 )Bi,0 ,
i=1
where ti+1/2 = (ti + ti+1 )/2. Fix x ∈ [a, b] and let l be an integer such that tl ≤ x < tl+1 .
We then have
Z x
f (x) − P0 f (x) = f (x) − f (tl+1/2 ) =
Df (y)dy
tl+1/2
so
|f (x) − P0 f (x)| ≤ |x − tl+1/2 | ||Df ||∞,[tl ,tl+1 ] ≤
h
||Df ||∞,[a,b] .
2
In this way we obtain (9.2) with K0 = 1/2.
In the linear case d = 1 we define P1 f to be the piecewise linear interpolant to f on t
g = P1 f =
n
X
f (ti+1 )Bi,1 .
i=1
Proposition 5.2 gives an estimate of the error in linear interpolation and by applying this
result on each interval we obtain
||f − P1 f ||∞,[a,b] ≤
h2
||D2 f ||∞,[a,b]
8
which is (9.2) with K1 = 1/8.
9.2.2
Consider next the quadratic case d = 2. We shall approximate f by the quasi-interpolant
P2 f that we constructed in Section 8.2.2. Its properties is summarised in the following
lemma.
9.2. THE DISTANCE TO SPLINES
185
Lemma 9.3. Suppose t = (ti )n+3
i=1 is a knot vector with ti+3 > ti for i = 1, . . . , n. The
operator
P2 f =
n
X
i=1
λi (f )Bi,2,t ,
1
1
with λi (f ) = − f (ti+1 ) + 2f (ti+3/2 ) − f (ti+2 )
2
2
(9.4)
satisfies P2 p = p for all p ∈ π2 .
To show that (9.3) holds for d = 2 we now give a sequence of small lemmas.
Lemma 9.4. Let P2 (f ) be as in (9.4). Then
|λi (f )| ≤ 3||f ||∞,[ti+1 ,ti+2 ] ,
i = 1, . . . , n.
(9.5)
Proof. Fix an integer i. Then
1
1
1
1
|λi (f )| = | − f (ti+1 ) + 2f (ti+3/2 ) − f (ti+2 )| ≤ ( + 2 + )||f ||∞,[ti+1 ,ti+2 ]
2
2
2
2
from which the result follows.
Lemma 9.5. For ` = 3, . . . , n we can bound P2 f on a subinterval [t` , t`+1 ] by
||P2 f ||∞,[t` ,t`+1 ] ≤ 3||f ||∞,[t`−1 ,t`+2 ] .
(9.6)
Proof. Fix x ∈ [t` , t`+1 ]. Since the B-splines are nonnegative and form a partition of
unity we have
`
X
λi (f )Bi,2,t (x) ≤ max |λi (f )|
|P2 f (x)| = `−2≤i≤`
i=`−2
≤ 3 max ||f ||∞,[ti+1 ,ti+2 ] = 3||f ||∞,[t`−1 ,t`+2 ] ,
`−2≤i≤`
where we used Lemma 9.4. This completes the proof.
The following lemma shows that locally, the spline P2 f approximates f essentially as
well as the best quadratic polynomial.
Lemma 9.6. For ` = 3, . . . , n the error f − P2 f on the interval [t` , t`+1 ] is bounded by
||f − P2 f ||∞,[t` ,t`+1 ] ≤ 4 dist∞,[t`−1 ,t`+2 ] (f, π2 ).
(9.7)
Proof. Let p ∈ π2 be any quadratic polynomial. Since P2 p = p and P2 is a linear operator,
application of (9.6) to f − p yields
f (x) − (P2 f )(x) = f (x) − p(x) − (P2 f )(x) − p(x) ≤ f (x) − p(x) + P2 (f − p)(x)
(9.8)
≤ (1 + 3)||f − p||∞,[t`−1 ,t`+2 ] .
Since p is arbitrary we obtain (9.7).
We can now prove (9.2) for d = 2. For any interval [a, b] Theorem 9.1 with d = 2 gives
dist∞,[a,b] (f, π2 ) ≤ K2 h3 ||D3 f ||∞,[a,b] ,
where h = b − a and K2 = 1/(23 3!). Combining this estimate on [a, b] = [t`−1 , t`+2 ] with
(9.7) we obtain (9.3) and hence (9.2).
186
CHAPTER 9. APPROXIMATION THEORY AND STABILITY
9.2.3
The general case
The general case is analogous to the quadratic case, but the details are more complicated.
Recall that for d = 2 we picked three points xi,k = ti+1 + k(ti+2 − ti+1 )/2 for k = 0, 1, 2
in each subinterval [ti+1 , ti+2 ] and then chose constants wi,k for k = 0, 1, 2 such that the
operator
P2 f =
n
X
λi (f )Bi,2,t ,
with λi (f ) = wi,0 f (xi,0 ) + wi,1 f (xi,1 ) + wi,2 f (xi,2 ),
i=1
reproduced quadratic polynomials. We will follow the same strategy for general degree.
The resulting quasi-interpolant is a special case of the one given in Theorem 8.7.
Suppose that d ≥ 2 and fix an integer i such that ti+d > ti+1 . We pick the largest
subinterval [ai , bi ] = [tl , tl+1 ] of [ti+1 , ti+d ] and define the uniformly spaced points
k
xi,k = ai + (bi − ai ), for k = 0, 1, . . . , d
d
in this interval. Given f ∈ C∆ [a, b] we define Pd f ∈ Sd,t by
Pd f (x) =
n
X
λi (f )Bi,d (x),
where λi (f ) =
i=1
d
X
wi,k f (xi,k ).
(9.9)
(9.10)
k=0
The following lemma shows how the coefficients (wi,k )dk=0 should be chosen so that Pd p = p
for all p ∈ πd .
Lemma 9.7. Suppose that in (9.10) the functionals λi are given by λi (f ) = f (ti+1 ) if
ti+d = ti+1 , while if ti+d > ti+1 we set
wi,k = γi (pi,k ),
k = 0, 1, . . . , d,
(9.11)
where γi (pi,k ) is the ith B-spline coefficient of the polynomial
pi,k (x) =
d
Y
x − xi,j
.
xi,k − xi,j
(9.12)
j=0
j6=k
Then the operator Pd in (9.10) satisfies Pd p = p for all p ∈ πd .
Proof. Suppose first that ti+d > ti+1 . Any p ∈ πd can be written in the form
p(x) =
d
X
p(xi,k )pi,k (x).
(9.13)
k=0
For if we denote the function on the right by q(x) then q(xi,k ) = p(xi,k ) for k = 0, 1, . . . ,
d, and since q ∈ πd it follows by the uniqueness of the interpolating polynomial that p = q.
Now, by linearity of γi we have
λi (p) =
d
X
wi,k p(xi,k ) =
k=0
= γi
d
X
γi (pi,k )p(xi,k )
k=0
d
X
k=0
pi,k p(xi,k ) = γi (p).
9.2. THE DISTANCE TO SPLINES
187
If ti+1 = ti+d we know that a spline of degree d with knots t agrees with its i + 1st
coefficient at ti+1 . In particular, for any polynomial p we have λi (p) = f (ti+1 ) = γi (p).
Alltogether this means that
Pd (p) =
n
X
λi (p)Bi,d (x) =
i=1
n
X
γi (p)Bi,d (x) = p
i=1
which confirms the lemma.
The B-spline coefficients of pi,k can be found from the following lemma.
Lemma 9.8. Given a spline space Sd,t and numbers v1 , . . . , vd . The ith B-spline coefficient
of the polynomial p(x) = (x − v1 ) . . . (x − vd ) can be written
X
1
γi (p) =
(ti+j1 − v1 ) · · · (ti+jd − vd ),
(9.14)
d!
(j1 ,...,jd )∈Πd
where Πd is the set of all permutations of the integers 1, 2, . . . , d.
Proof. By Theorem 4.16 we have
γi (p) = B[p](ti+1 , . . . , ti+d ),
where B[p] is the blossom of p. It therefore suffices to verify that the expression (9.14) for
γi (p) satisfies the three properties of the blossom, but this is immediate.
As an example, for d = 2 the set of all permutations of 1, 2 are Π2 = {(1, 2), (2, 1)}
and therefore
1
γi (x − v1 )(x − v2 ) =
(ti+1 − v1 )(ti+2 − v2 ) + (ti+2 − v1 )(ti+1 − v2 ) .
2
We can now give a bound for λi (f ).
P
Theorem 9.9. Let Pd (f ) = ni=1 λi (f )Bi,d be the operator in Lemma 9.7. Then
|λi (f )| ≤ Kd ||f ||∞,[ti+1 ,ti+d ] ,
where
Kd =
i = 1, . . . , n,
2d
[d(d − 1)]d
d!
(9.15)
(9.16)
depends only on d.
Proof. Fix an integer i. From Lemma 9.8 we have
d X
Y
ti+jr − vr
wi,k =
/d!,
xi,k − vr
(9.17)
(j1 ,...,jd )∈Πd r=1
where (vr )dr=1 = (xi,0 , . . . , xi,k−1 , xi,k+1 , . . . , xi,d ). and Πd denotes the set of all permutations of the integers 1, 2, . . . , d. Since the numbers ti+jr and vr belongs to the interval
[ti+1 , ti+d ] for all r we have the inequality
d
Y
r=1
(ti+jr − vr ) ≤ (ti+d − ti+1 )d .
188
CHAPTER 9. APPROXIMATION THEORY AND STABILITY
We also note that xi,k − vr = (k − q)(bi − ai )/d for some q in the range 1 ≤ q ≤ d but with
q 6= k. Taking the product over all r we therefore obtain
d
d
Y
Y
|k − q|
bi − ai d
ti+d − ti+1 d
|xi,k − vr | =
(bi − ai ) ≥ k!(d − k)!
≥ k!(d − k)!
d
d
d(d − 1)
r=1
q=0
q6=k
for all values of k and r since [ai , bi ] is the largest subinterval of [ti+1 , ti+d ]. Since the sum
in (9.17) contains d! terms, we find
d
d X
[d(d − 1)]d X d
2d
|wi,k | ≤
= [d(d − 1)]d = Kd
d!
k
d!
k=0
k=0
and hence
|λi (f )| ≤ ||f ||∞,[ti+1 ,ti+d ]
d
X
|wi,k | ≤ Kd ||f ||∞,[ti+1 ,ti+d ]
(9.18)
k=0
which is the required inequality.
From the bound for λi (f ) we easily obtain a bound for the norm of Pd f .
Theorem 9.10. For d + 1 ≤ l ≤ n and f ∈ C∆ [a, b] we have the bound
||Pd f ||∞,[tl ,tl+1 ] ≤ Kd ||f ||∞,[tl−d+1 ,tl+d ] ,
(9.19)
where Kd is the constant in Theorem 9.9.
Proof. Fix x ∈ [tl , tl+1 ]. Since the B-splines are nonnegative and form a partition of unity
we have by Theorem 9.9
|Pd f (x)| = |
l
X
λi (f )Bi,d,t (x)| ≤ max |λi (f )|
i=l−d
l−d≤i≤l
≤ Kd max ||f ||∞,[ti+1 ,ti+d ] = Kd ||f ||∞,[tl−d+1 ,tl+d ]
l−d≤i≤l
This completes the proof.
The following corollary shows that Pd f locally approximates f essentially as well as
the best polynomial approximation of f of degree d.
Corollary 9.11. For l = d + 1, . . . , n the error f − Pd f on the interval [tl , tl+1 ] is bounded
by
||f − Pd f ||∞,[tl ,tl+1 ] ≤ (1 + Kd ) dist∞,[tl−d+1 ,tl+d ] (f, πd ),
(9.20)
where Kd is the constant in Theorem 9.9
Proof. We argue exactly as in the quadratic case. Let p ∈ πd be any polynomial in πd .
Since Pd p = p and Pd is a linear operator we therefore have
f (x) − (Pd f )(x) = f (x) − p(x) − (Pd f )(x) − p(x) ≤ f (x) − p(x) + Pd (f − p)(x)
≤ (1 + Kd )||f − p||∞,[tl−d+1 ,tl+d ] .
Since p is arbitrary we obtain (9.20).
9.3. STABILITY OF THE B-SPLINE BASIS
189
We can now prove (9.2) for general d. By Theorem 9.1 we have for any interval [a, b]
dist∞,[a,b] (f, πd ) ≤ Kd hd+1 ||Dd+1 f ||∞,[a,b] ,
where h = b−a and Kd only depends on d. Combining this estimate on [a, b] = [tl−d+1 , tl+d ]
with (9.20) we obtain (9.3) and hence (9.2).
9.3
Stability of the B-spline basis
In order to compute with polynomials or splines we need to choose a basis to represent
the functions. If a basis is to be suitable for computer manipulations then it should be
reasonably insensitive to round-off errors. In particular, functions with ‘small’ function
values should have ‘small’ coefficients and vice versa. A basis with this property is said
to be well conditioned or stable. In this section we will study the relationship between a
spline and its coefficients quantitatively by introducing the condition number of a basis.
We have already seen that the size of a spline is bounded by its B-spline coefficients.
There is also a reverse inequality, i.e., a bound on the B-spline coefficients in terms of the
size of f . There are several reasons why such inequalities are important. In Section 4.1
we made use of this fact to estimate how fast the control polygon converges to the spline
as more and more knots are inserted. A more direct consequence is that small relative
perturbations in the coefficients can only lead to small changes in the function values.
Both properties reflect the fact that the B-spline basis is well conditioned.
9.3.1
A general definition of stability
The stability of a basis can be defined quite generally. Instead of considering polynomials,
we can consider a general linear vector space where we can measure the size of the elements
through a norm; this is called a normed linear space.
Definition 9.12. Let U be a normed linear space. A basis (φj ) for U is said to be stable
with respect to a vector norm || · || if there are small positive constants C1 and C2 such
that
X
(9.21)
cj φj ≤ C2 (cj ),
C1−1 (cj ) ≤ j
for all sets of coefficients c = (cj ). Let C1∗ and C2∗ denote the smallest possible values of
C1 and C2 such that (9.21) holds. The condition number of the basis is then defined to
be κ = κ((φi )i ) = C1∗ C2∗ .
At the risk of confusion, we have used the same symbol both for the norm in U and
the vector norm of the coefficients. In our case U will of course be some spline space Sd,t
and the basis (φj ) will be the B-spline basis. The norms we will consider are the p-norms
which are defined by
Z
||f ||p = ||f ||p,[a,b] =
b
1/p
|f (x)| dx
,
a
p
and ||c||p =
X
p
1/p
|cj |
j
where f is a function on the interval [a, b] and c = (cj ) is a real vector, and p is a real
number in the range 1 ≤ p < ∞ for any real number. For p = ∞ the norms are defined by
||f ||∞ = ||f ||∞,[a,b] = max |f (x)|, and ||c||∞ = (cj )∞ = max |cj |,
a≤x≤b
j
190
CHAPTER 9. APPROXIMATION THEORY AND STABILITY
In practice, the most important norms are the 1-, 2- and ∞-norms.
In Definition 9.12 we require the constants C1 and C2 to be ‘small’, but how small is
‘small’ ? There is no unique answer to this question, but it is typically required that C1
and C2 should be independent of the dimension n of U, or at least grow very slowly with
n. Note that we always have κ ≥ 1, and κ = 1 if and only if we have equality in both
inequalities in (9.21).
A stable basis is desirable for many reasons, and the constant κ = C1 C2 crops up in
many different contexts. The condition number κ does in fact act as a sort of derivative
of the basis and gives a measure of how much an error in the coefficients is magnified in a
function value.
P
P
Proposition 9.13. Suppose (φj ) is a stable basis for U. If f = j cj φj and g = j bj φj
are two elements in U with f 6= 0, then
||c − b||
||f − g||
≤κ
,
||f ||
||c||
(9.22)
where κ is the condition number of the basis as in Definition 9.12.
Proof. From (9.21), we have the two inequalities ||f − g|| ≤ C2 ||(cj − bj )|| and 1/||f || ≤
C1 /||(cj )||. Multiplying these together gives the result.
If we think of g as an approximation to f , then (9.22) says that the relative error in
f − g is bounded by at most κ times the relative error in the coefficients. If κ is small,
then a small relative error in the coefficients gives a small relative error in the function
values. This is important in floating point calculations on a computer. A function is
usually represented by its coefficients relative to some basis. Normally, the coefficients are
real numbers that must be represented inexactly as floating point numbers in a computer.
This round-off error means that the computed spline, here g, will differ from the exact
f . Proposition 9.13 shows that this is not so serious if the perturbed coefficients of g are
close to those of f and the basis is stable.
Proposition 9.13 also provides some information as to what are acceptable values of
∗
C1 and C2∗ . If for example κ = C1∗ C2∗ = 100 we risk losing 2 decimal places in evaluation
of a function; exactly how much accuracy one can afford to lose will of course vary.
One may wonder whether there are any unstable polynomial bases. It turns out that
the power basis 1, x, x2 , . . . , on the interval [0, 1] is unstable even for quite low degrees.
Already for degree 10, one risks losing as much as 4 or 5 decimal digits in the process of
computing the value of a polynomial on the interval [0, 1] relative to this basis, and other
operations such as numerical root finding is even more sensitive.
9.3.2
The condition number of the B-spline basis. Infinity norm
Since splines and B-splines are defined via the knot vector, it is quite conceivable that
the condition number of the B-spline basis could become arbitrarily large for certain knot
configurations, for example in the limit when two knots merge into one. We will now prove
that the condition number of the B-spline basis can be bounded independently of the knot
vector so it cannot grow beyond all bounds when the knots vary.
The best constant C2∗ in Definition 9.12 can be found quite easily for the B-spline basis.
9.3. STABILITY OF THE B-SPLINE BASIS
191
Lemma 9.14. In all spline spaces Sd,t the bound
m
X
bi Bi,d ≤ ||b||∞
∞,[t1 ,tm+1+d ]
i=1
holds. Equality holds if bi = 1 for all i and the knot vector t = (ti )n+d
i=0 is d + 1-extended;
in this case C2∗ = 1.
Proof. This follows since the B-splines are nonnegative and sum to one.
To find a bound for the constant C1 we shall use the operator Pd given by (9.3). We
recall that Pd reproduces polynomials of degree d, i.e., Pd p = p for all p ∈ πd . We now
show that more is true; we have in fact that Pd reproduces all splines in Sd,t .
Theorem 9.15. The operator
Pd f =
n
X
λi (f )Bi,d
i=1
given by (9.3) reproduces all splines in Sd,t , Pd f = f for all f ∈ Sd,t .
Proof. We first show that
λj (Bk,d ) = δj,k ,
for j, k = 1, . . . , n.
(9.23)
Fix i and let
Ii = [ai , bj ] = [tli , tli +1 ]
be the interval used to define λi (f ). We consider the polynomials
φk = Bk,d |Ii
for li − d ≤ k ≤ li
i
obtained by restricting the B-splines {Bk,d }lk=l
to the interval Ii . Since Pd reproduces
i −d
πd we have
li
X
λj (φk )φj (x)
φk (x) = (Pd φk )(x) =
j=li −d
for x in the interval Ii . By the linear independence of the the polynomials (φk ) we therefore
obtain
λj (Bk,d ) = λj (φk ) = δj,k ,
for j, k = li − d, . . . , li .
In particular we have λi Bi,d = 1 since li − d ≤ i ≤ li . For k < li − d or k > li the support
of Bk,d has empty intersection with Ii so λi (Bk,d ) = 0 for these values of k. Thus (9.23)
holds for all k.
P
To complete the proof suppose f = nk=1 ck Bk,d is a spline in Sd,t . From (9.23) we
then have
n X
n
n
X
X
Qf =
ck λj (Bk,d ) Bj,d =
cj Bj,d = f.
j=1 k=1
j=1
192
CHAPTER 9. APPROXIMATION THEORY AND STABILITY
To obtain an upper bound for C1∗ we note that the leftmost inequality in (9.21) is
equivalent to
|bi | ≤ C1 ||f ||, i = 1, . . . , m.
Lemma 9.16. There isPa constant Kd , depending only on the polynomial degree d, such
that for all splines f = m
i=1 bi Bi,d in some given spline space Sd,t the inequality
|bi | ≤ Kd ||f ||[ti+1 ,ti+d ]
(9.24)
holds for all i.
Proof. Consider the operator Pd given in Lemma 9.7. Since Pd f = f we have bi = λi (f ).
The result now follows from (9.15)
Note that if [a, b] ⊆ [c, d], then ||f ||∞,[a,b] ≤ ||f ||∞,[c,d] . From (9.24) we therefore
conclude that |bi | ≤ Kd kf k∞,[t1 ,tm+1+d ] for all i or briefly ||b|| ≤ Kd kf k. The constant Kd
can therefore be used as C1 in Definition 9.12 in the case where the norm is the ∞-norm.
Combining the two lemmas we obtain the following theorem.
Theorem 9.17. There is a constant K1 , depending only
P on the polynomial degree d,
such that for all spline spaces Sd,t and all splines f = m
i=1 bi Bi,d ∈ Sd,t with B-spline
m
coefficients b = (bi )i=1 the inequalities
K1−1 ||b||∞ ≤ ||f ||∞,[t1 ,tm+d ] ≤ ||b||∞
(9.25)
hold.
The condition number of the B-spline basis on the knot vector t with respect to the ∞norm is usually denoted κd,∞,t . By taking the supremum over all knot vectors we obtain
the knot independent condition number κd,∞ ,
κd,∞ = sup κd,∞,t .
t
Theorem 9.17 shows that κd,∞ is bounded above by K1 .
The estimate Kd for C1∗ given by (9.16) is a number which grows quite rapidly with
d and does not indicate that the B-spline basis is stable. However, it is possible to find
better estimates for the condition number, and it is known that the B-spline basis is very
stable, at least for moderate values of d. To determine the condition number is relatively
simple for d ≤ 2; we have κ0,∞ = κ1,∞ = 1 and κ2,∞ = 3. For d ≥ 3 it has recently been
shown that κd,∞ = O(2d ). The first few values are known numerically to be κ3,∞ ≈ 5.5680
and κ4,∞ ≈ 12.088.
9.3.3
The condition number of the B-spline basis. p-norm
With 1 ≤ p ≤ ∞ and q such that 1/p+1/q = 1 we recall the Hölder inequality for functions
Z b
|f (x)g(x)|dx ≤ ||f ||p ||g||q ,
a
and the Hölder inequality for sums
m
X
i=1
m
|bi ci | ≤ ||(bi )m
i=1 ||p ||(ci )i=1 ||q .
9.3. STABILITY OF THE B-SPLINE BASIS
193
We also note that for any polynomial g ∈ πd and any interval [a, b] we have
C
|g(x)| ≤
b−a
Z
b
|g(x)| dx,
x ∈ [a, b],
(9.26)
a
where the constant C only depends on the degree d. This follows on [a, b] = [0, 1] since
all norms on a finite dimensional vector space are equivalent, and then on an arbitrary
interval [a, b] by a change of variable.
In order to generalise the stability result (9.25) to arbitrary p-norms we need to scale
the B-splines differently. We define the p-norm B-splines to be identically zero if ti+d+1 = ti
and
1/p
d+1
p
Bi,d,t =
Bi,d,t ,
(9.27)
ti+d+1 − ti
otherwise.
Theorem 9.18. There is a constant K, depending only on the polynomial
degree d, such
P
p
that for all 1 ≤ p ≤ ∞, all spline spaces Sd,t and all splines f = m
b
B
i
i=1
i,d ∈ Sd,t with
m
p-norm B-spline coefficients b = (bi )i=1 the inequalities
K −1 ||b||p ≤ ||f ||p,[t1 ,tm+d ] ≤ ||b||p
(9.28)
hold.
Proof. We first prove the upper inequality. Let γi = (d + 1)/(ti+d+1 − ti ) for i = 1, . . . , m
and set [a, b] = [t1 , tm+d+1 ]. Using the Hölder inequality for sums we have
X
p
|bi Bi,d
|=
X
i
1/p
|bi γi
1/p
1/q
Bi,d |Bi,d ≤
X
i
|bi |p γi Bi,d
1/p X
i
1/q
Bi,d
.
i
Raising this to the pth power and using the partition of unity property we obtain the
inequality
X
p X p
p
bi Bi,d
(x) ≤
|bi | γi Bi,d (x), x ∈ R.
i
R
Therefore, recalling that
||f ||pp,[a,b]
i
Bi,d (x)dx = 1/γi we find
bX
Z
=
a
p
p
bi Bi,d
(x) dx
i
≤
X
p
Z
|bi | γi
b
Bi,d (x) dx =
a
i
X
|bi |p .
i
Taking pth roots proves the upper inequality.
Consider now the lower inequality. Recall from (9.24) that we can bound the B-spline
coefficients in terms of the infinity norm of the function. In terms of the coefficients bi of
the p-norm B-splines we obtain from (9.24) for all i
d+1
ti+d+1 − ti
1/p
|bi | ≤ K1
max
ti+1 ≤x≤ti+d
|f (x)|,
194
CHAPTER 9. APPROXIMATION THEORY AND STABILITY
where the constant K1 only depends on d. Taking max over a larger subinterval, using
(9.26), and then Hölder for integrals we find
1/p
|bi | ≤ K1 (d + 1)−1/p ti+d+1 − ti
| max |f (x)|
ti ≤x≤ti+d+1
Z
−1+1/p ti+d+1
≤ CK1 (d + 1)−1/p ti+d+1 − ti
|f (y)| dy
ti
≤ CK1 (d + 1)−1/p
Z
ti+d+1
|f (y)|p dy
1/p
ti
Raising both sides to the pth power and summing over i we obtain
X
X Z ti+d+1
p
p p
−1
|bi | ≤ C K1 (d + 1)
|f (y)|p dy ≤ C p K1p ||f ||pp,[a,b] .
i
i
ti
Taking pth roots we obtain the lower inequality in (9.28) with K = CK1 .
Exercises for Chapter 9
9.1 In this exercise we will study the order of approximation by the Schoenberg Variation
Diminishing Spline Approximation of degree d ≥ 2. This approximation is given by
Vd f =
n
X
f (t∗i )Bi,d ,
with t∗i =
i=1
ti+1 + · · · ti+d
.
d
Here Bi,d is the ith B-spline of degree d on a d + 1-regular knot vector t = (ti )n+d+1
.
i=1
We assume that ti+d > ti for i = 2, . . . , n. Moreover we define the quantities
a = t1 ,
b = tn+d+1 ,
h = max ti+1 − ti .
1≤i≤n
We want to show that Vd f is an O(h2 ) approximation to a sufficiently smooth f .
We first consider the more general spline approximation
Ṽd f =
n
X
λi (f )Bi,d ,
with λi (f ) = wi,0 f (xi,0 ) + wi,1 f (xi,1 ).
i=1
Here xi,0 and xi,1 are two distinct points in [ti , ti+d ] and wi,0 , wi,1 are constants,
i = 1, . . . , n.
Before attempting to solve this exercise the reader might find it helpful to review
Section 9.2.2
a) Suppose for i = 1, . . . , n that wi,0 and wi,1 are such that
wi,0 + wi,1 = 1
xi,0 wi,0 + xi,1 wi,1 = t∗i
Show that then Ṽd p = p for all p ∈ π1 . (Hint: Consider the polynomials
p(x) = 1 and p(x) = x.)
9.3. STABILITY OF THE B-SPLINE BASIS
195
b) Show that if we set xi,0 = t∗i for all i then Ṽd f = Vd f for all f , regardless of
how we choose the value of xi,1 .
In the rest of this exercise we set λi (f ) = f (t∗i ) for i = 1, . . . , n, i.e. we consider
Vd f . We define the usual uniform norm on an interval [c, d] by
||f ||[c,d] = sup |f (x)|,
f ∈ C∆ [c, d].
c≤x≤d
c) Show that for d + 1 ≤ l ≤ n
||Vd f ||[tl ,tl+1 ] ≤ ||f ||[t∗l−d ,t∗l ] ,
f ∈ C∆ [a, b].
d) Show that for f ∈ C∆ [t∗l−d , t∗l ] and d + 1 ≤ l ≤ n
||f − Vd f ||[tl ,tl+1 ] ≤ 2 dist[t∗l−d ,t∗l ] (f, π1 ).
e) Explain why the following holds for d + 1 ≤ l ≤ n
dist[t∗l−d ,t∗l ] (f, π1 ) ≤
(t∗l − t∗l−d )2
||D2 f ||[t∗l−d ,t∗l ] .
8
f) Show that the following O(h2 ) estimate holds
||f − Vd f ||[a,b] ≤
d2 2 2
h ||D f ||[a,b] .
4
(Hint: Verify that t∗l − t∗l−d ≤ hd. )
9.2 In this exercise we want to perform a numerical simulation experiment to determine
the order of approximation by the quadratic spline approximations
V2 f =
n
X
f (t∗i )Bi,2 ,
i=1
P2 f =
n
X
i=1
with t∗i =
ti+1 + ti+2
,
2
1
1
− f (ti+1 ) + 2f (t∗i ) − f (ti+2 ) Bi,2 .
2
2
We want to test the hypotheses f − V2 f = O(h2 ) and f − P2 f = O(h3 ) where h =
maxi ti+1 − ti . We test these on the function f (x) = sin x on [0, π] for various values
nm +3
of h. Consider for m ≥ 0 and nm = 2 + 2m the 3-regular knot vector tm = (tm
i )i=1
−m
on the interval [0, π] with uniform spacing hm = π2 . We define
V2m f
=
n
X
i=1
P2m f =
n
X
i=1
m
f (tm
i+3/2 )Bi,2 ,
with tm
i =
m
tm
i+1 + ti+2
,
2
1 m m
1
m
− f (tm
i+1 ) + 2f (ti+3/2 ) − f (ti+2 ) Bi,2 ,
2
2
196
CHAPTER 9. APPROXIMATION THEORY AND STABILITY
m is the ith quadratic B-spline on tm . As approximations to the norms
and Bi,2
||f − V2m f ||[0,π] and ||f − P2m f ||[0,π] we use
EVm = max |f (jπ/100) − V2m f (jπ/100)|,
0≤j≤100
EPm
= max |f (jπ/100) − P2m f (jπ/100)|.
0≤j≤100
Write a computer program to compute numerically the values of EVm and EPm for
m = 0, 1, 2, 3, 4, 5, and the ratios EVm /EVm−1 and EPm /EPm−1 for 1 ≤ m ≤ 5. What
can you deduce about the approximation order of the two methods?
Make plots of V2m f , P2m f , f − V2m f , and f − P2m f for some values of m.
m
9.3 Suppose we have m ≥ 3 data points xi , f (xi ) i=1 sampled from a function f , where
the abscissas x = (xi )m
i=1 satisfy x1 < · · · < xm . In this exercise we want to derive
a local quasi-interpolation scheme which only uses the data values at the xi ’s and
which has O(h3 ) order of accuracy if the y-values are sampled from a smooth function
f . The method requires m to be odd.
From x we form a 3-regular knot vector by using every second data point as a knot
t = (tj )n+3
j=1 = (x1 , x1 , x1 , x3 , x5 , . . . , xm−2 , xm , xm , xm ),
(9.29)
where n = (m + 3)/2. In the quadratic spline space S2,t we can then construct the
spline
n
X
Q2 f =
λj (f )Bj,2 ,
(9.30)
j=1
λj (f )nj=1
where the B-spline coefficients
are defined by the rule
1
−1
−1
2
− θj f (x2j−3 ) + θj (1 + θj ) f (x2j−2 ) − θj f (x2j−1 ) ,
λj (f ) =
2
(9.31)
for j = 1, . . . , n. Here θ1 = θn = 1 and
θj =
x2j−2 − x2j−3
x2j−1 − x2j−2
for j = 2, . . . , n − 1.
a) Show that Q2 simplifies to P2 given by (9.4) when the data abscissas are uniformly spaced.
b) Show that Q2 p = p for all p ∈ π2 and that because of the multiple abscissas at
the ends we have λ1 (f ) = f (x1 ), λn (f ) = f (xm ), so only the original data are
used to define Q2 f . (Hint: Use the formula in Exercise 1.
c) Show that for j = 1, . . . , n and f ∈ C∆ [x1 , xm ]
|λj (f )| ≤ (2θ + 1)||f ||∞,[tj+1 ,tj+2 ] ,
where
θ = max {θj−1 , θj }.
1≤j≤n
9.3. STABILITY OF THE B-SPLINE BASIS
197
d) Show that for l = 3, . . . , n, f ∈ C∆ [x1 , xm ], and x ∈ [tl , tl+1 ]
|Q2 (f )(x)| ≤ (2θ + 1)||f ||∞,[tl−1 ,tl+2 ] .
e) Show that for l = 3, . . . , n and f ∈ C∆ [x1 , xm ]
||f − Q2 f ||∞,[tl ,tl+1 ] ≤ (2θ + 2) dist[tl−1 ,tl+2 ] (f, π2 ).
3 [x , x ] we have the O(h3 ) estimate
f) Show that for f ∈ C∆
1 m
||f − Q2 f ||∞,[x1 ,xm ] ≤ K(θ)|∆x|3 ||D3 f ||∞,[x1 ,xm ] ,
where
|∆x| = max |xj+1 − xj |
j
and the constant K(θ) only depends on θ.
198
CHAPTER 9. APPROXIMATION THEORY AND STABILITY
CHAPTER 10
Shape Preserving Properties of
B-splines
In earlier chapters we have seen a number of examples of the close relationship between a
spline function and its B-spline coefficients. This is especially evident in the properties of
the Schoenberg operator, but the same phenomenon is apparent in the diagonal property
of the blossom, the stability of the B-spline basis, the convergence of the control polygon
to the spline it represents and so on. In the present chapter we are going to add to this list
by relating the number of zeros of a spline to the number of sign changes in the sequence of
its B-spline coefficients. From this property we shall obtain an accurate characterisation
of when interpolation by splines is uniquely solvable. In the final section we show that
the knot insertion matrix and the B-spline collocation matrix are totally positive, i.e., all
their square submatrices have nonnegative determinants.
10.1
Bounding the number of zeros of a spline
In Section 4.5 of Chapter 4 we showed that the number of sign changes in a spline is
bounded by the number of sign changes in its B-spline coefficients, a generalisation of
Descartes’ rule of signs for polynomials, Theorem 4.23. Theorem 4.25 is not a completely
satisfactory generalisation of Theorem 4.23 since it does not allow multiple zeros. In this
section we will prove a similar result that does allow multiple zeros, but we cannot allow
the most general spline functions. we have to restrict ourselves to connected splines.
P
Definition 10.1. A spline f = nj=1 cj Bj,d in Sd,τ is said to be connected if for each x
in (τ1 , τn+d+1 ) there is some j such that τj < x < τj+d+1 and cj 6= 0. A point x where this
condition fails is called a splitting point for f .
To develop some intuition about connected splines, let us see when a spline is not
connected. A splitting point of f can be of two kinds:
(i) The splitting point x is not a knot. If τµ < x < τµ+1 , then τj < x < τj+d+1
for j = µ − d, . . . , µ (assuming the knot vector is long enough) so we must have
cµ−d = · · · = cµ = 0. In other words f must be identically zero on (τµ , τµ+1 ). In this
case f splits into two spline functions f1 and f2 with knot vectors τ 1 = (τj )µj=1 and
199
200
CHAPTER 10. SHAPE PRESERVING PROPERTIES OF B-SPLINES
τ 2 = (τj )n+d+1
j=µ+1 . We clearly have
f1 =
µ−d−1
X
cj Bj,d ,
j=1
n
X
f2 =
cj Bj,d .
j=µ+1
(ii) The splitting point x is a knot of multiplicity m, say
τµ < x = τµ+1 = · · · = τµ+m < τµ+m+1 .
In this case we have τj < x < τj+1+d for j = µ + m − d, . . . , µ. We must therefore
have cµ+m−d = · · · = cµ = 0. (Note that if m = d + 1, then no coefficients need to
be zero). This means that all the B-splines that “cross” x do not contribute to f . It
therefore splits into two parts f1 and f2 , but now the two pieces are not separated
by an interval, but only by the single point x. The knot vector of f1 is τ 1 = (τj )µ+m
j=1
and the knot vector of f2 is τ 2 = (τj )n+d+1
,
while
j=µ+1
µ+m−d−1
X
f1 =
cj Bj,d ,
f2 =
j=1
n
X
cj Bj,d .
j=µ+1
Before getting on with our zero counts we need the following lemma.
Lemma 10.2. Suppose that z is a knot that occurs m times in τ ,
τi < z = τi+1 = · · · = τi+m < τi+m+1
for some i. Let f =
P
j cj Bj,d
be a spline in Sd,τ . Then
cj =
d−m
1 X
(−1)k Dd−k ρj,d (z)Dk f (z)
d!
(10.1)
k=0
for all j such that τj < z < τj+d+1 , where ρj,d (y) = (y − τj+1 ) · · · (y − τj+d ).
Proof. Recall from Theorem 8.5 that the B-spline coefficients of f can be written as
d
cj = λj f =
1 X
(−1)k Dd−k ρj,d (y)Dk f (y),
d!
k=0
where y is a number such that Bj,d (y) > 0. In particular, we may choose y = z for
j = i + m − d, . . . , i so
d
1 X
cj = λj f =
(−1)k Dd−k ρj,d (z)Dk f (z),
d!
(10.2)
k=0
for these values of j. But in this case ρj,d (y) contains the factor (y − τi+1 ) · · · (y − τi+m ) =
(y − z)m so Dd−k ρj,d (z) = 0 for k > d − m and j = i + m − d, . . . , i, i.e., for all values of
j such that τj < z < τj+d+1 . The formula (10.1) therefore follows from (10.2).
10.1. BOUNDING THE NUMBER OF ZEROS OF A SPLINE
201
In the situation of Lemma 10.2, we know from Lemma 2.6 that Dk f is continuous at z
for k = 0, . . . , d−m, but Dd+1−m f may be discontinuous. Equation (10.1) therefore shows
that the B-spline coefficients of f can be computed solely from continuous derivatives of
f at a point.
Lemma 10.3. Let f be a spline that is connected. For each x in (τ1 , τn+d+1 ) there is
then a nonnegative integer r such that Dr f is continuous at x and Dr f (x) 6= 0.
Proof. The claim is clearly true if x is not a knot, for otherwise f would be identically zero
on an interval and therefore not connected. Suppose next that x is a knot of multiplicity
m. Then the first discontinuous derivative at x is Dd−m+1 f , so if the claim is not true,
we must have Dj f (x) = 0 for j = 0, . . . , d − m. But then we see from Lemma 10.2
that cl = λl f = 0 for all l such that τl < x < τl+d+1 . But this is impossible since f is
connected.
The lemma shows that we can count zeros of connected splines precisely as for smooth
functions. If f is a connected spline then a zero must be of the form f (z) = Df (z) = · · · =
Dr−1 f (z) = 0 with Dr f (z) 6= 0 for some integer r. Moreover Dr f is continuous at z. The
total number of zeros of f on (a, b), counting multiplicities, is denoted Z(f ) = Z(a,b) (f ).
Recall from Definition 4.21 that S − (c) denotes the number of sign changes in the vector
c (zeros are completely ignored).
Example 10.4. Below are some examples of zero counts of functions. For comparison we have also
included counts of sign changes. All zero counts are over the whole real line.
S − (x2 ) = 0,
Z x(1 − x)2 = 3,
Z x3 (1 − x)2 = 5,
S − x(1 − x)2 = 1,
S − x3 (1 − x)2 = 1,
S − (x7 ) = 1,
Z(−1 − x2 + cos x) = 2,
S − (−1 − x2 + cos x) = 0.
Z(x) = 1,
S − (x) = 1,
Z(x2 ) = 2,
Z(x7 ) = 7,
We are now ready to prove a generalization of Theorem 4.23 that allows zeros to be
counted with multiplicities.
P
Theorem 10.5. Let f = nj=1 cj Bj,d be a spline in Sd,τ that is connected. Then
Z(τ1 ,τn+d+1 ) (f ) ≤ S − (c) ≤ n − 1.
Proof. Let z1 < z2 < · · · < z` be the zeros of f in the interval (τ1 , τn+d+1 ), each of
multiplicity ri ; Lemma 10.2 shows that zi occurs at most d − ri times in τ . For if zi
occured m > d − ri times in τ then d − m < ri , and hence cj = 0 by (10.1) for all j such
that τj < z < τj+d+1 , which means that z is a splitting point for f . But this is impossible
since f is connected.
Now we form a new knot vector τ̂ where zi occurs exactly d−ri times and the numbers
zi − h and zi + h occur d + 1 times. Here h is a number that is small enough to ensure
that there are no other zeros of f or knots from τ other than zi in [zi − h, zi + h] for
1 ≤ i ≤ `. Let ĉ be the B-spline coefficients of f relative to τ̂ . By Lemma 4.24 we then
have S − (ĉ) ≤ S − (c) so it suffices to prove that Z(τ1 ,τn+d+1 ) (f ) ≤ S − (ĉ). But since
Z(τ1 ,τn+d+1 ) (f ) =
`
X
i=1
Z(zi −h,zi +h) (f ),
202
CHAPTER 10. SHAPE PRESERVING PROPERTIES OF B-SPLINES
it suffices to establish the theorem in the following situation: The knot vector is given by
d+1
d+1
}|
{ z d−r
}|
{
z
}| { z
τ = (z − h, . . . , z − h, z, . . . , z, z + h, . . . , z + h)
and z is a zero of f of multiplicity r. We want to show that
cj =
(d − r)!
(−1)d+1−j hr Dr f (z),
d!
j = d + 1 − r, . . . , d + 1,
(10.3)
−
so that the r + 1 coefficients (cj )d+1
j=d+1−r alternate in sign. For then S (c) ≥ r =
Z(z−h,z+h) (f ). Fix j in the range d + 1 − r ≤ j ≤ d + 1. By equation (10.1) we have
r
cj =
1 X
(−1)r d−r
(−1)k Dd−k ρj,d (z)Dk f (z) =
D ρj,d (z)Dr f (z),
d!
d!
k=0
since Dj f (z) = 0 for j = 0 . . . , r − 1. With our special choice of knot vector we have
ρj,d (y) = (y − z + h)d+1−j (y − z)d−r (y − z − h)r−d−1+j .
Taking d − r derivatives we therefore obtain
Dd−r ρj,d (z) = (d − r)!hd+1−j (−h)r−d−1+j = (d − r)!(−1)r−d−1+j hr
and (10.3) follows.
Figures 10.1 (a)–(d) show some examples of splines with multiple zeros of the sort
discussed in the proof of Theorem 10.5. All the knot vectors are d + 1-regular on the
interval [0, 2], with additional knots at x = 1. In Figure 10.1 (a) there is one knot at x = 1
and the spline is the polynomial (x − 1)2 which has a double zero at x = 1. The control
polygon models the spline in the normal way and has two sign changes. In Figure 10.1 (b)
the knot vector is the same, but the spline is now the polynomial (x − 1)3 . In this case
the multiplicity of the zero is so high that the spline has a splitting point at x = 1. The
construction in the proof of Theorem 10.5 prescribes a knot vector with no knots at x = 1
in this case. Figure 10.1 (c) shows the polynomial (x − 1)3 as a degree 5 spline on a
6-regular knot vector with a double knot at x = 1. As promised by the theorem and its
proof the coefficients change sign exactly three times. The spline in Figure 10.1 (d) is
more extreme. It is the polynomial (x − 1)8 represented as a spline of degree 9 with one
knot at x = 1. The control polygon has the required 8 changes of sign.
10.2
Uniqueness of spline interpolation
Having established Theorem 10.5, we return to the problem of showing that the B-spline
collocation matrix that occurs in spline interpolation, is nonsingular. We first consider
Lagrange interpolation, and then turn to Hermite interpolation where we also allow interpolation derivatives.
10.2. UNIQUENESS OF SPLINE INTERPOLATION
203
1
1
0.8
0.5
0.6
0.4
0.5
0.2
1
1.5
2
-0.5
0.5
1
1.5
2
-0.2
-1
(a) Cubic, 2 zeros, simple knot.
(b) Cubic, multiplicity 3, simple knot.
1
1
0.8
0.5
0.6
0.5
1
1.5
-0.5
-1
(c) Degree 5, multiplicity 3, double
knot.
2
0.4
0.2
0.5
1
1.5
2
(d) Degree 9, multiplicity 8, simple
knot.
Figure 10.1. Splines of varying degree with a varying number of zeros at and knots at x = 1.
204
CHAPTER 10. SHAPE PRESERVING PROPERTIES OF B-SPLINES
10.2.1
Lagrange Interpolation
In Chapter 8 we studied spline interpolation. With a spline space Sd,τ of dimension n and
dataP
(yi )ni=1 given at n distinct points x1 < x2 < · · · < xn , the aim is to determine a spline
g = ni=1 ci Bi,d in Sd,τ such that
g(xi ) = yi ,
for i = 1, . . . , n.
(10.4)
This leads to the linear system of equations
Ac = y,
where

B1,d (x1 ) B2,d (x1 )
 B1,d (x2 ) B2,d (x2 )

A=
..
..

.
.
B1,d (xn ) B2,d (xn )

. . . Bn,d (x1 )
. . . Bn,d (x2 ) 

,
..
..

.
.
. . . Bn,d (xn )
 
c1
 c2 
 
c =  . ,
 .. 
cn


y1
 y2 
 
y =  . .
 .. 
yn
The matrix A is often referred to as the B-spline collocation matrix. Since Bi,d (x) is
nonzero only if τi < x < τi+d+1 (we may allow τi = x if τi = τi+d < τi+d+1 ), the matrix A
will in general be sparse. The following theorem tells us exactly when A is nonsingular.
Theorem 10.6. Let Sd,τ be a given spline space, and let x1 < x2 < · · · < xn be n distinct
n
numbers. The collocation matrix A with entries Bj,d (xi ) i,j=1 is nonsingular if and only
if its diagonal is positive, i.e.,
Bi,d (xi ) > 0
for i = 1, . . . , n.
(10.5)
Proof. We start by showing that A is singular if a diagonal entry is zero. Suppose that
xq ≤ τq (strict inequality if τq = τq+d < τq+d+1 ) for some q so that Bq,d (xq ) = 0. By the
support properties of B-splines we must have ai,j = 0 for i = 1, . . . , q and j = q, . . . ,
n. But this means that only the n − q last entries of each of the last n − q + 1 columns
of A can be nonzero; these columns must therefore be linearly dependent and A must be
singular. A similar argument shows that A is also singular if xq ≥ τq+d+1 .
To show the converse, suppose that (10.5)Pholds but A is singular. Then there is a
nonzero vector c such that Ac = 0. Let f = ni=1 ci Bi,d denote the spline with B-spline
coefficients c. We clearly have f (xq ) = 0 for q = 1, . . . , n. Let G denote the set
G = ∪i (τi , τi+d+1 ) | ci 6= 0 .
Since each x in G must be in (τi , τi+d+1 ) for some i with ci 6= 0, we note that G contains no
splitting points of f . Note that if xi = τi = τi+d < τi+d+1 occurs at a knot of multiplicity
d + 1, then 0 = f (xi ) = ci . To complete the proof, suppose first that G is an open interval.
Since xi is in G if ci 6= 0, the number of zeros of f in G is greater than or equal to the
number ` of nonzero coefficients in c. Since we also have S − (c) < ` ≤ ZG (f ), we have a
contradiction to Theorem 10.5. In general G consists of several subintervals which means
that f is not connected, but can be written as a sum of connected components, each
defined on one of the subintervals. The above argument then leads to a contradiction on
each subinterval, and hence we conclude that A is nonsingular.
Theorem 10.6 makes it simple to ensure that the collocation matrix is nonsingular. We
just place the knots and interpolation points in such a way that τi < xi < τi+d+1 for i = 1,
. . . , n (note again that if τi = τi+d < τi+d+1 , then xi = τi is allowed).
10.2. UNIQUENESS OF SPLINE INTERPOLATION
10.2.2
205
Hermite Interpolation
In earlier chapters, particularly in Chapter 8, we made use of polynomial interpolation
with Hermite data—data based on derivatives as well as function values. This is also of
interest for splines, and as for polynomials this is conveniently indicated by allowing the
interpolation point to coalesce. If for example x1 = x2 = x3 = x, we take x1 to signify
interpolation of function value at x, the second occurrence of x signifies interpolation of
first derivative, and the third tells us to interpolate second derivative at x. If we introduce
the notation
λx (i) = max{j | xi−j = xi }
j
and assume that the interpolation points are given in nondecreasing order as x1 ≤ x2 ≤
· · · ≤ xn , then the interpolation conditions are
Dλx (i) g(xi ) = Dλx (i) f (xi )
(10.6)
where f is a given function and g is the spline to be determined. Since we are dealing with
splines of degree d we cannot interpolate derivatives of higher order than d; we therefore
assume that xi < xi+d+1 for i = 1, . . . , n − d − 1. At a point of discontinuity (10.6) is
to be interpreted according to our usual convention of taking limits from the right. The
(i, j)-entry of the collocation matrix A is now given by
ai,j = Dλx (i) Bj,d (xi ),
and as before the interpolation problem is generally solvable if and only if the collocation
matrix is nonsingular. Also as before, it turns out that the collocation matrix is nonsingular
if and only if τi ≤ xi < τi+d+1 , where equality is allowed in the first inequality only if
Dλx (i) Bi,d (xi ) 6= 0. This result will follow as a special case of our next theorem where we
consider an even more general situation.
At times it is of interest to know exactly when a submatrix of the collocation matrix is
nonsingular. The submatrices we consider are obtained by removing the same number of
rows and columns from A. Any columns may be removed, or equivalently, we consider a
subset {Bj1 ,d , . . . , Bj` ,d } of the B-splines. When removing rows we have to be a bit more
careful. The convention is that if a row with derivatives of order r at z is included, then we
also include all the lower order derivatives at z. This is most easily formulated by letting
the sequence of interpolation points only contain ` points as in the following theorem.
Theorem 10.7. Let Sd,τ be a spline space and let {Bj1 ,d , . . . , Bj` ,d } be a subsequence of
its B-splines. Let x1 ≤ · · · ≤ x` be a sequence of interpolation points with xi ≤ xi+d+1 for
i = 1, . . . , ` − d − 1. Then the ` × ` matrix A(j) with entries given by
ai,q = Dλx (i) Bjq ,d (xi )
for i = 1, . . . , ` and q = 1, . . . , ` is nonsingular if and only if
τji ≤ xi < τji +d+1 ,
for i = 1, . . . , `,
where equality is allowed in the first inequality if Dλx (i) Bji ,d (xi ) 6= 0.
(10.7)
206
CHAPTER 10. SHAPE PRESERVING PROPERTIES OF B-SPLINES
Proof. The proof follows along the same lines as the proof of Theorem 10.6. The most
challenging part is the proof that condition (10.7) is necessary so we focus on this. Suppose
that (10.7) holds, but
PA(j) is singular. Then we can find a nonzero vector c such that
A(j)c = 0. Let f = `i=1 ci Bji ,d denote the spline with c as its B-spline coefficients, and
let G denote the set
G = ∪`i=1 {(τji , τji +d+1 ) | ci 6= 0}.
To carry through the argument of Theorem 10.6 we need to verify that in the exceptional
case where xi = τji then ci = 0.
Set r = λx (i) and suppose that the knot τji occurs m times in τ and that τji = xi so
Dr Bji ,d (xi ) 6= 0. In other words
τµ < xi = τµ+1 = · · · = τµ+m < τµ+m+1
for some integer µ, and in addition ji = µ + k for some integer k with 1 ≤ k ≤ m. Note
that f satisfies
f (xi ) = Df (xi ) = · · · = Dr f (xi ) = 0.
(Remember that if a derivative is discontinuous at xi we take limits from the right.) Recall
from Lemma 2.6 that all B-splines have continuous derivatives up to order d − m at xi .
Since Dr Bji clearly is discontinuous at xi , it must be true that r > d−m. We therefore have
f (xi ) = Df (xi ) = · · · = Dd−m f (xi ) = 0 and hence cµ+m−d = · · · = cµ = 0 by Lemma 10.2.
The remaining interpolation conditions at xi are Dd−m+1 f (xi ) = Dd−m+2 f (xi ) = · · · =
Dr f (xi ) = 0. Let us consider each of these in turn. By the continuity properties of
B-splines we have Dd−m+1 Bµ+1 (xi ) 6= 0 and Dd−m+1 Bµ+ν = 0 for ν > 1. This means
that
0 = Dd−m+1 f (xi ) = cµ+1 Dd−m+1 Bµ+1 (xi )
and cµ+1 = 0. Similarly, we also have
0 = Dd−m+2 f (xi ) = cµ+2 Dd−m+2 Bµ+2 (xi ),
and hence cµ+2 = 0 since Dd−m+2 Bµ+2 (xi ) 6= 0. Continuing this process we find
0 = Dr f (xi ) = cµ+r+m−d Dr Bµ+r+m−d (xi ),
so cµ+r+m−d = 0 since Dr Bµ+r+m−d (xi ) 6= 0. This argument also shows that ji cannot be
chosen independently of r; we must have ji = µ + r + m − d.
For the rest of the proof it is sufficient to consider the case where G is an open interval,
just as in the proof of Theorem 10.6. Having established that ci = 0 if xi = τji , we know
that if ci 6= 0 then xi ∈ G. The number of zeros of f in G (counting multiplicities) is
therefore greater than or equal to the number of nonzero coefficients. But this is impossible
according to Theorem 10.5.
10.3
Total positivity
In this section we are going to deduce another interesting property of the knot insertion
matrix and the B-spline collocation matrix, namely that they are totally positive. We
follow the same strategy as before and establish this first for the knot insertion matrix and
then obtain the total positivity of the collocation matrix by recognising it as a submatrix
of a knot insertion matrix.
10.3. TOTAL POSITIVITY
207
Definition 10.8. A matrix A in Rm,n is said to be totally positive if all its square
submatrices have nonnegative determinant. More formally, let i = (i1 , i2 , . . . , i` ) and
j = (j1 , j2 , . . . , j` ) be two integer sequences such that
1 ≤ i1 < i2 < · · · < i` ≤ m,
(10.8)
1 ≤ i1 < i2 < · · · < i` ≤ n,
(10.9)
and let A(i, j) denote the submatrix of A with entries (aip ,jq )`p,q=1 . Then A is totally
positive if det A(i, j) ≥ 0 for all sequences i and j on the form (10.8) and (10.9), for all `
with 1 ≤ ` ≤ min{m, n}.
We first show that knot insertion matrices are totally positive.
Theorem 10.9. Let τ and t be two knot vectors with τ ⊆ t. Then the knot insertion
matrix from Sd,τ to Sd,t is totally positive.
Proof. Suppose that there are k more knots in t than in τ ; our proof is by induction
on k. We first note that if k = 0, then A = I, the identity matrix, while if k = 1, then
A is a bi-diagonal
matrix with one more rows than columns. Let us denote the entries
n+1,n
of A by αj (i) i,j=1 (if k = 0 the range of i is 1, . . . , n). In either case all the entries
are nonnegative and αj (i) = 0 for j < i − 1 and j > i. Consider now the determinant of
A(i, j). If j` ≥ i` then j` > iq for q = 1, . . . , `−1 so αj` (iq ) = 0 for q < `. This means that
only the last entry of the last column of A(i, j) is nonzero. The other possibility is that
j` ≤ i` − 1 so that jq < i` − 1 for q < `. Then αjq (i` ) = 0 for q < ` so only the last entry of
the last row of A(i, j) is nonzero. Expanding the determinant either by the last column or
last row we therefore have det A(i, j) = αj` (i` ) det A(i0 , j 0 ) where i0 = (i1 , . . . , i`−1 ) and
j 0 = (j1 , . . . , j`−1 ). Continuing this process we find that
det A(i, j) = αj1 (i1 )αj2 (i2 ) · · · αj` (i` )
which clearly is nonnegative.
For k ≥ 2, we make use of the factorization
A = Ak · · · A1 = Ak B,
(10.10)
where each Ar corresponds to insertion of one knot and B = Ak−1 · · · A1 is the knot
insertion matrix for inserting k − 1 of the knots. By the induction hypothesis we know
that both Ak and B are totally positive; we must show that A is totally positive. Let
m,m−1
(ai ) and (bi ) denote the rows of A and B, and let αj (i) i,j=1 denote the entries of Ak .
From (10.10) we have
ai = αi−1 (i)bi−1 + αi (i)bi
for i = 1, . . . , m,
where α0 (1) = αm (m) = 0. Let ai (j) and bi (j) denote the vectors obtained by keeping
only entries (jq )`q=1 of ai and bi respectively. Row q of A(i, j) of A is then given by
aiq (j) = αiq −1 (iq )biq −1 (j) + αiq (iq )biq (j).
208
CHAPTER 10. SHAPE PRESERVING PROPERTIES OF B-SPLINES
Using the linearity of the determinant in row q we therefore have




ai1 (j)
ai1 (j)
 .. 


..
 . 


.







det aiq (j) = det αiq −1 (iq )biq −1 (j) + αiq (iq )biq (j)

 .. 


..
 . 


.
ai` (j)
ai` (j)




ai1 (j)
ai1 (j)
 .. 


..
 . 


.




.


b
(j)
= αiq −1 (iq ) det biq −1 (j) + αiq (iq ) det 
i
q


 .. 


..
 . 


.
ai` (j)
ai` (j)
By expanding the other rows similarly we find that det A(i, j) can be written as a sum
of determinants of submatrices of B, multiplied by products of αj (i)’s. By the induction
hypothesis all these quantities are nonnegative, so the determinant of A(i, j) must also
be nonnegative. Hence A is totally positive.
Knowing that the knot insertion matrix is totally positive, we can prove a similar
property of the B-spline collocation matrix, even in the case where multiple collocation
points are allowed.
Theorem 10.10. Let Sd,τ be a spline space and let {Bj1 ,d , . . . , Bj` ,d } be a subsequence
of its B-splines. Let x1 ≤ · · · ≤ x` be a sequence of interpolation points with xi ≤ xi+d+1
for i = 1, . . . , ` − d − 1, and denote by A(j) the ` × ` matrix with entries given by
ai,q = Dλx (i) Bjq ,d (xi )
for i = 1, . . . , ` and q = 1, . . . , `. Then
det A(j) ≥ 0.
Proof. We first prove the claim in the case x1 < x2 < · · · < x` . By inserting knots
of multiplicity d + 1 at each of (xi )`i=1 we obtain a knot vector t that contains τ as a
subsequence. If ti−1 < ti = ti+d < ti+d+1 we know from Lemma 2.6 that Bj,d,τ (ti ) =
αj,d (i). This means that the matrix A(j) appears as a submatrix of the knot insertion
matrix from τ to t. It therefore follows from Theorem 10.9 that det A(j) ≥ 0 in this case.
To prove the theorem in the general case we consider a set of distinct collocation points
y1 < · · · < y` and let A(j, y) denote the corresponding collocation matrix. Set λi = λx (i)
and let ρi denote the linear functional given by
ρi f = λi ! [yi−λi , . . . , yi ]f
(10.11)
for i = 1, . . . , `. Here [·, . . . , ·]f is the divided difference of f . By standard properties of
divided differences we have
ρi Bj,d =
i
X
s=i−λi
γi,s Bj,d (ys )
10.3. TOTAL POSITIVITY
209
and γi,i > 0. Denoting by D the matrix with (i, j)-entry ρi Bj,d , we find by properties of
determinants and (10.11) that
det D = γ1,1 · · · γ`,` det A(j, y).
If we now let y tend to x we know from properties of the divided difference functional
i
that ρi Bj tends to Dλ Bj in the limit. Hence D tends to A(j) so det A(j) ≥ 0.
210
CHAPTER 10. SHAPE PRESERVING PROPERTIES OF B-SPLINES
APPENDIX A
Some Linear Algebra
A.1
Matrices
The collection of m, n matrices


a1,1 , . . . , a1,n
··· 
A =  ···
am,1 , . . . , am,n
with real elements ai,j is denoted by Rm,n . If n = 1 then A is called a column vector.
Similarly, if m = 1 then A is a row vector. We let Rm denote the collection of all column
or row vectors with m real components.
A.1.1
Nonsingular matrices, and inverses.
Definition A.1. A collection of vectors a1 , . . . , an ∈ Rm is linearly independent if x1 a1 +
· · · + xn an = 0 for some real numbers x1 , . . . , xn , implies that x1 = · · · = xn = 0.
Suppose a1 , . . . , an are P
the columns of a matrix A ∈ Rm,n . For a vector x = (x1 , . . . ,
n
xn ∈ R we have Ax = nj=1 xj aj . It follows that the collection a1 , . . . , an is linearly
independent if and only if Ax = 0 implies x = 0.
)T
Definition A.2. A square matrix A such that Ax = 0 implies x = 0 is said to be
nonsingular.
Definition A.3. A square matrix A ∈ Rn,n is said to be invertible if for some B ∈ Rn,n
BA = AB = I,
where I ∈ Rn,n is the identity matrix.
An invertible matrix A has a unique inverse B = A−1 . If A, B, and C are square
matrices, and A = BC, then A is invertible if and only if both B and C are also invertible.
Moreover, the inverse of A is the product of the inverses of B and C in reverse order,
A−1 = C −1 B −1 .
211
212
A.1.2
Determinants.
The determinant of a square matrix A will be denoted det(A) or
a1,1 , . . . , a1,n ..
.. .
.
. an,1 , . . . , an,n Recall that the determinant of a 2 × 2 matrix is
a1,1 a1,2 a2,1 a2,2 = a1,1 a2,2 − a1,2 a2,1 .
A.1.3
Criteria for nonsingularity and singularity.
We state without proof the following criteria for nonsingularity.
Theorem A.4. The following is equivalent for a square matrix A ∈ Rn,n .
1. A is nonsingular.
2. A is invertible.
3. Ax = b has a unique solution x = A−1 b for any b ∈ Rn .
4. A has linearly independent columns.
5. AT is nonsingular.
6. A has linearly independent rows.
7. det(A) 6= 0.
We also have a number of criteria for a matrix to be singular.
Theorem A.5. The following is equivalent for a square matrix A ∈ Rn,n .
1. There is a nonzero x ∈ Rn so that Ax = 0.
2. A has no inverse.
3. Ax = b has either no solution or an infinite number of solutions.
4. A has linearly dependent columns.
5. There is a nonzero x so that xT A = 0.
6. A has linearly dependent rows.
7. det(A) = 0.
Corollary A.6. A matrix with more columns than rows has linearly dependent columns.
Proof. Suppose A ∈ Rm,n with n > m. By adding n − m rows of zeros to A we obtain a
square matrix B ∈ Rn,n . This matrix has linearly dependent rows. By Theorem A.4 the
matrix B has linearly dependent columns. But then the columns of A are also linearly
dependent.
A.2. VECTOR NORMS
213
A.2
Vector Norms
Formally, a vector norm || || = ||x||, is a function k k : Rn → [0, ∞) that satisfies for
x, y, ∈ Rn , and α ∈ R the following properties
1. ||x|| = 0 implies x = 0.
2. ||αx|| = |α|||x||.
3. ||x + y|| ≤ ||x|| + ||y||.
(A.1)
Property 3 is known as the Triangle Inequality. For us the most useful class of norms are
the p or `p norms. They are defined for p ≥ 1 and x = (x1 , x2 , . . . , xn )T ∈ Rn by
||x||p = (|x1 |p + |x2 |p + · · · + |xn |p )1/p .
||x||∞ = maxi |xi |.
(A.2)
Since
||x||∞ ≤ ||x||p ≤ n1/p ||x||∞ ,
p≥1
(A.3)
n1/p
and limp→∞
= 1 for any n ∈ N we see that limp→∞ ||x||p = ||x||∞ .
The 1,2, and ∞ norms are the most important. We have
||x||22 = x21 + · · · + x2n = xT x.
(A.4)
Lemma A.7 (The Hölder inequality). We have for 1 ≤ p ≤ ∞ and x, y ∈ R
n
X
|xi yi | ≤ ||x||p ||y||q ,
where
i=1
1 1
+ = 1.
p q
(A.5)
Proof. We base the proof on properties of the exponential function. Recall that the
exponential function is convex, i.e. with f (x) = ex we have the inequality
f (λx + (1 − λ)y) ≤ λf (x) + (1 − λ)f (y)
(A.6)
for every λ ∈ [0, 1] and x, y ∈ R.
If x = 0 or y = 0, there is nothing to prove. Suppose x, y P
6= 0. Define u = x/||x||p
and v = y/||y||
.
Then
||u||
=
||v||
=
1.
If
we
can
prove
that
p
qP
i |ui vi | ≤ 1, we are done
Pq
because then i |xi yi | = ||x||p ||y||q i |ui vi | ≤ ||x||p ||y||q . Since |ui vi | = |ui ||vi |, we can
assume that ui ≥ 0 and vi ≥ 0. Moreover, we can assume that ui > 0 and vi > 0 because
a zero term contributes no more to the left hand side than to the right hand side of (A.5).
Let si , ti be such that ui = esi /p , vi = eti /q . Taking f (x) = ex , λ = 1/p, 1 − λ = 1/q,
x = si and y = ti in (A.6) we find
1
1
esi /p+ti /q ≤ esi + eti .
p
q
But then
X
X
1X p 1X q
1 1
1 X si 1 X ti
|ui vi | =
esi /p+ti /q ≤
e +
e =
ui +
vi = + = 1.
p
q
p
q
p q
i
i
i
This completes the proof of (A.5).
i
i
i
214
When p = 2 then q = 2 and the Hölder inequality is associated with the names
Buniakowski-Cauchy-Schwarz.
Lemma A.8 (The Minkowski inequality). We have for 1 ≤ p ≤ ∞ and x, y ∈ R
||x + y||p ≤ ||x||p + ||y||p .
(A.7)
Proof. Let u = (u1 , . . . , un ) with ui = |xi + yi |p−1 . Since q(p−1) = p and p/q = p−1, we
find
X
X
p−1
||u||q = (
|xi + yi |q(p−1) )1/q = (
|xi + yi |p )1/q = ||x + y||p/q
p = ||x + y||p .
i
i
Using this and the Hölder inequality we obtain
X
X
X
||x + y||pp =
|xi + yi |p ≤
|ui ||xi | +
|ui ||yi | ≤ (||x||p + ||y||p )||u||q
i
i
≤ (||x||p + ||y||p )||x +
i
y||p−1
p .
Dividing by ||x + y||p−1
proves Minkowski.
p
Using the Minkowski inequality it follows that the p norms satisfies the axioms for a
vector norm.
In (A.3) we established the inequality
||x||∞ ≤ ||x||p ≤ n1/p ||x||∞ ,
p ≥ 1.
More generally, we say that two vector norms || || and || ||0 are equivalent if there exists
positive constants µ and M such that
µ||x|| ≤ ||x||0 ≤ M ||x||
(A.8)
for all x ∈ Rn .
Theorem A.9. All vector norms on Rn are equivalent.
Proof. It is enough to show that a vector norm || || is equivalent to the l∞ norm, || ||∞ .
Let x ∈ Rn and let ei , i = 1, . . . , n be the unit vectors in Rn . Writing x = x1 e1 +· · ·+xn en
we have
X
X
||x|| ≤
|xi |||ei || ≤ ||x||∞ M, M =
||ei ||.
i
i
To find µ > 0 such that ||x|| ≥ µ||x||∞ for all x ∈ Rn is less elementary. Consider the
function f given by f (x) = ||x|| defined on the l∞ “unit ball”
S = {x ∈ Rn : ||x||∞ = 1}.
S is a closed and bounded set. From the inverse triangle inequality
| ||x|| − ||y|| | ≤ ||x − y||,
x, y ∈ Rn .
A.3. VECTOR SPACES OF FUNCTIONS
215
it follows that f is continuous on S. But then f attains its maximum and minimum on S,
i.e. there is a point x∗ ∈ S such that
||x∗ || = min ||x||.
x∈S
Moreover, since x∗ is nonzero we have µ := ||x∗ || > 0. If x ∈ Rn is nonzero then
x = x/||x||∞ ∈ S. Thus
µ ≤ ||x|| = ||
x
1
|| =
||x||,
||x||∞
||x||∞
and this establishes the lower inequality.
It can be shown that for the p norms we have for any q with 1 ≤ q ≤ p ≤ ∞
||x||p ≤ ||x||q ≤ n1/q−1/p ||x||p ,
x ∈ Rn .
(A.9)
<
A.3
Vector spaces of functions
In Rm we have the operations x + y and ax of vector addition and multiplication by
a scalar a ∈ R. Such operations can also be defined for functions. As an example, if
f (x) = x, g(x) = 1 , and a, b are real numbers then af (x) + bg(x) = ax + b. In general, if
f and g are two functions defined on the same set I and a ∈ R, then the sum f + g and
the product af are functions defined on I by
(f + g)(x) = f (x) + g(x),
(af (x) = af (x).
Two functions f and g defined on I are equal if f (x) = g(x) for all x ∈ I. We say that f
is the zero function, i.e. f = 0, if f (x) = 0 for all x ∈ I.
Definition A.10. Suppose S is a collection of real valued or vector valued functions, all
defined on the same set I. The collection S is called a vector space if af + bg ∈ S for all
f, g ∈ S and all a, b ∈ R. A subset T of S is called a subspace of S if T itself is a vector
space.
Example A.11. Vector spaces
• All polynomials πd of degree at most d.
• All polynomials of all degrees.
• All trigonometric polynomials a0 +
Pd
k=1 (ak
cos kx + bk sin kx of degree at most d.
• The set C(I) of all continuous real valued functions defined on I.
• The set C r (I) of all real valued functions defined on I with continuous j 0 th derivative for j =
0, 1, . . . , r.
216
Definition A.12. A vector space S is said to be finite dimesional if
S = span(φ1 , . . . , φn ) = {
n
X
cj φj : cj ∈ R},
j=1
for a finite number of functions φ1 , . . . , φn in S. The functions φ1 , . . . , φn are said to span
or generate S.
Of the examples above the space πd = span(1, x, x2 , . . . xd ) generated by the monomials 1, x, x2 , . . . xd is finite dimensional. Also the trigonometric polynomials are finite
dimensional. The space of all polynomials of all degrees is not finite dimensional. To
see this we observe that any finite set cannot generate the monomial xd+1 where d is the
maximal degree of the elements in the spanning set. Finally we observe that C(I) and
C r (I) contain the space of polynomials of all degrees as a subspace. Hence they are not
finite dimensional,
Pn
If f ∈ S = span(φ1 , . . . , φn ) then f =
j=1 cj φj for some c = (c1 , . . . , cn ). With
T
φ = (φ1 , . . . , φn ) we will often use the vector notation
f (x) = φ(x)T c
(A.10)
for f .
A.3.1
Linear independence and bases
All vector spaces in this section will be finite dimensional.
Definition A.13. A set of functions φ = (φ1 , . . . , φn )T in a vector space S is said to be
linearly independent on a subset J of I if φ(x)T c = c1 φ1 (x) + · · · + cn φn (x) = 0 for all
x ∈ J implies that c = 0. If J = I then we simply say that φ is linearly independent.
If φ is linearly independent then the representation in (A.10) is unique. For if f =
φT c = φT b for some c, b ∈ Rn then f = φT (c − b) = 0. Since φ is linearly independent
we have c − b = 0, or c = b.
Definition A.14. A set of functions φT = (φ1 , . . . , φn ) in a vector space S is a basis for
S if the following two conditions hold
1. φ is linearly independent.
2. S = span(φ).
Theorem A.15. The monomials 1, x, x2 , . . . xd are linearly independent on any set J ⊂ R
containing at least d + 1 distinct points. In particular these functions form as basis for πd .
Proof. Let x0 , . . . , xd be d + 1 distinct points in J, and let p(x) = c0 + c1 x + · · · + cd xd = 0
for all x ∈ J. Then p(xi ) = 0, for i = 0, 1, . . . , d. Since a nonzero polynomial of degree
d can have at most d zeros we conclude that p must be the zero polynomial. But then
ck = p(k) (0)/k! = 0 for k = 0, 1, . . . , d. It follows that the monomial is a basis for πd since
they span πd by definition.
To prove some basic results about bases in a vector space of functions it is convenient
to introduce a matrix transforming one basis into another.
A.3. VECTOR SPACES OF FUNCTIONS
217
Lemma A.16. Suppose S and T are finite dimensional vector spaces with S ⊂ T , and
let φ = (φ1 , . . . , φn )T be a basis for S and ψ = (ψ1 , . . . , ψm )T a basis for T . Then
φ = AT ψ,
(A.11)
for some matrix A ∈ Rm,n . If f = φT c ∈ S is given then f = ψ T b with
b = Ac.
(A.12)
Moreover A has linearly independent columns.
Proof. Since φj ∈ T there are real numbers ai,j such that
φj =
m
X
ai,j ψi ,
for j = 1, . . . , n,
i=1
This equation is simply the component version of (A.11). If f ∈ S then f ∈ T and f = ψ T b
for some b. By (A.11) we have φT = ψ T A and f = φT c = ψ T Ac or ψ T b = ψ T Ac. Since
ψ is linearly independent we get (A.12). Finally, to show that A has linearly independent
columns suppose Ac = 0. Define f ∈ S by f = φT c. By (A.11) we have f = ψ T Ac = 0.
But then f = φT c = 0. Since φ is linearly independent we conclude that c = 0.
The matrix A in Lemma A.16 is called a change of basis matrix.
A basis for a vector space generated by n functions can have at most n elements.
Lemma A.17. If ψ = (ψ1 . . . , ψk )T is a linearly independent set in a vector space S =
span(φ1 , . . . , φn ), then k ≤ n.
Proof. With φ = (φ1 , . . . , φn )T we have
ψ = AT φ,
for some A ∈ Rn,k .
If k > n then A is a rectangular matrix with more columns than rows. From Corollary A.6
we know that the columns of such a matrix must be linearly dependent; I.e. there is some
nonzero c ∈ Rk such that Ac = 0. But then ψ T c = φT Ac = 0, for some nonzero c. This
implies that ψ is linearly dependent, a contradiction. We conclude that k ≤ n.
Lemma A.18. Every basis for a vector space must have the same number of elements.
Proof. Suppose φ = (φ1 , . . . , φn )T and ψ = (ψ1 , . . . , ψm )T are two bases for the vector
space. We need to show that m = n. Now
φ = AT ψ,
for some A ∈ Rm,n ,
ψ = B T φ,
for some B ∈ Rn,m .
By Lemma A.16 we know that both A and B have linearly independent columns. But
then by Corollary A.6 we see that m = n.
Definition A.19. The number of elements in a basis in a vector space S is called the
dimension of S, and is denoted by dim(S).
218
The following lemma shows that every set of linearly independent functions in a vector
space S can be extended to a basis for S. In particular every finite dimensional vector
space has a basis.
Lemma A.20. A set φT = (φ1 , . . . , φk ) of linearly independent elements in a finite dimensional vector space S, can be extended to a basis ψ T = (ψ1 , . . . , ψm ) for S.
Proof. Let Sk = span(ψ1 , . . . , ψk ) where ψj = φj for j = 1, . . . , k. If Sk = S then we set
m = k and stop. Otherwise there must be an element ψk+1 ∈ S such that ψ1 , . . . , ψk+1 are
linearly independent. We define a new vector space Sk+1 by Sk+1 = span(ψ1 , . . . , ψk+1 ).
If Sk+1 = S then we set m = k + 1 and stop the process. Otherwise we continue to generate vector spaces Sk+2 , Sk+3 , · · · . Since S is finitely generated we must by Lemma A.17
eventually find some m such that Sm = S.
The following simple, but useful lemma, shows that a spanning set must be a basis if
it contains the correct number of elements.
Lemma A.21. Suppose S = span(φ). If φ contains dim(S) elements then φ is a basis
for S.
Proof. Let n = dim(S) and suppose φ = (φ1 , . . . , φn ) is a linearly dependent set. Then
there is one element, say φn which can be written as a linear combination of φ1 , . . . , φn−1 .
But then S = span(φ1 , . . . , φn−1 ) and dim(S) < n by Lemma A.17, a contradiction to the
assumption that φ is linearly dependent.
A.4
Normed Vector Spaces
Suppose S is a vector space of functions. A norm || || = ||f ||, is a function k k : S → [0, ∞)
that satisfies for f, g, ∈ S, and α ∈ R the following properties
1. ||f || = 0 implies f = 0.
2. ||αf || = |α|||f ||.
3. ||f + g|| ≤ ||f || + ||g||.
(A.13)
Property 3 is known as the Triangle Inequality. The pair (S, || ||) is called a normed vector
space (of functions).
In the rest of this section we assume that the functions in S are continuous, or at least
piecewise continuous on some interval [a, b].
Analogous to the p or `p norms for vectors in Rn we have the p or Lp norms for
functions. They are defined for 1 ≤ p ≤ ∞ and f ∈ S by
||f ||p = ||f ||Lp [a,b] =
R
1/p
b
p dx
|f
(x)|
,
a
p ≥ 1,
(A.14)
||f ||∞ = ||f ||L∞ [a,b] = maxa≤x≤b |f (x)|.
The 1,2, and ∞ norms are the most important.
We have for 1 ≤ p ≤ ∞ and f, g ∈ S the Hölder inequality
Z
b
|f (x)g(x)|dx ≤ ||f ||p ||g||q ,
a
where
1 1
+ = 1,
p q
(A.15)
A.4. NORMED VECTOR SPACES
219
and the Minkowski inequality
||f + g||p ≤ ||f ||p + ||g||p .
(A.16)
For p = 2 (A.15) is known as the Schwarz inequality, the Cauchy-Schwarz inequality, or
the Buniakowski-Cauchy- Schwarz inequality.
```