THE SMOOTHING SPLINE: A NEW APPROACH TO FOR DENDROCLIMATIC STUDIES

THE SMOOTHING SPLINE: A NEW APPROACH TO FOR DENDROCLIMATIC STUDIES
TREE -RING BULLETIN, Vol. 41, 1981
THE SMOOTHING SPLINE: A NEW APPROACH TO
STANDARDIZING FOREST INTERIOR TREE -RING WIDTH SERIES
FOR DENDROCLIMATIC STUDIES
EDWARD R. COOK
and
KENNETH PETERS
Lamont -Doherty Geological Observatory
Palisades, New York
ABSTRACT
A new approach to removing the non -climatic variance of forest interior tree -ring
width series, using the smoothing spline, is described. This method is superior to
orthogonal polynomials because it makes no assumptions about the shape of the curve
to be used for standardization. Also, because the spline curve can range continuously
from a linear least squares fit to cubic interpolation through the data, it is far more
flexible than polynomials and provides a more "natural" fit.
For computing the spline, we found that specifying the Lagrange multiplier p
which appears in the calculus of variation solution rather than the residual variance as
suggested by Reinsche was both practical and more efficient. In effect, the smoothing
spline is a one -parameter family of low -pass filters defined by p. We describe the
general characteristics of these filters in the time and frequency domains and compute
the response functions for several of them.
The smoothing spline is an excellent tree -ring standardization method because its
filtering characteristics are well defined. Its utility for dendroclimatology should be
considerable since, outside of semiarid environments, sites similar to forest interiors
predominate.
Es wird ein neuer Ansatz zur Beseitigung der nicht -klimatisch bedingten Varianz
aus den Jahrringfolgen von Bäumen aus dem Bestandesinneren mit Hilfe von
beschrieben. Dieses Verfahren ist der Berechnung von
es keine Annahmen über die zur
Standardisierung benötigte Kurvenform macht. Da die Spline -Kurve kontinuierlich
von einem linearen Ausgleich auf der Grundlage der kleinsten Abweichungsquadrate
bis zu einer kubischen Interpolation reichen kann, ist sie weitaus flexibler als
Polynome und führt zu einer "natürlichen" Anpassung.
Wir haben herausgefunden, daß die Vorgabe des Lagrange- Multiplikators p, der
bei der Lösung der Variation vorkommt, zur Berechnung des Spline praktikabler und
wirksamer ist als die Vorgabe der Restvarianz, wie Reinsche vorschlägt. In der Tat ist
der Ausgleichsspline eine Familie von Einparameter -Tiefpassfiltern, die durch p
definiert werden. Wir beschreiben die allgemeinen Eigenschaften dieser Filter im Zeit und Frequenzbereich und berechnen für einige von ihnen die Response- Funktionen.
Ausgleichs -Splines
orthogonalen Polynomen überlegen, da
Der Ausgleichsspline ist ein sehr gutes Verfahren zur Standardisierung von
Jahrringen, da seine Filtereigenschaften gut definiert sind. Seine Einsatzmöglichkeit in
der Dendroklimatologie dürfte beträchtlich sein, da außerhalb der semi- ariden
Standorte solche dominieren, die dem Bestandesinneren ähnlich sind.
Une nouvelle approche destinée à ôter la variance non climatique contenue dans
les séries dendrochronologiques provenant de l'intérieur de zones forestières par
l'utilisation de fonctions "spline" est décrite. Cette méthode est supérieure à celle basée
sur les fonctions polynomials orthogonales parce qu'elle ne fait aucune hypothèse
concernant la forme de la courbe qui doit être utilisée pour la standardisation. De
plus, la courbe d'approximation engendrée par une fonction spline peut varier
continuellement depuis un lissage linéaire calculé par les moindres carrés jusqu'à une
interpolation cubique au travers des données. De ce fait, cette équation est bien plus
souple que les polynomiales et procure des approximations plus "naturelles ".
Pour calculer la fonction spline, nous avons trouvé que spécifier le multiplicateur
p de LAGRANGE plutôt que la variance résiduelle comme le propose REINSCHE est
à la fois pratique et plus efficace. En effet, le spline de lissage est une famille à un
paramètre de filtres passe -bas défini par p. Nous décrivons les caractères généraux de
46
COOK AND PETERS
ces filtres dans le temps ainsi que les domaines de fréquence et nous calculons la
fonction -réponse pour plusieurs d'entre eux.
Le lissage par spline est une méthode de standardisation dendrochronologique '
parce que ses caractéristiques filtrantes sont bien définies. Son utilité en
dendroclimatologie devrait étre considérable puisqu'en dehors des régions semi- arides,
les sites correspondant à l'intérieur des foréts sont prédominants.
INTRODUCTION
In the semiarid environments of western North America, trees growing at or near
the upper and lower forest borders are generally unaffected by stand competition and
disturbance because of the wide spacing between neighboring trees. The rate of
annual radial growth of trees growing in such open- canopy environments generally
declines in an orderly fashion with increasing age to a relatively stationary mean level.
Because this declining growth rate is biological in origin, it must be removed from
each tree -ring series before the final composite tree -ring chronology can be used to
study variations in past climate. The process of modeling and removing such non climatic "noise" is known as standardization (Fritts 1971). Simple linear regressions
and modified negative exponential curves (Fritts et al. 1969) are commonly used to
standardize ring -width series for semiarid site trees.
When one moves to closed- canopy forest interior sites common to the deciduous
forests of eastern North America, the non- climatic component in annual ring -width
series becomes increasingly complex and variable due to competition between trees for
light and nutrients and from stand disturbances. Such additional noise is difficult to
model because it is often episodic in nature. The standardization techniques used for
semiarid site tree growth lack sufficient flexibility for removing the nonclimatic
variance of forest interior ring -width series. This problem is not trivial since it can
severely limit the potential of dendroclimatic research in eastern North America and
Europe.
One approach towards minimizing this problem has been the use of orthogonal
polynomials of varying degrees to model and remove non -climatic variance (Fritts
1976). Because of occasional dissatisfaction with the polynomial standardization
technique, we have researched a promising alternative, the smoothing spline.
BACKGROUND
Splines have been used traditionally as mathematical analogues to the thin flexible
strips used in drafting for interpolating new values between adjacent measurements.
As such, the spline is of no use for tree -ring research since it passes through the data
points on the assumption that these points are measured without error. Reinsch
(1967), however, considered the case where the data points were subject to unwanted
experimental error. In order to extract the underlying function from the experimental
noise, he developed an algorithm for a smoothing spline. Like the cubic interpolating
spline, this smoothing spline has continuous first and second derivatives.
The cubic spline can be thought of as a concatenation of cubic polynomial
segments that are joined together at their ends or knots. The continuity of the first and
second derivatives assures that the segments are joined in a very smooth fashion. In this
sense the smoothing spline is a series of piecewise cubic polynomials with a knot at each
data point abscissa (Wold 1974).
Splines are inherently superior to polynomials for approximating functions that
are disjointed or episodic in nature. To quote Rice (1969: 123) from Wold (1974):
The Smoothing Spline
47
"spline functions are the most successful approximating functions for practical
Applications so far discovered. The reader may be unaware of the fact that ordinary
, olynomials are inadequate in many situations. This is particularly the case when one
approximates functions which arise from the physical world rather than from the
mathematical world. Functions which express physical relationships are frequently of
a disjointed or disassociated nature. That is to say that their behavior in one region
may be totally unrelated to their behavior in another region. Polynomials, along with
most other mathematical functions, have just the opposite property. Namely, their
behavior in a small region determines their behavior everywhere. Splines do not suffer
this handicap since they are defined piecewise, yet, for [more than 3 data points] they
represent nice, smooth curves in the physical world."
A major advantage of orthogonal polynomials that allows for automated curve
fitting is the statistical independence between successively higher order curves. The
widely -used ring -width standardization program, developed by the Laboratory of
Tree -Ring Research at The University of Arizona, utilizes a testing algorithm in its
orthogonal polynomial option. It accepts a given order polynomial fit when the next
two higher orders do not reduce the residual variance by 5% or more. By contrast, the
greatest obstacle to using the spline in an efficient automated fashion was the lack of a
satisfactory criterion for specifying the degree of smoothing.
THE SMOOTHING SPLINE
The smoothing spline algorithm of Reinsch (1967) minimizes the total squared
curvature of the spline,
JXo
(x)]2dx,
(la)
under the constraint
i
2
n
g(xi) - Yi
0
<
(lb)
SYi
where yi is the input series, Syi is a series of weights, and S is a scaling parameter. The
quantities Syi control the extent of smoothing and are implicitly rescaled by varying S.
Reinsch suggested using for Syi a standard deviation associated with yi. We tried this
but found that it led to a noticeable bias in the fits and in the resultant tree -ring
indices. Also, there are good reasons for weighting all of the ring widths equally. First,
the actual measurement errors are independent, of reasonably constant variance and
usually negligible. Second, most of the variation in the ring widths, which is ring width
dependent, is not error. For ring -width series the local standard deviation is
proportional to the local mean. Roughly speaking the standardization curve should
pass through the local average of the ring widths and this result is not achieved easily if
mean dependent weighting is used. Thus we decided to set Syi = 1.0 for all i.
The above expression then reduces to an unweighted residual sum -of- squares
criterion and the spline corresponding to a given S value is computed by an iterative
procedure. For a given data set (with all Syi equal to 1.0) the spline fit is determined by
one parameter, S, which we scaled to be a fraction, s', of the variance of the data
about the mean.
COOK AND PETERS
48
A better parameter for spline selection was found by examining the calculus of
variation solution of the Reinsch problem. Each spline can be defined uniquely also by
the value of the Lagrange multiplier p that is associated with the constraint (lb). The
base 10 logarithms of p range from + co to -cc, but virtually all the corresponding
variation in S occurs between + 2.00 and -10.00 for our data. The positive extreme
defines an interpolating spline and the negative a simple linear least squares fit to the
data. S (and s') increases monotonically with 1 /p. An important property of the p
versus s' relation is that it is unaffected by the mean or variance of the data and, for a
time -invariant process that is sampled sufficiently often, independent of N also. That
is, a particular value of p or s' corresponds uniquely to a particular fraction of variance
removed from a given process. In addition, specifying p instead of s' allows us to
compute the spline directly rather than iteratively, which greatly reduces the
computation time per spline.
15-
A
Ê
,I.O
3
o
0.5
0.0
1850
1900
1950
1900
1950
1900
1950
1.5
00
,
,
,
1800
'
.
1850
'
YEARS
E
3
as
00
as
0.0
1750
1800
1850
YEARS
Figure 1. Four examples of the smoothing spline applied to forest interior ring -width
series. Each spline was computed for log p = -4.0. The solid line curve is the spline
fit and the dashed lined curve is the orthogonal polynomial fit. The four series are
from the same site in southeastern New York and the lower two plots are cores from
the same tree.
The Smoothing Spline
49
Using a wide range of p values, we fit splines to many different raw ring -width
series of forest interior trees. By displaying each curve fit on a cathode ray tube, the
'aptness of the fit was quickly, albeit subjectively, evaluated. By this trial and error
procedure, we found that the log p value of -4.00 generally yielded a satisfactory curve
fit to the data. Figure 1 illustrates four examples of this smoothing spline for log p =
-4.00. For comparison, the orthogonal polynomial curve fit computed by The
University of Arizona program is shown as a dashed line in each series. The four ring -
width series are from trees growing within 100 meters of each other on a site in
southeastern New York. Series A and B are from two different trees while series C and
D are from opposite sides of a third tree.
Series A is an example where the polynomial and spline fits give roughly the same
solution. Here the underlying growth curve is consistent in the sense that its
characteristics for the entire length are reasonably modeled by the selected polynomial
equation. Series B -D, however, illustrate a major advantage of the spline. The spline
curves have a "natural" flexibility due to their piecewise nature as if they were fit to the
data by eye and are consistently satisfactory. The polynomial curves are either totally
unsatisfactory as in C or adequate only in certain intervals as Rice (1969) described.
The poorer performance of the orthogonal polynomials may in part be a function
of the testing algorithm. If the required reduction of residual variance were reduced to
allow for more flexibility, the polynomial curves might better approximate the splines
as in series A. Higher order polynomials, however, are still subject to more constraints
than splines and the residual variance test may not consistently select the best order
polynomial for each data series.
Because the smoothing spline can range continuously from a simple linear fit to a
interpolation through the data, problems of overfitting can quickly arise. By
overfitting, we mean that an excessive amount of variance, some of it climatic, has
been removed from the tree -ring series. Since we have no a priori knowledge of the
variance structure of the climatic signal other than that it is sort of "red" sometimes,
we can best minimize the overfitting problem by comparing the spline fits of different
tree -ring series from the same site. Any consistent low frequency similarities between
the curves would suggest that a common signal, perhaps climate, has been removed.
In Figure 1, the four splines show very little covariation through time indicating that
the non -climatic variance in each series has been reasonably modeled. In cases where a
general disturbance such as fire or insect infestation has affected an entire stand of
trees, this approach will be more difficult to apply.
TIME AND FREQUENCY DOMAIN PROPERITES
The results shown in Figure 1 and others using different p values suggest that the
smoothing spline behaves at least approximately as a running average; the shape of the
weight function being determined by the parameter p, peaky for large p and flat for
small p. Under this interpretation the smoothing spline can be characterized by an
impulse response function in the time domain and by a frequency response function in
the frequency domain.
Figure 2 shows the results of computing three different splines for a 300 -point series
consisting of a unit spike at point 150 added to a constant series of 300 1.0 values (solid
lines). The weight functions are symmetric and only the central values and the leading
weights are shown. Each filter is smoothly tapered and has minor side lobes that
dampen out quickly. Note that the base widths of the filters are all moderate fractions
COOK AND PETERS
50
of 300. When the filter base width is comparable to the number of data points, or if
the spike is near the ends of the data, the shape of the response is different. For
instance, the log p = -4.0 response is the same to a few parts in ten thousand foi a unit
spike centered in 100 points, but for 50 points the response is noticeably different
(dashed line). For spikes near the ends of the data the response is asymmetric as well.
By Fourier transforming the matrix operations occurring in the computation of the
spline and ignoring the finite length of the data set, a frequency response function of
the form
1
u(f) = 1
1+
p(cos 2 71f + 2)
6(cos 2 irf - 1)2
(2)
is derived. Several of these functions are plotted in Figure 3. The frequency at which
the spline reduces the amplitude of a sine wave by 50% is shown on each curve. For
example, the 50% frequency response for log p = -4.0 occurs at a period of 53 years.
1.07
1.06
1 1.04
ti
1.02
1.01
1.00
I
IO
20
30
40
50
60
70
FILTER WIDTH
80
90
100
110
Figure 2. The impulse response functions of the smoothing spline for different log p
values. The units of each axis are dimensionless. The dashed line filter (log p =
-4.0) is from a data set with only 50 points.
The Smoothing Spline
51
Also, from (2) we can compute the p value for a spline which has a 50% frequency
response at a specified frequency:
6(cos 2 rf - 1)2
(cos 2 rf + 2)
(3)
Again, for small p values (corresponding to large base widths in the time domain) (2)
and (3) do not tell the whole story. When p = 0, for instance, u(f) = 0 everywhere
except at f = 0, where it equals 1.0. In the time domain this corresponds to convolving
with a function that is zero everywhere. In fact the spline fit is a sloping line which
according to Reinsch is a least squares fit to the original data.
The time series description of the smoothing spline outlined here, and which is
developed more fully in a another paper (Peters and Cook 1981), is intended to help
one select the degree of smoothing objectively on the basis of the frequency response
function (2) or (3) when the error variance in the data values is unknown or a
meaningless concept, thus extending the applicability of the technique. In practice we
have found that the qualifications regarding small values of p, although they must be
borne in mind, are not a serious restriction. Another example of the end effects are
shown in Figure 4. This ring -width series of an eastern hemlock is a dramatic example
of the suppression and release found in some trees growing in forest interiors. Two
splines were computed using a value of log p = -4.0: one for the entire series (solid
line) and one for a segment (dash -dot line). The end effects are obvious but acceptably
1.0
-20
-3.0
LOG
1
1000
i
400
-1.0
FREQUENCY
I
I
200
100
PERIOD
50
22
I
10
5
2
IN YEARS
Figure 3. Frequency response functions for several smoothing splines. The 50%
frequency response in years for each spline is shown.
1
COOK AND PETERS
52
small when compared with the magnitude of the ring width variation. The spline fits
the data near the end points as though a reasonable extension had been made and
then a moving average filter defined by (2) had been applied. This is the cas for all
the other data sets we have worked with also, for all values of p. Defining a
"reasonable" extension as one in accord with these observations we have found that a
reliable rule of thumb is to select a p value on the basis of (2) or (3) as though the
frequency response function were an accurate description of the spline for all values of
p.
DENDROCLIMATIC CONSIDERATIONS
We believe that the smoothing spline offers a major improvement in standardizing
ring -width series that are poorly modeled with straight line or negative exponential
curve fits. As a well defined class of low -pass filters, the behavior of the spline is well
defined in both the time and frequency domains. A spline provides a more natural fit
to the data because it operates effectively as a centrally weighted moving average on
the data. Orthogonal polynomials, however, try to generalize the underlying structure
of the data by operating on the entire sequence in a least squares sense. While a
polynomial fit may coincide with a spline fit as in Figure 1A, it is always under more
constraints that usually cause distortion in the shape of the computed growth curve.
Because the filtering action of these splines is known, each tree -ring chronology
should always be catalogued with information about the frequency response of the
splines used in standardization. For those researchers investigating long -term climatic
change or low- frequency cyclic phenomena, this information must be provided lest
they arrive at biased conclusions due to the filtered nature of the data.
The choice of each spline used for standardization is still subjective and must be
made on the basis of both frequency and time domain considerations. We want to
2.5 -
0.5 -
0.0
1700
1750
1800
1850
1900
1950
YEARS
Figure 4. An example of the effect of filter truncation at the ends of a data series.
The solid line is the spline fit to the complete (1690 -1976) tree -ring series. The
dash -dot line is the spline fit to the middle (1750 -1899) segment. Each spline was
computed using a log p value of -4.0.
The Smoothing Spline
53
preserve as much low frequency climatic variance as possible and yet remove divergent
non -climatic anomalies that, in the time domain, could be wrongly interpreted as
exceptional climatic events. Ideally, each spline should be "as straight as possible" and
still remove most of the variance that is not in common to all tree -ring series collected
from the same site. The key to this approach is adequate sampling which, for forest
interior sites, means a minimum field collection of 40 increment cores from trees of the
same species. By carefully comparing the splines by eye or statistically prior to merging
the standardized tree -ring series into a composite site chronology, the effects of
inadequate curve fits can be minimized.
The log p value of -4.0 that defines a spline with a 50% frequency response of 53
years is a useful starting point for using the smoothing spline. It was generally
satisfactory for the ring -width series we used for testing purposes because any lower
frequency climatic variance was indistinguishable from the variance judged to be non-
climatic. The latter component, being a combination of biological growth trend,
changing stand density, and episodic disturbance masked more slowly varying climatic
signals. There will certainly be instances, however, where more low frequency variance
should be retained in the final tree -ring chronology when the configuration of the non climatic component is relatively simple.
The smoothing spline is not a panacea for removing non -climatic variance in forest
ring -width series. A certain amount of climatic information will always be lost due to
the shape of the frequency response curves and where the signal and noise spectra
overlap in the lower frequencies. These are problems common to any filtering
operation. Nor will the spline allow us to relax the need for adequate sampling since
good replication is still the best way to increase the signal -to -noise ratio in tree -ring
chronologies.
With these considerations in mind, the smoothing spline represents a highly
flexible standardization technique that can be tailored to the needs of the researcher.
Although developed specifically for tree -ring series from forest interior sites, its
application extends to any series for which a particular model is not easily justifiable.
ACKNOWLEDGEMENTS
We thank Drs. W. S. Broecker, P. Stoffa, G. C. Jacoby and H. C. Fritts for comments and suggestions
that improved this paper. This research was supported by Grant ATM77 -19217 from the Climate Dynamics
Research Section of the National Science Foundation. Lamont -Doherty Geological Observatory
Contribution No. 3283.
REFERENCES
Fritts, H. C.
Dendroclimatology and dendroecology. Quaternary Research 1: 41949.
Tree -rings and climate. Academic Press, New York.
Fritts, H. C., J. E. Mosimann, and C. P. Bottorff
1969
A revised computer program for standardizing tree -ring series. Tree -Ring Bulletin 29: 15 -20.
Peters, K. and E. R. Cook
1981
The cubic smoothing spline as a digital filter. Lamont -Doherty Geological Observatory of
Columbia University, Technical Report #CU- 1- 81 /TRI.
Reinsch, C. H.
1967
Smoothing by spline functions. Numerische Mathematik 10: 177 -83.
Rice, J. R.
1969
The approximation of functions, Vol. 2. Addison -Wesley, Reading, Mass.
Wold, S.
1974
Spline functions in data analysis. Technometrics 16: 1 -11.
1971
1976
Was this manual useful for you? yes no
Thank you for your participation!

* Your assessment is very important for improving the work of artificial intelligence, which forms the content of this project

Download PDF

advertisement