Renormalization of Lorenz Maps BJ ¨ ORN WINCKLER Doctoral Thesis

Renormalization of Lorenz Maps BJ ¨ ORN WINCKLER Doctoral Thesis
Renormalization of Lorenz Maps
BJÖRN WINCKLER
Doctoral Thesis
Stockholm, Sweden, 2011
TRITA-MAT-11-MA-02
ISSN 1401-2278
ISRN KTH/MAT/DA 11/01-SE
ISBN 978-91-7501-011-3
Institutionen för matematik
KTH
100 44 Stockholm
Akademisk avhandling som med tillstånd av Kungl Tekniska högskolan
framlägges till offentlig granskning för avläggande av teknologie doktorsexamen i matematik tisdagen den 23 augusti 2011 i sal D3.
c Björn Winckler, augusti, 2011
Tryck: Universitetsservice US-AB
iii
Abstract
This thesis is a study of the renormalization operator on Lorenz
maps with a critical point. Lorenz maps arise naturally as first-return
maps for three-dimensional geometric Lorenz flows. Renormalization
is a tool for analyzing the microscopic geometry of dynamical systems
undergoing a phase transition.
In the first part we develop new tools to study the limit set of
renormalization for Lorenz maps whose combinatorics satisfy a long
return condition. This combinatorial condition leads to the construction of a relatively compact subset of Lorenz maps which is essentially
invariant under renormalization. From here we can deduce topological properties of the limit set (e.g. existence of periodic points of renormalization) as well as measure theoretic properties of infinitely renormalizable maps (e.g. existence of uniquely ergodic Cantor attractors).
After this, we show how Martens’ decompositions can be used to
study the differentiable structure of the limit set of renormalization.
We prove that each point in the limit set has a global two-dimensional
unstable manifold which is a graph and that the intersection of an unstable manifold with the domain of renormalization is a Cantor set.
All results in this part are stated for arbitrary real critical exponents
α > 1.
In the second part we give a computer assisted proof of the existence of a hyperbolic fixed point for the renormalization operator
on Lorenz maps of the simplest possible nonunimodal combinatorial
type. We then show how this can be used to deduce both universality
and rigidity for maps with the same combinatorial type as the fixed
point. The results in this part are only stated for critical exponent
α = 2.
iv
Sammanfattning
Denna avhandling är ett studium av renormaliseringsoperatorn
på Lorenzavbildningar som har en kritisk punkt. Sådana avbildningar
uppstår naturligt som förstaåterkomstavbildningar för geometriska
Lorenzflöden i tre dimensioner. Renormalisering är ett verktyg som
kan användas för att analysera den mikroskopiska geometrin hos dynamiska system som genomgår en fasövergång.
I del ett utvecklar vi nya verktyg för att analysera gränsmängden
för renormaliseringsoperatorn på Lorenzavbildningar vars kombinatorik uppfyller ett långt återkomsttidsvillkor. Detta villkor används
för att konstruera en relativt kompakt mängd av Lorenzavbildningar
som i stort sett är invariant under renormalisering. Utifrån detta kan
vi bevisa topologiska egenskaper hos gränsmängden (t.ex. existens
av periodiska punkter för renormaliseringsoperatorn) samt måtteoretiska egenskaper för oändligt renormaliserbara Lorenzavbildningar
(t.ex. existens av entydigt ergodiska Cantorattraktorer). Därefter visar
vi hur Martens dekompositioner kan avändas för att analysera den
differentierbara strukturen hos gränsmängden för renormalisering.
Vi visar att varje punkt i gränsmängden har en tvådimensionell global
instabil mångfald som är en graf samt att snittet av den instabila mångfalden med domänen av renormalisering är en Cantormängd. Alla
resultat i denna del gäller för godtyckliga reella kritiska exponenter
α > 1.
I del två bevisar vi att renormaliseringsoperatorn har en hyperbolisk fixpunkt av enklast möjliga ickeunimodala kombinatorik. Beviset stöder sig på ett datorprogram för att utföra vissa rigorösa uppskattningar. Vi visar även hur existens av en sådan fixpunkt medför
universalitet och stelhet för avbildningar av samma kombinatorik.
Denna del gäller endast för kritisk exponent α = 2.
v
Acknowledgements
I would like to thank my supervisors Marco Martens and Michael
Benedicks, and my assistant supervisor Masha Saprykina. I would
also like to thank Kristian Bjerklöv, Kostya Khanin, Denis Gaidashev,
and the dynamical systems group at Stony Brook.
A large part of this thesis was conceived during my stay at Institut
Mittag-Leffler (Djursholm, Sweden) in the spring of 2010. I gratefully
acknowledge their support.
Contents
1
Introduction
1.1 Background . . . .
1.2 Renormalization . .
1.3 Lorenz flows . . . .
1.4 Statement of results
1.5 Previous results . .
1.6 Future work . . . .
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
1
1
2
7
12
20
20
I Renormalization of maps with long return time
23
2
Preliminaries
2.1 The renormalization operator . . . . . . . . . . . . . . . . . .
2.2 Generalized renormalization . . . . . . . . . . . . . . . . . .
2.3 Invariant measures . . . . . . . . . . . . . . . . . . . . . . . .
25
25
31
35
3
Invariance
3.1 The invariant set . . . . . . . . . . . . . . . . . . . . . . . . . .
3.2 A priori bounds . . . . . . . . . . . . . . . . . . . . . . . . . .
3.3 Periodic points of the renormalization operator . . . . . . . .
43
43
54
57
4
Decompositions
4.1 Decompositions . . . . . . . . . . . . . . . . . . . . . . . . . .
4.2 Renormalization of decomposed maps . . . . . . . . . . . . .
63
63
71
vii
viii
5
Contents
Differentiable structure
5.1 The derivative . . . . . . . . . . . . .
5.2 Archipelagos in the parameter plane
5.3 Invariant cone field . . . . . . . . . .
5.4 Unstable manifolds . . . . . . . . . .
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
79
. 79
. 93
. 100
. 103
II Existence of a hyperbolic renormalization fixed point
6
7
Computer assisted proof
6.1 Existence of a hyperbolic fixed point .
6.2 Consequences . . . . . . . . . . . . . .
6.3 Outline of the computer assisted proof
6.4 The proof . . . . . . . . . . . . . . . . .
109
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
111
111
113
116
119
Implementation of estimates
7.1 Verification of contraction . . . . . . . . .
7.2 Computation with floating point numbers
7.3 Computation with polynomials . . . . . .
7.4 Computation with analytic functions . . .
7.5 Linear algebra routines . . . . . . . . . . .
7.6 Supporting functions . . . . . . . . . . . .
7.7 Input to the main program . . . . . . . . .
7.8 Running the main program . . . . . . . .
7.9 Haskell mini-reference . . . . . . . . . . .
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
123
124
127
131
132
140
142
143
143
144
.
.
.
.
A Background material
149
A.1 A fixed point theorem . . . . . . . . . . . . . . . . . . . . . . 149
A.2 The nonlinearity operator . . . . . . . . . . . . . . . . . . . . 150
A.3 The Schwarzian derivative . . . . . . . . . . . . . . . . . . . . 154
Bibliography
157
Index
161
C HAPTER
Introduction
1.1
Background
To get a feel for the subject of this thesis before getting into the details I will
begin by relating something of a mystery that occurs in the physical world.
Imagine a narrow tap which has very precise control over the flow of
water coming out of it. The flow is controlled by turning a knob which has
a dial indicating the angle the knob is turned. To begin with no water is
flowing. After turning the knob ever so slightly water starts dripping. At
this point the first interesting thing happens: as the knob is slowly being
turned the frequency with which the water drips will not change, until all
of a sudden the water starts dripping twice as fast as before. Now this pattern repeats itself; as the knob is turned the frequency of the drips does not
change, until all of a sudden the frequency doubles again. This frequency
doubling can be observed a couple of times until the water starts flowing in
a steady stream. In very general terms we call this a phase transition (with
dripping and flowing water being the two phases) via a period-doubling1
cascade.
Let a1 denote the angle the knob was turned when the tap started dripping, a2 the angle of the knob when the frequency doubled for the first
time, a3 the angle of the second frequency doubling, etc. If this experiment
was performed on another tap the recorded angles would most likely dif1 The
way we have presented this example it is the frequency which is doubling. However, if we perform the experiment “in reverse,” then it is the period which is doubling.
1
1
2
Chapter 1. Introduction
fer wildly. However, and this is the mysterious part, the asymptotic relative
distance between these angles is independent of the tap! That is, if d1 = a2 − a1
is the distance between the first two recorded angles, d2 = a3 − a2 the distance between the next two angles, and so on, then the sequence of ratios
d1 /d2 , d2 /d3 , . . . approaches a number which has nothing to do with the
tap used for the experiment:
di
→ δ ≈ 4.6692 . . .
d i +1
The number δ is called the Feigenbaum delta (or the first2 Feigenbaum
constant). The particular value of δ is not so interesting on its own, but what
is interesting is that it has a tendency to turn up in (dissipative) systems that
undergo a phase transition via a period-doubling cascade.
Another, seemingly unrelated, system where the Feigenbaum delta appears is in oscillating electric circuits. This example is not as appealing to
the intuition as the dripping tap so I will not go into too many details. The
setup is a simple; connect a resistor, an inductor, and a diode and feed this
circuit by a sinusoidal signal. By increasing the amplitude of the input signal, the voltage measured across the diode will exhibit the same frequency
doubling characteristic as the dripping tap. That is, if the input amplitude
is increased slowly, the voltage measured over the diode will double in frequency at specific values of the amplitude, call these V1 , V2 , V3 , etc. As
before, if we form relative distances between these values d1 = V2 − V1 ,
d2 = V3 − V2 , etc., then di /di+1 → δ.
The topic of this thesis, renormalization, is a tool for analyzing systems
which, like the above examples, are undergoing a phase transition. On one
level it explains why seemingly unrelated systems such as the dripping tap
and the oscillating circuit exhibit the same behavior during a transition, but
on a deeper level it allows us to give a precise description of the mathematics underlying this phenomenon.
1.2
Renormalization
The phenomenon presented in the previous section was not discovered by
some mad scientist playing with dripping taps in a lab — it was first observed in the computer experiment outlined below. Only later did people
2 There
is also a second Feigenbaum constant but its description is less intuitive.
1.2. Renormalization
3
come up with clever ways of reproducing the results in physical systems
(see e.g. Libchaber and Maurer, 1979; Linsay, 1981; Martien et al., 1985). The
computer experiment itself was inspired by phase transitions in statistical
mechanics.
The computer experiment goes something like this: pick a family of
quadratic maps which depend on one parameter, for example the logistic
family
f µ ( x ) = µx (1 − x ), x ∈ [0, 1], µ > 0.
Now iterate the critical point (x = 1/2) of this map and see where it ends up
after many iterations for different values of the parameter µ. The observed
behavior is shown in Figure 1.1 on the following page. For small values of µ
the critical point always ends up in a fixed point whose position is given by
the curve coming in from the left in the figure; after µ = 3 all of a sudden
the critical point ends up in a period two orbit given by the split into two
curves; a little later the curves split again and now the critical point follows
a period four orbit, and so on. Around µ = 3.6 it is no longer clear where
the critical point ends up — it may tend to a periodic orbit but the map may
also be chaotic.
The reason why we look at the fate of the critical point is because it can
be shown that it reflects the behavior of almost all other points in [0, 1] (the
obvious exceptions are x = 0 and x = 1, but there may be others). Another
way of saying this is that Figure 1.1 shows what the attractor of f µ looks
like for different values of the parameter µ.
Now let µ1 be the parameter value where the curve coming in from the
left splits in two, µ2 where the two curves split into four, etc. (that is, the parameter values µi record where a bifurcation takes place). As before, form
distances d1 = µ2 − µ1 , d2 = µ3 − µ2 , . . . , then consider the ratios d1 /d2 ,
d2 /d3 , and so on. By now it should come as no surprise that the ratios converge to the Feigenbaum delta. Also, if the experiment is repeated for any
other one-parameter family of maps with a quadratic critical point3 (e.g.
families of sine functions) then the same behavior will be observed. The exact values where the bifurcations take place will be different, but the ratio
of distances will still converge to the Feigenbaum delta. This phenomenon
is called universality.4
3 That is, any family which can be written h ( x2 ) in a neighborhood of the critical point,
where h is a homeomorphism.
4 Actually, there is more to universality. What we describe here is known as univer-
4
Chapter 1. Introduction
x
1.0
0.8
0.6
0.4
0.2
0
2.4
µ
2.6
2.8
3.0
3.2
3.4
3.6
3.8
4.0
Figure 1.1: Bifurcation diagram for the logistic family x 7→ µx (1 − x ).
Given a value for the parameter µ on the horizontal axis, the vertical axis
shows where the critical point ends up after many iterations. For µ < 3 it
goes to one spot, after that it may be in one of two spots, then four, then
eight, then . . . chaos.
The above experiment was carried out by Coullet and Tresser (1978),
and independently by Feigenbaum (1978, 1979). A crucial insight of Coullet
and Tresser was that they anticipated that the observations made in this
computer experiment would occur in the real world. Nowadays this might
be taken for granted but at the time it was new.
In order to explain universality the above authors introduced the perioddoubling operator T which is loosely defined as follows. Consider the
space U of real-analytic unimodal5 maps with a quadratic critical point.
Take some f ∈ U . If there is an interval C around the critical point to which
the restriction f 2 |C is affinely conjugate to some g ∈ U , then define T f = g
sality in the parameter plane, but there is also universality in the phase space, or metric
universality, which is described in Remark 6.2.4.
5 Unimodal here means with exactly one turning point.
1.2. Renormalization
5
(one should also assume that C is maximal for T to be well-defined). Conjugations preserve periodic points so a property of T is that f has a periodic point of period 2n if and only if T f has a periodic point of period n.
Hence, if Σn ⊂ U denotes the codimension one6 surface of maps undergoing a bifurcation from 2n−1 –periodic behavior to 2n –periodic behavior (corresponding to a split in the bifurcation diagram on the preceding page),
then T (Σn+1 ) ⊂ Σn , for n ≥ 1.
Now assume that: (i) T has a hyperbolic fixed point f ∗ , (ii) the spectrum of the derivative of T at f ∗ is discrete with one expanding eigenvalue
equal to δ and all other eigenvalues strictly contained in the complex unit
disc, and (iii) the surface Σn transversally intersects the unstable manifold
of f ∗ for n sufficiently large. These assumptions, called the renormalization
conjectures by the above authors, allowed them to explain universality as
follows.
Since T (Σn+1 ) ⊂ Σn the λ–lemma implies that the surfaces converge
toward the stable manifold of f ∗ and that the rate of convergence is determined by δ (since the action of T on a local unstable manifold is simply
multiplication by δ). A one-parameter family of unimodal maps is a curve
in U , so we can record parameter values an at which the curve crosses Σn .
These values converge (for generic7 families) and the rate of convergence,
which is described by the ratios ( an+1 − an )/( an − an−1 ), asymptotically
equals 1/δ. In particular, this only depends on properties of the fixed point
and not on the family under consideration, thereby explaining universality.
Proving the renormalization conjectures turned out to be difficult and
when the first partial proof was announced by Lanford (1982, 1984) it necessitated the use of a computer in order to perform some of the calculations.8
A novelty of Lanford’s proof was the use of interval arithmetic in order to
make rigorous computer estimates on an infinite-dimensional space. Note
that Lanford did not prove the transversality conjecture, this was done by
Eckmann and Wittwer (1987) also using the computer-assisted methods pi6 The
bifurcation surface Σn is characterized by a condition on the critical value. This is
a one-dimensional condition, which explains why Σn has codimension one.
7 By generic we essentially mean families which cross Σ transversally for all n ≥ N,
n
where N is some large number.
8 As Lanford (1982) puts it in a remark: “Although done by computer, the computations
involved in proving the results stated are just on the boundary of what it is feasible to
verify by hand. I estimate that a carefully chosen minimal set of estimates sufficient to
prove Theorems 1 and 3 could be carried out, with the aid only of a nonprogrammable
calculator, in a few days.”
6
Chapter 1. Introduction
oneered by Lanford. In Part II we recreate Lanford’s methods to prove a
similar result for maps with a discontinuity.
The period-doubling operator is defined in terms of the second iterate
of maps; this is a restriction of the more general renormalization operator R. Briefly, f ∈ U is renormalizable if there exists an interval C around
the critical point such that f n |C is affinely conjugate to a map g ∈ U for
some n > 1, and we define R f = g (take C maximal for R to be welldefined). Also, there is no reason to insist that the critical point is quadratic,
instead we should consider the space Uα of unimodal maps with critical
exponent α > 1, that is if f ∈ Uα then | f ( x ) − f (c)| = h(| x − c|α ) in a
neighborhood of the critical point c for some homeomorphism h.
Intuitively we think of renormalization as a microscope. If f is renormalizable, then this means that the interesting dynamics for f takes place
on the subset C (together with the forward orbit of C). Renormalization
takes C and “zooms in” on this interval and gives us a new map R f which
describes the dynamics on C. For certain f (the so-called infinitely renormalizable maps) this zooming can be repeated countably many times on
smaller and smaller intervals and this typically means that the dynamics
of f lives on a complicated fractal set. The geometry of such a fractal set
is intimately linked to the topology of the map itself. Renormalization allows us to study the microscopic geometry of maps; it is particularly useful
in connection with phase transitions between maps of different topological types. The prototypical example is that of the period-doubling phase
transition to chaos in the logistic family that we described above.
Let us now get back to the historical development of the theory of renormalization. Even though computer-assisted methods can be successfully
employed in the special case of the period-doubling operator, they are not
of much use when trying to say something about the renormalization operator. Other methods were needed in order to proceed.
The first conceptual description of the renormalization operator was
given by Sullivan (1992) where he introduced new complex analytic tools
in order to study the limit set of the renormalization operator (note that
the limit set contains the period-doubling fixed point). The idea is that the
limit set is a hyperbolic Cantor set and that the renormalization operator
acts as the full shift on the limit set (this is colloquially referred to as the
renormalization horseshoe). This was proved over the course of several
years by different authors, most notably Sullivan (1992); McMullen (1996);
Lyubich (1999).
7
1.3. Lorenz flows
A downside with the approach of the above authors is that it relies on
complex analytic methods which only work for even critical exponents (we
will have more to say about why this is dissatisfying in the next section).
For this reason people have been looking for alternative proofs which do
not rely on complex analysis, without much success so far. A notable exception is Martens (1998) who proves the existence of periodic points of
the renormalization operator on unimodal maps for any critical exponent.
However, that paper does not touch on the subject of hyperbolicity of the
limit set of renormalization.
In Part I we give a (partial) proof of the existence of a renormalization
horseshoe for a class of maps with a discontinuity. Our approach does
not use complex analysis and works for any critical exponent α > 1. To
our knowledge this is the first result about the structure of a limit set of
renormalization which works for any critical exponent. Moreover, it is one
of very few results on maps having both a discontinuity and a critical point
that goes beyond the purely topological aspects of the dynamics of such
maps.
1.3
Lorenz flows
The previous section discussed the renormalization part of the title of this
thesis. This section concerns the second part, that is Lorenz maps, but first
we need to talk about Lorenz flows.
Historically speaking, a Lorenz flow is a flow associated with the following system of ordinary differential equations in three dimensions called
the Lorenz equations:
ẋ = −σx + σy,
(1.1)
ẏ = − xz + rx − y,
ż = xy − bz,
This system describes a simplified model of convection in the atmosphere.
It was investigated by Lorenz (1963) and has played an important role in
the development of the subject of Dynamical Systems (see Viana, 2000, for
an overview and recent results). For parameter values σ = 10, b = 8/3
and r = 28 this system exhibits the well-known Lorenz attractor depicted
in Figure 1.2 on the following page.
8
Chapter 1. Introduction
Figure 1.2: The Lorenz attractor.
The Lorenz attractor is the prototypical example of a nonperiodic and
chaotic attractor. Let us take some time to explain what we mean by this.
By attractor we essentially mean a set with an open neighborhood such that
any trajectory that passes through this neighborhood approaches the attractor; that is, it has a large basin of attraction. In order for a set to be
an attractor it is also usually assumed that it cannot be decomposed into
smaller parts. This is true for the Lorenz attractor because it has a dense
orbit. By chaotic we mean that the system exhibits sensitive dependence on
initial conditions; any two points will eventually separate under time evolution of the system, no matter how close they initially started out. Sensitive
dependence on initial conditions epitomizes chaotic behavior in dynamical
systems. Finally, by nonperiodic we mean that the attractor is not simply a
1.3. Lorenz flows
9
periodic orbit. Typically this means that the appearance of the attractor is
very intricate and one glance at Figure 1.2 should be enough to convince
the reader that this certainly is the case for the Lorenz attractor. Note that a
periodic attractor can exhibit chaotic behavior (think of the doubling map
on a circle) even though geometrically it will look very tame. See Milnor
(1985) for a classical discussion on the concept of attractor.
A rigorous mathematical analysis of the system (1.1) for the parameters
given above proved to be quite difficult. It was unknown for a very long
time whether (1.1) really did exhibit a nonperiodic chaotic attractor or if
computer generated pictures like Figure 1.2 on the preceding page were in
fact only showing solution curves approaching a very long stable periodic
orbit. A proof that the Lorenz attractor is nonperiodic and chaotic was
finally announced by Tucker (1998). The proof uses a computer to make
rigorous estimates that would take too long to verify by hand, much like
the first proof of the renormalization conjectures.
Long before the proof was announced it was already known that flows
having the same geometric characteristic as the flow of (1.1) do exhibit nonperiodic chaotic attractors. Such geometric Lorenz flows were introduced in
Guckenheimer (1976). These days the name Lorenz flow usually refers to a
geometrically defined flow, the construction of which we will now discuss
(a detailed description can be found in Guckenheimer and Williams, 1979).
Figure 1.3 on the following page illustrates the construction of a geometric Lorenz flow. Roughly speaking, it is defined by the associated vector
field X having an equilibrium point of saddle type at the origin. This saddle should have a two-dimensional stable manifold and a one-dimensional
unstable manifold. The vector field X is chosen to be linear near the origin
but away from the origin it is nonlinear so that it returns in a controlled
manner. Namely, there should be a two-dimensional domain S which intersects the stable manifold in a curve and any flow trajectory which hits S
outside this curve does so transversally and eventually returns to S. This
means that the first-return map F to S is a well-defined map off the stable manifold. To simplify the analysis of the geometric flow further, it is
assumed that there exists a foliation of S which is F–invariant and whose
leaves are exponentially contracted by F. Hence the first-return map acts
on the leaves of the foliation and by taking a quotient over leaves we are
left with an interval map which is undefined at the point corresponding to
the stable manifold. Such a map is called a Lorenz map.
10
Chapter 1. Introduction
S
Ws
Wu
O
Figure 1.3: Illustration of a geometric Lorenz flow. The origin O is an equilibrium point of saddle type. The plane cutting the figure in two is part
of the stable manifold W s and the curves emanating left and right from O
is the unstable manifold W u . The flow is linear on the middle piece which
looks like an inverted T and nonlinear in the two hooks. The nonlinear part
forces the flow to return to the domain S. This ensures that the first-return
map to S is well-defined outside the line down the middle of S, which represents the intersection S ∩ W s . Points on this line tend to the origin. One
should also imagine that there is an invariant foliation of S, for example by
lines parallel to the line which cuts S in half.
Figure 1.4 on the next page shows the graphs of two Lorenz maps. The
discontinuity in this figure corresponds to the crossing of the domain S
and the stable manifold in the associated flow. This discontinuity appears
because trajectories which hit S on the right side of the stable manifold
will flow through the right hook, whereas trajectories to the left will flow
through the left hook (see Figure 1.3). On both one-sided neighborhoods
of the point of discontinuity a Lorenz map equals | x |α near the origin up to
coordinate changes in the domain and range. The number α is called the
critical exponent and is given by the absolute value of the ratio between the
weak stable and unstable eigenvalues at the saddle point of the associated
flow. In particular, α may assume any positive real value. An important
consequence of this is that if we want to be able to understand physical
systems (such as the Lorenz system) then we need to be able to analyze
11
1.3. Lorenz flows
α = 1/2
α=2
Figure 1.4: A Lorenz map for different values of the critical exponent α.
maps having any positive real critical exponent, since nature does not have
a preferred value for α. We stress this point since it explains why the search
for renormalization tools that work for any critical exponent is not purely
academic.
Guckenheimer and Williams (1979) show that there is an open set of
vector fields on R3 with the structure of a geometric Lorenz flow (that is,
there are many such flows, intuitively speaking). They consider the case
α < 1 with the additional assumption that the Lorenz map
√ is expanding in
the sense that the derivative is bounded from below by 2. In this situation
they use symbolic dynamics on the Lorenz map to show that the associated
flow supports a nonperiodic chaotic attractor.
We will consider the case when α > 1 which is significantly harder to
analyze due to the presence of contraction around the point of discontinuity (think tent maps vs. unimodal maps). Instead of focusing on properties
of the associated flow we will concentrate on the Lorenz map itself, with
the understanding that results about Lorenz maps can be transferred to
results about Lorenz flows. The presence of contraction leads to a much
wider variety of dynamics than in the more traditional expanding case,
which makes these Lorenz maps an interesting study in their own right.
Perhaps more important is that we wish to further the understanding of
one-dimensional dynamical systems, and in this sense the study of Lorenz
maps can be seen as a natural next step now that unimodal dynamics is
well understood. Of course, Lorenz maps are also important due to their
connection with flows on R3 . The presence of a discontinuity introduces
significant new difficulties that do not appear for unimodal maps. We have
12
Chapter 1. Introduction
been forced to invent new tools to study the renormalization of Lorenz
maps and our hope has been to perhaps use the insight gained from this
to better understand unimodal renormalization for arbitrary critical exponents. Our idea of using decompositions (see the next section) to compute
the derivative of the renormalization operator seems promising in this respect.
1.4
Statement of results
A Lorenz map f on [0, 1] \ {c} is a monotone increasing differentiable map
which fixes 0 and 1, and which has a unique critical point at c ∈ (0, 1).
Note that c is not in the domain of f , so when saying that c is a critical
point of f what we mean is that D f ( x ) → 0 as x → c. Figure 1.5 on the
next page illustrates the graph of a typical Lorenz map. The critical point
has critical exponent α > 1 which will be fixed once and for all (noninteger
α are permitted).9 We will generally assume that f (c− ) > c > f (c+ ), where
f (c− ) = lim f ( x )
x ↑c
and
f (c+ ) = lim f ( x )
x ↓c
denote the critical values of f . If f (c− ) ≤ c or f (c+ ) ≥ c, then f is trivial, otherwise it is nontrivial.10 Note that f maps its domain [0, 1] \ {c}
onto [0, 1] and that the inverse of f has two branches, unless f is trivial.
For the sake of simplicity we will assume that f is C 3 and has negative
Schwarzian derivative even though most of our arguments work, or can be
made to work, for f ∈ C 2 .
Given an interval C ⊂ [0, 1] we define the first-return map f˜ to C for f
by f˜( x ) = f n(x) ( x ), for x ∈ C \ {c}, where n( x ) is the smallest positive
integer such that f n(x) ( x ) ∈ C.
A Lorenz map is renormalizable if there exists an interval C ⊂ (0, 1)
containing c such that the first-return map to C is affinely conjugate to a
nontrivial Lorenz map g on [0, 1] \ {c0 } for some c0 ∈ (0, 1). The renormalization of f is then defined by R f = g, where C is assumed to be maximal
9 One
could also consider Lorenz maps with different critical exponents on either side
of the critical point. We choose not to pursue this generalization. Note that the first-return
maps of geometric Lorenz flows has the same critical exponent on both sides of the critical
point so this generalization is somewhat unnatural.
10 All points of a trivial map converge to a fixed point under iteration, which explains
the name trivial.
13
1.4. Statement of results
1
f (c− )
f (c+ )
0
c
1
Figure 1.5: The graph of a Lorenz map on [0, 1] \ {c}.
for R to be well-defined. Note that c0 6= c in general; this turns out to be one
of the most difficult problems to handle for the renormalization of Lorenz
maps.
The type of renormalization is given by the pair of words ω = (ω − , ω + )
where
(
(
i ( c− ) < c,
0,
if
f
0, if f j (c+ ) < c,
+
ω
(
j
)
=
ω − (i ) =
1, if f i (c− ) > c,
1, if f j (c+ ) > c,
for i = 0, . . . , a and j = 0, . . . , b, and where a and b are the smallest positive
integers such that f a+1 (c− ) ∈ C and f b+1 (c+ ) ∈ C, respectively. The type
ω = (ω − , ω + ) is called monotone if ω − = 01 · · · 1 and if ω + = 10 · · · 0.
Note that the combinatorial description of Lorenz maps is simplified due
to the fact that Lorenz maps are increasing so there is no need to introduce
permutations as in the case of unimodal maps.
If Rk f is renormalizable for k = 0, . . . , n − 1 then we say that f is n times
renormalizable; if this holds for all n ≥ 1, then we say that f is infinitely
renormalizable. In the latter case we define the combinatorial type of f to
be the sequence ω̄ = (ω0 , ω1 , . . . ) such that Rk f is ωk –renormalizable11 for
11 “ω–renormalizable”
is just shorthand for “renormalizable of type ω.”
14
Chapter 1. Introduction
all k ≥ 0. If the lengths of both words of ωi are uniformly bounded (in i)
then we say that f has bounded combinatorial type.
We will now fix a finite set Ω of monotone types satisfying the long
return condition of Section 3.1. Roughly speaking, Ω satisfies this condition
if
bαc ≤ |ω − | − 1 ≤ b2α − 1c and b− ≤ |ω + | − 1 ≤ b+ ,
for all ω = (ω − , ω + ) ∈ Ω. The constant b− must be sufficiently large
and b+ depends on the choice of b− . We emphasize that Ω essentially only
depends on b− , which needs to be chosen large (compared to 2α − 1) but
any b− works so long as it is not too small. It is also worth noting that b+
can in general be chosen much larger than b− so the set Ω is not small.
The condition on Ω may seem artificial and in all honesty so it is but
it allows us to make some serious estimates. The idea is that increasing
the return time of one branch while keeping the other constant forces the
critical point (of the renormalization) to move into a corner and in this way
we get some control over where the critical point is. Also, by increasing
the return time we push one critical value closer to a fixed point in the
boundary which causes the derivative along the orbit of this critical value
to grow. The combination of control over the position of the critical point
and large return derivatives is what makes all our estimates work.
In Section 3.1 it is shown that there exists a relatively compact set K
such that if f ∈ K is twice renormalizable, then R f ∈ K. Note that here
and from now on when we say “renormalizable” we mean “renormalizable
of type ω with ω ∈ Ω,” unless we specify otherwise. Because of this result
we colloquially call K an “invariant set” even though it is not really invariant (since R f need not be twice renormalizable!). The proof is a series of
estimates that rely on the conditions on Ω; in fact, the conditions on Ω were
chosen so that these estimates work.
From this result we are immediately able to prove existence of the socalled a priori bounds (see Section 3.2):
Theorem A (A priori bounds). If f ∈ K is infinitely renormalizable with combinatorial type in ΩN, then {Rn f }n≥0 is a relatively compact family.
The name a priori (real) bounds was coined by Sullivan (1992). In the
unimodal case Sullivan proves these bounds by employing a shortest interval argument, however in the Lorenz case this argument breaks down
1.4. Statement of results
15
(essentially due to the fact that the critical point is not fixed under renormalization) which is why we have to work fairly hard for a proof.
The a priori bounds are a basis for understanding the structure of infinitely renormalizable maps. With them we can prove the following:
Theorem B (Infinitely renormalizable maps). If f ∈ K is infinitely renormalizable with combinatorial type in ΩN, then f is ergodic and has no wandering
intervals. Furthermore, if Λ denotes the closure of the critical orbits of f , then:
1. Λ is a Cantor set with zero Lebesgue measure and Hausdorff dimension
in (0, 1),
2. the basin of Λ has full Lebesgue measure, and
3. Λ is uniquely ergodic.
The proof of ergodicity and nonexistence of wandering intervals uses
the concept of generalized renormalization introduced by Martens (1994).
Specifically, we adapt the definition of the weak Markov property to Lorenz
maps (see Section 2.2). Note that Lorenz maps may have wandering intervals in general, for example if they renormalize to a trivial map (regardless
whether the critical point is flat or not). Finding necessary conditions for a
Lorenz map not to have wandering intervals is still an open problem.
Since Lorenz maps have two critical orbits it is possible to construct
Lorenz maps with a Cantor attractor which supports two ergodic invariant
probability measures. In Section 2.3 we adapt the techniques of Gambaudo
and Martens (2006) to Lorenz maps and use this to construct such a map
as well as to show that bounded combinatorial type is sufficient but not
necessary for the Cantor attractor to be uniquely ergodic.
Having described the structure of the infinitely renormalizable maps
we turn to the more complex task of studying the renormalization operator
itself. Ultimately we want to describe the limit set of the renormalization
operator, but before we get there we address the question of existence of
periodic points (note that periodic points are contained in the limit set).
Martens (1998) proves the existence of periodic points for the renormalization operator on unimodal maps. This is done by studying the action
of the renormalization operator on the boundary of the domain of renormalization and then using a mapping degree argument to show existence
of finite dimensional approximate periodic points in a purely topological
way (the “bottom-down, top-up” lemma). The actual periodic points are
then found as limits of the approximate periodic points.
16
Chapter 1. Introduction
R( Bn × X )
Bn × X
Rn
X
R
Figure 1.6: Illustration of the action of a renormalization operator R (unimodal, Lorenz, . . . ). The domain of R is the product of an n–dimensional
ball Bn and an infinite-dimensional space X. The action of R is to wrap the
domain around the outside of itself in the parameter directions. By Theorem A.1.1 there is a fixed point of R in the intersection of the two boxes.
This is just an intuitive picture, in general the domain might be more complicated (although in our situation it is essentially this simple).
A natural generalization of the “bottom-down, top-up” lemma which
works for any dimension of the parameter plane is given in Appendix A.1.
The general idea is that renormalization acts on a space Bn × X, where the
ball Bn ⊂ Rn represents parameter space and X is some infinite dimensional function space. The parameters in this situation are the critical values, so n = 1 for unimodal maps and n = 2 for Lorenz maps. The action
of renormalization on sections Bn × { x } is to stretch in the general direction of Bn and wrap the boundary Sn−1 × { x } around the outside of Bn × X
in such a way that the degree of this map is nonzero. Also, there should
be a bounded subset of X which is invariant under renormalization. See
Figure 1.6 for an illustration. This is a rough description of how both the
unimodal and Lorenz renormalization operators act and it seems natural
to think that other renormalization operators are similar. Theorem A.1.1 is
directly suited to this situation.
We apply this fixed point theorem to find periodic points of the renor-
17
1.4. Statement of results
malization operator. The relatively compact “invariant” set K allows us to
construct the periodic points directly without having to take limits of approximate periodic points like Martens (1998). This simplifies the argument
considerably. In Section 3.3 we prove the following result:
Theorem C (Periodic points of renormalization). The renormalization operator has a periodic point for every periodic combinatorial type (ω0 , . . . , ωn−1 )∞ ∈
ΩN .
Note that we only prove existence and not uniqueness (even though
presumably there is a unique periodic point for each combinatorial type).
After this we turn to studying the limit set of the renormalization operator. Let AΩ denote the maps with a complete past, and let BΩ denote the
maps with a complete future. That is f ∈ AΩ if
f = R ω −1 f − 1 ,
f − 1 = R ω −2 f − 2 ,
f − 2 = R ω −3 f − 3 ,
...,
for some left infinite sequence (. . . , f −2 , f −1 ) of maps with ω−i ∈ Ω, and
f ∈ BΩ if f is infinitely renormalizable with combinatorial type in ΩN .12
Think of AΩ as the “attractor” for R. The limit set of renormalization is
the intersection ΛΩ = AΩ ∩ BΩ . We should perhaps not say the limit set
here since there are in fact many such sets (we can choose Ω in countably
many different ways!) but we think of Ω as being fixed so the terminology
should not cause any confusion.
Every f ∈ ΛΩ has a bi-infinite sequence of words
( . . . , ω −2 , ω −1 , ω0 , ω1 , ω2 , . . . ) ,
associated to it as explained above; explicitly, Rn f is ωn –renormalizable,
for every n ∈ Z. Note that if f is associated with the sequence {ωi }i∈Z , then
R f is associated with the sequence {ωi+1 }i∈Z , so R shifts the sequence to
the left. This indicates that R should be conjugate to a shift operator, and
conjecturally R is conjugate to the full shift on symbols in Ω. Unfortunately, we cannot prove that each f ∈ ΛΩ has a unique bi-infinite sequence
associated with it, nor can we prove that each bi-infinite sequence has a
unique f ∈ ΛΩ associated with it, so we are unable to define the potential
conjugacy. However, R|ΛΩ is at least semi-conjugate to the one-sided shift
on symbols in Ω via the map which sends f ∈ BΩ to its combinatorial type.
12 R
ω
is shorthand for the restriction of R to the set of ω–renormalizable maps.
18
Chapter 1. Introduction
The above paragraph describes the topological structure of the limit set
but we are also able to say something about its differentiable structure.
Conjecturally, the limit set is a horseshoe, that is a hyperbolic set on which
R is conjugate to the full shift. We only prove “half” of the hyperbolicity
statement, namely that each point f ∈ ΛΩ has a global unstable manifold
W u ( f ). By global we essentially mean that the unstable manifold stretches
across all domains of renormalization, that is
W u ( f ) ∩ Lω̄ 6= ∅ and ∂W u ( f ) ∩ Lω̄ = ∅,
∀ω̄ ∈
[
Ωn ,
n >0
where Lω̄ denotes the set of Lorenz maps which are renormalizable of
type ω̄.13
Theorem D (Renormalization horseshoe). The renormalization operator on
the limit set ΛΩ is semi-conjugate to the one-sided shift on symbols in Ω.
For every f ∈ ΛΩ there exists a unique global two-dimensional unstable manifold W u ( f ) which is C 1 .
The intersection of W u ( f ) with the renormalizable maps of a given type in Ω is
diffeomorphic to a square, the intersection with the infinitely renormalizable maps
of a given combinatorial type in ΩN is a point, and the union of all such points is
a Cantor set.
The proof of existence of unstable manifolds can be found in Section 5.4.
It is based on results in Section 5.3 where we show that there exists a cone
field which is invariant and expanded under the action of D R. In order to
prove this we compute the derivative of R in Section 5.1.
We are able to prove the uniqueness statement on the intersections of
renormalizable maps with the unstable manifolds by carefully looking at
the structure of the parameter plane of families of Lorenz maps (see Section 5.2). The parameters in this respect are essentially the critical values.
In order to compute the derivative of the renormalization operator we
introduce the machinery of decompositions (Martens, 1998). Since this represents a certain amount of effort let us discuss why we choose to take this
route. First of all, the renormalization operator is not differentiable on a
C k –space (since it is essentially just a composition operator) so we need
to make some restriction on the space in order to compute the derivative.
is, if ω̄ = (ω0 , . . . , ωn−1 ), then f ∈ Lω̄ if Ri f is ωi –renormalizable for i =
0, . . . , n − 1.
13 That
1.4. Statement of results
19
With decompositions we need not worry about this too much as the renormalization operator on decompositions contracts exponentially to a subset
where it is differentiable (see Proposition 4.2.8). The second, more fundamental problem is that there are “too many” directions in which to deform a general diffeomorphism thereby making estimates on the derivative
very difficult to handle. With decompositions there are “only” countably
many directions to deform in and, more importantly, all deformations are
monotone (in a sense that will be explained in Section 5.1) which makes the
estimates manageable. Any results obtained for decompositions can be automatically transferred back to Lorenz maps by composing as explained
in Section 4.1.
This concludes the results of Part I. In Part II we make something of
a historical detour. We consider the renormalization operator Rω of the
simplest possible nonunimodal14 type, namely ω = (01, 100). We then
recreate Lanford’s computer-assisted proof of the existence of a fixed point
of the period-doubling operator on unimodal maps and adapt it to Rω .
Note that the results of Part I do not cover this case, so there is no overlap.
Theorem E (Hyperbolic renormalization fixed point). Let ω = (10, 011).
The restriction of Rω acting on a space of real analytic Lorenz maps with quadratic
critical point has a fixed point f ? . The derivative D Rω ( f ? ) at f ? is compact and
has no eigenvalues on the unit circle.
This theorem has two shortcomings: (i) it does not say anything about
the number of unstable eigenvalues (i.e. the dimension of the unstable manifold), and (ii) there is no conclusion regarding the intersection of the unstable manifold with the bifurcation surfaces Σn as in the original renormalization conjectures. The first item is a problem with our proof, the second is
a shortcoming of Lanford’s method which can be corrected as in Eckmann
and Wittwer (1987) but we chose not do this as it makes the computer estimates more difficult.
The proof basically amounts to turning Rω into a contraction (without
changing the set of fixed points) via a Newton iteration and then using a
variant of the contraction mapping theorem. The verification that the modified operator is a contraction uses a computer to make rigorous estimates.
We provide the source code used to make these estimates in Chapter 7.
14 Exactly
what we mean by nonunimodal is explained in Remark 6.1.4.
20
Chapter 1. Introduction
Given a copy of this chapter it is possible to feed it into a compiler and get
an executable that will perform the estimates. Of course, reading source
code is always rather difficult but we would like to stress that the amount
of code is small enough to include in its entirety complete with documentation.
Finally, in Section 6.2 we discuss consequences of the existence of a hyperbolic renormalization fixed point. Some of these topics have already
been discussed in Part I, such as the existence of a Cantor attractor for infinitely renormalizable maps. Hyperbolicity allows us to go a little further
and show that the conjugacy between the Cantor attractors of two infinitely
renormalizable maps of combinatorial type (01, 100)∞ extends to a differentiable map whose derivative is Hölder continuous. This important result
is known as rigidity.
1.5
Previous results
Lorenz maps and geometric Lorenz flows were introduced by Guckenheimer (1976), but the first investigations of critical Lorenz maps seem to be
by Arneodo, Coullet, and Tresser (1981); Collet, Coullet, and Tresser (1985).
There is a vast literature on expanding Lorenz maps, probably because
these are the ones that arise naturally in the traditional Lorenz system, but
not much seems to have been published on critical Lorenz maps.
Martens and de Melo (2001) contains many results and ideas used in
this thesis. Their paper contains a proof of the full family theorem, a proof
of density of hyperbolicity, a description of quasi-conjugacy classes, as well
as a description of the archipelago structure of domains of renormalizability in the parameter plane.
Another source of results for Lorenz maps with a critical point is the
PhD thesis of St. Pierre (1999). It contains, among other things, a construction of Markov extensions on Lorenz maps, and an admissibility condition
for kneading invariants.
1.6
Future work
The most important piece that is missing from this thesis is the proof of
existence of a codimension two stable manifold at each point in the limit
1.6. Future work
21
set of renormalization. Some progress toward this result has been made
but is not yet completed.
Other than that it would of course be desirable to prove the results in
this thesis for any finite set Ω of renormalization types (without the restrictions imposed by monotone combinatorics and the long return condition).
However, this will require new methods as this work cannot handle the situation when both return times are comparable. It may however be possible
to use the idea of looking at pure decomposed maps in order to compute
the derivative of the renormalization operator for more general combinatorics.
It would be interesting to get a complete classification of which Lorenz
maps satisfy the weak Markov property as is done for unimodal maps in
Martens (1994). The fact that Lorenz maps have two independent cycles of
renormalization means that the shortest interval argument no longer works
which makes things a lot more difficult.
I think that it should be relatively straightforward to use the methods in
this thesis to give a description of the limit set of renormalization for unimodal maps with long return time. This has partly been done in Eckmann
et al. (1984) but their result only holds for critical exponent α = 2 and their
method does not generalize to arbitrary α > 1.
Part I
Renormalization of maps with
long return time
23
C HAPTER
Preliminaries
This chapter serves as an introduction to Lorenz maps with adaptations of
some well known results for unimodal maps. In Section 2.1 we define the
renormalization operator on Lorenz maps and introduce notation that will
be used throughout the thesis. This is followed by a description of generalized renormalization for Lorenz maps in Section 2.2 which is used to derive
ergodicity and non-existence of wandering intervals. Finally, Section 2.3
discusses invariant measures on Cantor attractors for Lorenz maps.
2.1
The renormalization operator
In this section we define the renormalization operator on Lorenz maps and
introduce notation that will be used throughout.
Definition 2.1.1. The standard Lorenz family (u, v, c) 7→ Q( x ) is defined
by
  u · 1 − c− x α ,
if x ∈ [0, c),
c
(2.1)
Q( x ) =
α
1 + v · −1 + x − c
, if x ∈ (c, 1],
1− c
where u ∈ [0, 1], v ∈ [0, 1], c ∈ (0, 1), and α > 1. The parameter α is called
the critical exponent and will be fixed once and for all.
Remark 2.1.2. The parameters (u, v, c) are chosen so that: (i) u is the length
of the image of [0, c), (ii) v is the length of the image of (c, 1], (iii) c is the
25
2
26
Chapter 2. Preliminaries
c1−
f 1−1 (c)
f 0−1 (c)
c1+
c
0
1
Figure 2.1: Illustration of the graph of a (01, 1000)–renormalizable Lorenz
map.
critical point (which is the same as the point of discontinuity). Note that u
and 1 − v are the critical values of Q.
Definition 2.1.3. A C k –Lorenz map f on [0, 1] \ {c} is any map which can
be written as
(
φ ◦ Q( x ), if x ∈ [0, c),
(2.2)
f (x) =
ψ ◦ Q( x ), if x ∈ (c, 1],
where φ, ψ ∈ D k are orientation-preserving C k –diffeomorphisms on [0, 1],
called the diffeomorphic parts of f . See Figure 2.1 for an illustration of a
Lorenz map. The set of C k –Lorenz maps is denoted Lk ; the subset LS ⊂ L3
denotes the Lorenz maps with negative Schwarzian derivative (see Appendix A.3 for more information on the Schwarzian derivative).
A Lorenz map has two critical values which we denote
c1− = lim f ( x )
x ↑c
and
c1+ = lim f ( x ).
x ↓c
27
2.1. The renormalization operator
If c1+ < c < c1− then f is nontrivial, otherwise all points converge to some
fixed point under iteration and for this reason f is called trivial. Unless
otherwise noted, we will always assume all maps to be nontrivial.
We make the identification
Lk = [0, 1]2 × (0, 1) × D k × D k ,
by sending (u, v, c, φ, ψ) to f defined by (2.2). Note that (u, v, c) defines
Q in (2.2) according to (2.1). For k ≥ 2 this identification turns Lk into a
subset of the Banach space R3 × D k × D k . Here D k is endowed with the
Banach space structure of C k−2 via the nonlinearity operator. In particular,
this turns Lk into a metric space. For k < 2 we turn Lk into a metric space
by using the usual C k metric on D k . See Appendix A.2 for more information
on the Banach space D k .
Remark 2.1.4. It may be worth emphasizing that for k ≥ 2 we are not using
the linear structure induced from C k on the diffeomorphisms D k . Explicitly,
if φ, ψ ∈ D k and N denotes the nonlinearity operator, then
aφ + bψ = N −1 ( aNφ + bNψ) ,
∀ a, b ∈ R,
and
kφkD k = k NφkC k−2 .
We call this norm on D k the C k−2 –nonlinearity norm. The nonlinearity operator N : D k → C k−2 is a bijection and is defined by
Nφ( x ) = D log Dφ( x ).
See Appendix A.2 for more details on the nonlinearity operator.
We now define the renormalization operator for Lorenz maps.
Definition 2.1.5. A Lorenz map f is renormalizable if there exists an interval C ( [0, 1] (properly containing c) such that the first-return map to C is
affinely conjugate to a nontrivial Lorenz map. Choose C so that it is maximal with respect to these properties. The first-return map affinely rescaled
to [0, 1] is called the renormalization of f and is denoted R f . The operator R which sends f to its renormalization is called the renormalization
operator.
28
Chapter 2. Preliminaries
Explicitly, if f is renormalizable then there exist minimal positive integers a and b such that the first return map f˜ to C is given by
(
f a+1 ( x ), if x ∈ L,
f˜( x ) =
f b+1 ( x ), if x ∈ R,
where L and R are the left and right components of C \ {c}, respectively.
The renormalization of f is defined by
R f ( x ) = h−1 ◦ f˜ ◦ h( x ),
x ∈ [0, 1] \ {h−1 (c)},
where h : [0, 1] → C is the affine orientation-preserving map taking [0, 1]
to C. Note that C is chosen maximal so that R f is uniquely defined.
Remark 2.1.6. We would like to emphasize that the renormalization is assumed to be a nontrivial Lorenz map. It is possible to define the renormalization operator for maps whose renormalization is trivial but we choose
not to include these in our definition. Such maps can be thought of as degenerate and including them makes some arguments more difficult which
is why we choose to exclude them.
Next, we wish to describe the combinatorial information encoded in a
renormalizable map.
Definition 2.1.7. A branch of f n is a maximal open interval B on which f n
is monotone (here maximality means that if A is an open interval which
properly contains B, then f n is not monotone on A).
To each branch B of f n we associate a word w( B) = σ0 · · · σn−1 on symbols {0, 1} by
(
0 if f j ( B) ⊂ (0, c),
σj =
1 if f j ( B) ⊂ (c, 1),
for j = 0, . . . , n − 1.
Definition 2.1.8. Assume f is renormalizable and let a, b, L and R be as
in Definition 2.1.5. The forward orbits of L and R induce a pair of words
ω = (w( L̂), w( R̂)) called the type of renormalization, where L̂ is the branch
of f a+1 containing L and R̂ is the branch of f b+1 containing R. In this situation we say that f is ω–renormalizable. See Figure 2.2 on the facing page
for an illustration of these definitions.
29
2.1. The renormalization operator
fb
f b +1 ( R )
fa
f a +1 ( L )
c R
L
f ( R)
f ( L)
Figure 2.2: Illustration of the dynamical intervals of a Lorenz map which is
ω–renormalizable, with ω = (011, 100000), a = 2, b = 5.
Let ω̄ = (ω0 , ω1 , . . . ). If Rn f is ωn –renormalizable for n = 0, 1, . . . , then
we say that f is infinitely renormalizable and that f has combinatorial type
ω̄. If the length of both words of ωk is uniformly bounded in k, then f is
said to have bounded combinatorial type.
The set of ω–renormalizable Lorenz maps is denoted Lω . We will use
variations of this notation as well; for ω̄ = (ω0 , . . . , ωn−1 ) we let Lω̄ denote the set of Lorenz maps f such that Ri f is ωi –renormalizable, for i =
0, . . . , n − 1, and similarly if n = ∞. Furthermore, if Ω is a set of types
of renormalization, then LΩ denotes the set of Lorenz maps which are ω–
renormalizable for some ω ∈ Ω.
We will almost exclusively restrict our attention to monotone combinatorics, that is renormalizations of type
a
b
z }| { z }| {
ω = (0 1 · · · 1, 1 0 · · · 0).
In what follows we will need to know how the five-tuple representation of a Lorenz map changes under renormalization. It is not difficult to
write down the formula for any type of renormalization but it becomes a
bit messy so we restrict ourselves to monotone combinatorics. However,
first we need to introduce the zoom operator.
Definition 2.1.9. The zoom operator Z takes a diffeomorphism and rescales
it affinely to a diffeomorphism on [0, 1]. Explicitly, let g be a map and I an
interval such that g| I is an orientation-preserving diffeomorphism. Define
1
Z ( g; I ) = ζ −
◦ g ◦ ζI,
g( I )
where ζ A : [0, 1] → A is the orientation-preserving affine map which takes
[0, 1] onto A. See Appendix A.2 for more information on zoom operators.
30
Chapter 2. Preliminaries
Remark 2.1.10. The terminology “zoom operator” is taken from Martens
(1998), but our definition is somewhat simpler since we only deal with
orientation-preserving diffeomorphisms. We will use the words ‘rescale’
and ‘zoom’ synonymously.
Lemma 2.1.11. If f = (u, v, c, φ, ψ) is renormalizable of monotone combinatorics, then
R f = (u0 , v0 , c0 , φ0 , ψ0 )
is given by
u0 =
| Q( L)|
,
|U |
φ0 = Z ( f 1a ◦ φ; U ),
v0 =
| Q( R)|
,
|V |
c0 =
| L|
,
|C |
ψ0 = Z ( f 0b ◦ ψ; U ),
where U = φ−1 ◦ f 1−a (C ) and V = ψ−1 ◦ f 0−b (C ).
Proof. This follows from two properties of zoom operators: (i) the map
q( x ) = x α on [0, 1] is ‘fixed’ under zooming on intervals adjacent to the critical point, that is Z (q; (0, t)) = q for t ∈ (0, 1) (technically speaking we have
not defined Z in this situation, but applying the formula for Z will give this
result), and (ii) zoom operators satisfy Z (h ◦ g; I ) = Z (h; g( I )) ◦ Z ( g; I ).
Notation. The notation introduced in this section will be used repeatedly
throughout. Here is a quick summary.
A Lorenz map is denoted either f or (u, v, c, φ, ψ) and these two notations are used interchangeably. Sometimes we write f 0 or f 1 to specify that
we are talking about the left or right branch of f , respectively. Similarly,
when talking about the inverse branches of f , we write f 0−1 and f 1−1 . The
subscript notation is also used for the standard family Q (so Q0 denotes the
left branch, etc.).
A Lorenz map has one critical point c and two critical values which we
denote c1− = limx↑c f ( x ) and c1+ = limx↓c f ( x ). The critical exponent is
denoted α and is always assumed to be fixed to some α > 1.
In general we use primes for variables associated with the renormalization of f . For example (u0 , v0 , c0 , φ0 , ψ0 ) = R f . Sometimes we use parentheses instead of primes, for example c1− (R f ) denotes the left critical value
of R f . In order to avoid confusion, we try to use D consistently to denote
derivative instead of using primes.
31
2.2. Generalized renormalization
With a renormalizable f we associate a return interval C such that C \
{c} has two components which we denote L and R. We use the notation
a + 1 and b + 1 to denote the return times of the first-return map to C from
L and R, respectively. The letters U and V are reserved to denote the pullbacks of C as in Lemma 2.1.11. We let U1 = φ(U ), Ui+1 = f i (U1 ) for
i = 1, . . . , a, and V1 = ψ(V ), Vj+1 = f j (V1 ) for j = 1, . . . , b (note that
Ua+1 = C = Vb+1 ). We call {Ui } and {Vj } the cycles of renormalization.
2.2
Generalized renormalization
In this section we adapt the idea of generalized renormalization introduced
by Martens (1994). The central concept is the weak Markov property which
is related to the distortion of the monotone branches of iterates of a map.
Definition 2.2.1. An interval C is called a nice interval of f if: (i) C is open,
(ii) the critical point of f is contained in C, and (iii) the orbit of the boundary
of C is disjoint from C.
Remark 2.2.2. A ‘nice interval’ is analogous to a ‘nice point’ for unimodal
maps (see Martens, 1994). The difference is that for unimodal maps one
point suffices to define an interval around the critical point (the ‘other’
boundary point is a preimage of the first), whereas for Lorenz maps the
boundary points of a nice interval are independent. The term ‘nice’ is perhaps a bit vague but its use has become established by now.
Definition 2.2.3. Fix f and a nice interval C. The transfer map to C induced
by f ,
[
T:
f −n (C ) → C,
n ≥0
is defined by T ( x ) = f τ (x) ( x ), where
τ:
[
f −n (C ) → N
n ≥0
is the transfer time to C; that is τ ( x ) is the smallest nonnegative integer n
such that f n ( x ) ∈ C.
Remark 2.2.4. Note that: (i) the domain of T is open, since C is open by
assumption, and f −1 (U ) is open if U is open (even if U contains a critical
32
Chapter 2. Preliminaries
value), since the point of discontinuity of f is not in the domain of f , (ii) T
is defined on C and T |C equals the identity map on C.
Proposition 2.2.5. Let T be the transfer map of f to a nice interval C. If I is a
component of the domain of T, then τ | I is constant and I is mapped monotonically
onto C by f τ ( I ) . Furthermore I, f ( I ), . . . , f τ ( I ) ( I ) are pairwise disjoint.
Remark 2.2.6. This means in particular that the components of the domain
of T are the same as the branches of T. In what follows we will use the
terminology “a branch of T” interchangeably with “a component of the
domain of T”.
Proof. If I = C then the proposition is trivial since T |C is the identity map
on C, so assume that I 6= C.
Pick some x ∈ I and let n = τ ( x ). Note that n > 0 since I 6= C. We
claim that the branch B of f n containing x is mapped over C. From this it
immediately follows that τ | I = n and f n ( I ) = C.
Since f n | B is monotone and f ( x ) ∈ C it suffices to show that f n (∂B) ∩
C = ∅. To this end, let y ∈ ∂B. Then there exists 0 ≤ i < n such that
f i (y) ∈ {0, c, 1}.
If f i (y) ∈ {0, 1} then we are done, since these points are fixed by f .
So assume that f i (y) = c and let J = ( x, y). Then f i ( J ) ∩ ∂C 6= ∅
since f i ( x ) ∈
/ C by minimality of τ ( x ). Consequently f n (y) ∈
/ C, otherwise
n
n
−
i
f ( J ) ⊂ C which would imply f (∂C ) ∩ C 6= ∅. But this is impossible
since C is nice and hence the claim follows.
From τ ( I ) = n it follows that I, . . . , f n ( I ) are pairwise disjoint. Suppose
not, then J = f i ( I ) ∩ f j ( I ) is nonempty for some 0 ≤ i < j ≤ n. But then
the transfer time on I ∩ f −i ( J ) is at most i + (n − j) which is strictly smaller
than n, and this contradicts the fact that τ ( I ) = n.
Proposition 2.2.7. Assume that f has no periodic attractors and that S f < 0.
Let T be the transfer map of f to a nice interval C. Then the complement of the
domain of T is a compact, f –invariant and hyperbolic set (and consequently it has
zero Lebesgue measure).
Proof. Let U = dom T and let Γ = [0, 1] \ U.
Since U is open Γ is closed and hence compact (since it is obviously
bounded).
By definition f −1 (U ) ⊂ U which implies f ( Γ ) ⊂ Γ.
2.2. Generalized renormalization
33
We can characterize Γ as the set of points x such that f n ( x ) ∈
/ C for
all n ≥ 0. Since S f < 0 it follows that f cannot have nonhyperbolic periodic points (Misiurewicz, 1981, Theorem 1.3) and by assumption f has no
periodic attractors so Γ must be hyperbolic (de Melo and van Strien, 1993,
Lemma III.2.1) .1
Finally, it is well known that a compact, invariant and hyperbolic set
has zero Lebesgue measure if f is at least C 1+Hölder (de Melo and van Strien,
1993, Theorem III.2.6) .1
Definition 2.2.8. A map f is said to satisfy the weak Markov property if
there exists a K < ∞ and a decreasing sequence C1 ⊃ C2 ⊃ · · · of nice intervals whose lengths tend to 0, such that the transfer map to Cn is defined
almost everywhere and has distortion bounded by K, for every n.
Remark 2.2.9. That a “transfer map T has distortion bounded by K” is simply a convenient way of saying that T | B has distortion bounded by K, for
every branch B of T.
Theorem 2.2.10. If f satisfies the weak Markov property, then f has no wandering
intervals.
Proof. In order to reach a contradiction assume that there exists a wandering interval W which is not contained in a strictly larger wandering interval.
Note that W must accumulate on at least one side of c. Otherwise there
would exist an interval I disjoint from the orbit of W with c ∈ cl I. We could
then modify f on I in such a way that the resulting map would be a bimodal
C 2 –map with nonflat critical points and W would still be a wandering interval for the modified map, see Figure 2.3 on the next page. However, such
maps do not have wandering intervals (Martens et al., 1992).
Now let {Ck } be the sequence of nice intervals that we get from the
weak Markov property. Since W accumulates on at least one side of c there
exists a sequence of nonnegative integers {nk } such that f nk (W ) ⊂ Ck . Let
Bk be the branch of f nk containing W.
The weak Markov property now shows that the distortion of f nk : W →
Ck is uniformly bounded (in k), so there exists a δ > 0 (independent of k)
1 The theorems from de Melo and van Strien (1993) that are referenced in this proof are
stated for maps whose domain is an interval but their proofs go through, mutatis mutandis,
for Lorenz maps.
34
Chapter 2. Preliminaries
I
Figure 2.3: Illustration showing why wandering intervals must accumulate on the critical point. If f has a wandering interval whose orbit does
not intersect some (one-sided) neighborhood I of the critical point, then by
modifying f on I according to the gray curve we create a bimodal map with
a wandering interval. This is impossible since bimodal maps with nonflat
critical points do not have wandering intervals.
such that f nk ( Bk ) contains a δ–scaled neighborhood of Ck . This Koebe space
can be pulled back and by applying the Macroscopic Koebe Lemma we see
that Bk contains a δ0 –scaled neighborhood of W, for every k (where δ0 only
depends on δ).
T
The above argument shows that B = Bk strictly contains W. Since c ∈
/
Bk for any k by definition we also have that B is a wandering interval. Thus
B is a wandering interval which strictly contains the wandering interval W,
but this contradicts the maximality of W. Hence f cannot have wandering
intervals.
Theorem 2.2.11. If f satisfies the weak Markov property, then f is ergodic.
Proof. In order to reach a contradiction, assume that there exist two invariant sets X and Y such that | X | > 0, |Y | > 0 and | X ∩ Y | = 0. Let {Ck } be
the sequence of nice intervals that we get from the weak Markov property.
35
2.3. Invariant measures
We claim that
| X ∩ Ck |
→ 1 and
|Ck |
|Y ∩ Ck |
→ 1,
|Ck |
as k → ∞.
Thus we arrive at a contradiction since this shows that | X ∩ Y | > 0.
Let Γk be the complement of the domain of the transfer map to Ck . By
S
the weak Markov property | Γk | = 0, hence Γk also has zero measure. This
and the assumption that | X | > 0 implies that there exists a density point
x which lies in X as well as in the domain of the transfer map to Ck , for
every k.
Let Bk be the branch of the transfer map to Ck containing x, and let τk
be the transfer time for Bk . We contend that | Bk | → 0. If not, there would
T
exist a subsequence {k i } such that B = Bki had positive measure, and
thus B would be contained in a wandering interval (which is impossible by
Theorem 2.2.10). Here we have used that Ck is a nice interval so the orbit
of Bk satisfies the disjointness property of Proposition 2.2.5.
Since f τk ( Bk ) = Ck , since f ( X ) ⊂ X, and since there exists a K < ∞
bounding the distortion of each transfer map we get
|Ck \ X |
| f τk ( Bk \ X )|
|B \ X|
≤
≤K k
→ 0,
τ
|Ck |
| f k ( Bk )|
| Bk |
as k → ∞.
The last step follows from x being a density point, since | Bk | → 0.
Now apply the same argument to Y and the claim follows.
2.3
Invariant measures
Let f be an infinitely renormalizable map of any combinatorial type2 and
let Λ be the closure of the orbits of the critical values and assume that this
is a Cantor set (in Section 3.2 we prove that Λ is a Cantor set, although
our proof is only valid for some combinatorial types). In this section we
describe the invariant measures supported on Λ. The techniques employed
are an adaptation of (Gambaudo and Martens, 2006) which also contains
proofs for all the statements we choose not to provide proofs for here.
2 Contrary to other sections, in this section we consider general combinatorial types and
not just monotone types.
36
Chapter 2. Preliminaries
Theorem 2.3.1. If f is infinitely renormalizable (of any combinatorial type) with
a Cantor attractor Λ, then Λ supports one or two ergodic invariant probability
measures.
If the combinatorial type of f is bounded then Λ is uniquely ergodic. Furthermore, it is possible to choose the combinatorial type so that Λ has two distinct
ergodic invariant probability measures.
Remark 2.3.2. Bounded combinatorial type is sufficient for Λ to be uniquely
ergodic, but is not necessary as Example 2.3.13 shows.
An infinitely renormalizable map naturally defines a sequence of finer
and finer covers of Λ. We now describe the construction of these covers
and how they in turn can be identified with certain directed graphs.
Definition 2.3.3. Since f is infinitely renormalizable we get a nested sequence of nice intervals {Cn }. Let an = τ (c1− ) and bn = τ (c1+ ) denote the
transfer times of the critical values to the nice interval Cn . Let {Uni } and
{Vni } be the pull-backs of Cn along the orbits of the critical values, that is
f i−1 (c1− ) ∈ Uni
and
f an +1−i (Uni ) = Cn ,
i = 1, . . . , an + 1,
f i−1 (c1+ ) ∈ Vni
and
f bn +1−i (Vni ) = Cn ,
i = 1, . . . , bn + 1.
j
The intervals {Uni } and {Vn } cover the Cantor set Λ and they satisfy a
disjointness property expressed by the following lemma. Intuitively, for a
fixed n these sets are pairwise disjoint except that if they overlap at some
time, then all remaining intervals follow the same orbit.
Lemma 2.3.4. There exists k n ≥ 0 such that Unan +1−i = Vnbn +1−i for i =
0, . . . , k n , and
Λ0n = {Uni }ia=n −1 kn ∪ {Vni }ib=n +1 1
is a pairwise disjoint cover of Λ for all n.
Remark 2.3.5. Note that k n = 0 if f has monotone combinatorial type.
Proof. Since Cn is nice it follows that if Unan +1−k ∩ Vnbn +1−k 6= ∅ for some k,
then Unan +1−i = Vnbn +1−i for all i = 0, . . . , k. Define k n to be the largest such k
(which exists since Unan +1 = Cn = Vnbn +1 ).
By Proposition 2.2.5 {Uni }i is a pairwise disjoint collection and so is
i
{Vn }i which proves the disjointness property of Λ0n .
37
2.3. Invariant measures
Finally, Λ0n covers Λ since the critical values are contained in Un1 and Vn1 ,
both of which are eventually mapped inside Cn .
Definition 2.3.6. The n–th cover of Λ is defined by
Λn = { I ∩ Λ | I ∈ Λ0n }.
The covers come with natural projections
πij : Λ j → Λi ,
i ≤ j,
defined by I = πij ( J ), where I ∈ Λi is the unique element which contains
J ∈ Λ j . These projections satisfy πii = id and πij = πik ◦ πkj if i ≤ k ≤ j.
Hence we get an inverse system ({Λi }i , {πij }i≤ j ). The inverse limit of this
system can be identified with Λ via the natural projections
pn : Λ → Λn ,
defined by pn ( x ) = I, where I ∈ Λn is the unique element containing x ∈
Λ. Explicitly, p : Λ → lim Λn is defined by
←−
p ( x ) = ( p1 ( x ), p2 ( x ), . . . ).
Remark 2.3.7. Note that p is: (i) well-defined, since pi = πij ◦ p j , (ii) surjective, since if ( xi )i ∈ lim Λn then x = lim xi exists and p( x ) = ( x1 , x2 , . . . ),
←−
(iii) injective, since if x, y ∈ Λ are distinct then pn ( x ) 6= pn (y) for some n
due to the fact that the diameter of elements in Λn tends to zero as n → ∞.
We think of Λn as the directed graph where each element in Λn is a node
and where an edge connects I to J if and only if f ( I ) ∩ J 6= ∅. Note that if
there is an edge from I to J, then f ( I ) = J unless I = Zn = Cn ∩ Λ. The
image of Zn is contained in the nodes En1 = Un1 ∩ Λ and En2 = Vn1 ∩ Λ. For
example, Λn could look like:
Un2 ∩ Λ
Un3 ∩ Λ = Vn5 ∩ Λ
Vn4 ∩ Λ
En1 = Un1 ∩ Λ
Zn = Cn ∩ Λ
Vn3 ∩ Λ
Vn2 ∩ Λ
En2 = Vn1 ∩ Λ
38
Chapter 2. Preliminaries
We now describe how the invariant measures on Λ can be identified
with a subset of an inverse limit of “almost invariant” measures on Λn . We
define the “almost invariant” signed measures on Λn as follows.
Definition 2.3.8. Let Σn be the σ–algebra generated by Λn . Note that Λn
consists of exactly two loops, by which we mean a maximal collection
{ Ik ∈ Λn }k such that f ( Ik ) = Ik+1 or Ik = Zn . These loops start in Eni
and terminate in Zn . Each loop defines a loop measure,
νni : Σn → R,
by νni ( I ) = 1 if and only if I is a node on the i–th loop of Λn .
Let H1 (Λn ) denote R–module generated by {νn1 , νn2 } and let M(Λ) denote the R–module of signed invariant measures on Λ.
Remark 2.3.9. Note that the (signed) measures in H1 (Λn ) are almost invariant since νni ( f −1 ( I )) = νni ( I ) if I ∈ Λn \ { En1 , En2 }, but invariance fails since
f −1 ( Eni ) ∈
/ Σn . The notation H1 (Λn ) comes from the fact that H1 (Λn ) is
isomorphic to the first homology module of the graph Λn .
Now consider the push-forward
1
( pn )∗ µ = µ ◦ p−
n
of µ ∈ M(Λ) under the projection pn :
Lemma 2.3.10. The push-forward of pn is a homomorphism ( pn )∗ : M(Λ) →
H1 (Λn ) and
( pn )∗ (µ) = µ( En1 )νn1 + µ( En2 )νn2 .
The push-forward under the projections πij induces an inverse system
({ H1 (Λi )}i , {(πij )∗ }i≤ j ).
Because of the previous lemma and the identification of Λ with lim Λn , the
←−
inverse limit lim H1 (Λn ) is isomorphic to M(Λ) via the isomorphism
←−
p∗ (µ) = ( p1 )∗ µ, ( p2 )∗ µ, . . . .
Definition 2.3.11. Let I(Λ) ⊂ M(Λ) denote the subset of positive invariant measures.
39
2.3. Invariant measures
Using the above one can check that I(Λ) is identified with lim H1+ (Λn ),
←−
where
H1+ (Λn ) = x1 νn1 + x2 νn2 | xi ≥ 0
are the (positive) almost invariant measures on Λn .
Lemma 2.3.12. Let Wn = (wij ) be the winding matrix defined by
wij = #{ I ⊂ Eni | I is an element of the j–th loop of Λn }.
Then Wn is the representation of the push-forward (πn,n+1 )∗ if we use the loop
measures {νk1 , νk2 } as bases of H1 (Λk ), for k = n, n + 1.
Proof of Theorem 2.3.1. By the above every invariant measure is represented
by an inverse limit
{(z1 , z2 , . . . ) | zi = Wi zi+1 , zi ∈ K },
where K ⊂ R2 is the cone {( x1 , x2 ) | xi ≥ 0}. This suggests that we should
look at the sets
\
In =
Wn · · · Wm−1 K,
m>n
since zn ∈ In . The winding matrices have positive integer entries, so either
In is a one-dimensional or a two-dimensional subspace, for all n ≥ 1.
If In has dimension two then it is the convex hull of two points and
hence In ∩ { x1 + x2 = 1} has exactly two extremal points. This implies that
there are two ergodic invariant probability measures. Since In cannot have
have dimension higher than two, this also shows that there can be no more
than two ergodic invariant probability measures. In Example 2.3.13 we
construct a map with exactly two ergodic invariant probability measures.
To see that Λ is uniquely ergodic in the bounded combinatorial type
case we introduce the Hilbert metric (also known as the hyperbolic metric)
on the interior of K. This metric is defined as in the following figure:
x2
w
n
d(z, z0 ) = log 1 +
z
z0
w0
x1
|z−z0 ||w−w0 |
|w−z||z0 −w0 |
o
40
Chapter 2. Preliminaries
Here w and w0 are the points on ∂K closest to z and z0 , respectively, on the
line through z and z0 . Note that d(z, z0 ) equals the log of one plus the crossratio of (w, z, z0 , w0 ). The cross-ratio is well-defined since these four points
are collinear. The Hilbert metric is contracted by positive matrices because:
(i) linear maps preserve cross-ratio, (ii) WK ⊂ K, if W is a positive matrix,
and (iii) the cross-ratio of (w, z, z0 , w0 ) decreases if w is moved further away
from z on the line through these four points (and similarly for w0 and z0 ).
If f has bounded combinatorial type, then there is a bound on the contraction constant for the winding matrices Wn which is independent of n.
This implies that In is one-dimensional and hence there is a unique ergodic
probability measure.
Example 2.3.13. If f is of monotone combinatorial type {(ωn− , ωn+ )}n≥0 with
an = |ωn− | − 1 and bn = |ωn+ | − 1, then we can compute the winding matrix:
Wn =
1
a n +1
bn + 1
.
1
Thus
1 + a n + 1 bn bn + bn + 1
Wn−1 Wn =
a n + a n + 1 1 + a n bn + 1
1
s n + ( a n bn ) − 1 a −
n (1 + r n )
= a n bn
bn−1 (1 + sn ) rn + ( an bn )−1
where sn = an+1 /an and rn = bn+1 /bn . Assuming an , bn → ∞ we see that
the above matrix modulo the multiplicative term is asymptotically equal to
If we let
Inm =
m
\
sn 0
.
0 rn
Wn Wn+1 · · · Wm Wm+1 K,
i =n
then Inm is two-dimensional for all m if we let an and bn grow sufficiently
fast. Thus the sets In in the proof of Theorem 2.3.1 will be two-dimensional,
giving rise to two ergodic invariant probability measures.
2.3. Invariant measures
41
It is possible to compute the contraction constant of the Hilbert metric
in this example (see Bushell, 1973). It is given by
√
a n bn − 1
.
k n −1 = √
a n bn + 1
This constant is an exact bound on the contraction, meaning that there exist
x, y such that d(Wn x, Wn y) = k n d( x, y).
By choosing an and bn such that ∏ k n = 0 we get that the sets In are lines
which means that Λ is uniquely ergodic. In particular, we could choose
{ an } and {bn } unbounded but growing slowly enough for ∏ k n = 0 to
hold, hence showing that bounded combinatorial type is not necessary for
Λ to be uniquely ergodic.
C HAPTER
Invariance
The central result of this chapter is the existence of an ‘invariant’ set K
for the renormalization operator in Section 3.1. This result is exploited in
Section 3.2 to prove a priori bounds which in turn has consequences for the
Cantor attractor for infinitely renormalizable maps. A careful investigation
of the action of the renormalization operator on K in Section 3.3 is then
used to exhibit periodic points of the renormalization operator.
3.1
The invariant set
In this section we construct an ‘invariant’ and relatively compact set for the
renormalization operator. This construction works for types of renormalization where the return time of one branch is much longer than the other.
This result will be exploited in the following sections.
Definition 3.1.1. Let ε = 1 − c. This notation will be used from now on.
Note that ε depends on f . Define
K = { f ∈ L1 | ε− ≤ ε ≤ ε+ , Dist φ ≤ δ, Dist ψ ≤ δ}
and
a
b
z }| { z }| {
Ω = {(0 1 · · · 1, 1 0 · · · 0) | a− ≤ a ≤ a+ , b− ≤ b ≤ b+ }.
We are going to show how K and Ω can be chosen so that K is invariant1
under the restriction of R to types in Ω. As always, assume that the critical
1 Exactly
what we mean by ‘invariant’ is expressed in Theorem 3.1.5.
43
3
44
Chapter 3. Invariance
exponent α > 1 has been fixed once and for all. The critical exponent will
essentially determine a− and a+ (this is not entirely natural as we would
expect to be able to choose a+ as large as we please but some estimates will
not work then). Finally we are left with a free parameter b− which when
chosen large enough will give us the invariance we are after.
Remark 3.1.2. Ideally we would like to let Ω be any finite set of types of
renormalization. However, our approach relies heavily on the fact that we
can choose b− large since this allows us to do some analysis. Also, we
restrict ourselves to monotone combinatorics (i.e. of type (01 · · · 1, 10 · · · 0))
since arbitrary types are more difficult to handle.
We will now show how to choose the constants defining K and Ω. Let
(3.1)
−
ε− = α−σb ,
ε+ = κα−(b
− (1− σr + )− a+ ) /α
,
+
where σ ∈ (0, 1), r + = a+ + 1 − α − α−b , and κ is a constant which is
defined in (3.14).
The parameter δ will be assumed to be small, δ = o (1/b− ) suffices.
However, δ must not be smaller than the bounds for the distortion of R f
in (3.15). For example, we may pick δ = (1/b− )2 .
Assume that σ, a− and a+ have been chosen so that
−
(3.2) a− > α − 1 + α−b ,
a+ ≤ 2α − 1,
α − r−
1
<σ≤ +
,
2
−
+
α −r r
a +1−α
−
where r − = a− + 1 − α − α−b . Finally, choose b+ such that
a+ r −
1 r − (1 − σr + ) + α2 σ −
+
−
·b −b +a −
→ ∞,
(3.3)
α
α
α
b− → ∞.
Remark 3.1.3. Let us briefly discuss the choice of constants. It is important
to realize that they do not represent necessary conditions and as such are
not optimal in any way.
In order for ε− < ε+ to hold we need 1 − σr + > 0 and σ ≥ (1 − σr + )/α,
both of which follow from the bounds on σ in (3.2).
The lower bound on a− is used to control the distortion of the return
maps in the proof of Theorem 3.1.5. The upper bound on a+ ensures that
the lower bound on σ in (3.2) is positive.
The lower bound on σ is equivalent to the constant in front of b− in (3.3)
being strictly larger than 1. This shows that b+ can be chosen so that b+ −
45
3.1. The invariant set
b− → ∞ as b− grows, which is important since we want Ω to be “as large
as possible.”
Finally, the condition (3.3) is used to prove that ε(R f ) ≥ ε− .
Example 3.1.4. One possible choice that will satisfy the above constraints is
σ = 1/α,
a − = b α c,
a+ = b2α − 1c.
(If α = n − λ for some n ∈ N and λ > 0 very close to 0, then b− may need
to be increased so that the lower bound on a− holds.)
We now state the theorem which exactly expresses what kind of ‘invariance’ we have for K.
Theorem 3.1.5 (Invariance). If f ∈ LSΩ and c1+ (R f ) ≤ 1/2 ≤ c1− (R f ), then
f ∈ K =⇒ R f ∈ K,
for b− large enough.
The condition on the critical values of the renormalization excludes
maps which are degenerate in some sense. There is nothing magical about
the number 1/2 here, all we want is for c1− (R f ) to be bounded away from 0
and c1+ (R f ) to be bounded away from 1. An alternative (weaker) statement
which we will also use is:
Corollary 3.1.6. If both f ∈ LSΩ and R f ∈ LSΩ , then
f ∈ K =⇒ R f ∈ K.
for b− large enough.
Proof. Since f ∈ LSΩ ∩ K we can apply Lemma 3.1.9 which shows that
c1− (R f ) → 1 and c1+ (R f ) → 0 as b− → ∞. Hence Theorem 3.1.5 applies.
Remark 3.1.7. The full family theorem (Martens and de Melo, 2001) implies
that there exists f which satisfies the conditions of the corollary. This shows
that both the corollary and the theorem are not vacuous.
The main reason for introducing the set K is the following:
Proposition 3.1.8. K is relatively compact in L0 .
46
Chapter 3. Invariance
Proof. Clearly [ε− , ε+ ] is compact in (0, 1). Hence we only need to show that
the ball B = {φ ∈ D 1 ([0, 1]) | Dist φ ≤ δ} is relatively compact in D 0 ([0, 1]).
This is an application of the Arzelà–Ascoli theorem; if {φn ∈ B} then
|φn (y) − φn ( x )| ≤ eδ |y − x | hence this sequence is equicontinuous (as well
as uniformly bounded), so it has a uniformly convergent subsequence.
The rest of this section is devoted to the proof of Theorem 3.1.5. We will
need the following expressions for the inverse branches of f :
(3.4)
(3.5)
!1/α
|φ−1 ([ x, c1− ])|
= c−c
,
|φ−1 ([0, c1− ])|
1/α
|ψ−1 ([ x, 1])|
−1
.
f 1 ( x ) = c + ( 1 − c ) 1 − −1 +
|ψ ([c1 , 1])|
f 0−1 ( x )
Lemma 3.1.9. There exists K such that if f ∈ L1Ω ∩ K, then 1 − c1− < Kε2 . Also,
c1+ → 0 exponentially in b− as b− → ∞.
Proof. For monotone combinatorics c1+ < f 0−b (c) and c1− > f 1−a (c), so the
idea is to look for bounds on the backward orbits of c. We claim that
(3.6)
(3.7)
f 1−1 (c) − c
≥ µ,
ε
α−n
1/(α−1)
c − f 0−n (c)
≥ µε
· c/eδ
,
c
where µ ≥ 1 − Kε.
Assume that the claim is true. Then (3.6) shows that
1 − c1− < 1 − f 1−1 (c) ≤ 1 − c − µε = ε(1 − µ) ≤ Kε2 ,
which proves the statement about about c1− .
−n
−
Next, let n = dlogα b− e. Then α−n ≤ 1/b− and (µε)α ≥ (µε)1/b , so
1/(α−1) −σ
f 0−n (c)
−
≤ 1 − (µε− )1/b (c/eδ )1/(α−1) ≤ 1 − µ (1 − ε+ )/eδ
α .
c
Thus f 0−n (c) is a uniform distance away from c. Since b− − dlogα b− e → ∞,
and since 0 is an attracting fixed point for f 0−1 with uniform bound on the
47
3.1. The invariant set
multiplier, it follows that f 0−b (c) approaches 0 exponentially as b− → ∞.
This proves the statement about c1+ .
We now prove the claim. We first show that
(3.8)
f 1−1 (c) − c ≥ ε ·
(3.9)
c − f 0−n (c) ≥ cα /eδ
1−
!1/α
eδ ε
c − f 0−b (c)
1/(α−1)
,
α−n
· f 1−1 (c) − c .
Equation (3.8) follows from a computation using (3.5) and the fact that 1 −
c1+ > c − f 0−b (c) holds for monotone combinatorics.
To prove (3.9), first apply (3.4) to get
f 0−1 ( x )
≤ c−c e
−
− δ c1
−x
c1−
1/α
.
This gives
f 0−1 (c) ≤ c − c · e−δ/α (c1− − c)1/α ,
(3.10)
and if x < c then
x 1/α
f 0−1 ( x ) ≤ c − c · e−δ/α 1 −
.
c
An induction argument on the last inequality leads to
1+···+α−(n−1) x α−n
f 0−n ( x ) ≤ c − ce−δ/α
,
· 1−
c
which together with (3.10), 1 + · · · + α−n < α/(α − 1), and c1− > f 1−1 (c)
proves (3.9).
Having proved (3.8) and (3.9) we now continue the proof of the claim.
Note that the left-hand side of (3.8) appears in the right-hand side of (3.9)
and vice versa. Thus we can iterate these inequalities once we have some
bound for either of them. To this end, suppose f 1−1 (c) − c ≥ tε, for some
t > 0. If we plug this into (3.9) and then plug the resulting bound into (3.8),
we will get a new bound on f 1−1 (c) − c. This defines a map
t 7→ h(t) =
1−
eδ
c
α/(α−1)
−b
ε 1− α
· α−b
t
!1/α
.
48
Chapter 3. Invariance
One can check that h is increasing and has two fixed points — one close to 0
which is repelling, and another close to 1 which is attracting. Explicitly, the
fixed point equation t = h(t) gives
α/(α−1)
−b
−b
tα (1 − tα ) = ε1−α eδ /c
.
If t ↑ 1 then the solution is approximately
α/(α−1) 1−α−b 1/α
ε
t1 = 1 − eδ /c
,
and if t ↓ 0 then the approximate solution is
α b +1 / ( α − 1 ) α b − 1
ε
.
t0 = eδ /c
Hence the proof is complete if we have some initial bound f 1−1 (c) − c ≥ t0 ε
such that t0 > t0 , because then hn (t0 ) → t1 .
To get an initial bound we use the fact that f 1−1 (c) − c > | R| and look
for a bound on | R|. Since R f is nontrivial we have f b+1 ( R) ⊃ R, which
implies
| R| ≤ | f b ( f ( R))| ≤ max f 0 ( x )b · eδ | Q( R)| ≤ (eδ uα/c)b eδ v (| R|/ε)α
x <c
and thus
f 1−1 (c) − c
cε1/b
> | R| ≥ ε ·
αeδ(b+1)/b
b/(α−1)
= εt0 .
b
Here t0 is of the order ε1/(α−1) α−b whereas t0 is of the order εα , so t0 > t0
−
for b− large enough (since ε ∼ α−b ··· ).
Lemma 3.1.10. There exists K such that if f ∈ L1Ω ∩ K then
K −1 ≤ D f 1−a ( x ) · α a ε−a ≤ K,
K −1 ≤ D f 0−b ( x ) · αb ε1−α
−b
≤ K,
∀ x > f 0−1 (c),
∀ x ≤ c.
Proof. The proof makes use of the following expressions for the derivatives
of f −1
!1−1/α
−1 ([0, c− ])|
−1 ( x )
|
φ
c
Dφ
1
(3.11)
D f 0−1 ( x ) = ·
,
α
u
|φ−1 ([ x, c1− ])|
!1−1/α
ε Dψ−1 ( x ) |ψ−1 ([c1+ , 1])|
−1
(3.12)
.
D f1 (x) = ·
α
v
|ψ−1 ([c1+ , x ])|
49
3.1. The invariant set
We start by proving the lower bound on D f 1−a . From (3.12) we get
≥ e−δ ε/α and hence
D f 1−1 ( x )
D f 1−a ( x ) ≥ e−aδ (ε/α) a ,
∀ x ∈ [c1+ , 1].
Note that e−aδ has a uniform bound since a < b− and δb− → 0 by assumption.
Next consider the upper bound on D f 1−a . By assumption
x > f 0−1 (c) ≥ c 1 − (eδ ε)1/α ,
where we have used (3.4). This together with (3.12) implies
1−1/α
D f 1−a ( x ) ≤ (ε/α) a (eδ /v) a e aδ ( f 0−1 (c) − c1+ )−a
≤ K (ε/α) a .
It remains to show that K is uniformly bounded. Briefly, this follows from
v a ≥ (1 − eδ c1+ ) a ≥ 1 − aeδ c1+ → 1
and from
− a a
a
a −1
f 0−1 (c) − c1+
≤ 1 − ε 1 − (eδ ε)1/α 1 − c1+ / f 0−1 (c)
−1
≤ 1 − aε 1 − a(eδ ε)1/α 1 − ac1+ / f 0−1 (c)
→ 0,
as b− → ∞. (We have used Lemma 3.1.9 to get ac1+ → 0; aδ → 0 and aε → 0
follows from the choice of K and Ω.)
We now turn to proving the bounds on D f 0−b . We claim that
(3.13)
−
− α−n
α/(α−1)
c1 − x
c1− − f 0−n ( x )
δ/α
1−1/α
−δ/(α−1)
·
≤
e
(
1
+
O
(
ε
))
,
e
≤
c1−
c1−
which together with (3.12) gives
c
− 2δ
c1 e (1 + O(ε1−1/α ))
b
≤
D f 0−b ( x ) · αb
b −1 c −
1
∏
i =0
≤
e2δ c
c1−
b
.
−x
c1−
(α−1)/αi+1
50
Chapter 3. Invariance
−b
The product can be rewritten as ((c1− − x )/c1− )1−α which is proportional
−b
to ε1−α by Lemma 3.1.9, so all we need to do is to prove that the constants
above are uniformly bounded. This follows from a similar argument to the
above and the fact that bεt → 0 for every t > 0, and bδ → 0, as b− → ∞ (by
the assumptions made on K and Ω).
To finish the proof, let us prove the claim (3.13). The lower bound follows from (3.4) and the estimate
c1− − f 0−1 ( x )
f 0−1 ( x )
≥ e−δ/α
>
1
−
−
c
c1
c1− − x
c1−
1/α
.
An induction argument finishes the proof of the lower bound.
For the upper bound we assume x ≤ c and use (3.4) again to get
c1− − f 0−1 ( x )
c c1− − f 0−1 ( x ) c − f 0−1 ( x )
= −·
·
≤ µeδ/α
−
−
1
c
c1
c1
c − f0 (x)
c1− − x
c1−
1/α
,
where
c1− − c
c
c c1− − f 0−1 (c)
c
·
+
·
=
−
−
−
−
1
c1
c1
c1
c − f 0 (c)
c − f 0−1 (c)
−1/α
−
− δ c1 − c
≤ 1 + O(ε1−1/α ).
< 1+ε e
c1−
µ=
Another induction argument finishes the proof for the upper bound.
Lemma 3.1.11. If f ∈ L1Ω ∩ K and c1+ (R f ) ≤ 1/2 ≤ c1− (R f ), then
α
1
v D f b (y)
c | R|
·
· ·
≤ 2e2δ ,
≤
2δ
ε | L|
u D f a (x)
2e
for some x ∈ f ( L) and y ∈ f ( R).
Remark 3.1.12. This lemma can be stated in much greater generality (the
proof remains unchanged). In particular, we do not use the bounds on ε
and it works for any type of renormalization.
Proof. From 0 ≤ c1+ (R f ) ≤ 1/2 ≤ c1− (R f ) ≤ 1 we get
1/2 ≤
| f a+1 ( L)|
| f b+1 ( R)|
≤ 1 and 1/2 ≤
≤ 1.
|C |
|C |
51
3.1. The invariant set
The mean value theorem implies that there exists x ∈ f ( L) and y ∈ f ( R)
such that
D f a ( x )| f ( L)| = | f a+1 ( L)|
and
D f b (y)| f ( R)| = | f b+1 ( R)|.
From f ( L) = φ ◦ Q0 ( L) and f ( R) = ψ ◦ Q1 ( R) we get
e−δ ≤
| f ( L)|
≤ eδ
u · (| L|/c)α
and
e−δ ≤
| f ( R)|
≤ eδ .
v · (| R|/ε)α
Now combine these equations to finish the proof:
v
u
c | R|
·
ε | L|
α
≤ e2δ
a
| f ( R)|
D f a ( x ) | f b+1 ( R)|
2δ D f ( x )
≤
2e
≤ e2δ
·
.
| f ( L)|
D f b (y) | f a+1 ( L)|
D f b (y)
The lower bound follows from a similar argument.
Proof of Theorem 3.1.5. The proof is divided up into three steps: (1) show
that the distortion of f b | f ( R) is small, (2) show that ε− ≤ ε(R f ) ≤ ε+ ,
(3) determine explicit bounds on the distortion for the renormalization.
Step 1. The map f 0−b |C extends continuously to [0, c1− ] so in order to use
the Koebe lemma we need to show that both components of [0, c1− ] \ C are
large compared to C. The relative length of the left component is large since
|C | is at most of order ε1/α , so we focus on the right component only.
There are two cases to consider: either | L| ≥ | R|, or | R| > | L| (the latter
will turn out not to hold, but we do not know that yet).
Assume | R| > | L|. For monotone combinatorics we have
f ( R) ⊂ ( f 0−b−1 (c), f 0−b+1 (c)),
thus
| f 0−b+1 (c) − f 0−b−1 (c)| ≥ | f ( R)| ≥ ε−δ v(| R|/ε)α ,
and consequently
| R|
≤
ε
eδ
· | f 0−b+1 (c) − f 0−b−1 (c)|
v
1/α
→ 0,
as b− → ∞.
This shows that the relative length of the right component of [0, c1− ] \ C
tends to infinity and hence the distortion of f b | f ( R) tends to zero.
52
Chapter 3. Invariance
Next, assume that | L| ≥ | R|. Since f is renormalizable f ( L) ⊂ C and
hence
2| L| ≥ |C | = D f a ( x )| f ( L)| ≥ D f a ( x )e−δ u(| L|/c)α ,
for some x ∈ f ( L). Now apply Lemma 3.1.10 to get that
| L| ≤ Kε a/(α−1) .
By assumption (3.2) a > α − 1 so once again we get that the relative length
of the right component tends to infinity as we increase b− .
Step 2. We first show that ε(R f ) ≤ ε+ . Apply Lemma 3.1.11 to get
| R|
| R|
ε
ε(R f ) =
<
≤
| L| + | R|
| L|
c
u D f 0−b (y)
·
· 2e2δ
v D f 1−a ( x )
!1/α
,
for some x ∈ f ( L) and some y ∈ f ( R). Now apply Lemma 3.1.10 and
Step 12 to get
(3.14)
−b 1/α
ε(R f ) ≤ κ α−b+a εα−a−1+α
.
(This defines the constant κ of (3.1).) The exponent of ε is negative by assumption (3.2) so inserting ε ≥ ε− gives us ε(R f ) ≤ κα−t , where
t=
b− (1 − σr + ) − a+
.
α
This shows that ε(R f ) ≤ ε+ .
We now show that ε(R f ) ≥ ε− . A similar argument to the above shows
that | R|/| L| ≥ kα−t , where
r − (1 − σr + ) −
a+ r −
1
+
−
b −
·b −a +
.
t=
α
α
α
−
Recall that ε− = α−σb so we would like σb− > t, which is equivalent to
1 r − (1 − σr + ) + α2 σ −
a+ r −
+
−
·b −b +a −
> 0.
α
α
α
need Step 1 to get a bound on D f 0−b (y), since we do not know if y ≤ c and this is
the only case Lemma 3.1.10 treats.
2 We
53
3.1. The invariant set
By assumption (3.3) the left-hand side tends to ∞ as b− grows. Hence
| R|/| L| ≥ kασb
− −t
ε− ,
so the right-hand side is greater than ε− if b− is sufficiently large. Consequently this is also true for ε(R f ) since
| R|
| R|
| R | −1
ε(R f ) =
=
· 1+
| L| + | R|
| L|
| L|
and | R|/| L| is small. Thus, ε(R f ) ≥ ε− if b− is sufficiently large.
Step 3. We now use the Koebe lemma to get explicit bounds on the distortion for R f . From Step 2 we know that | L| > | R| and thus the arguments
of Step 1 shows that |C | is at most of the order ε a/(α−1) . Hence Lemma 3.1.9
shows that the right component of (c1+ , c1− ) \ C has length of order ε and the
left component has length of order 1.
The inverses of the return maps f a | f ( L) and f b | f ( R) extend continuously
(at least) to (c1+ , c1− ) so the Koebe lemma implies that the distortion of the
return maps is of the order εt , where t = −1 + a/(α − 1) > 0. That is
(3.15)
Dist φ(R f ) ≤ Kεt
and
Dist ψ(R f ) ≤ Kεt .
Note that Kεt δ if we e.g. choose δ = (1/b− )2 .
This concludes the proof of Theorem 3.1.5.
Many of the results used to prove Theorem 3.1.5 do not rely on the assumption that c1+ (R f ) ≤ 1/2 ≤ c1− (R f ). Note that without this assumption we cannot say anything about the critical point of the renormalization,
nor can we ensure that the distortion of the diffeomorphic parts shrink under renormalization. In other words, we can not prove invariance of K
without this extra assumption. We collect these results and state them here
as one proposition as they will be needed later.
Proposition 3.1.13. If f ∈ LSΩ ∩ K, then
1. 1 − c1− < Kε2 for some K not depending on f ,
2. c1+ → 0 exponentially in b− as b− → ∞,
3. D f 1a |U (ε/α) a ,
−b
4. D f 0b |V α−b ε−1+α .
54
Chapter 3. Invariance
Remark 3.1.14. We use the notation g( x ) y to mean that there exists K <
∞ not depending on g such that K −1 y ≤ g( x ) ≤ Ky for all x in the domain
of g.
Proof. The first two items are proven in Lemma 3.1.9. The last two follow
from Lemma 3.1.10 and Step 1 of the proof of Theorem 3.1.5.
3.2
A priori bounds
In this section we begin exploiting the existence of the relatively compact
‘invariant’ set of Theorem 3.1.5. An important consequence of this theorem is the existence of so-called a priori bounds (or real bounds) for infinitely
renormalizable maps. We use the a priori bounds to analyze infinitely
renormalizable maps and their attractors.
Theorem 3.2.1 (A priori bounds). If f ∈ LSω̄ ∩ K is infinitely renormalizable
with ω̄ ∈ ΩN , then {Rn f }n≥0 is a relatively compact family (in L0 ).
Proof. This is a consequence of Corollary 3.1.6 and Proposition 3.1.8.
Theorem 3.2.2. If f ∈ LSω̄ ∩ K is infinitely renormalizable with ω̄ ∈ ΩN , then
f satisfies the weak Markov property.
Before giving the proof we need the following lemma. Intuitively, it
states that if f is renormalizable and I is a branch of f n , then f n ( I ) is large
compared with the return interval C, in the sense that f n ( I ) \ C contains
intervals from both cycles of renormalization. (See the end of Section 2.1
for an explanation of the notation used.)
Lemma 3.2.3. Assume that f is renormalizable. Let C = L ∪ {c} ∪ R be the
return interval, let a + 1 be the return time of L, let b + 1 be the return time of R.
If I is a branch of the transfer map to C, if J ⊃ I is a branch of f n , and if J is
disjoint from C, then there exist i ∈ {1, . . . , a} and j ∈ {1, . . . , b} such that
f i ( L) is contained in the right component of f n ( J ) \ C and f j ( R) is contained in
the left component.
Proof. Since J is a branch of f n , either ∂− J = 0 or there exists 0 ≤ l < n
such that ∂− f l ( J ) = c. In the former case the left component of f n ( J ) \ C
contains f ( R). Assume the latter case holds. By Proposition 2.2.5 f l ( J ) ⊃ R,
since f l ( I ) must be disjoint from C. Hence the left component of f n ( J ) \ C
55
3.2. A priori bounds
contains f n−l ( R). Note that n − l ≤ b + 1 since f b+1 ( R) is mapped over the
critical point (so f is not monotone on f b+1 ( R)). Furthermore, n − l 6= b + 1,
since f l ( I ) ∩ R = ∅ and thus f l +b+1 ( I ) ∩ C = ∅.
Now repeat the same argument for the other boundary point of J.
Proof of Theorem 3.2.2. Since f is infinitely renormalizable there exists a sequence C0 ⊃ C1 ⊃ · · · of nice intervals whose lengths tend to zero (i.e. Cn
is the range of the n–th first-return map and this interval is nice since the
boundary consists of periodic points).
Let Tn denote the transfer map to Cn . We must show that Tn is defined
almost everywhere and that it has uniformly bounded distortion.
By a theorem of Singer3 f cannot have a periodic attractor since it would
attract at least one of the critical values. This does not happen for infinitely
renormalizable maps since the critical orbits have subsequences which converge on the critical point. Thus Proposition 2.2.7 shows that Tn is defined
almost everywhere.
In order to prove that Tn has bounded distortion, pick any branch I
of Tn with positive transfer time i = τ ( I ), and let J be the branch of f i containing I. By Lemma 3.2.3 both components of f i ( J ) \ Cn contain intervals
from the forward orbit of Cn = cl Ln ∪ Rn , say f ln ( Ln ) and f rn ( Rn ) (note
that these do not depend on the choice of branch J). We contend that
(3.16)
inf | f ln ( Ln )|/|Cn | > 0
n
and
inf | f rn ( Rn )|/|Cn | > 0.
n
Suppose not, and consider the C 0 –closure of {Rn f }. The a priori bounds
show that this set is compact and hence there exists a subsequence {Rnk f }
which converges to some f ∗ . But then f ∗ is a renormalizable map whose
cycles of renormalization contain an interval of zero diameter. This is impossible, hence (3.16) must hold.
This shows that f i ( J ) contains a δ–scaled neighborhood of Cn and that
δ does not depend on n or J. The Koebe lemma now implies that Tn has
bounded distortion and that the bound does not depend on n.
Theorem 3.2.4. Assume f ∈ LSω̄ ∩ K is infinitely renormalizable with ω̄ ∈ ΩN .
Let Λ be the closure of the orbits of the critical values. Then:
• Λ is a Cantor set,
3 Singer’s
theorem is stated for unimodal maps but the statement and proof can easily
be adapted to Lorenz maps.
56
Chapter 3. Invariance
•
•
•
•
Λ has Lebesgue measure zero,
the Hausdorff dimension of Λ is strictly inside (0, 1),
Λ is uniquely ergodic,
the complement of the basin of attraction of Λ has zero Lebesgue measure.
Proof. Let Ln and Rn denote the left and right half of the return interval of
the n–th first-return map, let in and jn be the return times for Ln and Rn , let
Λ0 = [0, 1], and let
Λn =
in[
−1
jn −1
i
cl f ( Ln ) ∪
i =0
[
cl f j ( Rn ),
n = 1, 2, . . .
j =0
Components of Λn are called intervals of generation n and components of
Λn−1 \ Λn are called gaps of generation n (see Figure 3.1 on the facing page).
Let I be an interval of generation n, let J ⊂ I be an interval of generation
n + 1, and let G ⊂ I be a gap of generation n + 1. We claim that there exists
constants 0 < µ < λ < 1 such that
µ < | J |/| I | < λ
and
µ < | G |/| I | < λ,
where µ and λ do not depend on I, J and G. To see this, take the L0 –closure
of {Rn f }. This set is compact in L0 , so the infimum and supremum of
| J |/| I | over all I and J as above are bounded away from 0 and 1 (otherwise
there would exist an infinitely renormalizable map in L0 with I and J as
above such that | J | = 0 or | I | = | J |). The same argument holds for I and G.
Since {Rn f } is a subset of the closure the claim follows.
T
T
Next we claim that Λ = Λn . Clearly Λ ⊂ Λn (since the critical
values are contained in the closure of f ( Ln ) ∪ f ( Rn ) for each n). From the
previous claim |Λn | < λ|Λn−1 | so the lengths of the intervals of generation
T
n tend to 0 as n → ∞. Hence Λ = Λn .
It now follows from standard arguments that Λ is a Cantor set of zero
measure with Hausdorff dimension in (0, 1). That Λ is uniquely ergodic
follows from Theorem 2.3.1 since f has bounded combinatorics due to the
fact that Ω is finite.
It only remains to prove that almost all points are attracted to Λ. Let Tn
denote the transfer map to the n–th return interval Cn . By Proposition 2.2.7
the domain of Tn has full measure for every n and hence almost every point
visits every Cn . This finishes the proof.
57
3.3. Periodic points of the renormalization operator
c
L0
0
R11
R12
L32
R21
R22
R0
L1
L42
R32 R52
L11
R1
L2
R2
1
L22
R42
R62
L12
Figure 3.1: Illustration of the intervals of generations 0, 1 and 2 for a
(01, 100)–renormalizable map. Here Lin = f i ( Ln ) and Rin = f i ( Rn ). The
intersection of all levels n = 0, 1, 2, . . . is a Cantor set, see Theorem 3.2.4.
3.3
Periodic points of the renormalization operator
In this section we prove the existence of periodic points of the renormalization operator. The argument is topological and does not imply uniqueness
even though we believe the periodic points to be unique within each combinatorial class.4
The notation used here is the same as in Section 3.1, in particular the
sets Ω and K are the same as in that section. The constants ε− , ε+ and δ
appear in the definition of K.
Theorem 3.3.1. For every periodic combinatorial type ω̄ ∈ ΩN there exists a
periodic point of R in Lω̄ .
Remark 3.3.2. We are not saying anything about the periods of the periodic
points. For example, we are not asserting that there exists a period-two
point of type (ω, ω )∞ for some ω ∈ Ω — all we say is that there is a fixed
point of type (ω )∞ . The point here is that (ω, ω )∞ is just another way to
write (ω )∞ so these two types are the same.
To begin with we will consider the restriction Rω of R to some ω ∈ Ω
and show that Rω has a fixed point. Let
Y = LSω ∩ K,
and
Y 0 = { f ∈ Y | c1+ (R f ) ≤ 1/2 ≤ c1− (R f )}.
The proof is based on a careful investigation of the boundary of Y and the
action of R on this boundary. However, we need to introduce the set Y 0 because we do not have enough information on the renormalization of maps
in Y , see Theorem 3.1.5.
4 The
conjecture is that the restriction of R to the set of infinitely renormalizable maps
should contract maps of the same combinatorial type and this would imply uniqueness.
58
Chapter 3. Invariance
Definition 3.3.3. A branch B of f n is full if f n maps B onto the domain of f ;
B is trivial if f n fixes both endpoints of B.
Proposition 3.3.4. The boundary of Y consists of three parts, namely f ∈ ∂Y if
and only if at least one of the following conditions hold:
(Y1) the left or right branch of R f is full or trivial,
(Y2) 1 − c( f ) = ε( f ) ∈ {ε− , ε+ },
(Y3) Dist φ( f ) = δ or Dist ψ( f ) = δ.
Also, each condition occurs somewhere on ∂Y .
Before giving the proof we need to introduce some new concepts and
recall some established facts about families of Lorenz maps.
Definition 3.3.5. A slice (in the parameter plane) is any set of the form
S = [0, 1]2 × {c} × {φ} × {ψ},
where c, φ and ψ are fixed. We will permit ourselves to be a bit sloppy with
notation and write (u, v) ∈ S when it is clear which slice we are talking
about (or if it is irrelevant).
A slice S = [0, 1]2 × {c} × {φ} × {ψ} induces a family of Lorenz maps
S 3 (u, v) 7→ f u,v = (u, v, c, φ, ψ) ∈ L.
Any family induced from a slice is full, by which we mean that it realizes all
possible combinatorics. See (Martens and de Melo, 2001) for a precise definition and a proof of this statement. For our discussion the only important
fact is the following:
Proposition 3.3.6. Let (u, v) 7→ f u,v be a family induced by a slice. Then this
family intersects Lω̄ for every ω̄ such that Lω̄ 6= ∅. Note that ω̄ can be finite or
infinite.
Proof. This follows from (Martens and de Melo, 2001, Theorem A).
Recall that C = cl L ∪ R is the return interval for a renormalizable map,
and the return times for L and R are a + 1 and b + 1, respectively (see the
end of Section 2.1).
59
3.3. Periodic points of the renormalization operator
Lemma 3.3.7. Assume that f is renormalizable. Let (l, c) be the branch of f a+1
containing L and let (c, r ) be the branch of f b+1 containing R. Then
f a +1 ( l ) ≤ l
and
f b+1 (r ) ≥ r.
Proof. This is a special case of (Martens and de Melo, 2001, Lemma 4.1).
Proof of Proposition 3.3.4. Let us first consider the boundary of L0ω . If either
branch of R f is full or trivial, then we can perturb f in C 0 so that it no
longer is renormalizable. Hence (Y1) holds on ∂L0ω . If f ∈ L0ω does not
satisfy (Y1) then any sufficiently small C 0 –perturbation of f will still be
renormalizable by Lemma 3.3.7. Hence the boundary of renormalization is
exactly characterized by (Y1).
Conditions (Y2) and (Y3) are part of the boundary of K. These boundaries intersect LSω by Proposition 3.3.6 and hence these conditions are also
boundary conditions for Y .
Fix 1 − c0 = ε 0 ∈ (ε− , ε+ ) and let S = [0, 1]2 × {c0 } × {id} × {id}. Let
ρt be the deformation retract onto S defined by
ρt (u, v, c, φ, ψ) = (u, v, c + t(c0 − c), (1 − t)φ, (1 − t)ψ),
t ∈ [0, 1].
In order to make sense of this formula it is important to note that the linear
structure on the diffeomorphisms is that induced from C 0 via the nonlinearity operator N (see Remark 2.1.4). Hence, for example tφ is by definition
the diffeomorphism N −1 (tNφ). Let
R t = ρ t ◦ R.
The choice of slice is somewhat arbitrary in what follows, except that we
will have to be a little bit careful when chosing c0 as will be pointed out in
the proof of the next lemma. However, it is important to note that the slice
intersects Y .
Lemma 3.3.8. It is possible to choose c0 so that Rt has a fixed point on ∂Y 0 for
some t ∈ [0, 1] if and only if R has a fixed point on ∂Y 0 .
Remark 3.3.9. The condition c1+ (R f ) ≤ 1/2 ≤ c1− (R f ) roughly states that
u(R f ) ≥ 1/2 and v(R f ) ≥ 1/2. Thus Y 0 has another boundary condition
given by equality in either of these two inequalities. Instead of treating
these as separate boundary conditions we subsume them into (Y1) by saying that the left branch is trivial also if c1− (R f ) = 1/2 and similarly for the
right branch.
60
Chapter 3. Invariance
v
u
Figure 3.2: Illustration of the action of ρ1 ◦ R|S . The shaded area corresponds to a full island. The boxes shows what the branches of ρ1 ◦ R f look
like on each boundary piece.
Proof. The ‘if’ statement is obvious since R = R0 , so assume that R has no
fixed point on ∂Y 0 . Let f ∈ ∂Y 0 and assume that Rt f = f for some t > 0.
We will show that this is impossible.
To start off choose ε 0 ∈ (ε− , ε+ ) and let c0 = 1 − ε 0 as usual (we will be
more specific about the choice of ε 0 later).
Note that (Y2) cannot hold for Rt f since ε 0 ∈ (ε− , ε+ ) and hence the
same is true for ε(Rt f ), since t > 0 and ε(R f ) ∈ [ε− , ε+ ] by Theorem 3.1.5.
Similarly, (Y3) cannot hold for Rt f since the distortion of the diffeomorphic parts of R f are not greater than δ (by Theorem 3.1.5) and hence
the distortion of the diffeomorphic parts of Rt f are strictly smaller than δ
(since t > 0).5
The only possibility is that f = Rt f belongs to the boundary part described by condition (Y1).
If either branch of R f is full then corresponding branch of Rt f is full as
well which shows that f cannot be fixed by Rt , since a renormalizable map
cannot have a full branch. Thus one of the branches of R f must be trivial.
5 This
follows from Dist(1 − t)φ < Dist φ if, t > 0 and Dist φ > 0.
3.3. Periodic points of the renormalization operator
61
Assume that the left branch of R f is trivial (see Remark 3.3.9). Then
c1− (R f ) = c(R f ) since c(R f ) > 1/2. In particular, R f is not renormalizable since c1− for a renormalizable map is away from the critical point by
Lemma 3.1.9. Because of this lemma we can assure that Rs f is not renormalizable for all s ∈ [0, 1] by choosing ε 0 close to ε− . In particular, Rt f is
not renormalizable and hence cannot equal f .
Assume that the right branch of R f is trivial (see Remark 3.3.9). Then
c1+ (R f ) = 1/2 since c(R f ) > 1/2. In particular R f is not renormalizable
since that requires c1+ (R f ) to be close to 0 by Lemma 3.1.9. The same holds
for Rs f for all s ∈ [0, 1] since 1/2 > ε+ . In particular, f cannot be fixed by
Rt since f is renormalizable.
We have shown that f ∈
/ ∂Y 0 which is a contradiction and hence we
conclude that Rt f 6= f for all t ∈ [0, 1].
The slice S intersects the set Lω of renormalizable maps of type ω by
Proposition 3.3.6. This intersection can in general be a complicated set, but
there will always be at least one connected component I of the interior such
that the restricted family I 3 (u, v) 7→ f u,v is full (see Martens and de Melo,
2001, Theorem B). Such a set I is call a full island. The action of R on a
full island is illustrated in Figure 3.2 on the preceding page. Note that the
action of R on the boundary of I is given by (Y1) which also explains this
figure.
Lemma 3.3.10. Any extension of R1 |∂Y 0 to Y 0 has a fixed point.
Proof. If R1 has a fixed point on ∂Y 0 then there is nothing to prove, so assume that this is not the case.
Let S = [0, 1]2 × {c0 } × {id} × {id}. By the above discussion there is a
full island I ⊂ S . Note that ∂I ⊂ ∂Y 0 .
Pick any R : I → S such that R|∂I = R1 |∂I . Now define the displacement map δ : ∂I → S1 by
δ( x ) =
x − R( x )
.
| x − R( x )|
This map is well-defined since R1 was assumed not to have any fixed
points on ∂Y 0 and ∂I ⊂ ∂Y 0 . The degree of δ is nonzero since I is a full
island. This implies that R has a fixed point in I, otherwise δ would extend
to all of I which would imply that the degree of δ was zero. This finishes
the proof since R was an arbitrary extension of R1 |∂I and ∂I ⊂ ∂Y 0 .
62
Chapter 3. Invariance
Proposition 3.3.11. Rω has a fixed point.
Proof. By the previous two lemmas either Rω has a fixed point on ∂Y 0 or
we can apply Theorem A.1.1. In both cases Rω has a fixed point.
Proof of Theorem 3.3.1. Pick any sequence (ω0 , . . . , ωn−1 ) with ωi ∈ Ω. The
proof of the previous proposition can be repeated with
R 0 = R ω n −1 ◦ · · · ◦ R ω0
in place of R to see that R0 has a fixed point f ∗ . But then f ∗ is a periodic
point of R and its combinatorial type is (ω0 , . . . , ωn−1 )∞ .
C HAPTER
Decompositions
This chapter introduces decompositions in Section 4.1. Decompositions
can be thought of as generalizations of diffeomorphisms, or perhaps more
fundamentally as the internal structure of the diffeomorphisms that appear
naturally in the study of composition operators. In Section 4.2 the renormalization operator is lifted to decomposed Lorenz and it is shown that
renormalization contracts to the subset of pure decomposed Lorenz maps.
4.1
Decompositions
In this section we introduce the notion of a decomposition. We show how
to lift operators from diffeomorphisms to decompositions and also how
decompositions can be composed in order to recover a diffeomorphism.
This section is an adaptation of techniques introduced in Martens (1998).
Definition 4.1.1. A decomposition φ̄ : T → D 2 ([0, 1]) is an ordered sequence of diffeomorphisms labelled by a totally ordered and at most countable set T. Any such set T will be called a time set. The space D is defined
in Appendix A.2.
The space of decompositions D̄ T over T is the direct product
D̄T = ∏ D 2 ([0, 1])
T
together with the
`1 –norm
kφ̄k =
∑ kφτ k.
τ ∈T
63
4
64
Chapter 4. Decompositions
The notation here is φτ = φ̄(τ ). The distortion of a decomposition is defined similarly:
Dist φ̄ = ∑ Dist φτ .
τ ∈T
The sum of two time sets T0 ⊕ T1 is the disjoint union
T0 ⊕ T1 = {( x, i ) | x ∈ Ti , i = 0, 1},
with order ( x, i ) < (y, i ) if and only if x < y, and ( x, 0) < (y, 1) for all x, y.
The sum of two decompositions
φ̄0 ⊕ φ̄1 ∈ D̄ T0 ⊕T1 ,
where φ̄i ∈ D̄ Ti , is defined by φ̄0 ⊕ φ̄1 ( x, i ) = φ̄i ( x ). In other words, φ̄0 ⊕ φ̄1
is the diffeomorphisms of φ̄0 in the order of T0 , followed by the diffeomorphisms of φ̄1 in the order of T1 .
Note that ⊕ is noncommutative on time sets as well as on decompositions.
Remark 4.1.2. Our approach to decompositions is somewhat different from
that of Martens (1998). In particular, we require a lot less structure on time
sets and as such our definition is much more suitable to general combinatorics. Intuitively speaking, the structure that Martens (1998) puts on time
sets is recovered from limits of the renormalization operator so we will also
get this structure when looking at maps in the limit set of renormalization.
We simply choose not to make it part of the definition to gain some flexibility.
Proposition 4.1.3. The space of decompositions D̄ T is a Banach space.
Proof. The nonlinearity operator takes D 2 ([0, 1]) bijectively to C 0 ([0, 1]; R).
The latter is a Banach space so the same holds for D̄ T .
Definition 4.1.4. Let T be a finite time set (i.e. of finite cardinality) so that
we can label T = {0, 1, . . . , n − 1} with the usual order of elements. The
composition operator O : D̄ T → D 2 is defined by
Oφ̄ = φn−1 ◦ · · · ◦ φ0 .
The composition operator composes all maps in a decomposition in the
order of T. We can also define partial composition operators
O[ j,k] φ̄ = φk ◦ · · · ◦ φj ,
0 ≤ j ≤ k < n.
65
4.1. Decompositions
As a notational convenience we will write O≤k instead of O[0,k] etc.
Next, we would like to extend the composition operator to countable
time sets but unfortunately this is not possible in general. Instead of D 2 we
will work with the space D 3 with the C 1 –nonlinearity norm:
kφk1 = k NφkC 1 = max{| D k ( Nφ)|},
k =0,1
φ ∈ D3.
Define D̄ T3 = {φ̄ : T → D 3 | kφ̄k1 < ∞}, where
kφ̄k1 = ∑kφτ k1 .
Note that k·k will still be used to denote the C 0 –nonlinearity norm.
Proposition 4.1.5. The composition operator O : D̄ T3 → D 2 continuously extends to decompositions over countable time sets T.
Remark 4.1.6. It is important to note that there is an inherent loss of smoothness when composing a decomposition over a countable time set. Starting
with a bound on the C 1 –nonlinearity norm we only conclude a bound on
the C 0 –nonlinearity norm of the composed map. This can be generalized;
starting with a bound on the C k+1 –nonlinearity norm, we can conclude a
bound on the C k –nonlinearity norm for the composed map.
The reason why we loose one degree of smoothness is because we use
the mean value theorem for one estimate in the Sandwich Lemma 4.1.9. If
necessary it should be possible to replace this with for example a Hölder
estimate which would lead to a slightly stronger statement.
In order to prove this proposition we will need the Sandwich Lemma
which in itself relies on the following properties of the composition operator.
Lemma 4.1.7. Let φ̄ ∈ D̄ T be a decomposition over a finite time set T, and let
φ = Oφ̄. Then
e−kφ̄k ≤ |φ0 | ≤ ekφ̄k ,
|φ00 | ≤ kφ̄ke2kφ̄k ,
and
If furthermore, φ̄ ∈ D̄ T3 , then
kφk1 ≤ (1 + kφ̄k)e2kφ̄k kφ̄k1 .
kφk ≤ kφ̄kekφ̄k .
66
Chapter 4. Decompositions
Remark 4.1.8. Note that the lemma is stated for finite time sets, but the way
we define the composition operator for countable time sets (see the proof
of Proposition 4.1.5) will mean that the lemma also holds for countable time
sets.
Proof. The bounds on |φ0 | and |φ00 | follow from an induction argument using only Lemma A.2.11.
Since T is finite we can label φ̄ so that φ = φn−1 ◦ · · · ◦ φ0 . Let ψi =
O<i (φ̄) and let ψ0 = id. Now the bound on kφk follows from
n −1
Nφ( x ) =
∑ Nφi (ψi (x))ψi0 (x),
i =0
which in itself is obtained from an induction argument using the chain rule
for nonlinearities (see Lemma A.2.8).
Finally, take the derivative of the above equation to get
( Nφ)0 ( x ) =
n −1
∑ ( Nφi )0 (ψi (x))ψi0 (x)2 + Nφi (ψi (x))ψi00 (x).
i =0
From this the bound on kφk1 follows.
Lemma 4.1.9 (Sandwich Lemma). Let φ = φn−1 ◦ · · · ◦ φ0 and let ψ be obtained
by “sandwiching γ inside φ;” that is,
ψ = φn−1 ◦ · · · ◦ φi ◦ γ ◦ φi−1 ◦ · · · ◦ φ0 ,
for some i ∈ {0, . . . , n} (with the convention that φn = φ−1 = id).
For every λ there exists K such that if γ, φi ∈ D 3 and if kγk1 + ∑kφi k1 ≤ λ,
then kψ − φk ≤ K kγk.
Proof. Let φ+ = φn ◦ · · · ◦ φi , and let φ− = φi−1 ◦ · · · ◦ φ−1 . Two applications
of the chain rule for nonlinearities gives
0
Nψ( x ) − Nφ( x ) = N (φ+ ◦ γ)(φ− ( x )) − Nφ+ (φ− ( x )) · |φ−
( x )|
0
= Nφ+ (γ(y))γ0 (y) − Nφ+ (y) + Nγ(y) · |φ−
( x )|,
where y = φ− ( x ). By assumption Nφ+ ∈ C 1 so by the mean value theorem
there exists η ∈ [0, 1] such that
Nφ+ (γ(y)) = Nφ+ (y) + ( Nφ+ )0 (η ) · (φ(y) − y) .
67
4.1. Decompositions
Hence
0
Nψ( x ) − Nφ( x ) ≤ |φ−
( x )|
0
· Nφ+ (y) · |γ (y) − 1| + γ0 (y) · ( Nφ+ )0 (η ) · |γ(y) − y| + | Nγ(y)|
≤ K1 · K2 e k γ k − 1 + K3 e 2k γ k − 1 + k γ k ≤ K k γ k .
The constants Ki only depend on λ by Lemma 4.1.7. We have also used
Lemma A.2.11 and Lemma A.2.12 in the penultimate inequality.
Proof of Proposition 4.1.5. Let φ̄ ∈ D̄ T3 and choose an enumeration θ : N →
T. Let ψn denote the composition of {φθ (0) , . . . , φθ (n−1) } in the order induced by T.
We claim that {ψn } is a Cauchy sequence in D 2 . Indeed, by applying
the Sandwich Lemma with λ = kφ̄k1 we get a constant K only depending
on λ such that:
m + n −1
kψn − ψm k ≤
∑
i =m
m + n −1
kψi+1 − ψi k ≤ K
∑
kφθ (i) k → 0,
as m, n → ∞.
i =m
Hence φ = lim ψn exists and φ ∈ D 2 . This also shows that φ is independent
of the enumeration θ and hence we can define Oφ̄ = φ.
We can now use the composition operator to lift operators from D to D̄ T ,
starting with the zoom operators of Definition 2.1.9.
Definition 4.1.10. Let I ⊂ [0, 1] be an interval, let φ̄ ∈ D̄ T3 and let Iτ be the
image of I under the diffeomorphism O<τ (φ̄). Define Z (φ̄; I ) = ψ̄, where
ψτ = Z (φτ ; Iτ ), for every τ ∈ T.
Remark 4.1.11. An equivalent way of defining the zoom operators on D̄ T3 is
to let Iτ = ψτ−1 ( J ), where ψτ = O≥τ (φ̄), J = φ( I ), and φ = Oφ̄( I ). This is
equivalent since Oφ̄ = O≥τ (φ̄) ◦ O<τ (φ̄).
The original definition takes the view of zooming in on an interval in the
domain of the decomposition, whereas the latter takes the view of zooming
in on an interval in the range of the decomposition. We will make use of
both of these points of view.
Zoom operators on diffeomorphisms are contractions for a fixed interval I by Lemma A.2.16. A similar statement holds for decompositions:
68
Chapter 4. Decompositions
Lemma 4.1.12. Let I ⊂ [0, 1] be an interval. If φ̄ ∈ D̄ T3 then
k Z (φ̄; I )k ≤ ekφ̄k · min{| I |, |φ( I )|} · kφ̄k,
where φ = Oφ̄.
Remark 4.1.13. Since we are only dealing with decompositions with very
small norm this lemma is enough for our purposes. However, in more general situations the constant in front of kφ̄k may not be small enough. A
way around this is to consider decompositions which compose to diffeomorphisms with negative Schwarzian derivative. Then all the intervals Iτ
will have hyperbolic lengths bounded by that of J (notation is as in Remark 4.1.11). This can then be used to show that zoom operators contract
and the contraction can be bounded in terms of the hyperbolic length of J.
Proof. Using the notation of Definition 4.1.10 we have
k Z (φ̄; I )k =
∑ kZ(φτ ; Iτ )k ≤ ∑ | Iτ | · kφτ k ≤ sup | Iτ | · kφ̄k.
τ ∈T
τ ∈T
τ ∈T
For every τ there exists ξ τ ∈ I such that | Iτ | = (O<τ (φ̄))0 (ξ τ ) · | I | which
together with Lemma 4.1.7 implies that | Iτ | ≤ ekφ̄k · | I |. Similarly, there
exists ητ ∈ φ( I ) such that |φ( I )| = (O≥τ (φ̄))0 (ητ ) · | Iτ | so by Lemma 4.1.7
| Iτ | ≤ ekφ̄k · |φ( I )| as well.
This contraction property of the zoom operators leads us to introduce
the subspace of pure decompositions (the intuition is that renormalization
contracts towards the pure subspace, see Proposition 4.2.8).
Definition 4.1.14. The subspace of pure decompositions Q̄ T ⊂ D̄ T consists
of all decompositions φ̄ such that φτ is a pure map for every τ ∈ T.
The subspace of pure maps Q ⊂ D ∞ consists of restrictions of x α away
from the critical point, that is
Q = Z ( x | x |α−1 ; I ) | int I 63 0 .
A property of pure maps is that they can be parametrized by one real variable. We choose to parametrize the pure maps by their distortion with a
sign and call this parameter s. The sign of s is positive for I to the right of 0
and negative for I to the left of 0. With this convention the graphs of pure
maps will look like Figure 4.1 on the facing page.
69
4.1. Decompositions
1
1
1
s>0
s=0
s<0
0
1
0
0
1
1
Figure 4.1: The graphs of a pure map µs for different values of the signed
distortion s.
Remark 4.1.15. Let µs ∈ Q. A calculation shows that
Dist µs = |log µ0s (1)/µ0s (0)|
and from this it is possible to deduce an expression for µs :
(4.1)
α
1 + exp{ α−s 1 } − 1 x − 1
µs ( x ) =
,
exp{ ααs
−1 } − 1
x ∈ [0, 1], s 6= 0,
and µ0 = id. We emphasize that the parametrization is chosen so that |s|
equals the distortion of µs . For this reason we call s the signed distortion
of µs . Figure 4.1 shows the graphs of µs for different values of s. Equation (4.1) may at first seem to indicate that there is some sort of singular
behavior at s = 0 but this is not the case; the family s 7→ µs is smooth.
The next two lemmas are needed in preparation for Proposition 4.2.8.
Lemma 4.1.16. Let φ ∈ D 2 and let I ⊂ [0, 1] be an interval. Then
d( Z (φ; I ), Q) ≤ | I | · d(φ, Q),
where the distance d(·, ·) is induced by the C 0 –nonlinearity norm.
Proof. A calculation shows that
r s ( α − 1)
Nµs ( x ) =
,
1 + rs x
rs = exp
s
α−1
− 1.
70
Chapter 4. Decompositions
Let I = [ a, b] and let ζ I ( x ) = a + | I | · x. Then
d( Z (φ; I ), Q) = inf max N ( Z (φ; I ))( x ) − Nµs ( x )
s∈R x ∈[0,1]
r (α − 1) = inf max | I | · Nφ(ζ I ( x )) −
r >−1 x ∈[0,1]
1 + rx r ( α − 1)
= inf max | I | · Nφ(ζ I ( x )) −
r >−1 x ∈[0,1]
1 + r (ζ I ( x ) − a)/| I | ρ
(
α
−
1
)
,
= | I | · inf max Nφ( x ) −
1 + ρx ρ/
∈[− 1b ,− 1a ] x ∈ I
where ρ = r/(b − (1 + r ) a). Note that 1 + ρx has a zero in [0, 1] if ρ ≤ −1,
so the infimum is assumed for ρ > −1. Thus
ρ(α − 1) d( Z (φ; I ), Q) = | I | · inf max Nφ( x ) −
.
ρ>−1 x ∈ I
1 + ρx Taking the max over x ∈ [0, 1] finishes the proof.
Lemma 4.1.17. Let φ̄ ∈ D̄ T3 and let I ⊂ [0, 1] be an interval. Then
d Z (φ̄; I ), Q̄ T ≤ ekφ̄k · min{| I |, |φ( I )|} · d(φ̄, Q̄ T ),
where φ = Oφ̄.
Proof. Use Lemma 4.1.16 and a similar argument to that employed in the
proof of Lemma 4.1.12.
The pure decompositions have some very nice properties which we will
make use of repeatedly.
Proposition 4.1.18. If φ̄ ∈ Q̄ T and kφ̄k < ∞, then φ = Oφ̄ is in D ∞ and φ has
nonpositive Schwarzian derivative.
Remark 4.1.19. Note that kφ̄k < ∞ is equivalent to Dist φ̄ < ∞, since
Z 1
Dist µ = exp Nµ( x )dx ,
0
for pure maps µ. Hence the norm bound can be replaced by a distortion
bound and the above proposition still holds.
71
4.2. Renormalization of decomposed maps
Proof. Let η be the nonlinearity of a pure map. A computation gives
Dk η (x) =
(−1)k k!
· η ( x ) k +1 .
( α − 1) k
Hence, if η is bounded then so are all of its derivatives (of course, the bound
depends on k). Thus Proposition 4.1.5 shows that φ = Oφ̄ is well-defined
and φ ∈ D k , for all k ≥ 2 (use Remark 4.1.6).
Finally, every pure map has negative Schwarzian derivative so φ must
have nonpositive Schwarzian deriviative, since negative Schwarzian is preserved under composition by Lemma A.3.4.
Notation. We put a bar over objects associated with decompositions to distinguish them from diffeomorphisms. Hence φ̄ denotes a decomposition,
whereas φ denotes a diffeomorphism. Similarly, D̄ denotes a set of decompositions, whereas D is a set of diffeomorphisms.
Given a decomposition φ̄ : T → D , we use the notation φτ to mean φ̄(τ )
and we call this the diffeomorphism at time τ. Moreover, when talking
about φ̄ we consistently write φ to denote the composed map Oφ̄.
We will frequently consider the disjoint union of all decompositions instead of decompositions over some fixed time set T and for this reason we
introduce the notation
D̄ =
G
T
4.2
D̄T
and
Q̄ =
G
Q̄T .
T
Renormalization of decomposed maps
In this section we lift the renormalization operator to the space of decomposed Lorenz maps (i.e. Lorenz maps whose diffeomorphic parts are replaced with decompositions). We prove that renormalization contracts towards the subspace of pure decomposed maps. This will be used in later
sections to compute the derivative of R on its limit set.
Definition 4.2.1. Let T = ( T0 , T1 ) be a pair of time sets, and let D̄ T denote
the product D̄ T0 × D̄ T1 . The space of decomposed Lorenz maps L̄ T over T is
the set [0, 1]2 × (0, 1) × D̄ T together with structure induced from the Banach
space R3 × D̄ T with the max norm of the products.
72
Chapter 4. Decompositions
Definition 4.2.2. The composition operator induces a map L̄3T → L2 which
(by slight abuse of notation) we will also denote O. Explicitly, if f¯ =
(u, v, c, φ̄, ψ̄) ∈ L̄T , then f = O f¯ is defined by f = (u, v, c, Oφ̄, Oψ̄).
We will now define the renormalization operator on the space of decomposed Lorenz maps. Formally, the definition is identical to the definition of
the renormalization operator on Lorenz maps. To illustrate this, let f = O f¯
be renormalizable. Then, by Lemma 2.1.11, R f = (u0 , v0 , c0 , φ0 , ψ0 ), where
(4.2)
u0 = | Q( L)|/|U |,
v0 = | Q( R)|/|V |,
c 0 = | L | / | C |,
φ0 = Z ( f a ◦ φ; U ) and ψ0 = Z ( f b ◦ ψ; V ). Zoom operators satisfy
Z ( g ◦ h; I ) = Z ( g; h( I )) ◦ Z (h; I ),
so we can write
φ0 = Z (ψ; Q(Ua )) ◦ Z ( Q; Ua ) ◦ · · · ◦ Z (ψ; Q(U1 )) ◦ Z ( Q; U1 ) ◦ Z (φ; U ),
ψ0 = Z (φ; Q(Vb )) ◦ Z ( Q; Vb ) ◦ · · · ◦ Z (φ; Q(V1 )) ◦ Z ( Q; V1 ) ◦ Z (ψ; V ).
Definition 4.2.3. Define R f¯ = (u0 , v0 , c0 , φ̄0 , ψ̄0 ), where u0 , v0 , c0 are given
by (4.2) and
φ̄0 = Z (φ̄; U ) ⊕ Z ( Q; U1 ) ⊕ Z (ψ̄; Q(U1 )) ⊕ · · · ⊕ Z ( Q; Ua ) ⊕ Z (ψ̄; Q(Ua )),
ψ̄0 = Z (ψ̄; V ) ⊕ Z ( Q; V1 ) ⊕ Z (φ̄; Q(V1 )) ⊕ · · · ⊕ Z ( Q; Vb ) ⊕ Z (φ̄; Q(Vb )),
where Z ( Q; ·) is now interpreted as a decomposition over a singleton time
set. See Figure 4.2 on the facing page for an illustration of the action of R.
Definition 4.2.4. The domain of R on decomposed Lorenz maps is conF
tained in the disjoint union L̄ = T L̄ T over all time sets T. Just as before
we let L̄ω denote all ω–renormalizable maps in L̄; L̄ω̄ denotes all maps in
S
L̄ such that Ri f¯ ∈ L̄ωi , where ω̄ = (ω0 , ω1 , . . . ); and L̄Ω = ω ∈Ω L̄ω .
Remark 4.2.5. Note that R takes the renormalizable maps of L̄ T into L̄ T 0 ,
where T 0 6= T in general. This is the reason why we have to work with the
F
disjoint union T L̄ T .
73
4.2. Renormalization of decomposed maps
φ̄
Q0
ψ̄
Q0
ψ̄
ψ̄
Q1
φ̄
1
U
1
c
C
C
φ̄0
c
ψ̄0
V
0
0
Figure 4.2: Illustration of the renormalization operator acting on decomposed Lorenz maps. First the decompositions are ‘glued’ to each other with
Q according to the type of renormalization, here the type is (01, 100). Then
the interval C is pulled back, creating the shaded areas in the picture. The
maps following the dashed arrows from U to C and from V to C represent
the new decompositions before rescaling.
Lemma 4.2.6. The composition operator is a semi-conjugacy. That is, the following square commutes
S
R
S
L2ω −−−→ L2
L̄3ω −−−→


Oy
L̄3


yO
R
and O is surjective.
Remark 4.2.7. This lemma shows that we can use the composition operator
to transfer results about decomposed Lorenz maps to Lorenz maps.
Proof. The square commutes by definition so let us focus on the surjectivity. Fix τ ∈ T and define a map Γτ : D → D̄ T by sending φ ∈ D to the
decomposition φ̄ : T → D defined by
(
φ, if t = τ,
φ̄(t) =
id, otherwise.
Then O ◦ Γτ = id which proves that O is surjective on D̄ T and hence it is
also surjective on L̄ T .
74
Chapter 4. Decompositions
The main result for the renormalization operator on Lorenz maps was
the existence of the invariant set K for types in the set Ω, see Section 3.1.
It should come as no surprise that K and Ω will be central to our discussion on decomposed maps as well. The first result in this direction is the
following.
Proposition 4.2.8. If f¯ ∈ L̄3ω̄ is infinitely renormalizable with ω̄ ∈ ΩN , if
kφ̄k ≤ K and kψ̄k ≤ K, and if O f¯ ∈ K ∩ LS , then the decompositions of Rn f¯
are uniformly contracted towards the subset of pure decompositions.
Proof. From the definition of the renormalization operator (and using the
fact that d( Z ( Q; I ), Q) = 0) we get
d(φ̄0 , Q̄) =
a
∑ d(Z(ψ̄; Q(Ui )), Q̄) + d(Z(φ̄; U ), Q̄).
i =1
Now apply Lemma 4.1.17 to get
d(φ̄0 , Q̄) ≤ ekψ̄k
a +1
∑ |Ui |d(ψ̄, Q̄) + ekφ̄k |U1 |d(φ̄, Q̄).
i =2
From Section 3.1 we get that ∑|Ui | and ∑|Vi | may be chosen arbitrarily
small (by choosing the return times sufficiently large). Now make these
sums small compared with max{ekφ̄k , ekψ̄k } to see that there exists µ < 1
(only depending on K) such that
d(φ̄0 , Q̄) + d(ψ̄0 , Q̄) ≤ µ d(φ̄, Q̄) + d(ψ̄, Q̄) .
Our main goal is to understand the limit set of the renormalization operator and the above proposition will be central to this discussion.
Definition 4.2.9. The set of forward limits of R restricted to types in Ω is
defined by
\
[
AΩ =
Rn
L̄ω̄ .
n ≥1
ω̄ ∈Ωn
Remark 4.2.10. In other words, AΩ consists of all maps f¯ which have a complete past:
f¯ = Rω−1 f¯−1 ,
f¯−1 = Rω−2 f¯−2 ,
...,
ωi ∈ Ω.
This also describes how we can associate each f¯ ∈ AΩ with a left infinite
sequence (. . . , ω−2 , ω−1 ).
4.2. Renormalization of decomposed maps
75
Proposition 4.2.11. AΩ is contained in the subset of pure decomposed Lorenz
maps.
Proof. This is a direct consequence of Proposition 4.2.8.
Since AΩ is contained in the set of pure decomposed maps we will restrict our attention to this subset from now on. This is extremely convenient
since pure decompositions satisfy some very strong properties, see Proposition 4.1.18, and it will allow us to compute the derivative at all points
in AΩ in Section 5.1.
Next we would like to lift the invariant set K to the decomposed maps,
but simply taking the preimage O−1 (K) will yield a set which is too large1
so we will have to be a bit careful.
Definition 4.2.12. Define
K̄ = {(u, v, c, φ̄, ψ̄) | ε− ≤ 1 − c ≤ ε+ , Dist φ̄ ≤ δ, Dist ψ̄ ≤ δ, φ̄, ψ̄ ∈ Q̄},
and
K̄Ω = { f¯ ∈ K̄ ∩ L̄Ω | c1+ (R f¯) ≤ 1/2 ≤ c1− (R f¯)}.
Note that K̄ is defined analogously to K but with the additional assumption
that the decompositions are pure. The notation used here is the same as that
of Section 3.1.
Proposition 4.2.13. If b− is sufficiently large, then R(K̄Ω ) ⊂ K̄.
Proof. Let f = O f¯ = (u, v, c, φ, ψ). Note first of all that Dist φ̄ ≤ δ implies
that Dist φ ≤ δ, since Dist satisfies the subadditivity property
Dist γ2 ◦ γ1 ≤ Dist γ1 + Dist γ2 .
Hence, f automatically satisfies the conditions of Theorem 3.1.5, so all we
need to prove is that Dist φ̄0 ≤ δ and Dist ψ̄0 ≤ δ. This is the reason why
we define K̄ by a distortion bound instead of a norm bound. Note that f
has nonpositive Schwarzian since the decompositions are pure, see Proposition 4.1.18.
1 Any
preimage under O contains decompositions whose norm is arbitrarily large. As
an example of how things can go wrong, fix K > 0 and consider φ̄ : N → D defined
by φn+1 = φn−1 and kφn k = K for every n. Then φ2n−1 ◦ · · · ◦ φ0 = id for every n, but
∑kφn k = ∞.
76
Chapter 4. Decompositions
We will first show that the norm is invariant, then we transfer this invariance to the distortion. The reason why we consider the norm first is
because it satisfies the contraction property in Lemma 4.1.12 which makes
it easier to work with.
From the definition of R and Lemma 4.1.12 we get
a
kφ̄0 k = k Z (φ̄; U )k + ∑ k Z (ψ̄; Q(Ui ))k + k Z ( Q; Ui )k
i =1
a +1
a
i =2
i =1
≤ ekφ̄k kφ̄k · |U1 | + ekψ̄k kψ̄k ∑ |Ui | + ∑ k Z ( Q; Ui )k.
The norm of a pure map is determined by how far away its domain is from
the critical point. More precisely, we have that
a
a
i =1
i =1
|Ui |
∑ kZ(Q; Ui )k = (α − 1) ∑ d(c, Ui ) .
Each term in this sum is bounded by cross-ratio of Ui inside [c, 1]. Since
maps with positive Schwarzian contract cross-ratio, since S f < 0, and since
Ui is a pull-back of C under an iterate of f , this cross-ratio is bounded by
the cross-ratio χ of C inside [c1+ , 1]. Thus, the above sum is bounded by
a(α − 1)χ. From the proof of Theorem 3.1.5 we know that χ is of the order
εt for some t > 0. Since a < b− and b− εt → 0 we see that the above sum
has a uniform bound which tends to zero as b− → ∞.
A similar argument for ψ̄0 gives
kφ̄0 k + kψ̄0 k ≤ (kφ̄k + kψ̄k) exp {kφ̄k + kψ̄k}
∑|Ui | + ∑|Vi |
+m
= k (kφ̄k + kψ̄k) + m,
where m = ∑k Z ( Q; Ui )k + ∑k Z ( Q; Vi )k. The arguments above and the
proof of Theorem 3.1.5 show that we can choose b− and δ so that
δ≥
m
,
1−k
which proves that
kφ̄k + kψ̄k ≤ δ
=⇒
kφ̄0 k + kψ̄0 k ≤ δ.
4.2. Renormalization of decomposed maps
77
The final observation which we use to finish the proof is that if γ ∈ Q
then
Dist γ
kγk = (α − 1) · exp
−1 .
α−1
That is kγk ≈ Dist γ for pure maps γ with small distortion. This allows
us to slightly modify the above invariance argument for the norm so that it
holds for the distortion as well.
C HAPTER
Differentiable structure
This chapter begins with a calculation of the derivative of the renormalization operator on a subset of the pure decomposed Lorenz maps in Section 5.1. The derivative restricted to the parameter plane is orientationpreserving and this turns out to have strong consequences on the geometry
of the domains of renormalization. This is discussed in Section 5.2. After
this the estimates on the norm of the derivative are used to construct an
expanding invariant cone field in Section 5.3. The cone field is then used in
Section 5.4 to construct unstable manifolds at each point in the limit set of
renormalization.
5.1
The derivative
The tangent space of R on the pure decomposed Lorenz maps can be written X × Y, where X = R2 and Y = R × `1 × `1 . The coordinates on X
correspond to the (u, v) coordinates on L̄ T . Let ( x, y) ∈ X × Y denote the
coordinates on the tangent space and recall that we are using the max norm
on the products. The derivative of R at f¯ is denoted
D R f¯ = M =
M1
M3
M2
,
M4
where M1 : R2 → R2 , M2 : Y → R2 , M3 : R2 → Y and M4 : Y → Y are
bounded linear operators.
79
5
80
Chapter 5. Differentiable structure
Remark 5.1.1. The fact that the derivative on the pure decomposed maps can
be written as an infinite matrix is one of the reasons why we restrict ourselves to the pure decompositions. Deformations of pure decompositions
are also easy to deal with since they are ‘monotone’ in the sense that the
dynamical intervals that define the renormalization move monotonically
under such deformations. This makes it possible to estimate the elements
of the derivative matrix.
Theorem 5.1.2. There exist constants k and K such that if f¯ ∈ K̄ ∩ L̄Ω , then
k M1 x k ≥ k min{|U |−1 , |V |−1 } · k x k,
| x1 | | x2 |
k M3 x k ≤ Kρ
+
,
|U |
|V |
k M2 k ≤ K |C |−1 ,
k M4 k ≤ Kρ|C |−1 ,
where ρ = max{ε0 , Dist φ̄, Dist ψ̄} and b− is sufficiently large.
Remark 5.1.3. The sets K̄ and K̄Ω are introduced in Definition 4.2.12 and
Ω is given by Section 3.1 as always. Some results in this section are stated
for K̄ but others are only valid for the subset K̄Ω . The main difference
between these two sets is that maps in K̄Ω have good bounds on u, v and
c for the renormalization due to Proposition 4.2.13, whereas for maps in K̄
we cannot say much about the renormalization.
Proof. The proof of this theorem is split up into a few propositions that
are in this section. The estimate for M1 is given in Corollary 5.1.11. The
estimates for M2 and M4 follow from Propositions 5.1.12 and 5.1.15. Finally,
the estimate for M3 follows from Propositions 5.1.12 and 5.1.13.
Notation. Let f¯ = (u, v, c, φ̄, ψ̄) and as always use primes to denote the
renormalization R f¯ = (u0 , v0 , c0 , φ̄0 , ψ̄0 ). We introduce special notation for
the diffeomorphic parts of the renormalization before rescaling:
(5.1)
Φ = f 1a ◦ φ,
Ψ = f 0b ◦ ψ,
so that Φ : U → C, Ψ : V → C, and C = ( p, q). Note that p and q are by
definition periodic points of periods a + 1 and b + 1, respectively.
We will use the notation ∂s t to denote the partial derivative of t with respect to s. In the formulas below we write ∂t to mean the partial derivative
of t with respect to any direction.
The notation g( x ) y is used to mean that there exists K < ∞ not
depending on g such that K −1 y ≤ g( x ) ≤ Ky for all x in the domain of g.
81
5.1. The derivative
The ∂ operator satisfies the following rules:
Lemma 5.1.4. The following expressions hold whenever they make sense:
(5.2)
(5.3)
∂( f ◦ g)( x ) = ∂ f ( g( x )) + f 0 ( g( x ))∂g( x ),
n
∂ f n +1 ( x ) = ∑ D f n − i f i +1 ( x ) ∂ f f i ( x ) ,
i =0
(5.4)
∂ f
−1
∂ f f −1 ( x )
( x ) = − 0 −1 .
f f (x)
Furthermore, if f ( p) = p then
(5.5)
∂p = −
∂ f ( p)
f 0 ( p) − 1
.
Remark 5.1.5. The ∂ operator clearly also satisfies the product rule
(5.6)
∂( f · g)( x ) = ∂ f ( x ) g( x ) + f ( x )∂g( x ).
This and the chain rule gives the quotient rule
(5.7)
∂( f /g)( x ) =
∂ f ( x ) g( x ) − f ( x )∂g( x )
.
g ( x )2
Proof. Equation (5.2) implies the others three. The second equation is an
induction argument and the last two follow from
0 = ∂ ( x ) = ∂ f ◦ f −1 ( x ) = ∂ f f −1 ( x ) + f 0 f −1 ( x ) ∂ f −1 ( x ) ,
and
∂( p) = ∂( f ( p)) = ∂ f ( p) + f 0 ( p)∂p.
Equation (5.2) itself can be proved by writing f ε ( x ) = f ( x ) + ε fˆ( x ), gε ( x ) =
g( x ) + ε ĝ( x ) and using Taylor expansion:
f ε ( gε ( x )) = f ε ( g( x )) + ε f ε0 ( g( x )) ĝ( x ) + O(ε2 )
= f ( g( x )) + ε fˆ( g( x )) + f 0 ( g( x )) ĝ( x ) + O(ε2 ).
We now turn to computing the derivative matrix M. The first three rows
of M are given by the following formulas.
82
Chapter 5. Differentiable structure
Lemma 5.1.6. The partial derivatives of u0 , v0 and c0 are given by
∂ ( Q0 (c) − Q0 ( p)) − u0 · ∂ Φ−1 (q) − Φ−1 ( p)
,
∂u =
|U |
∂ ( Q1 (q) − Q1 (c)) − v0 · ∂ Ψ −1 (q) − Ψ −1 ( p)
0
∂v =
,
|V |
∂(c − p) − c0 · ∂(q − p)
∂c0 =
.
|C |
0
Proof. Use (4.2), Lemma 5.1.4 and Remark 5.1.5.
Let us first consider how to use these formulas when deforming in the
u, v or c directions (i.e. the first three columns of M). Almost everything in
these formulas is completely explicit — we have expressions for Q0 and Q1
so evaluating for example ∂u Q0 (c) is routine. In order to evaluate for example the term ∂u Ψ −1 (q) we make use of (5.4) and (5.3). This involves
estimating the sum in (5.3) which can be done with mean value theorem
estimates. The terms ∂p and ∂q are evaluated using (5.5) and the fact that
p = Φ ◦ Q0 ( p) and q = Ψ ◦ Q1 (q). There are a few shortcuts to make the
calculations simpler as well, for example ∂u Φ = 0 since Φ does not contain
Q0 which is the only term that depends on u, and so on.
Deforming in the φ̄ or ψ̄ directions (there are countably many such directions) is similar. Here we make use of the fact that the decompositions
are pure and we have an explicit formula (4.1) for pure maps where the
free parameter represents the signed distortion (see Remark 4.1.15), so we
can compute their derivative, partial derivative with respect to distortion
etc. These deformations will affect the partial derivatives of any expression
involving Φ or Ψ, but all others will not ‘see’ these deformations. The calculations involved do not make any particular use of which direction we
deform in, so even though there are countably many directions we essentially only need to perform one calculation for φ̄ and another for ψ̄.
We now turn to computing the partial derivatives of φ̄0 and ψ̄0 .
Lemma 5.1.7. Let µs0 = Z (µs ; I ), where µs , µs0 ∈ Q and I = [ x, y]. Then
∂s0 = Nµs (y)∂y − Nµs ( x )∂x +
∂( Dµs )(y) ∂( Dµs )( x )
−
.
Dµs (y)
Dµs ( x )
83
5.1. The derivative
Proof. By definition s = log{ Dµs (1)/Dµs (0)}. Distortion is invariant under zooming, so this shows that s0 = log{ Dµs (y)/Dµs ( x )}. A calculation
gives
∂( Dµs )( x )
∂ log Dµs ( x ) =
+ Nµs ( x )∂x.
Dµs ( x )
By definition φ̄0 consists of maps of the form Z (µs ; I ) (as well as finitely
many of the form Z ( Q; I ) but these can be thought of as lims→±∞ Z (µs ; I )).
Hence the above lemma shows us how to compute the partial derivatives
at each time in φ̄0 . Note that we implicitly identify R with Q via s 7→ µs .
In order to use the lemma we also need a way to evaluate the terms ∂x
and ∂y. One way to do this is to express these in terms of ∂p and ∂q which
have already been computed at this stage. If we let T : I → [ p, q] denote
the ‘transfer map’ to C, then p = T ( x ) and hence (5.2) shows that
∂x =
∂p − ∂T ( x )
.
DT ( x )
The terms ∂T and DT can be bounded by ∂Φ and DΦ (or ∂Ψ and DΨ) all
of which have already been computed as well.
Proposition 5.1.8. If f ∈ K ∩ LΩ , then


−1
Q( p)
1− u 0
1 u0 DΨ (Ψ (q)) 1− Q(q)
1
1
+
−
a
+
1
−
1
b
+
1
u Df
|U |
|U | v DΦ(Φ (q)) D f
( p)−1
(q)−
1  + M1e ,
M1 = 
−1
Q( p)
1 v0 DΦ(Φ ( p))
1
1− v 0 1− Q ( q )
− |V | u DΨ (Ψ−1 ( p)) D f a+1 ( p)−1
1 + v D f b+1 (q)−1
|V |
where the error term M1e is negligible.
Remark 5.1.9. Note that this proposition does not need any assumptions on
the critical values of the renormalization (cf. Theorem 3.1.5). This will be
important later on when we discuss the structure of the parameter plane.
Also note that the M1 part of the derivative matrix has nothing to do with
decompositions so it is stated for nondecomposed Lorenz maps.
Proof. We begin by computing ∂p and ∂q. Use Φ ◦ Q0 ( p) = p, Ψ ◦ Q1 (q) =
q, and (5.5) to get
(5.8)
(5.9)
DΦ( Q0 ( p))∂u Q0 ( p)
,
D f a +1 ( p ) − 1
∂v Φ( Q0 ( p))
∂v p = −
,
D f a +1 ( p ) − 1
∂u p = −
∂u Ψ ( Q1 (q))
,
D f b +1 ( q ) − 1
DΨ ( Q1 (q))∂v Q1 (q)
∂v q = −
.
D f b +1 ( q ) − 1
∂u q = −
84
Chapter 5. Differentiable structure
Here we have used that ∂u Φ = 0 and ∂v Ψ = 0.
Next, let us estimate ∂u Ψ. Let x ∈ V and let xi = f i ◦ ψ( x ). From (5.3)
we get
∂u Ψ ( x ) = ∂u f b ◦ ψ)( x ) = ∂u f ( xb−1 ) +
b −1
∑ D f b − i ( x i ) ∂ u f ( x i −1 ),
i =1
where ∂u f ( x ) = φ0 ( Q0 ( x )) Q0 ( x )/u. Note that ∂u f ( xi−1 ) ≤ e2δ xi /u. In
order to bound the sum we divide the estimate into two parts. Let n < b
be the smallest integer such that D f ( xi ) ≤ 1 for all i ≥ n. In the part where
i < n we estimate
D f b − i ( x i ) x i = D f n − i ( x i ) D f b − n ( x n ) x i ≤ K1
xn
D f ( xb−1 ) xi ≤ K2 ε1−1/α .
xi
Here we have used the mean value theorem to find ξ i ≤ xi such that
D f n−i (ξ i ) = xn /xi and D f n−i ( xi ) ≤ K1 D f n−i (ξ i ), since φ has very small
distortion. In the part where i ≥ n we estimate
D f b−i ( xi ) xi ≤ D f ( xb−1 ) ≤ Kε1−1/α .
Summing over the two parts gives us the estimate
b −1
∑ D f b−i (xi )∂u f (xi−1 ) ≤ K(b − 1)ε1−1/α .
i =1
Hence
(5.10)
∂u Ψ ( x ) = ∂u f f b−1 ◦ ψ( x ) + O bε1−1/α ≈ 1.
We will now estimate ∂v Φ. Let x ∈ U and let xi = f i ◦ φ( x ). Similarly
to the above, we have
a −1
∂ v Φ ( x ) = ∂ v f ( x a −1 ) +
∑ D f a − i ( x i ) ∂ v f ( x i −1 ),
i =1
where ∂v f ( x ) = −ψ0 ( Q1 ( x ))(1 − Q1 ( x ))/v. By the mean value theorem
there exists ξ i ∈ [ xi , 1] such that D f a−i (ξ i ) = (1 − x a )/(1 − xi ), since
85
5.1. The derivative
f a−i ( xi ) = x a . From Lemma 3.1.10 it follows that D f a−i ( xi ) D f a−i (ξ i ).
Putting all of this together we get that the sum above is proportional to
a −1
∑ D f a−i (ξ i )(1 − xi ) = (a − 1)(1 − xa ).
i =1
Thus
∂v Φ( x ) − aε,
(5.11)
since x a ∈ C and hence 1 − x a = ε + O(|C |) ≈ ε.
We now have all the ingredients we need to compute M1 . Lemma 5.1.6
shows that
|U |∂u u0 = ∂u Q0 (c) − ∂u Q0 ( p) − Q00 ( p)∂u p
− u0 DΦ−1 (q)∂u q − DΦ−1 ( p)∂u p .
Here we have used ∂u Φ = 0. Now use (5.8) to get
Q00 ( p)∂u p = −∂u Q0 ( p)
D f a +1 ( p )
,
D f a +1 ( p ) − 1
DΦ−1 ( p)∂u p = −
∂ u Q0 ( p )
.
D f a +1 ( p ) − 1
Thus
(5.12)
|U | ∂ u u 0 = 1 +
(1 − u 0 ) ∂ u Q0 ( p )
u0 ∂u Ψ ( Q1 (q))
.
+
D f a +1 ( p ) − 1
DΦ(Φ−1 (q))( D f b+1 (q) − 1)
The last term is much smaller than one because of (5.10) and since | DΦ| 1 (and also D f b+1 (q) ≈ α/ε0 > α).
From Lemma 5.1.6 we get
|V |∂v v0 = ∂v Q1 (q) + Q10 (q)∂v q − ∂v Q1 (c)
− v0 DΨ −1 (q)∂v q − DΨ −1 ( p)∂v p .
Here we have used ∂v Ψ = 0. Now use (5.9) to get
Q10 (q)∂v q = −
∂ v Q 1 ( q ) D f b +1 ( q )
,
D f b +1 ( q ) − 1
DΨ −1 (q)∂v q = −
∂ v Q1 ( q )
.
D f b +1 ( q ) − 1
86
Chapter 5. Differentiable structure
Thus
(5.13)
|V | ∂ v v 0 = 1 −
(1 − v 0 ) ∂ v Q1 ( q )
v0 ∂v Φ( Q0 ( p))
−
.
DΨ (Ψ −1 ( p))( D f a+1 ( p) − 1)
D f b +1 ( q ) − 1
The last term is much smaller than one by (5.11) and since | DΨ | 1 (and
also D f a+1 ( p) ≈ α/c0 > α).
From Lemma 5.1.6 we get
|U |∂v u0 = − Q00 ( p)∂v p − u0 ∂v Φ−1 (q) + DΦ−1 (q)∂v q
−∂v Φ−1 ( p) − DΦ−1 ( p)∂v p .
Let us prove that that the dominating term is the one with ∂v q. From (5.9)
we get
∂ v Q 1 ( q ) D f b +1 ( q )
∂v q = − 0
,
Q 1 ( q ) D f b +1 ( q ) − 1
which diverges as b− → ∞, since | R|/ε → 0 and hence Q10 (q) → 0 (by
Proposition 3.1.13). From (5.9) and (5.11) we get that ∂v p → 0, which shows
that the last term is dominated by the term with ∂v q. Now, ∂v Φ−1 ( x ) =
−∂v Φ( x )/DΦ( x ), which combined with (5.11) shows that the term with
∂v q dominates the two terms with ∂v Φ−1 . Furthermore
Q00 ( p)∂v p = −
∂v Φ( Q0 ( p)) D f a+1 ( p)
,
DΦ( Q0 ( p)) D f a+1 ( p) − 1
which combined with (5.11) shows that the term with ∂v q dominates the
above term. Thus
(5.14)
|U | ∂ v u 0 = u 0
DΨ (Ψ −1 (q)) ∂v Q1 (q)
+ e,
DΦ(Φ−1 (q)) D f b+1 (q) − 1
where the error term e is tiny compared with the other term on the righthand side.
From Lemma 5.1.6 we get
|V |∂u v0 = Q10 (q)∂u q − v0 ∂u Ψ −1 (q) + DΨ −1 (q)∂u q
−∂u Ψ −1 ( p) − DΨ −1 ( p)∂u p .
87
5.1. The derivative
Let us prove that that the dominating term is the one with ∂u p. From (5.8)
we get
∂ u Q 0 ( p ) D f a +1 ( p )
∂u p = − 0
,
Q 0 ( p ) D f a +1 ( p ) − 1
which diverges as b− → ∞, since | L|/c → 0 and hence Q00 ( p) → 0. From
(5.8) and (5.10) we get that ∂u q is bounded and hence the ∂u p term dominates the second term involving ∂u q. Now, ∂u Ψ −1 ( x ) = −∂u Ψ (y)/DΨ (y),
y = Ψ −1 ( x ), which combined with (5.10) shows that the ∂u p term dominates the two terms involving ∂u Ψ −1 . Furthermore
Q10 (q)∂u q = −
∂u Ψ ( Q1 (q)) D f b+1 (q)
,
DΨ ( Q1 (q)) D f b+1 (q) − 1
which combined with (5.10) shows that the ∂u p term dominates the above
term. Thus
|V | ∂ u v 0 = − v 0
(5.15)
DΦ(Φ−1 ( p)) ∂u Q0 ( p)
+ e,
DΨ (Ψ −1 ( p)) D f a+1 ( p) − 1
where the error term e is tiny compared with the other term on the righthand side.
Corollary 5.1.10. If f ∈ K ∩ LΩ , then det M1 > 0 for b− large enough.
Proof. Let
t=
DΦ(Φ−1 ( p)) DΨ (Ψ −1 (q))
.
DΦ(Φ−1 (q)) DΨ (Ψ −1 ( p))
From Lemma 3.1.10 and Proposition 3.1.13 we know that the distortion of
Φ and Ψ tend to zero as b− → ∞. Hence t → 1.
From Proposition 5.1.8 we get
|U ||V | det M1 > 1 − t
u0 v0
Q( p)(1 − Q(q))
.
uv ( D f a+1 ( p) − 1)( D f b+1 (q) − 1)
Note that D f a+1 ( p) ≈ α/c0 and D f b+1 (q) ≈ α/ε0 . If u0 , v0 ≥ 1/2, then
ε0 1 by Theorem 3.1.5 and so det M1 > 0. If not, then we can estimate
|U ||V | det M1 > 1 −
t
c0 ε0
t
1
> 1−
> 0,
0
0
2uv (α − c )(α − ε )
2uv 4(α − 1/2)2
since u and v are close to one and t can be assumed to be close to one by
the above.
88
Chapter 5. Differentiable structure
Corollary 5.1.11. There exists k > 0 such that if f is as above, then
k M1 x k ≥ k · min{|U |−1 , |V |−1 } · k x k.
Proof. Write M1 as
M1 =
a
|U |
− |Uc |
− |Vb |
!
.
d
|V |
(Here we have used that the distortion of Φ and Ψ are small, so DΦ/DΨ |V |/|U |.) Then
−1
−1 d |U | b |U |
M1 = ( ad − bc)
.
c |V | a |V |
We are using the max-norm, hence
k M1−1 k = ( ad − bc)−1 · max{(b + d)|U |, (c + a)|V |}.
It can be checked that (b + d)/( ad − bc) and ( a + c)/( ad − bc) are bounded
by some K. Let k = 1/K to finish the proof.
Proposition 5.1.12. If f ∈ K ∩ LΩ , then
∂c u0 −|C |−1 ,
∂ c v 0 | C | −1 ,
∂ u c 0 c 0 ε 0 |U | −1 ,
∂ c c 0 − c 0 ε 0 | C | −1 ,
∂ v c 0 − c 0 ε 0 | V | −1 .
Proof. A straightforward calculation shows that
(5.16)
∂ c Q0 ( x )
x
=−
0
Q0 ( x )
c
and
∂ c Q1 ( x )
1−x
=−
.
0
Q1 ( x )
1−c
This together with Φ ◦ Q0 ( p) = p, Ψ ◦ Q1 (q) = q, (5.1) and (5.5) gives
∂c p =
p
a+1 ( p ) − ∂ Φ ( Q ( p ))
c
0
cDf
,
a
+
1
Df
( p) − 1
∂c q =
1− q
b+1 ( q ) − ∂ Ψ ( Q ( q ))
c
1
ε Df
.
b
+
1
Df
(q) − 1
From (5.3) and (5.16) we get
∂c Φ( x ) = −
1 a −1
D f a −i ( x i ) · (1 − x i ),
ε i∑
=0
x i = f i ◦ φ ( x ),
x ∈ U,
∂c Ψ ( x ) = −
1 b −1
D f b −i ( xi ) · xi ,
c i∑
=0
x i = f i ◦ ψ ( x ),
x ∈ V.
89
5.1. The derivative
Using a similar argument as in the proof of Proposition 5.1.8 this shows
that
∂c Φ( x ) − a and ∂c Ψ ( x ) = −O(bε1−1/α ),
and hence ∂c p 1 and ∂c q 1.
Now apply Lemma 5.1.6 using the fact that Φ−1 ( p) = Q0 ( p) to get
|U |∂c u0 = −(1 − u0 )∂c Q0 ( p) − u0 ∂c Φ−1 (q) .
A calculation gives
p
D f a+1 ( p) c − ∂c Φ( Q0 ( p))
1
∂ c Q0 ( p ) =
DΦ( Q0 ( p))
DΦ( Q0 ( p)) ( D f a+1 ( p) − 1)
and
∂c Φ
−1
∂ c q − ∂ c Φ Φ −1 ( q )
1
.
(q) =
−
1
DΦ Φ (q)
DΦ Φ−1 (q)
(In particular, both terms have the same sign.) But DΦ( x ) |C |/|U |, so
this gives ∂c u0 −|C |−1 . The proof that ∂c v0 |C |−1 is almost identical.
From Lemma 5.1.6 we get
| C | ∂ c c 0 = c 0 (1 − ∂ c q ) + ε 0 (1 − ∂ c p ),
and hence
∂c c0 =
c0 ε0 D f b+1 (q) − ε|C |−1 (1 − ∂c Ψ ( Q1 (q)))
ε
D f b +1 ( q ) − 1
ε0 c0 D f a+1 ( p) − c|C |−1 (1 − ∂c Φ( Q0 ( p)))
c
D f a +1 ( p ) − 1
c0 1 − ∂c Ψ ( Q1 (q))
ε0 1 − ∂c Φ( Q0 ( p))
=−
−
+ O(c0 ε0 /ε).
|C |( D f a+1 ( p) − 1)
|C |( D f b+1 (q) − 1)
+
Now use D f b+1 (q) ≈ α/ε0 and D f a+1 ( p) ≈ α/c0 to get ∂c c0 −c0 ε0 |C |−1 .
(Note that ∂c Φ( x ) < 0 and ∂c Ψ ( x ) < 0.)
Apply Lemma 5.1.6 to get
|C |∂u c0 = −c0 ∂u q − ε0 ∂u p.
This and the proof of Proposition 5.1.8 shows that
c0 D f a+1 ( p) − 1 ∂u Ψ ( Q1 (q)) + ε0 D f b+1 (q) − 1 DΦ( Q0 ( p))∂u Q0 ( p)
0
∂u c =
.
| C | D f a +1 ( p ) − 1 D f b +1 ( q ) − 1
90
Chapter 5. Differentiable structure
Since c0 ( D f a+1 ( p) − 1) α − c0 , ε0 ( D f b+1 (q) − 1) α − ε0 , |∂u Ψ | | DΦ|,
and ∂u Q0 ( p) ≈ 1, this shows that
∂u c0 c0 ε0
DΦ( Q0 ( p))
c0 ε0
.
|C |
|U |
The proof that ∂v c0 −c0 ε0 |V |−1 is almost identical.
Notation. We need some new notation to state the remaining propositions.
Each pure map φσ in the decomposition φ̄ can be identified with a real
number which we denote sσ ∈ R, and each ψτ in the decomposition ψ̄ can
be identified with a real number tτ ∈ R:
R 3 sσ ↔ φσ = φ̄(σ) ∈ Q,
R 3 tτ ↔ ψτ = ψ̄(τ ) ∈ Q.
We put primes on these numbers to denote that they come from the renormalization, so s0σ0 ∈ R is identified with φ̄0 (σ0 ) and t0τ 0 ∈ R is identified
with ψ̄0 (τ 0 ). Note that σ, σ0 are used to denote times for φ̄, φ̄0 , and τ, τ 0 are
used to denote times for ψ̄, ψ̄0 , respectively.
Proposition 5.1.13. There exists K such that if f¯ ∈ K̄ ∩ L̄Ω , then
|s0σ0 |
,
|U |
|t0 0 |
|∂u t0τ 0 | ≤ K τ ,
|U |
|∂u s0σ0 | ≤ K
|s0σ0 |
,
|V |
|t0 0 |
|∂v t0τ 0 | ≤ K τ ,
|V |
|∂v s0σ0 | ≤ K
|s0σ0 |
,
|C |
|t0 0 |
|∂c t0τ 0 | ≤ K τ .
|C |
|∂c s0σ0 | ≤ K
Proof. We will compute ∂v s0σ0 ; the other calculations are almost identical.
There are four cases to consider depending on which time in the decomposition φ̄0 we are looking at: (1) φ̄0 (σ0 ) = Z (φσ ; I ), (2) φ̄0 (σ0 ) = Z (ψτ ; I ),
(3) φ̄0 (σ0 ) = Z ( Q0 ; I ), (4) φ̄0 (σ0 ) = Z ( Q1 ; I ). In each case let I = [ x, y] and
let T : I → C be the ‘transfer map’ to C. This means that T = f i ◦ γ for
some i and γ is a partial composition (e.g. γ = O≥σ (φ̄) in case 1) or a pure
map (in cases 3 and 4).
In case 1 Lemma 5.1.7 gives
∂v s0σ0 =
Nφσ ( x )
Nφσ (y)
(∂v q − ∂v T (y)) −
(∂v p − ∂v T ( x )).
DT (y)
DT ( x )
By Lemma A.2.16 Nφσ (y) = Nφσ0 0 (1)/| I | and hence
Nφσ0 0 (1)/| I |
s0 0
Nφσ (y)
σ .
DT (y)
|C |/| I |
|C |
91
5.1. The derivative
Here we
used that
of φσ0 0 does not change sign so
R have
R the0 nonlinearity
0
0
0
sσ0 = Nφσ0 and that Nφσ0 ≈ Nφσ0 (1) since the nonlinearity is close to
being constant (which is true since φ̄0 is pure and has very small norm).
We now need to estimate ∂v T but this can very roughly be bounded by
∂v Φ since
∂v T (y) = ∂v f 1i (γ(y)),
so the estimate that was used for ∂v Φ in the proof of Proposition 5.1.8 can
be employed. From the same proof we thus get that ∂v q dominates both
∂v p and ∂v T.
The above arguments show that
∂v s0σ0 s0σ0
s0 0 DΨ ( Q1 (q))
s0σ0
1
∂v q − σ
−
.
|C |
| C | D f b +1 ( q ) − 1
| V | D f b +1 ( q ) − 1
This concludes the calculations for case 1.
Case 2 is almost identical to case 1. Case 4 differs in that Lemma 5.1.7
now gives two extra terms
∂v s0σ0 =
NQ1 (y)
NQ1 ( x )
(∂v q − ∂v T (y)) −
(∂v p − ∂v T ( x ))
DT (y)
DT ( x )
∂v Q10 (y) ∂v Q10 ( x )
+
−
.
Q10 (y)
Q10 ( x )
However, ∂v Q1 = 1/v so the last two terms cancel. The rest of the calculations go exactly like in case 1. Case 3 is similar to case 4.
Remark 5.1.14. A key point in the above proof is that deformations in a
decomposition direction is monotone. This is what allowed us to estimate
the partial derivatives of the ‘transfer map’ T by the partial derivatives of
Φ or Ψ.
Proposition 5.1.15. There exists K and ρ > 0 such that if f¯ ∈ K̄ ∩ L̄Ω , then
|∂? u0 | ≤
Kερ
,
|C |
|∂? s0σ0 | ≤
for ? ∈ {sσ , tτ }.
|∂? v0 | ≤
Kερ |s0σ0 |
,
|C |
Kc0 ε0 ερ
,
|C |
Kερ |t0τ 0 |
|∂? t0τ 0 | ≤
,
|C |
Kερ
,
|C |
|∂? c0 | ≤
92
Chapter 5. Differentiable structure
Proof. Let us first consider ∂sσ , that is deformations in the direction of φσ .
Since φσ is pure we can use (4.1) to compute
∂sσ φσ ( x ) − x (1 − x ).
(5.17)
From (5.5) we get
∂ sσ Φ Q0 ( p )
∂ sσ p = −
D f a +1 ( p ) − 1
and
∂ sσ Ψ Q1 ( q )
∂ sσ q = −
.
D f b +1 ( q ) − 1
so the first thing to do is to calculate the partial derivatives of Φ and Ψ.
Let x ∈ U, then
∂sσ Φ( x ) = ∂sσ f 1a ◦ O>σ (φ̄) ◦ φσ ◦ O<σ (φ̄) ( x )
= D f 1a ◦ O>σ (φ̄) O≤σ (φ̄)( x ) · ∂sσ φσ O<σ (φ̄)( x ) .
Note that we have used that f 1 does not depend on sσ . From (5.17) we thus
get that
(5.18)
|∂sσ Φ( x )| ≤ K 0 · DΦ( x )(1 − x ) ≤ Kε.
Let x ∈ V and let xi = f 0i ◦ ψ( x ). As in the proof of Proposition 5.1.8 we
have
b −1
∂ s σ Ψ ( x ) = ∂ s σ f 0 ( x b −1 ) +
∑ D f0b−i (xi )∂s
σ
f 0 ( x i −1 ).
i =1
From (5.17) we get
|∂sσ f 0 ( xi−1 )| = D O>σ (φ̄) O≤σ (φ̄) ◦ Q0 ( xi−1 ) · ∂sσ O<σ (φ̄) ◦ Q0 ( xi−1 ) ≤ K | x i |.
Using the same estimate as in the proof of Proposition 5.1.8 this shows that
(5.19)
|∂sσ Ψ ( x )| ≤ K 0 (1 − xb ) + O(bε1−1/α ) = O(bε1−1/α ).
We can now argue as in the proof of Proposition 5.1.8 to find bounds on
∂sσ ? for ? ∈ {u0 , v0 , c0 }. From Lemma 5.1.6 we get
D f a +1 ( p )
u 0 ∂ s σ Φ Φ −1 ( q ) − ∂ s σ q
1 − u0 ∂sσ Φ( Q( p))
0
·
·
+
,
∂ sσ u =
|U |
DΦ( Q( p)) D f a+1 ( p) − 1 |U |
DΦ Φ−1 (q)
0 ∂ Ψ ( Q ( q ))
b +1 ( q )
0 ∂ Ψ Ψ −1 ( p ) − ∂ p
1
−
v
D
f
v
s
s
s
σ
σ ,
− ∂ sσ v 0 =
· σ
·
+
b
+
1
−
1
|V |
DΨ ( Q(q)) D f
( q ) − 1 |V |
DΨ Ψ ( p)
∂ sσ Ψ Q1 ( q )
∂ s Φ Q0 ( p )
∂ sσ c 0 = c 0 ·
+ ε 0 · σ a +1
.
b
+
1
Df
( p) − 1
Df
(q) − 1
93
5.2. Archipelagos in the parameter plane
Use that Dφ |C |/|U |, DΨ |C |/|V |, D f a+1 ( p) α/c0 and D f b+1 (q) α/ε0 to finish the estimates for ∂sσ u0 , ∂sσ v0 and ∂sσ c0 . Note that bεr → 0 for
any r > 0 so it is clear from (5.18) and (5.19) that we can find a ρ > 0 such
that |∂sσ Φ| < Kερ and |∂sσ Ψ | < Kερ .
In order to find bounds for ? ∈ {s0σ0 , t0τ 0 } we argue as in the proof of
Proposition 5.1.13. The last two terms from Lemma 5.1.7 are slightly different (when nonzero). In this case they are given by
∂sσ Dφσ (y) ∂sσ Dφσ ( x )
−
.
Dφσ (y)
Dφσ ( x )
Using (5.17) we can calculate this difference. For |sσ | 1 it is close to y − x
which turns out to be negligible. All other details are exactly like the proof
of Proposition 5.1.13.
The estimates for ∂tτ are handled similarly. The only difference is the
estimates of the partial derivatives of Φ and Ψ. These can be determined
by arguing as in the above and the proof of Proposition 5.1.8 which results
in
(5.20)
|∂tτ Φ( x )| ≤ Kε1−1/α
and
|∂tτ Ψ (y)| ≤ Kaε,
for x ∈ U and y ∈ V. The remaining estimates are handled identically to
the above.
5.2
Archipelagos in the parameter plane
The term archipelago was introduced by Martens and de Melo (2001) to describe the structure of the domains of renormalizability in the parameter
plane for families of Lorenz maps. In this section we show how the information we have on the derivative of the renormalization operator can be
used to prove that the structure of archipelagos must be very rigid.
Fix c∗ , φ∗ , ψ∗ and let F : [0, 1]2 → L denote the associated family of
Lorenz maps
(u, v) = λ 7→ Fλ = (u, v, c∗ , φ∗ , ψ∗ ).
We will assume that: (i) Sφ∗ < 0 and Sψ∗ < 0, (ii) Dist φ∗ ≤ δ and Dist ψ∗ ≤
δ, and (iii) ε− ≤ 1 − c∗ ≤ ε+ . These conditions ensure that Fλ ∈ K. The
notation Ω, K, δ, ε− and ε+ is introduced in Section 3.1.
94
Chapter 5. Differentiable structure
Definition 5.2.1. An archipelago Aω ⊂ [0, 1]2 of type ω ∈ Ω is the set of
λ such that Fλ is ω–renormalizable. An island of Aω is the interior of a
connected component of Aω .
For the family λ 7→ Fλ we have the following very strong structure theorem for archipelagos (this should be contrasted with Martens and de Melo,
2001). Note that c∗ , φ∗ and ψ∗ are arbitrary, so the results in this section
holds for any family which satisfies conditions (i) to (iii) above.
Theorem 5.2.2. For every ω ∈ Ω there exists a unique island I such that the
archipelago Aω equals the closure of I. Furthermore, I is diffeomorphic to a square.
Remark 5.2.3. This theorem shows that the structure of Aω is very rigid.
Note that the structure of archipelagos is much more complicated in general. There may be multiple islands, islands need not be square, there may
be isolated points, etc.
Corollary 5.2.4. For every ω̄ ∈ ΩN there exists a unique λ such that Fλ has
combinatorial type ω̄. The set of all such λ is a Cantor set.
Proof. By Theorem 5.2.2 there exists a unique sequence of nested squares1
I0 ⊃ I1 ⊃ I2 ⊃ · · ·
such that λ ∈ Ik implies that Fλ is renormalizable of type (ω0 , . . . , ωk−1 ). We
contend that the relative diameter of Ik+1 inside Ik is uniformly bounded
(from above and from below). Otherwise the a priori bounds would give
us a subsequence {λk( j) ∈ Ik( j) } such that Rk( j) Fλk( j) converges (in L0 ) to a
renormalizable Lorenz map with a critical value on the boundary. This is a
contradiction since a map cannot be renormalizable if it has critical values
in the boundary. We conclude that the intersection ∩ Ik is a point.
The bound on the relative diameters is uniform in both ω and Fλ since
Ω is finite and K is relatively compact. It now follows from a standard
argument that the union of the above intersections over all combinatorial
types in ΩN is a Cantor set.
The family Fλ is monotone, by which we mean that u 7→ F(u,v) ( x ) is
strictly increasing for x ∈ (0, c∗ ), and v 7→ F(u,v) ( x ) is strictly decreasing for
x ∈ (c∗ , 1). As a consequence, if we let
M(+u,v) = {( x, y) | x ≥ u, y ≤ v}
1 By
and
M(−u,v) = {( x, y) | x ≤ u, y ≥ v},
a square we mean any set diffeomorphic to the unit square.
95
5.2. Archipelagos in the parameter plane
then
µ ∈ Mλ+ =⇒ Fµ ( x ) > Fλ ( x )
and
µ ∈ Mλ− =⇒ Fµ ( x ) < Fλ ( x ),
for all x ∈ (0, 1) \ {c}. In other words, deformations in Mλ+ moves both
branches up, deformations in Mλ− moves both branches down. This simple
observation is key to analyzing the structure of archipelagos.
Definition 5.2.5. Let πS : R3 → R2 be the projection which takes the rectangle [c, 1] × [1 − c, 1] × {c} onto S = [1/2, 1]2
1−y
1−x
,1−
,
πS ( x, y, c) = 1 −
2(1 − c )
2c
and let H be the map which takes (u, v, c, φ, ψ) to the height of its branches
(c is kept around because πS needs it)
H (u, v, c, φ, ψ) = (φ(u), 1 − ψ(1 − v), c).
Now define R : Aω → S by
R(λ) = πS ◦ H ◦ R( Fλ ).
Remark 5.2.6. The action of R can be understood by looking at Figure 5.1
on the next page. The boundary of an island I is mapped into the boundary of the wedge W by the map H ◦ R. The four boundary pieces of the
wedge correspond to when the renormalization has at least one full or trivial branch. Note that the image of ∂I in ∂W will not in general lie in a
plane, instead it will be bent around somewhat. For this reason we project
down to the square S via the projection πS . This gives us the final operator R : Aω → S.
Proposition 5.2.7. Let I ⊂ Aω be an island. Then R is an orientation-preserving
diffeomorphism that takes the closure of I onto S.
Remark 5.2.8. This already shows that the structure of archipelagos is very
rigid. First of all every island is full, but there are also exactly one of each
type of extremal points, and exactly one of each type of vertex. In other
words, there are no degenerate islands of any type! Extremal points and
vertices are defined in Martens and de Melo (2001), see also the caption of
Figure 5.2 on page 97.
96
Chapter 5. Differentiable structure
H◦R◦G
W
(c∗ , 1 − c∗ , c∗ )
1 − c1+
c1−
c
Figure 5.1: Illustration of the action of R on the family Fλ . The dark
gray island is mapped onto a set which is wrapped around the wedge W.
That is, the boundary of the island is mapped into the boundary of W
with nonzero degree. Note that in this illustration we project the image
of R to R3 via the map H. The maps H and G convert between critical values (c1− , c1+ ) and (u, v)–parameters. Explicitly G (c1− , 1 − c1+ , c∗ ) =
(φ∗−1 (c1− ), 1 − ψ∗−1 (c1+ ), c∗ , φ∗ , ψ∗ ).
Proof. By definition R maps I into S and ∂I into ∂S. We claim that DRλ is
orientation-preserving for every λ ∈ cl I.2 Assume that the claim holds (we
will prove this soon).
We contend that R maps cl I onto S. If not, then R(∂I ) must be strictly
contained in ∂S, since the boundaries are homeomorphic to the circle and
R is continuous. But then DRλ must be singular for some λ ∈ ∂I which
contradicts the claim.
Hence R : cl I → S maps a simply connected domain onto a simply
connected domain, and DR is a local isomorphism. Thus R is in fact a
diffeomorphism.
We now prove the claim. A computation gives
DπS ( x, y, c) =
2 The
(2(1 − c))−1
0
?
,
0
(2c)−1 ?
notation DRλ is used to denote the derivative of R at the point λ.
97
5.2. Archipelagos in the parameter plane
(1, 1)
λ3
R Fλ3
λ2
R Fλ2
λ4
R Fλ4
λ1
R Fλ1
(φ∗−1 (c∗ ), 1 − ψ∗−1 (c∗ ))
Figure 5.2: Illustration of a full island for the family Fλ . The boundary corresponds to when at least one branch of the renormalization R Fλ is either
full or trivial. The top right and bottom left corners are extremal points; the
top left and bottom right corners are vertices.
and

φ0 (u)
0
...
ψ 0 (1 − v ) . . .  .
= 0
?
?
...

DH(u,v,c,φ,ψ)
The top-left 2 × 2 matrix is orientation-preserving in both cases and the
same is true for D R by Corollary 5.1.10. Thus DRλ is orientation-preserving.
Lemma 5.2.9. Assume f m (c1− ) = c = f n (c1+ ) for some m, n > 0. Let (l, c)
and (c, r ) be branches of f m and f n , respectively. Then f m (l ) ≤ l and f n (r ) ≥ r.
In particular, f is renormalizable to a map with trivial branches.
Proof. In order to reach a contradiction we assume that f m (l ) > l. Then
f im (l ) ↑ x for some point x ∈ (l, c] as i → ∞, since f m (c1− ) = c. Since l
is the left endpoint of a branch there exists t such that f t (l ) = c1+ . Hence
f m−t (c1+ ) = l so the orbit of c1+ contains the orbit of l. But the orbit of c1+
was periodic by assumption which contradicts f im (l ) ↑ x. Hence f m (l ) ≤ l.
Now repeat this argument for r to complete the proof.
98
Chapter 5. Differentiable structure
Definition 5.2.10. Define
−
i −
γtriv
= λ ∈ [0, 1]2 Fλa+1 (c−
∗ ) = c∗ and Fλ ( c∗ ) > c∗ , i = 1, . . . , a ,
+
i +
γtriv
= λ ∈ [0, 1]2 Fλb+1 (c+
∗ ) = c∗ and Fλ ( c∗ ) < c∗ , i = 1, . . . , b .
+
(The notation here is g(c−
∗ ) = limx ↑c∗ g ( x ) and g ( c∗ ) = limx ↓c∗ g ( x ).)
−
Lemma 5.2.11. The set γtriv
is the image of a curve v 7→ ( g(v), v). The map g is
differentiable and takes [1 − ψ∗−1 (c∗ ), 1] into [φ∗−1 (c∗ ), 1).
+
Similarly, γtriv
is the image of a curve u 7→ (u, h(u)) where h is differentiable
−
1
and takes [φ∗ (c∗ ), 1] into [1 − ψ∗−1 (c∗ ), 1).
Proof. Define
g(v) = φ∗−1 ◦ (ψ∗ ◦ Q1 )−a (c∗ )
and
h(u) = 1 − ψ∗−1 ◦ (φ∗ ◦ Q0 )−b (c∗ ).
Note that Q1 depends on v and Q0 depends on u so g and h are well-defined
−
+
maps. It can now be checked that these maps define γtriv
and γtriv
.
−
+
−
+
Lemma 5.2.12. Assume that γtriv
crosses γtriv
and let λ ∈ γtriv
∩ γtriv
. Then
the crossing is transversal and there exists ρ > 0 such that if r < ρ, then the
−
+
complement of γtriv
∪ γtriv
inside the ball Br (λ) consists of four components and
exactly one of these components is contained in the archipelago Aω .
Proof. To begin with assume that the crossing is transversal so that the com−
+
plement of γtriv
∪ γtriv
in Br (λ) automatically consists of four components
−
+
for r small enough. Note that γtriv
∪ γtriv
does not intersect Mλ+ ∪ Mλ− \ {λ}.
−
Hence, precisely one component will have a boundary point µ ∈ γtriv
such
+
+
that γtriv intersects Mµ . Denote this component by N. Note that if we move
from µ inside N ∩ Mµ+ then the left critical value of the return map moves
+
above the diagonal. If we move in N ∩ Mµ+ from a point in γtriv
then the
right critical value of the return map moves below the diagonal.
By Lemma 5.2.9 Fλ is renormalizable and moreover the periodic points
pλ and qλ that define the return interval of Fλ are hyperbolic repelling by
the minimum principle. Hence, if we deform Fλ into N it will still be renormalizable since N consists of µ such that Fµa+1 (c− ) is above the diagonal
and Fµb+1 (c+ ) is below the diagonal. By choosing r small enough all of N
will be contained in Aω .
Note that if we deform into any other component (other than N) then at
least one of the critical values of the return map will be on the wrong side
99
5.2. Archipelagos in the parameter plane
(1, 1)
E
−
γtriv
λ
+
γtriv
µ
Figure 5.3: Illustration of the proof of Theorem 5.2.2. Both λ and µ must be
in the boundary of islands, which lie inside the shaded areas. These two
islands have opposite orientation which is impossible.
of the diagonal and hence the corresponding map is not renormalizable.
Thus only the component N intersects Aω .
Now assume that the crossing is not transversal. Then we may pick
−
+
λ in the intersection γtriv
∩ γtriv
so that it is on the boundary of an island
(by the above argument). But then λ must be at a transversal intersection
−
+
since islands are square by Proposition 5.2.7 and the curves γtriv
and γtriv
are differentiable. Hence every crossing is transversal.
Proof of Theorem 5.2.2. From Proposition 5.2.7 we know that every island
must contain an extremal point which renormalizes to a map with only
trivial branches, and hence every island must be adjacent to a crossing be−
+
tween the curves γtriv
and γtriv
. We claim that there can be only one such
crossing and hence uniqueness of islands follows. Note that there is always
at least one island by Proposition 3.3.6.
−
+
By Lemma 5.2.11 γtriv
and γtriv
terminate in the upper and right bound2
ary of [0, 1] , respectively. Let λ be the crossing nearest the points of termination in these boundaries. Let E be the component in the complement of
−
+
γtriv
∪ γtriv
in [0, 1]2 that contains the point (1, 1). The geometrical configu−
+
ration of γtriv and γtriv
is such that E must contain the piece of Aω adjacent
to λ as in Lemma 5.2.12. To see this use the fact that deformations in the
cones Mλ+ moves both branches of Fλ up.
100
Chapter 5. Differentiable structure
In order to reach a contradiction assume that there exists another cross−
+
ing µ between γtriv
and γtriv
(see Figure 5.3 on the previous page). By
Lemma 5.2.12 there is an island attached to this crossing but the config−
+
uration of γtriv
and γtriv
at µ is such that this island is oriented opposite to
the island inside E. But R is orientation-preserving so both islands must be
oriented the same way and hence we reach a contradiction. The conclusion
−
+
is that there can be no more than one crossing between γtriv
and γtriv
as
claimed.
Finally, the entire archipelago equals the closure of the island since the
derivative of R is nonsingular at every point in the archipelago. Hence
every point in the archipelago must either be contained in an island or on
the boundary of an island.
5.3
Invariant cone field
A standard way of showing hyperbolicity of a linear map is to find an invariant cone field with expansion inside the cones and contraction in the
complement of the cones. In this section we show that the derivative of the
renormalization operator has an invariant cone field and that it expands
these cones. However, our estimates on the derivative are not sufficient to
prove contraction in the complement of the cones so we cannot conclude
that the derivative is hyperbolic. The results in this section are used in Section 5.4 to construct unstable manifolds in the limit set of renormalization.
Let
H ( f¯, κ ) = {( x, y) | kyk ≤ κ k x k}
denote the standard horizontal κ–cone on the tangent space at f¯ ∈ K̄Ω ,
where
K̄Ω = { f¯ ∈ K̄ ∩ L̄Ω | c1+ (R f¯) ≤ 1/2 ≤ c1− (R f¯)}.
As always, K and Ω are the same as in Section 3.1. Recall that we decompose the tangent space into a two-dimensional subspace with coordinate
x and a codimension two subspace with coordinate y. The x–coordinate
corresponds to the (u, v)–subspace in K̄Ω . We use the max-norm so if
z = ( x, y) then kzk = max{k x k, kyk}.
Proposition 5.3.1. Assume f¯ ∈ K̄Ω and define
−
κ ( f¯) = K − max{ε, Dist φ̄, Dist ψ̄}
and
κ ( f¯) = K + min
+
|C | |C |
,
|U | | V |
.
5.3. Invariant cone field
101
It is possible to choose K + , K − (not depending on f¯) such that if κ ≤ κ + ( f¯), then
D R f¯ H ( f¯, κ ) ⊂ H R f¯, κ − (R f¯) .
In particular, the cone field f¯ 7→ H ( f¯, 1) is mapped strictly into itself by D R.
Remark 5.3.2. Note that as b− increases, κ − ↓ 0 and κ + ↑ ∞. Thus a fatter
and fatter cone is mapped into a thinner and thinner cone. In particular,
the invariant subspaces inside the thin cone and the complement of the fat
cone eventually line up with the coordinate axes.
Proof. Assume kyk ≤ κ k x k. Let z0 = Mz where z0 = ( x 0 , y0 ) and z = ( x, y).
Then
k M1 x k − κ k M2 k
k M1 x k − k M2 kkyk
k x0 k
kxk
≥
≥
.
ky0 k
k M3 x k + k M4 kkyk
k M3 kxxk k + κ k M4 k
We are interested in a lower bound on k x 0 k/ky0 k so this shows that we need
to minimize
k M1 x k − κ k M2 k
g( x ) =
,
k M3 x k + κ k M4 k
subject to the constraint k x k = max{| x1 |, | x2 |} = 1.
By Proposition 5.1.8, asymptotically
0x x1
ε
x
x
2
2
1
, −
.
k M1 x k = max −
+
|U | α | V |
(α − 1)|U | |V | Here we have used that f¯ ∈ K̄Ω and hence 1 − u0 1, 1 − v0 1,
D f a+1 ( p) α, D f b+1 (q) α/ε0 , and DΦ( x )/DΨ (y) |V |/|U | for x ∈ U,
y ∈ V.
Let ρ0 = max{ε0 , Dist φ̄0 , Dist ψ̄0 }. By Theorem 5.1.2
A | x1 | B | x2 |
0
+
,
k M3 x k ≤ ρ
|U |
|V |
so we are lead to minimize
o
n
0 max |U1 | − αε|Vt | , (α−11)|U | − |Vt | − κ k M2 k
g1 ( t ) =
,
A
Bt
0
+
+
κ
k
M
k
/ρ
ρ 0 |U
4
|
|V |
102
Chapter 5. Differentiable structure
(corresponding to x1 = ±1) and
o
n
0 max |Ut | − α|εV | , (α−1t )|U | − |V1 | − κ k M2 k
g2 ( t ) =
,
B
0
ρ0 |At
+
+
κ
k
M
k
/ρ
4
U|
|V |
(corresponding to x2 = ±1) over t ∈ [0, 1]. Note that we have to assume
that α(α − 1) > ε0 here or the numerators might be zero (we can assure that
this is the case by increasing b− if necessary). These maps are piecewise
monotone so the minimum is assumed on a boundary point. The points
t = 0, t = 1 are boundary points of both g1 and g2 . Let
t0 =
|V | 2 − α
α
·
·
|U | α − 1 α − ε 0
and
t1 =
|V |
α
α
.
·
·
|U | α − 1 α + ε 0
If α < 2, then t0 is a boundary point of g1 if t0 < 1, otherwise t0−1 is a
boundary point of g2 . For α > 1, if t1 < 1 then t1 is a boundary point of g1 ,
else t1−1 is a boundary point of g2 .
It is now routine to check the values of gi in these boundary points.
Instead of writing down all the calculations, let us just do one:
g2 ( 0 ) =
1 − κK2 |V |/|C |
|V |−1 − κ k M2 k
≥ 0
.
−
1
0
0
ρ ( B + κK4 |V |/|C |)
ρ ( B|V | + κ k M4 k/ρ )
Hence, if for example κ < |C |/(2K2 |V |), then g2 (0) ≥ (ρ0 (2B + K4 /K2 ))−1 .
The other boundary points will lead to similar conclusions, but perhaps
with |U |/|C | instead of |V |/|C | dictating the choice of κ. Now define K +
as the smallest constant in the bound on κ and define K − as the largest
constant next to ρ0 that comes out of the evaluations at the boundary points.
Proposition 5.3.3. Let f¯ ∈ K̄Ω . Then D R is strongly expanding on the cone
field f¯ 7→ H ( f¯, 1). Specifically, there exists k > 0 (not depending on f¯) such that
k D R f¯ zk ≥ k · min{|U |−1 , |V |−1 } · kzk,
∀z ∈ H ( f¯, 1) \ {0}.
Proof. Use Corollary 5.1.11 to get
k Mzk ≥ k M1 x + M2 yk ≥ k · min{|U |−1 , |V |−1 } − k M2 k · k x k.
Now use the fact that kzk = k x k for z ∈ H ( f¯, 1) to finish the proof.
103
5.4. Unstable manifolds
5.4
Unstable manifolds
The norm used on the tangent space does not give good enough estimates
to see a contracting subspace so we cannot quite prove that the limit set
of R is hyperbolic. However, these estimates did give an expanding invariant cone field and in this section we will show how this gives us unstable
manifolds at each point of the limit set.
Instead of trying to appeal to the stable and unstable manifold theorem
for dominated splittings to get local unstable manifolds we directly construct global unstable manifolds by using all the information we have about
the renormalization operator and its derivative. This is done by defining a
graph transform and showing that it contracts some suitable metric similarly to the Hadamard proof of the stable and unstable manifold theorem.
We are only able to show that the resulting graphs are C 1 since we do not
have hyperbolicity. Our proof is an adaptation of the proof of Theorem 6.2.8
in Katok and Hasselblatt (1995).
Definition 5.4.1. Let AΩ be as in Definition 4.2.9 and define the limit set of
renormalization for types in Ω by
ΛΩ = AΩ ∩ L̄ΩN .
Remark 5.4.2. Here L̄ΩN denotes the set of infinitely renormalizable maps
with combinatorial type in ΩN and AΩ can intuitively be thought of as the
attractor for R. The set Ω is the same as in Section 3.1, as always.
Note that
ΛΩ ⊂ [0, 1]2 × (0, 1) × Q̄2 ,
where Q̄ denotes the set of pure decompositions, see Definition 4.1.14.
Theorem 5.4.3. For every f¯ = (u, v, c, φ̄, ψ̄) ∈ ΛΩ there exists a unique global
unstable manifold W u ( f¯). The unstable manifold is a graph
W u ( f¯) = { λ, σ(λ) | λ ∈ I },
where σ : I → (0, 1) × Q̄ × Q̄ is κ–Lipschitz for some κ 1 (not depending
on f¯). The domain I is essentially given by
π R(L̄ω ) ∩ [0, 1]2 × {c} × {φ̄} × {ψ̄} ,
where π is the projection onto the (u, v)–plane, and ω is defined by f¯ being in the
image R(L̄ω ). Additionally, W u is C 1 .
104
Chapter 5. Differentiable structure
Remark 5.4.4. Note that in stark contrast to the situation in the ‘regular’ stable and unstable manifold theorem we get global unstable manifolds which
are graphs and that these are almost completely straight due to the Lipschitz
constant being very small. The statement about the domain I is basically
that I is “as large as possible.” This will be elaborated on in the proof.
Another thing to note is that we cannot say anything about the uniqueness of f¯ ∈ ΛΩ for a given combinatorics. That is, given
ω̄ = (. . . , ω−1 , ω0 , ω1 , . . . )
we cannot prove that there exists a unique f¯ ∈ ΛΩ realizing this combinatorics. Instead we see a foliation of the set of maps with type ω̄ by unstable
manifolds. If we had a hyperbolic structure on ΛΩ this problem would go
away.
Corollary 5.4.5. Let f¯ ∈ ΛΩ and let ω̄ ∈ ΩN . Then W u ( f¯) intersects the set of
infinitely renormalizable maps of combinatorial type ω̄ in a unique point, and the
union of all such points over ω̄ ∈ ΩN is a Cantor set.
Proof. Theorem 5.4.3 shows that the unstable manifolds are straight (see
the above remark) and hence Lemma 5.4.6 enables us to apply the same
arguments as in Corollary 5.2.4.
Lemma 5.4.6. There exists κ close to 1 such that if γ : [0, 1]2 → (0, 1) × Q̄2 is
κ–Lipschitz and graph γ ⊂ K̄, then L̄ω ∩ graph γ is diffeomorphic to a square,
for every ω ∈ Ω.
Proof. By Theorem 5.2.2 the set L̄ω ∩ K̄ is a tube for every ω ∈ Ω. Take a
tangent vector at a point in the image of ∂L̄ω ∩ K̄. Such a tangent will lie
in the complement of a cone Hκ = {kyk ≤ κ k x k} for κ < 1 close to 1, since
the projection of the image of a tube to the (u, v, c)–subspace will look like
a (slightly deformed) cut-off part of the wedge in Figure 5.1 on page 96.
By Proposition 5.3.1, R−1 maps the complement of Hκ into itself and hence
every tube “lies in the complement of Hκ ”. That is, a tangent vector at a
point in the boundary of a tube lies in Hκ , so the tubes cut the (u, v)–plane
at an angle which is smaller than 1/κ.
Now if we choose κ as above, then the graph of κ will also intersect every tube on an angle. Hence the intersection is diffeomorphic to a square.
The main point here is that with κ chosen properly, γ cannot ‘fold over’
a tube and in such a way create an intersection which is not simply connected.
105
5.4. Unstable manifolds
Proof of Theorem 5.4.3. The proof is divided into three steps: (1) definition
of the graph transform Γ, (2) showing that Γ is a contraction, (3) proof of
C 1 –smoothness of the unstable manifold.
Step 1. From Proposition 3.1.13 we know that the critical values for any
map in K̄ are uniformly close to 1 so there exists µ 1 such that if we
define the ‘block’
B̄ = [1 − µ, 1]2 × (0, 1) × Q̄2 ∩ K̄,
then L̄Ω ∩ K̄ ⊂ B̄ , 1 − µ > φ−1 (c) and µ > ψ−1 (c) for all (u, v, c, φ, ψ) ∈
O(B̄). In other words, the block B̄ is defined so that it contains all maps
in K̄ which are renormalizable of type in Ω and the square [1 − µ, 1]2 is
contained in the projection of the image R(L̄Ω ∩ K̄) onto the (u, v)–plane.
Fix f¯0 ∈ ΛΩ and κ ∈ (κ − , 1), where κ − is the supremum of κ − ( f¯) defined in Proposition 5.3.1 and κ is small enough so that Lemma 5.4.6 applies. Associated with f¯0 are two bi-infinite sequences {ωi }i∈Z and { f¯i }i∈Z
such that Rωi f¯i = f¯i+1 for all i ∈ Z. Now define Gi , the “unstable graphs
centered on f¯i ,” as the set of κ–Lipschitz maps γi : [1 − µ, 1]2 → (0, 1) × Q̄2
such that graph γi ⊂ B̄ and γi (λi ) = (ci , φ̄i , ψ̄i ), where f¯i = (λi , ci , φ̄i , ψ̄i ).
Let G = ∏i Gi . We will now define a metric on G . Let
d i ( γi , θ i ) =
sup
λ∈[1−µ,1]2
|γi (λ) − θi (λ)|
,
| λ − λi |
γi , θ i ∈ G i ,
and define
d(γ, θ ) = sup di (γi , θi ),
i ∈Z
γ, θ ∈ G .
This metric turns (G , d) into a complete metric space. Note that it is not
enough to simply use a C 0 –metric since we do not have a contracting subspace of D R. The denominator in the definition of di is thus necessary to
turn the graph transform into a contraction.
We can now define the graph transform Γ : G → G for f¯0 . Let γi ∈ Gi
and define Γi (γi ) to be the γi0+1 ∈ Gi+1 such that
graph γi0+1 = Rωi (graph γi ∩ L̄ωi ) ∩ B̄ .
Let us discuss why this is a well-defined map Γi : Gi → Gi+1 . Lemma 5.4.6
shows that Rωi (graph γi ∩ L̄ωi ) is the graph of some map I ⊂ R2 →
106
Chapter 5. Differentiable structure
(0, 1) × Q̄2 , where I is simply connected. That I ⊃ [1 − µ, 1]2 is a consequence of how B̄ was chosen. Finally, this map is κ–Lipschitz by Proposition 5.3.1.
Actually, we have cheated a little bit here since Proposition 5.3.1 is
stated for maps satisfying the extra condition
c1+ (R f ) ≤ 1/2 ≤ c1− (R f ).
In defining the graph transform we should intersect L̄ωi with the set defined by this condition before mapping it forward by Rωi . Otherwise we do
not have enough information to deduce that the entire image is κ–Lipschitz
as well. However, this problem is artificial. We could have chosen the constant 1/2 closer to 1 and still gotten the invariant cone field. All this means
is that domain I of the theorem is slightly smaller than it should be (we
have to cut out a small part of the graph where v is very close to 0 but v is
still allowed to range all the way up to 1 so this amounts to a very small
part of the domain). This is one reason why we say that “I is essentially
given by . . . ” in the statement of the theorem. The other reason is that the
intersection with R(L̄ω ) should be taken with a surface with a small angle
and not a surface which is parallel to the (u, v)–plane.
The graph transform is now defined by
Γ (γ) = Γi (γi ) i∈Z ,
γ = { γi } i ∈ Z ∈ G .
We claim that Γ is a contraction on (G , d) and hence the contraction
mapping theorem implies that Γ has a unique fixed point γ∗ ∈ G . The
global unstable manifolds along { f¯i } are then given by
W u ( f¯i+1 ) = graph Γi (γi∗ ),
∀i ∈ Z.
In particular, this proves existence and uniqueness of the global unstable
manifold at f¯0 . That these are the global unstable manifolds is a consequence of L̄Ω ∩ K̄ ⊂ B̄ . Furthermore, the Lipschitz constant for these
graphs is much smaller than 1 since we can pick κ close to κ − . Again,
we are cheating a little bit here since we have to cut out a small part of the
domain of the graph as discussed above.
Step 2. We now prove that Γ is a contraction. The focus will be on
Γi for now and to avoid clutter we will drop subscripts on elements of Gi
107
5.4. Unstable manifolds
and Gi+1 . Pick γ, θ ∈ Gi and let γ0 = Γi (γ) and θ 0 = Γi (θ ). Note that
γ 0 , θ 0 ∈ G i +1 .
We write
R f¯ = ( A(λ, η ), B(λ, η )),
where f¯ = (λ, η ), λ ∈ R2 and A(λ, η ) ∈ R2 . Let Aγ (λ) = A(λ, γ(λ)) and
similarly Bγ (λ) = B(λ, γ(λ)). With this notation the action of Γi is given
by
λ, γ(λ) 7→ Aγ (λ), Bγ (λ) = λ0 , γ0 (λ0 ) .
Hence
di+1 (γ0 , θ 0 ) = sup
λ0
kγ0 ◦ Aγ (λ) − θ 0 ◦ Aγ (λ)k
kγ0 (λ0 ) − θ 0 (λ0 )k
=
sup
.
k λ 0 − λ i +1 k
k Aγ (λ) − Aγ (λi )k
Aγ (λ)
Recall that the notation here is (λi , γ(λi )) = f¯i and (λi+1 , γ0 (λi+1 ) = f¯i+1 .
The last numerator can be estimated by
kγ0 ◦ Aγ (λ) − θ 0 ◦ Aγ (λ)k
≤ kγ0 ◦ Aγ (λ) − θ 0 ◦ Aθ (λ)k + kθ 0 ◦ Aγ (λ) − θ 0 ◦ Aθ (λ)k
≤ k Bγ (λ) − Bθ (λ)k + κ k Aγ (λ) − Aθ (λ)k
≤ (k M4 k + κ k M2 k) kγ(λ) − θ (λ)k.
The denominator can bounded by Proposition 5.3.3
k Aγ (λ) − Aγ (λi )k ≥ k · min{|U |−1 , |V |−1 } · kλ − λi k.
Thus
d i +1 ( γ 0 , θ 0 ) ≤
(k M4 k + κ k M2 k)
di (γ, θ ) = νdi (γ, θ ).
k · min{|U |−1 , |V |−1 }
Theorem 5.1.2 shows that ν 1 uniformly in the index i. Hence Γ is a
(very strong) contraction.
Step 3. Going from Lipschitz to C 1 smoothness of the unstable manifold is a standard argument. See for example Katok and Hasselblatt (1995,
Chapter 6.2).
Part II
Existence of a hyperbolic
renormalization fixed point
109
C HAPTER
Computer assisted proof
This chapter contains excerpts from (Winckler, 2010).
The main result of this chapter is that the renormalization operator has
a hyperbolic fixed point of combinatorial type (01, 100)∞ , which is proved
using the contraction mapping theorem on an associated operator. We use
a computer to rigorously compute estimates to show that this associated
operator is indeed a contraction. This method was pioneered by Lanford
(1982) when he proved the existence of a fixed point of the period-doubling
operator on unimodal maps (see also Lanford, 1984). However, Lanford’s
paper only gives a brief outline of the method he employs without an actual proof, so we have gone through quite a lot of pains to include all the
missing details (many of which were borrowed from Koch et al., 1996).
6.1
Existence of a hyperbolic fixed point
We choose a different set of coordinates for Lorenz maps in this part in
order to simplify the implementation of the computer estimates. Instead
of keeping the domain fixed and letting the critical point vary we fix the
critical point at 0 and let the domain vary. We also choose the domain to
be the smallest invariant domain which contains the critical point. Lastly,
instead of considering C k maps we will only consider maps whose branches
are restrictions of analytic maps.
111
6
112
Chapter 6. Computer assisted proof
0.4
0.2
0
−0.2
−0.4
−0.6
−0.8
−1
−1 −0.8 −0.6 −0.4 −0.2
0
0.2
0.4
Figure 6.1: A Lorenz map in the “smallest invariant domain coordinates” of
Definition 6.1.1. This is the actual graph of the fixed point of Theorem 6.1.3.
Definition 6.1.1. A Lorenz map f on a closed interval I = [l, r ], where
l < 0 < r, is a monotone increasing continuous function from I \ {0} to I
such that f (0− ) = r, f (0+ ) = l (see Figure 6.1).
We require that f ( x ) = ϕ(| x |α ) for all x ∈ (l, 0), where ϕ is a symmetric1
analytic map defined on some complex neighborhood of [l, 0], and similarly
f ( x ) = ψ(| x |α ) for x ∈ (0, r ), where ψ is a symmetric analytic map defined
on some complex neighborhood of [0, r ].
The definition of the renormalization operator for this choice of coordinates is almost identical to the definition in Section 2.1 so we avoid restating
it here. However, we would like to make one important remark regarding
the choice of smoothness.
Remark 6.1.2. When defined on the space of Lorenz maps with analytic
branches the renormalization operator is differentiable and its derivative
is a compact linear operator. This follows from the fact that R f only evaluates f on a strict subset of the domain of f (see Sections 7.4.4 and 7.4.5). On
1 Here
‘symmetric’ means ϕ(z̄) = ϕ̄(z), where bar denotes complex conjugation.
113
6.2. Consequences
the other hand, if we only were to demand C r –smoothness for the branches
of our Lorenz maps then R would no longer be differentiable, see de Faria
et al. (2006) and de Melo and van Strien (1993, Ch. VI.1.1).
We now state the main theorem of this part:
Theorem 6.1.3. Let ω = (01, 100). The restricted renormalization operator Rω
acting on the space of Lorenz maps with critical exponent α = 2 has a hyperbolic
fixed point.
Remark 6.1.4. The fixed point of Theorem 6.1.3 is the simplest nonunimodal
fixed point of R. By this we mean that if ω = (01, 10) then the fixed point
of the period-doubling operator on unimodal maps corresponds to a fixed
point for Rω as follows.
Let g : [−1, 1] → [−1, 1] be the fixed point for the period-doubling
operator normalized so that g(0) = 1. Then g is an even map that satisfies
the Cvitanović–Feigenbaum functional equation
g( x ) = −λ−1 g2 (λx ),
λ = − g (1).
Now define a Lorenz map f by f |[−1,0) = g and f |(0,1] = − g. It is easy
to check that the first-return map f˜ to U = [−λ, λ] is f˜ = f 2 and that U is
maximal. Thus
(
−λ−1 g2 (λx ) = g( x ) if x < 0,
R f ( x ) = λ−1 f˜(λx ) =
λ−1 g2 (λx ) = − g( x ) if x > 0,
which shows that f is a fixed point of Rω .
6.2
Consequences
The existence of a hyperbolic renormalization fixed point has very strong
dynamical consequences, some of which we will give a brief overview of
here. Throughout this section let R denote the restricted renormalization
operator Rω , where ω = (01, 100), and let f ? denote the fixed point of
Theorem 6.1.3.
s at f
Corollary 6.2.1 (Stable manifold). There exists a local stable manifold Wloc
?
consisting of maps in a neighbourhood of f ? which under iteration of R remain in
this neighbourhood and converge with an exponential rate to f ? .
114
Chapter 6. Computer assisted proof
0
L0
R11
R12
L32
R21
R22
L1
L42
R32 R52
R0
L11
R1
L2
R2
L22
R42
R62
L12
Figure 6.2: Illustration of the dynamical intervals of generations 0, 1, 2 for
renormalization of type ω = (01, 100).
The local stable manifold extends to a global stable manifold W s consisting of
maps which converge to f ? under iteration of R. If f ∈ W s then f is infinitely
renormalizable.
Proof. The existence of a stable manifold is a direct consequence of the stable and unstable manifold theorem. If f converges to f ? , then Rn f is defined for all k > 0, which is the same as saying that f is infinitely renormalizable.
We now turn to studying the dynamical properties of maps on the stable
manifold. Let f ∈ W s , then the times of closest return ( an , bn ) are given by
the recursion
(
a n + 1 = a n + bn ,
a1 = 2,
bn+1 = 2an + bn , b1 = 3,
These determine the first-return interval Un = cl{ Ln ∪ Rn } for the n–th
renormalization, by
L n = f bn ( 0 + ) , 0 ,
In other words: the first-return map
x ∈ Ln and f˜n ( x ) = f bn ( x ) if x ∈ Rn .
Define
(
Lkn = f k ( Ln ),
Rkn = f k ( Rn ),
Rn = 0, f an (0− ) .
f˜n to Un is given by f˜n ( x ) = f an ( x ) if
k = 0, . . . , an − 1,
k = 0, . . . , bn − 1.
The collection of these intervals (over k) form a pairwise disjoint collection
for each n, called the intervals of generation n (see Figure 6.2).
115
6.2. Consequences
Theorem 6.2.2 (Cantor attractor). If f ∈ W s then the closure of the critical
orbits of f is a measure zero Cantor set Λ f which attracts almost every point in
the domain of f .
Proof. The critical orbits make up the endpoints of the dynamical intervals
{ Lkn } ∪ { Rkn }, so Λ f is contained in
(6.1)
\
n
cl{ En ∪ Fn },
where En =
[
k
Lkn and Fn =
[
Rkn .
k
Note that En+1 ⊂ En and Fn+1 ⊂ Fn .
First assume that f = f ? . Then the first-return maps f˜n are all equal to f
itself (up to a linear change of coordinates) so the total lengths of En and Fn
shrink with an exponential rate (the position of Un+1 inside Un is the same
for all n, so we can apply the Macroscopic and Infinitesimal Koebe principles as in the proof of the “real bounds” in de Melo and van Strien, 1993).
Hence the intersection (6.1) is a measure zero Cantor set, and consequently
Λ f is as well.
Now, if f is an arbitrary map in W s then R f converges to f ? . In other
words, the first-return maps { f˜n } converge to f ? (up to a linear change of
coordinates). Now use the same arguments as above.
Finally, the above arguments can be adapted to prove that f satisfies the
weak Markov property as in Theorem 3.2.2 and hence the basin of Λ f has
full measure.
Theorem 6.2.3 (Rigidity). If f , g ∈ W s then there exists a homeomorphism
h : Λ f → Λ g conjugating f and g on their respective Cantor attractors. If furs , then h extends to a C 1+α diffeomorphism on the entire
thermore f , g ∈ Wloc
domain of f .
Proof. Define h( f n (0− )) = gn (0− ) and h( f n (0+ )) = gn (0+ ). This extends
continuously to a map on Λ f as in the proof of de Melo and van Strien
(1993, Proposition VI.1.4).
s then there exists C > 0 and λ < 1 such that d ( f n , gn ) <
If f , g ∈ Wloc
n
Cλ so we can use an argument similar to that in de Melo and van Strien
(1993, Theorem VI.9.4) to prove the second statement.
Remark 6.2.4 (Universality). The second conclusion of Theorem 6.2.3 is a
strong version of what is known as “metric universality”: the small scale
116
Chapter 6. Computer assisted proof
geometric structure of the Cantor attractor does not depend on the map itself (only on the combinatorial type and the critical exponent). That is, if
s and zoom in around the same spot on their
we take two maps f , g ∈ Wloc
Cantor attractors then their structures are almost identical since a differentiable map (i.e. the extended h) is almost linear if one zooms in closely
enough.
For example, the limit of | Ln+1 |/| Ln | as n → ∞ exists and is independent of f (it equals the ratio | L2 ( f ? )|/| L1 ( f ? )| for f ? ). More generally, the
multifractal spectrum (and Hausdorff measure in particular) of Λ f does not
depend on f (only on f ? ).
6.3
Outline of the computer assisted proof
In this section we give a brief outline of the method of proof and how to
calculate rigorous estimates with a computer.
6.3.1
Method of proof
Given a Fréchet differentiable operator T with compact derivative on a Banach space X of analytic functions we would like to prove that T has a
hyperbolic fixed point. The main tool is the following consequence of the
contraction mapping theorem:
Proposition 6.3.1. Let Φ be a Fréchet differentiable operator on a Banach space X,
let f 0 ∈ X, and let Br ( f 0 ) ⊂ X be the closed ball of radius r centered on f 0 . If there
are positive numbers ε, θ such that
1. k DΦ f k < θ, for all f ∈ Br ( f 0 ),
2. kΦ f 0 − f 0 k < ε,
3. ε < (1 − θ )r,
then there exists f ? ∈ Br ( f 0 ) such that Φ f ? = f ? and Φ has no other fixed points
inside Br ( f 0 ). Furthermore, k f ? − f 0 k < ε/(1 − θ ).
Our strategy is to find a good approximation f 0 of a fixed point of T
and then use a computer to verify that the conditions on r, ε, θ hold if r is
chosen small enough. Unfortunately, this is not possible for T itself since in
our case it is not a contraction so first we have to turn T into a contraction
without changing the set of fixed points. This is done by using Newton’s
117
6.3. Outline of the computer assisted proof
method to solve the equation T f − f = 0, which results in the iteration
f 7→ f − ( DT f − I )−1 ( T f − f ),
where I denotes the identity operator on X. The operator we use is a slight
simplification of this, namely
Φ f = f − ( Γ − I ) −1 ( T f − f ) = ( Γ − I ) −1 ( Γ − T ) f ,
where Γ is a finite-rank linear approximation of DT f0 (chosen so that Γ − I
is invertible). The operator Φ is a contraction if f 0 and Γ are chosen carefully.2 Note that Φ f = f if and only if T f = f , so once we verify that the
conditions of Proposition 6.3.1 hold for Φ it follows that T has a fixed point.
To prove hyperbolicity we need to do some extra work. The derivative
of Φ is
DΦ f = ( Γ − I )−1 ( Γ − DT f ).
At this stage we will have already checked that the norm of this is bounded
from above by 1. By strengthening this estimate to
k( Γ − eit I )−1 ( Γ − DT f )k < 1,
∀t ∈ R, ∀ f ∈ Br ( f 0 ),
we also get that DT f ? is hyperbolic at the fixed point f ? . To see this, assume that eit is an eigenvalue of DT f ? with eigenvector h normalized so
that khk = 1. Then
k( Γ − eit I )−1 ( Γ − DT f ? )hk = k( Γ − eit I )−1 ( Γ − eit I )hk = khk = 1,
which is impossible. Since DT was assumed to be compact we know that
the spectrum is discrete, so the lack of eigenvalues on the unit circle implies
hyperbolicity.
6.3.2
Rigorous computer estimates
In order to verify the above estimates on a computer we are faced with
two fundamental problems: (i) arithmetic operations on real numbers are
carried out with finite precision which leads to rounding problems, (ii) the
2 Here is how to choose f and Γ: use the Newton iteration on polynomials of some
0
fixed degree to determine f 0 and set Γ = DT f0 . The hardest part is finding an initial guess
such that the iteration converges.
118
Chapter 6. Computer assisted proof
space of analytic functions is infinite dimensional so any representation of
an analytic function needs to be truncated.
The general idea to deal with these problems is to compute with sets
which are guaranteed to contain the exact result instead of computing with
points: real numbers are replaced with intervals, analytic functions are replaced with rectangle sets A0 × · · · × Ak × {C } in Rn representing all functions of the form
{ a0 + · · · + ak zk + zd h(z) | a j ∈ A j , j = 0, . . . , k, khk ≤ C },
where { A j } are intervals. This takes care of the truncation problem and the
rounding problem is taken care of roughly by “rounding outwards” (lower
bounds are rounded down, upper bounds are rounded up). Once these
set representations have been chosen we lift operations on points to operations on sets. Since the form of these sets are most likely not preserved by
such operations, this lifting involves finding bounds by sets of the chosen
form (e.g. if F and G are rectangle sets of analytic functions and we want
to lift composition of functions, then we have to find a rectangle set which
contains the set { f ◦ g | f ∈ F, g ∈ G }.)
Section 7.2 contains all the details for computing with intervals and Section 7.4 contains all the details for computing with rectangle sets of analytic
functions.
Let us make one final remark concerning the evaluation of the operator
norm of a linear operator L on the space of analytic functions. In order to
get good enough bounds on the estimate of the operator norm we will use
the `1 –norm on the Taylor coefficients of analytic functions. The reason for
this is that estimating the operator norm with
k Lk = sup k L f k
k f k≤1
will usually result in really bad estimates. With the `1 –norm, if we think
of L as an infinite matrix (in the basis {zk }), the operator norm is found by
taking the supremum over the norms of the columns of this matrix, that is
k Lk = supk Lξ k k,
ξ k (z) = zk .
k ≥0
Evaluating the norms of columns gives much better estimates and for this
reason we choose this norm. See Section 7.4.11 for the specifics.
119
6.4. The proof
6.4
The proof
First we restate the definition of the restricted renormalization operator,
then we change coordinates and restate Theorem 6.1.3.
6.4.1
Definition of the operator
From now on we fix the domain of our Lorenz maps to some interval
[−1, r ]. The right endpoint cannot be fixed since it generally changes under renormalization (we will soon change coordinates so that the domain
is fixed).
Instead of dealing with functions with a discontinuity we represent a
Lorenz map F by a pair ( f , g), with f : [−1, 0] → [−1, r ], f (0) = r, and
g : [0, r ] → [−1, r ], g(0) = −1.
With this notation, the first-return map to some interval U will be of the
type ( F a , F b )|U if F is renormalizable. For the type ω = (01, 100), we can
be more precise: in this case a = 2, b = 3 and the first-return map is of the
form ( g ◦ f , f ◦ f ◦ g)|U if it is renormalizable.
Let T denote the restricted renormalization operator Rω , and fix the
critical exponent α = 2. If T ( f , g) = ( fˆ, ĝ) then T is defined by
fˆ(z) = λ−1 g ◦ f (λz),
ĝ(z) = λ−1 f ◦ f ◦ g(λz),
λ = − f 2 (−1).
6.4.2
Changing coordinates
To ensure the correct normalization (g(0) = −1) and the correct critical
exponent (α = 2) we make two coordinate changes and calculate how the
operator T transforms. We will also carefully choose the domain of T so
that all compositions are well-defined (e.g. λz is in the domain of f etc.).
This is checked automatically by the computer (and also shows that T is
differentiable with compact derivative, since f and g are analytic). Finally,
it is important to realize that the choice of coordinates may greatly affect the
operator norm of the derivative; not every choice will give a good enough
estimate.
The domain of T is chosen to be contained in the set of Lorenz maps
( f , g) with representation f (z) = φ(z2 ) and g(z) = ψ(z2 ), where φ and ψ
120
Chapter 6. Computer assisted proof
have domains {z : |z − 1| < s} and {z : |z| < t}, respectively (the constants
s and t will soon be specified). Rewriting T in terms of φ and ψ gives
φ̂(z) = λ−1 ψ(φ(λ2 z)2 )
ψ̂(z) = λ−1 φ(φ(ψ(λ2 z)2 )2 )
λ = − φ ( φ (1)2 )
This coordinate change ensures the correct critical exponent.
The next coordinate change is to fix the normalization and also to bring
the domain of all functions to the unit disk. Fixing the normalization has
the benefit that the error involved in the evaluation of λ is minimized (since
we only need to evaluate f close to z = 0, see Section 7.4.8). Changing all
domains to the unit disk simplifies the implementation of the computer
estimates.
Definition 6.4.1. Define X to be the Banach space of symmetric (with respect to the real axis) analytic maps on the unit disk with finite `1 –norm.
That is, if f ∈ X then f (z) = ∑ ak zk with ak ∈ R and k f k = ∑| ak | < ∞.
Definition 6.4.2. Define Y = X × X with the norm
k( f , g)kY = k f k X + k gk X
and with linear structure defined by
α( f , g) + β( f 0 , g0 ) = (α f + β f 0 , αg + βg0 ).
Clearly Y is a Banach space (since X is).
Change coordinates from φ, ψ to ( f , g) ∈ Y (note that f and g are not
the same as above) as follows
φ(z) = f ([z − 1]/s),
ψ(z) = −1 + z · g(z/t),
where we will choose s = 2.2 and t = 0.5. Rewriting T in terms of f and g
gives
n
2 2 o
fˆ(w) = λ−1 −1 + f λ2 w + 1s − 1s · g 1t f λ2 w + 1s − 1s
,
o
2
1 n
ĝ(w) =
1 + λ−1 f 1s f λ2 twg λ2 w λ2 twg λ2 w − 2 1s − 1s
,
tw
λ = − f [ f (0)2 − 1]/s .
This is the final form of the operator that will be studied.
121
6.4. The proof
6.4.3
Computing the derivative
In order to simplify the computation of the derivative of T we break the
computation down into several steps as follows:
p f ( w ) = λ 2 · ( w + s −1 ) − s −1
f1 = f ◦ p f
f2 =
p g ( w ) = λ2 w
g1 = g ◦ p g
f 12
g2 = t · p g · g1
f 3 = f 2 /t
g3 = g2 · ( g2 − 2)/s
f4 = g ◦ f3
g4 = f ◦ g3
f 5 = −1 + f 2 · f 4
g5 = ( g42 − 1)/s
f 6 = f 5 /λ
g6 = f ◦ g5
g7 = g6 /λ
g8 ( w ) = ( g7 ( w ) + 1 ) / ( t · w )
With this notation we have that T ( f , g) = ( f 6 , g8 ). Note that the result of
g7 (w) + 1 is a function with zero as constant coefficient so in the implementation of g8 we will not actually divide by w, instead we will ‘shift’ the
coefficients to the left.
It is now fairly easy to derive expressions for the derivative. If f is
perturbed by δ f and g is perturbed by δg, then the above functions are
perturbed as follows:
δp f (w) = 2 · λ · δλ · (w + s−1 )
δp g (w) = 2 · λ · δλ · w
δ f 1 = D f ◦ p f · δp f + δ f ◦ p f
δg1 = Dg ◦ p g · δp g + δg ◦ p g
δ f2 = 2 f1 δ f1
δg2 = t · (δp g · g1 + p g · δg1 )
δ f 3 = δ f 2 /t
δg3 = δg2 · ( g2 − 2)/s + g2 · δg2 /s
δ f 4 = Dg ◦ f 3 · δ f 3 + δg ◦ f 3
δg4 = D f ◦ g3 · δg3 + δ f ◦ g3
δ f5 = δ f2 · f4 + f2 · δ f4
δg5 = 2 · g4 · δg4 /s
δ f 6 = δ f 5 /λ − f 5 · δλ/λ
2
δg6 = D f ◦ g5 · δg5 + δ f ◦ g5
δg7 = δg6 /λ − g6 · δλ/λ2
δg8 (w) = δg7 (w)/(t · w)
With this notation we have that DT( f ,g) (δ f , δg) = (δ f 6 , δg8 ).
122
6.4.4
Chapter 6. Computer assisted proof
New statement
We now state Theorem 6.1.3 in the form it will be proved. The discussion
in Section 6.3.1 shows how this result can be used to deduce Theorem 6.1.3.
Theorem 6.4.3. There exists a Lorenz map F0 and a matrix Γ such that the simplified Newton operator Φ = ( Γ − I )−1 ( Γ − T ) is well-defined and satisfies:
1. k DΦF k < 0.2, for all k F − F0 k ≤ 10−7 ,
2. kΦF0 − F0 k < 5 · 10−9 .
3. k( Γ − eit I )−1 ( Γ − DTF )k < 0.9, for all t ∈ R, k F − F0 k ≤ 10−7 .
Proof. The remainder of this part is dedicated to rigorously checking the
first two estimates with a computer. The third estimate is verified by covering the unit circle with small rectangles and using the same techniques
as in the first two estimates to get rigorous upper bounds on the operator
norm. However, we have left out the source code for this estimate to keep
the page count down and also because the running time of the program
went from a few seconds to several hours (we had to cover the unit circle
with 50000 rectangles in order for the estimate to work).
Remark 6.4.4. The approximate fixed point F0 and approximate derivative Γ
at the fixed point are found by performing a Newton iteration eight times
on an initial guess (which was found by trial-and-error). We will not spend
too much time talking about these approximations but they could potentially be used to compute e.g. the Hausdorff dimension of the Cantor attractor of maps on the local stable manifold.
We did however compute the eigenvalues of Γ and it turns out that Γ
has two simple expanding eigenvalues λs ≈ 23.36530 and λw ≈ 12.11202,
and the rest of the spectrum is strictly contained in the unit disk. Since Γ is a
good approximation of DT f ? and both operators are compact it seems clear
that the spectrum of DT f ? also must have exactly two unstable eigenvalues.
Lanford (1984) claims that in the case of the period-doubling operator
if an analog of the third estimate of Theorem 6.4.3 holds and “if Γ has spectrum inside the unit disk except for a single simple expanding eigenvalue,
then the same will be true for DT f ? .” It seems plausible that a similar statement holds in the present situation with two simple expanding eigenvalues
but have not yet managed to prove this (it is easy to see that if Γ and DT f ?
were both diagonal then the third estimate would imply that they have the
same number of unstable eigenvalues).
C HAPTER
Implementation of estimates
This chapter was previously published online as “supplementary material”
to (Winckler, 2010).
In this chapter we implement the computer program which performs
the estimates needed to prove Theorem 6.1.3. The literature on this type of
computer assisted proof seems to have a tradition of never including these
details, most likely because it would require an order of thousands of lines
of source code. We make a conscious break from this tradition and show
how to implement all estimates in only 166 lines of source code.1 The key
behind this reduction in size is to use a pure functional programming language since it allows us to program in a declarative style: we specify what
the program does, not how it is accomplished. Purity means that functions
cannot have side-effects (the output from a function only depends on its input) which makes it easier to reason about the source code. In our context
this is important since it means that we can check the correctness of each
function in complete isolation from the rest of the source code (and a typical function is only one or two lines long which simplifies the verification
of individual functions). To further minimize the risk of programming errors we choose a strongly typed language since these are good at catching
common programming errors during compilation.
We would like to take this opportunity to advocate the programming
language Haskell for tasks similar to the one at hand — it has all the benefits
1 This includes: definition of main operator and its derivative (40 lines), an interval
arithmetic library (30 lines), a library for computing with analytic functions (65 lines), a
linear equation solver (15 lines).
123
7
124
Chapter 7. Implementation of estimates
mentioned above and more, but at the same time manages to produce code
which runs very fast (thanks to the GHC compiler). Unfortunately, many
readers will probably have had little prior exposure to Haskell and for this
reason we have in Section 7.9 included a brief overview of Haskell as well
as a table highlighting its syntax to aid the reader in understanding the
source code.
7.1
Verification of contraction
In this section we implement the main operator and compute the estimates
of Theorem 6.4.3. Before reading this section it may be a good idea to take a
quick glance at the beginning of Section 7.4 in order to understand the way
analytic functions are represented. It may also be helpful to use Table 7.1 in
Section 7.9 to look up unfamiliar syntax in the source code.
7.1.1
The main program
To begin with we import two functions from the standard library that will
be needed later:
1
import Data.List (maximumBy,transpose)
The entry point of the program is the function main, all that is done here
is to print the result of the computations to follow:
2
main = do putStrLn $ "radius
= " ++ show beta
putStrLn $ "|Phi(f)-f| < " ++ show eps
putStrLn $ "|DPhi|
< " ++ show theta
The initial guess2 is first improved by iterating a polynomial approximation3 of the operator Φ eight times (the derivative is recomputed in each
iteration, so this is a Newton iteration):
5
approxFixedPt = iterate (\t -> approx $ opPhi (gamma t) t) guess !! 8
Compute the approximation Γ of the derivative DT at the approximate
fixed point:
6
approxDeriv = gamma approxFixedPt
2 See
3 See
Section 7.7
Section 7.4.10
7.1. Verification of contraction
125
Compute an upper bound on the distance4 between the approximate fixed
point and its image under Φ:
7
eps = upper $ dist approxFixedPt (opPhi approxDeriv approxFixedPt)
Construct a ball5 of radius β centered around the approximate fixed point
and then compute the supremum of the operator norm6 of the derivative
on this ball:
8
theta = opnorm $ opDPhi approxDeriv (ball beta approxFixedPt)
The rest of this section will detail the implementation of the operator
Φ and its derivative. The generic routines for rigorous computation with
floating point numbers and analytic functions are discussed in the sections
that follow. All input to the program (d, sf, sg, guess, beta) is collected
in Section 7.7. Instructions on how to run the program and the output it
produces is given in Section 7.8.
7.1.2
The main operator
The operator T is computed in a function called mainOp which takes a
Lorenz map ( f , g) ∈ Y and a sequence of tangent vectors {(δ f k , δgk ) ∈
Y }nk=1 and returns ( T ( f , g), { DT( f ,g) (δ f k , δgk )}nk=1 ). We perform both computations in one function since the derivative uses a lot of intermediate
results from the computation of T ( f , g).
Given a Lorenz map (f,g) and a list of tangent vectors ds, first compute
f 6 and g8 as in Section 6.4.3 and split the result so that the polynomial parts
have degree at most d − 1. Then compute the derivatives and return the
result of these two computations in a pair:
9
mainOp
l
pf
pg
f1
f2
f3
f4
f5
f6
4 See
(f,g) ds = ((split d f6,split d g8), mainDer ds) where
= lambda f
= F [(l^2-1)/sf,l^2] 0 ; g1 = compose g pg
= F [0,l^2] 0
; g2 = pg * g1 .* sg
= compose f pf
; g3 = g2 * (g2 - 2) ./ sf
= f1^2
; g4 = compose f g3
= f2 ./ sg
; g5 = (g4^2 - 1) ./ sf
= compose g f3
; g6 = compose f g5
= -1 + f2*f4
; g7 = g6 ./ l
= f5 ./ l
; g8 = lshift g7 ./ sg
Section 7.4.3
Section 7.4.12
6 See Section 7.4.11
5 See
126
Chapter 7. Implementation of estimates
The actual computation of the derivative is performed next inside a local
function to mainOp. If there are no tangent vectors, no computation is performed:
19
mainDer [] = []
Otherwise, recurse over the list of tangent vectors and compute δ f 6 and
δg8 and again split the result so that the polynomial parts have degree at
most d − 1:
20
mainDer
dl
dpf
dpg
df1
df2
df3
df4
df5
df6
((df,dg):ds) = (split d df6,split d dg8) : mainDer ds where
= dlambda f df
= F ([2*l*dl]*[1/sf,1]) 0 ; dg1 = dcompose g pg dg dpg
= F [0,2*l*dl] 0
; dg2 = (dpg*g1 + pg*dg1) .* sg
= dcompose f pf df dpf
; dg3 = 2*(dg2*g2 - dg2) ./ sf
= 2*f1*df1
; dg4 = dcompose f g3 df dg3
= df2 ./ sg
; dg5 = 2*g4*dg4 ./ sf
= dcompose g f3 dg df3
; dg6 = dcompose f g5 df dg5
= df2*f4 + f2*df4
; dg7 = dg6./l - g6.*(dl/l^2)
= df5./l - f5.*(dl/l^2)
; dg8 = lshift dg7 ./ sg
Note that the constants s and t of Section 6.4.3 are called sf and sg respectively in the source code.
The above function can be used to compute the action of T by passing an
empty list of tangent vectors and extracting the first element of the returned
pair:
30
opT fg = fst $ mainOp fg []
Similarly, we can evaluate DT by extracting the second element:
31
opDT fg ds = snd $ mainOp fg ds
Using this function we compute an approximation Γ of DT( f ,g) by evaluating the derivative at the 2d first basis vectors7 of Y and approximating the
result with polynomials and packing them into a 2d × 2d matrix (transposing the resulting matrix is necessary because the linear algebra routines8
we use require the matrix to be stored in row-major order):
32
gamma fg = transpose $ map (interleavePoly . approx)
$ opDT fg (take (2*d) basis)
7 See
8 See
Section 7.4.11.
Section 7.5.
7.2. Computation with floating point numbers
127
Finally, the operator Φ (and its derivative) is implemented by taking a
Newton step9 with T (for convenience we pass the approximate derivative
as the parameter m):
34
opPhi m x
= newton m (opT x) x
opDPhi m x ds = [ newton m a b | (a,b) <- zip (opDT x ds) ds ]
7.1.3
The rescaling factor
With our choice of coordinates the rescaling factor λ only depends on f
(and not on g):
λ( f ) = − f [ f (0)2 − 1]/s
The implementation is straightforward:
36
lambda f = -eval f (((eval f 0)^2-1)/sf)
If f ∈ X is perturbed by δ f ∈ X then λ is perturbed by δλ, where
δλ = −2 · s−1 · f (0) · δ f (0) · D f [ f (0)2 − 1]/s − δ f [ f (0)2 − 1]/s .
Derivative evaluation has to be handled carefully since we are using the
`1 –norm, see Section 7.4.4. If y = [ f (0)2 − 1]/s, then y lies in the closed
disk of radius |y| but since we need to evaluate the derivative on an open
disk we first enlarge the bound on |y| to get the radius µ and then evaluate
the derivative on this slightly larger disk:
37
dlambda f df =
where f0 =
y =
mu =
7.2
-2/sf * f0 * eval df 0 * eval (deriv mu f) y - eval df y
eval f 0
(f0^2 - 1)/sf
enlarge $ abs y
Computation with floating point numbers
We discuss how to control rounding and avoid overflow and underflow
when computing with floating point numbers. We show how to lift operations on floating point numbers to intervals and then how to bound these
operations.
9 See
Section 7.4.13.
128
Chapter 7. Implementation of estimates
7.2.1
Safe numbers
In order to avoid overflow and underflow during the course of the proof
we restrict all computations to the set of safe numbers (Koch et al., 1996)
which we define as the subset of double precision floating point numbers
(referred to as floats from now on) x such that x = 0 or 2−500 < | x | < 2500 .
We say that y is a safe upper bound on x 6= 0 iff x < y (strict inequality)
and y is a safe number; safe lower bounds are defined analogously. If x = 0,
then y = 0 is both a safe upper and lower bound and there are no other safe
bounds on x (this will make sense after reading the assumption below).
Safe numbers allow us to perform rigorous computations on any computer conforming to the IEEE 754 standard since such a computer must
satisfy the following assumption:10
Assumption. Let x̄ be a float resulting from an arithmetic operation on safe numbers performed by the computer and let x be the exact result of the same operation.
If x̄ 6= 0 then either x̄ = x − or x̄ = x + , where x − is the largest float such that
x − ≤ x and x + is the smallest float such that x ≤ x + . Furthermore, x̄ = 0 if and
only if x = 0.
Under this assumption we know that the exact result must lie within
any safe upper and lower bounds on x̄, and we know that when the computer returns a result of 0 then the computation must be exact.
Given a float x we now show how to find safe upper and lower bounds
on x.
Check if a number is safe:
41
isSafe x = let ax = abs x in x == 0 || (ax > 2^^(-500) && ax < 2^500)
Use this function to assert that a number is safe, abort the program otherwise:
42
assertSafe x | isSafe x = x
| otherwise = error "assertSafe: not a safe number"
Given a float we can ‘step’ to an adjacent float as follows:
44
stepFloat n 0 = 0
stepFloat n x = let (s,e) = decodeFloat x in encodeFloat (s+n) e
10 This statement follows from: (1) the fact that IEEE 754 guarantees correct rounding,
(2) the result of an arithmetic operation on safe numbers is a normalized float so silent
underflow to zero cannot occur.
7.2. Computation with floating point numbers
129
That is, stepFloat 1 x is the smallest float larger than x, and similarly
stepFloat (-1) x is the largest float smaller than x, unless x = 0 in which
case x is returned in both cases. (The function decodeFloat converts a float
to the form s · 2e , where s, e ∈ Z, and encodeFloat converts back to a float.)
Now finding a safe upper or lower bound is easy, just step to the next
float and assert that it is safe:
46
safeUpperBound = assertSafe . stepFloat 1
safeLowerBound = assertSafe . stepFloat (-1)
7.2.2
The Scalar data type
The Scalar data type represents safe lower and upper bounds on a number:
48
data Scalar = S !Double !Double deriving (Show,Eq)
The first number is the lower bound, the second the upper bound. The
following function returns the upper bound:
49
upper (S _ u) = u
We bound operations on real numbers by first lifting them to operations
on Scalar values and then bound the resulting operations by enlarging the
bound to safe lower and upper bounds. An operation is exact if it does not
involve any rounding (in which case there is no need to enlarge a bound).
The function that takes a Scalar with lower bound l and upper bound
u, then finds a safe lower bound on l and a safe upper bound on u is implemented as follows:
50
enlarge (S l u) = S (safeLowerBound l) (safeUpperBound u)
For convenience we provide a function to convert a number x to a
Scalar with x as both lower and upper bound:
51
toScalar x = S x x
7.2.3
Arithmetic on scalars
We make Scalar an instance of the Num type class so that we can perform
arithmetic on scalars (addition (+), subtraction (-), negation, multiplication (*) and nonnegative integer exponentiation (^)).
52
instance Num Scalar where
130
Chapter 7. Implementation of estimates
If x ∈ [l, u] then − x ∈ [−u, −l ]; negation is exact on safe number so we do
not need to enlarge the bound:
53
negate (S l u) = S (-u) (-l)
If x ∈ [l, u] then | x | ∈ [max{0, l, −u}, − min{0, l, −u}] (it is easy to check
that this is correct regardless of the signs of l and r). All operations involved
are exact on safe numbers so we do not need to enlarge the bound:
54
abs (S l u) = S (maximum xs) (-minimum xs)
where xs = [0, l, -u]
If x ∈ [l, u] and y ∈ [l 0 , u0 ], then x + y ∈ [l + l 0 , u + u0 ]. This operation is not
exact so we enlarge the bound:
56
(S l u) + (S l’ u’) = enlarge (S (l + l’) (u + u’))
If x ∈ [l, u] and y ∈ [l 0 , u0 ], then x ∗ y ∈ [ a, b] where a is the minimum of
the numbers {l ∗ l 0 , l ∗ u0 , u ∗ l 0 , u ∗ u0 } and b is the maximum of the same
numbers. This operation is not exact so we enlarge the bound:
57
(S l u) * (S l’ u’) = enlarge (S (minimum xs) (maximum xs))
where xs = [l*l’, l*u’, u*l’, u*u’]
The last two methods are required to complete the implementation of the
Num instance (fromInteger provides implicit conversion of integer literals
to Scalar values):
59
fromInteger
= toScalar . fromInteger
signum (S l u) = error "S.signum: not defined"
In order to be able to divide Scalar values using (/) we must also add
Scalar to the Fractional type class.
61
instance Fractional Scalar where
If x ∈ [l, u] and if l, u have the same sign, then the reciprocal is well-defined
for x and 1/x ∈ [1/u, 1/l ]. This operation is not exact so we enlarge the
bound:
62
recip (S l u) | l*u > 0
= enlarge (S (1/u) (1/l))
| otherwise = error "S.recip: not well-defined"
The last method is required; it provides implicit conversion of decimal literals to Scalar values:
64
fromRational = toScalar . fromRational
131
7.3. Computation with polynomials
7.2.4
Ordering of scalars
In order to be able to compare Scalar values, e.g. using (<), we add Scalar
to the Ord type class. If two bounds overlap we declare them incomparable
and halt the program, otherwise comparison is implemented in the obvious
way.
65
instance Ord Scalar where
compare (S l u) (S l’ u’)
| u < l’
=
| l > u’
=
| l == l’ && u == u’ =
| otherwise
=
7.3
LT
GT
EQ
error "S.compare: uncomparable"
Computation with polynomials
We show how to lift operations on polynomials (of degree d − 1) to rectangle sets in Rd and then how to bound these operations.
7.3.1
Representation of polynomials
Polynomials are represented as a list of Scalar values (with the first element representing the constant coefficient). Hence what we refer to as a
‘polynomial’ of degree d − 1 is actually a rectangle set in Rd . In this section
we lift operations on actual polynomials to such rectangles. We do not need
to find any bounds on these lifts since this was already done implicitly in
the previous section.
7.3.2
Arithmetic with polynomials
Add polynomials to the Num type class so that we can perform arithmetic
operations on polynomials. (This implementation is a bit more general
since it adds [a] to the Num type class for any type a in the Num type class.)
71
instance (Num a) => Num [a] where
Addition: [c1 + zq1 (z)] + [c2 + zq2 (z)] = [c1 + c2 ] + z[q1 (z) + q2 (z)].
72
(c1:q1) + (c2:q2) = c1 + c2 : q1 + q2
[]
+ p2
= p2
p1
+ []
= p1
Multiplication: [c1 + zq1 (z)] · [c2 + zq2 (z)] = [c1 · c2 ] + z[c1 · q2 (z) + q1 (z) ·
p2 (z)], where p2 (z) = c2 + zq2 (z).
132
Chapter 7. Implementation of estimates
(c1:q1) * [email protected](c2:q2) = c1*c2 : [c1]*q2 + q1*p2
_
* _
= []
75
The remaining methods are straightforward:
negate p
= map negate p
fromInteger c = [fromInteger c]
abs
= error "abs not implemented for polynomials"
signum
= error "signum not defined for polynomials"
77
7.3.3
Polynomial evaluation
Evaluation of the polynomial c + zq(z) at the point t is done using the obvious recursion:
81
peval (c:q) t = c + t * peval q t
peval []
_ = 0
7.3.4
Norm of polynomial
We use the `1 –norm on polynomials, i.e. k a0 + · · · + an zn k = | a0 | + · · · +
| a n |:
83
pnorm = sum . map abs
7.3.5
Derivative of polynomial
The derivative of c + zq(z) is implemented using the recursion suggested
by D (c + zq(z)) = q(z) + zDq(z):
84
pderiv (c:q) = q + (0 : pderiv q)
pderiv []
= []
7.4
Computation with analytic functions
We show how to lift operations on analytic functions to rectangle subsets
in X and how to bound these operations.
7.4.1
The Function data type
Functions in X are represented as
f ( z ) = p ( z ) + z d h ( z ),
7.4. Computation with analytic functions
133
where p is a polynomial (not necessarily of degree less than d) and k hk < K,
where h ∈ X. We refer to p as the polynomial part of f and h is called the
error of f . The value for the degree d is specified in Appendix 7.7.
The Function data type represents an analytic function on the above
form (the first parameter is the polynomial part, the second parameter is
the bound on the error):
86
data Function = F ![Scalar] !Scalar deriving (Show,Eq)
That is, Function represents rectangle subsets of X of the form
{ a0 + · · · + an zn + zd h(z) | ak ∈ Ak , k = 0, . . . , n, khk ∈ I },
where { Ak } and I are intervals. Only the upper bound on the error term
is needed so we do not take care to ensure that the lower bound is correct.
Hence, the lower bound will be meaningless in general.
Note that we do allow n ≥ d in the above representation but in general we adjust our computations to ensure n < d. We call this operation
splitting: if
f ( z ) = a0 + · · · + a n z n + z d h ( z ),
with n ≥ k ≥ d, then we can split f at degree k into
f (z) = a0 + · · · + ak−1 zk−1 + zd [ ak zk−d + · · · + an zn−d + h(z)]
= p0 (z) + zd [r (z) + h(z)].
Thus the polynomial part of f after splitting is p0 and the error is bounded
by kr k + k hk (by the triangle inequality). The implementation of this operation is:
87
split k (F p e) = let (p’,r) = splitAt k p in F p’ (e + pnorm r)
We will now lift operations on analytic functions to the above type of
rectangles and then find bounds on these operations.
7.4.2
Arithmetic with analytic functions
In what follows we let f i (z) = pi (z) + zd hi (z) for i = 1, 2, 3, and let f 1 f 2 =
f 3 where is the operation under consideration.
Make Function an instance of the Num type class so that we can perform
arithmetic operations on functions (addition (+), subtraction (-), negation,
multiplication (*) and nonnegative integer exponentiation (^)).
134
88
Chapter 7. Implementation of estimates
instance Num Function where
Addition of two functions is performed by adding the polynomial part and
the error separately, p3 = p1 + p2 and h3 = h1 + h2 , so that k h3 k ≤ k h1 k +
kh2 k by the triangle inequality:
89
(F p1 e1) + (F p2 e2) = F (p1 + p2) (e1 + e2)
Multiplication of two analytic functions is given by the equation
i
h
f 1 ( z ) f 2 ( z ) = p1 ( z ) p2 ( z ) + z d p1 ( z ) h2 ( z ) + p2 ( z ) h1 ( z ) + z d h1 ( z ) h2 ( z ) ,
so that k h3 k ≤ k p1 kkh2 k + k p2 kkh1 k + k h1 kkh2 k. To ensure that the degree
of the polynomial part does not increase too much we split it at degree
d + 1:11
90
(F p1 e1) * (F p2 e2) = split (d+1) (F (p1*p2) e3)
where e3 = e2*pnorm p1 + e1*pnorm p2 + e1*e2
The negation of f is − p(z) + zd (−h(z)) but the error is unchanged since we
only keep a bound on its norm:
92
negate (F p e) = F (negate p) e
The remaining methods are required to complete the implementation of the
Num instance (fromInteger provides implicit conversion of integer numerals to Function values):
93
fromInteger c = F [fromInteger c] 0
abs
= error "abs not implemented for Function"
signum
= error "signum not defined for Function"
7.4.3
Norm of analytic functions
The triangle inequality gives k p(z) + zd h(z)k ≤ k pk + k hk (since |z| < 1):
96
norm (F p e) = pnorm p + e
The norm on the Cartesian product Y = X × X is k( f , g)k = k f k + k gk:
97
sumnorm (f,g) = norm f + norm g
The distance induced by the norm on Y:
98
dist (f,g) (f’,g’) = sumnorm (f-f’,g-g’)
11 We choose to split at degree d + 1 (instead of the perhaps more natural choice of degree d) because the division by z in the definition of the operator T would otherwise cause
g8 to have degree at most d − 2.
135
7.4. Computation with analytic functions
7.4.4
Differentiation
The implementation of differentiation of f ∈ X is complicated by the use
of the `1 –norm on X, since k f k < ∞ does not imply that k D f k < ∞.
This problem is overcome by only computing the derivative of functions
restricted to a disk of radius strictly smaller than one. That is, we need to
know a-priori that the function we are differentiating only will be evaluated
on this smaller disk. Usually we get this information from the fact that we
compute derivatives like D f 1 ◦ f 2 and we have bounds on the image of f 2 .
Given f ∈ X we will estimate D f |{|z|<µ} where µ < 1. If f (z) = p(z) +
d
z h(z), then D f (z) = Dp(z) + dh(z)zd−1 + zd Dh(z) = p1 (z) + zd h1 (z).
Here we are faced with the problem that we only know the norm of h so
all we can say about the polynomial part is that p1 (z) = Dp(z) + sdzd−1 ,
where s ∈ [−khk, k hk].
Let h(z) = ∑ ak zk , then the error can be crudely approximated as follows:
k Dh(z) |{|z|<µ} k = k Dh(µz)k =
∑ kµk−1 |ak | ≤ sup kµk−1 khk
k ≥1
k ≥1
≤ khk ∑ kµk−1 =
k ≥1
khk
(1 − µ )2
Putting all this together we arrive at the following implementation:
99
deriv mu (F p e) | mu < 1
= F p1 e1
| otherwise = error "deriv: mu is not < 1"
where p1 = pderiv p + [S (-s) s] * [0,1]^(d-1)
e1 = e / (1 - mu)^2
s = fromIntegral d * upper e
Note that µ is passed as a parameter by the caller of this function (it
is not a constant). As mentioned earlier, usually this function is used to
compute expressions like D f 1 ◦ f 2 in which case µ will be an upper bound
on the radius of a disk containing the image of f 2 .
7.4.5
Composition
The implementation of composition of analytic functions f 1 ◦ f 2 is split up
into two parts. First we consider the special case when f 1 = p1 is a polynomial, then we treat the general case.
136
Chapter 7. Implementation of estimates
Polynomials are defined on all of C so the composition p1 ◦ f 2 is always
defined. If p1 (z) = c + zq(z) then we may use the recursion suggested by
p1 ◦ f 2 ( z ) = c + f 2 ( z ) · q ◦ f 2 ( z ):
104
compose’ (c:q) f2 = (F [c] 0) + f2 * compose’ q f2
The recursion ends when the polynomial is the zero polynomial, in which
case p1 ◦ f 2 = 0:
105
compose’ []
_
= 0
In the general case we have to take care to ensure that the image of f 2
is contained in the domain of f 1 for the composition to be well-defined. A
sufficient condition for this to hold is k f 2 k < 1 since the domain of f 1 is
the unit disk. Under this assumption we compute f 1 ◦ f 2 = p1 ◦ f 2 + ( f 2 )d ·
h1 ◦ f 2 . These two terms are split at degree d + 1 to get p1 ◦ f 2 (z) = p̃1 (z) +
zd h̃1 (z) and f 2 (z)d = p̃2 (z) + zd h̃2 (z).12 Then f 1 ◦ f 2 (z) = p3 (z) + zd h3 (z)
with p3 = p̃1 + p̃2 · h1 ◦ f 2 and h3 = h̃1 + h̃2 · h1 ◦ f 2 . Only the norm of h1 is
given so from this we can only draw the conclusion that p3 (z) = p̃1 (z) + s ·
p̃2 (z) for s ∈ [−kh1 k, k h1 k] (s is in fact a function but we may think of it as a
constant since we are really computing with sets of polynomials). The error
is approximated using the triangle inequality, k h3 k ≤ kh̃1 k + k h̃2 kkh1 k.
106
compose (F p1 e1) f2 | norm f2 < 1 = F (p1’ + [s]*p2’) (e1’ + e1*e2’)
| otherwise
= error "compose: |f2| is too large"
where (F p1’ e1’) = split (d+1) (compose’ p1 f2)
(F p2’ e2’) = split (d+1) (f2^d)
s
= S (-upper e1) (upper e1)
The term s · p̃2 (z) can introduce devastating errors into the computation
since s lies in an interval which has positive upper bound and a negative
lower bound (if p̃2 has a coefficient with small error but a large magnitude
relative to s, then after multiplying with s that coefficient will have an error
that is bigger than the magnitude of the coefficient). We work around this
problem by choosing the degree d large, since this tends to make the term s
smaller. Another way to deal with this problem is to include a “general
error” term in the representation of analytic functions (Koch et al., 1996).
12 See the footnote near the definition of multiplication of analytic functions for an expla-
nation of the choice of degree d + 1.
7.4. Computation with analytic functions
7.4.6
137
Derivative of the composition operator
Let S( f , g) = f ◦ g, then the derivative is given by
DS( f ,g) (δ f , δg) = D f ◦ g · δg + δ f ◦ g.
Note that when computing D f we must specify as a first parameter the
radius of a disk strictly contained in the unit disk to which D f is restricted
(see Section 7.4.4). In the present situation we know that the image of g is
contained in a disk with radius k gk, so D f only needs to be evaluated on
the disk of radius k gk:
111
dcompose f g df dg = (deriv (norm g) f ‘compose‘ g) * dg + (df ‘compose‘ g)
7.4.7
Division by z
If f (z) = a1 z + · · · + an zn + zd h(z), then
f (z)/z = a1 + · · · + an zn−1 + zd−1 h(0) + zd h̃(z),
where |h(0)| ≤ khk and kh̃k ≤ k hk. Since we do not know the value of
h(0) we estimate the coefficient it with s ∈ [−khk, k hk]. We think of this
operation as a “left shift”, whence the name of this function:
112
lshift (F (c:q) e) = F (q + [0,1]^(d-1) * [S (-upper e) (upper e)]) e
lshift (F [] e)
= F ([0,1]^(d-1) * [S (-upper e) (upper e)]) e
If the polynomial part of f has a constant coefficient a0 6= 0 then this function will not return the correct result, so we take care to only use it when
we know that a0 = 0.
7.4.8
Point evaluation
If f (z) = p(z) + zd h(z), then f (t) = p(t) + td · s for some s ∈ [−khk, k hk].
We also check that t is in the unit disk otherwise the program is terminated
with an error:
114
eval (F p e) t | abs t < 1 = peval p t + t^d * (S (-upper e) (upper e))
| otherwise = error ("eval: not in domain t=" ++ show t)
Note that the further away t is from 0, the more error is introduced in the
evaluation. For t = 0 the error term has no influence on the evaluation.
138
Chapter 7. Implementation of estimates
7.4.9
Scaling
As a convenience we define operators to scale an analytic function by a
scalar on the on the right. The precedence for these operators are the same
as for their ’normal’ counterparts.
Multiplication satisfies [ p(z) + zd h(z)] · x = x · p(z) + zd [ x · h(z)] and
division is handled similarly. Note that the error term is affected:
116
infixl 7 .*, ./
(F p e) .* x = F (p * [x]) (e * abs x)
(F p e) ./ x = F (p * [1/x]) (e / abs x)
7.4.10
Polynomial approximation
Let f (z) = p(z) + zd h(z). To approximate f by a polynomial we first discard the error term zd h(z), then we disregard the errors in the coefficients
+
of p. That is, for p(z) = a0 + · · · + an zn with ak ∈ [ a−
k , ak ] we replace ak
−
+
with the mean ãk = ( ak + ak )/2 (we ‘collapse’ the bounds on ak ). Finally,
we lift this operation to pairs of functions:
119
approx (f,g) = (approx’ f, approx’ g)
where approx’ (F p _) = F (map (toScalar . collapse) p) 0
collapse (S l u) = (l+u)/2
7.4.11
Operator norm
Let ξ k (z) = zk so that {ξ k }k≥0 is a basis for X. A basis for Y is {ηk }k≥0 ,
where η2k = (ξ k , 0) and η2k+1 = (0, ξ k ). This set is implemented as follows:
122
basis = interleave (zip basis’ (repeat 0)) (zip (repeat 0) basis’)
where basis’ = map xi [0..]
xi k
= F (replicate k 0 ++ [1]) 0
Proposition 7.4.1. If L : Y → Y is a linear and bounded operator, then
k Lk = max{k Lη0 k, . . . , k Lη2d−1 k, sup k L(h, 0)k, sup k L(0, h)k},
h∈ Bd
where Bd = {zd h(z) | k hk < 1}.
This is a consequence of using the `1 –norm on X.
h∈ Bd
7.4. Computation with analytic functions
139
Given a linear operator op acting on a list of tangent vectors13 , we estimate the operator norm by applying it to the first 2d basis vectors and to
the sets Bd × 0 and 0 × Bd . Then we compute the upper bound of the norm
of the results and take the maximum:
125
opnorm op = maximum $ map (upper . sumnorm) $ op tangents
where tangents = (F [] 1,0) : (0,F [] 1) : take (2*d) basis
Note that Bd is represented by the set of functions with no polynomial part
and an error bounded by 1, which is the same as F [] 1.
7.4.12
Construction of balls
We cannot exactly represent arbitrary balls in X with the Function type.
Instead we construct a rectangle set which is guaranteed to contain the ball.
Thus, a bound on a ball of radius r centered on an analytic function (in
our case it is always a polynomial, i.e. e=0) can be implemented as follows:
127
ball r (f,g) = (ball’ r f, ball’ r g)
where ball’ r (F p e) = F (map (+ S (-r) r) p) (e + toScalar r)
7.4.13
Newton’s method
This is our variant of Newton’s method on Y:
( f , g) 7→ ( M − I )−1 ( M − T )( f , g),
where M is a 2d × 2d matrix passed as the first parameter. The second
parameter is T ( f , g) and the third parameter is ( f , g). When lifting M into Y
we project the error term to zero by letting s = 0 and when lifting ( M − I )−1
we preserve the error term by letting s = 1 (see below for how this lifting
is done):
129
newton m (tf,tg) fg = fg’
where (mf,mg) = liftPolyOp 0 (apply m) fg
fg’
= liftPolyOp 1 (solve $ subtractDiag m 1) (mf-tf,mg-tg)
Let f = p(z) + zd h(z) where deg p < d and let A be the linear operator
represented (in the basis {zk }) by the infinite matrix
M 0
0 sI
13 The
linear operator acts on a sequence of tangent vectors since this is how we have
implemented the derivative of the main operator.
140
Chapter 7. Implementation of estimates
where M is a d × d matrix and I is the infinite identity matrix. We lift the
linear operator A into X by
A f (z) = Mp(z) + zd (s · h(z)).
The following function implements this lifting into Y. We split f and g
to ensure that their degrees are at most d − 1, and since our linear algebra
routines require its input in one vector we interleave the polynomial parts.
Also, instead of passing M we pass a linear operator op which allows us to
use one function to lift matrix multiplication (apply) and solution of linear
equations (solve):
132
liftPolyOp s op (f,g) = (F pf’ (s*ef), F pg’ (s*eg))
where [email protected](F pf ef, F pg eg) = (split d f, split d g)
(pf’,pg’) = uninterleave $ op (interleavePoly fg)
When interleaving the polynomial parts of two functions we first pad the
polynomials with zeros to ensure their lengths are exactly d (e.g. a0 + a1 z is
padded to a0 + a1 z + 0z2 + · · · + 0zd−1 ). Hence the resulting vector always
has length 2d:
135
interleavePoly (F p _, F q _) = interleave (pad p) (pad q)
where pad x = take d $ x ++ (repeat 0)
7.5
Linear algebra routines
In this section we implement a simple linear algebra library to compute
matrix-vector products and to solve linear equations.
A matrix is represented as a list of its rows and a row is a list of its
elements. A vector is just a list of elements (we think of them as column
vectors). This is a very simplistic library so no checking is done to ensure
that matrices have the correct dimensions (e.g. it is quite possible to create
a ‘matrix’ with rows of differing lengths).
7.5.1
Matrix-vector product
Computing the matrix-vector product Mx is fairly straightforward:
137
apply m x = map (dotProduct x) m
The dot product of vectors a and b:
138
dotProduct a b = sum $ zipWith (*) a b
7.5. Linear algebra routines
141
Note that if a and b have different lengths, then the above function will
treat the longer vector as if it had the same length as the shorter.
7.5.2
Linear equation solver
The following function solves the linear system of equations Mx = b. It is
a simple wrapper around a function which solves a linear system given an
augmented matrix.
139
solve m b = solveAugmented $ augmentedMatrix m b
The augmented matrix for M and b is M b , i.e. the matrix with b appended as the last column of M:
140
augmentedMatrix = zipWith (\x y -> x ++ [y])
We now implement a linear equation solver which takes an augmented
matrix as its only parameter. It is implemented using Gaussian elimination with partial pivoting. The only novelty compared with a traditional
imperative implementation is that we solve the equations recursively.
Given a n × (n + 1) augmented matrix M0 first perform partial pivoting,
i.e. the row whose first element has the largest magnitude is moved to the
top to form the matrix M. Assuming that we already have the solution
for x2 , . . . , xn we can compute x1 = (m1,n+1 − ∑nj=2 m1j x j )/m11 and we are
done. The solution for x2 , . . . , xn is found recursively as follows: perform
a Gaussian elimination on M to ensure that all rows except the first start
with a zero to get a matrix M̃. Throw away the first row and column of M̃
to get a (n − 1) × n matrix N 0 and solve the linear system with augmented
matrix N 0 . The solution to this system is x2 , . . . , xn .
141
solveAugmented [] = []
solveAugmented m’ = (last m1t - dotProduct m1t x) / m11 : x
where [email protected]((m11:m1t):_) = partialPivot m’
x = solveAugmented $ eliminate m
Partial pivoting is done by first finding a list of all possible ways to split
the matrix M into a top and a bottom half. This list is searched for the split
which has a maximal first element in the bottom half. The maximal split is
then reassembled into one matrix by moving the top row of the bottom half
to the top of the matrix.
145
partialPivot m = piv:mtop ++ mbot
where (mtop,piv:mbot) = maximumBy comparePivotElt (splits m)
comparePivotElt (_,(a:_):_) (_,(b:_):_) = compare (abs a) (abs b)
142
Chapter 7. Implementation of estimates
The following routine uses Gaussian elimination to ensure that the first
element of all rows except the first starts with a zero. That is, we add a
suitable multiple of the first row to the other rows one at a time:
148
eliminate ((m11:m1t):mbot) = foldl appendScaledRow [] mbot
where appendScaledRow a (r:rs) = a ++ [scaleAndAdd (-r/m11) m1t rs]
scaleAndAdd s a b = zipWith (+) (map (*s) a) b
Remark 7.5.1. When using the above linear equation solver with matrices
and vectors over intervals (of type Scalar) there is a question of what the
‘solution’ represents. As always, we are computing bounds on solutions:
if M is in some rectangle set [ M] of matrices and b is in some rectangle set
[b] of vectors, then the above routine will compute a rectangle set [ x ] such
that if x is a solution to Mx = b then x ∈ [ x ].
Note that our solver will compute rather loose bounds on the solution
set, see e.g. Jansson and Rump (1991) for ways of finding sharper bounds.
7.6
Supporting functions
Given a square matrix M and a number x compute M − xI, i.e. subtract x
from every diagonal element of M:
151
subtractDiag m x = foldl f [] (zip m [0..])
where f m’ (r,k) = let (h,t:ts) = splitAt k r
in m’ ++ [h ++ [t-x] ++ ts]
Given a list, return all possible ways to split the list in two:
154
splits x = splits’ [] x
where splits’ _ []
= []
splits’ x [email protected](yh:yt) = (x,y) : splits’ (x ++ [yh]) yt
Interleave two lists a and b, i.e. construct a new list by taking the first element from a, then the first element from b and repeating for the remaining
elements.
157
interleave a b = concat $ zipWith (\x y -> [x,y]) a b
Perform the ‘inverse’ of the above function, i.e. take a list c and construct a pair of lists (a,b) such that interleave a b = c:
158
uninterleave = unzip . pairs
Given a list, partition it into pairs of adjacent elements:
7.7. Input to the main program
159
143
pairs [] = []
pairs (x:y:rest) = (x,y) : pairs rest
pairs _ = error "list must have even length"
7.7
Input to the main program
The degree of the error term in our representation of analytic functions:
162
d = 13 :: Int
The radius of the ball on which Φ is a contraction:
163
beta = 1.0e-7 :: Double
The radii for the domains of φ and ψ:
164
sf = 2.2 :: Scalar
sg = 0.5 :: Scalar
The initial guess for the fixed point:
166
guess = (F [-0.75, -2.5] 0, F [6.2,-2.1] 0)
7.8
Running the main program
This document contains all the Haskell source code needed to compile the
program into an executable. Given a copy of the LATEX source of this document (assuming the file is named lmca.lhs), use the following command
to compile it:14
ghc --make -O2 lmca.lhs
This produces an executable called lmca (or lmca.exe if you are using Windows) which when called will execute the main function.
Here is the output of running the main program:
radius
= 1.0e-7,
|Phi(f)-f| < 4.830032057738462e-9,
|DPhi|
< 0.1584543808202988
This output was taken from a sample run using GHC 6.12.1 on Mac OS X
10.6.2. The running time on an 1.8 GHz Intel Core 2 Duo was less than
10 seconds.
14 The
GHC compiler can be downloaded for free from http://haskell.org.
144
7.9
Chapter 7. Implementation of estimates
Haskell mini-reference
This section introduces some of the features and syntax of Haskell to help
anybody unfamiliar with the language read the source code. It is assumed
that the reader has some prior experience with an imperative language
(Java, C, etc.) but is new to functional programming languages. Table 7.1
below collects examples of Haskell syntax used in the source code and can
be used to look up unfamiliar expressions. For more information on the
Haskell language go to http://haskell.org.
Haskell is a functional language. Such languages differ from imperative
languages in several significant ways, e.g.: there are no control structures
such as for loops and data is immutable so there is no concept of variables
(memory locations) that can be written to.
Basic types include: booleans (True, False), numbers (e.g. -1, 2.3e3, integers of any magnitude are supported), tuples (e.g. (1,’a’,0.3), elements
can have different types), and lists (e.g. [1,2,3], all elements must have the
same type). Functions are on the same level as basic types so they can e.g.
be passed as parameters to other functions.
Functions are defined like f parameters = expression where f is the function name and there can be zero or more parameters. Note that there are
no parentheses around parameters and that parameters are separated by
spaces. Function calls have very high precedence, so f x^2 is the same as
(f x)^2, not f (x^2). The keywords let .. in and where can be used to
bind expressions to function-local definitions (i.e. local functions or variables).
New data types can be defined using the data construct. For example,
data Interval = I Double Double defines a type called Interval which
consists of two double precision floating point numbers (i.e. the endpoints
of the interval). New values of this type are constructed using the value
constructor which we called I, e.g. I 0 1 defines the unit interval.
Functions can be defined with pattern matching on built-in and custom
data types. For example, len (I a b) = abs (b-a) defines a function len
which returns the length of an interval (for the custom data type Interval).
We often use pattern matching on lists, where [] matches the empty
list and (x:xs) matches a list with a least one element and binds the first
element to x and the rest to xs (read as plural of x). The notation [email protected](x:xs)
can be used to bind the entire list to v on a match.
7.9. Haskell mini-reference
145
The notation _ may be used to match anything without binding the
match to a variable, e.g. firstZero (x:_) = x == 0 defines a function
which returns True if the first element of a nonempty list is equal to zero
(and throws an exception if called on the empty list []).
Type classes are a way of declaring that a custom data type supports
a certain predefined collection of functions and also allow for ‘overloading’ of functions (and operators, which can be turned into functions as
noted in the example for (+) in Table 7.1). We only mention type classes
because we come across them when implementing Scalar and Function.
The pre-defined type classes we use are Num (for (+), (-), (*), (^), abs),
Fractional (for (/), (^^)), Eq (test for equality), and Show (for conversion
to strings).
Expression
f1 x = 2*x
f2 x y = x+y
f1 3
f2 3 4
f2 2 (f1 3)
f2 2 $ f1 3
f2 2 f1 3
\x -> 2*x
f2 3
f1 . f2 3
3 ‘f2‘ 4
(+) 3 4
(3*)
g x | x<0 = -1
| x>0 = 1
| x==0 = 0
num [] = 0
num (_:xs)
= 1+num xs
Description
define a function f1 which doubles its argument
define a function f2 which adds its two arguments
apply f1 to 3 (=6)
apply f2 to 3 and 4 (= 7)
apply f2 to 2 and 6 (the result of f1 3) (= 8)
same as above (the operator $ is often used in this
way to avoid overuse of parentheses)
error (this means: compute f2 2 f1 and apply the
result to 3, but 2+f1 does not make sense)
define the anonymous function x 7→ 2x
apply f2 to 3 (= the function \x -> 3+x)
composition (= the function \x -> 2*(3+x))
turn function (in backticks) into an operator (=7)
turn operator (in parentheses) into a function (=7)
fix first parameter to 3 (= the function \x -> 3*x)
define the sign function g using guards (the | symbols)
define a function which counts the number of elements in a list using pattern matching
Continued on the next page. . .
146
Expression
[1,2]
[]
[1..]
[2..5]
1 : [2,3]
[1,2,3] !! 0
[1,2] ++ [3]
’a’
"abc"
2^3
2^^(-1)
(’a’,2)
fst (’a’,2)
snd (’a’,2)
map f1 [1..]
[f2 a b |
a <- [1,2],
b <- [3..5]]
foldl f2 1 [3,5]
iterate f1 1
maximum [1,4,2]
maximumBy f x
Chapter 7. Implementation of estimates
Description
a list (all elements must have the same type)
the empty list
the list of all positive integers
list enumeration with bounds (=[2,3,4,5])
append element to beginning of list (=[1,2,3])
access list elements by zero-based index (=1)
concatenate two lists (=[1,2,3])
a character
a string, i.e. a list of characters (=[’a’,’b’,’c’])
nonnegative integer exponentiation (=8)
integer exponentiation (=0.5)
a pair (the elements need not have the same type)
access first element in a pair (=’a’)
access second element in a pair (=2)
apply f1 to all elements in the list (=[2,4,8,..])
list comprehension, i.e. { f 2 ( a, b) | a ∈ {1, 2}, b ∈
{3, 4, 5}} (= [4,5,6,5,6,7])
fold over list (compute f2 1 3 = 4, then f2 4 5) (=9)
compute orbit of 1 under f1 (=[1,2,4,8,..])
return maximum element in a list (=4)
as above, but using f to compare elements of the
list x
minimum [1,4,2] return minimum element in a list (=1)
splitAt 2 [1,4,2]split list in two at given index (=([1,4],[2]))
take 3 [7..]
take the first 3 elements from the list (=[7,8,9])
zip [1..] [3,4] join two lists into a list of pairs (=[(1,3),(2,4)])
unzip [(1,3),
‘inverse’ of zip (=([1,2],[3,4]))
(2,4)]
zipWith f2 [1..] like zip, but use f2 to join elements (=[4,6])
[3,4]
repeat 0
infinite list with one element repeated (=[0,0,..])
replicate 3 0
finite list with one element repeated (=[0,0,0])
Continued on the next page. . .
7.9. Haskell mini-reference
Expression
sum [3,-1,4]
transpose m
putStrLn "hi"
show 1.2
error "ohno"
147
Description
sum of elements in list (=6)
the transpose of the matrix m (m is a list of lists)
print hi to standard out and and append a new line
turn the number 1.2 into the string "1.2"
abort program with error message ohno
Table 7.1: Examples of Haskell syntax used in the source code.
A PPENDIX
Background material
The purpose of this appendix is to collect some background material that
is used throughout the text. Section A.1 contains a topological fixed point
theorem which is suitable for proving the existence of periodic points for
renormalization operators in general. Sections A.2 and A.3 contains basic
facts about the nonlinearity operator and the Schwarzian derivative, respectively.
A.1
A fixed point theorem
The following theorem is an adaptation of Granas and Dugundji (2003,
Theorem 4.7).
Theorem A.1.1. Let X ⊂ Y where X is closed and Y is a normal topological
space. If f : X → Y is homotopic to a map g : X → Y with the property that
every extension of g|∂X to X has a fixed point in X, and if the homotopy ht has no
fixed point on ∂X for every t ∈ [0, 1], then f has a fixed point in X.
Remark A.1.2. Note that the statement is such that X must have nonempty
interior. This follows from the assumption that g has a fixed point (since it
is an extension of g|∂X ) but the requirement on the homotopy implies that
g has no fixed point on ∂X.
S
Proof. Let Ft be the set of fixed points of ht and let F = Ft . Since g must
have a fixed point F is nonempty. Since ht has no fixed points on ∂X for
every t, F and ∂X are disjoint.
149
A
150
Appendix A. Background material
We claim that F is closed. To see this, let { xn ∈ F } be a convergent
sequence, let x = lim xn . Note that x ∈ X since F ⊂ X and X is closed. By
definition there exists tn ∈ [0, 1] such that xn = h( xn , tn ). Pick a convergent
subsequence tnk → t. Since xn is convergent h( xnk , tnk ) = xnk → x, but at
the same time h( xnk , tnk ) → h( x, t) since h is continuous. Hence h( x, t) = x,
that is x ∈ F which proves the claim.
Since Y is normal and ∂X and F are disjoint closed sets there exists a
map λ : X → [0, 1] such that λ| F = 0 and λ|∂X = 1. Define ḡ( x ) =
h( x; λ( x )). Then ḡ is an extension of g|∂X since if x ∈ ∂X, then ḡ( x ) =
h( x, 1) = g( x ). Hence ḡ has a fixed point p ∈ X. However, p must also be a
fixed point of f since p = ḡ( p) = h( p, λ( p)) so that p ∈ F and consequently
p = ḡ( p) = h( p, 0) = f ( p).
A.2
The nonlinearity operator
Definition A.2.1. Let C k ( A; B) denote the set of k times continuously differentiable maps f : A → B and let D k ( A; B) ⊂ C k ( A; B) denote the subset
of orientation-preserving homeomorphisms whose inverse lie in C k ( B; A).
As a notational convenience we write C k ( A) instead of C k ( A; A), and C k
instead of C k ( A; B) if there is no need to specify A and B (and similarly for
D k ).
Definition A.2.2. The nonlinearity operator N : D 2 ( A; B) → C 0 ( A; R) is
defined by
(A.1)
Nφ = D log Dφ.
We say that Nφ is the nonlinearity of φ.
Remark A.2.3. Note that
Nφ =
D2 φ
.
Dφ
Definition A.2.4. The distortion of φ ∈ D 1 ( A; B) is defined by
Dist φ = sup log
x,y∈ A
Dφ( x )
.
Dφ(y)
151
A.2. The nonlinearity operator
Remark A.2.5. We think of the nonlinearity of φ ∈ D 2 ( A; B) as the density
for its distortion. To understand this remark, let dµ = Nφ(t)dt. Assuming
Nφ is a positive function, then µ is a measure and
Dist φ =
Z
A
dµ,
since by (A.1)
Z y
x
Nφ(t)dt = log
Dφ(y)
.
Dφ( x )
If Nφ is negative, then − Nφ(t) is a density. The only problem with the
interpretation of Nφ as a density occurs when it changes sign. Intuitively
speaking, we can still think of the nonlinearity as a local density of the distortion (away from the zeros of Nφ).
Note that Nφ does not change sign in the important special case of φ
being a pure map (i.e. a restriction of x α ). So the (absolute value of the)
nonlinearity is the density for the distortion of pure maps.
Lemma A.2.6. The kernel of N : D 2 ( A; B) → C 0 ( A; R) equals the orientationpreserving affine map that takes A onto B.
Lemma A.2.7. The nonlinearity operator N : D 2 ( A; B) → C 0 ( A; R) is a bijection. In the specific case of A = B = [0, 1] the inverse is given by
Rx
Rs
exp{ 0 f (t)dt}ds
−1
0
(A.2)
N f (x) = R 1
.
Rs
exp
{
f
(
t
)
dt
}
ds
0
0
Lemma A.2.8 (The chain rule for the nonlinearity operator). If φ, ψ ∈ D 2
then
(A.3)
N (ψ ◦ φ) = Nψ ◦ φ · Dφ + Nφ.
Proof. Use the chain rule of differentiation:
N (ψ ◦ φ) = D log D (ψ ◦ φ) = D log( Dψ ◦ φ · Dφ)
= D log( Dψ ◦ φ) + D log Dφ =
D2 ψ ◦ φ · Dφ
+ Nφ
Dψ ◦ φ
= ( D log Dψ) ◦ φ · Dφ + Nφ = Nψ ◦ φ · Dφ + Nφ.
152
Appendix A. Background material
Definition A.2.9. We turn D 2 ( A; B) into a Banach space by inducing the
usual linear structure and uniform norm of C 0 ( A; R) via the nonlinearity
operator. That is, we define
αφ + βψ = N −1 (αNφ + βNψ) ,
(A.4)
kφk = sup | Nφ(t)|,
(A.5)
t∈ A
for φ, ψ ∈ D 2 ( A; B) and α, β ∈ R.
Lemma A.2.10. If φ ∈ D 2 ( A; B) then
N (φ−1 )(y) = −
(A.6)
Nφ(φ−1 (y))
,
Dφ(φ−1 (y))
∀y ∈ B.
Proof. Let x = φ−1 (y), then
h
i
N (φ−1 )(y) = D log D (φ−1 )(y) = D log Dφ( x )−1
=−
D2 φ( x )
Nφ( x )
· D (φ−1 )(y) = −
.
Dφ( x )
Dφ( x )
Lemma A.2.11. If φ ∈ D 2 ( A; B) then
(A.7)
(A.8)
(A.9)
Dφ(y)
≤ e|y−x|·kφk ,
Dφ( x )
| B| −kφk
| B | kφk
·e
≤ Dφ( x ) ≤
·e ,
| A|
| A|
| B|
| D2 φ( x )| ≤
· k φ k · ekφk ,
| A|
e−|y− x|·kφk ≤
for all x, y ∈ A.
Proof. Integrate the nonlinearity to get
Z y
x
as well as
Z y
x
Nφ(t)dt = log
Dφ(y)
,
Dφ( x )
Nφ(t)dt ≤ |y − x |kφk.
153
A.2. The nonlinearity operator
Combine these two equations to get (A.7).
By the mean value theorem we may chose y such that Dφ(y) = | B|/| A|,
so (A.8) follows from (A.7).
Finally, since
Nφ( x ) = D log Dφ( x ) =
D2 φ( x )
,
Dφ( x )
we get | D2 φ( x )| ≤ | Dφ( x )| · kφk. Now apply (A.8) to get (A.9).
Lemma A.2.12. If φ, ψ ∈ D 2 ( A; B) then
(A.10)
(A.11)
|φ( x ) − ψ( x )| ≤ e2kφ−ψk − 1 · min{φ( x ), 1 − φ( x )},
Dφ( x )
e−kφ−ψk ≤
≤ ekφ−ψk ,
Dψ( x )
for all x ∈ A.
Lemma A.2.13. The set B = {φ ∈ D 2 : kφk ≤ K } is relatively compact in C 0 .
Proof. Maps in B are uniformly bounded by definition and since maps in B
have uniformly bounded derivative (by (A.8)) they are equicontinuous.
The theorem of Arzelà–Ascoli now says that any sequence in B has a subsequence which converges uniformly (to a map in C 0 ). Thus, the C 0 –closure
of B is compact in C 0 .
Definition A.2.14. Let ζ J : [0, 1] → J be the affine orientation-preserving
map taking [0, 1] onto an interval J.
Define the zoom operator Z : D 2 ( A; B) → D 2 ([0, 1]) by
(A.12)
Zφ = ζ B−1 ◦ φ ◦ ζ A .
Remark A.2.15. Note that if φ ∈ D( A; B), then B = φ( A) so Zφ only depends on φ and A (not on B). We will often write Z (φ; A) instead of Zφ in
order to emphasize the dependence on A.
Lemma A.2.16. If φ ∈ D 2 ( A; B) then
(A.13)
Z (φ−1 ) = ( Zφ)−1 ,
(A.14)
N ( Zφ) = | A| · Nφ ◦ ζ A ,
(A.15)
k Zφk = | A| · kφk.
154
Appendix A. Background material
Proof. The first equation is just a calculation
1
−1
Z ( φ −1 ) = ζ −
◦ ζ B = (ζ B−1 ◦ φ ◦ ζ A )−1 = ( Zφ)−1 .
A ◦φ
To see the second equation, apply the chain rule for nonlinearities and
use the fact that affine maps have zero nonlinearity
N ( Zφ) = N (ζ B−1 ◦ φ ◦ ζ A ) = N (φ ◦ ζ A ) = Nφ ◦ ζ A · Dζ A .
This implies the third equation
k Zφk = sup | Nφ ◦ ζ A ( x )| · | A| = sup | Nφ( x )| · | A| = kφk · | A|.
x∈ A
x ∈[0,1]
A.3
The Schwarzian derivative
In this appendix we collect some results on the Schwarzian derivative.
Proofs can be found in de Melo and van Strien (1993, Chapter IV).
Definition A.3.1. The Schwarzian derivative S : D 3 ( A; B) → C 0 ( A; R) is
defined by
1
S f = D ( N f ) − ( N f )2 .
2
(A.16)
Remark A.3.2. Note that
2
3 D2 f
D3 f
−
.
Sf =
Df
2 Df
Lemma A.3.3. The kernel of S : D 3 ( A; B) → C 0 ( A; R) is the set of orientationpreserving Möbius maps which take A onto B.
Lemma A.3.4 (The chain rule for the Schwarzian derivative). If f , g ∈ D 3 ,
then
(A.17)
S( f ◦ g) = S f ◦ g · ( Dg)2 + Sg.
155
A.3. The Schwarzian derivative
Proof. Use the chain rule for nonlinearities:
2
1
N ( f ◦ g)
2
2
1
= D ( N f ◦ g · Dg + Ng) − N f ◦ g · Dg + Ng
2
2
= D ( N f ) ◦ g · ( Dg) + N f ◦ g · D2 g + D ( Ng)
1
1
− ( N f ◦ g)2 · ( Dg)2 − N f ◦ g · Dg · Ng − ( Ng)2
2
2
1
= D ( N f ) ◦ g − ( N f ◦ g)2 · ( Dg)2 + Sg
2
2
= S f ◦ g · ( Dg) + Sg.
S( f ◦ g) = D ( N ( f ◦ g)) −
(In the fourth step we used the fact that Dg · Ng = D2 g.)
Lemma A.3.5. S f < 0 if and only if S( f −1 ) ≥ 0.
Lemma A.3.6 (Koebe Lemma). If f ∈ D 3 (( a, b); R) and S f ≥ 0, then
(A.18)
−1
| N f ( x )| ≤ 2 · min{| x − a|, | x − b|} .
Proof. A proof for this particular statement of the Koebe lemma can be
found in Jiang (1996, Lemma 2.4). A more general version of the Koebe
lemma can be found in de Melo and van Strien (1993, Section IV.3).
Corollary A.3.7. Let τ > 0 and let f ∈ D 3 ( A; B). If f extends to a map F ∈
D 3 ( I; J ) with SF < 0 and if J \ B has two components, each having length at least
τ | B|, then
k Z f k ≤ e2/τ · 2/τ.
Proof. Since SF < 0 it follows that S( F −1 ) ≥ 0 so the Koebe lemma and
(A.15) imply that
k Z ( f −1 )k = | B| · k f −1 k ≤ | B| ·
2
2
= .
τ | B|
τ
Now apply Lemmas A.2.10, A.2.11(A.8) and A.2.16(A.13)
k Z f k ≤ exp{k( Z f )−1 k} · k( Z f )−1 k = exp{k Z ( f −1 )k} · k Z ( f −1 )k
≤ e2/τ · 2/τ.
Bibliography
A. Arneodo, P. Coullet, and C. Tresser. 1981. A possible new mechanism
for the onset of turbulence. Phys. Lett. A, 81(4):197–201.
P. J. Bushell. 1973. Hilbert’s metric and positive contraction mappings in a
Banach space. Arch. Rational Mech. Anal., 52:330–338.
Pierre Collet, Pierre Coullet, and Charles Tresser. 1985. Scenarios under
constraint. J. Physique Lett., 46(4):143–147.
Pierre Coullet and Charles Tresser. 1978. Itérations d’endomorphismes et
groupe de renormalisation. C. R. Acad. Sci. Paris Sér. A-B, 287(7):A577–
A580.
Edson de Faria, Welington de Melo, and Alberto Pinto. 2006. Global hyperbolicity of renormalization for Cr unimodal mappings. Ann. of Math. (2),
164(3):731–824.
Welington de Melo and Sebastian van Strien. 1993. One-dimensional dynamics, volume 25 of Ergebnisse der Mathematik und ihrer Grenzgebiete (3) [Results in Mathematics and Related Areas (3)]. Springer-Verlag, Berlin.
J.-P. Eckmann, H. Epstein, and P. Wittwer. 1984. Fixed points of Feigenbaum’s type for the equation f p (λx ) ≡ λ f ( x ). Comm. Math. Phys., 93(4):
495–516.
Jean-Pierre Eckmann and Peter Wittwer. 1987. A complete proof of the
Feigenbaum conjectures. J. Statist. Phys., 46(3-4):455–475.
157
158
Bibliography
Mitchell J. Feigenbaum. 1978. Quantitative universality for a class of nonlinear transformations. J. Statist. Phys., 19(1):25–52.
Mitchell J. Feigenbaum. 1979. The universal metric properties of nonlinear
transformations. J. Statist. Phys., 21(6):669–706.
Jean-Marc Gambaudo and Marco Martens. 2006. Algebraic topology for
minimal Cantor sets. Ann. Henri Poincaré, 7(3):423–446.
Andrzej Granas and James Dugundji. 2003. Fixed point theory. Springer
Monographs in Mathematics. Springer-Verlag, New York.
John Guckenheimer. 1976. A strange, strange attractor. In Jerrold E. Marsden and Marjorie McCracken, editors, The Hopf bifurcation and its applications, pages 368–381. Springer-Verlag, New York.
John Guckenheimer and R. F. Williams. 1979. Structural stability of Lorenz
attractors. Inst. Hautes Études Sci. Publ. Math., 50:59–72.
C. Jansson and S. M. Rump. 1991. Rigorous solution of linear programming
problems with uncertain data. Z. Oper. Res., 35(2):87–111.
Yunping Jiang. 1996. Renormalization and geometry in one-dimensional and
complex dynamics, volume 10 of Advanced Series in Nonlinear Dynamics.
World Scientific Publishing Co. Inc., River Edge, NJ.
Anatole Katok and Boris Hasselblatt. 1995. Introduction to the modern theory of dynamical systems, volume 54 of Encyclopedia of Mathematics and its
Applications. Cambridge University Press, Cambridge.
Hans Koch, Alain Schenkel, and Peter Wittwer. 1996. Computer-assisted
proofs in analysis and programming in logic: a case study. SIAM Rev., 38
(4):565–604.
Oscar E. Lanford, III. 1982. A computer-assisted proof of the Feigenbaum
conjectures. Bull. Amer. Math. Soc. (N.S.), 6(3):427–434.
Oscar E. Lanford, III. 1984. Computer-assisted proofs in analysis. Phys. A,
124(1-3):465–470. Mathematical physics, VII (Boulder, Colo., 1983).
A. Libchaber and J. Maurer. 1979. Rayleigh–Bénard experiment in liquid helium; frequency locking and the onset of turbulence. Journal de
Physique — Lettres, 40(16):419–423.
Bibliography
159
Paul S. Linsay. 1981. Period doubling and chaotic behavior in a driven
anharmonic oscillator. Phys. Rev. Lett., 47(19):1349–1352.
Edward N. Lorenz. 1963. Deterministic nonperiodic flow. Journal of the
Atmospheric Sciences, 20:130–141.
Mikhail Lyubich. 1999. Feigenbaum-Coullet-Tresser universality and Milnor’s hairiness conjecture. Ann. of Math. (2), 149(2):319–420.
M. Martens, W. de Melo, and S. van Strien. 1992. Julia-Fatou-Sullivan theory for real one-dimensional dynamics. Acta Math., 168(3-4):273–318.
Marco Martens. 1994. Distortion results and invariant Cantor sets of unimodal maps. Ergodic Theory Dynam. Systems, 14(2):331–349.
Marco Martens. 1998. The periodic points of renormalization. Ann. of Math.
(2), 147(3):543–584.
Marco Martens and Welington de Melo. 2001. Universal models for Lorenz
maps. Ergodic Theory Dynam. Systems, 21(3):833–860.
P. Martien, S. C. Pope, P. L. Scott, and R. S. Shaw. 1985. The chaotic behavior
of the leaky faucet. Physics Letters A, 110:399–404.
Curtis T. McMullen. 1996. Renormalization and 3-manifolds which fiber over the
circle, volume 142 of Annals of Mathematics Studies. Princeton University
Press, Princeton, NJ.
John Milnor. 1985. On the concept of attractor. Comm. Math. Phys., 99(2):
177–195.
Michał Misiurewicz. 1981. Absolutely continuous measures for certain
maps of an interval. Inst. Hautes Études Sci. Publ. Math., 53:17–51.
Matthias St. Pierre. 1999. Topological and measurable dynamics of Lorenz
maps. Dissertationes Math. (Rozprawy Mat.), 382:134.
Dennis Sullivan. 1992. Bounds, quadratic differentials, and renormalization
conjectures. In American Mathematical Society centennial publications, Vol.
II (Providence, RI, 1988), pages 417–466. Amer. Math. Soc., Providence, RI.
Warwick Bryan Tucker. 1998. The Lorenz attractor exists. ProQuest LLC, Ann
Arbor, MI. Thesis (Ph.D.)–Uppsala Universitet (Sweden).
160
Bibliography
Marcelo Viana. 2000. What’s new on Lorenz strange attractors?
Intelligencer, 22(3):6–19.
Math.
Björn Winckler. 2010. A renormalization fixed point for Lorenz maps. Nonlinearity, 23(6):1291–1302.
Index
cycles of renormalization, 31
A
a priori bounds, 54
archipelago, 93
attractor, 7
D
decomposition, 63
pure, 68
diffeomorphic parts, 26
distortion, 150
of a decomposition, 64
signed, 69
B
basin of attraction, 7
bifurcation diagram, 4
branch, 28
full, 58
trivial, 58
E
ergodic, 34
extremal point, 97
C
chain rule
for nonlinearities, 151
for Schwarzian derivative, 154
combinatorial type, 13, 29
bounded, 14, 29
composition operator, 64
on decomposed maps, 72
partial, 64
cone field, 100
critical exponent, 10, 25
critical point, 26
critical values, 26
F
Feigenbaum delta, 2
first-return map, 12
float, 128
full family, 58
G
gaps of generation n, 56
H
Haskell, 123, 144
161
162
Hilbert metric, 39
hyperbolic metric, 39
I
infinitely renormalizable, 13, 29
intervals of generation n, 56
invariant measure, 35
island, 93
K
Koebe Lemma, 155
L
limit set of renormalization, 17, 103
logistic family, 3
loop measure, 38
Lorenz attractor, 7
Lorenz equations, 7
Lorenz flow, 7
geometric, 9
Lorenz map, 9, 26
decomposed, 71
nontrivial, 12, 27
trivial, 12, 27
M
monotone combinatorics, 29
monotone family, 94
monotone type, 13
N
nice interval, 31
nonlinearity, 150
as density for distortion, 151
nonlinearity norm, 27
nonlinearity operator, 27, 150
nonunimodal, 113
P
period-doubling cascade, 1
Index
period-doubling operator, 4
phase transition, 1
pure map, 68
push-forward, 38
R
renormalizable, 6, 12, 27
n times, 13
renormalization, 12, 27
boundary of, 58
type of, 13, 28
renormalization conjectures, 5
renormalization horseshoe, 6, 18
renormalization operator, 6, 27
on decomposed maps, 72
rigidity, 20, 115
S
safe bound, 128
safe number, 128
Sandwich Lemma, 66
Schwarzian derivative, 154
slice, 58
stable manifold, 113
standard Lorenz family, 25
T
time set, 63
transfer map, 31
transfer time, 31
U
uniquely ergodic, 35
universality, 3
in the parameter plane, 3
metric, 115
unstable manifold, 103
V
vertex, 97
Index
W
wandering interval, 33
weak Markov property, 33
Z
zoom operator, 29, 153
on decompositions, 67
163
Was this manual useful for you? yes no
Thank you for your participation!

* Your assessment is very important for improving the work of artificial intelligence, which forms the content of this project

Download PDF

advertisement