Renormalization of Lorenz Maps BJÖRN WINCKLER Doctoral Thesis Stockholm, Sweden, 2011 TRITA-MAT-11-MA-02 ISSN 1401-2278 ISRN KTH/MAT/DA 11/01-SE ISBN 978-91-7501-011-3 Institutionen för matematik KTH 100 44 Stockholm Akademisk avhandling som med tillstånd av Kungl Tekniska högskolan framlägges till offentlig granskning för avläggande av teknologie doktorsexamen i matematik tisdagen den 23 augusti 2011 i sal D3. c Björn Winckler, augusti, 2011 Tryck: Universitetsservice US-AB iii Abstract This thesis is a study of the renormalization operator on Lorenz maps with a critical point. Lorenz maps arise naturally as first-return maps for three-dimensional geometric Lorenz flows. Renormalization is a tool for analyzing the microscopic geometry of dynamical systems undergoing a phase transition. In the first part we develop new tools to study the limit set of renormalization for Lorenz maps whose combinatorics satisfy a long return condition. This combinatorial condition leads to the construction of a relatively compact subset of Lorenz maps which is essentially invariant under renormalization. From here we can deduce topological properties of the limit set (e.g. existence of periodic points of renormalization) as well as measure theoretic properties of infinitely renormalizable maps (e.g. existence of uniquely ergodic Cantor attractors). After this, we show how Martens’ decompositions can be used to study the differentiable structure of the limit set of renormalization. We prove that each point in the limit set has a global two-dimensional unstable manifold which is a graph and that the intersection of an unstable manifold with the domain of renormalization is a Cantor set. All results in this part are stated for arbitrary real critical exponents α > 1. In the second part we give a computer assisted proof of the existence of a hyperbolic fixed point for the renormalization operator on Lorenz maps of the simplest possible nonunimodal combinatorial type. We then show how this can be used to deduce both universality and rigidity for maps with the same combinatorial type as the fixed point. The results in this part are only stated for critical exponent α = 2. iv Sammanfattning Denna avhandling är ett studium av renormaliseringsoperatorn på Lorenzavbildningar som har en kritisk punkt. Sådana avbildningar uppstår naturligt som förstaåterkomstavbildningar för geometriska Lorenzflöden i tre dimensioner. Renormalisering är ett verktyg som kan användas för att analysera den mikroskopiska geometrin hos dynamiska system som genomgår en fasövergång. I del ett utvecklar vi nya verktyg för att analysera gränsmängden för renormaliseringsoperatorn på Lorenzavbildningar vars kombinatorik uppfyller ett långt återkomsttidsvillkor. Detta villkor används för att konstruera en relativt kompakt mängd av Lorenzavbildningar som i stort sett är invariant under renormalisering. Utifrån detta kan vi bevisa topologiska egenskaper hos gränsmängden (t.ex. existens av periodiska punkter för renormaliseringsoperatorn) samt måtteoretiska egenskaper för oändligt renormaliserbara Lorenzavbildningar (t.ex. existens av entydigt ergodiska Cantorattraktorer). Därefter visar vi hur Martens dekompositioner kan avändas för att analysera den differentierbara strukturen hos gränsmängden för renormalisering. Vi visar att varje punkt i gränsmängden har en tvådimensionell global instabil mångfald som är en graf samt att snittet av den instabila mångfalden med domänen av renormalisering är en Cantormängd. Alla resultat i denna del gäller för godtyckliga reella kritiska exponenter α > 1. I del två bevisar vi att renormaliseringsoperatorn har en hyperbolisk fixpunkt av enklast möjliga ickeunimodala kombinatorik. Beviset stöder sig på ett datorprogram för att utföra vissa rigorösa uppskattningar. Vi visar även hur existens av en sådan fixpunkt medför universalitet och stelhet för avbildningar av samma kombinatorik. Denna del gäller endast för kritisk exponent α = 2. v Acknowledgements I would like to thank my supervisors Marco Martens and Michael Benedicks, and my assistant supervisor Masha Saprykina. I would also like to thank Kristian Bjerklöv, Kostya Khanin, Denis Gaidashev, and the dynamical systems group at Stony Brook. A large part of this thesis was conceived during my stay at Institut Mittag-Leffler (Djursholm, Sweden) in the spring of 2010. I gratefully acknowledge their support. Contents 1 Introduction 1.1 Background . . . . 1.2 Renormalization . . 1.3 Lorenz flows . . . . 1.4 Statement of results 1.5 Previous results . . 1.6 Future work . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1 1 2 7 12 20 20 I Renormalization of maps with long return time 23 2 Preliminaries 2.1 The renormalization operator . . . . . . . . . . . . . . . . . . 2.2 Generalized renormalization . . . . . . . . . . . . . . . . . . 2.3 Invariant measures . . . . . . . . . . . . . . . . . . . . . . . . 25 25 31 35 3 Invariance 3.1 The invariant set . . . . . . . . . . . . . . . . . . . . . . . . . . 3.2 A priori bounds . . . . . . . . . . . . . . . . . . . . . . . . . . 3.3 Periodic points of the renormalization operator . . . . . . . . 43 43 54 57 4 Decompositions 4.1 Decompositions . . . . . . . . . . . . . . . . . . . . . . . . . . 4.2 Renormalization of decomposed maps . . . . . . . . . . . . . 63 63 71 vii viii 5 Contents Differentiable structure 5.1 The derivative . . . . . . . . . . . . . 5.2 Archipelagos in the parameter plane 5.3 Invariant cone field . . . . . . . . . . 5.4 Unstable manifolds . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 79 . 79 . 93 . 100 . 103 II Existence of a hyperbolic renormalization fixed point 6 7 Computer assisted proof 6.1 Existence of a hyperbolic fixed point . 6.2 Consequences . . . . . . . . . . . . . . 6.3 Outline of the computer assisted proof 6.4 The proof . . . . . . . . . . . . . . . . . 109 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 111 111 113 116 119 Implementation of estimates 7.1 Verification of contraction . . . . . . . . . 7.2 Computation with floating point numbers 7.3 Computation with polynomials . . . . . . 7.4 Computation with analytic functions . . . 7.5 Linear algebra routines . . . . . . . . . . . 7.6 Supporting functions . . . . . . . . . . . . 7.7 Input to the main program . . . . . . . . . 7.8 Running the main program . . . . . . . . 7.9 Haskell mini-reference . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 123 124 127 131 132 140 142 143 143 144 . . . . A Background material 149 A.1 A fixed point theorem . . . . . . . . . . . . . . . . . . . . . . 149 A.2 The nonlinearity operator . . . . . . . . . . . . . . . . . . . . 150 A.3 The Schwarzian derivative . . . . . . . . . . . . . . . . . . . . 154 Bibliography 157 Index 161 C HAPTER Introduction 1.1 Background To get a feel for the subject of this thesis before getting into the details I will begin by relating something of a mystery that occurs in the physical world. Imagine a narrow tap which has very precise control over the flow of water coming out of it. The flow is controlled by turning a knob which has a dial indicating the angle the knob is turned. To begin with no water is flowing. After turning the knob ever so slightly water starts dripping. At this point the first interesting thing happens: as the knob is slowly being turned the frequency with which the water drips will not change, until all of a sudden the water starts dripping twice as fast as before. Now this pattern repeats itself; as the knob is turned the frequency of the drips does not change, until all of a sudden the frequency doubles again. This frequency doubling can be observed a couple of times until the water starts flowing in a steady stream. In very general terms we call this a phase transition (with dripping and flowing water being the two phases) via a period-doubling1 cascade. Let a1 denote the angle the knob was turned when the tap started dripping, a2 the angle of the knob when the frequency doubled for the first time, a3 the angle of the second frequency doubling, etc. If this experiment was performed on another tap the recorded angles would most likely dif1 The way we have presented this example it is the frequency which is doubling. However, if we perform the experiment “in reverse,” then it is the period which is doubling. 1 1 2 Chapter 1. Introduction fer wildly. However, and this is the mysterious part, the asymptotic relative distance between these angles is independent of the tap! That is, if d1 = a2 − a1 is the distance between the first two recorded angles, d2 = a3 − a2 the distance between the next two angles, and so on, then the sequence of ratios d1 /d2 , d2 /d3 , . . . approaches a number which has nothing to do with the tap used for the experiment: di → δ ≈ 4.6692 . . . d i +1 The number δ is called the Feigenbaum delta (or the first2 Feigenbaum constant). The particular value of δ is not so interesting on its own, but what is interesting is that it has a tendency to turn up in (dissipative) systems that undergo a phase transition via a period-doubling cascade. Another, seemingly unrelated, system where the Feigenbaum delta appears is in oscillating electric circuits. This example is not as appealing to the intuition as the dripping tap so I will not go into too many details. The setup is a simple; connect a resistor, an inductor, and a diode and feed this circuit by a sinusoidal signal. By increasing the amplitude of the input signal, the voltage measured across the diode will exhibit the same frequency doubling characteristic as the dripping tap. That is, if the input amplitude is increased slowly, the voltage measured over the diode will double in frequency at specific values of the amplitude, call these V1 , V2 , V3 , etc. As before, if we form relative distances between these values d1 = V2 − V1 , d2 = V3 − V2 , etc., then di /di+1 → δ. The topic of this thesis, renormalization, is a tool for analyzing systems which, like the above examples, are undergoing a phase transition. On one level it explains why seemingly unrelated systems such as the dripping tap and the oscillating circuit exhibit the same behavior during a transition, but on a deeper level it allows us to give a precise description of the mathematics underlying this phenomenon. 1.2 Renormalization The phenomenon presented in the previous section was not discovered by some mad scientist playing with dripping taps in a lab — it was first observed in the computer experiment outlined below. Only later did people 2 There is also a second Feigenbaum constant but its description is less intuitive. 1.2. Renormalization 3 come up with clever ways of reproducing the results in physical systems (see e.g. Libchaber and Maurer, 1979; Linsay, 1981; Martien et al., 1985). The computer experiment itself was inspired by phase transitions in statistical mechanics. The computer experiment goes something like this: pick a family of quadratic maps which depend on one parameter, for example the logistic family f µ ( x ) = µx (1 − x ), x ∈ [0, 1], µ > 0. Now iterate the critical point (x = 1/2) of this map and see where it ends up after many iterations for different values of the parameter µ. The observed behavior is shown in Figure 1.1 on the following page. For small values of µ the critical point always ends up in a fixed point whose position is given by the curve coming in from the left in the figure; after µ = 3 all of a sudden the critical point ends up in a period two orbit given by the split into two curves; a little later the curves split again and now the critical point follows a period four orbit, and so on. Around µ = 3.6 it is no longer clear where the critical point ends up — it may tend to a periodic orbit but the map may also be chaotic. The reason why we look at the fate of the critical point is because it can be shown that it reflects the behavior of almost all other points in [0, 1] (the obvious exceptions are x = 0 and x = 1, but there may be others). Another way of saying this is that Figure 1.1 shows what the attractor of f µ looks like for different values of the parameter µ. Now let µ1 be the parameter value where the curve coming in from the left splits in two, µ2 where the two curves split into four, etc. (that is, the parameter values µi record where a bifurcation takes place). As before, form distances d1 = µ2 − µ1 , d2 = µ3 − µ2 , . . . , then consider the ratios d1 /d2 , d2 /d3 , and so on. By now it should come as no surprise that the ratios converge to the Feigenbaum delta. Also, if the experiment is repeated for any other one-parameter family of maps with a quadratic critical point3 (e.g. families of sine functions) then the same behavior will be observed. The exact values where the bifurcations take place will be different, but the ratio of distances will still converge to the Feigenbaum delta. This phenomenon is called universality.4 3 That is, any family which can be written h ( x2 ) in a neighborhood of the critical point, where h is a homeomorphism. 4 Actually, there is more to universality. What we describe here is known as univer- 4 Chapter 1. Introduction x 1.0 0.8 0.6 0.4 0.2 0 2.4 µ 2.6 2.8 3.0 3.2 3.4 3.6 3.8 4.0 Figure 1.1: Bifurcation diagram for the logistic family x 7→ µx (1 − x ). Given a value for the parameter µ on the horizontal axis, the vertical axis shows where the critical point ends up after many iterations. For µ < 3 it goes to one spot, after that it may be in one of two spots, then four, then eight, then . . . chaos. The above experiment was carried out by Coullet and Tresser (1978), and independently by Feigenbaum (1978, 1979). A crucial insight of Coullet and Tresser was that they anticipated that the observations made in this computer experiment would occur in the real world. Nowadays this might be taken for granted but at the time it was new. In order to explain universality the above authors introduced the perioddoubling operator T which is loosely defined as follows. Consider the space U of real-analytic unimodal5 maps with a quadratic critical point. Take some f ∈ U . If there is an interval C around the critical point to which the restriction f 2 |C is affinely conjugate to some g ∈ U , then define T f = g sality in the parameter plane, but there is also universality in the phase space, or metric universality, which is described in Remark 6.2.4. 5 Unimodal here means with exactly one turning point. 1.2. Renormalization 5 (one should also assume that C is maximal for T to be well-defined). Conjugations preserve periodic points so a property of T is that f has a periodic point of period 2n if and only if T f has a periodic point of period n. Hence, if Σn ⊂ U denotes the codimension one6 surface of maps undergoing a bifurcation from 2n−1 –periodic behavior to 2n –periodic behavior (corresponding to a split in the bifurcation diagram on the preceding page), then T (Σn+1 ) ⊂ Σn , for n ≥ 1. Now assume that: (i) T has a hyperbolic fixed point f ∗ , (ii) the spectrum of the derivative of T at f ∗ is discrete with one expanding eigenvalue equal to δ and all other eigenvalues strictly contained in the complex unit disc, and (iii) the surface Σn transversally intersects the unstable manifold of f ∗ for n sufficiently large. These assumptions, called the renormalization conjectures by the above authors, allowed them to explain universality as follows. Since T (Σn+1 ) ⊂ Σn the λ–lemma implies that the surfaces converge toward the stable manifold of f ∗ and that the rate of convergence is determined by δ (since the action of T on a local unstable manifold is simply multiplication by δ). A one-parameter family of unimodal maps is a curve in U , so we can record parameter values an at which the curve crosses Σn . These values converge (for generic7 families) and the rate of convergence, which is described by the ratios ( an+1 − an )/( an − an−1 ), asymptotically equals 1/δ. In particular, this only depends on properties of the fixed point and not on the family under consideration, thereby explaining universality. Proving the renormalization conjectures turned out to be difficult and when the first partial proof was announced by Lanford (1982, 1984) it necessitated the use of a computer in order to perform some of the calculations.8 A novelty of Lanford’s proof was the use of interval arithmetic in order to make rigorous computer estimates on an infinite-dimensional space. Note that Lanford did not prove the transversality conjecture, this was done by Eckmann and Wittwer (1987) also using the computer-assisted methods pi6 The bifurcation surface Σn is characterized by a condition on the critical value. This is a one-dimensional condition, which explains why Σn has codimension one. 7 By generic we essentially mean families which cross Σ transversally for all n ≥ N, n where N is some large number. 8 As Lanford (1982) puts it in a remark: “Although done by computer, the computations involved in proving the results stated are just on the boundary of what it is feasible to verify by hand. I estimate that a carefully chosen minimal set of estimates sufficient to prove Theorems 1 and 3 could be carried out, with the aid only of a nonprogrammable calculator, in a few days.” 6 Chapter 1. Introduction oneered by Lanford. In Part II we recreate Lanford’s methods to prove a similar result for maps with a discontinuity. The period-doubling operator is defined in terms of the second iterate of maps; this is a restriction of the more general renormalization operator R. Briefly, f ∈ U is renormalizable if there exists an interval C around the critical point such that f n |C is affinely conjugate to a map g ∈ U for some n > 1, and we define R f = g (take C maximal for R to be welldefined). Also, there is no reason to insist that the critical point is quadratic, instead we should consider the space Uα of unimodal maps with critical exponent α > 1, that is if f ∈ Uα then | f ( x ) − f (c)| = h(| x − c|α ) in a neighborhood of the critical point c for some homeomorphism h. Intuitively we think of renormalization as a microscope. If f is renormalizable, then this means that the interesting dynamics for f takes place on the subset C (together with the forward orbit of C). Renormalization takes C and “zooms in” on this interval and gives us a new map R f which describes the dynamics on C. For certain f (the so-called infinitely renormalizable maps) this zooming can be repeated countably many times on smaller and smaller intervals and this typically means that the dynamics of f lives on a complicated fractal set. The geometry of such a fractal set is intimately linked to the topology of the map itself. Renormalization allows us to study the microscopic geometry of maps; it is particularly useful in connection with phase transitions between maps of different topological types. The prototypical example is that of the period-doubling phase transition to chaos in the logistic family that we described above. Let us now get back to the historical development of the theory of renormalization. Even though computer-assisted methods can be successfully employed in the special case of the period-doubling operator, they are not of much use when trying to say something about the renormalization operator. Other methods were needed in order to proceed. The first conceptual description of the renormalization operator was given by Sullivan (1992) where he introduced new complex analytic tools in order to study the limit set of the renormalization operator (note that the limit set contains the period-doubling fixed point). The idea is that the limit set is a hyperbolic Cantor set and that the renormalization operator acts as the full shift on the limit set (this is colloquially referred to as the renormalization horseshoe). This was proved over the course of several years by different authors, most notably Sullivan (1992); McMullen (1996); Lyubich (1999). 7 1.3. Lorenz flows A downside with the approach of the above authors is that it relies on complex analytic methods which only work for even critical exponents (we will have more to say about why this is dissatisfying in the next section). For this reason people have been looking for alternative proofs which do not rely on complex analysis, without much success so far. A notable exception is Martens (1998) who proves the existence of periodic points of the renormalization operator on unimodal maps for any critical exponent. However, that paper does not touch on the subject of hyperbolicity of the limit set of renormalization. In Part I we give a (partial) proof of the existence of a renormalization horseshoe for a class of maps with a discontinuity. Our approach does not use complex analysis and works for any critical exponent α > 1. To our knowledge this is the first result about the structure of a limit set of renormalization which works for any critical exponent. Moreover, it is one of very few results on maps having both a discontinuity and a critical point that goes beyond the purely topological aspects of the dynamics of such maps. 1.3 Lorenz flows The previous section discussed the renormalization part of the title of this thesis. This section concerns the second part, that is Lorenz maps, but first we need to talk about Lorenz flows. Historically speaking, a Lorenz flow is a flow associated with the following system of ordinary differential equations in three dimensions called the Lorenz equations: ẋ = −σx + σy, (1.1) ẏ = − xz + rx − y, ż = xy − bz, This system describes a simplified model of convection in the atmosphere. It was investigated by Lorenz (1963) and has played an important role in the development of the subject of Dynamical Systems (see Viana, 2000, for an overview and recent results). For parameter values σ = 10, b = 8/3 and r = 28 this system exhibits the well-known Lorenz attractor depicted in Figure 1.2 on the following page. 8 Chapter 1. Introduction Figure 1.2: The Lorenz attractor. The Lorenz attractor is the prototypical example of a nonperiodic and chaotic attractor. Let us take some time to explain what we mean by this. By attractor we essentially mean a set with an open neighborhood such that any trajectory that passes through this neighborhood approaches the attractor; that is, it has a large basin of attraction. In order for a set to be an attractor it is also usually assumed that it cannot be decomposed into smaller parts. This is true for the Lorenz attractor because it has a dense orbit. By chaotic we mean that the system exhibits sensitive dependence on initial conditions; any two points will eventually separate under time evolution of the system, no matter how close they initially started out. Sensitive dependence on initial conditions epitomizes chaotic behavior in dynamical systems. Finally, by nonperiodic we mean that the attractor is not simply a 1.3. Lorenz flows 9 periodic orbit. Typically this means that the appearance of the attractor is very intricate and one glance at Figure 1.2 should be enough to convince the reader that this certainly is the case for the Lorenz attractor. Note that a periodic attractor can exhibit chaotic behavior (think of the doubling map on a circle) even though geometrically it will look very tame. See Milnor (1985) for a classical discussion on the concept of attractor. A rigorous mathematical analysis of the system (1.1) for the parameters given above proved to be quite difficult. It was unknown for a very long time whether (1.1) really did exhibit a nonperiodic chaotic attractor or if computer generated pictures like Figure 1.2 on the preceding page were in fact only showing solution curves approaching a very long stable periodic orbit. A proof that the Lorenz attractor is nonperiodic and chaotic was finally announced by Tucker (1998). The proof uses a computer to make rigorous estimates that would take too long to verify by hand, much like the first proof of the renormalization conjectures. Long before the proof was announced it was already known that flows having the same geometric characteristic as the flow of (1.1) do exhibit nonperiodic chaotic attractors. Such geometric Lorenz flows were introduced in Guckenheimer (1976). These days the name Lorenz flow usually refers to a geometrically defined flow, the construction of which we will now discuss (a detailed description can be found in Guckenheimer and Williams, 1979). Figure 1.3 on the following page illustrates the construction of a geometric Lorenz flow. Roughly speaking, it is defined by the associated vector field X having an equilibrium point of saddle type at the origin. This saddle should have a two-dimensional stable manifold and a one-dimensional unstable manifold. The vector field X is chosen to be linear near the origin but away from the origin it is nonlinear so that it returns in a controlled manner. Namely, there should be a two-dimensional domain S which intersects the stable manifold in a curve and any flow trajectory which hits S outside this curve does so transversally and eventually returns to S. This means that the first-return map F to S is a well-defined map off the stable manifold. To simplify the analysis of the geometric flow further, it is assumed that there exists a foliation of S which is F–invariant and whose leaves are exponentially contracted by F. Hence the first-return map acts on the leaves of the foliation and by taking a quotient over leaves we are left with an interval map which is undefined at the point corresponding to the stable manifold. Such a map is called a Lorenz map. 10 Chapter 1. Introduction S Ws Wu O Figure 1.3: Illustration of a geometric Lorenz flow. The origin O is an equilibrium point of saddle type. The plane cutting the figure in two is part of the stable manifold W s and the curves emanating left and right from O is the unstable manifold W u . The flow is linear on the middle piece which looks like an inverted T and nonlinear in the two hooks. The nonlinear part forces the flow to return to the domain S. This ensures that the first-return map to S is well-defined outside the line down the middle of S, which represents the intersection S ∩ W s . Points on this line tend to the origin. One should also imagine that there is an invariant foliation of S, for example by lines parallel to the line which cuts S in half. Figure 1.4 on the next page shows the graphs of two Lorenz maps. The discontinuity in this figure corresponds to the crossing of the domain S and the stable manifold in the associated flow. This discontinuity appears because trajectories which hit S on the right side of the stable manifold will flow through the right hook, whereas trajectories to the left will flow through the left hook (see Figure 1.3). On both one-sided neighborhoods of the point of discontinuity a Lorenz map equals | x |α near the origin up to coordinate changes in the domain and range. The number α is called the critical exponent and is given by the absolute value of the ratio between the weak stable and unstable eigenvalues at the saddle point of the associated flow. In particular, α may assume any positive real value. An important consequence of this is that if we want to be able to understand physical systems (such as the Lorenz system) then we need to be able to analyze 11 1.3. Lorenz flows α = 1/2 α=2 Figure 1.4: A Lorenz map for different values of the critical exponent α. maps having any positive real critical exponent, since nature does not have a preferred value for α. We stress this point since it explains why the search for renormalization tools that work for any critical exponent is not purely academic. Guckenheimer and Williams (1979) show that there is an open set of vector fields on R3 with the structure of a geometric Lorenz flow (that is, there are many such flows, intuitively speaking). They consider the case α < 1 with the additional assumption that the Lorenz map √ is expanding in the sense that the derivative is bounded from below by 2. In this situation they use symbolic dynamics on the Lorenz map to show that the associated flow supports a nonperiodic chaotic attractor. We will consider the case when α > 1 which is significantly harder to analyze due to the presence of contraction around the point of discontinuity (think tent maps vs. unimodal maps). Instead of focusing on properties of the associated flow we will concentrate on the Lorenz map itself, with the understanding that results about Lorenz maps can be transferred to results about Lorenz flows. The presence of contraction leads to a much wider variety of dynamics than in the more traditional expanding case, which makes these Lorenz maps an interesting study in their own right. Perhaps more important is that we wish to further the understanding of one-dimensional dynamical systems, and in this sense the study of Lorenz maps can be seen as a natural next step now that unimodal dynamics is well understood. Of course, Lorenz maps are also important due to their connection with flows on R3 . The presence of a discontinuity introduces significant new difficulties that do not appear for unimodal maps. We have 12 Chapter 1. Introduction been forced to invent new tools to study the renormalization of Lorenz maps and our hope has been to perhaps use the insight gained from this to better understand unimodal renormalization for arbitrary critical exponents. Our idea of using decompositions (see the next section) to compute the derivative of the renormalization operator seems promising in this respect. 1.4 Statement of results A Lorenz map f on [0, 1] \ {c} is a monotone increasing differentiable map which fixes 0 and 1, and which has a unique critical point at c ∈ (0, 1). Note that c is not in the domain of f , so when saying that c is a critical point of f what we mean is that D f ( x ) → 0 as x → c. Figure 1.5 on the next page illustrates the graph of a typical Lorenz map. The critical point has critical exponent α > 1 which will be fixed once and for all (noninteger α are permitted).9 We will generally assume that f (c− ) > c > f (c+ ), where f (c− ) = lim f ( x ) x ↑c and f (c+ ) = lim f ( x ) x ↓c denote the critical values of f . If f (c− ) ≤ c or f (c+ ) ≥ c, then f is trivial, otherwise it is nontrivial.10 Note that f maps its domain [0, 1] \ {c} onto [0, 1] and that the inverse of f has two branches, unless f is trivial. For the sake of simplicity we will assume that f is C 3 and has negative Schwarzian derivative even though most of our arguments work, or can be made to work, for f ∈ C 2 . Given an interval C ⊂ [0, 1] we define the first-return map f˜ to C for f by f˜( x ) = f n(x) ( x ), for x ∈ C \ {c}, where n( x ) is the smallest positive integer such that f n(x) ( x ) ∈ C. A Lorenz map is renormalizable if there exists an interval C ⊂ (0, 1) containing c such that the first-return map to C is affinely conjugate to a nontrivial Lorenz map g on [0, 1] \ {c0 } for some c0 ∈ (0, 1). The renormalization of f is then defined by R f = g, where C is assumed to be maximal 9 One could also consider Lorenz maps with different critical exponents on either side of the critical point. We choose not to pursue this generalization. Note that the first-return maps of geometric Lorenz flows has the same critical exponent on both sides of the critical point so this generalization is somewhat unnatural. 10 All points of a trivial map converge to a fixed point under iteration, which explains the name trivial. 13 1.4. Statement of results 1 f (c− ) f (c+ ) 0 c 1 Figure 1.5: The graph of a Lorenz map on [0, 1] \ {c}. for R to be well-defined. Note that c0 6= c in general; this turns out to be one of the most difficult problems to handle for the renormalization of Lorenz maps. The type of renormalization is given by the pair of words ω = (ω − , ω + ) where ( ( i ( c− ) < c, 0, if f 0, if f j (c+ ) < c, + ω ( j ) = ω − (i ) = 1, if f i (c− ) > c, 1, if f j (c+ ) > c, for i = 0, . . . , a and j = 0, . . . , b, and where a and b are the smallest positive integers such that f a+1 (c− ) ∈ C and f b+1 (c+ ) ∈ C, respectively. The type ω = (ω − , ω + ) is called monotone if ω − = 01 · · · 1 and if ω + = 10 · · · 0. Note that the combinatorial description of Lorenz maps is simplified due to the fact that Lorenz maps are increasing so there is no need to introduce permutations as in the case of unimodal maps. If Rk f is renormalizable for k = 0, . . . , n − 1 then we say that f is n times renormalizable; if this holds for all n ≥ 1, then we say that f is infinitely renormalizable. In the latter case we define the combinatorial type of f to be the sequence ω̄ = (ω0 , ω1 , . . . ) such that Rk f is ωk –renormalizable11 for 11 “ω–renormalizable” is just shorthand for “renormalizable of type ω.” 14 Chapter 1. Introduction all k ≥ 0. If the lengths of both words of ωi are uniformly bounded (in i) then we say that f has bounded combinatorial type. We will now fix a finite set Ω of monotone types satisfying the long return condition of Section 3.1. Roughly speaking, Ω satisfies this condition if bαc ≤ |ω − | − 1 ≤ b2α − 1c and b− ≤ |ω + | − 1 ≤ b+ , for all ω = (ω − , ω + ) ∈ Ω. The constant b− must be sufficiently large and b+ depends on the choice of b− . We emphasize that Ω essentially only depends on b− , which needs to be chosen large (compared to 2α − 1) but any b− works so long as it is not too small. It is also worth noting that b+ can in general be chosen much larger than b− so the set Ω is not small. The condition on Ω may seem artificial and in all honesty so it is but it allows us to make some serious estimates. The idea is that increasing the return time of one branch while keeping the other constant forces the critical point (of the renormalization) to move into a corner and in this way we get some control over where the critical point is. Also, by increasing the return time we push one critical value closer to a fixed point in the boundary which causes the derivative along the orbit of this critical value to grow. The combination of control over the position of the critical point and large return derivatives is what makes all our estimates work. In Section 3.1 it is shown that there exists a relatively compact set K such that if f ∈ K is twice renormalizable, then R f ∈ K. Note that here and from now on when we say “renormalizable” we mean “renormalizable of type ω with ω ∈ Ω,” unless we specify otherwise. Because of this result we colloquially call K an “invariant set” even though it is not really invariant (since R f need not be twice renormalizable!). The proof is a series of estimates that rely on the conditions on Ω; in fact, the conditions on Ω were chosen so that these estimates work. From this result we are immediately able to prove existence of the socalled a priori bounds (see Section 3.2): Theorem A (A priori bounds). If f ∈ K is infinitely renormalizable with combinatorial type in ΩN, then {Rn f }n≥0 is a relatively compact family. The name a priori (real) bounds was coined by Sullivan (1992). In the unimodal case Sullivan proves these bounds by employing a shortest interval argument, however in the Lorenz case this argument breaks down 1.4. Statement of results 15 (essentially due to the fact that the critical point is not fixed under renormalization) which is why we have to work fairly hard for a proof. The a priori bounds are a basis for understanding the structure of infinitely renormalizable maps. With them we can prove the following: Theorem B (Infinitely renormalizable maps). If f ∈ K is infinitely renormalizable with combinatorial type in ΩN, then f is ergodic and has no wandering intervals. Furthermore, if Λ denotes the closure of the critical orbits of f , then: 1. Λ is a Cantor set with zero Lebesgue measure and Hausdorff dimension in (0, 1), 2. the basin of Λ has full Lebesgue measure, and 3. Λ is uniquely ergodic. The proof of ergodicity and nonexistence of wandering intervals uses the concept of generalized renormalization introduced by Martens (1994). Specifically, we adapt the definition of the weak Markov property to Lorenz maps (see Section 2.2). Note that Lorenz maps may have wandering intervals in general, for example if they renormalize to a trivial map (regardless whether the critical point is flat or not). Finding necessary conditions for a Lorenz map not to have wandering intervals is still an open problem. Since Lorenz maps have two critical orbits it is possible to construct Lorenz maps with a Cantor attractor which supports two ergodic invariant probability measures. In Section 2.3 we adapt the techniques of Gambaudo and Martens (2006) to Lorenz maps and use this to construct such a map as well as to show that bounded combinatorial type is sufficient but not necessary for the Cantor attractor to be uniquely ergodic. Having described the structure of the infinitely renormalizable maps we turn to the more complex task of studying the renormalization operator itself. Ultimately we want to describe the limit set of the renormalization operator, but before we get there we address the question of existence of periodic points (note that periodic points are contained in the limit set). Martens (1998) proves the existence of periodic points for the renormalization operator on unimodal maps. This is done by studying the action of the renormalization operator on the boundary of the domain of renormalization and then using a mapping degree argument to show existence of finite dimensional approximate periodic points in a purely topological way (the “bottom-down, top-up” lemma). The actual periodic points are then found as limits of the approximate periodic points. 16 Chapter 1. Introduction R( Bn × X ) Bn × X Rn X R Figure 1.6: Illustration of the action of a renormalization operator R (unimodal, Lorenz, . . . ). The domain of R is the product of an n–dimensional ball Bn and an infinite-dimensional space X. The action of R is to wrap the domain around the outside of itself in the parameter directions. By Theorem A.1.1 there is a fixed point of R in the intersection of the two boxes. This is just an intuitive picture, in general the domain might be more complicated (although in our situation it is essentially this simple). A natural generalization of the “bottom-down, top-up” lemma which works for any dimension of the parameter plane is given in Appendix A.1. The general idea is that renormalization acts on a space Bn × X, where the ball Bn ⊂ Rn represents parameter space and X is some infinite dimensional function space. The parameters in this situation are the critical values, so n = 1 for unimodal maps and n = 2 for Lorenz maps. The action of renormalization on sections Bn × { x } is to stretch in the general direction of Bn and wrap the boundary Sn−1 × { x } around the outside of Bn × X in such a way that the degree of this map is nonzero. Also, there should be a bounded subset of X which is invariant under renormalization. See Figure 1.6 for an illustration. This is a rough description of how both the unimodal and Lorenz renormalization operators act and it seems natural to think that other renormalization operators are similar. Theorem A.1.1 is directly suited to this situation. We apply this fixed point theorem to find periodic points of the renor- 17 1.4. Statement of results malization operator. The relatively compact “invariant” set K allows us to construct the periodic points directly without having to take limits of approximate periodic points like Martens (1998). This simplifies the argument considerably. In Section 3.3 we prove the following result: Theorem C (Periodic points of renormalization). The renormalization operator has a periodic point for every periodic combinatorial type (ω0 , . . . , ωn−1 )∞ ∈ ΩN . Note that we only prove existence and not uniqueness (even though presumably there is a unique periodic point for each combinatorial type). After this we turn to studying the limit set of the renormalization operator. Let AΩ denote the maps with a complete past, and let BΩ denote the maps with a complete future. That is f ∈ AΩ if f = R ω −1 f − 1 , f − 1 = R ω −2 f − 2 , f − 2 = R ω −3 f − 3 , ..., for some left infinite sequence (. . . , f −2 , f −1 ) of maps with ω−i ∈ Ω, and f ∈ BΩ if f is infinitely renormalizable with combinatorial type in ΩN .12 Think of AΩ as the “attractor” for R. The limit set of renormalization is the intersection ΛΩ = AΩ ∩ BΩ . We should perhaps not say the limit set here since there are in fact many such sets (we can choose Ω in countably many different ways!) but we think of Ω as being fixed so the terminology should not cause any confusion. Every f ∈ ΛΩ has a bi-infinite sequence of words ( . . . , ω −2 , ω −1 , ω0 , ω1 , ω2 , . . . ) , associated to it as explained above; explicitly, Rn f is ωn –renormalizable, for every n ∈ Z. Note that if f is associated with the sequence {ωi }i∈Z , then R f is associated with the sequence {ωi+1 }i∈Z , so R shifts the sequence to the left. This indicates that R should be conjugate to a shift operator, and conjecturally R is conjugate to the full shift on symbols in Ω. Unfortunately, we cannot prove that each f ∈ ΛΩ has a unique bi-infinite sequence associated with it, nor can we prove that each bi-infinite sequence has a unique f ∈ ΛΩ associated with it, so we are unable to define the potential conjugacy. However, R|ΛΩ is at least semi-conjugate to the one-sided shift on symbols in Ω via the map which sends f ∈ BΩ to its combinatorial type. 12 R ω is shorthand for the restriction of R to the set of ω–renormalizable maps. 18 Chapter 1. Introduction The above paragraph describes the topological structure of the limit set but we are also able to say something about its differentiable structure. Conjecturally, the limit set is a horseshoe, that is a hyperbolic set on which R is conjugate to the full shift. We only prove “half” of the hyperbolicity statement, namely that each point f ∈ ΛΩ has a global unstable manifold W u ( f ). By global we essentially mean that the unstable manifold stretches across all domains of renormalization, that is W u ( f ) ∩ Lω̄ 6= ∅ and ∂W u ( f ) ∩ Lω̄ = ∅, ∀ω̄ ∈ [ Ωn , n >0 where Lω̄ denotes the set of Lorenz maps which are renormalizable of type ω̄.13 Theorem D (Renormalization horseshoe). The renormalization operator on the limit set ΛΩ is semi-conjugate to the one-sided shift on symbols in Ω. For every f ∈ ΛΩ there exists a unique global two-dimensional unstable manifold W u ( f ) which is C 1 . The intersection of W u ( f ) with the renormalizable maps of a given type in Ω is diffeomorphic to a square, the intersection with the infinitely renormalizable maps of a given combinatorial type in ΩN is a point, and the union of all such points is a Cantor set. The proof of existence of unstable manifolds can be found in Section 5.4. It is based on results in Section 5.3 where we show that there exists a cone field which is invariant and expanded under the action of D R. In order to prove this we compute the derivative of R in Section 5.1. We are able to prove the uniqueness statement on the intersections of renormalizable maps with the unstable manifolds by carefully looking at the structure of the parameter plane of families of Lorenz maps (see Section 5.2). The parameters in this respect are essentially the critical values. In order to compute the derivative of the renormalization operator we introduce the machinery of decompositions (Martens, 1998). Since this represents a certain amount of effort let us discuss why we choose to take this route. First of all, the renormalization operator is not differentiable on a C k –space (since it is essentially just a composition operator) so we need to make some restriction on the space in order to compute the derivative. is, if ω̄ = (ω0 , . . . , ωn−1 ), then f ∈ Lω̄ if Ri f is ωi –renormalizable for i = 0, . . . , n − 1. 13 That 1.4. Statement of results 19 With decompositions we need not worry about this too much as the renormalization operator on decompositions contracts exponentially to a subset where it is differentiable (see Proposition 4.2.8). The second, more fundamental problem is that there are “too many” directions in which to deform a general diffeomorphism thereby making estimates on the derivative very difficult to handle. With decompositions there are “only” countably many directions to deform in and, more importantly, all deformations are monotone (in a sense that will be explained in Section 5.1) which makes the estimates manageable. Any results obtained for decompositions can be automatically transferred back to Lorenz maps by composing as explained in Section 4.1. This concludes the results of Part I. In Part II we make something of a historical detour. We consider the renormalization operator Rω of the simplest possible nonunimodal14 type, namely ω = (01, 100). We then recreate Lanford’s computer-assisted proof of the existence of a fixed point of the period-doubling operator on unimodal maps and adapt it to Rω . Note that the results of Part I do not cover this case, so there is no overlap. Theorem E (Hyperbolic renormalization fixed point). Let ω = (10, 011). The restriction of Rω acting on a space of real analytic Lorenz maps with quadratic critical point has a fixed point f ? . The derivative D Rω ( f ? ) at f ? is compact and has no eigenvalues on the unit circle. This theorem has two shortcomings: (i) it does not say anything about the number of unstable eigenvalues (i.e. the dimension of the unstable manifold), and (ii) there is no conclusion regarding the intersection of the unstable manifold with the bifurcation surfaces Σn as in the original renormalization conjectures. The first item is a problem with our proof, the second is a shortcoming of Lanford’s method which can be corrected as in Eckmann and Wittwer (1987) but we chose not do this as it makes the computer estimates more difficult. The proof basically amounts to turning Rω into a contraction (without changing the set of fixed points) via a Newton iteration and then using a variant of the contraction mapping theorem. The verification that the modified operator is a contraction uses a computer to make rigorous estimates. We provide the source code used to make these estimates in Chapter 7. 14 Exactly what we mean by nonunimodal is explained in Remark 6.1.4. 20 Chapter 1. Introduction Given a copy of this chapter it is possible to feed it into a compiler and get an executable that will perform the estimates. Of course, reading source code is always rather difficult but we would like to stress that the amount of code is small enough to include in its entirety complete with documentation. Finally, in Section 6.2 we discuss consequences of the existence of a hyperbolic renormalization fixed point. Some of these topics have already been discussed in Part I, such as the existence of a Cantor attractor for infinitely renormalizable maps. Hyperbolicity allows us to go a little further and show that the conjugacy between the Cantor attractors of two infinitely renormalizable maps of combinatorial type (01, 100)∞ extends to a differentiable map whose derivative is Hölder continuous. This important result is known as rigidity. 1.5 Previous results Lorenz maps and geometric Lorenz flows were introduced by Guckenheimer (1976), but the first investigations of critical Lorenz maps seem to be by Arneodo, Coullet, and Tresser (1981); Collet, Coullet, and Tresser (1985). There is a vast literature on expanding Lorenz maps, probably because these are the ones that arise naturally in the traditional Lorenz system, but not much seems to have been published on critical Lorenz maps. Martens and de Melo (2001) contains many results and ideas used in this thesis. Their paper contains a proof of the full family theorem, a proof of density of hyperbolicity, a description of quasi-conjugacy classes, as well as a description of the archipelago structure of domains of renormalizability in the parameter plane. Another source of results for Lorenz maps with a critical point is the PhD thesis of St. Pierre (1999). It contains, among other things, a construction of Markov extensions on Lorenz maps, and an admissibility condition for kneading invariants. 1.6 Future work The most important piece that is missing from this thesis is the proof of existence of a codimension two stable manifold at each point in the limit 1.6. Future work 21 set of renormalization. Some progress toward this result has been made but is not yet completed. Other than that it would of course be desirable to prove the results in this thesis for any finite set Ω of renormalization types (without the restrictions imposed by monotone combinatorics and the long return condition). However, this will require new methods as this work cannot handle the situation when both return times are comparable. It may however be possible to use the idea of looking at pure decomposed maps in order to compute the derivative of the renormalization operator for more general combinatorics. It would be interesting to get a complete classification of which Lorenz maps satisfy the weak Markov property as is done for unimodal maps in Martens (1994). The fact that Lorenz maps have two independent cycles of renormalization means that the shortest interval argument no longer works which makes things a lot more difficult. I think that it should be relatively straightforward to use the methods in this thesis to give a description of the limit set of renormalization for unimodal maps with long return time. This has partly been done in Eckmann et al. (1984) but their result only holds for critical exponent α = 2 and their method does not generalize to arbitrary α > 1. Part I Renormalization of maps with long return time 23 C HAPTER Preliminaries This chapter serves as an introduction to Lorenz maps with adaptations of some well known results for unimodal maps. In Section 2.1 we define the renormalization operator on Lorenz maps and introduce notation that will be used throughout the thesis. This is followed by a description of generalized renormalization for Lorenz maps in Section 2.2 which is used to derive ergodicity and non-existence of wandering intervals. Finally, Section 2.3 discusses invariant measures on Cantor attractors for Lorenz maps. 2.1 The renormalization operator In this section we define the renormalization operator on Lorenz maps and introduce notation that will be used throughout. Definition 2.1.1. The standard Lorenz family (u, v, c) 7→ Q( x ) is defined by u · 1 − c− x α , if x ∈ [0, c), c (2.1) Q( x ) = α 1 + v · −1 + x − c , if x ∈ (c, 1], 1− c where u ∈ [0, 1], v ∈ [0, 1], c ∈ (0, 1), and α > 1. The parameter α is called the critical exponent and will be fixed once and for all. Remark 2.1.2. The parameters (u, v, c) are chosen so that: (i) u is the length of the image of [0, c), (ii) v is the length of the image of (c, 1], (iii) c is the 25 2 26 Chapter 2. Preliminaries c1− f 1−1 (c) f 0−1 (c) c1+ c 0 1 Figure 2.1: Illustration of the graph of a (01, 1000)–renormalizable Lorenz map. critical point (which is the same as the point of discontinuity). Note that u and 1 − v are the critical values of Q. Definition 2.1.3. A C k –Lorenz map f on [0, 1] \ {c} is any map which can be written as ( φ ◦ Q( x ), if x ∈ [0, c), (2.2) f (x) = ψ ◦ Q( x ), if x ∈ (c, 1], where φ, ψ ∈ D k are orientation-preserving C k –diffeomorphisms on [0, 1], called the diffeomorphic parts of f . See Figure 2.1 for an illustration of a Lorenz map. The set of C k –Lorenz maps is denoted Lk ; the subset LS ⊂ L3 denotes the Lorenz maps with negative Schwarzian derivative (see Appendix A.3 for more information on the Schwarzian derivative). A Lorenz map has two critical values which we denote c1− = lim f ( x ) x ↑c and c1+ = lim f ( x ). x ↓c 27 2.1. The renormalization operator If c1+ < c < c1− then f is nontrivial, otherwise all points converge to some fixed point under iteration and for this reason f is called trivial. Unless otherwise noted, we will always assume all maps to be nontrivial. We make the identification Lk = [0, 1]2 × (0, 1) × D k × D k , by sending (u, v, c, φ, ψ) to f defined by (2.2). Note that (u, v, c) defines Q in (2.2) according to (2.1). For k ≥ 2 this identification turns Lk into a subset of the Banach space R3 × D k × D k . Here D k is endowed with the Banach space structure of C k−2 via the nonlinearity operator. In particular, this turns Lk into a metric space. For k < 2 we turn Lk into a metric space by using the usual C k metric on D k . See Appendix A.2 for more information on the Banach space D k . Remark 2.1.4. It may be worth emphasizing that for k ≥ 2 we are not using the linear structure induced from C k on the diffeomorphisms D k . Explicitly, if φ, ψ ∈ D k and N denotes the nonlinearity operator, then aφ + bψ = N −1 ( aNφ + bNψ) , ∀ a, b ∈ R, and kφkD k = k NφkC k−2 . We call this norm on D k the C k−2 –nonlinearity norm. The nonlinearity operator N : D k → C k−2 is a bijection and is defined by Nφ( x ) = D log Dφ( x ). See Appendix A.2 for more details on the nonlinearity operator. We now define the renormalization operator for Lorenz maps. Definition 2.1.5. A Lorenz map f is renormalizable if there exists an interval C ( [0, 1] (properly containing c) such that the first-return map to C is affinely conjugate to a nontrivial Lorenz map. Choose C so that it is maximal with respect to these properties. The first-return map affinely rescaled to [0, 1] is called the renormalization of f and is denoted R f . The operator R which sends f to its renormalization is called the renormalization operator. 28 Chapter 2. Preliminaries Explicitly, if f is renormalizable then there exist minimal positive integers a and b such that the first return map f˜ to C is given by ( f a+1 ( x ), if x ∈ L, f˜( x ) = f b+1 ( x ), if x ∈ R, where L and R are the left and right components of C \ {c}, respectively. The renormalization of f is defined by R f ( x ) = h−1 ◦ f˜ ◦ h( x ), x ∈ [0, 1] \ {h−1 (c)}, where h : [0, 1] → C is the affine orientation-preserving map taking [0, 1] to C. Note that C is chosen maximal so that R f is uniquely defined. Remark 2.1.6. We would like to emphasize that the renormalization is assumed to be a nontrivial Lorenz map. It is possible to define the renormalization operator for maps whose renormalization is trivial but we choose not to include these in our definition. Such maps can be thought of as degenerate and including them makes some arguments more difficult which is why we choose to exclude them. Next, we wish to describe the combinatorial information encoded in a renormalizable map. Definition 2.1.7. A branch of f n is a maximal open interval B on which f n is monotone (here maximality means that if A is an open interval which properly contains B, then f n is not monotone on A). To each branch B of f n we associate a word w( B) = σ0 · · · σn−1 on symbols {0, 1} by ( 0 if f j ( B) ⊂ (0, c), σj = 1 if f j ( B) ⊂ (c, 1), for j = 0, . . . , n − 1. Definition 2.1.8. Assume f is renormalizable and let a, b, L and R be as in Definition 2.1.5. The forward orbits of L and R induce a pair of words ω = (w( L̂), w( R̂)) called the type of renormalization, where L̂ is the branch of f a+1 containing L and R̂ is the branch of f b+1 containing R. In this situation we say that f is ω–renormalizable. See Figure 2.2 on the facing page for an illustration of these definitions. 29 2.1. The renormalization operator fb f b +1 ( R ) fa f a +1 ( L ) c R L f ( R) f ( L) Figure 2.2: Illustration of the dynamical intervals of a Lorenz map which is ω–renormalizable, with ω = (011, 100000), a = 2, b = 5. Let ω̄ = (ω0 , ω1 , . . . ). If Rn f is ωn –renormalizable for n = 0, 1, . . . , then we say that f is infinitely renormalizable and that f has combinatorial type ω̄. If the length of both words of ωk is uniformly bounded in k, then f is said to have bounded combinatorial type. The set of ω–renormalizable Lorenz maps is denoted Lω . We will use variations of this notation as well; for ω̄ = (ω0 , . . . , ωn−1 ) we let Lω̄ denote the set of Lorenz maps f such that Ri f is ωi –renormalizable, for i = 0, . . . , n − 1, and similarly if n = ∞. Furthermore, if Ω is a set of types of renormalization, then LΩ denotes the set of Lorenz maps which are ω– renormalizable for some ω ∈ Ω. We will almost exclusively restrict our attention to monotone combinatorics, that is renormalizations of type a b z }| { z }| { ω = (0 1 · · · 1, 1 0 · · · 0). In what follows we will need to know how the five-tuple representation of a Lorenz map changes under renormalization. It is not difficult to write down the formula for any type of renormalization but it becomes a bit messy so we restrict ourselves to monotone combinatorics. However, first we need to introduce the zoom operator. Definition 2.1.9. The zoom operator Z takes a diffeomorphism and rescales it affinely to a diffeomorphism on [0, 1]. Explicitly, let g be a map and I an interval such that g| I is an orientation-preserving diffeomorphism. Define 1 Z ( g; I ) = ζ − ◦ g ◦ ζI, g( I ) where ζ A : [0, 1] → A is the orientation-preserving affine map which takes [0, 1] onto A. See Appendix A.2 for more information on zoom operators. 30 Chapter 2. Preliminaries Remark 2.1.10. The terminology “zoom operator” is taken from Martens (1998), but our definition is somewhat simpler since we only deal with orientation-preserving diffeomorphisms. We will use the words ‘rescale’ and ‘zoom’ synonymously. Lemma 2.1.11. If f = (u, v, c, φ, ψ) is renormalizable of monotone combinatorics, then R f = (u0 , v0 , c0 , φ0 , ψ0 ) is given by u0 = | Q( L)| , |U | φ0 = Z ( f 1a ◦ φ; U ), v0 = | Q( R)| , |V | c0 = | L| , |C | ψ0 = Z ( f 0b ◦ ψ; U ), where U = φ−1 ◦ f 1−a (C ) and V = ψ−1 ◦ f 0−b (C ). Proof. This follows from two properties of zoom operators: (i) the map q( x ) = x α on [0, 1] is ‘fixed’ under zooming on intervals adjacent to the critical point, that is Z (q; (0, t)) = q for t ∈ (0, 1) (technically speaking we have not defined Z in this situation, but applying the formula for Z will give this result), and (ii) zoom operators satisfy Z (h ◦ g; I ) = Z (h; g( I )) ◦ Z ( g; I ). Notation. The notation introduced in this section will be used repeatedly throughout. Here is a quick summary. A Lorenz map is denoted either f or (u, v, c, φ, ψ) and these two notations are used interchangeably. Sometimes we write f 0 or f 1 to specify that we are talking about the left or right branch of f , respectively. Similarly, when talking about the inverse branches of f , we write f 0−1 and f 1−1 . The subscript notation is also used for the standard family Q (so Q0 denotes the left branch, etc.). A Lorenz map has one critical point c and two critical values which we denote c1− = limx↑c f ( x ) and c1+ = limx↓c f ( x ). The critical exponent is denoted α and is always assumed to be fixed to some α > 1. In general we use primes for variables associated with the renormalization of f . For example (u0 , v0 , c0 , φ0 , ψ0 ) = R f . Sometimes we use parentheses instead of primes, for example c1− (R f ) denotes the left critical value of R f . In order to avoid confusion, we try to use D consistently to denote derivative instead of using primes. 31 2.2. Generalized renormalization With a renormalizable f we associate a return interval C such that C \ {c} has two components which we denote L and R. We use the notation a + 1 and b + 1 to denote the return times of the first-return map to C from L and R, respectively. The letters U and V are reserved to denote the pullbacks of C as in Lemma 2.1.11. We let U1 = φ(U ), Ui+1 = f i (U1 ) for i = 1, . . . , a, and V1 = ψ(V ), Vj+1 = f j (V1 ) for j = 1, . . . , b (note that Ua+1 = C = Vb+1 ). We call {Ui } and {Vj } the cycles of renormalization. 2.2 Generalized renormalization In this section we adapt the idea of generalized renormalization introduced by Martens (1994). The central concept is the weak Markov property which is related to the distortion of the monotone branches of iterates of a map. Definition 2.2.1. An interval C is called a nice interval of f if: (i) C is open, (ii) the critical point of f is contained in C, and (iii) the orbit of the boundary of C is disjoint from C. Remark 2.2.2. A ‘nice interval’ is analogous to a ‘nice point’ for unimodal maps (see Martens, 1994). The difference is that for unimodal maps one point suffices to define an interval around the critical point (the ‘other’ boundary point is a preimage of the first), whereas for Lorenz maps the boundary points of a nice interval are independent. The term ‘nice’ is perhaps a bit vague but its use has become established by now. Definition 2.2.3. Fix f and a nice interval C. The transfer map to C induced by f , [ T: f −n (C ) → C, n ≥0 is defined by T ( x ) = f τ (x) ( x ), where τ: [ f −n (C ) → N n ≥0 is the transfer time to C; that is τ ( x ) is the smallest nonnegative integer n such that f n ( x ) ∈ C. Remark 2.2.4. Note that: (i) the domain of T is open, since C is open by assumption, and f −1 (U ) is open if U is open (even if U contains a critical 32 Chapter 2. Preliminaries value), since the point of discontinuity of f is not in the domain of f , (ii) T is defined on C and T |C equals the identity map on C. Proposition 2.2.5. Let T be the transfer map of f to a nice interval C. If I is a component of the domain of T, then τ | I is constant and I is mapped monotonically onto C by f τ ( I ) . Furthermore I, f ( I ), . . . , f τ ( I ) ( I ) are pairwise disjoint. Remark 2.2.6. This means in particular that the components of the domain of T are the same as the branches of T. In what follows we will use the terminology “a branch of T” interchangeably with “a component of the domain of T”. Proof. If I = C then the proposition is trivial since T |C is the identity map on C, so assume that I 6= C. Pick some x ∈ I and let n = τ ( x ). Note that n > 0 since I 6= C. We claim that the branch B of f n containing x is mapped over C. From this it immediately follows that τ | I = n and f n ( I ) = C. Since f n | B is monotone and f ( x ) ∈ C it suffices to show that f n (∂B) ∩ C = ∅. To this end, let y ∈ ∂B. Then there exists 0 ≤ i < n such that f i (y) ∈ {0, c, 1}. If f i (y) ∈ {0, 1} then we are done, since these points are fixed by f . So assume that f i (y) = c and let J = ( x, y). Then f i ( J ) ∩ ∂C 6= ∅ since f i ( x ) ∈ / C by minimality of τ ( x ). Consequently f n (y) ∈ / C, otherwise n n − i f ( J ) ⊂ C which would imply f (∂C ) ∩ C 6= ∅. But this is impossible since C is nice and hence the claim follows. From τ ( I ) = n it follows that I, . . . , f n ( I ) are pairwise disjoint. Suppose not, then J = f i ( I ) ∩ f j ( I ) is nonempty for some 0 ≤ i < j ≤ n. But then the transfer time on I ∩ f −i ( J ) is at most i + (n − j) which is strictly smaller than n, and this contradicts the fact that τ ( I ) = n. Proposition 2.2.7. Assume that f has no periodic attractors and that S f < 0. Let T be the transfer map of f to a nice interval C. Then the complement of the domain of T is a compact, f –invariant and hyperbolic set (and consequently it has zero Lebesgue measure). Proof. Let U = dom T and let Γ = [0, 1] \ U. Since U is open Γ is closed and hence compact (since it is obviously bounded). By definition f −1 (U ) ⊂ U which implies f ( Γ ) ⊂ Γ. 2.2. Generalized renormalization 33 We can characterize Γ as the set of points x such that f n ( x ) ∈ / C for all n ≥ 0. Since S f < 0 it follows that f cannot have nonhyperbolic periodic points (Misiurewicz, 1981, Theorem 1.3) and by assumption f has no periodic attractors so Γ must be hyperbolic (de Melo and van Strien, 1993, Lemma III.2.1) .1 Finally, it is well known that a compact, invariant and hyperbolic set has zero Lebesgue measure if f is at least C 1+Hölder (de Melo and van Strien, 1993, Theorem III.2.6) .1 Definition 2.2.8. A map f is said to satisfy the weak Markov property if there exists a K < ∞ and a decreasing sequence C1 ⊃ C2 ⊃ · · · of nice intervals whose lengths tend to 0, such that the transfer map to Cn is defined almost everywhere and has distortion bounded by K, for every n. Remark 2.2.9. That a “transfer map T has distortion bounded by K” is simply a convenient way of saying that T | B has distortion bounded by K, for every branch B of T. Theorem 2.2.10. If f satisfies the weak Markov property, then f has no wandering intervals. Proof. In order to reach a contradiction assume that there exists a wandering interval W which is not contained in a strictly larger wandering interval. Note that W must accumulate on at least one side of c. Otherwise there would exist an interval I disjoint from the orbit of W with c ∈ cl I. We could then modify f on I in such a way that the resulting map would be a bimodal C 2 –map with nonflat critical points and W would still be a wandering interval for the modified map, see Figure 2.3 on the next page. However, such maps do not have wandering intervals (Martens et al., 1992). Now let {Ck } be the sequence of nice intervals that we get from the weak Markov property. Since W accumulates on at least one side of c there exists a sequence of nonnegative integers {nk } such that f nk (W ) ⊂ Ck . Let Bk be the branch of f nk containing W. The weak Markov property now shows that the distortion of f nk : W → Ck is uniformly bounded (in k), so there exists a δ > 0 (independent of k) 1 The theorems from de Melo and van Strien (1993) that are referenced in this proof are stated for maps whose domain is an interval but their proofs go through, mutatis mutandis, for Lorenz maps. 34 Chapter 2. Preliminaries I Figure 2.3: Illustration showing why wandering intervals must accumulate on the critical point. If f has a wandering interval whose orbit does not intersect some (one-sided) neighborhood I of the critical point, then by modifying f on I according to the gray curve we create a bimodal map with a wandering interval. This is impossible since bimodal maps with nonflat critical points do not have wandering intervals. such that f nk ( Bk ) contains a δ–scaled neighborhood of Ck . This Koebe space can be pulled back and by applying the Macroscopic Koebe Lemma we see that Bk contains a δ0 –scaled neighborhood of W, for every k (where δ0 only depends on δ). T The above argument shows that B = Bk strictly contains W. Since c ∈ / Bk for any k by definition we also have that B is a wandering interval. Thus B is a wandering interval which strictly contains the wandering interval W, but this contradicts the maximality of W. Hence f cannot have wandering intervals. Theorem 2.2.11. If f satisfies the weak Markov property, then f is ergodic. Proof. In order to reach a contradiction, assume that there exist two invariant sets X and Y such that | X | > 0, |Y | > 0 and | X ∩ Y | = 0. Let {Ck } be the sequence of nice intervals that we get from the weak Markov property. 35 2.3. Invariant measures We claim that | X ∩ Ck | → 1 and |Ck | |Y ∩ Ck | → 1, |Ck | as k → ∞. Thus we arrive at a contradiction since this shows that | X ∩ Y | > 0. Let Γk be the complement of the domain of the transfer map to Ck . By S the weak Markov property | Γk | = 0, hence Γk also has zero measure. This and the assumption that | X | > 0 implies that there exists a density point x which lies in X as well as in the domain of the transfer map to Ck , for every k. Let Bk be the branch of the transfer map to Ck containing x, and let τk be the transfer time for Bk . We contend that | Bk | → 0. If not, there would T exist a subsequence {k i } such that B = Bki had positive measure, and thus B would be contained in a wandering interval (which is impossible by Theorem 2.2.10). Here we have used that Ck is a nice interval so the orbit of Bk satisfies the disjointness property of Proposition 2.2.5. Since f τk ( Bk ) = Ck , since f ( X ) ⊂ X, and since there exists a K < ∞ bounding the distortion of each transfer map we get |Ck \ X | | f τk ( Bk \ X )| |B \ X| ≤ ≤K k → 0, τ |Ck | | f k ( Bk )| | Bk | as k → ∞. The last step follows from x being a density point, since | Bk | → 0. Now apply the same argument to Y and the claim follows. 2.3 Invariant measures Let f be an infinitely renormalizable map of any combinatorial type2 and let Λ be the closure of the orbits of the critical values and assume that this is a Cantor set (in Section 3.2 we prove that Λ is a Cantor set, although our proof is only valid for some combinatorial types). In this section we describe the invariant measures supported on Λ. The techniques employed are an adaptation of (Gambaudo and Martens, 2006) which also contains proofs for all the statements we choose not to provide proofs for here. 2 Contrary to other sections, in this section we consider general combinatorial types and not just monotone types. 36 Chapter 2. Preliminaries Theorem 2.3.1. If f is infinitely renormalizable (of any combinatorial type) with a Cantor attractor Λ, then Λ supports one or two ergodic invariant probability measures. If the combinatorial type of f is bounded then Λ is uniquely ergodic. Furthermore, it is possible to choose the combinatorial type so that Λ has two distinct ergodic invariant probability measures. Remark 2.3.2. Bounded combinatorial type is sufficient for Λ to be uniquely ergodic, but is not necessary as Example 2.3.13 shows. An infinitely renormalizable map naturally defines a sequence of finer and finer covers of Λ. We now describe the construction of these covers and how they in turn can be identified with certain directed graphs. Definition 2.3.3. Since f is infinitely renormalizable we get a nested sequence of nice intervals {Cn }. Let an = τ (c1− ) and bn = τ (c1+ ) denote the transfer times of the critical values to the nice interval Cn . Let {Uni } and {Vni } be the pull-backs of Cn along the orbits of the critical values, that is f i−1 (c1− ) ∈ Uni and f an +1−i (Uni ) = Cn , i = 1, . . . , an + 1, f i−1 (c1+ ) ∈ Vni and f bn +1−i (Vni ) = Cn , i = 1, . . . , bn + 1. j The intervals {Uni } and {Vn } cover the Cantor set Λ and they satisfy a disjointness property expressed by the following lemma. Intuitively, for a fixed n these sets are pairwise disjoint except that if they overlap at some time, then all remaining intervals follow the same orbit. Lemma 2.3.4. There exists k n ≥ 0 such that Unan +1−i = Vnbn +1−i for i = 0, . . . , k n , and Λ0n = {Uni }ia=n −1 kn ∪ {Vni }ib=n +1 1 is a pairwise disjoint cover of Λ for all n. Remark 2.3.5. Note that k n = 0 if f has monotone combinatorial type. Proof. Since Cn is nice it follows that if Unan +1−k ∩ Vnbn +1−k 6= ∅ for some k, then Unan +1−i = Vnbn +1−i for all i = 0, . . . , k. Define k n to be the largest such k (which exists since Unan +1 = Cn = Vnbn +1 ). By Proposition 2.2.5 {Uni }i is a pairwise disjoint collection and so is i {Vn }i which proves the disjointness property of Λ0n . 37 2.3. Invariant measures Finally, Λ0n covers Λ since the critical values are contained in Un1 and Vn1 , both of which are eventually mapped inside Cn . Definition 2.3.6. The n–th cover of Λ is defined by Λn = { I ∩ Λ | I ∈ Λ0n }. The covers come with natural projections πij : Λ j → Λi , i ≤ j, defined by I = πij ( J ), where I ∈ Λi is the unique element which contains J ∈ Λ j . These projections satisfy πii = id and πij = πik ◦ πkj if i ≤ k ≤ j. Hence we get an inverse system ({Λi }i , {πij }i≤ j ). The inverse limit of this system can be identified with Λ via the natural projections pn : Λ → Λn , defined by pn ( x ) = I, where I ∈ Λn is the unique element containing x ∈ Λ. Explicitly, p : Λ → lim Λn is defined by ←− p ( x ) = ( p1 ( x ), p2 ( x ), . . . ). Remark 2.3.7. Note that p is: (i) well-defined, since pi = πij ◦ p j , (ii) surjective, since if ( xi )i ∈ lim Λn then x = lim xi exists and p( x ) = ( x1 , x2 , . . . ), ←− (iii) injective, since if x, y ∈ Λ are distinct then pn ( x ) 6= pn (y) for some n due to the fact that the diameter of elements in Λn tends to zero as n → ∞. We think of Λn as the directed graph where each element in Λn is a node and where an edge connects I to J if and only if f ( I ) ∩ J 6= ∅. Note that if there is an edge from I to J, then f ( I ) = J unless I = Zn = Cn ∩ Λ. The image of Zn is contained in the nodes En1 = Un1 ∩ Λ and En2 = Vn1 ∩ Λ. For example, Λn could look like: Un2 ∩ Λ Un3 ∩ Λ = Vn5 ∩ Λ Vn4 ∩ Λ En1 = Un1 ∩ Λ Zn = Cn ∩ Λ Vn3 ∩ Λ Vn2 ∩ Λ En2 = Vn1 ∩ Λ 38 Chapter 2. Preliminaries We now describe how the invariant measures on Λ can be identified with a subset of an inverse limit of “almost invariant” measures on Λn . We define the “almost invariant” signed measures on Λn as follows. Definition 2.3.8. Let Σn be the σ–algebra generated by Λn . Note that Λn consists of exactly two loops, by which we mean a maximal collection { Ik ∈ Λn }k such that f ( Ik ) = Ik+1 or Ik = Zn . These loops start in Eni and terminate in Zn . Each loop defines a loop measure, νni : Σn → R, by νni ( I ) = 1 if and only if I is a node on the i–th loop of Λn . Let H1 (Λn ) denote R–module generated by {νn1 , νn2 } and let M(Λ) denote the R–module of signed invariant measures on Λ. Remark 2.3.9. Note that the (signed) measures in H1 (Λn ) are almost invariant since νni ( f −1 ( I )) = νni ( I ) if I ∈ Λn \ { En1 , En2 }, but invariance fails since f −1 ( Eni ) ∈ / Σn . The notation H1 (Λn ) comes from the fact that H1 (Λn ) is isomorphic to the first homology module of the graph Λn . Now consider the push-forward 1 ( pn )∗ µ = µ ◦ p− n of µ ∈ M(Λ) under the projection pn : Lemma 2.3.10. The push-forward of pn is a homomorphism ( pn )∗ : M(Λ) → H1 (Λn ) and ( pn )∗ (µ) = µ( En1 )νn1 + µ( En2 )νn2 . The push-forward under the projections πij induces an inverse system ({ H1 (Λi )}i , {(πij )∗ }i≤ j ). Because of the previous lemma and the identification of Λ with lim Λn , the ←− inverse limit lim H1 (Λn ) is isomorphic to M(Λ) via the isomorphism ←− p∗ (µ) = ( p1 )∗ µ, ( p2 )∗ µ, . . . . Definition 2.3.11. Let I(Λ) ⊂ M(Λ) denote the subset of positive invariant measures. 39 2.3. Invariant measures Using the above one can check that I(Λ) is identified with lim H1+ (Λn ), ←− where H1+ (Λn ) = x1 νn1 + x2 νn2 | xi ≥ 0 are the (positive) almost invariant measures on Λn . Lemma 2.3.12. Let Wn = (wij ) be the winding matrix defined by wij = #{ I ⊂ Eni | I is an element of the j–th loop of Λn }. Then Wn is the representation of the push-forward (πn,n+1 )∗ if we use the loop measures {νk1 , νk2 } as bases of H1 (Λk ), for k = n, n + 1. Proof of Theorem 2.3.1. By the above every invariant measure is represented by an inverse limit {(z1 , z2 , . . . ) | zi = Wi zi+1 , zi ∈ K }, where K ⊂ R2 is the cone {( x1 , x2 ) | xi ≥ 0}. This suggests that we should look at the sets \ In = Wn · · · Wm−1 K, m>n since zn ∈ In . The winding matrices have positive integer entries, so either In is a one-dimensional or a two-dimensional subspace, for all n ≥ 1. If In has dimension two then it is the convex hull of two points and hence In ∩ { x1 + x2 = 1} has exactly two extremal points. This implies that there are two ergodic invariant probability measures. Since In cannot have have dimension higher than two, this also shows that there can be no more than two ergodic invariant probability measures. In Example 2.3.13 we construct a map with exactly two ergodic invariant probability measures. To see that Λ is uniquely ergodic in the bounded combinatorial type case we introduce the Hilbert metric (also known as the hyperbolic metric) on the interior of K. This metric is defined as in the following figure: x2 w n d(z, z0 ) = log 1 + z z0 w0 x1 |z−z0 ||w−w0 | |w−z||z0 −w0 | o 40 Chapter 2. Preliminaries Here w and w0 are the points on ∂K closest to z and z0 , respectively, on the line through z and z0 . Note that d(z, z0 ) equals the log of one plus the crossratio of (w, z, z0 , w0 ). The cross-ratio is well-defined since these four points are collinear. The Hilbert metric is contracted by positive matrices because: (i) linear maps preserve cross-ratio, (ii) WK ⊂ K, if W is a positive matrix, and (iii) the cross-ratio of (w, z, z0 , w0 ) decreases if w is moved further away from z on the line through these four points (and similarly for w0 and z0 ). If f has bounded combinatorial type, then there is a bound on the contraction constant for the winding matrices Wn which is independent of n. This implies that In is one-dimensional and hence there is a unique ergodic probability measure. Example 2.3.13. If f is of monotone combinatorial type {(ωn− , ωn+ )}n≥0 with an = |ωn− | − 1 and bn = |ωn+ | − 1, then we can compute the winding matrix: Wn = 1 a n +1 bn + 1 . 1 Thus 1 + a n + 1 bn bn + bn + 1 Wn−1 Wn = a n + a n + 1 1 + a n bn + 1 1 s n + ( a n bn ) − 1 a − n (1 + r n ) = a n bn bn−1 (1 + sn ) rn + ( an bn )−1 where sn = an+1 /an and rn = bn+1 /bn . Assuming an , bn → ∞ we see that the above matrix modulo the multiplicative term is asymptotically equal to If we let Inm = m \ sn 0 . 0 rn Wn Wn+1 · · · Wm Wm+1 K, i =n then Inm is two-dimensional for all m if we let an and bn grow sufficiently fast. Thus the sets In in the proof of Theorem 2.3.1 will be two-dimensional, giving rise to two ergodic invariant probability measures. 2.3. Invariant measures 41 It is possible to compute the contraction constant of the Hilbert metric in this example (see Bushell, 1973). It is given by √ a n bn − 1 . k n −1 = √ a n bn + 1 This constant is an exact bound on the contraction, meaning that there exist x, y such that d(Wn x, Wn y) = k n d( x, y). By choosing an and bn such that ∏ k n = 0 we get that the sets In are lines which means that Λ is uniquely ergodic. In particular, we could choose { an } and {bn } unbounded but growing slowly enough for ∏ k n = 0 to hold, hence showing that bounded combinatorial type is not necessary for Λ to be uniquely ergodic. C HAPTER Invariance The central result of this chapter is the existence of an ‘invariant’ set K for the renormalization operator in Section 3.1. This result is exploited in Section 3.2 to prove a priori bounds which in turn has consequences for the Cantor attractor for infinitely renormalizable maps. A careful investigation of the action of the renormalization operator on K in Section 3.3 is then used to exhibit periodic points of the renormalization operator. 3.1 The invariant set In this section we construct an ‘invariant’ and relatively compact set for the renormalization operator. This construction works for types of renormalization where the return time of one branch is much longer than the other. This result will be exploited in the following sections. Definition 3.1.1. Let ε = 1 − c. This notation will be used from now on. Note that ε depends on f . Define K = { f ∈ L1 | ε− ≤ ε ≤ ε+ , Dist φ ≤ δ, Dist ψ ≤ δ} and a b z }| { z }| { Ω = {(0 1 · · · 1, 1 0 · · · 0) | a− ≤ a ≤ a+ , b− ≤ b ≤ b+ }. We are going to show how K and Ω can be chosen so that K is invariant1 under the restriction of R to types in Ω. As always, assume that the critical 1 Exactly what we mean by ‘invariant’ is expressed in Theorem 3.1.5. 43 3 44 Chapter 3. Invariance exponent α > 1 has been fixed once and for all. The critical exponent will essentially determine a− and a+ (this is not entirely natural as we would expect to be able to choose a+ as large as we please but some estimates will not work then). Finally we are left with a free parameter b− which when chosen large enough will give us the invariance we are after. Remark 3.1.2. Ideally we would like to let Ω be any finite set of types of renormalization. However, our approach relies heavily on the fact that we can choose b− large since this allows us to do some analysis. Also, we restrict ourselves to monotone combinatorics (i.e. of type (01 · · · 1, 10 · · · 0)) since arbitrary types are more difficult to handle. We will now show how to choose the constants defining K and Ω. Let (3.1) − ε− = α−σb , ε+ = κα−(b − (1− σr + )− a+ ) /α , + where σ ∈ (0, 1), r + = a+ + 1 − α − α−b , and κ is a constant which is defined in (3.14). The parameter δ will be assumed to be small, δ = o (1/b− ) suffices. However, δ must not be smaller than the bounds for the distortion of R f in (3.15). For example, we may pick δ = (1/b− )2 . Assume that σ, a− and a+ have been chosen so that − (3.2) a− > α − 1 + α−b , a+ ≤ 2α − 1, α − r− 1 <σ≤ + , 2 − + α −r r a +1−α − where r − = a− + 1 − α − α−b . Finally, choose b+ such that a+ r − 1 r − (1 − σr + ) + α2 σ − + − ·b −b +a − → ∞, (3.3) α α α b− → ∞. Remark 3.1.3. Let us briefly discuss the choice of constants. It is important to realize that they do not represent necessary conditions and as such are not optimal in any way. In order for ε− < ε+ to hold we need 1 − σr + > 0 and σ ≥ (1 − σr + )/α, both of which follow from the bounds on σ in (3.2). The lower bound on a− is used to control the distortion of the return maps in the proof of Theorem 3.1.5. The upper bound on a+ ensures that the lower bound on σ in (3.2) is positive. The lower bound on σ is equivalent to the constant in front of b− in (3.3) being strictly larger than 1. This shows that b+ can be chosen so that b+ − 45 3.1. The invariant set b− → ∞ as b− grows, which is important since we want Ω to be “as large as possible.” Finally, the condition (3.3) is used to prove that ε(R f ) ≥ ε− . Example 3.1.4. One possible choice that will satisfy the above constraints is σ = 1/α, a − = b α c, a+ = b2α − 1c. (If α = n − λ for some n ∈ N and λ > 0 very close to 0, then b− may need to be increased so that the lower bound on a− holds.) We now state the theorem which exactly expresses what kind of ‘invariance’ we have for K. Theorem 3.1.5 (Invariance). If f ∈ LSΩ and c1+ (R f ) ≤ 1/2 ≤ c1− (R f ), then f ∈ K =⇒ R f ∈ K, for b− large enough. The condition on the critical values of the renormalization excludes maps which are degenerate in some sense. There is nothing magical about the number 1/2 here, all we want is for c1− (R f ) to be bounded away from 0 and c1+ (R f ) to be bounded away from 1. An alternative (weaker) statement which we will also use is: Corollary 3.1.6. If both f ∈ LSΩ and R f ∈ LSΩ , then f ∈ K =⇒ R f ∈ K. for b− large enough. Proof. Since f ∈ LSΩ ∩ K we can apply Lemma 3.1.9 which shows that c1− (R f ) → 1 and c1+ (R f ) → 0 as b− → ∞. Hence Theorem 3.1.5 applies. Remark 3.1.7. The full family theorem (Martens and de Melo, 2001) implies that there exists f which satisfies the conditions of the corollary. This shows that both the corollary and the theorem are not vacuous. The main reason for introducing the set K is the following: Proposition 3.1.8. K is relatively compact in L0 . 46 Chapter 3. Invariance Proof. Clearly [ε− , ε+ ] is compact in (0, 1). Hence we only need to show that the ball B = {φ ∈ D 1 ([0, 1]) | Dist φ ≤ δ} is relatively compact in D 0 ([0, 1]). This is an application of the Arzelà–Ascoli theorem; if {φn ∈ B} then |φn (y) − φn ( x )| ≤ eδ |y − x | hence this sequence is equicontinuous (as well as uniformly bounded), so it has a uniformly convergent subsequence. The rest of this section is devoted to the proof of Theorem 3.1.5. We will need the following expressions for the inverse branches of f : (3.4) (3.5) !1/α |φ−1 ([ x, c1− ])| = c−c , |φ−1 ([0, c1− ])| 1/α |ψ−1 ([ x, 1])| −1 . f 1 ( x ) = c + ( 1 − c ) 1 − −1 + |ψ ([c1 , 1])| f 0−1 ( x ) Lemma 3.1.9. There exists K such that if f ∈ L1Ω ∩ K, then 1 − c1− < Kε2 . Also, c1+ → 0 exponentially in b− as b− → ∞. Proof. For monotone combinatorics c1+ < f 0−b (c) and c1− > f 1−a (c), so the idea is to look for bounds on the backward orbits of c. We claim that (3.6) (3.7) f 1−1 (c) − c ≥ µ, ε α−n 1/(α−1) c − f 0−n (c) ≥ µε · c/eδ , c where µ ≥ 1 − Kε. Assume that the claim is true. Then (3.6) shows that 1 − c1− < 1 − f 1−1 (c) ≤ 1 − c − µε = ε(1 − µ) ≤ Kε2 , which proves the statement about about c1− . −n − Next, let n = dlogα b− e. Then α−n ≤ 1/b− and (µε)α ≥ (µε)1/b , so 1/(α−1) −σ f 0−n (c) − ≤ 1 − (µε− )1/b (c/eδ )1/(α−1) ≤ 1 − µ (1 − ε+ )/eδ α . c Thus f 0−n (c) is a uniform distance away from c. Since b− − dlogα b− e → ∞, and since 0 is an attracting fixed point for f 0−1 with uniform bound on the 47 3.1. The invariant set multiplier, it follows that f 0−b (c) approaches 0 exponentially as b− → ∞. This proves the statement about c1+ . We now prove the claim. We first show that (3.8) f 1−1 (c) − c ≥ ε · (3.9) c − f 0−n (c) ≥ cα /eδ 1− !1/α eδ ε c − f 0−b (c) 1/(α−1) , α−n · f 1−1 (c) − c . Equation (3.8) follows from a computation using (3.5) and the fact that 1 − c1+ > c − f 0−b (c) holds for monotone combinatorics. To prove (3.9), first apply (3.4) to get f 0−1 ( x ) ≤ c−c e − − δ c1 −x c1− 1/α . This gives f 0−1 (c) ≤ c − c · e−δ/α (c1− − c)1/α , (3.10) and if x < c then x 1/α f 0−1 ( x ) ≤ c − c · e−δ/α 1 − . c An induction argument on the last inequality leads to 1+···+α−(n−1) x α−n f 0−n ( x ) ≤ c − ce−δ/α , · 1− c which together with (3.10), 1 + · · · + α−n < α/(α − 1), and c1− > f 1−1 (c) proves (3.9). Having proved (3.8) and (3.9) we now continue the proof of the claim. Note that the left-hand side of (3.8) appears in the right-hand side of (3.9) and vice versa. Thus we can iterate these inequalities once we have some bound for either of them. To this end, suppose f 1−1 (c) − c ≥ tε, for some t > 0. If we plug this into (3.9) and then plug the resulting bound into (3.8), we will get a new bound on f 1−1 (c) − c. This defines a map t 7→ h(t) = 1− eδ c α/(α−1) −b ε 1− α · α−b t !1/α . 48 Chapter 3. Invariance One can check that h is increasing and has two fixed points — one close to 0 which is repelling, and another close to 1 which is attracting. Explicitly, the fixed point equation t = h(t) gives α/(α−1) −b −b tα (1 − tα ) = ε1−α eδ /c . If t ↑ 1 then the solution is approximately α/(α−1) 1−α−b 1/α ε t1 = 1 − eδ /c , and if t ↓ 0 then the approximate solution is α b +1 / ( α − 1 ) α b − 1 ε . t0 = eδ /c Hence the proof is complete if we have some initial bound f 1−1 (c) − c ≥ t0 ε such that t0 > t0 , because then hn (t0 ) → t1 . To get an initial bound we use the fact that f 1−1 (c) − c > | R| and look for a bound on | R|. Since R f is nontrivial we have f b+1 ( R) ⊃ R, which implies | R| ≤ | f b ( f ( R))| ≤ max f 0 ( x )b · eδ | Q( R)| ≤ (eδ uα/c)b eδ v (| R|/ε)α x <c and thus f 1−1 (c) − c cε1/b > | R| ≥ ε · αeδ(b+1)/b b/(α−1) = εt0 . b Here t0 is of the order ε1/(α−1) α−b whereas t0 is of the order εα , so t0 > t0 − for b− large enough (since ε ∼ α−b ··· ). Lemma 3.1.10. There exists K such that if f ∈ L1Ω ∩ K then K −1 ≤ D f 1−a ( x ) · α a ε−a ≤ K, K −1 ≤ D f 0−b ( x ) · αb ε1−α −b ≤ K, ∀ x > f 0−1 (c), ∀ x ≤ c. Proof. The proof makes use of the following expressions for the derivatives of f −1 !1−1/α −1 ([0, c− ])| −1 ( x ) | φ c Dφ 1 (3.11) D f 0−1 ( x ) = · , α u |φ−1 ([ x, c1− ])| !1−1/α ε Dψ−1 ( x ) |ψ−1 ([c1+ , 1])| −1 (3.12) . D f1 (x) = · α v |ψ−1 ([c1+ , x ])| 49 3.1. The invariant set We start by proving the lower bound on D f 1−a . From (3.12) we get ≥ e−δ ε/α and hence D f 1−1 ( x ) D f 1−a ( x ) ≥ e−aδ (ε/α) a , ∀ x ∈ [c1+ , 1]. Note that e−aδ has a uniform bound since a < b− and δb− → 0 by assumption. Next consider the upper bound on D f 1−a . By assumption x > f 0−1 (c) ≥ c 1 − (eδ ε)1/α , where we have used (3.4). This together with (3.12) implies 1−1/α D f 1−a ( x ) ≤ (ε/α) a (eδ /v) a e aδ ( f 0−1 (c) − c1+ )−a ≤ K (ε/α) a . It remains to show that K is uniformly bounded. Briefly, this follows from v a ≥ (1 − eδ c1+ ) a ≥ 1 − aeδ c1+ → 1 and from − a a a a −1 f 0−1 (c) − c1+ ≤ 1 − ε 1 − (eδ ε)1/α 1 − c1+ / f 0−1 (c) −1 ≤ 1 − aε 1 − a(eδ ε)1/α 1 − ac1+ / f 0−1 (c) → 0, as b− → ∞. (We have used Lemma 3.1.9 to get ac1+ → 0; aδ → 0 and aε → 0 follows from the choice of K and Ω.) We now turn to proving the bounds on D f 0−b . We claim that (3.13) − − α−n α/(α−1) c1 − x c1− − f 0−n ( x ) δ/α 1−1/α −δ/(α−1) · ≤ e ( 1 + O ( ε )) , e ≤ c1− c1− which together with (3.12) gives c − 2δ c1 e (1 + O(ε1−1/α )) b ≤ D f 0−b ( x ) · αb b −1 c − 1 ∏ i =0 ≤ e2δ c c1− b . −x c1− (α−1)/αi+1 50 Chapter 3. Invariance −b The product can be rewritten as ((c1− − x )/c1− )1−α which is proportional −b to ε1−α by Lemma 3.1.9, so all we need to do is to prove that the constants above are uniformly bounded. This follows from a similar argument to the above and the fact that bεt → 0 for every t > 0, and bδ → 0, as b− → ∞ (by the assumptions made on K and Ω). To finish the proof, let us prove the claim (3.13). The lower bound follows from (3.4) and the estimate c1− − f 0−1 ( x ) f 0−1 ( x ) ≥ e−δ/α > 1 − − c c1 c1− − x c1− 1/α . An induction argument finishes the proof of the lower bound. For the upper bound we assume x ≤ c and use (3.4) again to get c1− − f 0−1 ( x ) c c1− − f 0−1 ( x ) c − f 0−1 ( x ) = −· · ≤ µeδ/α − − 1 c c1 c1 c − f0 (x) c1− − x c1− 1/α , where c1− − c c c c1− − f 0−1 (c) c · + · = − − − − 1 c1 c1 c1 c − f 0 (c) c − f 0−1 (c) −1/α − − δ c1 − c ≤ 1 + O(ε1−1/α ). < 1+ε e c1− µ= Another induction argument finishes the proof for the upper bound. Lemma 3.1.11. If f ∈ L1Ω ∩ K and c1+ (R f ) ≤ 1/2 ≤ c1− (R f ), then α 1 v D f b (y) c | R| · · · ≤ 2e2δ , ≤ 2δ ε | L| u D f a (x) 2e for some x ∈ f ( L) and y ∈ f ( R). Remark 3.1.12. This lemma can be stated in much greater generality (the proof remains unchanged). In particular, we do not use the bounds on ε and it works for any type of renormalization. Proof. From 0 ≤ c1+ (R f ) ≤ 1/2 ≤ c1− (R f ) ≤ 1 we get 1/2 ≤ | f a+1 ( L)| | f b+1 ( R)| ≤ 1 and 1/2 ≤ ≤ 1. |C | |C | 51 3.1. The invariant set The mean value theorem implies that there exists x ∈ f ( L) and y ∈ f ( R) such that D f a ( x )| f ( L)| = | f a+1 ( L)| and D f b (y)| f ( R)| = | f b+1 ( R)|. From f ( L) = φ ◦ Q0 ( L) and f ( R) = ψ ◦ Q1 ( R) we get e−δ ≤ | f ( L)| ≤ eδ u · (| L|/c)α and e−δ ≤ | f ( R)| ≤ eδ . v · (| R|/ε)α Now combine these equations to finish the proof: v u c | R| · ε | L| α ≤ e2δ a | f ( R)| D f a ( x ) | f b+1 ( R)| 2δ D f ( x ) ≤ 2e ≤ e2δ · . | f ( L)| D f b (y) | f a+1 ( L)| D f b (y) The lower bound follows from a similar argument. Proof of Theorem 3.1.5. The proof is divided up into three steps: (1) show that the distortion of f b | f ( R) is small, (2) show that ε− ≤ ε(R f ) ≤ ε+ , (3) determine explicit bounds on the distortion for the renormalization. Step 1. The map f 0−b |C extends continuously to [0, c1− ] so in order to use the Koebe lemma we need to show that both components of [0, c1− ] \ C are large compared to C. The relative length of the left component is large since |C | is at most of order ε1/α , so we focus on the right component only. There are two cases to consider: either | L| ≥ | R|, or | R| > | L| (the latter will turn out not to hold, but we do not know that yet). Assume | R| > | L|. For monotone combinatorics we have f ( R) ⊂ ( f 0−b−1 (c), f 0−b+1 (c)), thus | f 0−b+1 (c) − f 0−b−1 (c)| ≥ | f ( R)| ≥ ε−δ v(| R|/ε)α , and consequently | R| ≤ ε eδ · | f 0−b+1 (c) − f 0−b−1 (c)| v 1/α → 0, as b− → ∞. This shows that the relative length of the right component of [0, c1− ] \ C tends to infinity and hence the distortion of f b | f ( R) tends to zero. 52 Chapter 3. Invariance Next, assume that | L| ≥ | R|. Since f is renormalizable f ( L) ⊂ C and hence 2| L| ≥ |C | = D f a ( x )| f ( L)| ≥ D f a ( x )e−δ u(| L|/c)α , for some x ∈ f ( L). Now apply Lemma 3.1.10 to get that | L| ≤ Kε a/(α−1) . By assumption (3.2) a > α − 1 so once again we get that the relative length of the right component tends to infinity as we increase b− . Step 2. We first show that ε(R f ) ≤ ε+ . Apply Lemma 3.1.11 to get | R| | R| ε ε(R f ) = < ≤ | L| + | R| | L| c u D f 0−b (y) · · 2e2δ v D f 1−a ( x ) !1/α , for some x ∈ f ( L) and some y ∈ f ( R). Now apply Lemma 3.1.10 and Step 12 to get (3.14) −b 1/α ε(R f ) ≤ κ α−b+a εα−a−1+α . (This defines the constant κ of (3.1).) The exponent of ε is negative by assumption (3.2) so inserting ε ≥ ε− gives us ε(R f ) ≤ κα−t , where t= b− (1 − σr + ) − a+ . α This shows that ε(R f ) ≤ ε+ . We now show that ε(R f ) ≥ ε− . A similar argument to the above shows that | R|/| L| ≥ kα−t , where r − (1 − σr + ) − a+ r − 1 + − b − ·b −a + . t= α α α − Recall that ε− = α−σb so we would like σb− > t, which is equivalent to 1 r − (1 − σr + ) + α2 σ − a+ r − + − ·b −b +a − > 0. α α α need Step 1 to get a bound on D f 0−b (y), since we do not know if y ≤ c and this is the only case Lemma 3.1.10 treats. 2 We 53 3.1. The invariant set By assumption (3.3) the left-hand side tends to ∞ as b− grows. Hence | R|/| L| ≥ kασb − −t ε− , so the right-hand side is greater than ε− if b− is sufficiently large. Consequently this is also true for ε(R f ) since | R| | R| | R | −1 ε(R f ) = = · 1+ | L| + | R| | L| | L| and | R|/| L| is small. Thus, ε(R f ) ≥ ε− if b− is sufficiently large. Step 3. We now use the Koebe lemma to get explicit bounds on the distortion for R f . From Step 2 we know that | L| > | R| and thus the arguments of Step 1 shows that |C | is at most of the order ε a/(α−1) . Hence Lemma 3.1.9 shows that the right component of (c1+ , c1− ) \ C has length of order ε and the left component has length of order 1. The inverses of the return maps f a | f ( L) and f b | f ( R) extend continuously (at least) to (c1+ , c1− ) so the Koebe lemma implies that the distortion of the return maps is of the order εt , where t = −1 + a/(α − 1) > 0. That is (3.15) Dist φ(R f ) ≤ Kεt and Dist ψ(R f ) ≤ Kεt . Note that Kεt δ if we e.g. choose δ = (1/b− )2 . This concludes the proof of Theorem 3.1.5. Many of the results used to prove Theorem 3.1.5 do not rely on the assumption that c1+ (R f ) ≤ 1/2 ≤ c1− (R f ). Note that without this assumption we cannot say anything about the critical point of the renormalization, nor can we ensure that the distortion of the diffeomorphic parts shrink under renormalization. In other words, we can not prove invariance of K without this extra assumption. We collect these results and state them here as one proposition as they will be needed later. Proposition 3.1.13. If f ∈ LSΩ ∩ K, then 1. 1 − c1− < Kε2 for some K not depending on f , 2. c1+ → 0 exponentially in b− as b− → ∞, 3. D f 1a |U (ε/α) a , −b 4. D f 0b |V α−b ε−1+α . 54 Chapter 3. Invariance Remark 3.1.14. We use the notation g( x ) y to mean that there exists K < ∞ not depending on g such that K −1 y ≤ g( x ) ≤ Ky for all x in the domain of g. Proof. The first two items are proven in Lemma 3.1.9. The last two follow from Lemma 3.1.10 and Step 1 of the proof of Theorem 3.1.5. 3.2 A priori bounds In this section we begin exploiting the existence of the relatively compact ‘invariant’ set of Theorem 3.1.5. An important consequence of this theorem is the existence of so-called a priori bounds (or real bounds) for infinitely renormalizable maps. We use the a priori bounds to analyze infinitely renormalizable maps and their attractors. Theorem 3.2.1 (A priori bounds). If f ∈ LSω̄ ∩ K is infinitely renormalizable with ω̄ ∈ ΩN , then {Rn f }n≥0 is a relatively compact family (in L0 ). Proof. This is a consequence of Corollary 3.1.6 and Proposition 3.1.8. Theorem 3.2.2. If f ∈ LSω̄ ∩ K is infinitely renormalizable with ω̄ ∈ ΩN , then f satisfies the weak Markov property. Before giving the proof we need the following lemma. Intuitively, it states that if f is renormalizable and I is a branch of f n , then f n ( I ) is large compared with the return interval C, in the sense that f n ( I ) \ C contains intervals from both cycles of renormalization. (See the end of Section 2.1 for an explanation of the notation used.) Lemma 3.2.3. Assume that f is renormalizable. Let C = L ∪ {c} ∪ R be the return interval, let a + 1 be the return time of L, let b + 1 be the return time of R. If I is a branch of the transfer map to C, if J ⊃ I is a branch of f n , and if J is disjoint from C, then there exist i ∈ {1, . . . , a} and j ∈ {1, . . . , b} such that f i ( L) is contained in the right component of f n ( J ) \ C and f j ( R) is contained in the left component. Proof. Since J is a branch of f n , either ∂− J = 0 or there exists 0 ≤ l < n such that ∂− f l ( J ) = c. In the former case the left component of f n ( J ) \ C contains f ( R). Assume the latter case holds. By Proposition 2.2.5 f l ( J ) ⊃ R, since f l ( I ) must be disjoint from C. Hence the left component of f n ( J ) \ C 55 3.2. A priori bounds contains f n−l ( R). Note that n − l ≤ b + 1 since f b+1 ( R) is mapped over the critical point (so f is not monotone on f b+1 ( R)). Furthermore, n − l 6= b + 1, since f l ( I ) ∩ R = ∅ and thus f l +b+1 ( I ) ∩ C = ∅. Now repeat the same argument for the other boundary point of J. Proof of Theorem 3.2.2. Since f is infinitely renormalizable there exists a sequence C0 ⊃ C1 ⊃ · · · of nice intervals whose lengths tend to zero (i.e. Cn is the range of the n–th first-return map and this interval is nice since the boundary consists of periodic points). Let Tn denote the transfer map to Cn . We must show that Tn is defined almost everywhere and that it has uniformly bounded distortion. By a theorem of Singer3 f cannot have a periodic attractor since it would attract at least one of the critical values. This does not happen for infinitely renormalizable maps since the critical orbits have subsequences which converge on the critical point. Thus Proposition 2.2.7 shows that Tn is defined almost everywhere. In order to prove that Tn has bounded distortion, pick any branch I of Tn with positive transfer time i = τ ( I ), and let J be the branch of f i containing I. By Lemma 3.2.3 both components of f i ( J ) \ Cn contain intervals from the forward orbit of Cn = cl Ln ∪ Rn , say f ln ( Ln ) and f rn ( Rn ) (note that these do not depend on the choice of branch J). We contend that (3.16) inf | f ln ( Ln )|/|Cn | > 0 n and inf | f rn ( Rn )|/|Cn | > 0. n Suppose not, and consider the C 0 –closure of {Rn f }. The a priori bounds show that this set is compact and hence there exists a subsequence {Rnk f } which converges to some f ∗ . But then f ∗ is a renormalizable map whose cycles of renormalization contain an interval of zero diameter. This is impossible, hence (3.16) must hold. This shows that f i ( J ) contains a δ–scaled neighborhood of Cn and that δ does not depend on n or J. The Koebe lemma now implies that Tn has bounded distortion and that the bound does not depend on n. Theorem 3.2.4. Assume f ∈ LSω̄ ∩ K is infinitely renormalizable with ω̄ ∈ ΩN . Let Λ be the closure of the orbits of the critical values. Then: • Λ is a Cantor set, 3 Singer’s theorem is stated for unimodal maps but the statement and proof can easily be adapted to Lorenz maps. 56 Chapter 3. Invariance • • • • Λ has Lebesgue measure zero, the Hausdorff dimension of Λ is strictly inside (0, 1), Λ is uniquely ergodic, the complement of the basin of attraction of Λ has zero Lebesgue measure. Proof. Let Ln and Rn denote the left and right half of the return interval of the n–th first-return map, let in and jn be the return times for Ln and Rn , let Λ0 = [0, 1], and let Λn = in[ −1 jn −1 i cl f ( Ln ) ∪ i =0 [ cl f j ( Rn ), n = 1, 2, . . . j =0 Components of Λn are called intervals of generation n and components of Λn−1 \ Λn are called gaps of generation n (see Figure 3.1 on the facing page). Let I be an interval of generation n, let J ⊂ I be an interval of generation n + 1, and let G ⊂ I be a gap of generation n + 1. We claim that there exists constants 0 < µ < λ < 1 such that µ < | J |/| I | < λ and µ < | G |/| I | < λ, where µ and λ do not depend on I, J and G. To see this, take the L0 –closure of {Rn f }. This set is compact in L0 , so the infimum and supremum of | J |/| I | over all I and J as above are bounded away from 0 and 1 (otherwise there would exist an infinitely renormalizable map in L0 with I and J as above such that | J | = 0 or | I | = | J |). The same argument holds for I and G. Since {Rn f } is a subset of the closure the claim follows. T T Next we claim that Λ = Λn . Clearly Λ ⊂ Λn (since the critical values are contained in the closure of f ( Ln ) ∪ f ( Rn ) for each n). From the previous claim |Λn | < λ|Λn−1 | so the lengths of the intervals of generation T n tend to 0 as n → ∞. Hence Λ = Λn . It now follows from standard arguments that Λ is a Cantor set of zero measure with Hausdorff dimension in (0, 1). That Λ is uniquely ergodic follows from Theorem 2.3.1 since f has bounded combinatorics due to the fact that Ω is finite. It only remains to prove that almost all points are attracted to Λ. Let Tn denote the transfer map to the n–th return interval Cn . By Proposition 2.2.7 the domain of Tn has full measure for every n and hence almost every point visits every Cn . This finishes the proof. 57 3.3. Periodic points of the renormalization operator c L0 0 R11 R12 L32 R21 R22 R0 L1 L42 R32 R52 L11 R1 L2 R2 1 L22 R42 R62 L12 Figure 3.1: Illustration of the intervals of generations 0, 1 and 2 for a (01, 100)–renormalizable map. Here Lin = f i ( Ln ) and Rin = f i ( Rn ). The intersection of all levels n = 0, 1, 2, . . . is a Cantor set, see Theorem 3.2.4. 3.3 Periodic points of the renormalization operator In this section we prove the existence of periodic points of the renormalization operator. The argument is topological and does not imply uniqueness even though we believe the periodic points to be unique within each combinatorial class.4 The notation used here is the same as in Section 3.1, in particular the sets Ω and K are the same as in that section. The constants ε− , ε+ and δ appear in the definition of K. Theorem 3.3.1. For every periodic combinatorial type ω̄ ∈ ΩN there exists a periodic point of R in Lω̄ . Remark 3.3.2. We are not saying anything about the periods of the periodic points. For example, we are not asserting that there exists a period-two point of type (ω, ω )∞ for some ω ∈ Ω — all we say is that there is a fixed point of type (ω )∞ . The point here is that (ω, ω )∞ is just another way to write (ω )∞ so these two types are the same. To begin with we will consider the restriction Rω of R to some ω ∈ Ω and show that Rω has a fixed point. Let Y = LSω ∩ K, and Y 0 = { f ∈ Y | c1+ (R f ) ≤ 1/2 ≤ c1− (R f )}. The proof is based on a careful investigation of the boundary of Y and the action of R on this boundary. However, we need to introduce the set Y 0 because we do not have enough information on the renormalization of maps in Y , see Theorem 3.1.5. 4 The conjecture is that the restriction of R to the set of infinitely renormalizable maps should contract maps of the same combinatorial type and this would imply uniqueness. 58 Chapter 3. Invariance Definition 3.3.3. A branch B of f n is full if f n maps B onto the domain of f ; B is trivial if f n fixes both endpoints of B. Proposition 3.3.4. The boundary of Y consists of three parts, namely f ∈ ∂Y if and only if at least one of the following conditions hold: (Y1) the left or right branch of R f is full or trivial, (Y2) 1 − c( f ) = ε( f ) ∈ {ε− , ε+ }, (Y3) Dist φ( f ) = δ or Dist ψ( f ) = δ. Also, each condition occurs somewhere on ∂Y . Before giving the proof we need to introduce some new concepts and recall some established facts about families of Lorenz maps. Definition 3.3.5. A slice (in the parameter plane) is any set of the form S = [0, 1]2 × {c} × {φ} × {ψ}, where c, φ and ψ are fixed. We will permit ourselves to be a bit sloppy with notation and write (u, v) ∈ S when it is clear which slice we are talking about (or if it is irrelevant). A slice S = [0, 1]2 × {c} × {φ} × {ψ} induces a family of Lorenz maps S 3 (u, v) 7→ f u,v = (u, v, c, φ, ψ) ∈ L. Any family induced from a slice is full, by which we mean that it realizes all possible combinatorics. See (Martens and de Melo, 2001) for a precise definition and a proof of this statement. For our discussion the only important fact is the following: Proposition 3.3.6. Let (u, v) 7→ f u,v be a family induced by a slice. Then this family intersects Lω̄ for every ω̄ such that Lω̄ 6= ∅. Note that ω̄ can be finite or infinite. Proof. This follows from (Martens and de Melo, 2001, Theorem A). Recall that C = cl L ∪ R is the return interval for a renormalizable map, and the return times for L and R are a + 1 and b + 1, respectively (see the end of Section 2.1). 59 3.3. Periodic points of the renormalization operator Lemma 3.3.7. Assume that f is renormalizable. Let (l, c) be the branch of f a+1 containing L and let (c, r ) be the branch of f b+1 containing R. Then f a +1 ( l ) ≤ l and f b+1 (r ) ≥ r. Proof. This is a special case of (Martens and de Melo, 2001, Lemma 4.1). Proof of Proposition 3.3.4. Let us first consider the boundary of L0ω . If either branch of R f is full or trivial, then we can perturb f in C 0 so that it no longer is renormalizable. Hence (Y1) holds on ∂L0ω . If f ∈ L0ω does not satisfy (Y1) then any sufficiently small C 0 –perturbation of f will still be renormalizable by Lemma 3.3.7. Hence the boundary of renormalization is exactly characterized by (Y1). Conditions (Y2) and (Y3) are part of the boundary of K. These boundaries intersect LSω by Proposition 3.3.6 and hence these conditions are also boundary conditions for Y . Fix 1 − c0 = ε 0 ∈ (ε− , ε+ ) and let S = [0, 1]2 × {c0 } × {id} × {id}. Let ρt be the deformation retract onto S defined by ρt (u, v, c, φ, ψ) = (u, v, c + t(c0 − c), (1 − t)φ, (1 − t)ψ), t ∈ [0, 1]. In order to make sense of this formula it is important to note that the linear structure on the diffeomorphisms is that induced from C 0 via the nonlinearity operator N (see Remark 2.1.4). Hence, for example tφ is by definition the diffeomorphism N −1 (tNφ). Let R t = ρ t ◦ R. The choice of slice is somewhat arbitrary in what follows, except that we will have to be a little bit careful when chosing c0 as will be pointed out in the proof of the next lemma. However, it is important to note that the slice intersects Y . Lemma 3.3.8. It is possible to choose c0 so that Rt has a fixed point on ∂Y 0 for some t ∈ [0, 1] if and only if R has a fixed point on ∂Y 0 . Remark 3.3.9. The condition c1+ (R f ) ≤ 1/2 ≤ c1− (R f ) roughly states that u(R f ) ≥ 1/2 and v(R f ) ≥ 1/2. Thus Y 0 has another boundary condition given by equality in either of these two inequalities. Instead of treating these as separate boundary conditions we subsume them into (Y1) by saying that the left branch is trivial also if c1− (R f ) = 1/2 and similarly for the right branch. 60 Chapter 3. Invariance v u Figure 3.2: Illustration of the action of ρ1 ◦ R|S . The shaded area corresponds to a full island. The boxes shows what the branches of ρ1 ◦ R f look like on each boundary piece. Proof. The ‘if’ statement is obvious since R = R0 , so assume that R has no fixed point on ∂Y 0 . Let f ∈ ∂Y 0 and assume that Rt f = f for some t > 0. We will show that this is impossible. To start off choose ε 0 ∈ (ε− , ε+ ) and let c0 = 1 − ε 0 as usual (we will be more specific about the choice of ε 0 later). Note that (Y2) cannot hold for Rt f since ε 0 ∈ (ε− , ε+ ) and hence the same is true for ε(Rt f ), since t > 0 and ε(R f ) ∈ [ε− , ε+ ] by Theorem 3.1.5. Similarly, (Y3) cannot hold for Rt f since the distortion of the diffeomorphic parts of R f are not greater than δ (by Theorem 3.1.5) and hence the distortion of the diffeomorphic parts of Rt f are strictly smaller than δ (since t > 0).5 The only possibility is that f = Rt f belongs to the boundary part described by condition (Y1). If either branch of R f is full then corresponding branch of Rt f is full as well which shows that f cannot be fixed by Rt , since a renormalizable map cannot have a full branch. Thus one of the branches of R f must be trivial. 5 This follows from Dist(1 − t)φ < Dist φ if, t > 0 and Dist φ > 0. 3.3. Periodic points of the renormalization operator 61 Assume that the left branch of R f is trivial (see Remark 3.3.9). Then c1− (R f ) = c(R f ) since c(R f ) > 1/2. In particular, R f is not renormalizable since c1− for a renormalizable map is away from the critical point by Lemma 3.1.9. Because of this lemma we can assure that Rs f is not renormalizable for all s ∈ [0, 1] by choosing ε 0 close to ε− . In particular, Rt f is not renormalizable and hence cannot equal f . Assume that the right branch of R f is trivial (see Remark 3.3.9). Then c1+ (R f ) = 1/2 since c(R f ) > 1/2. In particular R f is not renormalizable since that requires c1+ (R f ) to be close to 0 by Lemma 3.1.9. The same holds for Rs f for all s ∈ [0, 1] since 1/2 > ε+ . In particular, f cannot be fixed by Rt since f is renormalizable. We have shown that f ∈ / ∂Y 0 which is a contradiction and hence we conclude that Rt f 6= f for all t ∈ [0, 1]. The slice S intersects the set Lω of renormalizable maps of type ω by Proposition 3.3.6. This intersection can in general be a complicated set, but there will always be at least one connected component I of the interior such that the restricted family I 3 (u, v) 7→ f u,v is full (see Martens and de Melo, 2001, Theorem B). Such a set I is call a full island. The action of R on a full island is illustrated in Figure 3.2 on the preceding page. Note that the action of R on the boundary of I is given by (Y1) which also explains this figure. Lemma 3.3.10. Any extension of R1 |∂Y 0 to Y 0 has a fixed point. Proof. If R1 has a fixed point on ∂Y 0 then there is nothing to prove, so assume that this is not the case. Let S = [0, 1]2 × {c0 } × {id} × {id}. By the above discussion there is a full island I ⊂ S . Note that ∂I ⊂ ∂Y 0 . Pick any R : I → S such that R|∂I = R1 |∂I . Now define the displacement map δ : ∂I → S1 by δ( x ) = x − R( x ) . | x − R( x )| This map is well-defined since R1 was assumed not to have any fixed points on ∂Y 0 and ∂I ⊂ ∂Y 0 . The degree of δ is nonzero since I is a full island. This implies that R has a fixed point in I, otherwise δ would extend to all of I which would imply that the degree of δ was zero. This finishes the proof since R was an arbitrary extension of R1 |∂I and ∂I ⊂ ∂Y 0 . 62 Chapter 3. Invariance Proposition 3.3.11. Rω has a fixed point. Proof. By the previous two lemmas either Rω has a fixed point on ∂Y 0 or we can apply Theorem A.1.1. In both cases Rω has a fixed point. Proof of Theorem 3.3.1. Pick any sequence (ω0 , . . . , ωn−1 ) with ωi ∈ Ω. The proof of the previous proposition can be repeated with R 0 = R ω n −1 ◦ · · · ◦ R ω0 in place of R to see that R0 has a fixed point f ∗ . But then f ∗ is a periodic point of R and its combinatorial type is (ω0 , . . . , ωn−1 )∞ . C HAPTER Decompositions This chapter introduces decompositions in Section 4.1. Decompositions can be thought of as generalizations of diffeomorphisms, or perhaps more fundamentally as the internal structure of the diffeomorphisms that appear naturally in the study of composition operators. In Section 4.2 the renormalization operator is lifted to decomposed Lorenz and it is shown that renormalization contracts to the subset of pure decomposed Lorenz maps. 4.1 Decompositions In this section we introduce the notion of a decomposition. We show how to lift operators from diffeomorphisms to decompositions and also how decompositions can be composed in order to recover a diffeomorphism. This section is an adaptation of techniques introduced in Martens (1998). Definition 4.1.1. A decomposition φ̄ : T → D 2 ([0, 1]) is an ordered sequence of diffeomorphisms labelled by a totally ordered and at most countable set T. Any such set T will be called a time set. The space D is defined in Appendix A.2. The space of decompositions D̄ T over T is the direct product D̄T = ∏ D 2 ([0, 1]) T together with the `1 –norm kφ̄k = ∑ kφτ k. τ ∈T 63 4 64 Chapter 4. Decompositions The notation here is φτ = φ̄(τ ). The distortion of a decomposition is defined similarly: Dist φ̄ = ∑ Dist φτ . τ ∈T The sum of two time sets T0 ⊕ T1 is the disjoint union T0 ⊕ T1 = {( x, i ) | x ∈ Ti , i = 0, 1}, with order ( x, i ) < (y, i ) if and only if x < y, and ( x, 0) < (y, 1) for all x, y. The sum of two decompositions φ̄0 ⊕ φ̄1 ∈ D̄ T0 ⊕T1 , where φ̄i ∈ D̄ Ti , is defined by φ̄0 ⊕ φ̄1 ( x, i ) = φ̄i ( x ). In other words, φ̄0 ⊕ φ̄1 is the diffeomorphisms of φ̄0 in the order of T0 , followed by the diffeomorphisms of φ̄1 in the order of T1 . Note that ⊕ is noncommutative on time sets as well as on decompositions. Remark 4.1.2. Our approach to decompositions is somewhat different from that of Martens (1998). In particular, we require a lot less structure on time sets and as such our definition is much more suitable to general combinatorics. Intuitively speaking, the structure that Martens (1998) puts on time sets is recovered from limits of the renormalization operator so we will also get this structure when looking at maps in the limit set of renormalization. We simply choose not to make it part of the definition to gain some flexibility. Proposition 4.1.3. The space of decompositions D̄ T is a Banach space. Proof. The nonlinearity operator takes D 2 ([0, 1]) bijectively to C 0 ([0, 1]; R). The latter is a Banach space so the same holds for D̄ T . Definition 4.1.4. Let T be a finite time set (i.e. of finite cardinality) so that we can label T = {0, 1, . . . , n − 1} with the usual order of elements. The composition operator O : D̄ T → D 2 is defined by Oφ̄ = φn−1 ◦ · · · ◦ φ0 . The composition operator composes all maps in a decomposition in the order of T. We can also define partial composition operators O[ j,k] φ̄ = φk ◦ · · · ◦ φj , 0 ≤ j ≤ k < n. 65 4.1. Decompositions As a notational convenience we will write O≤k instead of O[0,k] etc. Next, we would like to extend the composition operator to countable time sets but unfortunately this is not possible in general. Instead of D 2 we will work with the space D 3 with the C 1 –nonlinearity norm: kφk1 = k NφkC 1 = max{| D k ( Nφ)|}, k =0,1 φ ∈ D3. Define D̄ T3 = {φ̄ : T → D 3 | kφ̄k1 < ∞}, where kφ̄k1 = ∑kφτ k1 . Note that k·k will still be used to denote the C 0 –nonlinearity norm. Proposition 4.1.5. The composition operator O : D̄ T3 → D 2 continuously extends to decompositions over countable time sets T. Remark 4.1.6. It is important to note that there is an inherent loss of smoothness when composing a decomposition over a countable time set. Starting with a bound on the C 1 –nonlinearity norm we only conclude a bound on the C 0 –nonlinearity norm of the composed map. This can be generalized; starting with a bound on the C k+1 –nonlinearity norm, we can conclude a bound on the C k –nonlinearity norm for the composed map. The reason why we loose one degree of smoothness is because we use the mean value theorem for one estimate in the Sandwich Lemma 4.1.9. If necessary it should be possible to replace this with for example a Hölder estimate which would lead to a slightly stronger statement. In order to prove this proposition we will need the Sandwich Lemma which in itself relies on the following properties of the composition operator. Lemma 4.1.7. Let φ̄ ∈ D̄ T be a decomposition over a finite time set T, and let φ = Oφ̄. Then e−kφ̄k ≤ |φ0 | ≤ ekφ̄k , |φ00 | ≤ kφ̄ke2kφ̄k , and If furthermore, φ̄ ∈ D̄ T3 , then kφk1 ≤ (1 + kφ̄k)e2kφ̄k kφ̄k1 . kφk ≤ kφ̄kekφ̄k . 66 Chapter 4. Decompositions Remark 4.1.8. Note that the lemma is stated for finite time sets, but the way we define the composition operator for countable time sets (see the proof of Proposition 4.1.5) will mean that the lemma also holds for countable time sets. Proof. The bounds on |φ0 | and |φ00 | follow from an induction argument using only Lemma A.2.11. Since T is finite we can label φ̄ so that φ = φn−1 ◦ · · · ◦ φ0 . Let ψi = O<i (φ̄) and let ψ0 = id. Now the bound on kφk follows from n −1 Nφ( x ) = ∑ Nφi (ψi (x))ψi0 (x), i =0 which in itself is obtained from an induction argument using the chain rule for nonlinearities (see Lemma A.2.8). Finally, take the derivative of the above equation to get ( Nφ)0 ( x ) = n −1 ∑ ( Nφi )0 (ψi (x))ψi0 (x)2 + Nφi (ψi (x))ψi00 (x). i =0 From this the bound on kφk1 follows. Lemma 4.1.9 (Sandwich Lemma). Let φ = φn−1 ◦ · · · ◦ φ0 and let ψ be obtained by “sandwiching γ inside φ;” that is, ψ = φn−1 ◦ · · · ◦ φi ◦ γ ◦ φi−1 ◦ · · · ◦ φ0 , for some i ∈ {0, . . . , n} (with the convention that φn = φ−1 = id). For every λ there exists K such that if γ, φi ∈ D 3 and if kγk1 + ∑kφi k1 ≤ λ, then kψ − φk ≤ K kγk. Proof. Let φ+ = φn ◦ · · · ◦ φi , and let φ− = φi−1 ◦ · · · ◦ φ−1 . Two applications of the chain rule for nonlinearities gives 0 Nψ( x ) − Nφ( x ) = N (φ+ ◦ γ)(φ− ( x )) − Nφ+ (φ− ( x )) · |φ− ( x )| 0 = Nφ+ (γ(y))γ0 (y) − Nφ+ (y) + Nγ(y) · |φ− ( x )|, where y = φ− ( x ). By assumption Nφ+ ∈ C 1 so by the mean value theorem there exists η ∈ [0, 1] such that Nφ+ (γ(y)) = Nφ+ (y) + ( Nφ+ )0 (η ) · (φ(y) − y) . 67 4.1. Decompositions Hence 0 Nψ( x ) − Nφ( x ) ≤ |φ− ( x )| 0 · Nφ+ (y) · |γ (y) − 1| + γ0 (y) · ( Nφ+ )0 (η ) · |γ(y) − y| + | Nγ(y)| ≤ K1 · K2 e k γ k − 1 + K3 e 2k γ k − 1 + k γ k ≤ K k γ k . The constants Ki only depend on λ by Lemma 4.1.7. We have also used Lemma A.2.11 and Lemma A.2.12 in the penultimate inequality. Proof of Proposition 4.1.5. Let φ̄ ∈ D̄ T3 and choose an enumeration θ : N → T. Let ψn denote the composition of {φθ (0) , . . . , φθ (n−1) } in the order induced by T. We claim that {ψn } is a Cauchy sequence in D 2 . Indeed, by applying the Sandwich Lemma with λ = kφ̄k1 we get a constant K only depending on λ such that: m + n −1 kψn − ψm k ≤ ∑ i =m m + n −1 kψi+1 − ψi k ≤ K ∑ kφθ (i) k → 0, as m, n → ∞. i =m Hence φ = lim ψn exists and φ ∈ D 2 . This also shows that φ is independent of the enumeration θ and hence we can define Oφ̄ = φ. We can now use the composition operator to lift operators from D to D̄ T , starting with the zoom operators of Definition 2.1.9. Definition 4.1.10. Let I ⊂ [0, 1] be an interval, let φ̄ ∈ D̄ T3 and let Iτ be the image of I under the diffeomorphism O<τ (φ̄). Define Z (φ̄; I ) = ψ̄, where ψτ = Z (φτ ; Iτ ), for every τ ∈ T. Remark 4.1.11. An equivalent way of defining the zoom operators on D̄ T3 is to let Iτ = ψτ−1 ( J ), where ψτ = O≥τ (φ̄), J = φ( I ), and φ = Oφ̄( I ). This is equivalent since Oφ̄ = O≥τ (φ̄) ◦ O<τ (φ̄). The original definition takes the view of zooming in on an interval in the domain of the decomposition, whereas the latter takes the view of zooming in on an interval in the range of the decomposition. We will make use of both of these points of view. Zoom operators on diffeomorphisms are contractions for a fixed interval I by Lemma A.2.16. A similar statement holds for decompositions: 68 Chapter 4. Decompositions Lemma 4.1.12. Let I ⊂ [0, 1] be an interval. If φ̄ ∈ D̄ T3 then k Z (φ̄; I )k ≤ ekφ̄k · min{| I |, |φ( I )|} · kφ̄k, where φ = Oφ̄. Remark 4.1.13. Since we are only dealing with decompositions with very small norm this lemma is enough for our purposes. However, in more general situations the constant in front of kφ̄k may not be small enough. A way around this is to consider decompositions which compose to diffeomorphisms with negative Schwarzian derivative. Then all the intervals Iτ will have hyperbolic lengths bounded by that of J (notation is as in Remark 4.1.11). This can then be used to show that zoom operators contract and the contraction can be bounded in terms of the hyperbolic length of J. Proof. Using the notation of Definition 4.1.10 we have k Z (φ̄; I )k = ∑ kZ(φτ ; Iτ )k ≤ ∑ | Iτ | · kφτ k ≤ sup | Iτ | · kφ̄k. τ ∈T τ ∈T τ ∈T For every τ there exists ξ τ ∈ I such that | Iτ | = (O<τ (φ̄))0 (ξ τ ) · | I | which together with Lemma 4.1.7 implies that | Iτ | ≤ ekφ̄k · | I |. Similarly, there exists ητ ∈ φ( I ) such that |φ( I )| = (O≥τ (φ̄))0 (ητ ) · | Iτ | so by Lemma 4.1.7 | Iτ | ≤ ekφ̄k · |φ( I )| as well. This contraction property of the zoom operators leads us to introduce the subspace of pure decompositions (the intuition is that renormalization contracts towards the pure subspace, see Proposition 4.2.8). Definition 4.1.14. The subspace of pure decompositions Q̄ T ⊂ D̄ T consists of all decompositions φ̄ such that φτ is a pure map for every τ ∈ T. The subspace of pure maps Q ⊂ D ∞ consists of restrictions of x α away from the critical point, that is Q = Z ( x | x |α−1 ; I ) | int I 63 0 . A property of pure maps is that they can be parametrized by one real variable. We choose to parametrize the pure maps by their distortion with a sign and call this parameter s. The sign of s is positive for I to the right of 0 and negative for I to the left of 0. With this convention the graphs of pure maps will look like Figure 4.1 on the facing page. 69 4.1. Decompositions 1 1 1 s>0 s=0 s<0 0 1 0 0 1 1 Figure 4.1: The graphs of a pure map µs for different values of the signed distortion s. Remark 4.1.15. Let µs ∈ Q. A calculation shows that Dist µs = |log µ0s (1)/µ0s (0)| and from this it is possible to deduce an expression for µs : (4.1) α 1 + exp{ α−s 1 } − 1 x − 1 µs ( x ) = , exp{ ααs −1 } − 1 x ∈ [0, 1], s 6= 0, and µ0 = id. We emphasize that the parametrization is chosen so that |s| equals the distortion of µs . For this reason we call s the signed distortion of µs . Figure 4.1 shows the graphs of µs for different values of s. Equation (4.1) may at first seem to indicate that there is some sort of singular behavior at s = 0 but this is not the case; the family s 7→ µs is smooth. The next two lemmas are needed in preparation for Proposition 4.2.8. Lemma 4.1.16. Let φ ∈ D 2 and let I ⊂ [0, 1] be an interval. Then d( Z (φ; I ), Q) ≤ | I | · d(φ, Q), where the distance d(·, ·) is induced by the C 0 –nonlinearity norm. Proof. A calculation shows that r s ( α − 1) Nµs ( x ) = , 1 + rs x rs = exp s α−1 − 1. 70 Chapter 4. Decompositions Let I = [ a, b] and let ζ I ( x ) = a + | I | · x. Then d( Z (φ; I ), Q) = inf max N ( Z (φ; I ))( x ) − Nµs ( x ) s∈R x ∈[0,1] r (α − 1) = inf max | I | · Nφ(ζ I ( x )) − r >−1 x ∈[0,1] 1 + rx r ( α − 1) = inf max | I | · Nφ(ζ I ( x )) − r >−1 x ∈[0,1] 1 + r (ζ I ( x ) − a)/| I | ρ ( α − 1 ) , = | I | · inf max Nφ( x ) − 1 + ρx ρ/ ∈[− 1b ,− 1a ] x ∈ I where ρ = r/(b − (1 + r ) a). Note that 1 + ρx has a zero in [0, 1] if ρ ≤ −1, so the infimum is assumed for ρ > −1. Thus ρ(α − 1) d( Z (φ; I ), Q) = | I | · inf max Nφ( x ) − . ρ>−1 x ∈ I 1 + ρx Taking the max over x ∈ [0, 1] finishes the proof. Lemma 4.1.17. Let φ̄ ∈ D̄ T3 and let I ⊂ [0, 1] be an interval. Then d Z (φ̄; I ), Q̄ T ≤ ekφ̄k · min{| I |, |φ( I )|} · d(φ̄, Q̄ T ), where φ = Oφ̄. Proof. Use Lemma 4.1.16 and a similar argument to that employed in the proof of Lemma 4.1.12. The pure decompositions have some very nice properties which we will make use of repeatedly. Proposition 4.1.18. If φ̄ ∈ Q̄ T and kφ̄k < ∞, then φ = Oφ̄ is in D ∞ and φ has nonpositive Schwarzian derivative. Remark 4.1.19. Note that kφ̄k < ∞ is equivalent to Dist φ̄ < ∞, since Z 1 Dist µ = exp Nµ( x )dx , 0 for pure maps µ. Hence the norm bound can be replaced by a distortion bound and the above proposition still holds. 71 4.2. Renormalization of decomposed maps Proof. Let η be the nonlinearity of a pure map. A computation gives Dk η (x) = (−1)k k! · η ( x ) k +1 . ( α − 1) k Hence, if η is bounded then so are all of its derivatives (of course, the bound depends on k). Thus Proposition 4.1.5 shows that φ = Oφ̄ is well-defined and φ ∈ D k , for all k ≥ 2 (use Remark 4.1.6). Finally, every pure map has negative Schwarzian derivative so φ must have nonpositive Schwarzian deriviative, since negative Schwarzian is preserved under composition by Lemma A.3.4. Notation. We put a bar over objects associated with decompositions to distinguish them from diffeomorphisms. Hence φ̄ denotes a decomposition, whereas φ denotes a diffeomorphism. Similarly, D̄ denotes a set of decompositions, whereas D is a set of diffeomorphisms. Given a decomposition φ̄ : T → D , we use the notation φτ to mean φ̄(τ ) and we call this the diffeomorphism at time τ. Moreover, when talking about φ̄ we consistently write φ to denote the composed map Oφ̄. We will frequently consider the disjoint union of all decompositions instead of decompositions over some fixed time set T and for this reason we introduce the notation D̄ = G T 4.2 D̄T and Q̄ = G Q̄T . T Renormalization of decomposed maps In this section we lift the renormalization operator to the space of decomposed Lorenz maps (i.e. Lorenz maps whose diffeomorphic parts are replaced with decompositions). We prove that renormalization contracts towards the subspace of pure decomposed maps. This will be used in later sections to compute the derivative of R on its limit set. Definition 4.2.1. Let T = ( T0 , T1 ) be a pair of time sets, and let D̄ T denote the product D̄ T0 × D̄ T1 . The space of decomposed Lorenz maps L̄ T over T is the set [0, 1]2 × (0, 1) × D̄ T together with structure induced from the Banach space R3 × D̄ T with the max norm of the products. 72 Chapter 4. Decompositions Definition 4.2.2. The composition operator induces a map L̄3T → L2 which (by slight abuse of notation) we will also denote O. Explicitly, if f¯ = (u, v, c, φ̄, ψ̄) ∈ L̄T , then f = O f¯ is defined by f = (u, v, c, Oφ̄, Oψ̄). We will now define the renormalization operator on the space of decomposed Lorenz maps. Formally, the definition is identical to the definition of the renormalization operator on Lorenz maps. To illustrate this, let f = O f¯ be renormalizable. Then, by Lemma 2.1.11, R f = (u0 , v0 , c0 , φ0 , ψ0 ), where (4.2) u0 = | Q( L)|/|U |, v0 = | Q( R)|/|V |, c 0 = | L | / | C |, φ0 = Z ( f a ◦ φ; U ) and ψ0 = Z ( f b ◦ ψ; V ). Zoom operators satisfy Z ( g ◦ h; I ) = Z ( g; h( I )) ◦ Z (h; I ), so we can write φ0 = Z (ψ; Q(Ua )) ◦ Z ( Q; Ua ) ◦ · · · ◦ Z (ψ; Q(U1 )) ◦ Z ( Q; U1 ) ◦ Z (φ; U ), ψ0 = Z (φ; Q(Vb )) ◦ Z ( Q; Vb ) ◦ · · · ◦ Z (φ; Q(V1 )) ◦ Z ( Q; V1 ) ◦ Z (ψ; V ). Definition 4.2.3. Define R f¯ = (u0 , v0 , c0 , φ̄0 , ψ̄0 ), where u0 , v0 , c0 are given by (4.2) and φ̄0 = Z (φ̄; U ) ⊕ Z ( Q; U1 ) ⊕ Z (ψ̄; Q(U1 )) ⊕ · · · ⊕ Z ( Q; Ua ) ⊕ Z (ψ̄; Q(Ua )), ψ̄0 = Z (ψ̄; V ) ⊕ Z ( Q; V1 ) ⊕ Z (φ̄; Q(V1 )) ⊕ · · · ⊕ Z ( Q; Vb ) ⊕ Z (φ̄; Q(Vb )), where Z ( Q; ·) is now interpreted as a decomposition over a singleton time set. See Figure 4.2 on the facing page for an illustration of the action of R. Definition 4.2.4. The domain of R on decomposed Lorenz maps is conF tained in the disjoint union L̄ = T L̄ T over all time sets T. Just as before we let L̄ω denote all ω–renormalizable maps in L̄; L̄ω̄ denotes all maps in S L̄ such that Ri f¯ ∈ L̄ωi , where ω̄ = (ω0 , ω1 , . . . ); and L̄Ω = ω ∈Ω L̄ω . Remark 4.2.5. Note that R takes the renormalizable maps of L̄ T into L̄ T 0 , where T 0 6= T in general. This is the reason why we have to work with the F disjoint union T L̄ T . 73 4.2. Renormalization of decomposed maps φ̄ Q0 ψ̄ Q0 ψ̄ ψ̄ Q1 φ̄ 1 U 1 c C C φ̄0 c ψ̄0 V 0 0 Figure 4.2: Illustration of the renormalization operator acting on decomposed Lorenz maps. First the decompositions are ‘glued’ to each other with Q according to the type of renormalization, here the type is (01, 100). Then the interval C is pulled back, creating the shaded areas in the picture. The maps following the dashed arrows from U to C and from V to C represent the new decompositions before rescaling. Lemma 4.2.6. The composition operator is a semi-conjugacy. That is, the following square commutes S R S L2ω −−−→ L2 L̄3ω −−−→ Oy L̄3 yO R and O is surjective. Remark 4.2.7. This lemma shows that we can use the composition operator to transfer results about decomposed Lorenz maps to Lorenz maps. Proof. The square commutes by definition so let us focus on the surjectivity. Fix τ ∈ T and define a map Γτ : D → D̄ T by sending φ ∈ D to the decomposition φ̄ : T → D defined by ( φ, if t = τ, φ̄(t) = id, otherwise. Then O ◦ Γτ = id which proves that O is surjective on D̄ T and hence it is also surjective on L̄ T . 74 Chapter 4. Decompositions The main result for the renormalization operator on Lorenz maps was the existence of the invariant set K for types in the set Ω, see Section 3.1. It should come as no surprise that K and Ω will be central to our discussion on decomposed maps as well. The first result in this direction is the following. Proposition 4.2.8. If f¯ ∈ L̄3ω̄ is infinitely renormalizable with ω̄ ∈ ΩN , if kφ̄k ≤ K and kψ̄k ≤ K, and if O f¯ ∈ K ∩ LS , then the decompositions of Rn f¯ are uniformly contracted towards the subset of pure decompositions. Proof. From the definition of the renormalization operator (and using the fact that d( Z ( Q; I ), Q) = 0) we get d(φ̄0 , Q̄) = a ∑ d(Z(ψ̄; Q(Ui )), Q̄) + d(Z(φ̄; U ), Q̄). i =1 Now apply Lemma 4.1.17 to get d(φ̄0 , Q̄) ≤ ekψ̄k a +1 ∑ |Ui |d(ψ̄, Q̄) + ekφ̄k |U1 |d(φ̄, Q̄). i =2 From Section 3.1 we get that ∑|Ui | and ∑|Vi | may be chosen arbitrarily small (by choosing the return times sufficiently large). Now make these sums small compared with max{ekφ̄k , ekψ̄k } to see that there exists µ < 1 (only depending on K) such that d(φ̄0 , Q̄) + d(ψ̄0 , Q̄) ≤ µ d(φ̄, Q̄) + d(ψ̄, Q̄) . Our main goal is to understand the limit set of the renormalization operator and the above proposition will be central to this discussion. Definition 4.2.9. The set of forward limits of R restricted to types in Ω is defined by \ [ AΩ = Rn L̄ω̄ . n ≥1 ω̄ ∈Ωn Remark 4.2.10. In other words, AΩ consists of all maps f¯ which have a complete past: f¯ = Rω−1 f¯−1 , f¯−1 = Rω−2 f¯−2 , ..., ωi ∈ Ω. This also describes how we can associate each f¯ ∈ AΩ with a left infinite sequence (. . . , ω−2 , ω−1 ). 4.2. Renormalization of decomposed maps 75 Proposition 4.2.11. AΩ is contained in the subset of pure decomposed Lorenz maps. Proof. This is a direct consequence of Proposition 4.2.8. Since AΩ is contained in the set of pure decomposed maps we will restrict our attention to this subset from now on. This is extremely convenient since pure decompositions satisfy some very strong properties, see Proposition 4.1.18, and it will allow us to compute the derivative at all points in AΩ in Section 5.1. Next we would like to lift the invariant set K to the decomposed maps, but simply taking the preimage O−1 (K) will yield a set which is too large1 so we will have to be a bit careful. Definition 4.2.12. Define K̄ = {(u, v, c, φ̄, ψ̄) | ε− ≤ 1 − c ≤ ε+ , Dist φ̄ ≤ δ, Dist ψ̄ ≤ δ, φ̄, ψ̄ ∈ Q̄}, and K̄Ω = { f¯ ∈ K̄ ∩ L̄Ω | c1+ (R f¯) ≤ 1/2 ≤ c1− (R f¯)}. Note that K̄ is defined analogously to K but with the additional assumption that the decompositions are pure. The notation used here is the same as that of Section 3.1. Proposition 4.2.13. If b− is sufficiently large, then R(K̄Ω ) ⊂ K̄. Proof. Let f = O f¯ = (u, v, c, φ, ψ). Note first of all that Dist φ̄ ≤ δ implies that Dist φ ≤ δ, since Dist satisfies the subadditivity property Dist γ2 ◦ γ1 ≤ Dist γ1 + Dist γ2 . Hence, f automatically satisfies the conditions of Theorem 3.1.5, so all we need to prove is that Dist φ̄0 ≤ δ and Dist ψ̄0 ≤ δ. This is the reason why we define K̄ by a distortion bound instead of a norm bound. Note that f has nonpositive Schwarzian since the decompositions are pure, see Proposition 4.1.18. 1 Any preimage under O contains decompositions whose norm is arbitrarily large. As an example of how things can go wrong, fix K > 0 and consider φ̄ : N → D defined by φn+1 = φn−1 and kφn k = K for every n. Then φ2n−1 ◦ · · · ◦ φ0 = id for every n, but ∑kφn k = ∞. 76 Chapter 4. Decompositions We will first show that the norm is invariant, then we transfer this invariance to the distortion. The reason why we consider the norm first is because it satisfies the contraction property in Lemma 4.1.12 which makes it easier to work with. From the definition of R and Lemma 4.1.12 we get a kφ̄0 k = k Z (φ̄; U )k + ∑ k Z (ψ̄; Q(Ui ))k + k Z ( Q; Ui )k i =1 a +1 a i =2 i =1 ≤ ekφ̄k kφ̄k · |U1 | + ekψ̄k kψ̄k ∑ |Ui | + ∑ k Z ( Q; Ui )k. The norm of a pure map is determined by how far away its domain is from the critical point. More precisely, we have that a a i =1 i =1 |Ui | ∑ kZ(Q; Ui )k = (α − 1) ∑ d(c, Ui ) . Each term in this sum is bounded by cross-ratio of Ui inside [c, 1]. Since maps with positive Schwarzian contract cross-ratio, since S f < 0, and since Ui is a pull-back of C under an iterate of f , this cross-ratio is bounded by the cross-ratio χ of C inside [c1+ , 1]. Thus, the above sum is bounded by a(α − 1)χ. From the proof of Theorem 3.1.5 we know that χ is of the order εt for some t > 0. Since a < b− and b− εt → 0 we see that the above sum has a uniform bound which tends to zero as b− → ∞. A similar argument for ψ̄0 gives kφ̄0 k + kψ̄0 k ≤ (kφ̄k + kψ̄k) exp {kφ̄k + kψ̄k} ∑|Ui | + ∑|Vi | +m = k (kφ̄k + kψ̄k) + m, where m = ∑k Z ( Q; Ui )k + ∑k Z ( Q; Vi )k. The arguments above and the proof of Theorem 3.1.5 show that we can choose b− and δ so that δ≥ m , 1−k which proves that kφ̄k + kψ̄k ≤ δ =⇒ kφ̄0 k + kψ̄0 k ≤ δ. 4.2. Renormalization of decomposed maps 77 The final observation which we use to finish the proof is that if γ ∈ Q then Dist γ kγk = (α − 1) · exp −1 . α−1 That is kγk ≈ Dist γ for pure maps γ with small distortion. This allows us to slightly modify the above invariance argument for the norm so that it holds for the distortion as well. C HAPTER Differentiable structure This chapter begins with a calculation of the derivative of the renormalization operator on a subset of the pure decomposed Lorenz maps in Section 5.1. The derivative restricted to the parameter plane is orientationpreserving and this turns out to have strong consequences on the geometry of the domains of renormalization. This is discussed in Section 5.2. After this the estimates on the norm of the derivative are used to construct an expanding invariant cone field in Section 5.3. The cone field is then used in Section 5.4 to construct unstable manifolds at each point in the limit set of renormalization. 5.1 The derivative The tangent space of R on the pure decomposed Lorenz maps can be written X × Y, where X = R2 and Y = R × `1 × `1 . The coordinates on X correspond to the (u, v) coordinates on L̄ T . Let ( x, y) ∈ X × Y denote the coordinates on the tangent space and recall that we are using the max norm on the products. The derivative of R at f¯ is denoted D R f¯ = M = M1 M3 M2 , M4 where M1 : R2 → R2 , M2 : Y → R2 , M3 : R2 → Y and M4 : Y → Y are bounded linear operators. 79 5 80 Chapter 5. Differentiable structure Remark 5.1.1. The fact that the derivative on the pure decomposed maps can be written as an infinite matrix is one of the reasons why we restrict ourselves to the pure decompositions. Deformations of pure decompositions are also easy to deal with since they are ‘monotone’ in the sense that the dynamical intervals that define the renormalization move monotonically under such deformations. This makes it possible to estimate the elements of the derivative matrix. Theorem 5.1.2. There exist constants k and K such that if f¯ ∈ K̄ ∩ L̄Ω , then k M1 x k ≥ k min{|U |−1 , |V |−1 } · k x k, | x1 | | x2 | k M3 x k ≤ Kρ + , |U | |V | k M2 k ≤ K |C |−1 , k M4 k ≤ Kρ|C |−1 , where ρ = max{ε0 , Dist φ̄, Dist ψ̄} and b− is sufficiently large. Remark 5.1.3. The sets K̄ and K̄Ω are introduced in Definition 4.2.12 and Ω is given by Section 3.1 as always. Some results in this section are stated for K̄ but others are only valid for the subset K̄Ω . The main difference between these two sets is that maps in K̄Ω have good bounds on u, v and c for the renormalization due to Proposition 4.2.13, whereas for maps in K̄ we cannot say much about the renormalization. Proof. The proof of this theorem is split up into a few propositions that are in this section. The estimate for M1 is given in Corollary 5.1.11. The estimates for M2 and M4 follow from Propositions 5.1.12 and 5.1.15. Finally, the estimate for M3 follows from Propositions 5.1.12 and 5.1.13. Notation. Let f¯ = (u, v, c, φ̄, ψ̄) and as always use primes to denote the renormalization R f¯ = (u0 , v0 , c0 , φ̄0 , ψ̄0 ). We introduce special notation for the diffeomorphic parts of the renormalization before rescaling: (5.1) Φ = f 1a ◦ φ, Ψ = f 0b ◦ ψ, so that Φ : U → C, Ψ : V → C, and C = ( p, q). Note that p and q are by definition periodic points of periods a + 1 and b + 1, respectively. We will use the notation ∂s t to denote the partial derivative of t with respect to s. In the formulas below we write ∂t to mean the partial derivative of t with respect to any direction. The notation g( x ) y is used to mean that there exists K < ∞ not depending on g such that K −1 y ≤ g( x ) ≤ Ky for all x in the domain of g. 81 5.1. The derivative The ∂ operator satisfies the following rules: Lemma 5.1.4. The following expressions hold whenever they make sense: (5.2) (5.3) ∂( f ◦ g)( x ) = ∂ f ( g( x )) + f 0 ( g( x ))∂g( x ), n ∂ f n +1 ( x ) = ∑ D f n − i f i +1 ( x ) ∂ f f i ( x ) , i =0 (5.4) ∂ f −1 ∂ f f −1 ( x ) ( x ) = − 0 −1 . f f (x) Furthermore, if f ( p) = p then (5.5) ∂p = − ∂ f ( p) f 0 ( p) − 1 . Remark 5.1.5. The ∂ operator clearly also satisfies the product rule (5.6) ∂( f · g)( x ) = ∂ f ( x ) g( x ) + f ( x )∂g( x ). This and the chain rule gives the quotient rule (5.7) ∂( f /g)( x ) = ∂ f ( x ) g( x ) − f ( x )∂g( x ) . g ( x )2 Proof. Equation (5.2) implies the others three. The second equation is an induction argument and the last two follow from 0 = ∂ ( x ) = ∂ f ◦ f −1 ( x ) = ∂ f f −1 ( x ) + f 0 f −1 ( x ) ∂ f −1 ( x ) , and ∂( p) = ∂( f ( p)) = ∂ f ( p) + f 0 ( p)∂p. Equation (5.2) itself can be proved by writing f ε ( x ) = f ( x ) + ε fˆ( x ), gε ( x ) = g( x ) + ε ĝ( x ) and using Taylor expansion: f ε ( gε ( x )) = f ε ( g( x )) + ε f ε0 ( g( x )) ĝ( x ) + O(ε2 ) = f ( g( x )) + ε fˆ( g( x )) + f 0 ( g( x )) ĝ( x ) + O(ε2 ). We now turn to computing the derivative matrix M. The first three rows of M are given by the following formulas. 82 Chapter 5. Differentiable structure Lemma 5.1.6. The partial derivatives of u0 , v0 and c0 are given by ∂ ( Q0 (c) − Q0 ( p)) − u0 · ∂ Φ−1 (q) − Φ−1 ( p) , ∂u = |U | ∂ ( Q1 (q) − Q1 (c)) − v0 · ∂ Ψ −1 (q) − Ψ −1 ( p) 0 ∂v = , |V | ∂(c − p) − c0 · ∂(q − p) ∂c0 = . |C | 0 Proof. Use (4.2), Lemma 5.1.4 and Remark 5.1.5. Let us first consider how to use these formulas when deforming in the u, v or c directions (i.e. the first three columns of M). Almost everything in these formulas is completely explicit — we have expressions for Q0 and Q1 so evaluating for example ∂u Q0 (c) is routine. In order to evaluate for example the term ∂u Ψ −1 (q) we make use of (5.4) and (5.3). This involves estimating the sum in (5.3) which can be done with mean value theorem estimates. The terms ∂p and ∂q are evaluated using (5.5) and the fact that p = Φ ◦ Q0 ( p) and q = Ψ ◦ Q1 (q). There are a few shortcuts to make the calculations simpler as well, for example ∂u Φ = 0 since Φ does not contain Q0 which is the only term that depends on u, and so on. Deforming in the φ̄ or ψ̄ directions (there are countably many such directions) is similar. Here we make use of the fact that the decompositions are pure and we have an explicit formula (4.1) for pure maps where the free parameter represents the signed distortion (see Remark 4.1.15), so we can compute their derivative, partial derivative with respect to distortion etc. These deformations will affect the partial derivatives of any expression involving Φ or Ψ, but all others will not ‘see’ these deformations. The calculations involved do not make any particular use of which direction we deform in, so even though there are countably many directions we essentially only need to perform one calculation for φ̄ and another for ψ̄. We now turn to computing the partial derivatives of φ̄0 and ψ̄0 . Lemma 5.1.7. Let µs0 = Z (µs ; I ), where µs , µs0 ∈ Q and I = [ x, y]. Then ∂s0 = Nµs (y)∂y − Nµs ( x )∂x + ∂( Dµs )(y) ∂( Dµs )( x ) − . Dµs (y) Dµs ( x ) 83 5.1. The derivative Proof. By definition s = log{ Dµs (1)/Dµs (0)}. Distortion is invariant under zooming, so this shows that s0 = log{ Dµs (y)/Dµs ( x )}. A calculation gives ∂( Dµs )( x ) ∂ log Dµs ( x ) = + Nµs ( x )∂x. Dµs ( x ) By definition φ̄0 consists of maps of the form Z (µs ; I ) (as well as finitely many of the form Z ( Q; I ) but these can be thought of as lims→±∞ Z (µs ; I )). Hence the above lemma shows us how to compute the partial derivatives at each time in φ̄0 . Note that we implicitly identify R with Q via s 7→ µs . In order to use the lemma we also need a way to evaluate the terms ∂x and ∂y. One way to do this is to express these in terms of ∂p and ∂q which have already been computed at this stage. If we let T : I → [ p, q] denote the ‘transfer map’ to C, then p = T ( x ) and hence (5.2) shows that ∂x = ∂p − ∂T ( x ) . DT ( x ) The terms ∂T and DT can be bounded by ∂Φ and DΦ (or ∂Ψ and DΨ) all of which have already been computed as well. Proposition 5.1.8. If f ∈ K ∩ LΩ , then −1 Q( p) 1− u 0 1 u0 DΨ (Ψ (q)) 1− Q(q) 1 1 + − a + 1 − 1 b + 1 u Df |U | |U | v DΦ(Φ (q)) D f ( p)−1 (q)− 1 + M1e , M1 = −1 Q( p) 1 v0 DΦ(Φ ( p)) 1 1− v 0 1− Q ( q ) − |V | u DΨ (Ψ−1 ( p)) D f a+1 ( p)−1 1 + v D f b+1 (q)−1 |V | where the error term M1e is negligible. Remark 5.1.9. Note that this proposition does not need any assumptions on the critical values of the renormalization (cf. Theorem 3.1.5). This will be important later on when we discuss the structure of the parameter plane. Also note that the M1 part of the derivative matrix has nothing to do with decompositions so it is stated for nondecomposed Lorenz maps. Proof. We begin by computing ∂p and ∂q. Use Φ ◦ Q0 ( p) = p, Ψ ◦ Q1 (q) = q, and (5.5) to get (5.8) (5.9) DΦ( Q0 ( p))∂u Q0 ( p) , D f a +1 ( p ) − 1 ∂v Φ( Q0 ( p)) ∂v p = − , D f a +1 ( p ) − 1 ∂u p = − ∂u Ψ ( Q1 (q)) , D f b +1 ( q ) − 1 DΨ ( Q1 (q))∂v Q1 (q) ∂v q = − . D f b +1 ( q ) − 1 ∂u q = − 84 Chapter 5. Differentiable structure Here we have used that ∂u Φ = 0 and ∂v Ψ = 0. Next, let us estimate ∂u Ψ. Let x ∈ V and let xi = f i ◦ ψ( x ). From (5.3) we get ∂u Ψ ( x ) = ∂u f b ◦ ψ)( x ) = ∂u f ( xb−1 ) + b −1 ∑ D f b − i ( x i ) ∂ u f ( x i −1 ), i =1 where ∂u f ( x ) = φ0 ( Q0 ( x )) Q0 ( x )/u. Note that ∂u f ( xi−1 ) ≤ e2δ xi /u. In order to bound the sum we divide the estimate into two parts. Let n < b be the smallest integer such that D f ( xi ) ≤ 1 for all i ≥ n. In the part where i < n we estimate D f b − i ( x i ) x i = D f n − i ( x i ) D f b − n ( x n ) x i ≤ K1 xn D f ( xb−1 ) xi ≤ K2 ε1−1/α . xi Here we have used the mean value theorem to find ξ i ≤ xi such that D f n−i (ξ i ) = xn /xi and D f n−i ( xi ) ≤ K1 D f n−i (ξ i ), since φ has very small distortion. In the part where i ≥ n we estimate D f b−i ( xi ) xi ≤ D f ( xb−1 ) ≤ Kε1−1/α . Summing over the two parts gives us the estimate b −1 ∑ D f b−i (xi )∂u f (xi−1 ) ≤ K(b − 1)ε1−1/α . i =1 Hence (5.10) ∂u Ψ ( x ) = ∂u f f b−1 ◦ ψ( x ) + O bε1−1/α ≈ 1. We will now estimate ∂v Φ. Let x ∈ U and let xi = f i ◦ φ( x ). Similarly to the above, we have a −1 ∂ v Φ ( x ) = ∂ v f ( x a −1 ) + ∑ D f a − i ( x i ) ∂ v f ( x i −1 ), i =1 where ∂v f ( x ) = −ψ0 ( Q1 ( x ))(1 − Q1 ( x ))/v. By the mean value theorem there exists ξ i ∈ [ xi , 1] such that D f a−i (ξ i ) = (1 − x a )/(1 − xi ), since 85 5.1. The derivative f a−i ( xi ) = x a . From Lemma 3.1.10 it follows that D f a−i ( xi ) D f a−i (ξ i ). Putting all of this together we get that the sum above is proportional to a −1 ∑ D f a−i (ξ i )(1 − xi ) = (a − 1)(1 − xa ). i =1 Thus ∂v Φ( x ) − aε, (5.11) since x a ∈ C and hence 1 − x a = ε + O(|C |) ≈ ε. We now have all the ingredients we need to compute M1 . Lemma 5.1.6 shows that |U |∂u u0 = ∂u Q0 (c) − ∂u Q0 ( p) − Q00 ( p)∂u p − u0 DΦ−1 (q)∂u q − DΦ−1 ( p)∂u p . Here we have used ∂u Φ = 0. Now use (5.8) to get Q00 ( p)∂u p = −∂u Q0 ( p) D f a +1 ( p ) , D f a +1 ( p ) − 1 DΦ−1 ( p)∂u p = − ∂ u Q0 ( p ) . D f a +1 ( p ) − 1 Thus (5.12) |U | ∂ u u 0 = 1 + (1 − u 0 ) ∂ u Q0 ( p ) u0 ∂u Ψ ( Q1 (q)) . + D f a +1 ( p ) − 1 DΦ(Φ−1 (q))( D f b+1 (q) − 1) The last term is much smaller than one because of (5.10) and since | DΦ| 1 (and also D f b+1 (q) ≈ α/ε0 > α). From Lemma 5.1.6 we get |V |∂v v0 = ∂v Q1 (q) + Q10 (q)∂v q − ∂v Q1 (c) − v0 DΨ −1 (q)∂v q − DΨ −1 ( p)∂v p . Here we have used ∂v Ψ = 0. Now use (5.9) to get Q10 (q)∂v q = − ∂ v Q 1 ( q ) D f b +1 ( q ) , D f b +1 ( q ) − 1 DΨ −1 (q)∂v q = − ∂ v Q1 ( q ) . D f b +1 ( q ) − 1 86 Chapter 5. Differentiable structure Thus (5.13) |V | ∂ v v 0 = 1 − (1 − v 0 ) ∂ v Q1 ( q ) v0 ∂v Φ( Q0 ( p)) − . DΨ (Ψ −1 ( p))( D f a+1 ( p) − 1) D f b +1 ( q ) − 1 The last term is much smaller than one by (5.11) and since | DΨ | 1 (and also D f a+1 ( p) ≈ α/c0 > α). From Lemma 5.1.6 we get |U |∂v u0 = − Q00 ( p)∂v p − u0 ∂v Φ−1 (q) + DΦ−1 (q)∂v q −∂v Φ−1 ( p) − DΦ−1 ( p)∂v p . Let us prove that that the dominating term is the one with ∂v q. From (5.9) we get ∂ v Q 1 ( q ) D f b +1 ( q ) ∂v q = − 0 , Q 1 ( q ) D f b +1 ( q ) − 1 which diverges as b− → ∞, since | R|/ε → 0 and hence Q10 (q) → 0 (by Proposition 3.1.13). From (5.9) and (5.11) we get that ∂v p → 0, which shows that the last term is dominated by the term with ∂v q. Now, ∂v Φ−1 ( x ) = −∂v Φ( x )/DΦ( x ), which combined with (5.11) shows that the term with ∂v q dominates the two terms with ∂v Φ−1 . Furthermore Q00 ( p)∂v p = − ∂v Φ( Q0 ( p)) D f a+1 ( p) , DΦ( Q0 ( p)) D f a+1 ( p) − 1 which combined with (5.11) shows that the term with ∂v q dominates the above term. Thus (5.14) |U | ∂ v u 0 = u 0 DΨ (Ψ −1 (q)) ∂v Q1 (q) + e, DΦ(Φ−1 (q)) D f b+1 (q) − 1 where the error term e is tiny compared with the other term on the righthand side. From Lemma 5.1.6 we get |V |∂u v0 = Q10 (q)∂u q − v0 ∂u Ψ −1 (q) + DΨ −1 (q)∂u q −∂u Ψ −1 ( p) − DΨ −1 ( p)∂u p . 87 5.1. The derivative Let us prove that that the dominating term is the one with ∂u p. From (5.8) we get ∂ u Q 0 ( p ) D f a +1 ( p ) ∂u p = − 0 , Q 0 ( p ) D f a +1 ( p ) − 1 which diverges as b− → ∞, since | L|/c → 0 and hence Q00 ( p) → 0. From (5.8) and (5.10) we get that ∂u q is bounded and hence the ∂u p term dominates the second term involving ∂u q. Now, ∂u Ψ −1 ( x ) = −∂u Ψ (y)/DΨ (y), y = Ψ −1 ( x ), which combined with (5.10) shows that the ∂u p term dominates the two terms involving ∂u Ψ −1 . Furthermore Q10 (q)∂u q = − ∂u Ψ ( Q1 (q)) D f b+1 (q) , DΨ ( Q1 (q)) D f b+1 (q) − 1 which combined with (5.10) shows that the ∂u p term dominates the above term. Thus |V | ∂ u v 0 = − v 0 (5.15) DΦ(Φ−1 ( p)) ∂u Q0 ( p) + e, DΨ (Ψ −1 ( p)) D f a+1 ( p) − 1 where the error term e is tiny compared with the other term on the righthand side. Corollary 5.1.10. If f ∈ K ∩ LΩ , then det M1 > 0 for b− large enough. Proof. Let t= DΦ(Φ−1 ( p)) DΨ (Ψ −1 (q)) . DΦ(Φ−1 (q)) DΨ (Ψ −1 ( p)) From Lemma 3.1.10 and Proposition 3.1.13 we know that the distortion of Φ and Ψ tend to zero as b− → ∞. Hence t → 1. From Proposition 5.1.8 we get |U ||V | det M1 > 1 − t u0 v0 Q( p)(1 − Q(q)) . uv ( D f a+1 ( p) − 1)( D f b+1 (q) − 1) Note that D f a+1 ( p) ≈ α/c0 and D f b+1 (q) ≈ α/ε0 . If u0 , v0 ≥ 1/2, then ε0 1 by Theorem 3.1.5 and so det M1 > 0. If not, then we can estimate |U ||V | det M1 > 1 − t c0 ε0 t 1 > 1− > 0, 0 0 2uv (α − c )(α − ε ) 2uv 4(α − 1/2)2 since u and v are close to one and t can be assumed to be close to one by the above. 88 Chapter 5. Differentiable structure Corollary 5.1.11. There exists k > 0 such that if f is as above, then k M1 x k ≥ k · min{|U |−1 , |V |−1 } · k x k. Proof. Write M1 as M1 = a |U | − |Uc | − |Vb | ! . d |V | (Here we have used that the distortion of Φ and Ψ are small, so DΦ/DΨ |V |/|U |.) Then −1 −1 d |U | b |U | M1 = ( ad − bc) . c |V | a |V | We are using the max-norm, hence k M1−1 k = ( ad − bc)−1 · max{(b + d)|U |, (c + a)|V |}. It can be checked that (b + d)/( ad − bc) and ( a + c)/( ad − bc) are bounded by some K. Let k = 1/K to finish the proof. Proposition 5.1.12. If f ∈ K ∩ LΩ , then ∂c u0 −|C |−1 , ∂ c v 0 | C | −1 , ∂ u c 0 c 0 ε 0 |U | −1 , ∂ c c 0 − c 0 ε 0 | C | −1 , ∂ v c 0 − c 0 ε 0 | V | −1 . Proof. A straightforward calculation shows that (5.16) ∂ c Q0 ( x ) x =− 0 Q0 ( x ) c and ∂ c Q1 ( x ) 1−x =− . 0 Q1 ( x ) 1−c This together with Φ ◦ Q0 ( p) = p, Ψ ◦ Q1 (q) = q, (5.1) and (5.5) gives ∂c p = p a+1 ( p ) − ∂ Φ ( Q ( p )) c 0 cDf , a + 1 Df ( p) − 1 ∂c q = 1− q b+1 ( q ) − ∂ Ψ ( Q ( q )) c 1 ε Df . b + 1 Df (q) − 1 From (5.3) and (5.16) we get ∂c Φ( x ) = − 1 a −1 D f a −i ( x i ) · (1 − x i ), ε i∑ =0 x i = f i ◦ φ ( x ), x ∈ U, ∂c Ψ ( x ) = − 1 b −1 D f b −i ( xi ) · xi , c i∑ =0 x i = f i ◦ ψ ( x ), x ∈ V. 89 5.1. The derivative Using a similar argument as in the proof of Proposition 5.1.8 this shows that ∂c Φ( x ) − a and ∂c Ψ ( x ) = −O(bε1−1/α ), and hence ∂c p 1 and ∂c q 1. Now apply Lemma 5.1.6 using the fact that Φ−1 ( p) = Q0 ( p) to get |U |∂c u0 = −(1 − u0 )∂c Q0 ( p) − u0 ∂c Φ−1 (q) . A calculation gives p D f a+1 ( p) c − ∂c Φ( Q0 ( p)) 1 ∂ c Q0 ( p ) = DΦ( Q0 ( p)) DΦ( Q0 ( p)) ( D f a+1 ( p) − 1) and ∂c Φ −1 ∂ c q − ∂ c Φ Φ −1 ( q ) 1 . (q) = − 1 DΦ Φ (q) DΦ Φ−1 (q) (In particular, both terms have the same sign.) But DΦ( x ) |C |/|U |, so this gives ∂c u0 −|C |−1 . The proof that ∂c v0 |C |−1 is almost identical. From Lemma 5.1.6 we get | C | ∂ c c 0 = c 0 (1 − ∂ c q ) + ε 0 (1 − ∂ c p ), and hence ∂c c0 = c0 ε0 D f b+1 (q) − ε|C |−1 (1 − ∂c Ψ ( Q1 (q))) ε D f b +1 ( q ) − 1 ε0 c0 D f a+1 ( p) − c|C |−1 (1 − ∂c Φ( Q0 ( p))) c D f a +1 ( p ) − 1 c0 1 − ∂c Ψ ( Q1 (q)) ε0 1 − ∂c Φ( Q0 ( p)) =− − + O(c0 ε0 /ε). |C |( D f a+1 ( p) − 1) |C |( D f b+1 (q) − 1) + Now use D f b+1 (q) ≈ α/ε0 and D f a+1 ( p) ≈ α/c0 to get ∂c c0 −c0 ε0 |C |−1 . (Note that ∂c Φ( x ) < 0 and ∂c Ψ ( x ) < 0.) Apply Lemma 5.1.6 to get |C |∂u c0 = −c0 ∂u q − ε0 ∂u p. This and the proof of Proposition 5.1.8 shows that c0 D f a+1 ( p) − 1 ∂u Ψ ( Q1 (q)) + ε0 D f b+1 (q) − 1 DΦ( Q0 ( p))∂u Q0 ( p) 0 ∂u c = . | C | D f a +1 ( p ) − 1 D f b +1 ( q ) − 1 90 Chapter 5. Differentiable structure Since c0 ( D f a+1 ( p) − 1) α − c0 , ε0 ( D f b+1 (q) − 1) α − ε0 , |∂u Ψ | | DΦ|, and ∂u Q0 ( p) ≈ 1, this shows that ∂u c0 c0 ε0 DΦ( Q0 ( p)) c0 ε0 . |C | |U | The proof that ∂v c0 −c0 ε0 |V |−1 is almost identical. Notation. We need some new notation to state the remaining propositions. Each pure map φσ in the decomposition φ̄ can be identified with a real number which we denote sσ ∈ R, and each ψτ in the decomposition ψ̄ can be identified with a real number tτ ∈ R: R 3 sσ ↔ φσ = φ̄(σ) ∈ Q, R 3 tτ ↔ ψτ = ψ̄(τ ) ∈ Q. We put primes on these numbers to denote that they come from the renormalization, so s0σ0 ∈ R is identified with φ̄0 (σ0 ) and t0τ 0 ∈ R is identified with ψ̄0 (τ 0 ). Note that σ, σ0 are used to denote times for φ̄, φ̄0 , and τ, τ 0 are used to denote times for ψ̄, ψ̄0 , respectively. Proposition 5.1.13. There exists K such that if f¯ ∈ K̄ ∩ L̄Ω , then |s0σ0 | , |U | |t0 0 | |∂u t0τ 0 | ≤ K τ , |U | |∂u s0σ0 | ≤ K |s0σ0 | , |V | |t0 0 | |∂v t0τ 0 | ≤ K τ , |V | |∂v s0σ0 | ≤ K |s0σ0 | , |C | |t0 0 | |∂c t0τ 0 | ≤ K τ . |C | |∂c s0σ0 | ≤ K Proof. We will compute ∂v s0σ0 ; the other calculations are almost identical. There are four cases to consider depending on which time in the decomposition φ̄0 we are looking at: (1) φ̄0 (σ0 ) = Z (φσ ; I ), (2) φ̄0 (σ0 ) = Z (ψτ ; I ), (3) φ̄0 (σ0 ) = Z ( Q0 ; I ), (4) φ̄0 (σ0 ) = Z ( Q1 ; I ). In each case let I = [ x, y] and let T : I → C be the ‘transfer map’ to C. This means that T = f i ◦ γ for some i and γ is a partial composition (e.g. γ = O≥σ (φ̄) in case 1) or a pure map (in cases 3 and 4). In case 1 Lemma 5.1.7 gives ∂v s0σ0 = Nφσ ( x ) Nφσ (y) (∂v q − ∂v T (y)) − (∂v p − ∂v T ( x )). DT (y) DT ( x ) By Lemma A.2.16 Nφσ (y) = Nφσ0 0 (1)/| I | and hence Nφσ0 0 (1)/| I | s0 0 Nφσ (y) σ . DT (y) |C |/| I | |C | 91 5.1. The derivative Here we used that of φσ0 0 does not change sign so R have R the0 nonlinearity 0 0 0 sσ0 = Nφσ0 and that Nφσ0 ≈ Nφσ0 (1) since the nonlinearity is close to being constant (which is true since φ̄0 is pure and has very small norm). We now need to estimate ∂v T but this can very roughly be bounded by ∂v Φ since ∂v T (y) = ∂v f 1i (γ(y)), so the estimate that was used for ∂v Φ in the proof of Proposition 5.1.8 can be employed. From the same proof we thus get that ∂v q dominates both ∂v p and ∂v T. The above arguments show that ∂v s0σ0 s0σ0 s0 0 DΨ ( Q1 (q)) s0σ0 1 ∂v q − σ − . |C | | C | D f b +1 ( q ) − 1 | V | D f b +1 ( q ) − 1 This concludes the calculations for case 1. Case 2 is almost identical to case 1. Case 4 differs in that Lemma 5.1.7 now gives two extra terms ∂v s0σ0 = NQ1 (y) NQ1 ( x ) (∂v q − ∂v T (y)) − (∂v p − ∂v T ( x )) DT (y) DT ( x ) ∂v Q10 (y) ∂v Q10 ( x ) + − . Q10 (y) Q10 ( x ) However, ∂v Q1 = 1/v so the last two terms cancel. The rest of the calculations go exactly like in case 1. Case 3 is similar to case 4. Remark 5.1.14. A key point in the above proof is that deformations in a decomposition direction is monotone. This is what allowed us to estimate the partial derivatives of the ‘transfer map’ T by the partial derivatives of Φ or Ψ. Proposition 5.1.15. There exists K and ρ > 0 such that if f¯ ∈ K̄ ∩ L̄Ω , then |∂? u0 | ≤ Kερ , |C | |∂? s0σ0 | ≤ for ? ∈ {sσ , tτ }. |∂? v0 | ≤ Kερ |s0σ0 | , |C | Kc0 ε0 ερ , |C | Kερ |t0τ 0 | |∂? t0τ 0 | ≤ , |C | Kερ , |C | |∂? c0 | ≤ 92 Chapter 5. Differentiable structure Proof. Let us first consider ∂sσ , that is deformations in the direction of φσ . Since φσ is pure we can use (4.1) to compute ∂sσ φσ ( x ) − x (1 − x ). (5.17) From (5.5) we get ∂ sσ Φ Q0 ( p ) ∂ sσ p = − D f a +1 ( p ) − 1 and ∂ sσ Ψ Q1 ( q ) ∂ sσ q = − . D f b +1 ( q ) − 1 so the first thing to do is to calculate the partial derivatives of Φ and Ψ. Let x ∈ U, then ∂sσ Φ( x ) = ∂sσ f 1a ◦ O>σ (φ̄) ◦ φσ ◦ O<σ (φ̄) ( x ) = D f 1a ◦ O>σ (φ̄) O≤σ (φ̄)( x ) · ∂sσ φσ O<σ (φ̄)( x ) . Note that we have used that f 1 does not depend on sσ . From (5.17) we thus get that (5.18) |∂sσ Φ( x )| ≤ K 0 · DΦ( x )(1 − x ) ≤ Kε. Let x ∈ V and let xi = f 0i ◦ ψ( x ). As in the proof of Proposition 5.1.8 we have b −1 ∂ s σ Ψ ( x ) = ∂ s σ f 0 ( x b −1 ) + ∑ D f0b−i (xi )∂s σ f 0 ( x i −1 ). i =1 From (5.17) we get |∂sσ f 0 ( xi−1 )| = D O>σ (φ̄) O≤σ (φ̄) ◦ Q0 ( xi−1 ) · ∂sσ O<σ (φ̄) ◦ Q0 ( xi−1 ) ≤ K | x i |. Using the same estimate as in the proof of Proposition 5.1.8 this shows that (5.19) |∂sσ Ψ ( x )| ≤ K 0 (1 − xb ) + O(bε1−1/α ) = O(bε1−1/α ). We can now argue as in the proof of Proposition 5.1.8 to find bounds on ∂sσ ? for ? ∈ {u0 , v0 , c0 }. From Lemma 5.1.6 we get D f a +1 ( p ) u 0 ∂ s σ Φ Φ −1 ( q ) − ∂ s σ q 1 − u0 ∂sσ Φ( Q( p)) 0 · · + , ∂ sσ u = |U | DΦ( Q( p)) D f a+1 ( p) − 1 |U | DΦ Φ−1 (q) 0 ∂ Ψ ( Q ( q )) b +1 ( q ) 0 ∂ Ψ Ψ −1 ( p ) − ∂ p 1 − v D f v s s s σ σ , − ∂ sσ v 0 = · σ · + b + 1 − 1 |V | DΨ ( Q(q)) D f ( q ) − 1 |V | DΨ Ψ ( p) ∂ sσ Ψ Q1 ( q ) ∂ s Φ Q0 ( p ) ∂ sσ c 0 = c 0 · + ε 0 · σ a +1 . b + 1 Df ( p) − 1 Df (q) − 1 93 5.2. Archipelagos in the parameter plane Use that Dφ |C |/|U |, DΨ |C |/|V |, D f a+1 ( p) α/c0 and D f b+1 (q) α/ε0 to finish the estimates for ∂sσ u0 , ∂sσ v0 and ∂sσ c0 . Note that bεr → 0 for any r > 0 so it is clear from (5.18) and (5.19) that we can find a ρ > 0 such that |∂sσ Φ| < Kερ and |∂sσ Ψ | < Kερ . In order to find bounds for ? ∈ {s0σ0 , t0τ 0 } we argue as in the proof of Proposition 5.1.13. The last two terms from Lemma 5.1.7 are slightly different (when nonzero). In this case they are given by ∂sσ Dφσ (y) ∂sσ Dφσ ( x ) − . Dφσ (y) Dφσ ( x ) Using (5.17) we can calculate this difference. For |sσ | 1 it is close to y − x which turns out to be negligible. All other details are exactly like the proof of Proposition 5.1.13. The estimates for ∂tτ are handled similarly. The only difference is the estimates of the partial derivatives of Φ and Ψ. These can be determined by arguing as in the above and the proof of Proposition 5.1.8 which results in (5.20) |∂tτ Φ( x )| ≤ Kε1−1/α and |∂tτ Ψ (y)| ≤ Kaε, for x ∈ U and y ∈ V. The remaining estimates are handled identically to the above. 5.2 Archipelagos in the parameter plane The term archipelago was introduced by Martens and de Melo (2001) to describe the structure of the domains of renormalizability in the parameter plane for families of Lorenz maps. In this section we show how the information we have on the derivative of the renormalization operator can be used to prove that the structure of archipelagos must be very rigid. Fix c∗ , φ∗ , ψ∗ and let F : [0, 1]2 → L denote the associated family of Lorenz maps (u, v) = λ 7→ Fλ = (u, v, c∗ , φ∗ , ψ∗ ). We will assume that: (i) Sφ∗ < 0 and Sψ∗ < 0, (ii) Dist φ∗ ≤ δ and Dist ψ∗ ≤ δ, and (iii) ε− ≤ 1 − c∗ ≤ ε+ . These conditions ensure that Fλ ∈ K. The notation Ω, K, δ, ε− and ε+ is introduced in Section 3.1. 94 Chapter 5. Differentiable structure Definition 5.2.1. An archipelago Aω ⊂ [0, 1]2 of type ω ∈ Ω is the set of λ such that Fλ is ω–renormalizable. An island of Aω is the interior of a connected component of Aω . For the family λ 7→ Fλ we have the following very strong structure theorem for archipelagos (this should be contrasted with Martens and de Melo, 2001). Note that c∗ , φ∗ and ψ∗ are arbitrary, so the results in this section holds for any family which satisfies conditions (i) to (iii) above. Theorem 5.2.2. For every ω ∈ Ω there exists a unique island I such that the archipelago Aω equals the closure of I. Furthermore, I is diffeomorphic to a square. Remark 5.2.3. This theorem shows that the structure of Aω is very rigid. Note that the structure of archipelagos is much more complicated in general. There may be multiple islands, islands need not be square, there may be isolated points, etc. Corollary 5.2.4. For every ω̄ ∈ ΩN there exists a unique λ such that Fλ has combinatorial type ω̄. The set of all such λ is a Cantor set. Proof. By Theorem 5.2.2 there exists a unique sequence of nested squares1 I0 ⊃ I1 ⊃ I2 ⊃ · · · such that λ ∈ Ik implies that Fλ is renormalizable of type (ω0 , . . . , ωk−1 ). We contend that the relative diameter of Ik+1 inside Ik is uniformly bounded (from above and from below). Otherwise the a priori bounds would give us a subsequence {λk( j) ∈ Ik( j) } such that Rk( j) Fλk( j) converges (in L0 ) to a renormalizable Lorenz map with a critical value on the boundary. This is a contradiction since a map cannot be renormalizable if it has critical values in the boundary. We conclude that the intersection ∩ Ik is a point. The bound on the relative diameters is uniform in both ω and Fλ since Ω is finite and K is relatively compact. It now follows from a standard argument that the union of the above intersections over all combinatorial types in ΩN is a Cantor set. The family Fλ is monotone, by which we mean that u 7→ F(u,v) ( x ) is strictly increasing for x ∈ (0, c∗ ), and v 7→ F(u,v) ( x ) is strictly decreasing for x ∈ (c∗ , 1). As a consequence, if we let M(+u,v) = {( x, y) | x ≥ u, y ≤ v} 1 By and M(−u,v) = {( x, y) | x ≤ u, y ≥ v}, a square we mean any set diffeomorphic to the unit square. 95 5.2. Archipelagos in the parameter plane then µ ∈ Mλ+ =⇒ Fµ ( x ) > Fλ ( x ) and µ ∈ Mλ− =⇒ Fµ ( x ) < Fλ ( x ), for all x ∈ (0, 1) \ {c}. In other words, deformations in Mλ+ moves both branches up, deformations in Mλ− moves both branches down. This simple observation is key to analyzing the structure of archipelagos. Definition 5.2.5. Let πS : R3 → R2 be the projection which takes the rectangle [c, 1] × [1 − c, 1] × {c} onto S = [1/2, 1]2 1−y 1−x ,1− , πS ( x, y, c) = 1 − 2(1 − c ) 2c and let H be the map which takes (u, v, c, φ, ψ) to the height of its branches (c is kept around because πS needs it) H (u, v, c, φ, ψ) = (φ(u), 1 − ψ(1 − v), c). Now define R : Aω → S by R(λ) = πS ◦ H ◦ R( Fλ ). Remark 5.2.6. The action of R can be understood by looking at Figure 5.1 on the next page. The boundary of an island I is mapped into the boundary of the wedge W by the map H ◦ R. The four boundary pieces of the wedge correspond to when the renormalization has at least one full or trivial branch. Note that the image of ∂I in ∂W will not in general lie in a plane, instead it will be bent around somewhat. For this reason we project down to the square S via the projection πS . This gives us the final operator R : Aω → S. Proposition 5.2.7. Let I ⊂ Aω be an island. Then R is an orientation-preserving diffeomorphism that takes the closure of I onto S. Remark 5.2.8. This already shows that the structure of archipelagos is very rigid. First of all every island is full, but there are also exactly one of each type of extremal points, and exactly one of each type of vertex. In other words, there are no degenerate islands of any type! Extremal points and vertices are defined in Martens and de Melo (2001), see also the caption of Figure 5.2 on page 97. 96 Chapter 5. Differentiable structure H◦R◦G W (c∗ , 1 − c∗ , c∗ ) 1 − c1+ c1− c Figure 5.1: Illustration of the action of R on the family Fλ . The dark gray island is mapped onto a set which is wrapped around the wedge W. That is, the boundary of the island is mapped into the boundary of W with nonzero degree. Note that in this illustration we project the image of R to R3 via the map H. The maps H and G convert between critical values (c1− , c1+ ) and (u, v)–parameters. Explicitly G (c1− , 1 − c1+ , c∗ ) = (φ∗−1 (c1− ), 1 − ψ∗−1 (c1+ ), c∗ , φ∗ , ψ∗ ). Proof. By definition R maps I into S and ∂I into ∂S. We claim that DRλ is orientation-preserving for every λ ∈ cl I.2 Assume that the claim holds (we will prove this soon). We contend that R maps cl I onto S. If not, then R(∂I ) must be strictly contained in ∂S, since the boundaries are homeomorphic to the circle and R is continuous. But then DRλ must be singular for some λ ∈ ∂I which contradicts the claim. Hence R : cl I → S maps a simply connected domain onto a simply connected domain, and DR is a local isomorphism. Thus R is in fact a diffeomorphism. We now prove the claim. A computation gives DπS ( x, y, c) = 2 The (2(1 − c))−1 0 ? , 0 (2c)−1 ? notation DRλ is used to denote the derivative of R at the point λ. 97 5.2. Archipelagos in the parameter plane (1, 1) λ3 R Fλ3 λ2 R Fλ2 λ4 R Fλ4 λ1 R Fλ1 (φ∗−1 (c∗ ), 1 − ψ∗−1 (c∗ )) Figure 5.2: Illustration of a full island for the family Fλ . The boundary corresponds to when at least one branch of the renormalization R Fλ is either full or trivial. The top right and bottom left corners are extremal points; the top left and bottom right corners are vertices. and φ0 (u) 0 ... ψ 0 (1 − v ) . . . . = 0 ? ? ... DH(u,v,c,φ,ψ) The top-left 2 × 2 matrix is orientation-preserving in both cases and the same is true for D R by Corollary 5.1.10. Thus DRλ is orientation-preserving. Lemma 5.2.9. Assume f m (c1− ) = c = f n (c1+ ) for some m, n > 0. Let (l, c) and (c, r ) be branches of f m and f n , respectively. Then f m (l ) ≤ l and f n (r ) ≥ r. In particular, f is renormalizable to a map with trivial branches. Proof. In order to reach a contradiction we assume that f m (l ) > l. Then f im (l ) ↑ x for some point x ∈ (l, c] as i → ∞, since f m (c1− ) = c. Since l is the left endpoint of a branch there exists t such that f t (l ) = c1+ . Hence f m−t (c1+ ) = l so the orbit of c1+ contains the orbit of l. But the orbit of c1+ was periodic by assumption which contradicts f im (l ) ↑ x. Hence f m (l ) ≤ l. Now repeat this argument for r to complete the proof. 98 Chapter 5. Differentiable structure Definition 5.2.10. Define − i − γtriv = λ ∈ [0, 1]2 Fλa+1 (c− ∗ ) = c∗ and Fλ ( c∗ ) > c∗ , i = 1, . . . , a , + i + γtriv = λ ∈ [0, 1]2 Fλb+1 (c+ ∗ ) = c∗ and Fλ ( c∗ ) < c∗ , i = 1, . . . , b . + (The notation here is g(c− ∗ ) = limx ↑c∗ g ( x ) and g ( c∗ ) = limx ↓c∗ g ( x ).) − Lemma 5.2.11. The set γtriv is the image of a curve v 7→ ( g(v), v). The map g is differentiable and takes [1 − ψ∗−1 (c∗ ), 1] into [φ∗−1 (c∗ ), 1). + Similarly, γtriv is the image of a curve u 7→ (u, h(u)) where h is differentiable − 1 and takes [φ∗ (c∗ ), 1] into [1 − ψ∗−1 (c∗ ), 1). Proof. Define g(v) = φ∗−1 ◦ (ψ∗ ◦ Q1 )−a (c∗ ) and h(u) = 1 − ψ∗−1 ◦ (φ∗ ◦ Q0 )−b (c∗ ). Note that Q1 depends on v and Q0 depends on u so g and h are well-defined − + maps. It can now be checked that these maps define γtriv and γtriv . − + − + Lemma 5.2.12. Assume that γtriv crosses γtriv and let λ ∈ γtriv ∩ γtriv . Then the crossing is transversal and there exists ρ > 0 such that if r < ρ, then the − + complement of γtriv ∪ γtriv inside the ball Br (λ) consists of four components and exactly one of these components is contained in the archipelago Aω . Proof. To begin with assume that the crossing is transversal so that the com− + plement of γtriv ∪ γtriv in Br (λ) automatically consists of four components − + for r small enough. Note that γtriv ∪ γtriv does not intersect Mλ+ ∪ Mλ− \ {λ}. − Hence, precisely one component will have a boundary point µ ∈ γtriv such + + that γtriv intersects Mµ . Denote this component by N. Note that if we move from µ inside N ∩ Mµ+ then the left critical value of the return map moves + above the diagonal. If we move in N ∩ Mµ+ from a point in γtriv then the right critical value of the return map moves below the diagonal. By Lemma 5.2.9 Fλ is renormalizable and moreover the periodic points pλ and qλ that define the return interval of Fλ are hyperbolic repelling by the minimum principle. Hence, if we deform Fλ into N it will still be renormalizable since N consists of µ such that Fµa+1 (c− ) is above the diagonal and Fµb+1 (c+ ) is below the diagonal. By choosing r small enough all of N will be contained in Aω . Note that if we deform into any other component (other than N) then at least one of the critical values of the return map will be on the wrong side 99 5.2. Archipelagos in the parameter plane (1, 1) E − γtriv λ + γtriv µ Figure 5.3: Illustration of the proof of Theorem 5.2.2. Both λ and µ must be in the boundary of islands, which lie inside the shaded areas. These two islands have opposite orientation which is impossible. of the diagonal and hence the corresponding map is not renormalizable. Thus only the component N intersects Aω . Now assume that the crossing is not transversal. Then we may pick − + λ in the intersection γtriv ∩ γtriv so that it is on the boundary of an island (by the above argument). But then λ must be at a transversal intersection − + since islands are square by Proposition 5.2.7 and the curves γtriv and γtriv are differentiable. Hence every crossing is transversal. Proof of Theorem 5.2.2. From Proposition 5.2.7 we know that every island must contain an extremal point which renormalizes to a map with only trivial branches, and hence every island must be adjacent to a crossing be− + tween the curves γtriv and γtriv . We claim that there can be only one such crossing and hence uniqueness of islands follows. Note that there is always at least one island by Proposition 3.3.6. − + By Lemma 5.2.11 γtriv and γtriv terminate in the upper and right bound2 ary of [0, 1] , respectively. Let λ be the crossing nearest the points of termination in these boundaries. Let E be the component in the complement of − + γtriv ∪ γtriv in [0, 1]2 that contains the point (1, 1). The geometrical configu− + ration of γtriv and γtriv is such that E must contain the piece of Aω adjacent to λ as in Lemma 5.2.12. To see this use the fact that deformations in the cones Mλ+ moves both branches of Fλ up. 100 Chapter 5. Differentiable structure In order to reach a contradiction assume that there exists another cross− + ing µ between γtriv and γtriv (see Figure 5.3 on the previous page). By Lemma 5.2.12 there is an island attached to this crossing but the config− + uration of γtriv and γtriv at µ is such that this island is oriented opposite to the island inside E. But R is orientation-preserving so both islands must be oriented the same way and hence we reach a contradiction. The conclusion − + is that there can be no more than one crossing between γtriv and γtriv as claimed. Finally, the entire archipelago equals the closure of the island since the derivative of R is nonsingular at every point in the archipelago. Hence every point in the archipelago must either be contained in an island or on the boundary of an island. 5.3 Invariant cone field A standard way of showing hyperbolicity of a linear map is to find an invariant cone field with expansion inside the cones and contraction in the complement of the cones. In this section we show that the derivative of the renormalization operator has an invariant cone field and that it expands these cones. However, our estimates on the derivative are not sufficient to prove contraction in the complement of the cones so we cannot conclude that the derivative is hyperbolic. The results in this section are used in Section 5.4 to construct unstable manifolds in the limit set of renormalization. Let H ( f¯, κ ) = {( x, y) | kyk ≤ κ k x k} denote the standard horizontal κ–cone on the tangent space at f¯ ∈ K̄Ω , where K̄Ω = { f¯ ∈ K̄ ∩ L̄Ω | c1+ (R f¯) ≤ 1/2 ≤ c1− (R f¯)}. As always, K and Ω are the same as in Section 3.1. Recall that we decompose the tangent space into a two-dimensional subspace with coordinate x and a codimension two subspace with coordinate y. The x–coordinate corresponds to the (u, v)–subspace in K̄Ω . We use the max-norm so if z = ( x, y) then kzk = max{k x k, kyk}. Proposition 5.3.1. Assume f¯ ∈ K̄Ω and define − κ ( f¯) = K − max{ε, Dist φ̄, Dist ψ̄} and κ ( f¯) = K + min + |C | |C | , |U | | V | . 5.3. Invariant cone field 101 It is possible to choose K + , K − (not depending on f¯) such that if κ ≤ κ + ( f¯), then D R f¯ H ( f¯, κ ) ⊂ H R f¯, κ − (R f¯) . In particular, the cone field f¯ 7→ H ( f¯, 1) is mapped strictly into itself by D R. Remark 5.3.2. Note that as b− increases, κ − ↓ 0 and κ + ↑ ∞. Thus a fatter and fatter cone is mapped into a thinner and thinner cone. In particular, the invariant subspaces inside the thin cone and the complement of the fat cone eventually line up with the coordinate axes. Proof. Assume kyk ≤ κ k x k. Let z0 = Mz where z0 = ( x 0 , y0 ) and z = ( x, y). Then k M1 x k − κ k M2 k k M1 x k − k M2 kkyk k x0 k kxk ≥ ≥ . ky0 k k M3 x k + k M4 kkyk k M3 kxxk k + κ k M4 k We are interested in a lower bound on k x 0 k/ky0 k so this shows that we need to minimize k M1 x k − κ k M2 k g( x ) = , k M3 x k + κ k M4 k subject to the constraint k x k = max{| x1 |, | x2 |} = 1. By Proposition 5.1.8, asymptotically 0x x1 ε x x 2 2 1 , − . k M1 x k = max − + |U | α | V | (α − 1)|U | |V | Here we have used that f¯ ∈ K̄Ω and hence 1 − u0 1, 1 − v0 1, D f a+1 ( p) α, D f b+1 (q) α/ε0 , and DΦ( x )/DΨ (y) |V |/|U | for x ∈ U, y ∈ V. Let ρ0 = max{ε0 , Dist φ̄0 , Dist ψ̄0 }. By Theorem 5.1.2 A | x1 | B | x2 | 0 + , k M3 x k ≤ ρ |U | |V | so we are lead to minimize o n 0 max |U1 | − αε|Vt | , (α−11)|U | − |Vt | − κ k M2 k g1 ( t ) = , A Bt 0 + + κ k M k /ρ ρ 0 |U 4 | |V | 102 Chapter 5. Differentiable structure (corresponding to x1 = ±1) and o n 0 max |Ut | − α|εV | , (α−1t )|U | − |V1 | − κ k M2 k g2 ( t ) = , B 0 ρ0 |At + + κ k M k /ρ 4 U| |V | (corresponding to x2 = ±1) over t ∈ [0, 1]. Note that we have to assume that α(α − 1) > ε0 here or the numerators might be zero (we can assure that this is the case by increasing b− if necessary). These maps are piecewise monotone so the minimum is assumed on a boundary point. The points t = 0, t = 1 are boundary points of both g1 and g2 . Let t0 = |V | 2 − α α · · |U | α − 1 α − ε 0 and t1 = |V | α α . · · |U | α − 1 α + ε 0 If α < 2, then t0 is a boundary point of g1 if t0 < 1, otherwise t0−1 is a boundary point of g2 . For α > 1, if t1 < 1 then t1 is a boundary point of g1 , else t1−1 is a boundary point of g2 . It is now routine to check the values of gi in these boundary points. Instead of writing down all the calculations, let us just do one: g2 ( 0 ) = 1 − κK2 |V |/|C | |V |−1 − κ k M2 k ≥ 0 . − 1 0 0 ρ ( B + κK4 |V |/|C |) ρ ( B|V | + κ k M4 k/ρ ) Hence, if for example κ < |C |/(2K2 |V |), then g2 (0) ≥ (ρ0 (2B + K4 /K2 ))−1 . The other boundary points will lead to similar conclusions, but perhaps with |U |/|C | instead of |V |/|C | dictating the choice of κ. Now define K + as the smallest constant in the bound on κ and define K − as the largest constant next to ρ0 that comes out of the evaluations at the boundary points. Proposition 5.3.3. Let f¯ ∈ K̄Ω . Then D R is strongly expanding on the cone field f¯ 7→ H ( f¯, 1). Specifically, there exists k > 0 (not depending on f¯) such that k D R f¯ zk ≥ k · min{|U |−1 , |V |−1 } · kzk, ∀z ∈ H ( f¯, 1) \ {0}. Proof. Use Corollary 5.1.11 to get k Mzk ≥ k M1 x + M2 yk ≥ k · min{|U |−1 , |V |−1 } − k M2 k · k x k. Now use the fact that kzk = k x k for z ∈ H ( f¯, 1) to finish the proof. 103 5.4. Unstable manifolds 5.4 Unstable manifolds The norm used on the tangent space does not give good enough estimates to see a contracting subspace so we cannot quite prove that the limit set of R is hyperbolic. However, these estimates did give an expanding invariant cone field and in this section we will show how this gives us unstable manifolds at each point of the limit set. Instead of trying to appeal to the stable and unstable manifold theorem for dominated splittings to get local unstable manifolds we directly construct global unstable manifolds by using all the information we have about the renormalization operator and its derivative. This is done by defining a graph transform and showing that it contracts some suitable metric similarly to the Hadamard proof of the stable and unstable manifold theorem. We are only able to show that the resulting graphs are C 1 since we do not have hyperbolicity. Our proof is an adaptation of the proof of Theorem 6.2.8 in Katok and Hasselblatt (1995). Definition 5.4.1. Let AΩ be as in Definition 4.2.9 and define the limit set of renormalization for types in Ω by ΛΩ = AΩ ∩ L̄ΩN . Remark 5.4.2. Here L̄ΩN denotes the set of infinitely renormalizable maps with combinatorial type in ΩN and AΩ can intuitively be thought of as the attractor for R. The set Ω is the same as in Section 3.1, as always. Note that ΛΩ ⊂ [0, 1]2 × (0, 1) × Q̄2 , where Q̄ denotes the set of pure decompositions, see Definition 4.1.14. Theorem 5.4.3. For every f¯ = (u, v, c, φ̄, ψ̄) ∈ ΛΩ there exists a unique global unstable manifold W u ( f¯). The unstable manifold is a graph W u ( f¯) = { λ, σ(λ) | λ ∈ I }, where σ : I → (0, 1) × Q̄ × Q̄ is κ–Lipschitz for some κ 1 (not depending on f¯). The domain I is essentially given by π R(L̄ω ) ∩ [0, 1]2 × {c} × {φ̄} × {ψ̄} , where π is the projection onto the (u, v)–plane, and ω is defined by f¯ being in the image R(L̄ω ). Additionally, W u is C 1 . 104 Chapter 5. Differentiable structure Remark 5.4.4. Note that in stark contrast to the situation in the ‘regular’ stable and unstable manifold theorem we get global unstable manifolds which are graphs and that these are almost completely straight due to the Lipschitz constant being very small. The statement about the domain I is basically that I is “as large as possible.” This will be elaborated on in the proof. Another thing to note is that we cannot say anything about the uniqueness of f¯ ∈ ΛΩ for a given combinatorics. That is, given ω̄ = (. . . , ω−1 , ω0 , ω1 , . . . ) we cannot prove that there exists a unique f¯ ∈ ΛΩ realizing this combinatorics. Instead we see a foliation of the set of maps with type ω̄ by unstable manifolds. If we had a hyperbolic structure on ΛΩ this problem would go away. Corollary 5.4.5. Let f¯ ∈ ΛΩ and let ω̄ ∈ ΩN . Then W u ( f¯) intersects the set of infinitely renormalizable maps of combinatorial type ω̄ in a unique point, and the union of all such points over ω̄ ∈ ΩN is a Cantor set. Proof. Theorem 5.4.3 shows that the unstable manifolds are straight (see the above remark) and hence Lemma 5.4.6 enables us to apply the same arguments as in Corollary 5.2.4. Lemma 5.4.6. There exists κ close to 1 such that if γ : [0, 1]2 → (0, 1) × Q̄2 is κ–Lipschitz and graph γ ⊂ K̄, then L̄ω ∩ graph γ is diffeomorphic to a square, for every ω ∈ Ω. Proof. By Theorem 5.2.2 the set L̄ω ∩ K̄ is a tube for every ω ∈ Ω. Take a tangent vector at a point in the image of ∂L̄ω ∩ K̄. Such a tangent will lie in the complement of a cone Hκ = {kyk ≤ κ k x k} for κ < 1 close to 1, since the projection of the image of a tube to the (u, v, c)–subspace will look like a (slightly deformed) cut-off part of the wedge in Figure 5.1 on page 96. By Proposition 5.3.1, R−1 maps the complement of Hκ into itself and hence every tube “lies in the complement of Hκ ”. That is, a tangent vector at a point in the boundary of a tube lies in Hκ , so the tubes cut the (u, v)–plane at an angle which is smaller than 1/κ. Now if we choose κ as above, then the graph of κ will also intersect every tube on an angle. Hence the intersection is diffeomorphic to a square. The main point here is that with κ chosen properly, γ cannot ‘fold over’ a tube and in such a way create an intersection which is not simply connected. 105 5.4. Unstable manifolds Proof of Theorem 5.4.3. The proof is divided into three steps: (1) definition of the graph transform Γ, (2) showing that Γ is a contraction, (3) proof of C 1 –smoothness of the unstable manifold. Step 1. From Proposition 3.1.13 we know that the critical values for any map in K̄ are uniformly close to 1 so there exists µ 1 such that if we define the ‘block’ B̄ = [1 − µ, 1]2 × (0, 1) × Q̄2 ∩ K̄, then L̄Ω ∩ K̄ ⊂ B̄ , 1 − µ > φ−1 (c) and µ > ψ−1 (c) for all (u, v, c, φ, ψ) ∈ O(B̄). In other words, the block B̄ is defined so that it contains all maps in K̄ which are renormalizable of type in Ω and the square [1 − µ, 1]2 is contained in the projection of the image R(L̄Ω ∩ K̄) onto the (u, v)–plane. Fix f¯0 ∈ ΛΩ and κ ∈ (κ − , 1), where κ − is the supremum of κ − ( f¯) defined in Proposition 5.3.1 and κ is small enough so that Lemma 5.4.6 applies. Associated with f¯0 are two bi-infinite sequences {ωi }i∈Z and { f¯i }i∈Z such that Rωi f¯i = f¯i+1 for all i ∈ Z. Now define Gi , the “unstable graphs centered on f¯i ,” as the set of κ–Lipschitz maps γi : [1 − µ, 1]2 → (0, 1) × Q̄2 such that graph γi ⊂ B̄ and γi (λi ) = (ci , φ̄i , ψ̄i ), where f¯i = (λi , ci , φ̄i , ψ̄i ). Let G = ∏i Gi . We will now define a metric on G . Let d i ( γi , θ i ) = sup λ∈[1−µ,1]2 |γi (λ) − θi (λ)| , | λ − λi | γi , θ i ∈ G i , and define d(γ, θ ) = sup di (γi , θi ), i ∈Z γ, θ ∈ G . This metric turns (G , d) into a complete metric space. Note that it is not enough to simply use a C 0 –metric since we do not have a contracting subspace of D R. The denominator in the definition of di is thus necessary to turn the graph transform into a contraction. We can now define the graph transform Γ : G → G for f¯0 . Let γi ∈ Gi and define Γi (γi ) to be the γi0+1 ∈ Gi+1 such that graph γi0+1 = Rωi (graph γi ∩ L̄ωi ) ∩ B̄ . Let us discuss why this is a well-defined map Γi : Gi → Gi+1 . Lemma 5.4.6 shows that Rωi (graph γi ∩ L̄ωi ) is the graph of some map I ⊂ R2 → 106 Chapter 5. Differentiable structure (0, 1) × Q̄2 , where I is simply connected. That I ⊃ [1 − µ, 1]2 is a consequence of how B̄ was chosen. Finally, this map is κ–Lipschitz by Proposition 5.3.1. Actually, we have cheated a little bit here since Proposition 5.3.1 is stated for maps satisfying the extra condition c1+ (R f ) ≤ 1/2 ≤ c1− (R f ). In defining the graph transform we should intersect L̄ωi with the set defined by this condition before mapping it forward by Rωi . Otherwise we do not have enough information to deduce that the entire image is κ–Lipschitz as well. However, this problem is artificial. We could have chosen the constant 1/2 closer to 1 and still gotten the invariant cone field. All this means is that domain I of the theorem is slightly smaller than it should be (we have to cut out a small part of the graph where v is very close to 0 but v is still allowed to range all the way up to 1 so this amounts to a very small part of the domain). This is one reason why we say that “I is essentially given by . . . ” in the statement of the theorem. The other reason is that the intersection with R(L̄ω ) should be taken with a surface with a small angle and not a surface which is parallel to the (u, v)–plane. The graph transform is now defined by Γ (γ) = Γi (γi ) i∈Z , γ = { γi } i ∈ Z ∈ G . We claim that Γ is a contraction on (G , d) and hence the contraction mapping theorem implies that Γ has a unique fixed point γ∗ ∈ G . The global unstable manifolds along { f¯i } are then given by W u ( f¯i+1 ) = graph Γi (γi∗ ), ∀i ∈ Z. In particular, this proves existence and uniqueness of the global unstable manifold at f¯0 . That these are the global unstable manifolds is a consequence of L̄Ω ∩ K̄ ⊂ B̄ . Furthermore, the Lipschitz constant for these graphs is much smaller than 1 since we can pick κ close to κ − . Again, we are cheating a little bit here since we have to cut out a small part of the domain of the graph as discussed above. Step 2. We now prove that Γ is a contraction. The focus will be on Γi for now and to avoid clutter we will drop subscripts on elements of Gi 107 5.4. Unstable manifolds and Gi+1 . Pick γ, θ ∈ Gi and let γ0 = Γi (γ) and θ 0 = Γi (θ ). Note that γ 0 , θ 0 ∈ G i +1 . We write R f¯ = ( A(λ, η ), B(λ, η )), where f¯ = (λ, η ), λ ∈ R2 and A(λ, η ) ∈ R2 . Let Aγ (λ) = A(λ, γ(λ)) and similarly Bγ (λ) = B(λ, γ(λ)). With this notation the action of Γi is given by λ, γ(λ) 7→ Aγ (λ), Bγ (λ) = λ0 , γ0 (λ0 ) . Hence di+1 (γ0 , θ 0 ) = sup λ0 kγ0 ◦ Aγ (λ) − θ 0 ◦ Aγ (λ)k kγ0 (λ0 ) − θ 0 (λ0 )k = sup . k λ 0 − λ i +1 k k Aγ (λ) − Aγ (λi )k Aγ (λ) Recall that the notation here is (λi , γ(λi )) = f¯i and (λi+1 , γ0 (λi+1 ) = f¯i+1 . The last numerator can be estimated by kγ0 ◦ Aγ (λ) − θ 0 ◦ Aγ (λ)k ≤ kγ0 ◦ Aγ (λ) − θ 0 ◦ Aθ (λ)k + kθ 0 ◦ Aγ (λ) − θ 0 ◦ Aθ (λ)k ≤ k Bγ (λ) − Bθ (λ)k + κ k Aγ (λ) − Aθ (λ)k ≤ (k M4 k + κ k M2 k) kγ(λ) − θ (λ)k. The denominator can bounded by Proposition 5.3.3 k Aγ (λ) − Aγ (λi )k ≥ k · min{|U |−1 , |V |−1 } · kλ − λi k. Thus d i +1 ( γ 0 , θ 0 ) ≤ (k M4 k + κ k M2 k) di (γ, θ ) = νdi (γ, θ ). k · min{|U |−1 , |V |−1 } Theorem 5.1.2 shows that ν 1 uniformly in the index i. Hence Γ is a (very strong) contraction. Step 3. Going from Lipschitz to C 1 smoothness of the unstable manifold is a standard argument. See for example Katok and Hasselblatt (1995, Chapter 6.2). Part II Existence of a hyperbolic renormalization fixed point 109 C HAPTER Computer assisted proof This chapter contains excerpts from (Winckler, 2010). The main result of this chapter is that the renormalization operator has a hyperbolic fixed point of combinatorial type (01, 100)∞ , which is proved using the contraction mapping theorem on an associated operator. We use a computer to rigorously compute estimates to show that this associated operator is indeed a contraction. This method was pioneered by Lanford (1982) when he proved the existence of a fixed point of the period-doubling operator on unimodal maps (see also Lanford, 1984). However, Lanford’s paper only gives a brief outline of the method he employs without an actual proof, so we have gone through quite a lot of pains to include all the missing details (many of which were borrowed from Koch et al., 1996). 6.1 Existence of a hyperbolic fixed point We choose a different set of coordinates for Lorenz maps in this part in order to simplify the implementation of the computer estimates. Instead of keeping the domain fixed and letting the critical point vary we fix the critical point at 0 and let the domain vary. We also choose the domain to be the smallest invariant domain which contains the critical point. Lastly, instead of considering C k maps we will only consider maps whose branches are restrictions of analytic maps. 111 6 112 Chapter 6. Computer assisted proof 0.4 0.2 0 −0.2 −0.4 −0.6 −0.8 −1 −1 −0.8 −0.6 −0.4 −0.2 0 0.2 0.4 Figure 6.1: A Lorenz map in the “smallest invariant domain coordinates” of Definition 6.1.1. This is the actual graph of the fixed point of Theorem 6.1.3. Definition 6.1.1. A Lorenz map f on a closed interval I = [l, r ], where l < 0 < r, is a monotone increasing continuous function from I \ {0} to I such that f (0− ) = r, f (0+ ) = l (see Figure 6.1). We require that f ( x ) = ϕ(| x |α ) for all x ∈ (l, 0), where ϕ is a symmetric1 analytic map defined on some complex neighborhood of [l, 0], and similarly f ( x ) = ψ(| x |α ) for x ∈ (0, r ), where ψ is a symmetric analytic map defined on some complex neighborhood of [0, r ]. The definition of the renormalization operator for this choice of coordinates is almost identical to the definition in Section 2.1 so we avoid restating it here. However, we would like to make one important remark regarding the choice of smoothness. Remark 6.1.2. When defined on the space of Lorenz maps with analytic branches the renormalization operator is differentiable and its derivative is a compact linear operator. This follows from the fact that R f only evaluates f on a strict subset of the domain of f (see Sections 7.4.4 and 7.4.5). On 1 Here ‘symmetric’ means ϕ(z̄) = ϕ̄(z), where bar denotes complex conjugation. 113 6.2. Consequences the other hand, if we only were to demand C r –smoothness for the branches of our Lorenz maps then R would no longer be differentiable, see de Faria et al. (2006) and de Melo and van Strien (1993, Ch. VI.1.1). We now state the main theorem of this part: Theorem 6.1.3. Let ω = (01, 100). The restricted renormalization operator Rω acting on the space of Lorenz maps with critical exponent α = 2 has a hyperbolic fixed point. Remark 6.1.4. The fixed point of Theorem 6.1.3 is the simplest nonunimodal fixed point of R. By this we mean that if ω = (01, 10) then the fixed point of the period-doubling operator on unimodal maps corresponds to a fixed point for Rω as follows. Let g : [−1, 1] → [−1, 1] be the fixed point for the period-doubling operator normalized so that g(0) = 1. Then g is an even map that satisfies the Cvitanović–Feigenbaum functional equation g( x ) = −λ−1 g2 (λx ), λ = − g (1). Now define a Lorenz map f by f |[−1,0) = g and f |(0,1] = − g. It is easy to check that the first-return map f˜ to U = [−λ, λ] is f˜ = f 2 and that U is maximal. Thus ( −λ−1 g2 (λx ) = g( x ) if x < 0, R f ( x ) = λ−1 f˜(λx ) = λ−1 g2 (λx ) = − g( x ) if x > 0, which shows that f is a fixed point of Rω . 6.2 Consequences The existence of a hyperbolic renormalization fixed point has very strong dynamical consequences, some of which we will give a brief overview of here. Throughout this section let R denote the restricted renormalization operator Rω , where ω = (01, 100), and let f ? denote the fixed point of Theorem 6.1.3. s at f Corollary 6.2.1 (Stable manifold). There exists a local stable manifold Wloc ? consisting of maps in a neighbourhood of f ? which under iteration of R remain in this neighbourhood and converge with an exponential rate to f ? . 114 Chapter 6. Computer assisted proof 0 L0 R11 R12 L32 R21 R22 L1 L42 R32 R52 R0 L11 R1 L2 R2 L22 R42 R62 L12 Figure 6.2: Illustration of the dynamical intervals of generations 0, 1, 2 for renormalization of type ω = (01, 100). The local stable manifold extends to a global stable manifold W s consisting of maps which converge to f ? under iteration of R. If f ∈ W s then f is infinitely renormalizable. Proof. The existence of a stable manifold is a direct consequence of the stable and unstable manifold theorem. If f converges to f ? , then Rn f is defined for all k > 0, which is the same as saying that f is infinitely renormalizable. We now turn to studying the dynamical properties of maps on the stable manifold. Let f ∈ W s , then the times of closest return ( an , bn ) are given by the recursion ( a n + 1 = a n + bn , a1 = 2, bn+1 = 2an + bn , b1 = 3, These determine the first-return interval Un = cl{ Ln ∪ Rn } for the n–th renormalization, by L n = f bn ( 0 + ) , 0 , In other words: the first-return map x ∈ Ln and f˜n ( x ) = f bn ( x ) if x ∈ Rn . Define ( Lkn = f k ( Ln ), Rkn = f k ( Rn ), Rn = 0, f an (0− ) . f˜n to Un is given by f˜n ( x ) = f an ( x ) if k = 0, . . . , an − 1, k = 0, . . . , bn − 1. The collection of these intervals (over k) form a pairwise disjoint collection for each n, called the intervals of generation n (see Figure 6.2). 115 6.2. Consequences Theorem 6.2.2 (Cantor attractor). If f ∈ W s then the closure of the critical orbits of f is a measure zero Cantor set Λ f which attracts almost every point in the domain of f . Proof. The critical orbits make up the endpoints of the dynamical intervals { Lkn } ∪ { Rkn }, so Λ f is contained in (6.1) \ n cl{ En ∪ Fn }, where En = [ k Lkn and Fn = [ Rkn . k Note that En+1 ⊂ En and Fn+1 ⊂ Fn . First assume that f = f ? . Then the first-return maps f˜n are all equal to f itself (up to a linear change of coordinates) so the total lengths of En and Fn shrink with an exponential rate (the position of Un+1 inside Un is the same for all n, so we can apply the Macroscopic and Infinitesimal Koebe principles as in the proof of the “real bounds” in de Melo and van Strien, 1993). Hence the intersection (6.1) is a measure zero Cantor set, and consequently Λ f is as well. Now, if f is an arbitrary map in W s then R f converges to f ? . In other words, the first-return maps { f˜n } converge to f ? (up to a linear change of coordinates). Now use the same arguments as above. Finally, the above arguments can be adapted to prove that f satisfies the weak Markov property as in Theorem 3.2.2 and hence the basin of Λ f has full measure. Theorem 6.2.3 (Rigidity). If f , g ∈ W s then there exists a homeomorphism h : Λ f → Λ g conjugating f and g on their respective Cantor attractors. If furs , then h extends to a C 1+α diffeomorphism on the entire thermore f , g ∈ Wloc domain of f . Proof. Define h( f n (0− )) = gn (0− ) and h( f n (0+ )) = gn (0+ ). This extends continuously to a map on Λ f as in the proof of de Melo and van Strien (1993, Proposition VI.1.4). s then there exists C > 0 and λ < 1 such that d ( f n , gn ) < If f , g ∈ Wloc n Cλ so we can use an argument similar to that in de Melo and van Strien (1993, Theorem VI.9.4) to prove the second statement. Remark 6.2.4 (Universality). The second conclusion of Theorem 6.2.3 is a strong version of what is known as “metric universality”: the small scale 116 Chapter 6. Computer assisted proof geometric structure of the Cantor attractor does not depend on the map itself (only on the combinatorial type and the critical exponent). That is, if s and zoom in around the same spot on their we take two maps f , g ∈ Wloc Cantor attractors then their structures are almost identical since a differentiable map (i.e. the extended h) is almost linear if one zooms in closely enough. For example, the limit of | Ln+1 |/| Ln | as n → ∞ exists and is independent of f (it equals the ratio | L2 ( f ? )|/| L1 ( f ? )| for f ? ). More generally, the multifractal spectrum (and Hausdorff measure in particular) of Λ f does not depend on f (only on f ? ). 6.3 Outline of the computer assisted proof In this section we give a brief outline of the method of proof and how to calculate rigorous estimates with a computer. 6.3.1 Method of proof Given a Fréchet differentiable operator T with compact derivative on a Banach space X of analytic functions we would like to prove that T has a hyperbolic fixed point. The main tool is the following consequence of the contraction mapping theorem: Proposition 6.3.1. Let Φ be a Fréchet differentiable operator on a Banach space X, let f 0 ∈ X, and let Br ( f 0 ) ⊂ X be the closed ball of radius r centered on f 0 . If there are positive numbers ε, θ such that 1. k DΦ f k < θ, for all f ∈ Br ( f 0 ), 2. kΦ f 0 − f 0 k < ε, 3. ε < (1 − θ )r, then there exists f ? ∈ Br ( f 0 ) such that Φ f ? = f ? and Φ has no other fixed points inside Br ( f 0 ). Furthermore, k f ? − f 0 k < ε/(1 − θ ). Our strategy is to find a good approximation f 0 of a fixed point of T and then use a computer to verify that the conditions on r, ε, θ hold if r is chosen small enough. Unfortunately, this is not possible for T itself since in our case it is not a contraction so first we have to turn T into a contraction without changing the set of fixed points. This is done by using Newton’s 117 6.3. Outline of the computer assisted proof method to solve the equation T f − f = 0, which results in the iteration f 7→ f − ( DT f − I )−1 ( T f − f ), where I denotes the identity operator on X. The operator we use is a slight simplification of this, namely Φ f = f − ( Γ − I ) −1 ( T f − f ) = ( Γ − I ) −1 ( Γ − T ) f , where Γ is a finite-rank linear approximation of DT f0 (chosen so that Γ − I is invertible). The operator Φ is a contraction if f 0 and Γ are chosen carefully.2 Note that Φ f = f if and only if T f = f , so once we verify that the conditions of Proposition 6.3.1 hold for Φ it follows that T has a fixed point. To prove hyperbolicity we need to do some extra work. The derivative of Φ is DΦ f = ( Γ − I )−1 ( Γ − DT f ). At this stage we will have already checked that the norm of this is bounded from above by 1. By strengthening this estimate to k( Γ − eit I )−1 ( Γ − DT f )k < 1, ∀t ∈ R, ∀ f ∈ Br ( f 0 ), we also get that DT f ? is hyperbolic at the fixed point f ? . To see this, assume that eit is an eigenvalue of DT f ? with eigenvector h normalized so that khk = 1. Then k( Γ − eit I )−1 ( Γ − DT f ? )hk = k( Γ − eit I )−1 ( Γ − eit I )hk = khk = 1, which is impossible. Since DT was assumed to be compact we know that the spectrum is discrete, so the lack of eigenvalues on the unit circle implies hyperbolicity. 6.3.2 Rigorous computer estimates In order to verify the above estimates on a computer we are faced with two fundamental problems: (i) arithmetic operations on real numbers are carried out with finite precision which leads to rounding problems, (ii) the 2 Here is how to choose f and Γ: use the Newton iteration on polynomials of some 0 fixed degree to determine f 0 and set Γ = DT f0 . The hardest part is finding an initial guess such that the iteration converges. 118 Chapter 6. Computer assisted proof space of analytic functions is infinite dimensional so any representation of an analytic function needs to be truncated. The general idea to deal with these problems is to compute with sets which are guaranteed to contain the exact result instead of computing with points: real numbers are replaced with intervals, analytic functions are replaced with rectangle sets A0 × · · · × Ak × {C } in Rn representing all functions of the form { a0 + · · · + ak zk + zd h(z) | a j ∈ A j , j = 0, . . . , k, khk ≤ C }, where { A j } are intervals. This takes care of the truncation problem and the rounding problem is taken care of roughly by “rounding outwards” (lower bounds are rounded down, upper bounds are rounded up). Once these set representations have been chosen we lift operations on points to operations on sets. Since the form of these sets are most likely not preserved by such operations, this lifting involves finding bounds by sets of the chosen form (e.g. if F and G are rectangle sets of analytic functions and we want to lift composition of functions, then we have to find a rectangle set which contains the set { f ◦ g | f ∈ F, g ∈ G }.) Section 7.2 contains all the details for computing with intervals and Section 7.4 contains all the details for computing with rectangle sets of analytic functions. Let us make one final remark concerning the evaluation of the operator norm of a linear operator L on the space of analytic functions. In order to get good enough bounds on the estimate of the operator norm we will use the `1 –norm on the Taylor coefficients of analytic functions. The reason for this is that estimating the operator norm with k Lk = sup k L f k k f k≤1 will usually result in really bad estimates. With the `1 –norm, if we think of L as an infinite matrix (in the basis {zk }), the operator norm is found by taking the supremum over the norms of the columns of this matrix, that is k Lk = supk Lξ k k, ξ k (z) = zk . k ≥0 Evaluating the norms of columns gives much better estimates and for this reason we choose this norm. See Section 7.4.11 for the specifics. 119 6.4. The proof 6.4 The proof First we restate the definition of the restricted renormalization operator, then we change coordinates and restate Theorem 6.1.3. 6.4.1 Definition of the operator From now on we fix the domain of our Lorenz maps to some interval [−1, r ]. The right endpoint cannot be fixed since it generally changes under renormalization (we will soon change coordinates so that the domain is fixed). Instead of dealing with functions with a discontinuity we represent a Lorenz map F by a pair ( f , g), with f : [−1, 0] → [−1, r ], f (0) = r, and g : [0, r ] → [−1, r ], g(0) = −1. With this notation, the first-return map to some interval U will be of the type ( F a , F b )|U if F is renormalizable. For the type ω = (01, 100), we can be more precise: in this case a = 2, b = 3 and the first-return map is of the form ( g ◦ f , f ◦ f ◦ g)|U if it is renormalizable. Let T denote the restricted renormalization operator Rω , and fix the critical exponent α = 2. If T ( f , g) = ( fˆ, ĝ) then T is defined by fˆ(z) = λ−1 g ◦ f (λz), ĝ(z) = λ−1 f ◦ f ◦ g(λz), λ = − f 2 (−1). 6.4.2 Changing coordinates To ensure the correct normalization (g(0) = −1) and the correct critical exponent (α = 2) we make two coordinate changes and calculate how the operator T transforms. We will also carefully choose the domain of T so that all compositions are well-defined (e.g. λz is in the domain of f etc.). This is checked automatically by the computer (and also shows that T is differentiable with compact derivative, since f and g are analytic). Finally, it is important to realize that the choice of coordinates may greatly affect the operator norm of the derivative; not every choice will give a good enough estimate. The domain of T is chosen to be contained in the set of Lorenz maps ( f , g) with representation f (z) = φ(z2 ) and g(z) = ψ(z2 ), where φ and ψ 120 Chapter 6. Computer assisted proof have domains {z : |z − 1| < s} and {z : |z| < t}, respectively (the constants s and t will soon be specified). Rewriting T in terms of φ and ψ gives φ̂(z) = λ−1 ψ(φ(λ2 z)2 ) ψ̂(z) = λ−1 φ(φ(ψ(λ2 z)2 )2 ) λ = − φ ( φ (1)2 ) This coordinate change ensures the correct critical exponent. The next coordinate change is to fix the normalization and also to bring the domain of all functions to the unit disk. Fixing the normalization has the benefit that the error involved in the evaluation of λ is minimized (since we only need to evaluate f close to z = 0, see Section 7.4.8). Changing all domains to the unit disk simplifies the implementation of the computer estimates. Definition 6.4.1. Define X to be the Banach space of symmetric (with respect to the real axis) analytic maps on the unit disk with finite `1 –norm. That is, if f ∈ X then f (z) = ∑ ak zk with ak ∈ R and k f k = ∑| ak | < ∞. Definition 6.4.2. Define Y = X × X with the norm k( f , g)kY = k f k X + k gk X and with linear structure defined by α( f , g) + β( f 0 , g0 ) = (α f + β f 0 , αg + βg0 ). Clearly Y is a Banach space (since X is). Change coordinates from φ, ψ to ( f , g) ∈ Y (note that f and g are not the same as above) as follows φ(z) = f ([z − 1]/s), ψ(z) = −1 + z · g(z/t), where we will choose s = 2.2 and t = 0.5. Rewriting T in terms of f and g gives n 2 2 o fˆ(w) = λ−1 −1 + f λ2 w + 1s − 1s · g 1t f λ2 w + 1s − 1s , o 2 1 n ĝ(w) = 1 + λ−1 f 1s f λ2 twg λ2 w λ2 twg λ2 w − 2 1s − 1s , tw λ = − f [ f (0)2 − 1]/s . This is the final form of the operator that will be studied. 121 6.4. The proof 6.4.3 Computing the derivative In order to simplify the computation of the derivative of T we break the computation down into several steps as follows: p f ( w ) = λ 2 · ( w + s −1 ) − s −1 f1 = f ◦ p f f2 = p g ( w ) = λ2 w g1 = g ◦ p g f 12 g2 = t · p g · g1 f 3 = f 2 /t g3 = g2 · ( g2 − 2)/s f4 = g ◦ f3 g4 = f ◦ g3 f 5 = −1 + f 2 · f 4 g5 = ( g42 − 1)/s f 6 = f 5 /λ g6 = f ◦ g5 g7 = g6 /λ g8 ( w ) = ( g7 ( w ) + 1 ) / ( t · w ) With this notation we have that T ( f , g) = ( f 6 , g8 ). Note that the result of g7 (w) + 1 is a function with zero as constant coefficient so in the implementation of g8 we will not actually divide by w, instead we will ‘shift’ the coefficients to the left. It is now fairly easy to derive expressions for the derivative. If f is perturbed by δ f and g is perturbed by δg, then the above functions are perturbed as follows: δp f (w) = 2 · λ · δλ · (w + s−1 ) δp g (w) = 2 · λ · δλ · w δ f 1 = D f ◦ p f · δp f + δ f ◦ p f δg1 = Dg ◦ p g · δp g + δg ◦ p g δ f2 = 2 f1 δ f1 δg2 = t · (δp g · g1 + p g · δg1 ) δ f 3 = δ f 2 /t δg3 = δg2 · ( g2 − 2)/s + g2 · δg2 /s δ f 4 = Dg ◦ f 3 · δ f 3 + δg ◦ f 3 δg4 = D f ◦ g3 · δg3 + δ f ◦ g3 δ f5 = δ f2 · f4 + f2 · δ f4 δg5 = 2 · g4 · δg4 /s δ f 6 = δ f 5 /λ − f 5 · δλ/λ 2 δg6 = D f ◦ g5 · δg5 + δ f ◦ g5 δg7 = δg6 /λ − g6 · δλ/λ2 δg8 (w) = δg7 (w)/(t · w) With this notation we have that DT( f ,g) (δ f , δg) = (δ f 6 , δg8 ). 122 6.4.4 Chapter 6. Computer assisted proof New statement We now state Theorem 6.1.3 in the form it will be proved. The discussion in Section 6.3.1 shows how this result can be used to deduce Theorem 6.1.3. Theorem 6.4.3. There exists a Lorenz map F0 and a matrix Γ such that the simplified Newton operator Φ = ( Γ − I )−1 ( Γ − T ) is well-defined and satisfies: 1. k DΦF k < 0.2, for all k F − F0 k ≤ 10−7 , 2. kΦF0 − F0 k < 5 · 10−9 . 3. k( Γ − eit I )−1 ( Γ − DTF )k < 0.9, for all t ∈ R, k F − F0 k ≤ 10−7 . Proof. The remainder of this part is dedicated to rigorously checking the first two estimates with a computer. The third estimate is verified by covering the unit circle with small rectangles and using the same techniques as in the first two estimates to get rigorous upper bounds on the operator norm. However, we have left out the source code for this estimate to keep the page count down and also because the running time of the program went from a few seconds to several hours (we had to cover the unit circle with 50000 rectangles in order for the estimate to work). Remark 6.4.4. The approximate fixed point F0 and approximate derivative Γ at the fixed point are found by performing a Newton iteration eight times on an initial guess (which was found by trial-and-error). We will not spend too much time talking about these approximations but they could potentially be used to compute e.g. the Hausdorff dimension of the Cantor attractor of maps on the local stable manifold. We did however compute the eigenvalues of Γ and it turns out that Γ has two simple expanding eigenvalues λs ≈ 23.36530 and λw ≈ 12.11202, and the rest of the spectrum is strictly contained in the unit disk. Since Γ is a good approximation of DT f ? and both operators are compact it seems clear that the spectrum of DT f ? also must have exactly two unstable eigenvalues. Lanford (1984) claims that in the case of the period-doubling operator if an analog of the third estimate of Theorem 6.4.3 holds and “if Γ has spectrum inside the unit disk except for a single simple expanding eigenvalue, then the same will be true for DT f ? .” It seems plausible that a similar statement holds in the present situation with two simple expanding eigenvalues but have not yet managed to prove this (it is easy to see that if Γ and DT f ? were both diagonal then the third estimate would imply that they have the same number of unstable eigenvalues). C HAPTER Implementation of estimates This chapter was previously published online as “supplementary material” to (Winckler, 2010). In this chapter we implement the computer program which performs the estimates needed to prove Theorem 6.1.3. The literature on this type of computer assisted proof seems to have a tradition of never including these details, most likely because it would require an order of thousands of lines of source code. We make a conscious break from this tradition and show how to implement all estimates in only 166 lines of source code.1 The key behind this reduction in size is to use a pure functional programming language since it allows us to program in a declarative style: we specify what the program does, not how it is accomplished. Purity means that functions cannot have side-effects (the output from a function only depends on its input) which makes it easier to reason about the source code. In our context this is important since it means that we can check the correctness of each function in complete isolation from the rest of the source code (and a typical function is only one or two lines long which simplifies the verification of individual functions). To further minimize the risk of programming errors we choose a strongly typed language since these are good at catching common programming errors during compilation. We would like to take this opportunity to advocate the programming language Haskell for tasks similar to the one at hand — it has all the benefits 1 This includes: definition of main operator and its derivative (40 lines), an interval arithmetic library (30 lines), a library for computing with analytic functions (65 lines), a linear equation solver (15 lines). 123 7 124 Chapter 7. Implementation of estimates mentioned above and more, but at the same time manages to produce code which runs very fast (thanks to the GHC compiler). Unfortunately, many readers will probably have had little prior exposure to Haskell and for this reason we have in Section 7.9 included a brief overview of Haskell as well as a table highlighting its syntax to aid the reader in understanding the source code. 7.1 Verification of contraction In this section we implement the main operator and compute the estimates of Theorem 6.4.3. Before reading this section it may be a good idea to take a quick glance at the beginning of Section 7.4 in order to understand the way analytic functions are represented. It may also be helpful to use Table 7.1 in Section 7.9 to look up unfamiliar syntax in the source code. 7.1.1 The main program To begin with we import two functions from the standard library that will be needed later: 1 import Data.List (maximumBy,transpose) The entry point of the program is the function main, all that is done here is to print the result of the computations to follow: 2 main = do putStrLn $ "radius = " ++ show beta putStrLn $ "|Phi(f)-f| < " ++ show eps putStrLn $ "|DPhi| < " ++ show theta The initial guess2 is first improved by iterating a polynomial approximation3 of the operator Φ eight times (the derivative is recomputed in each iteration, so this is a Newton iteration): 5 approxFixedPt = iterate (\t -> approx $ opPhi (gamma t) t) guess !! 8 Compute the approximation Γ of the derivative DT at the approximate fixed point: 6 approxDeriv = gamma approxFixedPt 2 See 3 See Section 7.7 Section 7.4.10 7.1. Verification of contraction 125 Compute an upper bound on the distance4 between the approximate fixed point and its image under Φ: 7 eps = upper $ dist approxFixedPt (opPhi approxDeriv approxFixedPt) Construct a ball5 of radius β centered around the approximate fixed point and then compute the supremum of the operator norm6 of the derivative on this ball: 8 theta = opnorm $ opDPhi approxDeriv (ball beta approxFixedPt) The rest of this section will detail the implementation of the operator Φ and its derivative. The generic routines for rigorous computation with floating point numbers and analytic functions are discussed in the sections that follow. All input to the program (d, sf, sg, guess, beta) is collected in Section 7.7. Instructions on how to run the program and the output it produces is given in Section 7.8. 7.1.2 The main operator The operator T is computed in a function called mainOp which takes a Lorenz map ( f , g) ∈ Y and a sequence of tangent vectors {(δ f k , δgk ) ∈ Y }nk=1 and returns ( T ( f , g), { DT( f ,g) (δ f k , δgk )}nk=1 ). We perform both computations in one function since the derivative uses a lot of intermediate results from the computation of T ( f , g). Given a Lorenz map (f,g) and a list of tangent vectors ds, first compute f 6 and g8 as in Section 6.4.3 and split the result so that the polynomial parts have degree at most d − 1. Then compute the derivatives and return the result of these two computations in a pair: 9 mainOp l pf pg f1 f2 f3 f4 f5 f6 4 See (f,g) ds = ((split d f6,split d g8), mainDer ds) where = lambda f = F [(l^2-1)/sf,l^2] 0 ; g1 = compose g pg = F [0,l^2] 0 ; g2 = pg * g1 .* sg = compose f pf ; g3 = g2 * (g2 - 2) ./ sf = f1^2 ; g4 = compose f g3 = f2 ./ sg ; g5 = (g4^2 - 1) ./ sf = compose g f3 ; g6 = compose f g5 = -1 + f2*f4 ; g7 = g6 ./ l = f5 ./ l ; g8 = lshift g7 ./ sg Section 7.4.3 Section 7.4.12 6 See Section 7.4.11 5 See 126 Chapter 7. Implementation of estimates The actual computation of the derivative is performed next inside a local function to mainOp. If there are no tangent vectors, no computation is performed: 19 mainDer [] = [] Otherwise, recurse over the list of tangent vectors and compute δ f 6 and δg8 and again split the result so that the polynomial parts have degree at most d − 1: 20 mainDer dl dpf dpg df1 df2 df3 df4 df5 df6 ((df,dg):ds) = (split d df6,split d dg8) : mainDer ds where = dlambda f df = F ([2*l*dl]*[1/sf,1]) 0 ; dg1 = dcompose g pg dg dpg = F [0,2*l*dl] 0 ; dg2 = (dpg*g1 + pg*dg1) .* sg = dcompose f pf df dpf ; dg3 = 2*(dg2*g2 - dg2) ./ sf = 2*f1*df1 ; dg4 = dcompose f g3 df dg3 = df2 ./ sg ; dg5 = 2*g4*dg4 ./ sf = dcompose g f3 dg df3 ; dg6 = dcompose f g5 df dg5 = df2*f4 + f2*df4 ; dg7 = dg6./l - g6.*(dl/l^2) = df5./l - f5.*(dl/l^2) ; dg8 = lshift dg7 ./ sg Note that the constants s and t of Section 6.4.3 are called sf and sg respectively in the source code. The above function can be used to compute the action of T by passing an empty list of tangent vectors and extracting the first element of the returned pair: 30 opT fg = fst $ mainOp fg [] Similarly, we can evaluate DT by extracting the second element: 31 opDT fg ds = snd $ mainOp fg ds Using this function we compute an approximation Γ of DT( f ,g) by evaluating the derivative at the 2d first basis vectors7 of Y and approximating the result with polynomials and packing them into a 2d × 2d matrix (transposing the resulting matrix is necessary because the linear algebra routines8 we use require the matrix to be stored in row-major order): 32 gamma fg = transpose $ map (interleavePoly . approx) $ opDT fg (take (2*d) basis) 7 See 8 See Section 7.4.11. Section 7.5. 7.2. Computation with floating point numbers 127 Finally, the operator Φ (and its derivative) is implemented by taking a Newton step9 with T (for convenience we pass the approximate derivative as the parameter m): 34 opPhi m x = newton m (opT x) x opDPhi m x ds = [ newton m a b | (a,b) <- zip (opDT x ds) ds ] 7.1.3 The rescaling factor With our choice of coordinates the rescaling factor λ only depends on f (and not on g): λ( f ) = − f [ f (0)2 − 1]/s The implementation is straightforward: 36 lambda f = -eval f (((eval f 0)^2-1)/sf) If f ∈ X is perturbed by δ f ∈ X then λ is perturbed by δλ, where δλ = −2 · s−1 · f (0) · δ f (0) · D f [ f (0)2 − 1]/s − δ f [ f (0)2 − 1]/s . Derivative evaluation has to be handled carefully since we are using the `1 –norm, see Section 7.4.4. If y = [ f (0)2 − 1]/s, then y lies in the closed disk of radius |y| but since we need to evaluate the derivative on an open disk we first enlarge the bound on |y| to get the radius µ and then evaluate the derivative on this slightly larger disk: 37 dlambda f df = where f0 = y = mu = 7.2 -2/sf * f0 * eval df 0 * eval (deriv mu f) y - eval df y eval f 0 (f0^2 - 1)/sf enlarge $ abs y Computation with floating point numbers We discuss how to control rounding and avoid overflow and underflow when computing with floating point numbers. We show how to lift operations on floating point numbers to intervals and then how to bound these operations. 9 See Section 7.4.13. 128 Chapter 7. Implementation of estimates 7.2.1 Safe numbers In order to avoid overflow and underflow during the course of the proof we restrict all computations to the set of safe numbers (Koch et al., 1996) which we define as the subset of double precision floating point numbers (referred to as floats from now on) x such that x = 0 or 2−500 < | x | < 2500 . We say that y is a safe upper bound on x 6= 0 iff x < y (strict inequality) and y is a safe number; safe lower bounds are defined analogously. If x = 0, then y = 0 is both a safe upper and lower bound and there are no other safe bounds on x (this will make sense after reading the assumption below). Safe numbers allow us to perform rigorous computations on any computer conforming to the IEEE 754 standard since such a computer must satisfy the following assumption:10 Assumption. Let x̄ be a float resulting from an arithmetic operation on safe numbers performed by the computer and let x be the exact result of the same operation. If x̄ 6= 0 then either x̄ = x − or x̄ = x + , where x − is the largest float such that x − ≤ x and x + is the smallest float such that x ≤ x + . Furthermore, x̄ = 0 if and only if x = 0. Under this assumption we know that the exact result must lie within any safe upper and lower bounds on x̄, and we know that when the computer returns a result of 0 then the computation must be exact. Given a float x we now show how to find safe upper and lower bounds on x. Check if a number is safe: 41 isSafe x = let ax = abs x in x == 0 || (ax > 2^^(-500) && ax < 2^500) Use this function to assert that a number is safe, abort the program otherwise: 42 assertSafe x | isSafe x = x | otherwise = error "assertSafe: not a safe number" Given a float we can ‘step’ to an adjacent float as follows: 44 stepFloat n 0 = 0 stepFloat n x = let (s,e) = decodeFloat x in encodeFloat (s+n) e 10 This statement follows from: (1) the fact that IEEE 754 guarantees correct rounding, (2) the result of an arithmetic operation on safe numbers is a normalized float so silent underflow to zero cannot occur. 7.2. Computation with floating point numbers 129 That is, stepFloat 1 x is the smallest float larger than x, and similarly stepFloat (-1) x is the largest float smaller than x, unless x = 0 in which case x is returned in both cases. (The function decodeFloat converts a float to the form s · 2e , where s, e ∈ Z, and encodeFloat converts back to a float.) Now finding a safe upper or lower bound is easy, just step to the next float and assert that it is safe: 46 safeUpperBound = assertSafe . stepFloat 1 safeLowerBound = assertSafe . stepFloat (-1) 7.2.2 The Scalar data type The Scalar data type represents safe lower and upper bounds on a number: 48 data Scalar = S !Double !Double deriving (Show,Eq) The first number is the lower bound, the second the upper bound. The following function returns the upper bound: 49 upper (S _ u) = u We bound operations on real numbers by first lifting them to operations on Scalar values and then bound the resulting operations by enlarging the bound to safe lower and upper bounds. An operation is exact if it does not involve any rounding (in which case there is no need to enlarge a bound). The function that takes a Scalar with lower bound l and upper bound u, then finds a safe lower bound on l and a safe upper bound on u is implemented as follows: 50 enlarge (S l u) = S (safeLowerBound l) (safeUpperBound u) For convenience we provide a function to convert a number x to a Scalar with x as both lower and upper bound: 51 toScalar x = S x x 7.2.3 Arithmetic on scalars We make Scalar an instance of the Num type class so that we can perform arithmetic on scalars (addition (+), subtraction (-), negation, multiplication (*) and nonnegative integer exponentiation (^)). 52 instance Num Scalar where 130 Chapter 7. Implementation of estimates If x ∈ [l, u] then − x ∈ [−u, −l ]; negation is exact on safe number so we do not need to enlarge the bound: 53 negate (S l u) = S (-u) (-l) If x ∈ [l, u] then | x | ∈ [max{0, l, −u}, − min{0, l, −u}] (it is easy to check that this is correct regardless of the signs of l and r). All operations involved are exact on safe numbers so we do not need to enlarge the bound: 54 abs (S l u) = S (maximum xs) (-minimum xs) where xs = [0, l, -u] If x ∈ [l, u] and y ∈ [l 0 , u0 ], then x + y ∈ [l + l 0 , u + u0 ]. This operation is not exact so we enlarge the bound: 56 (S l u) + (S l’ u’) = enlarge (S (l + l’) (u + u’)) If x ∈ [l, u] and y ∈ [l 0 , u0 ], then x ∗ y ∈ [ a, b] where a is the minimum of the numbers {l ∗ l 0 , l ∗ u0 , u ∗ l 0 , u ∗ u0 } and b is the maximum of the same numbers. This operation is not exact so we enlarge the bound: 57 (S l u) * (S l’ u’) = enlarge (S (minimum xs) (maximum xs)) where xs = [l*l’, l*u’, u*l’, u*u’] The last two methods are required to complete the implementation of the Num instance (fromInteger provides implicit conversion of integer literals to Scalar values): 59 fromInteger = toScalar . fromInteger signum (S l u) = error "S.signum: not defined" In order to be able to divide Scalar values using (/) we must also add Scalar to the Fractional type class. 61 instance Fractional Scalar where If x ∈ [l, u] and if l, u have the same sign, then the reciprocal is well-defined for x and 1/x ∈ [1/u, 1/l ]. This operation is not exact so we enlarge the bound: 62 recip (S l u) | l*u > 0 = enlarge (S (1/u) (1/l)) | otherwise = error "S.recip: not well-defined" The last method is required; it provides implicit conversion of decimal literals to Scalar values: 64 fromRational = toScalar . fromRational 131 7.3. Computation with polynomials 7.2.4 Ordering of scalars In order to be able to compare Scalar values, e.g. using (<), we add Scalar to the Ord type class. If two bounds overlap we declare them incomparable and halt the program, otherwise comparison is implemented in the obvious way. 65 instance Ord Scalar where compare (S l u) (S l’ u’) | u < l’ = | l > u’ = | l == l’ && u == u’ = | otherwise = 7.3 LT GT EQ error "S.compare: uncomparable" Computation with polynomials We show how to lift operations on polynomials (of degree d − 1) to rectangle sets in Rd and then how to bound these operations. 7.3.1 Representation of polynomials Polynomials are represented as a list of Scalar values (with the first element representing the constant coefficient). Hence what we refer to as a ‘polynomial’ of degree d − 1 is actually a rectangle set in Rd . In this section we lift operations on actual polynomials to such rectangles. We do not need to find any bounds on these lifts since this was already done implicitly in the previous section. 7.3.2 Arithmetic with polynomials Add polynomials to the Num type class so that we can perform arithmetic operations on polynomials. (This implementation is a bit more general since it adds [a] to the Num type class for any type a in the Num type class.) 71 instance (Num a) => Num [a] where Addition: [c1 + zq1 (z)] + [c2 + zq2 (z)] = [c1 + c2 ] + z[q1 (z) + q2 (z)]. 72 (c1:q1) + (c2:q2) = c1 + c2 : q1 + q2 [] + p2 = p2 p1 + [] = p1 Multiplication: [c1 + zq1 (z)] · [c2 + zq2 (z)] = [c1 · c2 ] + z[c1 · q2 (z) + q1 (z) · p2 (z)], where p2 (z) = c2 + zq2 (z). 132 Chapter 7. Implementation of estimates (c1:q1) * [email protected](c2:q2) = c1*c2 : [c1]*q2 + q1*p2 _ * _ = [] 75 The remaining methods are straightforward: negate p = map negate p fromInteger c = [fromInteger c] abs = error "abs not implemented for polynomials" signum = error "signum not defined for polynomials" 77 7.3.3 Polynomial evaluation Evaluation of the polynomial c + zq(z) at the point t is done using the obvious recursion: 81 peval (c:q) t = c + t * peval q t peval [] _ = 0 7.3.4 Norm of polynomial We use the `1 –norm on polynomials, i.e. k a0 + · · · + an zn k = | a0 | + · · · + | a n |: 83 pnorm = sum . map abs 7.3.5 Derivative of polynomial The derivative of c + zq(z) is implemented using the recursion suggested by D (c + zq(z)) = q(z) + zDq(z): 84 pderiv (c:q) = q + (0 : pderiv q) pderiv [] = [] 7.4 Computation with analytic functions We show how to lift operations on analytic functions to rectangle subsets in X and how to bound these operations. 7.4.1 The Function data type Functions in X are represented as f ( z ) = p ( z ) + z d h ( z ), 7.4. Computation with analytic functions 133 where p is a polynomial (not necessarily of degree less than d) and k hk < K, where h ∈ X. We refer to p as the polynomial part of f and h is called the error of f . The value for the degree d is specified in Appendix 7.7. The Function data type represents an analytic function on the above form (the first parameter is the polynomial part, the second parameter is the bound on the error): 86 data Function = F ![Scalar] !Scalar deriving (Show,Eq) That is, Function represents rectangle subsets of X of the form { a0 + · · · + an zn + zd h(z) | ak ∈ Ak , k = 0, . . . , n, khk ∈ I }, where { Ak } and I are intervals. Only the upper bound on the error term is needed so we do not take care to ensure that the lower bound is correct. Hence, the lower bound will be meaningless in general. Note that we do allow n ≥ d in the above representation but in general we adjust our computations to ensure n < d. We call this operation splitting: if f ( z ) = a0 + · · · + a n z n + z d h ( z ), with n ≥ k ≥ d, then we can split f at degree k into f (z) = a0 + · · · + ak−1 zk−1 + zd [ ak zk−d + · · · + an zn−d + h(z)] = p0 (z) + zd [r (z) + h(z)]. Thus the polynomial part of f after splitting is p0 and the error is bounded by kr k + k hk (by the triangle inequality). The implementation of this operation is: 87 split k (F p e) = let (p’,r) = splitAt k p in F p’ (e + pnorm r) We will now lift operations on analytic functions to the above type of rectangles and then find bounds on these operations. 7.4.2 Arithmetic with analytic functions In what follows we let f i (z) = pi (z) + zd hi (z) for i = 1, 2, 3, and let f 1 f 2 = f 3 where is the operation under consideration. Make Function an instance of the Num type class so that we can perform arithmetic operations on functions (addition (+), subtraction (-), negation, multiplication (*) and nonnegative integer exponentiation (^)). 134 88 Chapter 7. Implementation of estimates instance Num Function where Addition of two functions is performed by adding the polynomial part and the error separately, p3 = p1 + p2 and h3 = h1 + h2 , so that k h3 k ≤ k h1 k + kh2 k by the triangle inequality: 89 (F p1 e1) + (F p2 e2) = F (p1 + p2) (e1 + e2) Multiplication of two analytic functions is given by the equation i h f 1 ( z ) f 2 ( z ) = p1 ( z ) p2 ( z ) + z d p1 ( z ) h2 ( z ) + p2 ( z ) h1 ( z ) + z d h1 ( z ) h2 ( z ) , so that k h3 k ≤ k p1 kkh2 k + k p2 kkh1 k + k h1 kkh2 k. To ensure that the degree of the polynomial part does not increase too much we split it at degree d + 1:11 90 (F p1 e1) * (F p2 e2) = split (d+1) (F (p1*p2) e3) where e3 = e2*pnorm p1 + e1*pnorm p2 + e1*e2 The negation of f is − p(z) + zd (−h(z)) but the error is unchanged since we only keep a bound on its norm: 92 negate (F p e) = F (negate p) e The remaining methods are required to complete the implementation of the Num instance (fromInteger provides implicit conversion of integer numerals to Function values): 93 fromInteger c = F [fromInteger c] 0 abs = error "abs not implemented for Function" signum = error "signum not defined for Function" 7.4.3 Norm of analytic functions The triangle inequality gives k p(z) + zd h(z)k ≤ k pk + k hk (since |z| < 1): 96 norm (F p e) = pnorm p + e The norm on the Cartesian product Y = X × X is k( f , g)k = k f k + k gk: 97 sumnorm (f,g) = norm f + norm g The distance induced by the norm on Y: 98 dist (f,g) (f’,g’) = sumnorm (f-f’,g-g’) 11 We choose to split at degree d + 1 (instead of the perhaps more natural choice of degree d) because the division by z in the definition of the operator T would otherwise cause g8 to have degree at most d − 2. 135 7.4. Computation with analytic functions 7.4.4 Differentiation The implementation of differentiation of f ∈ X is complicated by the use of the `1 –norm on X, since k f k < ∞ does not imply that k D f k < ∞. This problem is overcome by only computing the derivative of functions restricted to a disk of radius strictly smaller than one. That is, we need to know a-priori that the function we are differentiating only will be evaluated on this smaller disk. Usually we get this information from the fact that we compute derivatives like D f 1 ◦ f 2 and we have bounds on the image of f 2 . Given f ∈ X we will estimate D f |{|z|<µ} where µ < 1. If f (z) = p(z) + d z h(z), then D f (z) = Dp(z) + dh(z)zd−1 + zd Dh(z) = p1 (z) + zd h1 (z). Here we are faced with the problem that we only know the norm of h so all we can say about the polynomial part is that p1 (z) = Dp(z) + sdzd−1 , where s ∈ [−khk, k hk]. Let h(z) = ∑ ak zk , then the error can be crudely approximated as follows: k Dh(z) |{|z|<µ} k = k Dh(µz)k = ∑ kµk−1 |ak | ≤ sup kµk−1 khk k ≥1 k ≥1 ≤ khk ∑ kµk−1 = k ≥1 khk (1 − µ )2 Putting all this together we arrive at the following implementation: 99 deriv mu (F p e) | mu < 1 = F p1 e1 | otherwise = error "deriv: mu is not < 1" where p1 = pderiv p + [S (-s) s] * [0,1]^(d-1) e1 = e / (1 - mu)^2 s = fromIntegral d * upper e Note that µ is passed as a parameter by the caller of this function (it is not a constant). As mentioned earlier, usually this function is used to compute expressions like D f 1 ◦ f 2 in which case µ will be an upper bound on the radius of a disk containing the image of f 2 . 7.4.5 Composition The implementation of composition of analytic functions f 1 ◦ f 2 is split up into two parts. First we consider the special case when f 1 = p1 is a polynomial, then we treat the general case. 136 Chapter 7. Implementation of estimates Polynomials are defined on all of C so the composition p1 ◦ f 2 is always defined. If p1 (z) = c + zq(z) then we may use the recursion suggested by p1 ◦ f 2 ( z ) = c + f 2 ( z ) · q ◦ f 2 ( z ): 104 compose’ (c:q) f2 = (F [c] 0) + f2 * compose’ q f2 The recursion ends when the polynomial is the zero polynomial, in which case p1 ◦ f 2 = 0: 105 compose’ [] _ = 0 In the general case we have to take care to ensure that the image of f 2 is contained in the domain of f 1 for the composition to be well-defined. A sufficient condition for this to hold is k f 2 k < 1 since the domain of f 1 is the unit disk. Under this assumption we compute f 1 ◦ f 2 = p1 ◦ f 2 + ( f 2 )d · h1 ◦ f 2 . These two terms are split at degree d + 1 to get p1 ◦ f 2 (z) = p̃1 (z) + zd h̃1 (z) and f 2 (z)d = p̃2 (z) + zd h̃2 (z).12 Then f 1 ◦ f 2 (z) = p3 (z) + zd h3 (z) with p3 = p̃1 + p̃2 · h1 ◦ f 2 and h3 = h̃1 + h̃2 · h1 ◦ f 2 . Only the norm of h1 is given so from this we can only draw the conclusion that p3 (z) = p̃1 (z) + s · p̃2 (z) for s ∈ [−kh1 k, k h1 k] (s is in fact a function but we may think of it as a constant since we are really computing with sets of polynomials). The error is approximated using the triangle inequality, k h3 k ≤ kh̃1 k + k h̃2 kkh1 k. 106 compose (F p1 e1) f2 | norm f2 < 1 = F (p1’ + [s]*p2’) (e1’ + e1*e2’) | otherwise = error "compose: |f2| is too large" where (F p1’ e1’) = split (d+1) (compose’ p1 f2) (F p2’ e2’) = split (d+1) (f2^d) s = S (-upper e1) (upper e1) The term s · p̃2 (z) can introduce devastating errors into the computation since s lies in an interval which has positive upper bound and a negative lower bound (if p̃2 has a coefficient with small error but a large magnitude relative to s, then after multiplying with s that coefficient will have an error that is bigger than the magnitude of the coefficient). We work around this problem by choosing the degree d large, since this tends to make the term s smaller. Another way to deal with this problem is to include a “general error” term in the representation of analytic functions (Koch et al., 1996). 12 See the footnote near the definition of multiplication of analytic functions for an expla- nation of the choice of degree d + 1. 7.4. Computation with analytic functions 7.4.6 137 Derivative of the composition operator Let S( f , g) = f ◦ g, then the derivative is given by DS( f ,g) (δ f , δg) = D f ◦ g · δg + δ f ◦ g. Note that when computing D f we must specify as a first parameter the radius of a disk strictly contained in the unit disk to which D f is restricted (see Section 7.4.4). In the present situation we know that the image of g is contained in a disk with radius k gk, so D f only needs to be evaluated on the disk of radius k gk: 111 dcompose f g df dg = (deriv (norm g) f ‘compose‘ g) * dg + (df ‘compose‘ g) 7.4.7 Division by z If f (z) = a1 z + · · · + an zn + zd h(z), then f (z)/z = a1 + · · · + an zn−1 + zd−1 h(0) + zd h̃(z), where |h(0)| ≤ khk and kh̃k ≤ k hk. Since we do not know the value of h(0) we estimate the coefficient it with s ∈ [−khk, k hk]. We think of this operation as a “left shift”, whence the name of this function: 112 lshift (F (c:q) e) = F (q + [0,1]^(d-1) * [S (-upper e) (upper e)]) e lshift (F [] e) = F ([0,1]^(d-1) * [S (-upper e) (upper e)]) e If the polynomial part of f has a constant coefficient a0 6= 0 then this function will not return the correct result, so we take care to only use it when we know that a0 = 0. 7.4.8 Point evaluation If f (z) = p(z) + zd h(z), then f (t) = p(t) + td · s for some s ∈ [−khk, k hk]. We also check that t is in the unit disk otherwise the program is terminated with an error: 114 eval (F p e) t | abs t < 1 = peval p t + t^d * (S (-upper e) (upper e)) | otherwise = error ("eval: not in domain t=" ++ show t) Note that the further away t is from 0, the more error is introduced in the evaluation. For t = 0 the error term has no influence on the evaluation. 138 Chapter 7. Implementation of estimates 7.4.9 Scaling As a convenience we define operators to scale an analytic function by a scalar on the on the right. The precedence for these operators are the same as for their ’normal’ counterparts. Multiplication satisfies [ p(z) + zd h(z)] · x = x · p(z) + zd [ x · h(z)] and division is handled similarly. Note that the error term is affected: 116 infixl 7 .*, ./ (F p e) .* x = F (p * [x]) (e * abs x) (F p e) ./ x = F (p * [1/x]) (e / abs x) 7.4.10 Polynomial approximation Let f (z) = p(z) + zd h(z). To approximate f by a polynomial we first discard the error term zd h(z), then we disregard the errors in the coefficients + of p. That is, for p(z) = a0 + · · · + an zn with ak ∈ [ a− k , ak ] we replace ak − + with the mean ãk = ( ak + ak )/2 (we ‘collapse’ the bounds on ak ). Finally, we lift this operation to pairs of functions: 119 approx (f,g) = (approx’ f, approx’ g) where approx’ (F p _) = F (map (toScalar . collapse) p) 0 collapse (S l u) = (l+u)/2 7.4.11 Operator norm Let ξ k (z) = zk so that {ξ k }k≥0 is a basis for X. A basis for Y is {ηk }k≥0 , where η2k = (ξ k , 0) and η2k+1 = (0, ξ k ). This set is implemented as follows: 122 basis = interleave (zip basis’ (repeat 0)) (zip (repeat 0) basis’) where basis’ = map xi [0..] xi k = F (replicate k 0 ++ [1]) 0 Proposition 7.4.1. If L : Y → Y is a linear and bounded operator, then k Lk = max{k Lη0 k, . . . , k Lη2d−1 k, sup k L(h, 0)k, sup k L(0, h)k}, h∈ Bd where Bd = {zd h(z) | k hk < 1}. This is a consequence of using the `1 –norm on X. h∈ Bd 7.4. Computation with analytic functions 139 Given a linear operator op acting on a list of tangent vectors13 , we estimate the operator norm by applying it to the first 2d basis vectors and to the sets Bd × 0 and 0 × Bd . Then we compute the upper bound of the norm of the results and take the maximum: 125 opnorm op = maximum $ map (upper . sumnorm) $ op tangents where tangents = (F [] 1,0) : (0,F [] 1) : take (2*d) basis Note that Bd is represented by the set of functions with no polynomial part and an error bounded by 1, which is the same as F [] 1. 7.4.12 Construction of balls We cannot exactly represent arbitrary balls in X with the Function type. Instead we construct a rectangle set which is guaranteed to contain the ball. Thus, a bound on a ball of radius r centered on an analytic function (in our case it is always a polynomial, i.e. e=0) can be implemented as follows: 127 ball r (f,g) = (ball’ r f, ball’ r g) where ball’ r (F p e) = F (map (+ S (-r) r) p) (e + toScalar r) 7.4.13 Newton’s method This is our variant of Newton’s method on Y: ( f , g) 7→ ( M − I )−1 ( M − T )( f , g), where M is a 2d × 2d matrix passed as the first parameter. The second parameter is T ( f , g) and the third parameter is ( f , g). When lifting M into Y we project the error term to zero by letting s = 0 and when lifting ( M − I )−1 we preserve the error term by letting s = 1 (see below for how this lifting is done): 129 newton m (tf,tg) fg = fg’ where (mf,mg) = liftPolyOp 0 (apply m) fg fg’ = liftPolyOp 1 (solve $ subtractDiag m 1) (mf-tf,mg-tg) Let f = p(z) + zd h(z) where deg p < d and let A be the linear operator represented (in the basis {zk }) by the infinite matrix M 0 0 sI 13 The linear operator acts on a sequence of tangent vectors since this is how we have implemented the derivative of the main operator. 140 Chapter 7. Implementation of estimates where M is a d × d matrix and I is the infinite identity matrix. We lift the linear operator A into X by A f (z) = Mp(z) + zd (s · h(z)). The following function implements this lifting into Y. We split f and g to ensure that their degrees are at most d − 1, and since our linear algebra routines require its input in one vector we interleave the polynomial parts. Also, instead of passing M we pass a linear operator op which allows us to use one function to lift matrix multiplication (apply) and solution of linear equations (solve): 132 liftPolyOp s op (f,g) = (F pf’ (s*ef), F pg’ (s*eg)) where [email protected](F pf ef, F pg eg) = (split d f, split d g) (pf’,pg’) = uninterleave $ op (interleavePoly fg) When interleaving the polynomial parts of two functions we first pad the polynomials with zeros to ensure their lengths are exactly d (e.g. a0 + a1 z is padded to a0 + a1 z + 0z2 + · · · + 0zd−1 ). Hence the resulting vector always has length 2d: 135 interleavePoly (F p _, F q _) = interleave (pad p) (pad q) where pad x = take d $ x ++ (repeat 0) 7.5 Linear algebra routines In this section we implement a simple linear algebra library to compute matrix-vector products and to solve linear equations. A matrix is represented as a list of its rows and a row is a list of its elements. A vector is just a list of elements (we think of them as column vectors). This is a very simplistic library so no checking is done to ensure that matrices have the correct dimensions (e.g. it is quite possible to create a ‘matrix’ with rows of differing lengths). 7.5.1 Matrix-vector product Computing the matrix-vector product Mx is fairly straightforward: 137 apply m x = map (dotProduct x) m The dot product of vectors a and b: 138 dotProduct a b = sum $ zipWith (*) a b 7.5. Linear algebra routines 141 Note that if a and b have different lengths, then the above function will treat the longer vector as if it had the same length as the shorter. 7.5.2 Linear equation solver The following function solves the linear system of equations Mx = b. It is a simple wrapper around a function which solves a linear system given an augmented matrix. 139 solve m b = solveAugmented $ augmentedMatrix m b The augmented matrix for M and b is M b , i.e. the matrix with b appended as the last column of M: 140 augmentedMatrix = zipWith (\x y -> x ++ [y]) We now implement a linear equation solver which takes an augmented matrix as its only parameter. It is implemented using Gaussian elimination with partial pivoting. The only novelty compared with a traditional imperative implementation is that we solve the equations recursively. Given a n × (n + 1) augmented matrix M0 first perform partial pivoting, i.e. the row whose first element has the largest magnitude is moved to the top to form the matrix M. Assuming that we already have the solution for x2 , . . . , xn we can compute x1 = (m1,n+1 − ∑nj=2 m1j x j )/m11 and we are done. The solution for x2 , . . . , xn is found recursively as follows: perform a Gaussian elimination on M to ensure that all rows except the first start with a zero to get a matrix M̃. Throw away the first row and column of M̃ to get a (n − 1) × n matrix N 0 and solve the linear system with augmented matrix N 0 . The solution to this system is x2 , . . . , xn . 141 solveAugmented [] = [] solveAugmented m’ = (last m1t - dotProduct m1t x) / m11 : x where [email protected]((m11:m1t):_) = partialPivot m’ x = solveAugmented $ eliminate m Partial pivoting is done by first finding a list of all possible ways to split the matrix M into a top and a bottom half. This list is searched for the split which has a maximal first element in the bottom half. The maximal split is then reassembled into one matrix by moving the top row of the bottom half to the top of the matrix. 145 partialPivot m = piv:mtop ++ mbot where (mtop,piv:mbot) = maximumBy comparePivotElt (splits m) comparePivotElt (_,(a:_):_) (_,(b:_):_) = compare (abs a) (abs b) 142 Chapter 7. Implementation of estimates The following routine uses Gaussian elimination to ensure that the first element of all rows except the first starts with a zero. That is, we add a suitable multiple of the first row to the other rows one at a time: 148 eliminate ((m11:m1t):mbot) = foldl appendScaledRow [] mbot where appendScaledRow a (r:rs) = a ++ [scaleAndAdd (-r/m11) m1t rs] scaleAndAdd s a b = zipWith (+) (map (*s) a) b Remark 7.5.1. When using the above linear equation solver with matrices and vectors over intervals (of type Scalar) there is a question of what the ‘solution’ represents. As always, we are computing bounds on solutions: if M is in some rectangle set [ M] of matrices and b is in some rectangle set [b] of vectors, then the above routine will compute a rectangle set [ x ] such that if x is a solution to Mx = b then x ∈ [ x ]. Note that our solver will compute rather loose bounds on the solution set, see e.g. Jansson and Rump (1991) for ways of finding sharper bounds. 7.6 Supporting functions Given a square matrix M and a number x compute M − xI, i.e. subtract x from every diagonal element of M: 151 subtractDiag m x = foldl f [] (zip m [0..]) where f m’ (r,k) = let (h,t:ts) = splitAt k r in m’ ++ [h ++ [t-x] ++ ts] Given a list, return all possible ways to split the list in two: 154 splits x = splits’ [] x where splits’ _ [] = [] splits’ x [email protected](yh:yt) = (x,y) : splits’ (x ++ [yh]) yt Interleave two lists a and b, i.e. construct a new list by taking the first element from a, then the first element from b and repeating for the remaining elements. 157 interleave a b = concat $ zipWith (\x y -> [x,y]) a b Perform the ‘inverse’ of the above function, i.e. take a list c and construct a pair of lists (a,b) such that interleave a b = c: 158 uninterleave = unzip . pairs Given a list, partition it into pairs of adjacent elements: 7.7. Input to the main program 159 143 pairs [] = [] pairs (x:y:rest) = (x,y) : pairs rest pairs _ = error "list must have even length" 7.7 Input to the main program The degree of the error term in our representation of analytic functions: 162 d = 13 :: Int The radius of the ball on which Φ is a contraction: 163 beta = 1.0e-7 :: Double The radii for the domains of φ and ψ: 164 sf = 2.2 :: Scalar sg = 0.5 :: Scalar The initial guess for the fixed point: 166 guess = (F [-0.75, -2.5] 0, F [6.2,-2.1] 0) 7.8 Running the main program This document contains all the Haskell source code needed to compile the program into an executable. Given a copy of the LATEX source of this document (assuming the file is named lmca.lhs), use the following command to compile it:14 ghc --make -O2 lmca.lhs This produces an executable called lmca (or lmca.exe if you are using Windows) which when called will execute the main function. Here is the output of running the main program: radius = 1.0e-7, |Phi(f)-f| < 4.830032057738462e-9, |DPhi| < 0.1584543808202988 This output was taken from a sample run using GHC 6.12.1 on Mac OS X 10.6.2. The running time on an 1.8 GHz Intel Core 2 Duo was less than 10 seconds. 14 The GHC compiler can be downloaded for free from http://haskell.org. 144 7.9 Chapter 7. Implementation of estimates Haskell mini-reference This section introduces some of the features and syntax of Haskell to help anybody unfamiliar with the language read the source code. It is assumed that the reader has some prior experience with an imperative language (Java, C, etc.) but is new to functional programming languages. Table 7.1 below collects examples of Haskell syntax used in the source code and can be used to look up unfamiliar expressions. For more information on the Haskell language go to http://haskell.org. Haskell is a functional language. Such languages differ from imperative languages in several significant ways, e.g.: there are no control structures such as for loops and data is immutable so there is no concept of variables (memory locations) that can be written to. Basic types include: booleans (True, False), numbers (e.g. -1, 2.3e3, integers of any magnitude are supported), tuples (e.g. (1,’a’,0.3), elements can have different types), and lists (e.g. [1,2,3], all elements must have the same type). Functions are on the same level as basic types so they can e.g. be passed as parameters to other functions. Functions are defined like f parameters = expression where f is the function name and there can be zero or more parameters. Note that there are no parentheses around parameters and that parameters are separated by spaces. Function calls have very high precedence, so f x^2 is the same as (f x)^2, not f (x^2). The keywords let .. in and where can be used to bind expressions to function-local definitions (i.e. local functions or variables). New data types can be defined using the data construct. For example, data Interval = I Double Double defines a type called Interval which consists of two double precision floating point numbers (i.e. the endpoints of the interval). New values of this type are constructed using the value constructor which we called I, e.g. I 0 1 defines the unit interval. Functions can be defined with pattern matching on built-in and custom data types. For example, len (I a b) = abs (b-a) defines a function len which returns the length of an interval (for the custom data type Interval). We often use pattern matching on lists, where [] matches the empty list and (x:xs) matches a list with a least one element and binds the first element to x and the rest to xs (read as plural of x). The notation [email protected](x:xs) can be used to bind the entire list to v on a match. 7.9. Haskell mini-reference 145 The notation _ may be used to match anything without binding the match to a variable, e.g. firstZero (x:_) = x == 0 defines a function which returns True if the first element of a nonempty list is equal to zero (and throws an exception if called on the empty list []). Type classes are a way of declaring that a custom data type supports a certain predefined collection of functions and also allow for ‘overloading’ of functions (and operators, which can be turned into functions as noted in the example for (+) in Table 7.1). We only mention type classes because we come across them when implementing Scalar and Function. The pre-defined type classes we use are Num (for (+), (-), (*), (^), abs), Fractional (for (/), (^^)), Eq (test for equality), and Show (for conversion to strings). Expression f1 x = 2*x f2 x y = x+y f1 3 f2 3 4 f2 2 (f1 3) f2 2 $ f1 3 f2 2 f1 3 \x -> 2*x f2 3 f1 . f2 3 3 ‘f2‘ 4 (+) 3 4 (3*) g x | x<0 = -1 | x>0 = 1 | x==0 = 0 num [] = 0 num (_:xs) = 1+num xs Description define a function f1 which doubles its argument define a function f2 which adds its two arguments apply f1 to 3 (=6) apply f2 to 3 and 4 (= 7) apply f2 to 2 and 6 (the result of f1 3) (= 8) same as above (the operator $ is often used in this way to avoid overuse of parentheses) error (this means: compute f2 2 f1 and apply the result to 3, but 2+f1 does not make sense) define the anonymous function x 7→ 2x apply f2 to 3 (= the function \x -> 3+x) composition (= the function \x -> 2*(3+x)) turn function (in backticks) into an operator (=7) turn operator (in parentheses) into a function (=7) fix first parameter to 3 (= the function \x -> 3*x) define the sign function g using guards (the | symbols) define a function which counts the number of elements in a list using pattern matching Continued on the next page. . . 146 Expression [1,2] [] [1..] [2..5] 1 : [2,3] [1,2,3] !! 0 [1,2] ++ [3] ’a’ "abc" 2^3 2^^(-1) (’a’,2) fst (’a’,2) snd (’a’,2) map f1 [1..] [f2 a b | a <- [1,2], b <- [3..5]] foldl f2 1 [3,5] iterate f1 1 maximum [1,4,2] maximumBy f x Chapter 7. Implementation of estimates Description a list (all elements must have the same type) the empty list the list of all positive integers list enumeration with bounds (=[2,3,4,5]) append element to beginning of list (=[1,2,3]) access list elements by zero-based index (=1) concatenate two lists (=[1,2,3]) a character a string, i.e. a list of characters (=[’a’,’b’,’c’]) nonnegative integer exponentiation (=8) integer exponentiation (=0.5) a pair (the elements need not have the same type) access first element in a pair (=’a’) access second element in a pair (=2) apply f1 to all elements in the list (=[2,4,8,..]) list comprehension, i.e. { f 2 ( a, b) | a ∈ {1, 2}, b ∈ {3, 4, 5}} (= [4,5,6,5,6,7]) fold over list (compute f2 1 3 = 4, then f2 4 5) (=9) compute orbit of 1 under f1 (=[1,2,4,8,..]) return maximum element in a list (=4) as above, but using f to compare elements of the list x minimum [1,4,2] return minimum element in a list (=1) splitAt 2 [1,4,2]split list in two at given index (=([1,4],[2])) take 3 [7..] take the first 3 elements from the list (=[7,8,9]) zip [1..] [3,4] join two lists into a list of pairs (=[(1,3),(2,4)]) unzip [(1,3), ‘inverse’ of zip (=([1,2],[3,4])) (2,4)] zipWith f2 [1..] like zip, but use f2 to join elements (=[4,6]) [3,4] repeat 0 infinite list with one element repeated (=[0,0,..]) replicate 3 0 finite list with one element repeated (=[0,0,0]) Continued on the next page. . . 7.9. Haskell mini-reference Expression sum [3,-1,4] transpose m putStrLn "hi" show 1.2 error "ohno" 147 Description sum of elements in list (=6) the transpose of the matrix m (m is a list of lists) print hi to standard out and and append a new line turn the number 1.2 into the string "1.2" abort program with error message ohno Table 7.1: Examples of Haskell syntax used in the source code. A PPENDIX Background material The purpose of this appendix is to collect some background material that is used throughout the text. Section A.1 contains a topological fixed point theorem which is suitable for proving the existence of periodic points for renormalization operators in general. Sections A.2 and A.3 contains basic facts about the nonlinearity operator and the Schwarzian derivative, respectively. A.1 A fixed point theorem The following theorem is an adaptation of Granas and Dugundji (2003, Theorem 4.7). Theorem A.1.1. Let X ⊂ Y where X is closed and Y is a normal topological space. If f : X → Y is homotopic to a map g : X → Y with the property that every extension of g|∂X to X has a fixed point in X, and if the homotopy ht has no fixed point on ∂X for every t ∈ [0, 1], then f has a fixed point in X. Remark A.1.2. Note that the statement is such that X must have nonempty interior. This follows from the assumption that g has a fixed point (since it is an extension of g|∂X ) but the requirement on the homotopy implies that g has no fixed point on ∂X. S Proof. Let Ft be the set of fixed points of ht and let F = Ft . Since g must have a fixed point F is nonempty. Since ht has no fixed points on ∂X for every t, F and ∂X are disjoint. 149 A 150 Appendix A. Background material We claim that F is closed. To see this, let { xn ∈ F } be a convergent sequence, let x = lim xn . Note that x ∈ X since F ⊂ X and X is closed. By definition there exists tn ∈ [0, 1] such that xn = h( xn , tn ). Pick a convergent subsequence tnk → t. Since xn is convergent h( xnk , tnk ) = xnk → x, but at the same time h( xnk , tnk ) → h( x, t) since h is continuous. Hence h( x, t) = x, that is x ∈ F which proves the claim. Since Y is normal and ∂X and F are disjoint closed sets there exists a map λ : X → [0, 1] such that λ| F = 0 and λ|∂X = 1. Define ḡ( x ) = h( x; λ( x )). Then ḡ is an extension of g|∂X since if x ∈ ∂X, then ḡ( x ) = h( x, 1) = g( x ). Hence ḡ has a fixed point p ∈ X. However, p must also be a fixed point of f since p = ḡ( p) = h( p, λ( p)) so that p ∈ F and consequently p = ḡ( p) = h( p, 0) = f ( p). A.2 The nonlinearity operator Definition A.2.1. Let C k ( A; B) denote the set of k times continuously differentiable maps f : A → B and let D k ( A; B) ⊂ C k ( A; B) denote the subset of orientation-preserving homeomorphisms whose inverse lie in C k ( B; A). As a notational convenience we write C k ( A) instead of C k ( A; A), and C k instead of C k ( A; B) if there is no need to specify A and B (and similarly for D k ). Definition A.2.2. The nonlinearity operator N : D 2 ( A; B) → C 0 ( A; R) is defined by (A.1) Nφ = D log Dφ. We say that Nφ is the nonlinearity of φ. Remark A.2.3. Note that Nφ = D2 φ . Dφ Definition A.2.4. The distortion of φ ∈ D 1 ( A; B) is defined by Dist φ = sup log x,y∈ A Dφ( x ) . Dφ(y) 151 A.2. The nonlinearity operator Remark A.2.5. We think of the nonlinearity of φ ∈ D 2 ( A; B) as the density for its distortion. To understand this remark, let dµ = Nφ(t)dt. Assuming Nφ is a positive function, then µ is a measure and Dist φ = Z A dµ, since by (A.1) Z y x Nφ(t)dt = log Dφ(y) . Dφ( x ) If Nφ is negative, then − Nφ(t) is a density. The only problem with the interpretation of Nφ as a density occurs when it changes sign. Intuitively speaking, we can still think of the nonlinearity as a local density of the distortion (away from the zeros of Nφ). Note that Nφ does not change sign in the important special case of φ being a pure map (i.e. a restriction of x α ). So the (absolute value of the) nonlinearity is the density for the distortion of pure maps. Lemma A.2.6. The kernel of N : D 2 ( A; B) → C 0 ( A; R) equals the orientationpreserving affine map that takes A onto B. Lemma A.2.7. The nonlinearity operator N : D 2 ( A; B) → C 0 ( A; R) is a bijection. In the specific case of A = B = [0, 1] the inverse is given by Rx Rs exp{ 0 f (t)dt}ds −1 0 (A.2) N f (x) = R 1 . Rs exp { f ( t ) dt } ds 0 0 Lemma A.2.8 (The chain rule for the nonlinearity operator). If φ, ψ ∈ D 2 then (A.3) N (ψ ◦ φ) = Nψ ◦ φ · Dφ + Nφ. Proof. Use the chain rule of differentiation: N (ψ ◦ φ) = D log D (ψ ◦ φ) = D log( Dψ ◦ φ · Dφ) = D log( Dψ ◦ φ) + D log Dφ = D2 ψ ◦ φ · Dφ + Nφ Dψ ◦ φ = ( D log Dψ) ◦ φ · Dφ + Nφ = Nψ ◦ φ · Dφ + Nφ. 152 Appendix A. Background material Definition A.2.9. We turn D 2 ( A; B) into a Banach space by inducing the usual linear structure and uniform norm of C 0 ( A; R) via the nonlinearity operator. That is, we define αφ + βψ = N −1 (αNφ + βNψ) , (A.4) kφk = sup | Nφ(t)|, (A.5) t∈ A for φ, ψ ∈ D 2 ( A; B) and α, β ∈ R. Lemma A.2.10. If φ ∈ D 2 ( A; B) then N (φ−1 )(y) = − (A.6) Nφ(φ−1 (y)) , Dφ(φ−1 (y)) ∀y ∈ B. Proof. Let x = φ−1 (y), then h i N (φ−1 )(y) = D log D (φ−1 )(y) = D log Dφ( x )−1 =− D2 φ( x ) Nφ( x ) · D (φ−1 )(y) = − . Dφ( x ) Dφ( x ) Lemma A.2.11. If φ ∈ D 2 ( A; B) then (A.7) (A.8) (A.9) Dφ(y) ≤ e|y−x|·kφk , Dφ( x ) | B| −kφk | B | kφk ·e ≤ Dφ( x ) ≤ ·e , | A| | A| | B| | D2 φ( x )| ≤ · k φ k · ekφk , | A| e−|y− x|·kφk ≤ for all x, y ∈ A. Proof. Integrate the nonlinearity to get Z y x as well as Z y x Nφ(t)dt = log Dφ(y) , Dφ( x ) Nφ(t)dt ≤ |y − x |kφk. 153 A.2. The nonlinearity operator Combine these two equations to get (A.7). By the mean value theorem we may chose y such that Dφ(y) = | B|/| A|, so (A.8) follows from (A.7). Finally, since Nφ( x ) = D log Dφ( x ) = D2 φ( x ) , Dφ( x ) we get | D2 φ( x )| ≤ | Dφ( x )| · kφk. Now apply (A.8) to get (A.9). Lemma A.2.12. If φ, ψ ∈ D 2 ( A; B) then (A.10) (A.11) |φ( x ) − ψ( x )| ≤ e2kφ−ψk − 1 · min{φ( x ), 1 − φ( x )}, Dφ( x ) e−kφ−ψk ≤ ≤ ekφ−ψk , Dψ( x ) for all x ∈ A. Lemma A.2.13. The set B = {φ ∈ D 2 : kφk ≤ K } is relatively compact in C 0 . Proof. Maps in B are uniformly bounded by definition and since maps in B have uniformly bounded derivative (by (A.8)) they are equicontinuous. The theorem of Arzelà–Ascoli now says that any sequence in B has a subsequence which converges uniformly (to a map in C 0 ). Thus, the C 0 –closure of B is compact in C 0 . Definition A.2.14. Let ζ J : [0, 1] → J be the affine orientation-preserving map taking [0, 1] onto an interval J. Define the zoom operator Z : D 2 ( A; B) → D 2 ([0, 1]) by (A.12) Zφ = ζ B−1 ◦ φ ◦ ζ A . Remark A.2.15. Note that if φ ∈ D( A; B), then B = φ( A) so Zφ only depends on φ and A (not on B). We will often write Z (φ; A) instead of Zφ in order to emphasize the dependence on A. Lemma A.2.16. If φ ∈ D 2 ( A; B) then (A.13) Z (φ−1 ) = ( Zφ)−1 , (A.14) N ( Zφ) = | A| · Nφ ◦ ζ A , (A.15) k Zφk = | A| · kφk. 154 Appendix A. Background material Proof. The first equation is just a calculation 1 −1 Z ( φ −1 ) = ζ − ◦ ζ B = (ζ B−1 ◦ φ ◦ ζ A )−1 = ( Zφ)−1 . A ◦φ To see the second equation, apply the chain rule for nonlinearities and use the fact that affine maps have zero nonlinearity N ( Zφ) = N (ζ B−1 ◦ φ ◦ ζ A ) = N (φ ◦ ζ A ) = Nφ ◦ ζ A · Dζ A . This implies the third equation k Zφk = sup | Nφ ◦ ζ A ( x )| · | A| = sup | Nφ( x )| · | A| = kφk · | A|. x∈ A x ∈[0,1] A.3 The Schwarzian derivative In this appendix we collect some results on the Schwarzian derivative. Proofs can be found in de Melo and van Strien (1993, Chapter IV). Definition A.3.1. The Schwarzian derivative S : D 3 ( A; B) → C 0 ( A; R) is defined by 1 S f = D ( N f ) − ( N f )2 . 2 (A.16) Remark A.3.2. Note that 2 3 D2 f D3 f − . Sf = Df 2 Df Lemma A.3.3. The kernel of S : D 3 ( A; B) → C 0 ( A; R) is the set of orientationpreserving Möbius maps which take A onto B. Lemma A.3.4 (The chain rule for the Schwarzian derivative). If f , g ∈ D 3 , then (A.17) S( f ◦ g) = S f ◦ g · ( Dg)2 + Sg. 155 A.3. The Schwarzian derivative Proof. Use the chain rule for nonlinearities: 2 1 N ( f ◦ g) 2 2 1 = D ( N f ◦ g · Dg + Ng) − N f ◦ g · Dg + Ng 2 2 = D ( N f ) ◦ g · ( Dg) + N f ◦ g · D2 g + D ( Ng) 1 1 − ( N f ◦ g)2 · ( Dg)2 − N f ◦ g · Dg · Ng − ( Ng)2 2 2 1 = D ( N f ) ◦ g − ( N f ◦ g)2 · ( Dg)2 + Sg 2 2 = S f ◦ g · ( Dg) + Sg. S( f ◦ g) = D ( N ( f ◦ g)) − (In the fourth step we used the fact that Dg · Ng = D2 g.) Lemma A.3.5. S f < 0 if and only if S( f −1 ) ≥ 0. Lemma A.3.6 (Koebe Lemma). If f ∈ D 3 (( a, b); R) and S f ≥ 0, then (A.18) −1 | N f ( x )| ≤ 2 · min{| x − a|, | x − b|} . Proof. A proof for this particular statement of the Koebe lemma can be found in Jiang (1996, Lemma 2.4). A more general version of the Koebe lemma can be found in de Melo and van Strien (1993, Section IV.3). Corollary A.3.7. Let τ > 0 and let f ∈ D 3 ( A; B). If f extends to a map F ∈ D 3 ( I; J ) with SF < 0 and if J \ B has two components, each having length at least τ | B|, then k Z f k ≤ e2/τ · 2/τ. Proof. Since SF < 0 it follows that S( F −1 ) ≥ 0 so the Koebe lemma and (A.15) imply that k Z ( f −1 )k = | B| · k f −1 k ≤ | B| · 2 2 = . τ | B| τ Now apply Lemmas A.2.10, A.2.11(A.8) and A.2.16(A.13) k Z f k ≤ exp{k( Z f )−1 k} · k( Z f )−1 k = exp{k Z ( f −1 )k} · k Z ( f −1 )k ≤ e2/τ · 2/τ. Bibliography A. Arneodo, P. Coullet, and C. Tresser. 1981. A possible new mechanism for the onset of turbulence. Phys. Lett. A, 81(4):197–201. P. J. Bushell. 1973. Hilbert’s metric and positive contraction mappings in a Banach space. Arch. Rational Mech. Anal., 52:330–338. Pierre Collet, Pierre Coullet, and Charles Tresser. 1985. Scenarios under constraint. J. Physique Lett., 46(4):143–147. Pierre Coullet and Charles Tresser. 1978. Itérations d’endomorphismes et groupe de renormalisation. C. R. Acad. Sci. Paris Sér. A-B, 287(7):A577– A580. Edson de Faria, Welington de Melo, and Alberto Pinto. 2006. Global hyperbolicity of renormalization for Cr unimodal mappings. Ann. of Math. (2), 164(3):731–824. Welington de Melo and Sebastian van Strien. 1993. One-dimensional dynamics, volume 25 of Ergebnisse der Mathematik und ihrer Grenzgebiete (3) [Results in Mathematics and Related Areas (3)]. Springer-Verlag, Berlin. J.-P. Eckmann, H. Epstein, and P. Wittwer. 1984. Fixed points of Feigenbaum’s type for the equation f p (λx ) ≡ λ f ( x ). Comm. Math. Phys., 93(4): 495–516. Jean-Pierre Eckmann and Peter Wittwer. 1987. A complete proof of the Feigenbaum conjectures. J. Statist. Phys., 46(3-4):455–475. 157 158 Bibliography Mitchell J. Feigenbaum. 1978. Quantitative universality for a class of nonlinear transformations. J. Statist. Phys., 19(1):25–52. Mitchell J. Feigenbaum. 1979. The universal metric properties of nonlinear transformations. J. Statist. Phys., 21(6):669–706. Jean-Marc Gambaudo and Marco Martens. 2006. Algebraic topology for minimal Cantor sets. Ann. Henri Poincaré, 7(3):423–446. Andrzej Granas and James Dugundji. 2003. Fixed point theory. Springer Monographs in Mathematics. Springer-Verlag, New York. John Guckenheimer. 1976. A strange, strange attractor. In Jerrold E. Marsden and Marjorie McCracken, editors, The Hopf bifurcation and its applications, pages 368–381. Springer-Verlag, New York. John Guckenheimer and R. F. Williams. 1979. Structural stability of Lorenz attractors. Inst. Hautes Études Sci. Publ. Math., 50:59–72. C. Jansson and S. M. Rump. 1991. Rigorous solution of linear programming problems with uncertain data. Z. Oper. Res., 35(2):87–111. Yunping Jiang. 1996. Renormalization and geometry in one-dimensional and complex dynamics, volume 10 of Advanced Series in Nonlinear Dynamics. World Scientific Publishing Co. Inc., River Edge, NJ. Anatole Katok and Boris Hasselblatt. 1995. Introduction to the modern theory of dynamical systems, volume 54 of Encyclopedia of Mathematics and its Applications. Cambridge University Press, Cambridge. Hans Koch, Alain Schenkel, and Peter Wittwer. 1996. Computer-assisted proofs in analysis and programming in logic: a case study. SIAM Rev., 38 (4):565–604. Oscar E. Lanford, III. 1982. A computer-assisted proof of the Feigenbaum conjectures. Bull. Amer. Math. Soc. (N.S.), 6(3):427–434. Oscar E. Lanford, III. 1984. Computer-assisted proofs in analysis. Phys. A, 124(1-3):465–470. Mathematical physics, VII (Boulder, Colo., 1983). A. Libchaber and J. Maurer. 1979. Rayleigh–Bénard experiment in liquid helium; frequency locking and the onset of turbulence. Journal de Physique — Lettres, 40(16):419–423. Bibliography 159 Paul S. Linsay. 1981. Period doubling and chaotic behavior in a driven anharmonic oscillator. Phys. Rev. Lett., 47(19):1349–1352. Edward N. Lorenz. 1963. Deterministic nonperiodic flow. Journal of the Atmospheric Sciences, 20:130–141. Mikhail Lyubich. 1999. Feigenbaum-Coullet-Tresser universality and Milnor’s hairiness conjecture. Ann. of Math. (2), 149(2):319–420. M. Martens, W. de Melo, and S. van Strien. 1992. Julia-Fatou-Sullivan theory for real one-dimensional dynamics. Acta Math., 168(3-4):273–318. Marco Martens. 1994. Distortion results and invariant Cantor sets of unimodal maps. Ergodic Theory Dynam. Systems, 14(2):331–349. Marco Martens. 1998. The periodic points of renormalization. Ann. of Math. (2), 147(3):543–584. Marco Martens and Welington de Melo. 2001. Universal models for Lorenz maps. Ergodic Theory Dynam. Systems, 21(3):833–860. P. Martien, S. C. Pope, P. L. Scott, and R. S. Shaw. 1985. The chaotic behavior of the leaky faucet. Physics Letters A, 110:399–404. Curtis T. McMullen. 1996. Renormalization and 3-manifolds which fiber over the circle, volume 142 of Annals of Mathematics Studies. Princeton University Press, Princeton, NJ. John Milnor. 1985. On the concept of attractor. Comm. Math. Phys., 99(2): 177–195. Michał Misiurewicz. 1981. Absolutely continuous measures for certain maps of an interval. Inst. Hautes Études Sci. Publ. Math., 53:17–51. Matthias St. Pierre. 1999. Topological and measurable dynamics of Lorenz maps. Dissertationes Math. (Rozprawy Mat.), 382:134. Dennis Sullivan. 1992. Bounds, quadratic differentials, and renormalization conjectures. In American Mathematical Society centennial publications, Vol. II (Providence, RI, 1988), pages 417–466. Amer. Math. Soc., Providence, RI. Warwick Bryan Tucker. 1998. The Lorenz attractor exists. ProQuest LLC, Ann Arbor, MI. Thesis (Ph.D.)–Uppsala Universitet (Sweden). 160 Bibliography Marcelo Viana. 2000. What’s new on Lorenz strange attractors? Intelligencer, 22(3):6–19. Math. Björn Winckler. 2010. A renormalization fixed point for Lorenz maps. Nonlinearity, 23(6):1291–1302. Index cycles of renormalization, 31 A a priori bounds, 54 archipelago, 93 attractor, 7 D decomposition, 63 pure, 68 diffeomorphic parts, 26 distortion, 150 of a decomposition, 64 signed, 69 B basin of attraction, 7 bifurcation diagram, 4 branch, 28 full, 58 trivial, 58 E ergodic, 34 extremal point, 97 C chain rule for nonlinearities, 151 for Schwarzian derivative, 154 combinatorial type, 13, 29 bounded, 14, 29 composition operator, 64 on decomposed maps, 72 partial, 64 cone field, 100 critical exponent, 10, 25 critical point, 26 critical values, 26 F Feigenbaum delta, 2 first-return map, 12 float, 128 full family, 58 G gaps of generation n, 56 H Haskell, 123, 144 161 162 Hilbert metric, 39 hyperbolic metric, 39 I infinitely renormalizable, 13, 29 intervals of generation n, 56 invariant measure, 35 island, 93 K Koebe Lemma, 155 L limit set of renormalization, 17, 103 logistic family, 3 loop measure, 38 Lorenz attractor, 7 Lorenz equations, 7 Lorenz flow, 7 geometric, 9 Lorenz map, 9, 26 decomposed, 71 nontrivial, 12, 27 trivial, 12, 27 M monotone combinatorics, 29 monotone family, 94 monotone type, 13 N nice interval, 31 nonlinearity, 150 as density for distortion, 151 nonlinearity norm, 27 nonlinearity operator, 27, 150 nonunimodal, 113 P period-doubling cascade, 1 Index period-doubling operator, 4 phase transition, 1 pure map, 68 push-forward, 38 R renormalizable, 6, 12, 27 n times, 13 renormalization, 12, 27 boundary of, 58 type of, 13, 28 renormalization conjectures, 5 renormalization horseshoe, 6, 18 renormalization operator, 6, 27 on decomposed maps, 72 rigidity, 20, 115 S safe bound, 128 safe number, 128 Sandwich Lemma, 66 Schwarzian derivative, 154 slice, 58 stable manifold, 113 standard Lorenz family, 25 T time set, 63 transfer map, 31 transfer time, 31 U uniquely ergodic, 35 universality, 3 in the parameter plane, 3 metric, 115 unstable manifold, 103 V vertex, 97 Index W wandering interval, 33 weak Markov property, 33 Z zoom operator, 29, 153 on decompositions, 67 163
* Your assessment is very important for improving the work of artificial intelligence, which forms the content of this project
advertisement