Stable reconstructions in Hilbert spaces and the resolution of the Gibbs phenomenon Ben Adcock Department of Mathematics Simon Fraser University Burnaby, BC V5A 1S6 Canada Anders C. Hansen DAMTP, Centre for Mathematical Sciences University of Cambridge Wilberforce Rd, Cambridge CB3 0WA United Kingdom November 30, 2010 Abstract We introduce a method to reconstruct an element of a Hilbert space in terms of an arbitrary finite collection of linearly independent reconstruction vectors, given a finite number of its samples with respect to any Riesz basis. As we establish, provided the dimension of the reconstruction space is chosen suitably in relation to the number of samples, this procedure can be numerically implemented in a stable manner. Moreover, the accuracy of the resulting approximation is completely determined by the choice of reconstruction basis, meaning that the reconstruction vectors can be tailored to the particular problem at hand. An important example of this approach is the accurate recovery of a piecewise analytic function from its first few Fourier coefficients. Whilst the standard Fourier projection suffers from the Gibbs phenomenon, by reconstructing in a piecewise polynomial basis, we obtain an approximation with root exponential accuracy in terms of the number of Fourier samples and exponential accuracy in terms of the degree of the reconstruction function. Numerical examples illustrate the advantage of this approach over other existing methods. 1 Introduction Suppose that H is a separable Hilbert space with inner product h·, ·i and corresponding norm k·k. In this paper, we consider following problem: given the first m samples {hf, ψj i}m j=1 of an element f ∈ H with respect to some Riesz basis {ψj }∞ j=1 of H (the sampling basis), reconstruct f to high accuracy. Not only does such a problem lie at the heart of modern sampling theory [21, 47], it also occurs in a myriad of applications, including image processing (in particular, Magnetic Resonance Imaging), and the numerical solution of hyperbolic partial differential equations (PDEs). In practice, straightforward reconstruction of f may be achieved via orthogonal projection with respect to the sampling basis. Indeed, for an arbitrary f ∈ H, this is the best possible strategy. However, in many important circumstances, this approximation converges only slowly in m, when measured in the norm on H, or not at all, if a stronger norm (for example, the uniform norm) is considered. A prominent instance of this problem is the recovery of a function f : T → R from its first m Fourier coefficients (here T = [−1, 1) is the unit torus). In this instance, H = L2 (T) is the space of all 2-periodic, square-integrable functions of one variable. Provided f is analytic, 1 it is well-known that its Fourier series (the orthogonal projection with respect to the Fourier basis) converges exponentially fast. However, whenever f has a jump discontinuity, its Fourier series suffers from the well-known Gibbs phenomenon [41]. Whilst convergence occurs in the L2 norm, uniform convergence is lacking, and the approximation is polluted by characteristic O (1) oscillations near the discontinuity. Moreover, the rate of convergence is also slow: only 2 −1 − 12 pointwise away from the discontinuity. O(m ) when measured in the L norm, and O m Needless to say, the Gibbs phenomenon is a significant blight of many practical applications of Fourier series [35]. It is a testament to its importance that the design of effective techniques for its removal remains an active area of inquiry [30, 45]. Returning to the general form of the problem, let us now suppose that some additional information is known about the function f . For example, f may be sparse in a particular basis (e.g. a polynomial of low degree) or may possess certain regularity. In particular, in the Fourier instance, we may know that f is piecewise analytic with jump discontinuities at known locations in T. In this circumstance, it seems plausible that a better approximation to f can be obtained by expanding in a different basis (e.g. a piecewise polynomial basis). To this end, let us introduce the so-called reconstruction space (of dimension n) and seek to approximate f by an element fn,m consisting of n linearly independent elements of this space. As we will show in due course, provided reconstruction is carried out in a certain manner, a suitable approximation fn,m can always be found. Essential to this approach is that m (the number of samples) is chosen sufficiently large in comparison to n (equivalently, n is chosen sufficiently small in comparison to m). However, provided this is the case, the approximation fn,m inherits the principal features of the reconstruction space. In particular, fn,m is quasioptimal (or, under certain conditions, asymptotically optimal), in sense that the error kf −fn,m k can be bounded by a constant multiple of kf − Qn f k, where Qn f is the orthogonal projection onto the reconstruction space (in other words, the best approximation to f from this space). Moreover, from a practical standpoint, this method can be implemented by solving a linear least squares problem. Whenever the reconstruction vectors are suitably chosen (e.g. if they form a Riesz basis), the corresponding linear system is well-conditioned and the least squares problem can be solved in O (mn) operations by standard iterative techniques. Consider once more the example of Fourier series. Let f : T → R be an analytic function with jump discontinuity at x = −1 (equivalently, an analytic, nonperiodic function). As mentioned, the Fourier series of f lacks uniform convergence. However, since f is analytic, it makes sense to seek to reconstruct f in a system of polynomials. It is well-known that the best nth degree polynomial approximation √ of an analytic function converges exponentially fast in n [9]. As we shall prove, with n = O ( m), we obtain a quasi-optimal polynomial approximation fn,m to f from only its first m Fourier coefficients, regardless of the particular family of polynomials used. This results in exponential convergence of fn,m to f in n (the polynomial degree), or root exponential convergence in m (the number of Fourier samples). Moreover, whenever Legendre polynomials are used, the approximation is computable in a stable manner in only O (mn) operations. The use of, for example, Chebyshev polynomials results in a method with 3 O (n) condition number, that can be implemented in O(mn 2 ) operations. Furthermore, whilst retaining the aforementioned features, this procedure can be easily generalised to recovery of a piecewise analytic function of one variable (using piecewise polynomial bases), and to the case of multivariate functions defined in tensor-product regions. There are a number of existing algorithms for the removal of the Gibbs phenomenon from Fourier series. One of the most well-known, which also provides a polynomial approximant, is the Gegenbauer reconstruction technique [29, 30, 31]. As we discuss further in Section 3, the method developed in this paper has a number of key advantages over this device. Numerical results also indicate its superior performance for a number of test problems. The method developed in this paper was previously introduced by the authors in [3] within the context of abstract sampling theory. Whilst this problem, in this abstract form, has been extensively studied in the last couple of decades (in particular, by Eldar et al [20, 21], see also [47]), to the best of our knowledge this particular method does not appear in any existing literature. For a more detailed discussion of the relation of this approach to existing schemes we refer the reader to [3]. Conversely, in this paper, after presenting the general version of the 2 method in abstract terms, we will focus primarily on its application to the Fourier coefficient reconstruction problem. On this topic, a similar approach, but only dealing with reconstructions in Legendre polynomials from Fourier samples of analytic functions, was discussed in [32]. This can be viewed as a special case of our general framework. Furthermore, by examining this example as part of this framework, we are able to extend and improve the work of [32] in the following ways: (i) we derive a procedure allowing for reconstructions in any polynomial basis, not just Legendre polynomials, (ii) we extend this approach to reconstructions of piecewise smooth functions using (arbitrary) piecewise polynomial bases, (iii) we generalise this work to smooth functions of arbitrary numbers of variables √ and (iv) we obtain improved estimates for both the error and the necessary scaling n = O ( m) required for implementation. Aside from yielding these improvements, a great benefit of the general framework presented in this paper is that it is immediately applicable to a whole host of other reconstruction problems. To illustrate this generality, in the final part of this paper we consider its application to the accurate reconstruction of a piecewise analytic function from its orthogonal polynomial expansion coefficients. Such a problem is typical of that occurring in the application of polynomial spectral methods to hyperbolic PDEs [27], where the shock formation inhibits fast convergence of the polynomial approximation. As we highlight, this issue can be overcome in a completely stable fashion by reconstructing in a piecewise polynomial basis. The outline of the remainder of this paper is as follows. In Section 2 we introduce the reconstruction procedure and establish both stability and error estimates. Section 3 is devoted to (piecewise) polynomial reconstructions from Fourier samples. In Section 4 we consider reconstructions from tensor-product spaces, and in Section 5 we discuss other reconstruction problems. Finally, in Section 6 we present open problems and challenges. 2 General theory of reconstruction In this section, we describe the reconstruction procedure in its full generality. To this end, suppose that {ψj }∞ j=1 is a Riesz basis (the sampling basis) for a separable Hilbert space H over the field C. Let h·, ·i be the inner product on H, with associated norm k·k. Recall that, by definition, span{ψ1 , ψ2 , . . .} is dense in H and 2 X ∞ ∞ X 2 αj ψj ≤ c |αj |2 , c1 |αj | ≤ 2 j=1 j=1 j=1 ∞ X ∀α = {α1 , α2 , . . .} ∈ l2 (N), (2.1) for positive constants c1 , c2 . Equivalently, ψj = U(Ψj ), where {Ψj }∞ j=1 is an orthonormal basis for H and U : H → H is a bounded, bijective operator. Using this definition, it is easy to deduce that {ψj }∞ j=1 also satisfies the frame property d1 kf k2 ≤ ∞ X | hf, ψj i |2 ≤ d2 kf k2 , ∀f ∈ H, (2.2) j=1 for d1 , d2 > 0, where the largest possible value for d2 is kUk2H→H and the smallest possible value for d1 is kU −1 k−2 H→H [17]. Suppose now that the first m coefficients of an element f ∈ H with respect to the sampling basis are given: fˆj = hf, ψj i , j = 1, . . . , m. (2.3) Set Sm = span{ψ1 , . . . , ψm } and let Pm : H → Sm be the mapping f 7→ Pm f = m X hf, ψj i ψj . (2.4) j=1 We now seek to reconstruct f in a different basis. To this end, suppose that {φ1 , . . . , φn } are linearly independent reconstruction vectors and define Tn = span{φ1 , . . . , φn }. Let Qn : H → Tn 3 be the orthogonal projection onto Tn . Direct computation of Qn f , the best approximation to f from Tn , is not possible, since the coefficients hf, φj i are unknown. Instead, we seek to use the values (2.3) to compute an approximation fn,m ∈ Tn that is quasi-optimal, i.e. kf − Qn f k ≤ kf − fn,m k ≤ Ckf − Qn f k for some constant C > 0 independent of f , n and m. To do this, we introduce the sesquilinear form am : H × H → C, given by am (g, h) = hPm g, hi , ∀g, h ∈ H. (2.5) Note that, since hPm g, hi = m X hf, ψj i hg, ψj i = am (g, f ), ∀f, g ∈ H, j=1 am is a Hermitian form on H × H (here z is the complex conjugate of z ∈ C). With this to hand, we now define fn,m by the following condition am (fn,m , φ) = am (f, φ), ∀φ ∈ Tn . (2.6) Upon setting φ = φj , j = 1, . . . ,P n, this becomes an n × n linear system of equations for the n coefficients α1 , . . . , αn of fn,m = j=1 αj φj . We shall defer a discussion of the computation of this approximation to Section 2.3: in the remainder of this section we consider the analysis of fn,m . Before proving the main theorem regarding (2.6), let us first give an explanation as to why this approach works. As mentioned, key to this technique is that the parameter m is sufficiently large in comparison to n. To this end, let n be fixed and suppose that m → ∞. Due to (2.1), the mappings Pm converge strongly to a bounded, linear operator P, given by Pf = ∞ X hf, ψj i ψj , ∀f ∈ H. (2.7) j=1 Hence, for large m, the equations (2.6) defining fn,m resemble the equations a(f˜n , φ) = a(f, φ), ∀φ ∈ Tn , (2.8) where a : H × H → C is the Hermitian form a(f, g) = hPf, gi. Thus, it is reasonable to expect that fn,m → f˜n as m → ∞, provided such a function f˜n exists. However, Theorem 2.1. For all n ∈ N, the function f˜n exists and is unique. Moreover, d2 kf − f˜n k ≤ kf − Qn f k, d1 (2.9) where d1 and d2 arise from (2.2). This theorem can be established with a straightforward application of the Lax–Milgram theorem and its counterpart, Céa’s lemma [25]. Indeed, due to (2.2) and (2.7), d1 kgk2 ≤ a(g, g) = ∞ X | hg, ψj i |2 ≤ d2 kgk2 , ∀g ∈ H. (2.10) j=1 Hence the form a(·, ·) defines an equivalent inner product on H. Nonetheless, we shall present a self-contained proof, since similar techniques will be used subsequently. Proof. Let U : Tn → Cn be the linear mapping g 7→ {hPg, φj i}nj=1 . To prove existence and uniqueness of f˜n it suffices to show that U is invertible, upon which it follows that f˜n = U −1 {hPf, φj i}nj=1 . Suppose that Ug = 0. Then, by definition, hPg, φj i = 0 for j = 1, . . . , n. Using linearity, we deduce that hPg, gi = 0. Now, it follows from (2.10) that 0 = hPg, gi ≥ d1 kgk2 , giving g = 0. Hence, U is invertible and f˜n exists and is unique. 4 Now consider the error estimate (2.9). Using (2.2) once more, we obtain E 1 D P(f − f˜n ), f − f˜n . kf − f˜n k2 ≤ d1 By definition of f˜n , hP(f − f˜n ), φi = 0, ∀φ ∈ Tn . In particular, setting φ = f˜n − Qn f , yields E 1 D kf − f˜n k2 ≤ P(f − f˜n ), f − Qn f . d1 Since a(·, ·) forms an equivalent inner product on H, an application of the Cauchy–Schwarz inequality gives i 12 d2 1 h kf − f˜n k2 ≤ hP(f − f˜n ), f − f˜n ihP(f − Qn f ), f − Qn f i ≤ kf − f˜n kkf − Qn f k, d1 d1 as required. This theorem establishes existence and quasi-optimality of f˜n ≈ fn,m , thereby giving an intuitive argument for the success of this method. We now wish to fully confirm this observation. To this end, let Cn,m = inf m X φ∈Tn kφk=1 j=1 | hφ, ψj i |2 = inf hPm φ, φi = inf am (φ, φ). φ∈Tn kφk=1 φ∈Tn kφk=1 (2.11) The quantity Cn,m plays a fundamental role in this paper. Note that Lemma 2.2. For all n, m ∈ N, 0 ≤ Cn,m ≤ d2 . Moreover, for each n, Cn,m → Cn∗ ≥ d1 as m → ∞, where ∞ X Cn∗ = inf φ∈Tn kφk=1 j=1 | hφ, ψj i |2 = inf hPφ, φi = inf a(φ, φ), φ∈Tn kφk=1 φ∈Tn kφk=1 and d1 is defined in (2.2). Proof. Consider first the quantity n,m = sup X | hφ, ψj i |2 = sup hPφ − Pm φ, φi . φ∈Tn j>m kφk=1 (2.12) φ∈Tn kφk=1 Due to (2.2), the infinite sum is finite for any fixed φ, and Pn tends to zero as m → ∞. Now let {Φj }nj=1 be an orthonormal basis for Tn and set φ = j=1 αj Φj . Two applications of the Cauchy–Schwarz inequality gives X | hφ, ψj i |2 ≤ kφk2 j>m Pn n X X | hΦk , ψj i |2 . k=1 j>m Hence n,m ≤ k=1 j>m | hΦk , ψj i |2 , and we deduce that n,m is both finite and n,m → 0 as m → ∞. Noticing that |Cn,m − Cn∗ | ≤ n,m , ∀n, m ∈ N, gives the first part of the proof. For the second, we merely use (2.2). P Aside from Cn,m , we also define the quantity Dn,m by Dn,m = sup sup |hPm f, gi| . g∈Tn f ∈T⊥ n kf k=1 kgk=1 For this, we have the following lemma: 5 (2.13) Lemma 2.3. For all m, n ∈ N, 0 ≤ Dn,m ≤ d2 . Moreover, suppose that P is such that 2 P(Tn ) ⊆ Tn (for example, when P = I is the identity), then Dn,m ≤ c2 n,m , where n,m is as in (2.12). In particular, for fixed n, Dn,m → 0 as m → ∞. Proof. Let f, g ∈ H. By definition, hPm f, gi = m X 21 12 m m X X | hf, ψj i |2 | hg, ψj i |2 . hf, ψj i hg, ψj i ≤ j=1 j=1 (2.14) j=1 Hence (2.2) now gives the first result. Now suppose that f ∈ T⊥ n and P(Tn ) ⊆ Tn . Since Pm is self-adjoint, we have hPm f, gi = hf, Pm gi = hf, Pm g − Pgi. Here the second equality is due to the fact that f ⊥ Tn and Pg ∈ Tn for g ∈ Tn . By the Cauchy–Schwarz inequality, we obtain Dn,m ≤ sup kPg − Pm gk. g∈Tn kgk=1 For g ∈ Tn , we note from (2.1) that X kPg − Pm gk2 ≤ c2 | hg, ψj i |2 = c2 hPg − Pm g, gi ≤ c2 n,m kgk2 , j>m where the final equality follows from the definition of n,m . At this moment, we mention the following important fact. The constants Cn,m , Dn,m and Cn∗ , as well as the approximation fn,m , are determined only by the space Tn , not by the choice of reconstructions vectors φ1 , . . . , φn themselves. As we shall discuss later, the choice of reconstruction vectors only affects the stability of the scheme. This observation aside, we are now able to state the main theorem of this section: Theorem 2.4. For every n ∈ N there exists an m0 such that the approximation fn,m , defined by (2.6), exists and is unique for all m ≥ m0 , and satisfies the stability estimate kfn,m k ≤ −1 kf k. Furthermore, d2 Cn,m q −2 2 kf − Qn f k ≤ kf − fn,m k ≤ Kn,m kf − Qn f k, Kn,m = 1 + Dn,m Cn,m . (2.15) Specifically, the parameter m0 is the least value of m such that Cn,m > 0. To prove this theorem, we first recall that a Hermitian form a : H × H → R is said to be continuous if, for some constant γ > 0, |a(f, g)| ≤ γkf kkgk for all f, g ∈ H. Moreover, a is coercive, provided a(f, f ) ≥ ωkf k2 , ∀f ∈ H, for ω > 0 constant [25]. We now require the following lemma: Lemma 2.5. Suppose that am : H × H → R is the sesquilinear form am (f, g) = hPm f, gi. Then, am is continuous with constant γ ≤ d2 . Moreover, for every n ∈ N there exists an m0 such that the restriction of am to Tn ×Tn is coercive for all m ≥ m0 . Specifically, if Cn,m is given by (2.11), then m0 is the least value of m such that Cn,m > 0, and, for all m ≥ m0 , am (f, f ) ≥ Cn,m kf k2 , ∀f ∈ Tn . Finally, for all f ∈ H and g ∈ Tn , we have am (f − Qn f, g) ≤ Dn,m kf − Qn f kkgk. Proof. Continuity follows immediately from (2.14). For the second and final results, we merely use the definitions (2.11) and (2.13) of Cn,m and Dn,m respectively. Proof of Theorem 2.4. To establish existence and uniqueness, it suffices to prove that the linear operator U : Tn → Cn , g 7→ {hPm g, φj i}nj=1 is invertible. Suppose that g ∈ Tn with Ug = 0. By definition, we have hPm g, φj i = 0 for j = 1, . . . , n. Using linearity, it follows that hPm g, gi = 0. Lemma 2.5 now gives 0 ≤ Cn,m kgk2 ≤ 0. Hence g = 0, and therefore U has trivial kernel. Stability of fn,m is easily established from the continuity and coercivity conditions. Setting φ = fn,m in (2.6) gives Cn,m kfn,m k2 ≤ am (fn,m , fn,m ) = am (f, fn,m ) ≤ d2 kf kkfn,m k, 6 as required. Now consider the error estimate (2.15). Suppose that we define en,m = fn,m −Qn f ∈ Tn . Then, by definition of fn,m , we have am (en,m , φ) = am (f − Qn f, φ), ∀φ ∈ Tn . In particular, setting φ = en,m , we obtain −1 ken,m k ≤ Cn,m Dn,m kf − Qn f k. (2.16) Since Qn f is the orthogonal projection onto Tn , we have kf − fn,m k2 = ken,m k2 + kf − Qn f k2 , which gives the full result. When am is shown to be continuous and coercive, it may be tempting to seek to apply the Lax–Milgram theorem and Céa’s lemma to obtain Theorem 2.4. However, the Hermitian form am , when considered as a mapping H × H → C, will not, in general, be coercive. This is readily seen from the definition of Cn,m . The finite-dimensional operator Pm |Tn converges uniformly to P|Tn , whereas its infinite-dimensional counterpart Pm typically does not (for example, when Pm is the Fourier projection operator and H = L2 (T)). Hence, am only becomes coercive when restricted to Tn × Tn , and these standard results do not automatically apply. Though Theorem 2.4 establishes an estimate for the error f − fn,m measured in the natural norm on H, it is also useful to derive a result valid for any other norm defined on a suitable subspace of H (for example, this may be the uniform norm on T in the case of Fourier series). To this end, let ||| · ||| be such a norm and define G = {g ∈ H : |||g||| < ∞}. We have Corollary 2.6. Suppose that f ∈ G, Tn ⊆ G and that fn,m is defined by (2.6). Then, for all m ≥ m0 , kn Dn,m kf − Qn f k, (2.17) |||f − fn,m ||| ≤ |||f − Qn f ||| + Cn,m where kn = sup φ∈Tn |||φ||| and Cn,m , Dn,m are given by (2.11) and (2.13) respectively. kφk=1 Proof. Let en,m = fn,m − Qn f once more. Since en,m ∈ Tn , it follows from the definition of kn and the inequality (2.16) that |||en,m ||| ≤ kn ken,m k ≤ kn Dn,m kf − Qn f k. Cn,m The full result is obtained from the triangle inequality |||f − fn,m ||| ≤ |||en,m ||| + |||f − Qn f |||. Remark 1 In practice, Pn it is useful to have an upper bound for the constant kn . A simple exercise gives kn ≤ j=1 |||Φj |||, where {Φj }nj=1 is any orthonormal basis for Tn . Theorem 2.4 confirms that the approximation fn,m is quasi-optimal whenever m is sufficiently large in comparison to n. In other words, the error kf − fn,m k can be bounded by a constant multiple (independent of n) of kf −Qn f k. Naturally, whenever {φ1 , . . . , φn } are the first n vectors in an infinite sequence φ1 , φ2 , . . . that form a basis for H (a natural assumption to make from a practical standpoint), we find that fn,m converges to f at the same rate as Qn f , provided m scales appropriately with n. Under the same assumptions, Corollary 2.6 also verifies convergence of fn,m to f in ||| · |||, whenever Qn f → f in this norm and kn kf − Qn f k → 0 as n → ∞. Note on other feature of this framework. Provided m is such that Cn,m > 0, any function f ∈ Tn will be recovered exactly by fn,m . In the language of sampling theory, this property is commonly referred to as perfect reconstruction. Remark 2 Whilst this framework relies on the fact that m can range independently of n, a natural question to ask is what happens if m is set equal to n (note that, in this case, one recovers the well-known consistent sampling framework of Eldar et al [21, 47]). This question was discussed in detail in [3], where it demonstrated that such an approach often leads to severe ill-conditioning as n = m → ∞. Additionally, stringent restrictions are placed on the types of vectors f that can be reconstructed (see also Section 3.6). Conversely, by allowing m to vary independently of n, we obtain a reconstruction fn,m that is guaranteed to converge for any vector f ∈ H. Moreover, as we discuss in Section 2.3, provided the reconstruction vectors are suitably chosen, the computation of fn,m is completely stable. 7 2.1 The case of an orthogonal sampling basis In the previous setup, the sampling basis was assumed to be a Riesz basis. The particular case of {ψj }∞ j=1 being an orthonormal basis warrants further study, since this situation often arises in applications (e.g. Fourier sampling). In this setting, both (2.1) and (2.2) (which are now identical) hold with equality and with all constants being precisely one. In other words, Parseval’s identity 2 kf k = ∞ X | hf, ψj i |2 , ∀f ∈ H, j=1 holds. Our first result provides a geometric interpretation for the constant Cn,m in terms of subspace angles. Recall that if U and V are subspaces of H, then the angle θUV is given by cos(θUV ) = inf kQV uk, u∈U kuk=1 (2.18) where QV : H → V is the orthogonal projection. We have 2 Lemma 2.7. Suppose that {ψj }∞ j=1 is an orthonormal basis of H. Then Cn,m = cos (θ), where θ = θTn Sm is the angle between the subspaces Tn and Sm . Proof. Whenever {ψj }∞ j=1 is orthonormal, the operator Pm is the orthogonal projection onto Sm . Hence the result follows immediately from (2.11) and (2.18). Whilst this property is interesting, the following result has much greater significance: Theorem 2.8. Suppose that {ψj }∞ j=1 is an orthonormal basis of H. Then, for all m ≥ m0 , the approximation fn,m satisfies kf − Qn f k ≤ kf − fn,m k ≤ Kn,m kf − Qn f k where Kn,m = (2.19) q p −2 1 + (1 − Cn,m )Cn,m = 1 + tan2 θ sec2 θ, and θ is as in Lemma 2.7. In particular, for fixed n, Kn,m → 1 as m → ∞, and hence fn,m → Qn f as m → ∞. Proof. Note that P = I for an orthonormal sampling basis. In particular, n,m = Cn∗ − Cn,m 2 ≤ 1 − Cn,m . The result now follows and Cn∗ = 1. Hence, using Lemma 2.3 we find that Dn,m from Theorem 2.4. From this we conclude the following: not only is fn,m quasi-optimal, it is also asymptotically optimal in the sense that fn,m → Qn f , the best approximation to f from Tn , as m → ∞. Hence, using this approach, we can recover an approximation to f that is arbitrarily close to the error minimising approximant (which, as mentioned, cannot be computed directly from the given samples). Moreover, the rate of convergence of fn,m to Qn f is completely independent of the particular vector f , and relies only on rate of decay of the parameter 1 − Cn,m . Note that asymptotic optimality also occurs for general Riesz bases whenever P(Tn ) ⊆ Tn , in which case Dn,m → 0 as m → ∞ (see Lemma 2.3). The case of orthonormal sampling vectors presents the most obvious example of a basis satisfying this condition. Remark 3 Whenever the vectors {ψj }∞ j=1 are not orthonormal, a natural question to ask is whether we can modify the approach for computing fn,m to recover asymptotic optimality. This can be easily done, at least in theory, by replacing the operator Pm , given by (2.4), with the orthogonal projection H → Tn . In this case, both Lemma 2.7 and Theorem 2.8 will hold for nonorthogonal sampling bases. The downside of the approach is that it requires additional computational cost to compute fn,m , as we explain at the end of the next section. Pm Another potential means to recover asymptotic optimality is to define Pm g = j=1 hg, ψj i ψj∗ , where {ψj∗ } is the set of dual vectors to the sampling vectors {ψj }. In this case, Pm → I strongly, and asymptotic optimality follows. In practice, however, one may not have access to the dual vectors, thus this approach cannot necessarily be readily implemented. 8 2.2 Oblique asymptotic optimality Whenever the sampling basis is orthonormal, fn,m converges to the best approximation Qn f as m → ∞. An obvious question to ask is what can be said in the general case, where the vectors {ψj } only form a Riesz basis? The intuitive explanation given previously indicated that fn,m ≈ f˜n , where f˜n is defined by (2.8). We now wish to confirm this observation. In fact, as in the orthonormal case, we may demonstrate a stronger result: namely, for fixed n ∈ N, fn,m → f˜n as m → ∞ at a rate that is independent of the particular vector f ∈ H. Recall that the form a(·, ·) yields an equivalent inner product on H. Since f˜n is defined by the equations a(f˜n , φ) = a(f, φ), ∀φ ∈ Tn , the mapping fp7→ f˜n is the orthogonal projection onto Tn with respect to this inner product. Letting kgka = a(g, g) be the corresponding norm on H, we now define the constants C̃n,m = inf φ∈Tn kφka =1 hPm φ, φi , D̃n,m = sup sup | hPm f, gi |. (2.20) g∈Tn f ∈T⊥ n kf ka =1 kgka =1 ⊥ In this instance, T⊥ n is defined with respect to the a-inner product, i.e. Tn = {f ∈ H : a(f, φ) = 0, ∀φ ∈ Tn }. Conversely, when considered with respect to the canonical inner product, this subspace is precisely P(Tn )⊥ = {f ∈ H : hf, φi = 0, ∀φ ∈ P(Tn )}. Note the similarity between C̃n,m and D̃n,m and the quantities Cn,m and Dn,m defined in (2.11) and (2.13) respectively. Roughly speaking, the former measure the deviation of fn,m from Qn f , whereas, as we will subsequently show, the latter determine the deviation of fn,m from f˜n . With these definitions to hand, identical arguments to those given in the proofs of Lemmas 2.2 and 2.3 now yield: Lemma 2.9. For all m, n ∈ N, C̃n,m ≥ n, C̃n,m → 1 as m → ∞. 1 d2 Cn,m , where Cn,m is as in (2.11). Moreover, for fixed 2 Lemma 2.10. For all m, n ∈ N, D̃n,m ≤ d2 and D̃n,m ≤ c2 (1 − C̃n,m ). In particular, for fixed n, D̃n,m → 0 as m → ∞. Using these lemmas, we deduce Corollary 2.11. If fn,m and f˜n are given by (2.6) and (2.8) respectively, then D̃n,m kf − f˜n ka , kfn,m − f˜n ka ≤ C̃n,m and we have the error estimate kf − f˜n ka ≤ kf − fn,m ka ≤ K̃n,m kf − f˜n ka , K̃n,m = q −2 2 1 + D̃n,m C̃n,m . In particular, for any f ∈ H, fn,m → f˜n as m → ∞. Proof. Since (fn,m − f˜n ) ∈ Tn , we have D E C̃n,m kfn,m − f˜n k2a ≤ Pm (fn,m − f˜n ), fn,m − f˜n . Moreover, because hPm fn,m , φi = hPm f, φi, we deduce that D E C̃n,m kfn,m − f˜n k2a ≤ Pm (f − f˜n ), fn,m − f˜n ≤ D̃n,m kf − f˜n ka kfn,m − f˜n ka , where the second inequality follows from the definition (2.20) of D̃n,m and the fact that (f − f˜n ) ∈ T⊥ n , the orthogonal complement of Tn with respect to the a-inner product. Note that the mapping Wn : f 7→ f˜n is an oblique projection with respect to the inner product h·, ·i on H. In particular, Wn has range Tn and kernel P(Tn )⊥ , and we have the decomposition H = Tn ⊕ P(Tn )⊥ . For this reason, we say that fn,m possesses oblique asymptotic optimality. 9 2.3 Computation of fn,m Recall that the computation of the approximation fn,m involves solving the system of equations (2.6). This can be interpreted as the normal equations of a least squares problem. Suppose that Pn fn,m = j=1 αj φj , α = (α1 , . . . , αn ) ∈ Cn and fˆ = (fˆ1 , . . . , fˆm ). If U is the m × n matrix with (j, k)th entry hφk , ψj i, then (2.6) is given exactly by Aα = U † fˆ, where A = U † U and U † is the adjoint of U . In other words, the vector α is the least squares solution of the problem U α ≈ fˆ. This system can be solved iteratively by applying conjugate gradient iterations to the normal equations, for example. The number of required iterations is dependent on the condition number κ(A) of the matrix A. Specifically, the number of iterations required to obtain numerical p convergence (i.e. to within a prescribed tolerance) is proportional to κ(A) [26]. In particular, if κ(A) is O (1) for all n and m ≥ m0 , then the number of iterations is also O (1) for all n. Hence, the cost of computing fn,m is determined solely by the number of operations required to perform matrix-vector multiplications involving U . In other words, only O (mn) operations. Naturally, aside from this consideration, the condition number of A is also important since it determines susceptibility of the numerical computation to both round-off error and noise. Specifically, an error of magnitude in the inputs (i.e. the samples fˆj , j = 1, . . . , m) will yield an error of magnitude roughly κ(A) in the output fn,m . For these reasons it is of utmost importance to study the condition number of A. For this, we first introduce the Hermitian matrix Ã ∈ Cn×n with (j, k)th entry hφj , φk i. Note that Ã is the Gram matrix of the vectors {φ1 , . . . , φn }. In particular, κ(Ã) is a measure of the suitability of the particular vectors in which to compute Qn f . With Ã to hand, we also introduce the related matrix Ãa ∈ Cn×n with (j, k)th entry a(φj , φk ) = hPφj , φk i, i.e. the Gram matrix with respect to the inner product a(·, ·). The following lemma comes as no surprise: Lemma 2.12. The matrices Ã and Ãa are spectrally equivalent. In particular, for all n ∈ N, d2 d1 κ(Ã) ≤ κ(Ãa ) ≤ κ(Ã). d2 d1 Proof. For any Hermitian matrix B, the condition number is the ratio of the largest and smallest eigenvalues (in absolute value). Moreover, if B is positive definite, then † † α Bα α Bα = λmin (B), sup = λmax (B). (2.21) inf α∈Cn α† α α† α α∈Cn α6=0 α6=0 Pn If φ = j=1 αj φj , then α† Ãα = kφk2 and α† Ãa α = a(φ, φ). Hence, spectral equivalence now follows immediately from (2.10). Concerning the condition number of the matrix A, we now have the following: Lemma 2.13. Suppose that m ≥ m0 , where m0 is as in Theorem 2.4, and C̃n,m and Cn,m are given by (2.11) and (2.20) respectively. Then C̃n,m κ(Ãa ) ≤ κ(A) ≤ 1 C̃n,m d2 Cn,m κ(Ã) ≤ κ(A) ≤ κ(Ã). d2 Cn,m κ(Ãa ), Moreover, for fixed n, A → Ãa as m → ∞, and, if P = I, A → Ã = Ãa . Proof. The matrix A is Hermitian and, Pnprovided m ≥ m0 , positive definite. Hence, its eigenvalues are given by (2.21). For φ = j=1 αj φj , we have α† Aα = hPm φ, φi. By definition of Cn,m and C̃n,m , we find that λmin (A) ≥ Cn,m λmin (Ã) and λmin (A) ≥ C̃n,m λmin (Ãa ). Moreover, by (2.1) we have λmax (A) ≤ d2 λmax (Ã) and λmax (A) ≤ λmax (Ãa ). The first result now follows immediately from (2.21). For the second, we merely note that each entry of A converges to the corresponding entry of Ãa as m → ∞. 10 Note the important conclusion of this lemma: computing fn,m from (2.6) is no more illconditioned than the computation of the orthogonal projection Qn f or the oblique projection Wn f in terms of the vectors {φ1 , . . . , φn }. In practice, it is often true that these vectors correspond to the first n vectors in a basis {φj }∞ j=1 of H with additional structure. Whenever this is the case, as the following trivial corollary indicates, we can expect good conditioning: Corollary 2.14. Suppose that {φj }∞ j=1 is a Riesz basis for H (with respect to h·, ·i) with constants c01 and c02 . Then c0 d 2 κ(A) ≤ 0 2 . c1 Cn,m Proof. This follows immediately follows from (2.1) and Lemma 2.13. Put together, a fundamental conclusion of Theorem 2.4, Lemma 2.13 and Corollary 2.14 is the following: for a given reconstruction space Tn , the individual vectors φ1 , . . . , φn can be chosen arbitrarily, without altering the analysis of the approximation fn,m (which itself does not depend on the individual vectors used to represent it). The choice of vectors only becomes important when considering the condition number of linear system to solve. Moreover, the quality of a system of vectors for the reconstruction problem is completely intrinsic, in that it is determined only by the corresponding Gram matrix (in particular, it is independent of the sampling space). Corollary 2.14 confirms that the approximation fn,m can be readily computed in a stable manner for many choices of reconstruction basis. However, to fully implement this method, as we discuss further in the next section, it is useful to have numerical way of computing Cn,m . The following lemma provides such a means: Lemma 2.15. The quantity Cn,m is given by Cn,m = λmin (Ã−1 A). Moreover, if Ã and A commute, then Cn,m = 1 − kI − Ã−1 Ak. In particular, if {φj }nj=1 is an orthonormal basis, then Cn,m = λmin (A) = 1 − kI − Ak. Pn Proof. By definition Cn,m = inf φ∈Tn hPm φ, φi. Letting φ = j=1 αj φj , we find that kφk=1 Pn Cn,m = infn α∈C α6=0 j,k=1 P n αj αk hPm φj , φk i j,k=1 αj αk hφj , φk i = infn α∈C α6=0 α† Aα . α† Ãα We now claim that, for arbitrary Hermitian positive definite matrices B and C with B nonsingular, the following holds: infn α∈C α6=0 α† Cα = λmin (B −1 C), α† Bα sup α∈Cn α6=0 α† Cα = λmax (B −1 C). α† Bα To do so, write B = D† D, with D nonsingular. Then, after rearranging, we obtain infn α∈C α6=0 α† Cα β † D−† CD−1 β = inf = λmin (D−† CD−1 ), α† Bα β∈Cn β†β β6=0 for example. However, a trivial calculation confirms that the eigenvalues of D−† CD−1 are identical to those of B −1 C, thus establishing the claim. Since Ã is nonsingular, this confirms that Cn,m = λmin (Ã−1 A). For the second result, we merely notice that λmin (B) = 1 − λmax (I − B) = 1 − kI − Bk, whenever B is Hermitian. In Section 2.1, we briefly discussed a modified approach where the operator Pm , usually given by (2.4), was replaced by the orthogonal projection operator. The advantage of this approach is that it guarantees asymptotic optimality. However, the downside is additional computational expense. Indeed, the corresponding matrix is of the form A = U † V −1 U , where V ∈ Cm×m has (j, k)th entry hψj , ψk i. Hence, if conjugate gradients iterations are used, at each stage we are required to compute matrix-vector products involving the m × m matrix V −1 (assuming that 11 V −1 had been precomputed). In general, this requires O m2 operations. Thus, we incur a cost of O m2 , as opposed to O (mn) for the original algorithm. Hence, in practice it may be better settle for only quasi- and oblique asymptotic optimality, whilst retaining a lower computational cost. 2.4 Conditions for guaranteed, quasi-optimal recovery Let us return to the standard form of the method once more. To implement this method, it is necessary to have conditions that guarantee nonsingularity, stability and quasi-optimal recovery. In other words, for given sampling and reconstruction bases, we wish to study the quantity Θ(n; θ) = min {m ∈ N : Cn,m ≥ θ} , θ ∈ (0, d2 ), (2.22) where Cn,m is given by (2.11) and d2 stems from (2.2) . Note that Cn,m ≤ d2 by (2.2), thereby explaining the stated range of θ. Also, by Lemma 2.2, we have that limm→∞ Cn,m ≥ d1 > 0, thus Θ is well-defined. By definition, Θ(n; θ) is the least m such that kf − fn,m k ≤ c(θ)kf − Qn f k, where q p 1 + (1 − θ)θ−2 , (2.23) c(θ) = 1 + d22 θ−2 or whenever the sampling basis is orthonormal. In other words, the least m required for quasioptimal recovery with constant c(θ). Thus, provided m ≥ Θ(n; θ), the approximation fn,m converges at the same rate as Qn f as n → ∞. In addition, m ≥ Θ(n; θ) guarantees that kfn,m k ≤ d2 θ−1 kf k and κ(A) ≤ d2 θ−1 κ(q Ã), thus making the linear system for fn,m solvable in a number of operations proportional to d2 θ−1 κ(Ã). Note that Θ(n; θ) is determined only by the sampling and reconstruction spaces Sm and Tn . Whilst Θ(n; θ) can be numerically computed for any pair of spaces via the expression given in Lemma 2.15, analytical bounds must be determined on a case-by-case basis. In the next section, where we consider the recovery of functions from their Fourier samples using (piecewise) polynomial bases, we are able to derive explicit forms for such bounds. Remark 4 As mentioned, the framework developed in this section was first introduced by the authors in [3]. Whilst a result similar to Theorem 2.4 was proved, there are a number of important improvements offered by the theory presented in this paper: 1. In [3] it was assumed that the reconstruction vectors φ1 , . . . , φn were the first n in an infinite sequence of vectors that formed a Riesz basis for H. Conversely, Theorem 2.4 depends only on the subspace Tn , and thus the individual reconstruction vectors can be chosen arbitrarily. 2. The constants Kn,m and Cn,m are known exactly in terms of the sampling and reconstruction bases, and, whenever the sampling vectors are orthogonal, they possess a simple geometric interpretation in terms of subspace angles. Moreover, these constants can be computed by determining either the minimal eigenvalue or, in certain cases, the norm of an n × n matrix. 3. Simple, explicit bounds for the condition number of the matrix A are known in terms of the constant Cn,m and the Gram matrix Ã. 4. The behaviour of fn,m as m → ∞ (for n fixed) can be fully explained in terms of oblique asymptotic optimality. 3 Polynomial reconstructions from Fourier samples One of the most important examples of this procedure is the reconstruction of an analytic, but nonperiodic function f to high accuracy from its Fourier coefficients. Direct expansion in Fourier series converges only slowly in the L2 (−1, 1) norm, and suffers from the Gibbs phenomenon near the domain boundary. Hence, given the first m Fourier coefficients of f , we now seek to reconstruct f to high accuracy in another basis. 12 Let H = L2 (−1, 1) (the space of square integrable functions on (−1, 1)), f : (−1, 1) → R and 1 ψj (x) = √ eijπx , 2 j ∈ Z, be the standard Fourier basis functions. For m ≥ 2, we assume that the coefficients Z 1 jmk jmk + 1, . . . , − 1, fˆj = f (x)ψj (x) dx, j = − 2 2 −1 are known (note that, whenever m is even, this means that the first m − 1 Fourier coefficients of f are given. We will allow this minor discrepancy since it simplifies ensuing analysis). As a consequence of Theorem 2.4, we are free to choose the reconstruction space. The orthogonal projection of an analytic function onto the space Pn−1 of polynomials of degree less than n is known to converge exponentially fast at rate ρ−n , where ρ > 1 is determined from the largest Bernstein ellipse within which f is analytic [9]. Hence, we let Tn = Pn−1 . Note that an orthonormal basis for Tn is given by the functions q (3.1) φj (x) = j + 12 Pj (x), j ∈ N, where Pj is the j th Legendre polynomial. Moreover, if Qn is the orthogonal projection onto Tn , then it is well-known that √ kf − Qn f k ≤ cf nρ−n , (3.2) where cf depends only on the maximal value of f on the Bernstein ellipse indexed by ρ. Naturally, we could also assume finite regularity of f throughout, with suitable adjustments made to the various error estimates. However, for simplicity we shall not do this. With this to hand, provided m ≥ Θ(n; θ) for some θ > 0, where Θ(n; θ) is defined in (2.22), the approximation fn,m obtained from the reconstruction procedure satisfies kf −fn,m k ≤ √ c(θ)kf − Qn f k (see Theorem 2.4). In particular, kf − fn,m k ≤ c(θ)cf nρ−n . Hence, exponential convergence of fn,m . The key question remaining is how large m must be in comparison to n to ensure such rates of convergence. Resolving this question involves estimating the quantity Cn,m , a task we next pursue. 3.1 Estimates for Θ(n; θ) For both numerical and analytical estimates of Θ(n; θ) we need to select an appropriate basis of Pn−1 (recall from Section 2 that Θ(n; θ) is independent of the basis used). A natural choice is the orthonormal basis (3.1) of scaled Legendre polynomials. Fortunately, in this case, the inner products hφk , ψj i (i.e. the entries of the matrix U ) are known in closed form: s k + 21 hφk , ψj i = (−i)k Jk+ 12 (jπ), j ∈ Z, k ∈ N, (3.3) j where Jm is the Bessel function of first kind. This follows directly from Z 1 1 jm (z) = (−i)m eizx Pm (x) dx, ∀z ∈ C, 2 −1 (3.4) (see [1, 10.1.14]), where jm is the spherical Bessel function of the first kind, given by r π jm (z) = J 1 (z). 2z m+ 2 With this to hand, we may compute Cn,m (and, in turn, Θ(n; θ)) via the expression given in Lemma 2.15. In Figure 1 we display the functions Θ(n; 21 ) and Θ(n; 14 ) against n. Immediately, we witness quadratic growth of Θ(n; θ) with n, a result we verify in this section. In doing so, 13 0.7 0.6 0.5 0.4 0.3 0.2 0.1 800 600 400 200 10 20 30 40 50 10 Figure 1: The functions Θ(n; θ) (left) and n−2 Θ(n; θ) (right) for θ = 20 1 2 30 40 (squares) and θ = 50 1 4 (circles). we shall also derive an upper bound for Θ(n; θ) in terms of n and θ. This gives an explicit, analytic condition for quasi-optimal recovery. Whilst such a bound is completely robust (in that it holds for all n), we notice from Figure 1 that Θ(n; θ), when scaled by n−2 , quickly converges to an asymptotic limit. In practice it is wasteful to use a larger value of m than necessary (or, conversely, for fixed m a overly pessimistic value of n). Hence, in the second part of this section, we will also derive an asymptotic bound for Θ(n; θ). We commence as follows: Lemma 3.1. Suppose that Tn = Pn−1 , Sm is the space spanned by the first m Fourier basis functions and m ≥ max{2, π2 n}. Then Cn,m satisfies Cn,m ≥ 1 − 4(π − 2)n2 . π 2 (2b m 2 c − 1) Proof. From the definition of Cn,m and the fact that {ψj } is an orthonormal basis we have 1 − Cn,m = 1 − inf hPm φ, φi = sup hφ − Pm φ, φi = sup kφ − Pm φk2 , φ∈Tn kφk=1 φ∈Tn kφk=1 φ∈Tn kφk=1 Pn−1 where Pm is the Fourier projection operator. It now follows that 1−Cn,m ≤ k=0 kφk −Pm φk k2 , where φk is given by (3.1). By Parseval’s theorem and the expression (3.3), we find that X kφk − Pm φk k2 = |j|≥b m 2 c k + 12 |Jk+ 12 (jπ)|2 . j Now, using a known result for Bessel functions [32], it can be shown that k + 21 2k + 1 |Jk+ 12 (jπ)|2 ≤ q , j 2 jπ j π 2 − (k + 12 )2 provided jπ > k + 21 . Hence, for m > 2 π n, kφk − Pm φk k2 ≤ Now, it was shown in [32] that P j≥m j 2(2k + 1) X π2 m 1 q . (k+ 1 )2 j≥b 2 c j j − π22 √ 12 j −c2 kφk − Pm φk k2 ≤ and 1 − Cn,m ≤ ≤ 1 c 1 c arcsin m− 1 , whenever m ≥ c + 2 . This gives 2 4 2k + 1 arcsin , π (2b m 2 c − 1)π n−1 4X 2k + 1 arcsin . π (2b m 2 c − 1)π k=0 14 (3.5) We estimate this sum by the integral of arcsin t. We have 2n Z (2b m c−1)π 2 c − 1) 1 − Cn,m ≤ 2(2b m arcsin t dt. 2 0 Rx Now, arcsin x ≤ (arcsin 1)x for 0 ≤ x ≤ 1. It follows that F (x) = 0 arcsin t dt ≤ F (1)x2 . , Computing this integral exactly, we arrive at F (1) = π2 − 1. Upon substituting x = (2b m2n 2 c−1)π this completes the proof. This lemma confirms that it is sufficient for m to scale quadratically with n for quasi-optimal recovery. Using this result, we find that Theorem 3.2. Suppose that Tn and Sm are as in Lemma 3.1. Then, for n ≥ 2, Θ(n; θ) satisfies 1 2(π − 2) 2 Θ(n; θ) ≤ 2 + 2 n , ∀n ∈ N. 2 π (1 − θ) Proof. Suppose that m ≥ {2, π2 n}. Then, by Lemma 3.1, Cn,m ≥ θ if 1− 4(π − 2)n2 ≥ θ. − 1) π 2 (2b m 2c Rearranging, we find that jmk 4(π − 2)n2 2 ≥1+ 2 2 π (1 − θ) ⇒ m≥2 1 2(π − 2) 2 + n 2 π 2 (1 − θ) and the theorem is proved, provided the right-hand side exceeds max{2, π2 θ}. Since n ≥ 2, the right-hand side is certainly greater than 2. Moreover, 1+ 4(π − 2)n2 8(π − 2)n 2n ≥ > , 2 π (1 − θ) π π as required. Using a similar approach, we are also able to obtain an asymptotic bound for Θ(n; θ) that is sharper than if were to use Theorem 3.2 directly: Theorem 3.3. Suppose that Tn and Sm are as in Lemma 3.1. Then the function Θ(n; θ) satisfies 4 n−2 Θ(n; θ) ≤ 2 + O n−2 , n → ∞. π (1 − θ) Proof. Suppose that m = cn2 and recall (3.5). Since j < n and k > 12 cn2 , we deduce that kφk − Pm φk k2 ≤ 4(2k + 1) 2(2k + 1) X 1 + O n−4 = + O n−4 . 2 2 2 2 π j cπ n m j>b Hence 1−C n,cn2 2 c n−1 4 X 4 ≤ 2 2 (2k + 1) + O n−2 = 2 + O n−2 . cπ n cπ k=0 Rearranging now gives the result. In Figure 2 we compare the function n−2 Θ(n; θ) for θ = 21 , 14 and the global and asymptotic bounds of Theorems 3.2 and 3.3. Note that both bounds are reasonably sharp in comparison to the computed values. In particular, as n → ∞, n−2 Θ(n; 12 ) quickly approaches the limiting value c ≈ 0.38, whereas the global and asymptotic upper bounds are 0.93 and 0.81 respectively. At this moment, we reiterate an important point. Whilst Legendre polynomials were used in the proof of Lemma 3.1, the constant Cn,m is independent of the particular reconstruction basis, and is only determined by the space Tn . Hence, Theorems 3.2 and 3.3 provide a priori estimates regardless of the particular implementation of the reconstruction procedure. In the next section, we discuss the choice of polynomial basis and its effect on the numerical method. 15 1.5 1.5 1.0 1.0 0.5 0.5 20 40 60 20 80 40 60 80 Figure 2: The function n−2 Θ(n; θ) (squares), the global bound (circles) and the asymptotic bound (crosses), for n = 2, . . . , 80 and θ = 1 2 (left), θ = 1 4 (right). Remark 5 In some applications, medical imaging, for example, oversampling is common. Formally speaking, this is the situation where we wish to recover a function f with support in [−1, 1] from its Fourier samples taken over an extended interval K ⊇ [−1, 1] (e.g. K = [− 1 , 1 ] for some 0 < ≤ 1). In this case, proceeding in a similar manner to before, we let H = L2 (K), q ψj (x) = 2c eicjπx , x ∈ K, where c = 12 |K| and Tn = φ : φ|[−1,1] ∈ Pn−1 , supp(φ) ⊆ [−1, 1] . Using similar arguments to those of Lemma 3.1, one can also derive estimates for Cn,m and Θ(n; θ) in this case. In fact, 4(π − 2)n2 , cπ 2 (m − 1) Cn,m ≥ 1 − (3.6) and 1 2(π − 2) 2 Θ(n; θ) ≤ 2 + n , ∀n ∈ N, 2 cπ 2 (1 − θ) n−2 Θ(n; θ) ≤ 4 cπ 2 (1 − θ) + O n−2 , n → ∞. 2 In particular, we retain the scaling m = O n , regardless of the of size of the interval K. 3.2 Choice of polynomial basis The results proved in this section are independent of the polynomial basis used for implementation. In selecting such a basis, there are two questions which must be resolved. First, how stable is the resultant method, and second, how can the entries of the matrix U (as defined in Section 2.3) be computed? A straightforward choice is the orthogonal basis of Legendre polynomials (3.1). In this case, Ã = I, where Ã is the Gram matrix for {φ0 , . . . , φn−1 }, making the method well-conditioned (Lemma 2.13). Moreover, the entries of U are known explicitly via (3.3). Having said this, there is also interest in reconstructing in other polynomial bases. In many circumstances it may be advantageous to have an approximation fn,m that is easily manipulable. In this sense, an approximant composed of Legendre polynomials is not as convenient as one consisting of Chebyshev polynomials (of the first or second kind); the latter being easy to manipulate via the Fast Fourier Transform. To this end, the purpose of this section is to detail the implementation of this method in terms of general Gegenbauer polynomials. Gegenbauer polynomials arise as orthogonal polynomials with respect to the inner product Z 1 1 1 hf, giλ = f (x)g(x)(1 − x2 )λ− 2 dx, λ > − . 2 −1 For given λ, we denote the j th such polynomial by Cjλ ∈ Pj . Important special cases are the Legendre polynomials (λ = 12 ), and Chebyshev polynomials of the first (λ = 0) and second (λ = 1) kind. By convention [7] (see also [31]), each polynomial Cjλ is normalised so that Cjλ (1) = Γ(j + 2λ) , j!Γ(2λ) 16 (3.7) where Γ is the Gamma function, in which case it is known that (see [7, p.174]) kCjλ k2λ = where kf kλ = p √ Γ(j + 2λ)Γ(λ + 12 ) π , j!Γ(2λ)Γ(λ)(j + λ) (3.8) hf, f iλ . With this to hand, we now define φj = 1 C λ, kCjλ kλ j j = 0, 1, 2, . . . , (3.9) and seek to reconstruct f in this basis. Our first task is to compute the entries of the matrix U . For this, we need to compute integrals of the form Z 1 Ckλ (x)eizx dx, Ik (z) = k = 0, 1, 2, . . . , −1 where z ∈ R. Fortunately, such integrals obey a simple recurrence relation: Lemma 3.4. For z 6= 0, the integrals Ik (z) satisfy sin z sin z − z cos z , I1 (z) = 2iC1λ (1) , z z2 iz k −iz e + (−1) e 2i(k + λ) λ λ Ik (z) + Ik−1 (z) − i Ck+1 (1) − Ck−1 (1) , Ik+1 (z) = z z I0 (z) = 2C0λ (1) k = 1, 2, . . . . When z = 0, we have I0 (0) = 2C0λ (1), Ik (0) = 1 + (−1)k λ λ Ck+1 (1) − Ck−1 (1) , k = 1, 2, . . . . 2(k + λ) Proof. Recall the identity (see [7, p.176]) Cjλ (x) = 1 d λ λ Cj+1 − Cj−1 , 2(j + λ) dx j = 1, 2, . . . . Substituting this into the expression for Ik (z) and integrating by parts gives Ik (z) = λ 1 iz 1 λ Ck+1 (x) − Ck−1 (x) eizx x=−1 − [Ik+1 (z) − Ik−1 (z)] . 2(k + λ) 2(k + λ) Rearranging now gives the general recurrence for k ≥ 1. For k = 0, 1, we merely note that C0λ (x) = C0λ (1), C1λ (x) = C1λ (1)x and that Z 1 izx e −1 sin z dx = 2 , z Z 1 xeizx dx = 2 −1 sin z − z cos z . z2 The result for z = 0 is derived in a similar manner. Using this recurrence formula, the matrix U can be formed in O (mn) operations. With this to hand, we now turn our attention to the condition number of Ã: Theorem 3.5. Let Ã be Gram matrix for the vectors {φ0 , . . . , φn−1 }, where φj is given by (3.9). Then, κ(Ã) = O n|2λ−1| as n → ∞. In particular, whenever φ0 , . . . , φn−1 arise from Chebyshev polynomials (of the first or second kinds), then κ(Ã) = O (n). To prove this theorem, we first require the following two lemmas. For convenience, we will write L2λ (−1, 1), λ > − 21 , for the space of square-integrable functions with respect to the 1 Gegenbauer weight function (1 − x2 )λ− 2 . 17 Lemma 3.6. Suppose that − 12 < λ < 12 . Then, for all g ∈ L∞ (−1, 1), we have kgk ≤ kgkλ and, for some cλ > 0 independent of g, 1 1 −λ 2 . kgkλ ≤ cλ kgkλ+ 2 kgk∞ Conversely, if λ ≥ 1 2 (3.10) then, for all g ∈ L∞ (−1, 1), kgk ≤ kgkλ and 1 λ+ 1 2 kgk ≤ cλ kgkλ λ− 1 2 λ+ 1 2 kgk∞ . (3.11) Proof. Suppose first that − 12 < λ < 12 . Trivially, kgk ≤ kgkλ . Now consider the other inequality. For any 0 < < 1, we have Z 1 1 kgk2λ = |g(x)|2 (1 − x2 )λ− 2 dx Z−1 Z 1 1 = |g(x)|2 (1 − x2 )λ− 2 dx + |g(x)|2 (1 − x2 )λ− 2 dx |x|≤1− 1 ≤ (1 − (1 − )2 )λ− 2 kgk2 + 2kgk2∞ Z 1−<|x|≤1 1 2 λ− 21 (1 − x ) dx, 1− 1 1 where k · k∞ is the uniform norm on [−1, 1]. Note that (1 − (1 − y)2 )λ− 2 < y λ− 2 , ∀y ∈ (0, 1). It now follows that 1 kgk2λ ≤ λ− 2 kgk2 + 2 λ+ 1 2 kgk2∞ , λ + 21 0 < < 1. 2 kgk Let c > 2 be arbitrary. Then kgk2 < ckgk2∞ , so we may let = ckgk 2 . Substituting this into the ∞ previous expression immediately gives (3.10). Now suppose that λ > 12 . Once more, trivial arguments give that kgkλ ≤ kgk. For the other 1 inequality, we proceed in a similar manner. We have kgk2 ≤ 2 −λ kgk2λ + 2kφk2∞ . For c > 2 we 21 λ+ kgkλ 2 , which gives (3.11). now set = ckgk ∞ Lemma 3.7. Let λ ≥ 21 . Then, for all 1φ ∈ Pn−1 , kφk∞ ≤ kn,λ kφkλ , where kn,λ depends only on n and λ and satisfies kn,λ = O nλ+ 2 as n → ∞. Proof. Let {φ0 , . . . , φn−1 } be given by (3.9), and write an arbitrary φ ∈ Pn−1 as φ = Pn−1 where j=0 |aj |2 = kφk2λ . By the Cauchy–Schwarz inequality, kφk2∞ ≤ kφk2λ n−1 X Pn−1 j=0 aj φj , 2 kφj k2∞ = kn,λ kφk2λ . j=0 We now wish to estimate kφj k∞ . Recall that kCjλ k∞ = Cjλ (1) [7, p.206]. Hence, by (3.7) and (3.8), we have Γ(j + 2λ)(j + λ) . kφj k2∞ = √ j! πΓ(2λ)Γ(λ + 12 ) Consider the ratio Hence kφj k2∞ Γ(j+2λ) . j! By Stirling’s formula, Γ(j + 2λ) = O j 2λ−1 , j → ∞. j! 2 = O j 2λ , which gives kn,λ = O n2λ+1 , as required. 18 10 20 30 50 40 -5 -5 -10 -10 -15 -15 100 150 200 250 300 Figure 3: Error in approximating f (x) = e−x cos 4x by fn,m (x) for n = 1, . . . , 40. Left: log error log10 kf − fn,m k∞ (squares) and log10 kf − fn,m k (circles) against n. Right: log error against m = 0.2n2 . Proof of Theorem 3.5. Since Ã is Hermitian and positive definite, its condition number is the ratio of its maximum and minimum eigenvalues. By a simple argument, we find that λmax (Ã) = sup φ∈Pn−1 φ6=0 kφk2 , kφk2λ λmin (Ã) = inf φ∈Pn−1 φ6=0 kφk2 . kφk2λ Consider the case λ > 21 . By Lemma 3.6, we have λmin (Ã) ≥ 1 and λmax (Ã) ≤ c2λ sup φ∈Pn−1 φ6=0 kφk∞ kφkλ 2 λ− 112 λ+ 2 . Using Lemma 3.7, we deduce that λmax (Ã) = O n2λ−1 , as required. For the case − 12 < λ < 12 , we proceed in a similar manner. This theorem confirms that the method can be implemented using Chebyshev polynomials whilst incurring only a mild growth in the condition number. It follows that, if conjugate gradients are used to compute the approximation, the total computational cost of forming fn,m 3 is O(mn 2 ), as opposed to O (mn) in the Legendre polynomial case. In the next section we present several examples of this implementation. Remark 6 Whilst Theorem 3.5 provides an asymptotic estimate for κ(Ã) (and hence κ(A)), it may also be useful to derive global bounds. With effort, one could obtain versions of Lemmas 3.6 and 3.7 involving explicit bounds. For the sake of brevity, we shall not do this. However, whenever Chebyshev polynomials are used (arguably the most important case), it is a relatively simple exercise to confirm that √ κ(Ã) ≤ 2 2n, 1 2 1 κ(Ã) ≤ 3 3 π − 3 n(n + 21 )(n + 1) 3 , in the first and second kind cases respectively. 3.3 Numerical examples We now present several numerical examples of this method. All examples employ the value m = 0.2n2 , and the first series of examples consider the implementation using Legendre polynomials. In Figure 3 we consider the function f (x) = e−x cos 4x. Since f is analytic in this case, we witness exponential convergence in terms of n and root exponential convergence in terms of m. Note the effectiveness of the method: using less than 100 Fourier coefficients, we obtain an approximation with 13 digits of accuracy. As indicated by Theorem 2.4, the approximation fn,m is quasi-optimal. To highlight this feature of the method, Figure 4 displays both the error in approximating f by fn,m and the best approximation Qn f . Note the very close correspondence of the two graphs. 19 10 20 30 10 40 -5 -5 -10 -10 -15 -15 20 30 40 Figure 4: Error in approximating f (x) = e−x cos 4x by fn,m (x) (squares) and Qn f (x) (circles) for n = 1, . . . , 40. Left: log uniform error. Right log L2 error. 20 40 60 200 80 -5 -5 -10 -10 -15 -15 Figure 5: Error in approximating f (x) = fn,m k∞ 400 600 800 1000 1200 1 1+x2 by fn,m (x) for n = 1, . . . , 80. Left: log error log10 kf − (squares) and log10 kf − fn,m k (circles) against n. Right: log error against m = 0.2n2 . The example in Figures 3 and 4 is, in fact, entire. Hence, the approximation fn,m converges super-geometrically in n (as seen in Figure 3). For a meromorphic function, with complex singularity lying outside [−1, 1], the convergence rate is truly exponential at a rate ρ. This is 1 demonstrated in Figure 5, the approximated function being f (x) = 1+x 2 . Note that, despite the poles at x = ±i, the approximation fn,m still obtains 13 digits of accuracy using only 250 Fourier coefficients. Next we consider reconstructions in other polynomials using the work of the previous section. In Table 1 we give the error in approximating the function f (x) = e−x cos 4x with Chebyshev polynomials of the first and second kinds. Note that the resulting uniform error is virtually identical to the case of the Legendre polynomial implementation (unsurprisingly, since all three implementations compute exactly the same approximation fn,m , up to numerical error). Moreover, as evidenced by Table 2, the payoff is only mild growth in the condition number κ(A). 3.4 Connections to earlier work Rather than choosing m such that Cn,m ≥ θ, it may appear advantageous to find the minimum m such that Cn,m > 0. In other words, the smallest m such that fn,m is guaranteed to exist. Letting θ = 0 in Theorems 3.2 and 3.3, we immediately obtain a sufficient condition of the form m ≥ cn2 , for some c > 0. However, this result is far too pessimistic: it is known that reconstruction is always possible, provided m ≥ n [32]. For this reason, it may appear n (a) (b) (c) 5 1.45e0 1.45e0 1.45e0 10 1.85e-3 1.85e-3 1.85e-3 15 3.03e-7 3.03e-7 3.03e-7 20 2.53e-12 2.53e-12 2.49e-12 25 1.06e-14 3.51e-14 6.76e-14 30 8.42e-14 1.16e-13 7.33e-14 35 4.06e-14 4.57e-14 6.40e-14 40 5.31e-14 7.70e-14 5.15e-14 Table 1: Comparison of the error kf − fn,m k∞ with m = 0.2n2 , where fn,m is formed from (a) Legendre polynomials and Chebyshev polynomials of the (b) first and (c) second kinds. 20 n (a) (b) (c) 5 3.57 13.74 3.90 10 5.55 49.99 5.67 15 4.21 52.63 7.25 20 5.20 91.89 9.33 25 4.40 92.89 11.91 30 5.06 133.02 13.96 35 4.50 133.49 16.56 40 6.77 191.19 18.92 Table 2: Comparison of the condition number κ(A) with m = 0.2n2 , where A is formed from (a) Legendre polynomials and Chebyshev polynomials of the (b) first and (c) second kinds. favourable to reconstruct using m = n, leading to the so-called inverse polynomial reconstruction method [37, 38]. Unfortunately, however, this approach is extremely unstable. The linear system has geometrically large condition number, making the procedure extremely sensitive to both noise and round-off error. Moreover, a continuous analogue of the Runge phenomenon occurs. Roughly speaking, the approximation fn,m only converges to f if geometric decay of kf − Qn f k is faster than the geometric growth of kA−1 k, meaning that only functions analytic in sufficiently large complex regions can be approximated by this procedure (as discussed in detail in [3], this behaviour can be understood in terms of the operator-theoretic properties of finite sections of certain non-Hermitian infinite matrices). On the other hand, by allowing m to range independently of n, we overcome all these difficulties, and obtain a stable method whose convergence is completely determined by the convergence of Qn f to f . For the specific example of Legendre polynomial reconstructions from Fourier samples, this approach has also been recently considered in [32]. Therein, the estimate m = O n2 was derived, along with bounds for the error. Naturally, this problem is just one specific example of our general framework. However, within this context, our work improves and extends the results of [32] in the following ways: 1. Reconstruction is completely independent of the particular polynomial basis used. In particular, the estimates for Θ(n; θ) and kf − fn,m k are determined only by the spaces Tn and Sm . This allows for analysis of reconstructions in arbitrary polynomial bases, not just the Legendre polynomials used in [32]. 2. The estimates for Θ(n; θ) in Theorems 3.2 and 3.3 improve those given in [32]. In particular, it was shown in [32, Theorem 4.2] that Cn,αn2 ≥ 1 − 1 8 arcsin , π πα ∀n ∈ N, α ≥ 1, (3.12) 2 (our constant Cn,m corresponds to the quantity σn,m in [32]). Conversely, Lemma 3.1 leads to the improved bounds ( r ) 2 4(π − 2) 2 ∀n ≥ max , , α > 0, (3.13) Cn,αn2 ≥ 1 − 2 −2 π (α − n ) πα α and 4 + O n−2 , n → ∞, α > 0. (3.14) 2 π α Not only are these bounds sharper, they also hold for a greater range of α, thus permitting reconstruction with m = αn2 for any α > 0, as opposed to just α ≥ 1. This leads to savings in computational cost, and, in cases where m is fixed, allows larger values of n to be used, thereby increasing accuracy. To illustrate this improvement, note that (3.12) gives the estimate κ(A) ≤ 5.71 when m = n2 . Conversely, our estimate (3.13) yields the bound 2.61 for n ≥ 2, and (3.14) gives the asymptotic bound 1.68. To compare, direct computation of κ(A) indicates that κ(A) ≤ 1.32 for all n, and κ(A) → 1.2 as m → ∞. 3. Piecewise analytic functions and function of arbitrary numbers of variables can be recovered in a analogous fashion, with similar analysis (see Sections 3.5 and 4 respectively). Cn,αn2 ≥ 1 − 21 1.0 0.2 0.5 -1.0 0.5 -0.5 -0.2 -0.4 -0.6 -0.8 -1.0 -1.2 1.0 -0.5 -1.0 200 400 600 800 1000 1200 Figure 6: Error in approximating the function (3.15) by fn,m (x) for n = 1, . . . , 80. Left: the function f (x). Right: log error log10 kf − fn,m k∞ (squares) and log10 kf − fn,m k (circles) against m = 0.2n2 . 3.5 Reconstruction of piecewise analytic functions Naturally, whenever the approximated function is not analytic, the convergence rate of the polynomial approximant fn,m to f is not exponential. For example, consider the function (2e2π(x+1) − 1 − eπ )(eπ − 1)−1 x ∈ [−1, − 21 ) f (x) = (3.15) π x ∈ [− 21 , 1] − sin( 2πx 3 + 3) This function was put forth in [46] to test algorithms for overcoming the Gibbs phenomenon. Aside from the discontinuity, its sharp peak makes it a challenging function to reconstruct accurately. Since this function is discontinuous, we expect only low-order, algebraic convergence of fn,m in the L2 norm, but no uniform convergence, an observation confirmed in Figure 6. However, by reconstructing this function in a polynomial basis, we are not exploiting the known information about f : namely, the jump discontinuity at x = − 12 . The general procedure set out in Section 2 allows us to use such information in designing a reconstruction basis. Naturally, since f is analytic in the subintervals [−1, − 21 ] and [− 12 , 1], a better choice is to reconstruct f in a piecewise polynomial basis. The aim of this section is to describe this procedure. Seeking generality, suppose that f : [−1, 1] → R is piecewise analytic with jump discontinuities at −1 < x1 < . . . < xl < 1. Let x0 = −1 and xl+1 = 1. We assume that f has been sampled via fˆj = hf, ψj i, j = 1, . . . , m, where h·, ·i is the Euclidean inner product on L2 (−1, 1). In examples, these will be the Fourier samples of f , but the construction described below holds for arbitrary sampling bases consisting of functions defined on [−1, 1]. Throughout we shall assume that the discontinuity locations x1 , . . . , xl are known exactly. Although this may be a reasonable assumption in many applications, a fully-automated algorithm must also incorporate a scheme for singularity detection. We shall not discuss possible approaches for doing this, and we refer the reader to [45] for further details. Given the additional information about the location of the singularities of f , we now design a reconstruction basis to mirror this feature. We shall construct such a basis via local co-ordinate r mappings. To this end, let Ir = [xr , xr+1 ], cr = 21 (xr+1 − xr ) and define Λr (x) = x−x cr − 1, so 0 that Λ(Ir ) = [−1, 1]. Suppose now that Tn is a space of functions defined on [−1, 1] (e.g. the polynomial space Pn−1 ). By convention, we assume that each φ ∈ T0n is extended by zero to the whole real line, i.e. φ(x) = 0 for x ∈ R\[−1, 1]. Let Tn,r be the space of functions defined on Ir , given by Tn,r = {φ ◦ Λr : φ ∈ T0n }. We now define the new reconstruction space in the obvious manner: l X Tn = {φ : φ|Ir ∈ Tnr ,r , r = 0, . . . , l} , n = nr , r=0 and seek an approximation fn,m ∈ Tn to f via the conditions am (fn,m , φ) = am (f, φ), ∀φ ∈ Tn , where am is defined in (2.5). Suppose now that {φ1 , . . . , φn } is a collection of linearly independent reconstruction functions with T0n = span{φ1 , . . . , φn }. We construct a basis for Tn by scaling. To this end, we let φr,j = √1cr φj ◦ Λr , and notice that Tn = span {φr,j : j = 1, . . . , nr , r = 0, . . . , l}. Note that, if {φj } are orthonormal, then so are {φr,j }. With this basis in hand, the approximation 22 fn,m is now given by fn,m = nr l X X αr,j φr,j , r=0 j=1 where the coefficients αr,j are determined by the aforementioned equations. As before, this is equivalent to the least squares problem U α ≈ fˆ with block matrix U = [U1 , . . . , Ul ], where Ur is the m × nr matrix with (j, k)th entry Z xr+1 1 φk (Λr (x))ψj (x) dx. hφr,k , ψj i = √ cr x r Here fˆ = (fˆ1 , . . . , fˆm )> , α = [α0 , . . . , αl ] and αr = (αr,1 , . . . , αr,nr )> . Naturally, estimation of the constant Cn,m is vital. The following lemma aids in this task: Lemma 3.8. The constant Cn,m satisfies Cn,m ≥ 1 − l X (1 − Cnr ,m,r ) , r=0 where Cnr ,m,r = inf φ∈Tnr ,r hPm φ, φi. kφk=1 Proof. For φ ∈ Tn , denote φ|Ir by φ[r] . Assume that φ[r] is extended to [−1, 1] by zero, so that Pl φ = r= φ[r] . Since φ[r] ⊥ φ[s] for r 6= s, it follows that ( Pl r=0 1 − Cn,m = sup ) l X φ[r] − Pm φ[r] , φ[r] [r] [r] 2 : φ ∈ Tnr ,r , r = 0, . . . , l, kφ k 6= 0 . Pl [r] 2 r=0 r=0 kφ k Note that, for ar ≥ 0 and br > 0, r = 0, . . . , l, the inequality l X r=0 ar ≤ l l X ar X r=0 br br , r=0 holds. Setting ar = φ[r] − Pm φ[r] , φ[r] and br = kφ[r] k2 and using this inequality gives ( 1 − Cn,m ) l [r] l X X φ − Pm φ[r] , φ[r] [r] [r] 2 ≤ sup : φ ∈ Tnr ,r , r = 0, . . . , l, kφ k 6= 0 kφ[r] k2 r=0 r=0 l X hφ − Pm φ, φi ≤ sup : φ ∈ T , kφk = 6 0 , nr ,r kφk2 r=0 and this is precisely Pl r=0 (1 − Cnr ,m,r ). Let us now focus on piecewise polynomial reconstructions from Fourier samples, in which case Tn is the space of functions φ with φ|Ir a polynomial of degree nr . Regarding the rate of convergence of the resulting approximation fn,m , it is a simple exercise to confirm that kf − fn,m k ≤ c(θ)cf l X √ r nr ρ−n , r r=0 where c(θ) is defined in (2.23), cf is a constant depending on f only and ρr is determined by the largest Bernstein ellipse (appropriately scaled) within which the function f |Ir is analytic. Hence, exponential convergence of fn,m to f . The main question remaining is that of estimating the function Θ(n; θ) for this reconstruction procedure. For this, we have the following result, which extends Theorems 3.2 and 3.3 to this case: 23 Theorem 3.9. The function Θ(n; θ) satisfies & ' l 1 2(π − 2) X n2r Θ(n; θ) ≤ 2 , + 2 π 2 (1 − θ) r=0 cr ∀n = l X nr , n0 , . . . , nl ∈ N, r=0 and l Θ(n; θ) ≤ X n2 4 r + O (1) , π 2 (1 − θ) 0 cr n0 , . . . , nl → ∞. Proof. In view of Lemma 3.8, it suffices to consider Cnr ,mr ,r . To this end, let J = [α, β] ⊆ [−1, 1], Tn,J be the space of functions φ with supp(φ) ⊆ J and φ|J ∈ Pn−1 , and define En,m = sup hφ − Pm φ, φi . φ∈Tn,J kφk=1 1 Let Λ(x) = x−α c −1, where c = 2 (β −α), and write φ = Φ◦Λ, where supp(Φ) ⊆ [−1, 1]. Consider the quantity hφ, ψj i. By definition of ψj , we have 1 hφ, ψj i = √ 2 Z 1 φ(x)e −1 −ijπx c dx = √ e−ijπ(α+c) 2 Z Λ(1) Φ(y)e−ijπcy dy. Λ(−1) Let K = [Λ(−1), Λ(1)] = Λ([−1, 1]) ⊇ [−1, 1] and let Pm,K be the Fourier projection operator based on the interval K. It now follows that En,m = sup hΦ − Pm,K Φ, Φi : supp(Φ) ⊆ [−1, 1], Φ|[−1,1] ∈ Pn−1 , kΦk = 1 . This is now precisely the setup of Remark 5. Using (3.6), we therefore deduce that En,m ≤ 4(π − 2)n2 . cπ 2 (2b m 2 c − 1) Letting J = Ir , c = cr and using Lemma 3.8, we now obtain l Cn,m ≥ 1 − 4(π − 2) X n2r , π 2 (2b m 2 c − 1) r=0 cr (3.16) from which the result follows immediately. To implement this scheme, it is necessary to compute the values (3.5). By changing variables, it is easily seen that r Z cr −ijπdr 1 e hφr,k , ψj i = φk (y)e−ijπcr y dy, 2 −1 where dr = 12 (xr+1 + xr ). Since (3.4) holds for all z ∈ C, it follows that s hφr,k , ψj i = e−ijπdr (−i)k k + 21 Jk+ 12 (jπcr ), j (3.17) whenever the functions φr,k arise from scaled Legendre polynomials. Naturally, if the functions φr,k arise from arbitrary scaled Gegenbauer polynomials, computation of the values (3.5) can be carried out recursively via the algorithm described in Section 3.2 (for appropriate choices of z). In Figure 7 we apply this method to the function (3.15) using the orthonormal basis of scaled Legendre polynomials. The improvement over Figure 6 is dramatic: using only m ≈ 250 (with n0 = n1 = 16) we obtain 13 digits of accuracy. Note that, as expected, root exponential convergence occurs. Moreover, as predicted by (3.16), and illustrated in Table 3, the condition number of the matrix A remains small. 24 -2 -4 -6 -8 -10 -12 -14 100 200 300 400 -1.0 500 0.5 -0.5 1.0 -4 -6 -8 -10 -12 -14 -16 Figure 7: Error in approximating the function (3.15) by fn,m (x). Left: log error log10 kf − fn,m k∞ (squares) and 2 log10 kf − fn,m k (circles) against m = 1, . . . , 500 with n0 = n1 chosen so that m = n0 n2 1 1 + . Right: the error log10 |f (x) − fn,m (x)| against x ∈ [−1, 1] for m = 20, 40, 80, 160. 5 c0 c1 m Cn,m κ(A) 10 0.05 19.88 20 0.34 2.92 40 0.33 3.06 80 0.44 2.27 160 0.44 2.27 320 0.47 2.11 640 0.49 2.03 1280 0.50 1.98 Table 3: The constant Cn,m and the condition number κ(A) against m. 3.6 Comparison to existing methods Numerous methods exist for recovering functions to high accuracy from their Fourier data. Applications are myriad, and range from medical imaging [5, 6] to postprocessing of numerical solutions of hyperbolic PDEs [27, 36]. Prominent examples which deliver high global accuracy (in contrast to standard filtering techniques, which only yield high accuracy away from the singularities of a function [30]) include Gegenbauer reconstruction [29, 30, 31], techniques based on extrapolation [19], Padé methods [18] and Fourier extension methods [10, 33] (for a more complete list, see [12] and references therein). Whilst many of these methods deliver exponential convergence in terms of m (the number of Fourier coefficients), they all suffer from ill-conditioning. This comes as no surprise: the problem of reconstructing a function from its Fourier coefficients can be viewed as a continuous analogue of the recovery of a function from m equidistant pointwise values. As proved in [42], any method for this problem that converges exponentially fast in m will suffer from exponentially poor conditioning. We conjecture that a similar result holds in the continuous case. Aside from increased susceptibility to round-off error and noise, ill-conditioning often makes so-called inverse methods (e.g. extrapolation and Fourier extension methods) costly to implement. Conversely, the method proposed in the paper does not suffer from any ill-conditioning. This negative consequence of [42] is circumvented, precisely because we witness only root exponential convergence in m. However, an advantage of this approach is that it delivers exponential convergence in n, the degree of the final approximant fn,m . In many applications it may be necessary manipulate fn,m , its relatively low degree making such operations reasonably cheap. Thus, this method has the advantage of compression, a feature not shared by the majority of the other methods mentioned previously. A well-established and widely used alternative to this method is the Gegenbauer reconstruction technique, proposed by Gottlieb et al [31]. Much like this approach, it computes a polynomial approximation. Yet it stands out as being direct, meaning that no linear system or least squares problem is required to be solved. Whilst the original method has been shown to suffer from a generalised Runge phenomenon [11], thereby severely affecting its applicability, an improved approach based on Freud polynomials was recently developed in [24]. Comparatively speaking, this approach delivers exponential convergence in O m2 opera3 tions. On the other hand, our method obtains root exponential convergence at a cost of O(m 2 ) operations. However, despite being theoretically more efficient, the various constants involved in the Gegenbauer/Freud procedure tend to be rather large. 25 m (a) (b) 64 8.90e-01 2.40e-04 128 1.37e-01 8.36e-09 256 1.84e-04 2.40e-14 512 1.01e-07 1.38e-14 1024 9.33e-13 1.74e-14 2048 5.27e-13 2.26e-14 4096 5.23e-14 2.59e-14 Table 4: Comparison of the (a) Freud polynomial method and (b) generalised reconstruction method applied to (3.15). Here m is the total number of Fourier samples. Indeed, in Table 4 we compare the error in approximating the function (3.15) using both procedures (the data for the Freud method is taken from [24, Table 1]. Note that the parameter N used therein is such that 2N = m is the total number of Fourier coefficients). As is evident, the method proposed in this paper obtains an error of order 10−14 using less than 256 Fourier coefficients, whereas the Freud procedure does not reach this value until more than 1024 coefficients are used. The most likely reason for this improvement is that the generalised reconstruction method is quasi-optimal, thereby delivering a near-optimal polynomial approximation, whereas the Gegenbauer and Freud procedures do not possess this feature. Indeed, in the Gegenbauer case at least, we can quantitatively explain this phenomenon (the corresponding results for the Freud polynomial case are not so well understood). It is known that, for analytic functions f , the approximation Fn,m obtained from the Gegenbauer procedure converges geometrically at rate γ m , where β+2α −1 (β + 2α) , γ = max ρ , (2πe)α αα β β (see [30, eqn. (4.12)]). Here λ = αm is the parameter of the Gegenbauer polynomials used and n = βm is the degree of the polynomial Fn,m . Thus, in practice, the Gegenbauer method may converge much more slowly than the best polynomial approximation Qm f , which converges at rate ρ−m . Conversely, our approach offers root exponential convergence at a rate determined √ solely by ρ: kf − fn,m k ∼ ρ− m . Suppose now that the number of Fourier coefficients is fixed. Then, ignoring various constants, the Gegenbauer method will offer a lower error than the generalised reconstruction pro√ cedure only when γ m . ρ− m . In other words, m& log ρ log γ 2 . In typical examples (see [30, 31]), the parameter values α = β = 14 have been used, giving the condition m & 19(log ρ)2 . Thus, for reasonably analytic functions f (i.e. analytic in a sufficiently large Bernstein ellipse), the Gegenbauer procedure will only begin to outperform the generalised reconstruction method when the parameter m is rather large. Moreover, whenever f is entire (as is the case with the example in √ Table 4), the generalised reconstruction procedure will converge super-geometrically (in n = O ( m)), whereas the Gegenbauer method may still converge only exponentially at rate γ. Thus, in this situation the Gegenbauer method may never outperform the generalised reconstruction method for any finite m. Aside from improved numerical performance, let us mention several other benefits. First, √ as discussed, the final approximation consists of only O( m) terms, as opposed to O (m). Second, the basis for the polynomial reconstruction space Tn can be chosen arbitrarily (in particular, independently of m) without affecting the convergence. The only downside is a mild increase in condition number if nonorthogonal polynomials are employed. In contrast, for the Freud/Gegenbauer procedure, only very specific types of polynomials can be used (which may not be simple to construct or manipulate [24]), and, whenever the number of samples m is varied, all polynomials used for reconstruction must be changed. To somewhat temper our claims, we mention that we have only considered one particular test function. There may well be problems where the Freud/Gegenbauer procedure performs better, and the intention of future work is to present a more thorough comparison of the two methods. That aside, one advantage of Gegenbauer method is that it is local: the approximation in each subdomain is computed separately and independently of any other subdomain. Conversely, with 26 our approach, the computations are inherently coupled. Nevertheless, it may be possible to devise a local version of our approach, a question we intend to explore in future investigations. 4 Reconstructions in tensor-product spaces Thus far, we have focused on the reconstruction of univariate functions from Fourier samples. A simple extension of this approach, via tensor products, is to functions defined in cubes. The aim of this section is to detail this generalisation. To formalise this idea, let us return to the general perspective of Section 2. Suppose that the Hilbert space H can be expressed as a tensor-product H = H1 ⊗ · · · ⊗ Hd of Hilbert spaces Hi , i = 1, . . . , d, each having inner product h·, ·ii . Note that, for f = f1 ⊗ · · · ⊗ fd ∈ H and g = g1 ⊗ · · · ⊗ gd ∈ H, we have d Y hf, gi = hfi , gi ii . i=1 Now suppose that the sampling basis consists of tensor-product functions. To this end, let j = (j1 , . . . , jd ) ∈ Nd , ψj = ψ1,j1 ⊗ · · · ⊗ ψd,jd , and, for m = (m1 , . . . , md ) ∈ Nd , set Sm = span {ψj : j = (j1 , . . . , jd ), 1 ≤ ji ≤ mi , i = 1, . . . , d} . We assume throughout that the collection {ψi,j }∞ j=1 is a Riesz basis for Hi for i = 1, . . . , d (hence {ψj } is a Riesz basis for H). With this to hand, we define the operator Pm : H → Sm by Pm f = m1 X ··· j1 =1 md X hf, ψj i ψj . jd =1 Note that Pm (f1 ⊗ · · · ⊗ fd ) = P1,m1 f1 ⊗ · · · ⊗ Pd,md fd , where Pi,mi : Hi → Si,mi is defined in the obvious manner. In a similar fashion, we introduce the reconstruction vectors φj = φ1,j1 ⊗ · · · ⊗ φd,jd , which form a basis for the reconstruction space Tn = span {φj : j = (j1 , . . . , jd ), 1 ≤ ji ≤ ni , i = 1, . . . , d} , n = (n1 , . . . , nd ) ∈ Nd . As before, we construct the approximation fn,m ∈ Tn via the equations am (fn,m , φ) = am (f, φ), ∀φ ∈ Tn , where am (f, g) = hPm f, gi, ∀f, g ∈ H. To cast this problem in a form suitable for computations, let U [i] ∈ Cmi ×ni be the matrix with (j, k)th entry hψi,j , φi,k ii . Let U ∈ Cm̄,n̄ be the matrix of the d-variate reconstruction method, where m̄ = m1 . . . md and n̄ = n1 . . . nd . Then, it is easily shown that U= d O U [i] , A= i=1 d O A[i] , i=1 where A = U † U , and A[i] = (U [i] )† U [i] , and, in this case, B1 ⊗ B2 denotes the Kronecker product of matrices the B1 and B2 . By a trivial argument, we p conclude that the number of operations required to compute fn,m is of order (n1 m1 ) . . . (nd md ) κ(A). Recall that the spectrum of the Kronecker product matrix B1 ⊗ B2 consists precisely of the pairs λµ, where λ is an eigenvalue of B1 and µ is an eigenvalue of B2 . From this, we easily deduce that d Y κ(A) = κ(A[i] ). i=1 Hence, κ(A) is completely determined by the matrices A[i] , with the ith such matrix correspondi ing to the univariate reconstruction problem with sampling basis {ψi,j }m j=1 and reconstruction i basis {φi,j }nj=1 . Unsurprisingly, a similar observation also holds for the quantity Cn,m : 27 Lemma 4.1. The constant Cn,m satisfies Cn,m = Qd i=1 Ci,ni ,mi . Proof. By Lemma 2.15, Cn,m = λmin (Ã−1 A) and Ci,ni ,mi = λmin ((Ã[i] )−1 A[i] ), i = 1, . . . , d, where Ã and Ã[i] are defined in the obvious manner. Since Ã = Ã[1] ⊗ · · · ⊗ Ã[d] , the matrix Ã−1 is the Kronecker product of matrices (Ã[i] )−1 . The result now follows immediately. 4.1 Reconstruction of piecewise smooth functions Having presented the general case, we now turn our attention to the reconstruction of a piecewise smooth function f : [−1, 1]d → R. We shall make the significant (see later discussion) assumption that f is smooth in hyper-rectangular subregions of [−1, 1]d . To this end, let li ∈ N for i = 1, . . . , d and suppose that −1 = x0,i < x1,i < . . . < xli ,i < xli +1,i = 1, and define Ir,i = [xr,i , xr+1,i ] for r = 0, . . . , li . For r = (r1 , . . . , rd ), we write Ir = Ir1 ,1 ×· · ·×Ird ,d , so that the collection {Ir : r = (r1 , . . . , rd ), ri = 0, . . . , li , i = 1, . . . , d} , consists of disjoint sets whose union is [−1, 1]d . We assume that f is smooth within each subdomain Ir . In addition, for ri = 0, . . . , li and i = 1, . . . , d, let cri ,i = 21 (xri +1,i − xri ,i ) and x−x set Λri ,i (x) = cr r,ii ,i − 1, x ∈ Iri . Note that Λri ,i (Iri ,i ) = [−1, 1]. i We now construct a reconstruction space. To this end, for n ∈ N let T0n , dim T0n = n, be a space of functions φ : R → C with supp(φ) ⊆ [−1, 1]. Define Tn,r,i = {φ ◦ Λr,i : φ ∈ T0n } , n ∈ N. Now suppose that n is the vector (n1 , . . . , nd ), where ni = li X nr,i , i = 1, . . . , d, r=0 for some nr,i ∈ N. We now define the reconstruction space Tn by Tn = li d M O Tnr,i ,r,i . i=1 r=0 We seek a basis for this space. Let {φ1 , . . . , φn }, n ∈ N, be a basis for T0n , and set φr,j,i = √ 1 φj ◦ Λr,i . cr,i A basis for Tn is now given by {φr1 ,j1 ,1 ⊗ · · · ⊗ φrd ,jd ,d , j = 1, . . . , nri ,i , ri = 0, . . . , li , i = 1, . . . , d} . This framework gives a general means in which to construct reconstruction bases in the tensorproduct case for functions which are piecewise smooth with discontinuities parallel to the coordinate axes. Suppose now that we consider the recovery of such a function from its Fourier samples. Using the above framework, we are easily able to construct a basis consisting of piecewise polynomials of several variables. The main question remaining is that of estimating the function Cn,m . However, in view of the Lemma 4.1 and the results derived in Section 3.1, a simple argument gives 28 -2 5 10 15 20 25 5 -2 -4 -6 -8 -10 -12 -14 -4 -6 -8 -10 -12 10 15 20 25 Figure 8: The errors log10 kf − fn,m k (squares) and log10 kf − fn,m k∞ (circles) for n1 = n2 = 1, . . . , 25, 2 where f (x) = ex y (left) and f (x, y) = sin 3xy (right). Pli r=0 nr,i , and let Θ(n; θ) = min{m = (m1 , . . . , md ) : Cn,m ≥ θ}, θ ∈ (0, 1). Theorem 4.2. Suppose that n = (n1 , . . . , nd ), where ni = If θ1 , . . . , θd ∈ (0, 1) satisfy θ = θ1 . . . θd , then we may write Θ(n; θ) = (Θ1 (n1 ; θ1 ), . . . , Θd (nd ; θd )) , where, for i = 1, . . . , d, & ' li n2r,i 1 2(π − 2) X Θi (ni ; θi ) ≤ 2 + , 2 π 2 (1 − θi ) r=0 cr,i and l Θi (ni ; θi ) ≤ i X n2r,i 4 + O (1) , 2 π (1 − θi ) r=0 cr,i n0,i , . . . nli ,i → ∞. The main consequence of this theorem is the following: regardless of the dimension, the variables m1 , . . . , md must scale quadratically with n1 , . . . , nd to ensure quasi-optimal recovery. Consider now the most simple example of this approach: namely, where f is smooth in [−1, 1]d , so that Tn consists of multivariate polynomials. In Figure 8 we plot the error in approximating 2 the functions f (x, y) = ex y and f (x, y) = sin 3xy, using parameters m1 = m2 = 0.5n21 and n2 = n1 . As in the univariate case, we observe the accuracy of this technique. For example, using only m1 = m2 ≈ 200 and n1 = n2 ≈ 20 we obtain an error of order 10−14 . Remark 7 This approach (and many others based on tensor-product formulations) has the significant shortcoming that it requires the function to be singular in regions parallel to the coordinate axes. Naturally, this is a rather restrictive condition. For a function with singularities lying on a curve (in two dimensions, for example), one potential alternative is to apply the one-dimensional method along horizontal and vertical slices, and recover the two-dimensional function from the resulting one-dimensional reconstructions. However, the generality of the reconstruction framework presented in this paper allows for another approach. Given that we can reconstruct in any basis, an obvious alternative is to subdivide the domain into triangular elements and use a finite element basis for reconstruction. This is a subject for future investigation. 5 Other sampling problems Overcoming the Gibbs phenomenon in Fourier series is an obvious application of the general framework developed in Section 2. However, there is no reason to restrict to this case, and this framework can be readily applied to design effective methods for a variety of other problems. In this section we describe several related problems, and the application of this framework therein. 29 5.1 Modified Fourier sampling Modified Fourier series were proposed in [34] as a minor adjustment of Fourier series. In the domain [−1, 1], rather than expanding a function f in the classical Fourier basis {cos jπx : j ∈ N} ∪ {sin jπx : j ∈ N+ }, we construct the modified Fourier expansion using the basis {cos jπx : j ∈ N} ∪ {sin(j − 12 )πx : j ∈ N+ }, instead. Though this basis arises from only a minor adjustment of the Fourier basis, the result is an improved approximation: the modified Fourier series of a smooth, nonperiodic function converges uniformly at a rate of O m−1 , whilst Fourier series suffers from the Gibbs phenomenon. Although the convergence rate remains slow, the improvement over Fourier series, whilst retaining many of their principal benefits, has given rise to a number of applications of such expansions. For a more detailed survey, we refer the reader to [4]. We shall consider modified Fourier expansions in a somewhat different context. Given the similarity between the two bases, it is reasonable to assume that any sampling procedure (e.g. an MRI scanner) can be adapted to compute the modified Fourier coefficients of a given function (or image/signal), as opposed to the standard Fourier samples. Indeed, if Z 1 Ff (t) = f (x)e−iπtx dx, −1 is the Fourier transform of f , then the modified Fourier coefficients are precisely Z 1 1 fˆjC = f (x) cos jπx dx = [Ff (j) + Ff (−j)] , 2 −1 Z 1 i Ff (j − 12 ) + Ff ( 12 − j) , fˆjS = f (x) sin(j − 12 )πx dx = 2 −1 and hence can be computed from samples of the Fourier transform. This raises the question: given that the general framework can handle sampling in either, is there an advantage gained from sampling in the modified Fourier basis, as opposed to the Fourier basis? As we shall show, provided the function is analytic and nonperiodic, this is indeed the case. Specifically, when we reconstruct in a polynomial basis, we require fewer samples to obtain quasi-optimal recovery to within a prescribed degree. Suppose that we carry out the reconstruction procedure as in Section 3 but using modified Fourier samples instead of Fourier samples. For this, we set bmc 2 h i X 1 fˆjC cos jπx + fˆjS sin jπx . Pm f (x) = fˆ0C + 2 j=1 Naturally, we consider the function Θ(n; θ) once more. In Figure 9 we plot the function Θ(n; θ) for the modified Fourier basis. Upon comparison with Figure 2, we conclude the following: using modified Fourier sampling, as opposed to standard Fourier sampling leads to a noticeable improvement. Specifically, n−2 Θ(n; 21 ) is approximately 0.15 for large n in the modified Fourier case, as opposed to 0.4 in the Fourier case. This result means that, if the number of samples m is fixed, we are able to take a much larger value of n in the modified Fourier case, whilst retaining quasi-optimal recovery (with constant c(θ)). As a result, we obtain better, guaranteed accuracy. To illustrate this improvement, in Figure 10 we compare the errors in approximating the function f (x) = e−x cos 8x from either its Fourier or modified Fourier data. In both cases the number of samples m was fixed, and n was chosen so that the parameter Cn,m ≥ 12 . As is evident, the method based on modified Fourier samples greatly outperforms the other. For example, using only m = 120 samples, we obtain an error of order 10−14 for the former, in comparison to only 10−4 for the latter. As in the Fourier case, to implement the modified Fourier-based approach it is necessary to have estimates for the function Θ(n; θ). These are particularly simple to derive: 30 0.7 0.6 0.5 0.4 0.3 0.2 0.1 0.7 0.6 0.5 0.4 0.3 0.2 0.1 20 40 60 80 20 40 60 80 Figure 9: The function n−2 Θ(n; θ) (squares), the globabl bound (circles) and the asymptotic bound (crosses) for n = 1, . . . , 80 and θ = -2 -4 -6 -8 -10 -12 50 100 1 2 (left), θ = 150 3 4 (right). 200 -2 50 100 150 200 -4 -6 -8 -10 -12 Figure 10: The errors log10 kf −fn,m k∞ (left) and log10 kf −fn,m k (right) against m = 5, 10, 15, . . . , 200, where fn,m is computed from modified Fourier (squares) or Fourier (circles) samples. Lemma 5.1. Let Tn = Pn−1 and Sm be the space spanned by the first m modified Fourier basis functions. Then the function Θ(n; θ) satisfies 2 2 √ Θ(n; θ) ≤ kn n , where kn = n−2 sup kφ0 k. π 1−θ φ∈Pn−1 kφk=1 2 0 mπ kφ k Proof. In [2] it was shown that kφ − Pm φk ≤ for all sufficiently smooth functions φ. The result now follows immediately from the definition of Cn,m . As a result of this lemma, analytical bounds for Θ(n; θ) are dependent solely on the constant kn of the Markov inequality kφ0 k ≤ kn n2 kφk, ∀φ ∈ Pn−1 . Markov inequalities and their constants are well understood. The question of determining kn was first studied by Schmidt [43], in which the estimates 1 1 kn ≤ √ , ∀n, κn → , n → ∞, (5.1) π 2 were derived. In [44] the following improved asymptotic estimate was also obtained: −1 (n + 12 )2 π2 − 3 Rn 2 kn n = 1− + , n ≥ 5, (5.2) π 12(n + 12 )2 (n + 21 )4 where −6 < Rn < 13 (we refer the reader to [8] for a more thorough discussion of both these results and more recent work on this topic). Returning to Θ(n; θ) we now substitute the result of Lemma 5.1 into (5.1) and (5.2) to obtain the global and asymptotic bounds. In Figure 9 we compare these bounds to their numerically computed values. The relative sharpness of such estimates is once more observed. 5.2 Polynomial sampling The primary concern of this paper has been reconstruction from Fourier (or Fourier-like) samples. However, in several circumstances, most notably the spectral approximation of PDEs 31 with discontinuous solutions [23, 27], the problem arises where a piecewise analytic function has been sampled in an orthogonal polynomial basis. As previously noted, this approximation will converge slowly (and suffer from a Gibbs-type phenomenon near discontinuities), hence it is necessary to seek a new approximant with faster convergence. Whilst a version of the Gegenbauer reconstruction method has been developed for this task [28, 29], the advantages of the method proposed in this paper (see Section 3.6) make it a potential alternative to this existing approach. Hence, the purpose of this section is to give a brief overview of this application. It is beyond the scope of this paper to develop this example of the reconstruction procedure in its full generality. Instead, we consider only the recovery of a piecewise analytic function f : [−1, 1] → R from its first m Legendre polynomial coefficients fˆj = hf, ψj i, j = 0, . . . , m − 1, 1 where ψj = (j + 21 ) 2 Pj (x) is the j th normalised Legendre polynomial. Proceeding as in Section 3.5, we assume that f has jump discontinuities as −1 < x1 < . . . < xl < 1, and seek an approximation of the form fn,m (x) = l nX r −1 X αr,j φr,j (x), n= r=0 j=0 l X nr , r=0 r − 1, cr = 12 (xr+1 − xr ) and {φ0 , . . . , φn−1 } where φr,j (x) = √1cr φj (Λr (x)), Λr (x) = x−x cr is a system of polynomials on [−1, 1]. Since f is piecewise analytic, we expect exponential convergence of fn,m to f , provided m is sufficiently large in comparison to n. Aside from determining how large m must be in comparison to n for recovery, the main question remaining is that of implementation, i.e. how to compute the entries of the matrix U . This requires evaluation of the integrals Z xr+1 ψj (x)φr,k (x) dx, j = 0, . . . , m − 1, k = 0, . . . , n − 1. xr Whenever the reconstruction functions φr,k arise from Gegenbauer polynomials, these calculations can be done iteratively. For the sake of brevity, we will not describe this computation in full generality. Instead, we consider only the situation where the functions φr,k arise from Legendre polynomials, in which case we are required to compute the integrals Z xr+1 Pj (x)Pk (Λr (x)) dx, j = 0, . . . , m − 1, k = 0, . . . , n − 1, r = 0, . . . , l. xr We have Lemma 5.2. Let b Z uj,k = Pj (x)Pk (cx + d) dx, j, k = 0, 1, 2, . . . , (5.3) a where ca + d = −1 and cb + d = 1. Then u0,0 = b − a, u0,k = uj,0 = 2 δ0,k , c 1 b [Pj+1 (x) − xPj (x)]x=a , j u1,k = 2 2d δ1,k − δ0,k , 3c c j = 1, 2, . . . , k = 0, 1, 2, . . . , and, for j ≥ 2 and k ≥ 1, uj,k = (2j − 1)(k + 1) (2j − 1)k d(2j − 1) j−1 uj−1,k+1 + uj−1,k−1 − uj−1,k − uj−2,k . (5.4) cj(2k + 1) cj(2k + 1) cj j Proof. Recall the recurrence relation jPj (x) = (2j − 1)xPj−1 (x) − (j − 1)Pj−2 (x), j = 2, 3, . . . , for Legendre polynomials [1, chpt 22]. Substituting this into (5.3) gives Z 2j − 1 b j−1 uj,k = xPj−1 (x)Pk (cx + d) dx − uj−2,k . j j a 32 (5.5) -1.0 -0.5 0.5 -6 1.0 200 -2 -4 -6 -8 -10 -12 -14 -8 -10 -12 -14 -16 400 600 800 Figure 11: Left: the error log10 |f (x) − fn,m (x)| for −1 ≤ x ≤ 1 and m = 20, 40, 80, 160. Right: log errors log10 kf − fn,m k∞ (squares) and log10 kf − fn,m k (circles) against m. Letting x 7→ cx + d in (5.5) and rearranging, we find that xPk (x) = k+1 d k Pk+1 (cx + d) − Pk (cx + d) + Pk−1 (cx + d). c(2k + 1) c c(2k + 1) The recurrence (5.4) now follows upon substituting this into the previous expression. Rb Next consider uj,0 . Since P0 ≡ 1, we have u0,0 = b − a and uj,0 = a Pj (x) dx for j ≥ 1. Recall that the j th Legendre polynomial satisfies the Legendre differential equation [1, chpt 22] 0 (1 − x2 )Pj0 (x) = −j(j + 1)Pj (x). Substituting for Pj in Rb a Pj (x) dx and integrating gives uj,0 = 2 b 1 (x − 1)Pj0 (x) x=a . j(j + 1) The result now follows directly from the expression (1 − x2 )Pj0 (x) = (j + 1)(xPj (x) − Pj+1 (x)), j = 0, 1, 2, . . . . To complete the proof, we consider u0,k and u1,k . By the assumptions on a, b, c, d, we find that u0,k = 1 c Z 1 Pk (x) dx. −1 Orthogonality now gives u0,k = 2c δ0,k , as required. For u1,k we have u1,k 1 = c Z 1 1 (y − d)Pk (y) dy = c −1 2 δ1,k − 2dδ0,k , 3 as required. In Figure 11 we consider the approximation of the function sin cos x − 12 ≤ x < 21 , f (x) = 0 otherwise. (5.6) by the aforementioned method, using parameter values m = 81 n2 , n0 = n2 = 14 n and n1 = 12 n. As exhibited, we obtain 13 digits of accuracy using only m ≈ 120 Legendre coefficients of (5.6). Note that, although we have not shown it, the scaling m = O n2 appears to be sufficient for recovery. Numerical results, demonstrating this hypothesis, are given in Table 5. The function (5.6) was introduced in [28] to test the Gegenbauer reconstruction method when applied to this type of problem. As shown in Figure 11, we obtain a uniform error of roughly 10−8 using only m = 40 coefficients, and when m = 120, the corresponding value is 10−14 . In 33 n Cn,m κ(U ∗ U ) 8 0.98 19.97 16 0.87 4.17 24 0.85 3.57 32 0.85 3.57 40 0.85 3.50 48 0.84 3.43 56 0.84 3.43 64 0.84 3.41 72 0.84 3.38 80 0.84 3.38 Table 5: The values Cn,m and κ(U ∗ U ) against n, where m = 18 n2 . comparison, the Gegenbauer method gives errors of roughly 10−3 and 10−7 for these values of m (see [28, Fig. 3]), the latter being 107 times larger. Whilst this method appears to be a promising alternative,it should be mentioned that the recursive scheme to compute the entries of U requires O m2 operations. Since only O (mn) operations are required to compute the approximation fn,m , this is clearly less than ideal. Having said that, the Gegenbauer reconstruction method requires O m2 operations to compute each approximant, whereas with this scheme this higher cost is only incurred in a preprocessing step. 6 Conclusions and future work We have presented a reconstruction procedure to recover a function using any collection of linearly independent vectors, given a finite number of its samples with respect to an arbitrary Riesz basis. This approach is both stable and quasi-optimal, provided the number of samples m is sufficiently large in comparison to the number of reconstruction vectors n. Moreover, this condition can be estimated numerically or, in certain circumstances, analytically. A prominent example of this approach is the reconstruction of a piecewise analytic function from its Fourier samples. Using a piecewise polynomial basis, this results in an approximation that converges root exponentially in terms of m, or exponentially in terms of n. There are many potential avenues to pursue in extending this work, as we now detail: 1. Piecewise polynomial reconstructions from polynomial samples. In the final section in the paper we detailed the recovery of a piecewise analytic function in a piecewise polynomial basis, given its Legendre polynomial expansion coefficients. Herein, an important open problem is verifying that the scaling m = O n2 is sufficient for reconstruction. Other challenges involve devising an iterative scheme for computing the entries of U valid for reconstructions in arbitrary Gegenbauer polynomials, and which involves only O (mn) operations, as opposed to O m2 . Naturally, future work will also investigate the extension of this approach to reconstruction from arbitrary Gegenbauer polynomial expansion coefficients (as opposed to just Legendre polynomial expansion coefficients). 2. Gegenbauer polynomial reconstructions from Fourier samples. As shown, the reconstruction procedure can be implemented with arbitrary Gegenbauer polynomials. However, unless Legendre polynomials are used, the reconstruction is not stable. This problem arises because Gegenbauer polynomials do not form a Riesz basis for the space L2 (−1, 1) unless λ = 21 . However, Gegenbauer polynomial do form an orthogonal basis for the weighted space L2ω (−1, 1), 1 where ω(x) = (1 − x2 )λ− 2 . Hence, it is natural to ask whether the reconstruction procedure can be adjusted to incorporate this additional structure, thereby yielding a stable method. It turns out that this can be done, with the first step being the derivation of an extended abstract framework along similar lines to Section 2. We are currently compiling results in this case, and will report the details in a future paper. 3. Applications. Aside from the obvious applications in image and signal processing, the are a number of other potential uses of the procedure. First, it may be applicable to the spectral discretisation of PDEs. Spectral methods are extremely efficient for solving problems with smooth solutions. However, for problems that develop discontinuities, e.g. hyperbolic conservation laws, a postprocessor is required to recover high accuracy [27]. The Gegenbauer reconstruction technique has recently been applied to such problems (see [23, 27] and references therein). Given the potential advantages of the method developed in this paper (see Section 3.6), it is significant 34 interest to apply this approach to these problems. Aside from high accuracy, a pertinent issue in the use of spectral approximations for nonsmooth problems is the question of stability [27]. Since the generalised reconstruction procedure is well-conditioned, there may also be a benefit in this regard. Outside of PDEs, the Gegenbauer reconstruction technique has also been extended to other types of series, including radial basis functions [39], Fourier–Bessel series [40] and spherical harmonics [22]. Future work will also consider generalisation of the method of this paper along these lines. 4. Spline and wavelet-based reconstructions. Reconstructions in spline and wavelet bases are important in numerous applications. In [3], the authors gave a first insight into the application of such bases to the Fourier sampling problem. However, the theory is far from complete. Moreover, the use of more exotic objects, such a curvelets [16] and ridgelets [15], remains to be investigated. 5. Recovery from pointwise samples. The discrete analogue of the Fourier coefficient recovery problem involves the reconstruction of a function from m equispaced samples in [−1, 1]. This problem has received more attention of late [13, 42] than the continuous case considered in this paper. In particular, the so-called mock–Chebyshev method [14] can be viewed as a discrete analogue of this approach. Whilst the mock–Chebyshev method is well understood, there remain a number of other reconstruction from pointwise samples problems of interest. In particular, with application to spectral collocation schemes based on Chebyshev or Legendre polynomials, the recovery of a piecewise analytic function from its values at Gauss or Gauss–Lobatto nodes. Future work will consider this problem. References [1] M. Abramowitz and I. A. Stegun. Handbook of Mathematical Functions. Dover, 1974. [2] B. Adcock. Multivariate modified Fourier series and application to boundary value problems. Numer. Math., 115(4):511–552, 2010. [3] B. Adcock and A. C. Hansen. A generalized sampling theorem for stable reconstructions in arbitrary bases. Technical report NA2010/07, DAMTP, University of Cambridge, 2010. [4] B. Adcock and D. Huybrechs. Multivariate modified Fourier expansions. In E. Rønquist et al, editor, Proceedings of the International Conference on Spectral and High Order Methods (to appear), 2010. [5] R. Archibald, K. Chen, A. Gelb, and R. Renault. Improving tissue segmentation of human brain MRI through preprocessing by the Gegenbauer reconstruction method. NeuroImage, 20(1):489–502, 2003. [6] R. Archibald and A. Gelb. A method to reduce the Gibbs ringing artifact in MRI scans while keeping tissue boundary integrity. IEEE Transactions on Medical Imaging, 21(4):305–319, 2002. [7] H. Bateman. Higher Transcendental Functions. Vol. 2, McGraw–Hill, New York, 1953. [8] A. Böttcher and P. Dörfler. Weighted Markov-type inequalities, norms of Volterra operators, and zeros of Bessel functions. Math. Nachr., 283(1):40–57, 2010. [9] J. P. Boyd. Chebyshev and Fourier Spectral Methods. Springer–Verlag, 1989. [10] J. P. Boyd. A comparison of numerical algorithms for Fourier Extension of the first, second, and third kinds. J. Comput. Phys., 178:118–160, 2002. [11] J. P. Boyd. Trouble with Gegenbauer reconstruction for defeating Gibbs phenomenon: Runge phenomenon in the diagonal limit of Gegenbauer polynomial approximations. J. Comput. Phys., 204(1):253–264, 2005. [12] J. P. Boyd. Acceleration of algebraically-converging Fourier series when the coefficients have series in powers of 1/n. J. Comput. Phys., 228:1404–1411, 2009. [13] J. P. Boyd and J. R. Ong. Exponentially-convergent strategies for defeating the Runge phenomenon for the approximation of non-periodic functions. I. Single-interval schemes. Commun. Comput. Phys., 5(2–4):484–497, 2009. 35 [14] J.P. Boyd and F. Xu. Divergence (Runge phenomenon) for least-squares polynomial approximation on an equispaced grid and mock-Chebyshev subset interpolation. Appl. Math. Comput., 210(1):158– 168, 2009. [15] E. J. Candès and D.L. Donoho. Ridgelets: a key to higher-dimensional intermittency? Phil. Trans. R. Soc. Lond. A, 357(10):2495–2509, 1999. [16] E. J. Candès and D.L. Donoho. New tight frames of curvelets and optimal representations of objects with piecewise C 2 singularities. Comm. Pure Appl. Math., 57(2):219–266, 2004. [17] O. Christensen. An Introduction to Frames and Riesz Bases. Birkhauser, 2003. [18] T. A. Driscoll and B. Fornberg. A Padé-based algorithm for overcoming the Gibbs phenomenon. Numer. Algorithms, 26:77–92, 2001. [19] K. S. Eckhoff. On a high order numerical method for functions with singularities. Math. Comp., 67(223):1063–1087, 1998. [20] Y.C. Eldar. Sampling without input constraints: Consistent reconstruction in arbitrary spaces. Sampling, Wavelets and Tomography, 2003. [21] Y.C. Eldar and T. Werther. General framework for consistent sampling in Hilbert spaces. Int. J. Wavelets Multiresolut. Inf. Process., 3(3):347, 2005. [22] A. Gelb. The resolution of the Gibbs phenomenon for spherical harmonics. 66(218):699–717, 1997. Math. Comp., [23] A. Gelb and S. Gottlieb. The resolution of the Gibbs phenomenon for Fourier spectral methods. In A. Jerri, editor, Advances in The Gibbs Phenomenon. Sampling Publishing, Potsdam, New York, 2007. [24] A. Gelb and J. Tanner. Robust reprojection methods for the resolution of the Gibbs phenomenon. Appl. Comput. Harmon. Anal., 20:3–25, 2006. [25] D. Gilbarg and N.S. Trudinger. Elliptic Partial Differential Equations of Second Order. Springer Verlag, 2001. [26] G. H. Golub and C. F. Van Loan. Matrix Computations. John Hopkins University Press, Baltimore, 2nd edition, 1989. [27] D. Gottlieb and J. S. Hesthaven. Spectral methods for hyperbolic problems. J. Comput. Appl. Math., 128(1-2):83–131, 2001. [28] D. Gottlieb and C-W. Shu. On the Gibbs phenomenon IV: Recovering exponential accuracy in a subinterval from a Gegenbauer partial sum of a piecewise analytic function. Math. Comp., 64(211):1081–1095, 1995. [29] D. Gottlieb and C-W. Shu. On the Gibbs phenomenon III: Recovering exponential accuracy in a sub- interval from a spectral partial sum of a piecewise analytic function. SIAM J. Num. Anal., 33(1):280–290, 1996. [30] D. Gottlieb and C-W. Shu. On the Gibbs’ phenomenon and its resolution. SIAM Rev., 39(4):644– 668, 1997. [31] D. Gottlieb, C-W. Shu, A. Solomonoff, and H. Vandeven. On the Gibbs phenomenon I: Recovering exponential accuracy from the Fourier partial sum of a nonperiodic analytic function. J. Comput. Appl. Math., 43(1–2):91–98, 1992. [32] T. Hrycak and K. Gröchenig. Pseudospectral Fourier reconstruction with the modified inverse polynomial reconstruction method. J. Comput. Phys., 229(3):933–946, 2010. [33] D. Huybrechs. On the Fourier extension of non-periodic functions. 47(6):4326–4355, 2010. SIAM J. Numer. Anal., [34] A. Iserles and S. P. Nørsett. From high oscillation to rapid approximation I: Modified Fourier expansions. IMA J. Num. Anal., 28:862–887, 2008. [35] A.J. Jerri, editor. The Gibbs phenomenon in Fourier Analysis, Splines, and Wavelet Approximations. Kluwer Academic, Kordrecht, The Netherlands, 1998. [36] A.J. Jerri, editor. Advances in the Gibbs Phenomenon. Sampling Publishing, Potsdam, New York, 2007. [37] J.-H. Jung and B. D. Shizgal. Towards the resolution of the Gibbs phenomena. J. Comput. Appl. Math., 161(1):41–65, 2003. 36 [38] J.-H. Jung and B. D. Shizgal. Generalization of the inverse polynomial reconstruction method in the resolution of the Gibbs phenomenon. J. Comput. Appl. Math., 172(1):131–151, 2004. [39] J.H. Jung, S. Gottlieb, S.O. Kim, C.L. Bresten, and D. Higgs. Recovery of high order accuracy in radial basis function approximations of discontinuous problems. J Sci Comput, 45:359–381, 2010. [40] J.R. Kamm, T.O. Williams, J.S. Brock, and S. Li. Application of Gegenbauer polynomial expansions to mitigate Gibbs phenomenon in Fourier–Bessel series solutions of a dynamic sphere problem. Int. J. Numer. Meth. Biomed. Engng., 26(1276–1292), 2010. [41] T. W. Körner. Fourier Analysis. Cambridge University Press, 1988. [42] R. Platte, L. N. Trefethen, and A. Kuijlaars. Impossibility of fast stable approximation of analytic functions from equispaced samples. SIAM Rev. (to appear), 2010. [43] E. Schmidt. Die asymptotische Bestimmung des Maximums des Integrals über das Quadrat der Ableitung eines normierten Polynoms, dessen Grad ins Unendliche wächst. Sitzungsber. Preuss. Akad. Wiss., page 287, 1932. [44] E. Schmidt. Über die nebst ihren Ableitungen orthogonalen Polynomensysteme und das zugehörige Extremum. Math. Ann., 119:165–204, 1944. [45] E Tadmor. Filters, mollifiers and the computation of the Gibbs’ phenomenon. Acta Numerica, 16:305–378, 2007. [46] E. Tadmor and J. Tanner. Adaptive mollifiers for high resolution recovery of piecewise smooth data from its spectral information. Foundations of Computational Mathematics, 2(2):155–189, 2002. [47] M. Unser. Sampling–50 years after Shannon. Proc. IEEE, 88(4):569–587, 2000. 37

* Your assessment is very important for improving the work of artificial intelligence, which forms the content of this project

Download PDF

advertisement